Docsity
Docsity

Prepare-se para as provas
Prepare-se para as provas

Estude fácil! Tem muito documento disponível na Docsity


Ganhe pontos para baixar
Ganhe pontos para baixar

Ganhe pontos ajudando outros esrudantes ou compre um plano Premium


Guias e Dicas
Guias e Dicas

Increasing Marker Densities & Efficient Algorithms for Genomic Evaluations, Manuais, Projetos, Pesquisas de Cultura

This research study explores the impact of increasing marker densities on genomic evaluations in holstein dairy cattle. The authors find that while the gain in reliability from increasing the number of markers to 500,000 was only 1.6%, the majority of that gain came from genotyping young bulls at higher density. The study also discusses methods for imputing genotypes and computing genomic evaluations with large numbers of markers, and evaluates potential gains in reliability from increasing marker numbers.

Tipologia: Manuais, Projetos, Pesquisas

2011

Compartilhado em 30/06/2011

lenice-mendonca-de-menezes-7
lenice-mendonca-de-menezes-7 🇧🇷

5

(2)

18 documentos

1 / 11

Documentos relacionados


Pré-visualização parcial do texto

Baixe Increasing Marker Densities & Efficient Algorithms for Genomic Evaluations e outras Manuais, Projetos, Pesquisas em PDF para Cultura, somente na Docsity! RESEARCH Open Access Genomic evaluations with many more genotypes Paul M VanRaden1*, Jeffrey R O’Connell2, George R Wiggans1, Kent A Weigel3 Abstract Background: Genomic evaluations in Holstein dairy cattle have quickly become more reliable over the last two years in many countries as more animals have been genotyped for 50,000 markers. Evaluations can also include animals genotyped with more or fewer markers using new tools such as the 777,000 or 2,900 marker chips recently introduced for cattle. Gains from more markers can be predicted using simulation, whereas strategies to use fewer markers have been compared using subsets of actual genotypes. The overall cost of selection is reduced by genotyping most animals at less than the highest density and imputing their missing genotypes using haplotypes. Algorithms to combine different densities need to be efficient because numbers of genotyped animals and markers may continue to grow quickly. Methods: Genotypes for 500,000 markers were simulated for the 33,414 Holsteins that had 50,000 marker genotypes in the North American database. Another 86,465 non-genotyped ancestors were included in the pedigree file, and linkage disequilibrium was generated directly in the base population. Mixed density datasets were created by keeping 50,000 (every tenth) of the markers for most animals. Missing genotypes were imputed using a combination of population haplotyping and pedigree haplotyping. Reliabilities of genomic evaluations using linear and nonlinear methods were compared. Results: Differing marker sets for a large population were combined with just a few hours of computation. About 95% of paternal alleles were determined correctly, and > 95% of missing genotypes were called correctly. Reliability of breeding values was already high (84.4%) with 50,000 simulated markers. The gain in reliability from increasing the number of markers to 500,000 was only 1.6%, but more than half of that gain resulted from genotyping just 1,406 young bulls at higher density. Linear genomic evaluations had reliabilities 1.5% lower than the nonlinear evaluations with 50,000 markers and 1.6% lower with 500,000 markers. Conclusions: Methods to impute genotypes and compute genomic evaluations were affordable with many more markers. Reliabilities for individual animals can be modified to reflect success of imputation. Breeders can improve reliability at lower cost by combining marker densities to increase both the numbers of markers and animals included in genomic evaluation. Larger gains are expected from increasing the number of animals than the number of markers. Background Breeders now use thousands of genetic markers to select and improve animals. Previously only phenotypes and pedigrees were used in selection, but performance and parentage information was collected, stored, and evalu- ated affordably and routinely for many traits and many millions of animals. Genetic markers had limited use during the century after Mendel’s principles of genetic inheritance were rediscovered because few major QTL were identified and because marker genotypes were expensive to obtain before 2008. Genomic evaluations implemented in the last two years for dairy cattle have greatly improved reliability of selection, especially for younger animals, by using many markers to trace the inheritance of many QTL with small effects. More genetic markers can increase both reliability and cost of genomic selection. Genotypes for 50,000 markers now cost <US$200 per animal for cattle, pigs, chickens, and sheep. Lower cost chips containing fewer (2,900) markers and higher cost chips with more (777,000) mar- kers are already available for cattle, and additional geno- typing tools will become available for cattle and other * Correspondence: Paul.VanRaden@ars.usda.gov 1Animal Improvement Programs Laboratory, USDA, Building 5 BARC-West, Beltsville, MD 20705-2350, USA Full list of author information is available at the end of the article VanRaden et al. Genetics Selection Evolution 2011, 43:10 http://www.gsejournal.org/content/43/1/10 Ge n e t i c s Se lec t ion Evolut ion © 2011 VanRaden, et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. species in the near future. All three billion DNA base pairs of several Holstein bulls have been fully sequenced and costs of sequence data are rapidly declining. Reliabilities of genomic predictions were compared in previous studies for up to 50,000 actual or 1 million simulated markers. Reliabilities for young animals increased gradually as marker numbers increased from a few hundred up to 50,000 [1-3], and increased slightly when markers with low minor allele frequency were included [4]. For low- to medium-density panels (300 to 3,000 markers), selection of markers with large effects preserves more reliability if only the selected markers are used in the evaluation [5], but evenly spaced mar- kers preserve more reliability for all traits if imputation is used [6]. Reliabilities increased from 81 up to 83% as numbers of simulated markers increased from 50,000 to 100,000 using 40,000 predictor bulls [7], however, base population alleles in that study were in equilibrium rather than disequilibrium. Increasing marker numbers above 20,000 up to 1 mil- lion linked markers resulted in almost no gains in relia- bility in a simulation of 10 chromosomes and 1,500 QTL [8]. Larger gains resulted in a simulation of only one chromosome containing three to 30 QTL that accounted for all of the additive variance [9]. Many gen- ome-wide association studies of human traits have com- bined large numbers of markers from different chips [10], but those studies almost always estimated effects of individual loci rather than included all the loci to esti- mate the total genetic effect. Many genotypes will be missing in the future when data from denser or less dense chips are merged with current genotypes from 50,000-marker chips or when two different 50,000-marker sets are merged, as is being done in the EuroGenomics project [11,12]. Missing gen- otypes of descendants can be imputed accurately using low-density marker sets if ancestor haplotypes are avail- able [13-15]. At low marker densities, haplotypes pro- vide higher accuracy than genotypes when included in genomic evaluation [1,16]. Missing genotypes were not an immediate problem with data from a 50,000-marker set because >99% of genotypes were read correctly [17]. Fewer markers can be used to trace chromosome seg- ments within a population once identified by high-den- sity haplotyping. Without haplotyping, regressions could simply be computed for available SNP and the rest dis- regarded. With haplotyping, effects of both observed and unobserved SNP can be included. Transition to higher density chips will require including multiple mar- ker sets in one analysis because breeders will not re- genotype most animals. Simulated genotypes and haplotypes can be more use- ful than real data to test programs and hypotheses. Examples are analyses of larger data sets than are currently available or comparison of estimated haplo- types with true haplotypes, which are not observable in real data. Most simulations begin with all alleles in the founding generation in Hardy-Weinberg equilibrium and then introduce linkage disequilibrium (LD) using many non-overlapping generations of hypothetical pedi- grees [18] or fewer generations of actual pedigree [19]. Simulations can also include selection [20] or model divergent populations such as breeds [21]. Many geno- mic evaluation studies simulated shorter genomes and fewer chromosomes than in actual populations, presum- ably because computing times for obtaining complete data were too long. Goals of this study are to 1) impute genotypes using a combination of population and pedigree haplotyping, 2) compute genomic evaluations with up to 500,000 simu- lated markers, and 3) evaluate potential gains in reliabil- ity from increasing numbers of markers. Methods Haplotyping program Unknown genotypes can be made known (imputed) from observed genotypes at the same or nearby loci of relatives using pedigree haplotyping or from matching allele patterns (regardless of pedigree) using population haplotyping. Haplotypes indicate which alleles are on each chromosome and can distinguish the maternal chromosome provided by the ovum from the paternal chromosome provided by the sperm. Genotypes indicate only how many copies of each allele an individual inher- ited from its two parents. Fortran program findhap.f90 was designed to combine population and pedigree haplotyping. Genotypes were coded numerically as 0 if homozygous for the first allele, 2 if homozygous for the second allele, and 1 if heterozy- gous or not known; haplotypes were coded as 0 for the first allele, 2 for the second allele, and 1 for unknown to simplify matching. The algorithm began by creating a list of haplotypes from the genotypes in the first pass, and the process was iterated so genotypes earlier in the file could be matched again using haplotype refinements that occurred later. Steps used in the population haplotyping algorithm were: 1) each chromosome was divided into segments of about 500 markers each when analyzing the 500,000 marker or mixed datasets and 100 markers each for 50,000 marker data; 2) the first genotype was entered into the haplotype list as if it was a haplotype; 3) any subsequent genotypes that shared a haplotype were then used to split the previous genotypes into haplotypes; 4) as each genotype was compared to the list, a match was declared if no homozygous loci conflicted with the stored haplotype; 5) any remaining unknown alleles in that haplotype were imputed from homozygous alleles VanRaden et al. Genetics Selection Evolution 2011, 43:10 http://www.gsejournal.org/content/43/1/10 Page 2 of 11 Linkage phase was considered to be correctly called if estimated phase matched true phase for each adjacent pair of heterozygous markers. Effects of quantitative trait loci (QTL) were simulated with a heavy-tailed distribution. Standard, normal effects (s) were converted to have heavy tails using the function 2abs(s - 2). The locus with the largest effect contributed 2 to 4% of the additive genetic variance across five repli- cates, and the number of QTL was 10,000, which is greater than the 100 QTL used previously [19]. Small advantages of nonlinear over linear models for dairy cat- tle traits indicate many more QTL than previously assumed in most simulations. Similarly, human stature is very heritable (i.e. 0.8) but the 50 largest SNP effects account for only 5% of the variance [22]. If a few large QTL do exist, these causative mutations could be selected for directly instead of increasing density of mar- kers everywhere. Five replicates of the simulated data were analyzed as five traits, and QTL effects for each trait were indepen- dent. Just one set of genotypes contained the five QTL replicates for efficiency as in [19]. All QTL were located between the markers; none of the markers had a direct effect on the traits. Error variance for each genotyped animal was calculated from the reliability of its tradi- tional milk yield evaluation, which for cows might include only one or a few records with a 30% heritability but for bulls could include hundreds or thousands of daughter records. Daughter equivalents from parents were removed from total daughter equivalents to obtain reliability from own records and progeny (RELprog), and error variance for each animal equalled additive genetic variance times the reciprocal of reliability minus one, i.e. sa 2 (1/RELprog -1). Two mixed density data sets were simulated, which included genotypes from both 500,000- and 50,000- marker chips, to determine if a few thousand higher density genotypes would be sufficient to impute, using program findhap.f90, the missing genotypes for the other animals genotyped with 50,000 markers. The first analysis included 1,406 randomly chosen young bulls with 500,000 markers and the other 32,008 animals with 50,000 markers. The second analysis had 3,726 bulls with 500,000 markers, including 2,140 older bulls that had 99% reliability plus the same 1,406 young bulls, and the other 29,788 animals had 50,000 markers. Genomic evaluation The vector of observed, deregressed observations (y) was modelled with an overall mean (Xb), genotypes minus twice the base allele frequency (Z) multiplied by allele effects (u), a vector of polygenic effects for genotyped animals (p), and a vector of errors (e) with differing var- iance depending on RELprog: y u p e= + + +Xb Z To solve for polygenic effects, equations for all ances- tors of the genotyped animals are included along with p, so that the simple inverse for pedigree relationships could be constructed [23]. Reliabilities of solutions for Zu + p were obtained from squared correlations of esti- mated and true breeding values and averaged across five replicates for 14,061 young bull predictions. Dense markers account for most but not all of the additive genetic variation, and the remaining fraction of variance is the polygenic contribution (poly) assumed to be 10 and 0% of genetic variance with 50,000 and 500,000 markers, respectively. Values of poly have been assumed to equal from 0 to 20% of additive genetic var- iance in most national evaluations of actual 50,000-mar- ker data; poly should increase with fewer or decrease with more available markers. An initial test with 500,000 markers indicated a 0.1% decrease in reliability and slower convergence with 5% poly as compared to 0% poly in the model. Linear and nonlinear models were both applied to the simulated data using the same methods as [24]. The nonlinear model was analogous to Bayes A [9], and a range of values was tested for the parameter controlling the shape of the distribution for both marker densities. Reliability approximation Approximate reliability formulas are needed because correlations of true breeding value (BV) with genomic estimated breeding value (GEBV) are not available in actual data. The maximum genomic reliability that can be obtained in practice (RELmax) is limited by the maxi- mum marker density and by the size of the reference population. As the reference population becomes infi- nitely large, reliability should approach 1 minus poly because poly is the residual QTL variance not traceable by the markers on the chip. Total daughter equivalents (DEmax) from the reference population can be obtained by summing traditional reli- abilities (RELtrad) minus the reliabilities of parent aver- age (RELpa), multiplying by the ratio of error to sire variance (k), and dividing by the equivalent reference size (n) needed to achieve 50% genomic REL [25]: DE REL RELtrad pamax / .= −( )∑ k n Genomic reliabilities for individual animals can account for their traditional reliabilities, numbers of markers genotyped, quality of imputation, and relation- ship to the reference population. Animals that are less or more related to the reference population may have lower or higher DEmax. Accounting for individual VanRaden et al. Genetics Selection Evolution 2011, 43:10 http://www.gsejournal.org/content/43/1/10 Page 5 of 11 relationships is automatic with inversion [19] or can be approximated without inversion using elements of the genomic relationship matrix [4,26]. Conversion of DEmax to genomic REL should account for the fact that genotyped SNP do not perfectly track all QTL in the genome if full sequences are not avail- able. Multiplication by 1 - poly prevents reliability to reach 100%. If all reference animals are genotyped at the highest chip density, the expected genomic REL for young animals without pedigree information can be cal- culated as: REL 1 DE DEmax max max= −( ) +( )poly k/ . Each animal’s traditional REL is converted to daughter equivalents (DEtrad), and these are added to DEmax adjusted for any additional error introduced by genotyp- ing at lower SNP density. The reduced daughter equiva- lents from genomics (DEgen) can be calculated from the squared correlation between estimated and true geno- types averaged across loci (RELsnp) for each animal as: DE REL REL 1 REL RELgen max snp max snp= −( )k / The animal’s total reliability RELtot is computed from the sum of the daughter equivalents as: REL DE DE DE DEtot trad gen trad gen= +( ) + +( )/ k Results Genotype simulation Examples of actual and simulated LD patterns are in Figures 2 and 3, respectively. Squared correlations from actual or simulated genotypes were about equal on aver- age for markers separated by 10 to 3000 kb, but actual genotypes had a wider range of values with more very high or low squared correlations that continued across more distant markers. Further testing or a modified algorithm may be needed to obtain a closer match. If true LD is higher than simulated, the reliability of geno- mic predictions should also be higher, but the advan- tages of higher density would be less if the lower density markers already have strong LD with the QTL. Haplotype imputation Measures of imputation success from 50,000 markers, 500,000 markers, and the two mixed density datasets are in Table 1. Statistics are provided separately for animals with phenotypes in the reference population, labelled old, and animals without phenotypes, labelled young. In the single-density data sets, percentage of missing geno- types was 1.0% originally but after haplotyping only 0.07% were incorrect, i.e. 0.93% of the missing genotypes were imputed correctly. In the two mixed density data sets, 80 to 86% of the markers were missing originally and 93 to 96% of these missing markers were imputed. The remaining 6.4% and 3.3% of alleles in the two data- sets that were not observed and not imputed were set to population allele frequency. If only one allele was imputed, allele frequency was substituted for only the other, unknown allele, and these loci counted as half imputed. Many non-genotyped ancestors with 100% of markers missing originally had sufficiently accurate imputed data to meet the 90% call rate required for genotyped ani- mals. Thus, 1,117 ancestors could have their imputed genotypes included in the genomic evaluation. Nearly all 0 0.2 0.4 0.6 0.8 1 0 500 1000 1500 2000 2500 3000 Distance (kb) Sq ua re d C or re la tio n Figure 2 Linkage pattern among markers on a simulated chromosome. VanRaden et al. Genetics Selection Evolution 2011, 43:10 http://www.gsejournal.org/content/43/1/10 Page 6 of 11 of those animals were dams because most sires were already genotyped. Imputation of the remaining non- genotyped sires was difficult because they had few pro- geny and because most dams of their progeny were not genotyped. Paternal alleles were determined incorrectly for about 2% of the heterozygous markers for young animals and for about 4% for old animals in the single-density data. Rates of incorrect paternal allele calls were low because nearly all sires were genotyped, but increased to about 5% for young and 7% for old animals in the mixed-density data. The most popular sires and dams had 100% correctly called linkage phases and paternal alleles, whereas animals with fewer close relatives had somewhat fewer correct calls. Linkage phase was determined incorrectly for less than 2% of the adjacent pairs of heterozygous markers, except for old animals in the mixed-density data when only young animals had been genotyped at higher density. Five percent or fewer of the missing high-density marker genotypes were imputed incorrectly. The most frequent individual haplotype within a seg- ment was observed on average 5,883 times and accounted for 8.8% of all haplotypes in the population. The most frequent estimated haplotypes were also the most frequent true haplotypes, and their frequencies were similar, averaging 9.2% true vs. 8.8% estimated fre- quency of the most common haplotype. High frequen- cies for fairly long haplotypes are not surprising given the pedigree structure and large contributions from pop- ular sires in the recent past. Numbers of estimated haplotypes averaged 6,627 per 500-marker segment and were very consistent across segments with a SD of only 229. Numbers of true haplo- types averaged 2,735 and were smaller than estimated, possibly because genotyping errors inflated the esti- mated counts. Numbers of estimated haplotypes decreased to an average of 5,092 per 100-marker seg- ment used with the 50 K single-density data, but the SD increased to 318. The number of potential haplotypes was 66,828 with two haplotypes per animal and 33,414 animals, as compared to only 6,627 observed. Thus, each estimated haplotype was observed about 10 times on average. 0 0.2 0.4 0.6 0.8 1 0 500 1000 1500 2000 2500 3000 3500 Distance (kb) Sq ua re d C or re la tio n Figure 3 Linkage pattern from actual Holstein genotypes on chromosome 1. Table 1 Measures of imputation success for single- and mixed-density data by age group Markers used 50 K Mixed Mixed 500 K Number of 500 K genotypes 0 1,406 3,798 33,414 Age1: Missing before imputation (%) all 1 86 80 1 Missing after imputation (%) all 0.04 6.4 3.3 0.05 Genotype error rate (%) young 0.03 1.3 0.9 0.03 old 0.04 3.4 1.7 0.04 Incorrect genotypes (%) young 0.06 2.6 1.7 0.06 old 0.08 7.3 3.4 0.08 Incorrect linkage phase (%) young 0.3 1.9 1.4 0.1 old 0.4 5.4 2.5 0.2 Incorrect paternity (%) young 2.0 4.9 5.0 2.5 old 4.3 7.6 6.2 4.2 Correlation2 (estimated, true genotypes) all 0.99 0.84 0.93 0.99 Reliability of linear breeding values (%) young 82.6 83.4 83.7 84.1 Reliability of nonlinear breeding values (%) young 84.4 85.3 85.6 86.0 Reliability gain (nonlinear), 500 K - 50 K (%) young 0.0 0.9 1.2 1.6 1old are animals with phenotypes or progeny; young are animals without. VanRaden et al. Genetics Selection Evolution 2011, 43:10 http://www.gsejournal.org/content/43/1/10 Page 7 of 11 [21]. Pedigrees are not recorded for many animals in actual populations, and much of this information can be recovered even using low density genotyping. Computation Algorithms for imputation are rapidly evolving to meet the demands of growing genomic datasets. Several programs such as those tested by Weigel et al. [6] are available and may provide similar or better results with fewer markers or animals, but most were not designed for very large populations or very dense markers. Fortran program find- hap.f90 requires little time and memory and is available at http://aipl.arsusda.gov/software/index.cfm for download. Official genomic evaluations of USDA have used findhap. f90 to impute and include genotypes of dams since April 2010 and 3,000-marker genotypes since December 2010. Further improvements to imputation algorithms will increase accuracy and allow smaller fractions of animals to be genotyped at highest density. New methods are needed for combining multiple densities, for example 3,000, 50,000, and 500,000 markers, in the same dataset. During the 5 months of review for this manuscript, ver- sion 2 of findhap.f90 was released with better properties than those documented here for version 1. Use of pedi- gree haplotyping followed by population haplotyping can further improve call rates and reduce error rates with similar computation required (Mehdi Sargolzaei, U. Guelph, personal communication, 2010). The expense of genotyping 1,000-2,000 animals at higher density can be justified for a large population such as Holstein, but larger benefits may be needed if similar numbers are required within each breed. Experi- mental design is becoming a more important part of animal breeding to balance the speed, reliability and cost of selection. With many new technologies and options available, breeders and breeding companies need accurate advice on the potential of each investment to yield returns. Costs of genotyping are decreasing rapidly, and imputation using less dense marker sets allows the missing genotypes to be obtained almost for free. Conclusions Genotypes and genomic computations are rapidly expand- ing the data and tools available to breeders. Very high marker density increases reliability of within-breed selec- tion slightly (1.6%) in simulation, whereas lower densities allow breeders to apply cost-effective genomic selection to many more animals. Numbers of reference animals affect reliability more than number of markers, and animals with imputed genotypes contribute to the reference population. New methods for combining information from multiple data sets can improve gains with less cost. Individual reli- abilities can be adjusted to account for the number of markers and the accuracy of imputation. More precise estimates of reliability allow breeders to properly balance benefits vs. costs of using different marker sets. Computer programs that combined population haplo- typing with pedigree haplotyping performed well with mixtures of 500,000 and 50,000 marker genotypes simu- lated for subsets of 33,414 animals. Population haplotyp- ing methods rapidly matched DNA segments for individuals with or without genotyped ancestors, and pedigree haplotyping efficiently imputed genotypes of the non-genotyped parents and correctly filled most missing alleles for progeny genotyped with lower marker density. Accurate imputation can give breeders more reliable genomic evaluations on more animals without genotyping each for all markers. List of abbreviations used b: intercept (genetic base); BV: true breeding value; DEmax: genomic daughter equivalents with all markers observed; DEtrad: traditional daughter equivalents; DEgen: reduced daughter equivalents from genomics; e: vector of errors; GEBV: genomic estimated breeding value; k: ratio of error to sire variance; n: equivalent reference size needed to achieve 50% genomic reliability; p: vector of polygenic effects for each genotyped animal; poly: ratio of polygenic variance to additive genetic variance; RELmax: maximum genomic reliability for an animal with all markers observed; RELpa: reliability of parent average; RELprog: reliability from own records and progeny; RELsnp: squared correlation between estimated and true genotypes averaged across loci for each animal; RELtot: animal’s total reliability from all sources; RELtrad: reliability of traditional evaluation; u: vector of allele effects; X: incidence matrix (= 1) for intercept; y: vector of observations; Z: matrix of genotypes minus twice the base allele frequency; σa 2: additive genetic variance. Acknowledgements Mel Tooker assisted with computing and Tabatha Cooper provided technical editing. Author details 1Animal Improvement Programs Laboratory, USDA, Building 5 BARC-West, Beltsville, MD 20705-2350, USA. 2University of Maryland School of Medicine, Baltimore, MD, 21201, USA. 3University of Wisconsin, Madison, WI, 53706, USA. Authors’ contributions PV derived and programmed the algorithms and drafted the paper. JO and GW suggested several improvements to the imputation methods. KW reviewed available imputation algorithms and suggested experimental designs. All authors read and approved the final manuscript. Competing interests The authors declare that they have no competing interests. Received: 24 September 2010 Accepted: 2 March 2011 Published: 2 March 2011 References 1. Calus M, Meuwissen T, Roose Ad, Veerkamp R: Accuracy of genomic selection using different methods to define haplotypes. Genetics 2008, 178:553-561. 2. Solberg T, Sonesson A, Woolliams J: Genomic selection using different marker types and densities. J Anim Sci 2008, 86:2447-2454. 3. VanRaden P, Van Tassell C, Wiggans G, Sonstegard T, Schnabel R, Taylor J, Schenkel F: Invited review: Reliability of genomic predictions for North American Holstein bulls. J Dairy Sci 2009, 92:16-24. 4. Wiggans G, VanRaden P, Bacheller L, Tooker M, Hutchison J, Cooper T, Sonstegard T: Selection and management of DNA markers for use in genomic evaluation. J Dairy Sci 2010, 93:2287-2292. VanRaden et al. Genetics Selection Evolution 2011, 43:10 http://www.gsejournal.org/content/43/1/10 Page 10 of 11 5. Weigel K, de los Campos G, González-Recio O, Naya H, Wu X, Long N, Rosa G, Gianola D: Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers. J Dairy Sci 2009, 92:5248-5257. 6. Weigel K, de los Campos G, Vazquez A, Rosa G, Gianola D, Van Tassell C: Accuracy of direct genomic values derived from imputed single nucleotide polymorphism genotypes in Jersey cattle. J Dairy Sci 2010, 93:5423-5435. 7. VanRaden P, Wiggans G, Van Tassell C, Sonstegard T, Schenkel F: Benefits from cooperation in genomics. Interbull Bull 2009, 39:67-72. 8. Harris B, Johnson D: The impact of high density SNP chips on genomic evaluation in dairy cattle. Interbull Bulletin 2010, 42. 9. Meuwissen T, Goddard M: The use of family relationships and linkage disequilibrium to impute phase and missing genotypes in up to whole genome sequence density genotypic data. Genetics , 2010:10.1534/ genetics.1110.113936. 10. Li Y, Willer C, Sanna S, Abecasis G: Genotype imputation. Annu Rev Genomics Human Genet 2009, 10:387-406. 11. Druet T, Schrooten C, de Roos A: Imputation of genotypes from different single nucleotide polymorphism panels in dairy cattle. J Dairy Sci 2010, 93:5443-5454. 12. Lund M, de Roos A, de Vries A, Druet T, Ducrocq V, Fritz S, Guillaume F, Guldbrandtsen B, Liu Z, Reents R, Schrooten C, Seefried M, Su G: Improving genomic prediction by EuroGenomics collaboration. Proceedings of the Ninth World Congress on Genetics Applied to Livestock Production: 1-6 August 2010;Leipzig 2010, 0880. 13. Burdick J, Chen W, Abecasis G, Cheung V: In silico method for inferring genotypes in pedigrees. Nat Genet 2006, 38:1002-1004. 14. Habier D, Fernando R, Dekkers J: Genomic selection using low-density marker panels. Genetics 2009, 182:343-353. 15. Zhang Z, Druet T: Marker imputation with low-density marker panels in Dutch Holstein cattle. J Dairy Sci 2010, 93:5487-5494. 16. Villumsen T, Janss L: Bayesian genomic selection: the effect of haplotype length and priors. BMC Proc 2009, 3(Suppl 1):S11. 17. Wiggans G, Sonstegard T, VanRaden P, Matukumalli L, Schnabel R, Taylor J, Schenkel F, Van Tassell C: Selection of single-nucleotide polymorphisms and quality of genotypes used in genomic evaluation of dairy cattle in the United States and Canada. J Dairy Sci 2009, 92:3431-3436. 18. Meuwissen T, Hayes B, Goddard M: Prediction of total genetic value using genome-wide dense marker maps. Genetics 2001, 157:1819-1829. 19. VanRaden P: Efficient methods to compute genomic predictions. J Dairy Sci 2008, 91:4414-4423. 20. Sargolzaei M, Schenkel F: QMSim: a large-scale genome simulator for livestock. Bioinformatics 2009, 25:680-681. 21. Toosi A, Fernando R, Dekkers J: Genomic selection in admixed and crossbred populations. J Anim Sci 2010, 88:32-46. 22. Yang J, Benyamin B, McEvoy B, Gordon S, Henders A, Nyholt D, Madden P, Heath A, Martin N, Montgomery G: Common SNPs explain a large proportion of the heritability for human height. Nature Genet 2010, 42:565-569. 23. Henderson C: Inverse of a matrix of relationships due to sires and maternal grandsires. J Dairy Sci 1975, 58:1917-1921. 24. Cole J, VanRaden P: Visualization of results from genomic evaluations. J Dairy Sci 2010, 93:2727-2740. 25. VanRaden P, Sullivan P: International genomic evaluation methods for dairy cattle. Genet Sel Evol 2010, 42:7. 26. Liu Z, FSeefried , Reinhardt F, Reents R: Computation of approximate reliabilities. Interbull Bull 2010, 41. 27. Flaquer A, Fischer C, Wienker T: A new sex-specific genetic map of the human pseudoautosomal regions (PAR1 and PAR2). Hum Hered 2009, 68:192-200. 28. Cole J, VanRaden P, O’Connell J, Van Tassell C, Sonstegard T, Schnabel R, Taylor J, Wiggans G: Distribution and location of genetic effects for dairy traits. J Dairy Sci 2009, 92:2931-2946. 29. VanRaden PM, O’Connell JR, Wiggans GR, Weigel KA: Combining different marker densities in genomic evaluation. Interbull Bull 2010, 42:4. 30. Weigel KA, de los Campos G, Vazquez A, Van Tassell CP, Rosa GJM, Gianola D, O’Connell JR, VanRaden PM, Wiggans GR: Genomic selection and its effects on dairy cattle breeding programs. Proceedings of the Ninth World Congress on Genetics Applied to Livestock Production: 1-6 August 2010; Leipzig. communication 2010, 119:8. 31. Vanraden P: Genomic evaluations with many more genotypes and phenotypes. Proceedings of the Ninth World Congress on Genetics Applied to Livestock Production: 1-6 August 2010; Leipzig. communication 2010, 27:8. 32. Taylor J, Bean B, Marshall C, Sullivan J: Genetic and environmental components of semen production traits of artificial insemination Holstein bulls. J Dairy Sci 1985, 68:2703-2722. 33. Macciotta N, Gaspa G, Steri R, Nicolazzi E, Dimauro C, Pieramati C, Cappio- Borlino A: Using eigenvalues as variance priors in the prediction of genomic breeding values by principal component analysis. J Dairy Sci 2010, 93:2765-2774. 34. Villa-Angulo R, Matukumalli L, Gill C, Choi J, Van Tassell C, Grefenstette J: High-resolution haplotype block structure in the cattle genome. BMC Genetics 2009, 10:19-31. doi:10.1186/1297-9686-43-10 Cite this article as: VanRaden et al.: Genomic evaluations with many more genotypes. Genetics Selection Evolution 2011 43:10. Submit your next manuscript to BioMed Central and take full advantage of: • Convenient online submission • Thorough peer review • No space constraints or color figure charges • Immediate publication on acceptance • Inclusion in PubMed, CAS, Scopus and Google Scholar • Research which is freely available for redistribution Submit your manuscript at www.biomedcentral.com/submit VanRaden et al. Genetics Selection Evolution 2011, 43:10 http://www.gsejournal.org/content/43/1/10 Page 11 of 11
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved