Browsing by Subject "Linkage disequilibrium"

Now showing 1 - 4 of 4

Bridging genomics and genetic diversity
association between sequence polymorphism and trait variation in a spring barley collection
(2009) Haseneyer, Grit; Geiger, Hartwig H.
Association analysis has become common praxis in plant genetics for high-resolution mapping of quantitative trait loci (QTL), validating candidate genes, and identifying important alleles for crop improvement. In the present study the feasibility of association mapping in barley is investigated by associating DNA polymorphisms in selected candidate genes with variation in grain quality traits, plant height, and flowering time to gain further understanding of gene functions involved in the control of these traits. (1) As a starting point a worldwide collection of spring barley (Hordeum vulgare L.) accessions has been established to serve as an association platform for the present and possible further studies. This collection of 224 accessions, sampled from the IPK genebank, consists of 109 European, 45 West Asian and North African, 40 East Asian and 30 American entries. Forty-five EST derived polymorphic SSRs were used to determine the genetic structure. The markers were equally distributed over all seven chromosomes. Phenotypic data were assessed in field experiments performed at three locations in 2004 and 2005 in Germany. (2) Seven candidate genes were considered. Fragments of these genes were amplified and sequenced in the established collection. Single nucleotide polymorphisms (SNPs), haplotype variants, and linkage disequilibrium (LD) were investigated. (3) One gene was additionally analysed in 42 bread wheat (Triticum aestivum L.) accessions in order to compare barley and wheat for nucleotide diversity and LD. (4) Association analysis between SNPs and haplotype variants of the selected candidate genes and the phenotypic variation in thousand-grain weight, crude protein content, starch content, plant height, and flowering time was used to identify candidate genes influencing the variation of these traits in spring barley. A mixed model association-mapping method was employed for this purpose. In the established collection, significant genotypic variation was observed for all traits under study. Genotype×environment interaction variances were much smaller than the genotypic variances and heritability coefficients exceeded 0.9. Statistical analyses of population stratification revealed two major subgroups, mainly comprising two-rowed and six-rowed accessions, respectively. Within the sequenced fragments (13kb) of the seven candidate genes, 216 polymorphic sites and 93 haplotypes were detected demonstrating a moderate to high level of nucleotide and haplotype diversity in the germplasm collection. Most haplotypes (74.2%) occurred at a low frequency (<0.05) and therefore were rejected in the candidate gene-based association analysis. Pair-wise LD estimates between the detected SNPs revealed different intra-gene linkage patterns. The 45 SSR markers used for analysing the population structure revealed low intra- and interchromosomal LD (r²<0.2). Significant marker-trait associations between the candidate genes and the respective target traits were identified. The barley and wheat genes showed a high level of nucleotide identity (>95%) in the coding sequences, the distribution of polymorphisms was also similar in the two species, and both map to a syntenic position on chromosome 3. However, the genes were different in both collections with respect to LD and Tajima?s D statistic. In the barley collection only a moderate level of LD was observed whereas in wheat, LD was absolute between polymorphic sites located in the first intron while it decayed by distance between the former sites and those located downstream the first intron. Differences in Tajima?s D values indicate a lower selection pressure on the gene in barley than in wheat. In conclusion, the established association platform represents an excellent resource for marker-trait association studies. The germplasm collection displays a wide range of genotypic and phenotypic diversity providing phenotypic data for economically important traits and comprehensive information about the nucleotide and haplotype polymorphism of seven candidate genes. Association results demonstrate that the candidate gene-based approach of association mapping is an appropriate tool for characterising gene loci that have a significant impact on plant development and grain quality in spring barley.
Development and applications of Plabsoft
a computer program for population genetic data analyses and simulations in plant breeding
(2008) Maurer, Hans Peter; Melchinger, Albrecht E.
Marker-assisted breeding approaches are promising tools for enhancement of the conventional plant breeding process. They have been successfully applied in many areas such as plant variety protection, classification of germplasm, assessment of genetic diversity, mapping of genes underlying important agronomic traits, and using the mapping information for selection decisions. Powerful and flexible bioinformatic tools are urgently required for a better integration of molecular marker applications and classical plant breeding methods. The objective of my thesis work was to develop and apply Plabsoft, a computer program for population genetic data analyses and simulations in plant breeding. The assumption of Hardy-Weinberg equilibrium is a cornerstone of many concepts in population and quantitative genetics. Therefore, tests for Hardy-Weinberg equilibrium are of crucial importance, but the assumptions underlying asymptotic chi-square tests are often not met in datasets from plant breeding programs. I developed and implemented in Plabsoft a new algorithm for exact tests of Hardy-Weinberg equilibrium with multiple alleles. The newly derived algorithm has considerable computational advantages over previously described algorithms and extends substantially the range of problems that can be tested. Knowledge about the amount and distribution of linkage disequilibrium (LD) in breeding populations is of fundamental importance to assess the prospects for gene mapping with whole-genome association studies. To analyze LD in breeding populations, I implemented various LD measures in Plabsoft and developed a new significance test for these LD measures. The routines were employed to analyze LD in 497 elite maize lines from a commercial hybrid breeding program, which were fingerprinted by 81 simple sequence repeat (SSR) markers covering the entire genome. Strong LD was detected and, therefore, whole-genome association studies were recommended as promising. However, LD between unlinked loci will most likely result in a high rate of false positives. The prediction of hybrid performance with DNA markers facilitates the identification of superior hybrids. The single marker models used so far do not take into account the correlation between allele frequencies at linked markers. To overcome this problem, the concept of haplotype blocks was proposed. I developed and implemented in Plabsoft three alternative algorithms for haplotype block detection suitable for plant breeding. The algorithms were applied for the haplotype-based prediction of the hybrid performance of 270 hybrids, the parents of which were fingerprinted with 20 amplified fragment length polymorphism (AFLP) primer combinations. Employing haplotypes resulted in an improved prediction of hybrid performance compared with single marker models. Consequently, haplotype-based prediction methods have a high potential to improve substantially the efficiency of hybrid breeding programs. Computer simulations can be employed to solve population genetic problems in plant breeding, for which the simplifying assumptions underlying the classical population genetic theory do not hold true. However, before the start of my thesis no flexible simulation software was available. I developed algorithms for simulation of single breeding steps and entire plant breeding programs and implemented these in Plabsoft. The routines allow the simulation of plant breeding programs as they are conducted in practice. The simulation routines of Plabsoft were validated by simulating two marker-assisted backcross programs in rice conducted by the International Rice Research Institute (IRRI). In the simulations, the frequency distributions of the proportion of recurrent parent genome in the backcross populations were assessed. The simulation results were in good agreement with the experimental data. Therefore, computer simulations are a useful tool for pre-test estimation of selection response in marker-assisted backcrossing. The application of Plabsoft was exemplified by two studies in maize. In the first study, the expected LD decay in the intermating generations of two recurrent selections programs was determined with simulations. This application demonstrates the use of Plabsoft to solve problems for which analytical results are not available. In the second study, the forces generating and maintaining LD in a hybrid maize breeding program were investigated with computer simulations. This application demonstrates the capability of modeling complex long-term breeding programs as performed in practice. The studies of my thesis provide an example for the broad range of possible applications of Plabsoft. In addition to the presented studies, Plabsoft has so far been employed in about 40 further studies, which corroborates the usefulness of Plabsoft for integrating new genomic tools in applied plant breeding programs.
Linkage disequilibrium and association mapping in elite germplasm of European maize
(2006) Stich, Benjamin; Melchinger, Albrecht E.
Linkage mapping has become a routine tool for the identification of quantitative trait loci (QTL) in plants. An alternative, promising approach is association mapping, which has been successfully applied in human genetics to detect QTL coding for diseases. The objectives of this research were to examine the feasibility of association mapping in elite maize breeding populations and develop for this purpose appropriate biometric methods. The feasibility of association mapping depends on the extent of linkage disequilibrium (LD) as well as on the forces generating and conserving LD in the population under consideration. The objectives of our studies were to (i) examine the extent and genomic distribution of LD between pairs of simple sequence repeat (SSR) marker loci, (ii) compare these results with those obtained with amplified fragment length polymorphism (AFLP) markers, and (iii) investigate the forces generating and conserving LD in plant breeding populations. Our studies were based on experimental data of European elite maize inbreds as well as on computer simulations modeling the breeding history of the European flint heterotic group. The experimental results on European elite maize germplasm suggested that the extent of LD between SSR markers as well as AFLP markers are encouraging for the detection of marker-phenotype associations in genomewide scans. In populations with a short history of recombination, SSRs are advantageous over AFLPs in that they have a higher power to detect LD. In contrast, in populations with a long history of recombination, for which no LD is expected between pairs of SSR markers, AFLP markers should be favored over SSRs because then their higher marker density that is generated with a fixed budget can be used. Furthermore, the results of our experimental and simulation studies indicated that not only physical linkage is a cause of LD in plant breeding populations, but also relatedness, population stratification, genetic drift, and selection. So far, in plant genetics the logistic regression ratio test (LRRT) has been applied as a population-based association mapping approach. However, this test does only correct for LD caused by population stratification. The objectives of the presented study were to (i) adapt the quantitative pedigree disequilibrium test to typical pedigrees of inbred lines produced in plant breeding programs and (ii) compare the newly developed quantitative inbred pedigree disequilibrium test (QIPDT) and the commonly employed LRRT with respect to the power and type I error rate of QTL detection. This study was based on computer simulations modeling the breeding history of the European maize heterotic groups. In QIPDT the power of QTL detection was higher with 75 extended pedigrees than in LRRT with 75 independent inbreds. Furthermore, while the type I error rate of LRRT surpassed the nominal ® level, the QIPDT adhered to it. These results suggested that the QIPDT is superior to the LRRT for genome-wide association mapping if data collected routinely in plant breeding programs are available. Epistatic interactions among QTL contribute substantially to the genetic variation in complex traits. The main objectives of our study were to (i) investigate by computer simulations the power and proportion of false positives for detecting three-way interactions among QTL involved in a metabolic pathway in populations of recombinant inbred lines (RILs) derived from a nested design and (ii) compare these estimates to those obtained for detecting three-way interactions among QTL in RIL populations derived from diallel and different partial diallel mating designs. The computer simulations of this study were based on single nucleotide polymorphism haplotype data of 26 diverse maize inbreds. The power and proportion of false positives to detect three-way interactions with 5000 RILs derived from a nested design was relatively high for both the 4 QTL and the 12 QTL scenario. Higher power to detect three-way interactions was observed for RILs derived from optimally allocated distancebased designs than for RILs derived from a nested or diallel design. Our results suggested that association mapping methods adapted to the special features of plant breeding populations have the potential to overcome the limitations of classical linkage mapping methods.
Molecular genetic analysis of modified recurrent full-sib selection in two European F 2 flint maize populations
(2007) Falke, Karen Christin; Melchinger, Albrecht E.
The continuous improvement of breeding material is a main goal in plant breeding. However, many breeding methods result in a reduction of the genetic variation of the breeding material. The primary objective of recurrent selection (RS) methods is to assure a long-term selection response for the target traits while maintaining the genetic variability of the germplasm for continued selection. The main goal of this thesis research was the molecular evaluation of two European F2 flint maize populations under modified recurrent full-sib selection. In detail, the objectives were to (1) investigate linkage mapping with intermating populations, (2) verify the decay of parental linkage disequilibrium (LD) present in the F2 base population through three generations of intermating, (3) identify quantitative trait loci (QTL) of traits under selection, (4) separate the effects of random genetic drift and selection on allele frequency changes during the selection procedure in QTL regions, (5) determine the extent of LD build-up by selection, and (6) analyze the effects of LD on the selection response. Four homozygous inbred lines (A, B, C, and D) were crossed to developed 380 F2:3 lines of population A × B and 140 F3:4 lines of population C × D. The F2 generations of both populations were intermated for three generations by chain crossing to produce F2Syn3 populations. Starting from the F2Syn3 populations, four (A × B) and seven (C × D) cycles of modified recurrent full-sib selection were conducted using a pseudo-factorial mating design. The selection procedure was based on a selection index. For the marker assays, a total of 104 (A × B) and 101 (C × D) SSRs were employed to genotype random subsets of 146 F2:3 lines in A × B and 110 F3:4 lines C × D, 148 F2Syn3 plants and the parents of the 36 selected full-sib families in each cycle of both RS programs. Genetic linkage maps were constructed for population F2:3 (A × B) and F3:4 (C × D) and of both intermated populations F2Syn3. In contrast to earlier studies, mapping methods developed specifically for intermated populations were applied and, thus, the estimated recombination frequencies referred to a single meiosis. Consequently, the map expansion reported in earlier studies was neither expected nor observed. To verify the expected higher mapping resolution of intermated compared with the respective base population, the precision of estimates of the recombination frequencies r was quantified with the average information per individual ir relative to an F2 individual. For infinite population sizes of intermated populations ir > 1 when r < 0.142, however, with a population size of N = 240, as used in our mapping study, ir >1 only when r < 0.04. Therefore, we conclude that random genetic drift has a sizeable effect on ir and, thus, can overrule the advantages of intermating mapping populations. Three generations of intermating were primarily conducted to reduce the initial parental LD between linked loci in the F2 populations and its negative influence on the selection response. Our results demonstrated that the observed decay of LD agreed well with the theoretical expectations. In our modified recurrent full-sib selection program, a comparatively high selection response was reached. An evaluation at the molecular level revealed that further selection response can be expected, because neither fixation nor extinction of alleles was observed at the investigated marker loci. However, using SSR markers we also detected migration effects since the first selection cycles. Selection and random genetic drift are main forces affecting changes in allele frequencies and, therefore, the selection response of RS programs. In our study, we analyzed changes in allele frequencies employing a test that allows the separation of selection and random genetic drift. Significant allele frequency changes due to selection were observed for 23% (C0 vs. C1) and 20% (C0 vs. C4) of all loci in population A × B and for 6% (C0 vs. C1) and 13% (C0 vs. C7) in population C × D. In the base populations F2:3 (A × B) and F3:4 (C × D), several QTL for selection index and its components were detected. At some of them, loci displayed significant allele frequency changes due to selection. Selection is expected to increase the frequency of favorable alleles and simultaneously build up a negative LD between them, which causes a reduction in and, hence, a decline in the selection response. However, the development of LD in our study displayed an erratic change over the selection cycles with only slight increases in positive and negative LD. The reduction in due to a build-up of negative LD is expected only if negative LD is observed at many marker loci. Therefore, we concluded that LD due to selection did not limit the selection response in our RS programs.