Mixed modelling for phenotypic data from plant breeding

Möhring, Jens

Doctoral Thesis

2011

Mixed modelling for phenotypic data from plant breeding

Möhring, Jens

Dissertation_Jens_Moehring.pdf (439.81 KB)

Abstract (English)

Phenotypic selection and genetic studies require an efficient and valid analysis of phenotypic plant breeding data. Therefore, the analysis must take the mating design, the field design and the genetic structure of tested genotypes into account. In Chapter 2 unbalanced multi-environment trials (METs) in maize using a factorial design are analysed. The dataset from 30 years is subdivided in periods of up to three years. Variance component estimates for general and specific combining ability are calculated for each period. While mean grain yield increased with ongoing inter-pool selection, no changes for the mean of dry matter yield or for variance component estimate ratios were found. The continuous preponderance of general combining ability variance allows a hybrid selection based on general combining effects. The analysis of large datasets is often performed in stage-wise fashion by analysing each trial or location separately and estimating adjusted genotype means per trial or location. These means are then submitted to a mixed model to calculate genotype main effects across trials or locations. Chapter 3 studies the influence of stage-wise analysis on genotype main effect estimates for models which take account of the typical genetic structure of genotype effects within plant breeding data. For comparison, the genetic effects were assumed both fixed and random. The performance of several weighting methods for the stage-wise analysis are analysed by correlating the two-stage estimates with results of one-stage analysis and by calculating the mean square error (MSE) between both types of estimate. In case of random genetic effects, the genetic structure is modelled in one of three ways, either by using the numerator relationship matrix, a marker-based kinship matrix or by using crossed and nested genetic effects. It was found that stage-wise analysis results in comparable genotype main effect estimates for all weighting methods and for the assumption of random or fixed genetic effect if the model for analysis is valid. In case of choosing invalid models, e.g., if the missing data pattern is informative, both analyses are invalid and the results can differ. Informative missing data pattern can result from ignoring information either used for selecting the analysed genotypes or for selecting the test environments of genotypes, if not all genotypes are tested in all environments. While correlated information from relatives is rarely directly used for analysis of plant breeding data, it is often used implicitly by the breeder for selection decisions, e.g. by looking at the performance of a genotype and the average performance of the underlying cross. Chapter 4 proposed a model with a joint variance-covariance structure for related genotypes in analysis of diallels. This model is compared to other diallel models based on assumptions regarding the inheritance of several independent genes, i.e. on genetic models with more restrictive assumptions on the relationship between relatives. The proposed diallel model using a joint variance-covariance structure for parents and parental effects in crosses is shown to be a general model subsuming other more specialized diallel models, as these latter models can be obtained from the general model by adding restrictions on the variance-covariance structure. If no a priori information about the genetic model is available the proposed general model can outperform the more restrictive models. Using restrictive models can result in biased variance component estimates, if restrictions are not fulfilled by the data analysed. Chapter 5 evaluates, whether a subdivision of 21 triticale genotypes into heterotic pools is preferable. Subdividing genotypes into heterotic pools implies a factorial mating design between heterotic pools and a diallel mating design within each heterotic pool. For two (or more) heterotic pools the model is extended by assuming a joint variance-covariance structure for parental effects and general combing ability effects within the diallel and within the factorials. It is shown that a model with two heterotic pools has the best model fit. The variance component estimates for the general combing ability decrease within the heterotic pools and increase between heterotic pools. The results in Chapter 2 to 5 show, that an efficient and valid analysis of phenotypic plant breeding data is an essential part of the plant breeding process. The analysis can be performed in one or two stages. The used mixed models recognizing the field and mating design and the genetic structure can be used for answering questions about the genetic variance in cultivar populations under selection and of the number of heterotic pools. The proposed general diallel model using a joint variance-covariance structure between related effects can further be modified for factorials and other mating designs with related genotypes.

Abstract (German)

Eine effiziente Auswertung von Pflanzenzüchtungsdaten wird für phänotypische Selektion einerseits und genetische Studien anderseits benötigt. Dabei muss das Versuchs- und Kreuzungsdesign sowie die genetische Struktur der zu testenden Genotypen berücksichtigt werden. In Kapitel 2 wird ein 30-jähriger Maisdatensatz mit faktoriellem Kreuzungsdesign ausgewertet. Der Datensatz wird in Perioden unterteilt und für diese Gesamtmittelwerte sowie Varianzkomponenten für generelle und spezifische Kombinationseignung (g.c.a. und s.c.a.) ermittelt. Diese Schätzwerte werden dann zwischen den Perioden verglichen. Während der Kornertrag mit der Zeit zunimmt, kann für die Trockensubstanzmenge und das Verhältnis der Varianzkomponenten keine Veränderung nachgewiesen werden. Der hohe Anteil der g.c.a.-Varianz an der gesamten genetischen Varianz erlaubt eine Hybridselektion aufgrund der g.c.a. Bei großen Datensätzen erfolgt die Auswertung oft zweistufig. Zunächst werden Mittelwerte pro Versuch oder Ort geschätzt. Anschließend werden diese in einer Serienauswertung verwendet, um Gesamtmittelwerte zu erhalten. Kapitel 3 untersucht den Einfluss einer zweistufigen Auswertung auf genotypische Gesamtmittelwerte unter Berücksichtigung der für Pflanzenzüchtungsdaten typischen Verwandtschaftsverhältnisse zwischen Genotypen. Es werden Zweischrittauswertungen mit verschiedenen Gewichtungsmethoden im zweiten Schritt mit einer Einschrittauswertung verglichen. Die genetischen Effekte werden als zufällig angenommen, wobei zur Integration der Verwandtschaftsinformation der Genotypen drei Verfahren verwendet werden: Eine abstammungsbasierte Ähnlichkeitsmatrix, eine markerbasierte Ähnlichkeitsmatrix oder ein Modell mit geschachtelten und gekreuzten genetischen Effekten. Zum Vergleich werden die Datensätze auch mit festen genetischen Effekten ausgewertet. Als Gütekriterium werden die Korrelation der Gesamtmittelwertschätzungen zu denen der Einschrittauswertung sowie der mittlere quadratische Fehler (MSE) zwischen beiden bestimmt. Es ergeben sich vergleichbare Mittelwertschätzwerte für alle Gewichtungsmethoden. Im Fall von nicht zulässigen Modellen, beispielsweise wenn das Fehlmuster der Daten nicht zufällig ist, ergeben sich Unterschiede zwischen Ein- und Zweischrittauswertung. Beide Auswertungen sind dann unzulässig. Informative Fehlmuster können durch fehlende Verwandtschaftsinformation entstehen, wenn diese zur Selektion der geprüften Genotypen oder Genotyp-Umwelt-Kombinationen genutzt wurde. Während Verwandteninformationen für die Auswertung von Pflanzenzüchtungsdaten selten direkt modelliert wird, nutzen Züchter diese oft implizit. So wird zur Leistungsbewertung eines Genotypen oft auch die Eignung der gesamten Kreuzung betrachtet. Kapitel 4 schlägt für die Auswertung von Diallelen ein Modell vor, das eine gemeinsame Varianz-Kovarianzmatrix für alle korrelierten genetischen Effekte verwendet. Es wird eine Korrelation zwischen dem Elterneffekt und dem g.c.a.-Effekt des selben Elters modelliert. Dieses Model wird verglichen mit anderen Diallelmodellen, die auf der Vererbung vieler unabhängiger Gene und somit auf restriktiveren Annahmen bezüglich der Varianz-Kovarianzmatrix basieren. Durch Hinzufügen dieser Restriktionen in der Varianz-Kovarianzmatrix des vorgeschlagenen Modells lassen sich diese Modelle vom vorgeschlagene Modell ableiten. Abweichungen von restriktiveren Varianz-Kovarianzstrukturen können zu verzerrten Varianzkomponentenschätzungen führen. Fehlen Vorabinformationen, dass das wahre Vererbungsmodell durch andere Diallelmodelle besser abgebildet wird, so kann das vorgeschlagene Modell Dialleldaten potentiell besser beschreiben. Kapitel 5 untersucht die Unterteilung von 21 Triticalegenotypen in heterotische Gruppen. Eine Unterteilung impliziert faktorielle Kreuzungsdesigns zwischen und diallele Kreuzungsdesigns innerhalb der heterotischen Gruppen. Für zwei oder mehr heterotische Gruppen wird das Modell aus Kapitel 4 erweitert, in dem eine gemeinsame Varianz-Kovarianzmatrix für den Eltereffekt und die g.c.a.-Effekte des Elters im Diallel und in faktoriellen Designs angenommen wird. Ein Modell mit zwei heterotischen Gruppen zeigt die beste Modellanpassung. Die g.c.a.-Varianz schrumpft innerhalb und erhöht sich zwischen den heterotischen Gruppen. Die Ergebnisse in den Kapiteln 2 bis 5 zeigen, dass eine effiziente und valide Auswertung phänotypischer Pflanzenzüchtungsdaten ein essentieller Teil der Pflanzenzüchtung ist. Die Auswertung kann ein- oder zweistufig erfolgen. Die gemischten Modelle berücksichtigen das Versuchs- und Kreuzungsdesign und können verwendet werden, um Fragen über die Entwicklung genetischer Varianzen in Züchtungspopulationen oder zur optimalen Anzahl heterotischer Gruppen zu beantworten. Das vorgeschlagene Diallelmodell mit einer gemeinsamen Varianz-Kovarianzstruktur für alle korrelierten genetischen Effekte lässt sich für faktorielle Designs und andere Kreuzungsdesigns mit korrelierten Genotypen erweitern.

Publication license

Copyright

Faculty

Faculty of Agricultural Sciences

Institute

Institute of Crop Science

Examination date

2011-03-16

Supervisor

Piepho, Hans-Peter

Cite this publication

Möhring, J. (2011). Mixed modelling for phenotypic data from plant breeding. https://hohpublica.uni-hohenheim.de/handle/123456789/5478

Identification

https://hohpublica.uni-hohenheim.de/handle/123456789/5478

Language

English

Classification (DDC)

630 Agriculture

Collections

Institut für Kulturpflanzenwissenschaften

Free keywords

Diallel Two-stage analysis Diallel Zweischrittauswertung

Standardized keywords (GND)

Gemischtes Modell Biometrie Biostatistik Pflanzenzüchtung Hohenheim Phänotyp

BibTeX@phdthesis{Möhring2011,
url = {https://hohpublica.uni-hohenheim.de/handle/123456789/5478},
author = {Möhring, Jens},
title = {Mixed modelling for phenotypic data from plant breeding},
year = {2011},
school = {Universität Hohenheim},
}

Share this publication

Full item page

A new version of this entry is available:

Mixed modelling for phenotypic data from plant breeding

Abstract (English)

Abstract (German)

File is subject to an embargo until

This is a correction to:

A correction to this entry is available:

This is a new version of:

Other version

Notes

Publication license

Publication series

Published in

Other version

Faculty

Institute

Examination date

Supervisor

Cite this publication

Edition / version

Citation

Identification

DOI

ISSN

ISBN

Language

Publisher

Publisher place

Classification (DDC)

Collections

Original object

University bibliography

Free keywords

Standardized keywords (GND)

Sustainable Development Goals

BibTeX

Share this publication