Browsing by Subject "Self-vs-self data"

Now showing 1 - 1 of 1

Biometrical tools for heterosis research
(2010) Schützenmeister, André; Piepho, Hans-Peter
Molecular biological technologies are frequently applied for heterosis research. Large datasets are generated, which are usually analyzed with linear models or linear mixed models. Both types of model make a number of assumptions, and it is important to ensure that the underlying theory applies for datasets at hand. Simultaneous violation of the normality and homoscedasticity assumptions in the linear model setup can produce highly misleading results of associated t- and F-tests. Linear mixed models assume multivariate normality of random effects and errors. These distributional assumptions enable (restricted) maximum likelihood based procedures for estimating variance components. Violations of these assumptions lead to results, which are unreliable and, thus, are potentially misleading. A simulation-based approach for the residual analysis of linear models is introduced, which is extended to linear mixed models. Based on simulation results, the concept of simultaneous tolerance bounds is developed, which facilitates assessing various diagnostic plots. This is exemplified by applying the approach to the residual analysis of different datasets, comparing results to those of other authors. It is shown that the approach is also beneficial, when applied to formal significance tests, which may be used for assessing model assumptions as well. This is supported by the results of a simulation study, where various alternative, non-normal distributions were used for generating data of various experimental designs of varying complexity. For linear mixed models, where studentized residuals are not pivotal quantities, as is the case for linear models, a simulation study is employed for assessing whether the nominal error rate under the null hypothesis complies with the expected nominal error rate. Furthermore, a novel step within the preprocessing pipeline of two-color cDNA microarray data is introduced. The additional step comprises spatial smoothing of microarray background intensities. It is investigated whether anisotropic correlation models need to be employed or isotropic models are sufficient. A self-versus-self dataset with superimposed sets of simulated, differentially expressed genes is used to demonstrate several beneficial features of background smoothing. In combination with background correction algorithms, which avoid negative intensities and which have already been shown to be superior, this additional step increases the power in finding differentially expressed genes, lowers the number of false positive results, and increases the accuracy of estimated fold changes.