Browsing by Subject "Statistik"

Now showing 1 - 5 of 5

Business cycles and institutions
empirical analysis
(2017) Kufenko, Vadim; Hagemann, Harald
The cumulative dissertation covers diverse aspects of empirical analysis of business cycles and institutions. There are three research questions in focus. To address the interplay between business cycles and institutions, the first research question is formulated: could the Malthusian cycles be present in a frontier economy with abundance of land and which institutions could be responsible for the Malthusian regime and the transition from it? In order to consider the far-reaching implications of economic cycles for the development of economic thought, the second question is stated: can economic fluctuations quantitatively influence research output? To address the methodology of business cycle analysis, the third question is brought up: how may spurious periodicities emerge and how could one test for them? The main findings in the cumulative dissertation can be summarized as follows: i) it is shown that institutional arrangements may form economic constraints or build-up on the existing ones, responsible for the regimes in which cyclical fluctuations take place; ii) the interaction between the economic cycles and fluctuations in bibliometric variables representing research output in Economics as a science is analysed, and empirical evidence suggests the downswings of cycles stimulate more publications on the topic of crises and business cycles; iii) spurious periodicities emerge close to filtering bounds for real and simulated data after detrending, and it is demonstrated that simultaneous significance testing of spectral density peaks against the noise spectrum across different types of signals may help to reveal spurious periodicities.
Essays in health economics
(2018) Kaiser, Micha; Sousa-Poza, Alfonso
In economic theory a lot of attention is given to the understanding and modelling of consumption decisions of individuals. Usually, most models assume that individuals consume different markets goods and maximize their utility with respect to certain constraints. These constraints can be of various kinds. Besides monetary constraints health related constraints are vitally important during the maximization process of individuals. In such a paradigm, individuals would therefore benefit indirectly from being in a good health state, since this would imply that they are less constrained and could therefore shift their individual utility to a higher level. Moreover, health can also be treated as a good itself. Such an approach would assign a direct effect of different health states to an individuals utility rather than incorporating health states by including them as a source for binding constraints. Apart from the different strategies in modelling the consumption decisions, both ways of thinking have in common that the achievement as well as the maintenance of a good health state is – to some extent - a necessary condition to foster the utility maximization process. Additionally, health outcomes of individuals are highly sensitive to economic circumstances and different policy interventions. For instance, a change in the individuals income will lead to an adjustment of the optimal consumption decision and therefore also to an adjustment of the health outcome (either in a direct or indirect way). Therefore a profound understanding of the impact of changes in economic and political processes helps to assess their effects on the health outcomes of individuals. Hence, this thesis investigates the impact of different economic factors and policy interventions on health. In particular, the thesis contributes to the literature in the following way: Chapter two uses 22 years of data from the German Socio-Economic Panel and information on plant closures to investigate the effects of unemployment on four indicators of unhealthy lifestyles: diet, alcohol consumption, smoking, and (a lack of) physical activity. The main goal is to assess possible causal effects of unemployment on risky behaviors. In fact, in contrast to much of the existing literature the empirical identification strategy used in this analysis, is able to clearly identify exogenous effect and therefore avoids endogeneity, which may result from reversed causality. The main results provide little evidence that unemployment gives rise to unhealthy lifestyles. Chapter three evaluates the relation between preschool care and the well-being of children and adolescents in Germany by using data from the German Health Interview and Examination Survey of Children and Adolescents. Analyzing this relationship is important to provide conclusive knowledge for parents as well as policy-makers due to several reasons. While parents are interested in providing the best health outcomes for their children, policy-makers need to balance a possible trade-off between economic as well as social costs and benefits related to preschool care. Additionally, the chapter examines differences in outcomes based on child socioeconomic background by focusing on the heterogeneous effects for migrant children. The findings suggest that children who have experienced child care have a slightly lower well-being overall. For migrant children, however, the outcomes indicate a positive relation. The fourth chapter analyzes how a nationwide population-based skin cancer screening program (SCS) implemented in Germany in 2008 has impacted the number of hospital discharges following malignant skin neoplasm diagnosis and the malignant melanoma mortality rate per 100,000 inhabitants. Therefore, panel data from the Eurostat database, which covers subregions in 22 European countries is analyzed for the years 2000-2013. By using fixed-effects methods, the causal relationship between the skin cancer screening program and the change in diagnosis and mortality rates are identified and a policy implication is derived. While the results indicate that Germany’s nationwide SCS program is effective in terms of a higher diagnosis rate for malignant skin neoplasms and thus may contribute to an improvement in the early detection of skin cancer, there is no significant influence on the melanoma mortality rate. Chapter five analyzes how closely different income measures conform to Benford’s law, a mathematical predictor of probable first digit distribution across many sets of numbers. Because Benford’s law can be used to test data set reliability, a Benford analysis is applied to assess the quality of six widely used health related survey data sets. This is of particularly importance for health economists, since the majority of empirical work in this field relies on information from survey data. The findings indicate that although income generally obeys Benford’s law, almost all the data sets show substantial discrepancies from it, which can be interpreted as a strong indicator of reliability issues in the survey data. This result is confirmed by a simulation, which demonstrates that household level income data do not manifest the same poor performance as individual level data. This finding implies that researchers should focus on household level characteristics whenever possible to reduce observation errors.
Extensions of genomic prediction methods and approaches for plant breeding
(2013) Technow, Frank; Melchinger, Albrecht E.
Marker assisted selection (MAS) was a first attempt to exploit molecular marker information for selection purposes in plant breeding. The MAS approach rested on the identification of quantitative trait loci (QTL). Because of inherent shortcomings of this approach, MAS failed as a tool for improving polygenic traits, in most instances. By shifting focus from QTL identification to prediction of genetic values, a novel approach called 'genomic selection', originally suggested for breeding of dairy cattle, presents a solution to the shortcomings of MAS. In genomic selection, a training population of phenotyped and genotyped individuals is used for building the prediction model. This model uses the whole marker information simultaneously, without a preceding QTL identification step. Genetic values of selection candidates, which are only genotyped, are then predicted based on that model. Finally, the candidates are selected according their predicted genetic values. Because of its success, genomic selection completely revolutionized dairy cattle breeding. It is now on the verge of revolutionizing plant breeding, too. However, several features set apart plant breeding programs from dairy cattle breeding. Thus, the methodology has to be extended to cover typical scenarios in plant breeding. Providing such extensions to important aspects of plant breeding are the main objectives of this thesis. Single-cross hybrids are the predominant type of cultivar in maize and many other crops. Prediction of hybrid performance is of tremendous importance for identification of superior hybrids. Using genomic prediction approaches for this purpose is therefore of great interest to breeders. The conventional genomic prediction models estimate a single additive effect per marker. This was not appropriate for prediction of hybrid performance because of two reasons. (1) The parental inbred lines of single-cross hybrids are usually taken from genetically very distant germplasm groups. For example, in hybrid maize breeding in Central Europe, these are the Dent and Flint heterotic groups, separated for more than 500 years. Because of the strong divergence between the heterotic groups, it seemed necessary to estimate heterotic group specific marker effects. (2) Dominance effects are an important component of hybrid performance. They had to be included into the prediction models to capture the genetic variance between hybrids maximally. The use of different heterotic groups in hybrid breeding requires parallel breeding programs for inbred line development in each heterotic group. Increasing the training population size with lines from the opposite heterotic group was not attempted previously. Thus, a further objective of this thesis was to investigate whether an increase in the accuracy of genomic prediction can be achieved by using combined training sets. Important traits in plant breeding are characterized by binomially distributed phenotypes. Examples are germination rate, fertility rates, haploid induction rate and spontaneous chromosome doubling rate. No genomic prediction methods for such traits were available. Therefore, another objective was to provide methodological extensions for such traits. We found that incorporation of dominance effects for genomic prediction of maize hybrid performance led to considerable gains in prediction accuracy when the variance attributable to dominance effects was substantial compared to additive genetic variance. Estimation of marker effects specific to the Dent and Flint heterotic group was of less importance, at least not under the high marker densities available today. The main reason for this was the surprisingly high linkage phase consistency between Dent and Flint heterotic groups. Furthermore, combining individuals from different heterotic groups (Flint and Dent) into a single training population can result in considerable increases in prediction accuracy. Our extensions of the prediction methods to binomially distributed data yielded considerably higher prediction accuracies than approximate Gaussian methods. In conclusion, the developed extensions of prediction methods (to hybrid prediction and binomially distributed data) and approaches (training populations combining heterotic groups) can lead to considerable, cost free gains in prediction accuracy. They are therefore valuable tools for exploiting the full potential of genomic selection in plant breeding.
Genomic prediction in rye
(2017) Bernal-Vasquez, Angela-Maria; Piepho, Hans-Peter
Technical progress in the genomic field is accelerating developments in plant and animal breeding programs. The access to high-dimensional molecular data has facilitated acquisition of knowledge of genome sequences in many economically important species, which can be used routinely to predict genetic merit. Genomic prediction (GP) has emerged as an approach that allows predicting the genomic estimated breeding value (GEBV) of an unphenotyped individual based on its marker profile. The approach can considerably increase the genetic gain per unit time, as not all individuals need to be phenotyped. Accuracy of the predictions are influenced by several factors and require proper statistical models able to overcome the problem of having more predictor variables than observations. Plant breeding programs run for several years and genotypes are evaluated in multi environment trials. Selection decisions are based on the mean performance of genotypes across locations and later on, across years. Under this conditions, linear mixed models offer a suitable and flexible framework to undertake the phenotypic and genomic prediction analyses using a stage-wise approach, allowing refinement of each particular stage. In this work, an evaluation and comparison of outlier detection methods, phenotypic analyses and GP models were considered. In particular, it was studied whether at the plot level, identification and removal of possible outlying observations has an impact on the predictive ability. Further, if an enhancement of phenotypic models by spatial trends leads to improvement of GP accuracy, and finally, whether the use of the kinship matrix can enhance the dissection of GEBVs from genotype-by-year (GY) interaction effects. Here, the methods related to the mentioned objectives are compared using experimental datasets from a rye hybrid breeding program. Outlier detection methods widely used in many German plant breeding companies were assessed in terms of control of the family-wise error rate and their merits evaluated in a GP framework (Chapter 2). The benefit of implementation of the methods based on a robust scale estimate was that in routine analysis, such procedures reliably identified spurious data. This outlier detection approach per trial at the plot level is conservative and ensures that adjusted genotype means are not severely biased due to outlying observations. Whenever it is possible, breeders should manually flag suspicious observations based on subject-matter knowledge. Further, removing the flagged outliers identified by the recommendedmethods did not reduce predictive abilities estimated by cross validation (GP-CV) using data of a complete breeding cycle. A crucial step towards an accurate calibration of the genomic prediction procedure is the identification of phenotypic models capable of producing accurate adjusted genotype mean estimates across locations and years. Using a two-year dataset connected through a single check, a three-stage GP approach was implemented (Chapter 3). In the first stage, spatial and non-spatial models were fitted per locations and years to obtain adjusted genotype-tester means. In the second stage, adjusted genotype means were obtained per year, and in the third stage, GP models were evaluated. Akaike information criterion (AIC) and predictive abilities estimated from GP-CV were used as model selection criteria in the first and in the third stage. These criteria were used in the first stage, because a choice had to be made between the spatial and non-spatial models and in the third stage, because the predictive abilities allow a comparison of the results of the complete analysis obtained by the alternative stage-wise approaches presented in this thesis. The second stage was a transitional stage where no model selection was needed for a given method of stage-wise analysis. The predictive abilities displayed a different ranking pattern for the models than the AIC, but both approaches pointed to the same best models. The highest predictive abilities obtained for the GP-CV at the last stage did not coincide with the models that AIC and predictive ability of GP-CV selected in the first stage. Nonetheless, GP-CV can be used to further support model selection decisions that are usually based only upon AIC. There was a trend of models accounting for row and column variation to have better accuracies than the counterpart model without row and column effects, thus suggesting that row-column designs may be a potential option to set up breeding trials. While bulking multi-year data allows increasing the training set size and covering a wider genetic background, it remains a challenge to separate GEBVs from GY effects, when there are no common genotypes across years, i.e., years are poorly connected or totally disconnected. First, an approach considering the two-year dataset connected through a single check, adjusted genotype means were computed per year and submitted to the GP stage (Chapter 3). The year adjustment was done in the GP model by assuming that the mean across genotypes in a given year is a good estimate of the year effect. This assumption is valid because the genotypes evaluated in a year are a sample of the population. Results indicated that this approach is more realistic than relying on the adjustment of a single check. A further approach entailed the use of kinship to dissect GY effects from GEBVs (Chapter 4). It was not obvious which method best models the GY effect, thus several approaches were compared and evaluated in terms of predictive abilities in forward validation (GP-FV) scenarios. It was found that for training sets formed by several disconnected years’ data, the use of kinship to model GY effects was crucial. In training sets where two or three complete cycles were available (i.e. there were some common genotypes across years within a cycle), using kinship or not yielded similar predictive abilities. It was further shown that predictive abilities are higher for scenarios with high relatedness degree between training and validation sets, and that predicting a selection of top-yielding genotypes was more accurate than predicting the complete validation set when kinship was used to model GY effects. In conclusion, stage-wise analysis is recommended and it is stressed that the careful choice of phenotypic and genomic prediction models should be made case by case based on subject matter knowledge and specificities of the data. The analyses presented in this thesis provide general guidelines for breeders to develop phenotypic models integrated with GP. The methods and models described are flexible and allow extensions that can be easily implemented in routine applications.
The development of phenotypic protocols and adjustment of experimental designs in Pelargonium zonale breeding
(2018) Molenaar, Heike; Piepho, Hans-Peter
Ornamental plant variety improvement is limited by current phenotyping approaches and the lack of use of experimental designs. Robust phenotypic data obtained from experiments laid out to best control local variation by blocking allow adequate statistical analysis and are crucial for any breeding purpose, including MAS. Often experiments consist of multiple phases like in P. zonale breeding, where in the first phase stock plants are cultivated to obtain the stem cutting count and in the second phase the stem cuttings are further assess for root formation. The first analyses of rooting experiments raised questions regarding options for improving the two-phase experimental layout, for example whether there is a disadvantage to using exactly the same design in both phases. The other question was, whether a design can be optimized across both phases, such that the MVD can be decreased. Instead of generating a separate layout for each phase. Moreover, optimal selection methods that maximize selection gain in P. zonale breeding based on available data collected from unreplicated trials and containing pedigree information were sought. This thesis was conducted to evaluate the benefits of using two-phase experimental designs and corresponding analysis in P. zonale for production related traits, for which it was necessary to establish phenotyping protocols. To optimize the rooting experiments with their two-phase nature, alternative approaches were explored involving two-phase design generation either in phase wise order or across phases. Furthermore, selection methods considering pedigreeinformation (family-index selection) or not (individual selection), were evaluated to enhance selection efficiency in P. zonale breeding. The benefits of using experimental designs in P. zonale breeding was shown by the simulated response to selection. Alternative designs were evaluated by the MVD obtained by the intrablock analysis and the joint inter-block-intra-block analysis. The efficiency of individual and family-index selection was evaluated in terms of heritability obtained from linear mixed models implementing the selection methods. Simulated response to selection varied greatly, depending on the genotypic variances of the breeding population and traits. However, by using efficient designs allowing adequate analysis, a varietal improvement of over 20% of stock plant reduction is possible for stem cutting count, root formation, branch count and flower count. The smallest MVD for alternative designs was most frequently obtained for designs generated across phases rather than for each phase separately, in particular when both phases of the design were separated with a single pseudolevel. Family-index selection was superior to individual selection in P. zonale indicating that the pedigree-based BLUP procedure can further enhance selection efficiency in productionrelated traits in P. zonale. The quantification of genotypic variation by phenotypic protocols and the optimized two-phase designs for estimating genotypic values were necessary and successful steps in laying the foundation for effective MAS. Phenotypic protocols effectively characterized the genetic material on an observational unit level, while the two-phase experimental designs enabled effective characterization on a genotype level by adjusting entry means using linear mixed models. The resulting adjusted entry means are the basis for future genotype phenotype association for MAS.