Theoretical and Computational Aspects of Mixture Models, with Applications to Empirical Bayes Methods
Date
Author
Institution
Degree Level
Degree
Department
Specialization
Supervisor / Co-Supervisor and Their Department(s)
Citation for Previous Publication
Link to Related Item
Abstract
This thesis studies mixture models, in particular the estimation of mixing distributions and their applications to empirical Bayes prediction. The objectives are two-fold: to study the large-sample property of empirical Bayes estimators; to develop algorithms for the nonparametric estimation of mixing distributions as well as methods inspired by the Kiefer-Wolfowitz nonparametric maximum likelihood estimator. Asymptotic optimality of empirical Bayes estimators is a topic that has been in past studied by various authors, starting from Robbins (1956), and continued by Deely and Zimmer (1976), Robbins (1964), and Rutherford and Krutchkoff (1969). They all worked in somewhat different settings, focusing not only on mixture models but the general empirical Bayes methodology. Moreover, these authors considered exclusively the squared loss in predictions. In this thesis, we establish asymptotic optimality for the empirical Bayes estimators; the results apply not only for the squared loss, but for a large class of convex loss functions. A consistency result of Bayes estimators for mixture models for a large class of convex loss functions is provided under mild conditions. Nowadays, decision problems involving alternative loss functions other than the squared loss are becoming increasingly popular. For instance, Mukherjee, Brown, and Rusmevichientong (2015) have recently applied a parametric empirical Bayes method to the so-called newsvendor problem involving a piecewise linear loss function. The last chapter of this thesis compares their methodology with one that is based on mixture models, and discusses the potential of the latter in this field. The second part of the thesis is devoted to the estimation of mixing distribution in mixture models. Based on the breakthrough of Koenker and Mizera (2014), see also Dicker and Zhao (2014), Abadie and Kasy (2017), we propose four estimation methods/algorithms. Cutting-Plane Method, which for technical reasons comes last, is in fact an alternative algorithm for the Kiefer- Wolfowitz nonparametric maximum likelihood estimator studied by Koenker and Mizera. However, unlike their algorithm, the Cutting-Plane Method is also applicable in higher-dimensional parameter spaces. The same is true for the remaining three proposed methods. Projected Stochastic Gradient is capable of working in even higher dimensions but its convergence may be slow. Stochastic Average Approximation is generally much faster but in some versions, its estimation target differs from that of Kiefer-Wolfowitz nonparametric maximum likelihood estimator. This is even more true for Constraint Resampling, which is in fact an autonomous and novel estimation method; its properties, as well as those of other proposed methods are assessed via simulations and theoretical results. The penultimate chapter is devoted to facilitate the multivariate data-analytical applications of the developed algorithms. Nonparametric empirical Bayes methods are studied in the presence of explanatory variables. A nonparametric empirical Bayes regression model is later proposed. In contrast to some of the previous approaches, such a regression model has a very simple form and inherits most of theoretical properties of nonparametric empirical Bayes procedures. Unlike methods based on the partial linear model, the parameter estimation procedure is equivalent to solving a convex optimization problem in function space and can be efficiently solved by the proposed algorithms.
