UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Statistical modelling and inference for multivariate and longitudinal discrete response data Xu, James Jianmeng

Abstract

This thesis presents research on modelling, statistical inference and computation for multivariate discrete data. I address the problem of how to systematically model multivariate discrete response data including binary, ordinal categorical and count data, and how to carry out statistical inference and computations. To this end, I relate the multivariate models to similar univariate models already widely used in applications and to some multivariate models that hitherto were available but scattered in the literature, and I introduce new classes of models. The main contributions in this thesis to multivariate discrete data analysis are in several distinct directions. In modelling of multivariate discrete data , we propose two new classification of multivariate parametric discrete models: multivariate copula discrete (MCD) models and multivariate mixture discrete (MMD) models. Numerous new multivariate discrete models are introduced through these two classes and several multivariate discrete models which have appeared in the literature are unified by these two classes. With appropriate choices of copulas, these two classes of models allow the marginal parameters and dependence parameters to vary with covariates in a natural way. By using special dependence structures, the models can be used for longitudinal data with short time series or repeated measures data. As a result, the scope of multivariate discrete data analysis is substantially broadened. In statistical inference and computation for multivariate models, we propose the inference function of margins (IFM) approach in which each inference function is a likelihood equation for some marginal distribution of a multivariate distribution. Examples where the approach applies are the multivariate logit model with the copulas having certain closure properties and the multivariate probit model for binary data. This general approach makes the estimation of parameters for the multivariate models computationally feasible. The corresponding asymptotic theory, the estimation of standard errors by the Godambe information matrix as well as the jackknife method, and the efficiency of the IFM approach relative to full multivariate likelihood function approach are studied. Particular attention has been given to the models with special dependence structure (e.g. the copula dependence structure is exchangeable or AR(1) type if applicable), and efficient parameter estimation schemes based on IFM (weighting approach and pool-marginal-likelihood approach) are developed. We also give detailed assessments of the efficiency of the GEE approach for estimating regression parameters in multivariate models; this is lacking in the literature. Detailed data analyses of existing data sets are provided to give concrete application of multivariate models and the statistical inference procedures in this thesis.

Item Media

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.