Go to  Advanced Search

Effect of misspecified response correlation in regression analysis

Show full item record

Files in this item

Files Size Format Description   View
ubc_1996-0352.pdf 6.024Mb Adobe Portable Document Format   View/Open
Title: Effect of misspecified response correlation in regression analysis
Author: Chui, Grace Shung-Lai
Degree Master of Science - MSc
Program Statistics
Copyright Date: 1996
Abstract: One can imagine a possible loss of parameter estimation efficiency when response correlation is ignored or misspecified in modeling a response-covariate relationship. Under what conditions is efficiency lost? How much is lost? Whether the responses are correlated or independent, standard theory for the distribution of least squares parameter estimates in linear models (Gaussian responses) can be readily determined. We find that the linear regression analysis assuming independent responses is (theoretically) never more efficient than that incorporating response dependence. The "difference" in efficiencies between these two analyses — measured by how much more readily the latter detects a non-zero regression coefficient — generally increases as the coefficient-to-noise ratio increases. To incorporate response correlation in G LM parameter estimation, Liang &; Zeger (1986) extended the quasi-likelihood theory and developed the generalized estimating equations (GEE) approach. Despite being a popular method, the effects of misspecifying response correlation (e.g. assuming independence when responses are correlated) on parameter estimation efficiency using GEE are not obvious. To investigate such effects, we use simulation studies in which we generate count data and use the GEE approach to estimate the model parameters, using both the correct and misspecified correlation structures. The generated counts, the number of correlated responses in each cluster/replicate, and the total number of replicates are all small to imitate health impact studies (in which hospital admission counts are often the responses). Despite possible loss of parameter estimation efficiency due to such "obstacles" intrinsic in the model, simulation results indicate that the GEE approach produces' 1. regression parameter estimates with relatively small empirical biases using either a correct or misspecified response correlation; 2. a good estimate of the response correlation matrix if its structure is correctly specified; 3. naive and robust variance estimates both of which estimate the true variance well when the response correlation structure is correctly specified; and 4. good robust variance estimates even when the response correlation is misspecified. Furthermore, in a G LM with exchangeably correlated Poisson data and no covariates, specifying independence or exchangeable dependence yields the same intercept estimate and estimation efficiency, provided that inference is based on the robust variance estimate. The naive variance estimate can significantly underestimate the true variance if the responses are assumed independent when analyzing such a GLM.
URI: http://hdl.handle.net/2429/4521
Series/Report no. UBC Retrospective Theses Digitization Project [http://www.library.ubc.ca/archives/retro_theses/]
Scholarly Level: Graduate

This item appears in the following Collection(s)

Show full item record

All items in cIRcle are protected by copyright, with all rights reserved.

UBC Library
1961 East Mall
Vancouver, B.C.
Canada V6T 1Z1
Tel: 604-822-6375
Fax: 604-822-3893