Australian & New Zealand Journal of Statistics

Volume 61, Issue 2 p. 213-233

Original Article

Sequential imputation for models with latent variables assuming latent ignorability

Lauren J. Beesley,

Corresponding Author

Lauren J. Beesley

lbeesley@umich.edu

orcid.org/0000-0002-3788-5944

Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, MI, 48109 USA

Author to whom correspondence should be addressed.Search for more papers by this author

Jeremy M. G. Taylor,

Jeremy M. G. Taylor

Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, MI, 48109 USA

Search for more papers by this author

Roderick J. A. Little,

Roderick J. A. Little

Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, MI, 48109 USA

Search for more papers by this author

Lauren J. Beesley,

Corresponding Author

Lauren J. Beesley

lbeesley@umich.edu

orcid.org/0000-0002-3788-5944

Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, MI, 48109 USA

Author to whom correspondence should be addressed.Search for more papers by this author

Jeremy M. G. Taylor,

Jeremy M. G. Taylor

Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, MI, 48109 USA

Search for more papers by this author

Roderick J. A. Little,

Roderick J. A. Little

Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, MI, 48109 USA

Search for more papers by this author

First published: 05 July 2019

https://doi.org/10.1111/anzs.12264

Citations: 6

Share a link

Email
Facebook
Twitter
LinkedIn
Reddit
Wechat

Summary

Models that involve an outcome variable, covariates, and latent variables are frequently the target for estimation and inference. The presence of missing covariate or outcome data presents a challenge, particularly when missingness depends on the latent variables. This missingness mechanism is called latent ignorable or latent missing at random and is a generalisation of missing at random. Several authors have previously proposed approaches for handling latent ignorable missingness, but these methods rely on prior specification of the joint distribution for the complete data. In practice, specifying the joint distribution can be difficult and/or restrictive. We develop a novel sequential imputation procedure for imputing covariate and outcome data for models with latent variables under latent ignorable missingness. The proposed method does not require a joint model; rather, we use results under a joint model to inform imputation with less restrictive modelling assumptions. We discuss identifiability and convergence-related issues, and simulation results are presented in several modelling settings. The method is motivated and illustrated by a study of head and neck cancer recurrence. Imputing missing data for models with latent variables under latent-dependent missingness without specifying a full joint model.

Supporting Information

References

Bartlett, J.W., Seaman, S.R., White, I.R. & Carpenter, J.R. (2014). Multiple imputation of covariates by fully conditional specification: accomodating the substantive model. Statistical Methods in Medical Research 24, 462–487.
10.1177/0962280214521348
PubMedWeb of Science®Google Scholar
Beesley, L.J., Bartlett, J.W., Wolf, G.T. & Taylor, J.M.G. (2016). Multiple imputation of missing covariates for the Cox proportional hazards cure model. Statistics in Medicine 35, 4701–4717.
10.1002/sim.7048
PubMedWeb of Science®Google Scholar
Chung, H., Flaherty, B.P. & Schafer, J.L. (2006). Latent class logistic regression: application to marijuana use and attitudes among high school seniors. Journal of the Royal Statistical Society 169, 723–743.
10.1111/j.1467-985X.2006.00419.x
Google Scholar
Duffy, S., Taylor, J.M.G., Terrell, J., (2008). IL-6 predicts recurrence among head and neck cancer patients. Cancer 113, 750–757.
10.1002/cncr.23615
PubMedWeb of Science®Google Scholar
Follmann, D. & Wu, M.C. (1995). An approximate generalized linear model with random effects for informative missing data. Biometrics 51, 151–168.
10.2307/2533322
CASPubMedWeb of Science®Google Scholar
Frangakis, C.E. & Rubin, D.B. (1999). Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes. Biometrika 86, 365–379.
10.1093/biomet/86.2.365
Web of Science®Google Scholar
Gelman, A. (2004). Parameterization and bayesian modeling. Journal of the American Statistical Association 99, 537–545.
10.1198/016214504000000458
Web of Science®Google Scholar
Gelman, A. & Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science 7, 457–511.
10.1214/ss/1177011136
Google Scholar
Giusti, C. & Little, R.J.A. (2011). An analysis of nonignorable nonresponse to income in a survey with a rotating panel design. Journal of Official Statistics 27, 211–229.
Web of Science®Google Scholar
Harel, O. (2003). Strategies for data analysis with two types of missing values. Ph.D. thesis, Pennsylvania State University.
Google Scholar
Harel, O. & Schafer, J.L. (2009). Partial and latent ignorability in missing-data problems. Biometrika 96, 37–50.
10.1093/biomet/asn069
Web of Science®Google Scholar
Hughes, R.A., White, I.R., Seaman, S.R., Carpenter, J.R., Tilling, K. & Sterne, J.A.C. (2014). Joint modeling rationale for chained equations. BMC Medical Research Methodology 14, 1–10.
10.1186/1471-2288-14-28
PubMedWeb of Science®Google Scholar
Jung, H. (2007). A latent-class selection model for nonignorable missing data. Ph.D. thesis, Pennsylvania State University.
Google Scholar
Little, R.J.A. (1995). Modeling the drop-out mechanism in repeated-measures studies. Journal of the American Statistical Association 90, 1112–1121.
10.1080/01621459.1995.10476615
Web of Science®Google Scholar
Little, R.J. (2009a). Comments on: Missing data methods in longitudinal studies: a review. Test 18, 47–50.
10.1007/s11749-009-0140-3
Web of Science®Google Scholar
Little, R.J. (2009b). Selection and pattern-mixture models. In Longitudinal Data Analysis, eds. G. Fitzmaurice, M. Davidian, G. Verbeke & G. Molenberghs, chap. 18, pp. 409–431New York, NY: Taylor & Francis Group.
Web of Science®Google Scholar
Little, R.J.A. & Rubin, D.B. (2002). Statistical Analysis with Missing Data, 2nd edn. Hoboken, NJ: John Wiley and Sons, Inc.
10.1002/9781119013563
Google Scholar
Liu, J., Gelman, A., Hill, J., Su, Y.S. & Kropko, J. (2013). On the stationary distribution of iterative imputation. Biometrika 101, 155–173.
10.1093/biomet/ast044
Web of Science®Google Scholar
Lu, Z.L., Zhang, Z. & Lubke, G. (2011). Bayesian inference for growth mixture models with latent class dependent missing data. Multivariate Behavioral Research 46, 567–597.
10.1080/00273171.2011.589261
PubMedWeb of Science®Google Scholar
McCulloch, C.E., Neuhaus, J.M. & Olin, R.L. (2016). Biased and unbiased estimation in longitudinal studies with informative visit processes. Biometrics 72, 1315–1324.
10.1111/biom.12501
PubMedWeb of Science®Google Scholar
Meng, X.L. (1994). Multiple-imputation inferences with uncongenial sources of input. Statistical Science 9, 538–573.
10.1214/ss/1177010269
Web of Science®Google Scholar
Miao, W., Ding, P. & Geng, Z. (2016). Identifiability of normal and normal mixture models with nonignorable missing data. Journal of the American Statistical Association 111, 1673–1683.
10.1080/01621459.2015.1105808
CASWeb of Science®Google Scholar
Molenberghs, G., Beunckens, C. & Sotto, C. (2008). Every missing not at random model has got a missing at random counterpart with equal fit. Journal of the Royal Statistical Society (Series B) 70, 371–388.
10.1111/j.1467-9868.2007.00640.x
Web of Science®Google Scholar
Peterson, L.A., Bellile, E.L., Wolf, G.T., Virani, S., Shuman, A.G. & Taylor, J.M.G. (2016). Cigarette use, comorbidities, and prognosis in a prospective head and neck squamous cell carcinoma population. Head and Neck 38, 1810–1820.
10.1002/hed.24515
PubMedWeb of Science®Google Scholar
Raghunathan, T.E. (2001). A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology 27, 85–95.
Google Scholar
Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys, 1st edn. New York, NY: John Wiley and Sons, Inc.
10.1002/9780470316696
Google Scholar
Schafer, J.L. (1997). Imputation of missing covariates under a multivariate linear mixed model. Technical report, Pennsylvania State University.
Google Scholar
Schafer, J.L. & Yucel, R.M. (2002). Computational strategies for multivariate linear mixed-effects models with missing values. Journal of Computational and Graphical Statistics 11, 437–457.
10.1198/106186002760180608
Web of Science®Google Scholar
Sy, J.P. & Taylor, J.M.G. (2000). Estimation in a Cox proportional hazards cure model. Biometrics 56, 227–236.
10.1111/j.0006-341X.2000.00227.x
CASPubMedWeb of Science®Google Scholar
Taylor, J.M.G. (1995). Semiparametric estimation in failure time mixture models. Biometrics 51, 899–907.
10.2307/2532991
CASPubMedWeb of Science®Google Scholar
Van Buuren, S. (2007). Multiple imputation of discrete and continuous data by fully conditional specification. Statistical Methods in Medical Research 16, 219–242.
10.1177/0962280206074463
PubMedWeb of Science®Google Scholar
Van Buuren, S., Brand, J.P.L., Groothuis-Oudshoorn, C.G.M. & Rubin, D.B. (2006). Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation 76, 1049–1064.
10.1080/10629360600810434
Web of Science®Google Scholar
Wang, S., Shao, J. & Kwang Kim J. (2014). An instrumental variable approach for identification and estimation with nonignorable nonresponse. Statistica Sinica 24, 1097–1116.
Web of Science®Google Scholar
White, I.R. & Royston, P. (2009). Imputing missing covariate values for the Cox model. Statistics in Medicine 28, 1982–1998.
10.1002/sim.3618
PubMedWeb of Science®Google Scholar
Wu, M.C. & Carroll, R.J. (1988). Estimation and comparison of changes in the presence of informative right censoring by modeling the censoring process. Biometrics 44, 175–188.
10.2307/2531905
CASWeb of Science®Google Scholar
Yang, X., Lu, J. & Shoptaw, S. (2008). Imputation-based strategies for clinical trial longitudinal data with nonignorable missing values. Statistics in Medicine 27, 2826–2849.
10.1002/sim.3111
CASPubMedWeb of Science®Google Scholar

Citing Literature

Volume61, Issue2

June 2019

Pages 213-233

Sequential imputation for models with latent variables assuming latent ignorability

Summary

Supporting Information

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Sequential imputation for models with latent variables assuming latent ignorability

Summary

Supporting Information

References

Citing Literature

References

Related

Information