IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v100y2009i9p1900-1918.html
   My bibliography  Save this article

Normal distribution based pseudo ML for missing data: With applications to mean and covariance structure analysis

Author

Listed:
  • Yuan, Ke-Hai

Abstract

When missing data are either missing completely at random (MCAR) or missing at random (MAR), the maximum likelihood (ML) estimation procedure preserves many of its properties. However, in any statistical modeling, the distribution specification for the likelihood function is at best only an approximation to the real world. In particular, since the normal-distribution-based ML is typically applied to data with heterogeneous marginal skewness and kurtosis, it is necessary to know whether such a practice still generates consistent parameter estimates. When the manifest variables are linear combinations of independent random components and missing data are MAR, this paper shows that the normal-distribution-based MLE is consistent regardless of the distribution of the sample. Examples also show that the consistency of the MLE is not guaranteed for all nonnormally distributed samples. When the population follows a confirmatory factor model, and data are missing due to the magnitude of the factors, the MLE may not be consistent even when data are normally distributed. When data are missing due to the magnitude of measurement errors/uniqueness, MLEs for many of the covariance parameters related to the missing variables are still consistent. This paper also identifies and discusses the factors that affect the asymptotic biases of the MLE when data are not missing at random. In addition, the paper also shows that, under certain data models and MAR mechanism, the MLE is asymptotically normally distributed and the asymptotic covariance matrix is consistently estimated by the commonly used sandwich-type covariance matrix. The results indicate that certain formulas and/or conclusions in the existing literature may not be entirely correct.

Suggested Citation

  • Yuan, Ke-Hai, 2009. "Normal distribution based pseudo ML for missing data: With applications to mean and covariance structure analysis," Journal of Multivariate Analysis, Elsevier, vol. 100(9), pages 1900-1918, October.
  • Handle: RePEc:eee:jmvana:v:100:y:2009:i:9:p:1900-1918
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047-259X(09)00107-9
    Download Restriction: Full text for ScienceDirect subscribers only
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Gourieroux, Christian & Monfort, Alain & Trognon, Alain, 1984. "Pseudo Maximum Likelihood Methods: Theory," Econometrica, Econometric Society, vol. 52(3), pages 681-700, May.
    2. C. Hendricks Brown, 1983. "Asymptotic comparison of missing data procedures for estimating factor loadings," Psychometrika, Springer;The Psychometric Society, vol. 48(2), pages 269-291, June.
    3. Geert Molenberghs & Caroline Beunckens & Cristina Sotto & Michael G. Kenward, 2008. "Every missingness not at random model has a missingness at random counterpart with equal fit," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(2), pages 371-388, April.
    4. Roderick J. A. Little, 1988. "Robust Estimation of the Mean and Covariance Matrix from Data with Missing Values," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 37(1), pages 23-38, March.
    5. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    6. Yuan, Ke-Hai & Jennrich, Robert I., 1998. "Asymptotics of Estimating Equations under Natural Conditions," Journal of Multivariate Analysis, Elsevier, vol. 65(2), pages 245-260, May.
    7. Amemiya, Takeshi, 1973. "Regression Analysis when the Dependent Variable is Truncated Normal," Econometrica, Econometric Society, vol. 41(6), pages 997-1016, November.
    8. Xin-Yuan Song & Sik-Yum Lee, 2002. "Analysis of structural equation model with ignorable missing continuous and polytomous data," Psychometrika, Springer;The Psychometric Society, vol. 67(2), pages 261-288, June.
    9. Carl Finkbeiner, 1979. "Estimation for the multiple factor model when data are missing," Psychometrika, Springer;The Psychometric Society, vol. 44(4), pages 409-420, December.
    10. Sik-Yum Lee, 1986. "Estimation for structural equation models with missing data," Psychometrika, Springer;The Psychometric Society, vol. 51(1), pages 93-99, March.
    11. Yuan, Ke-Hai, 1997. "A Theorem on Uniform Convergence of Stochastic Functions with Applications," Journal of Multivariate Analysis, Elsevier, vol. 62(1), pages 100-109, July.
    12. Bengt Muthén & David Kaplan & Michael Hollis, 1987. "On structural equation modeling with data that are not missing completely at random," Psychometrika, Springer;The Psychometric Society, vol. 52(3), pages 431-462, September.
    13. Tang, Man-Lai & Bentler, Peter M., 1998. "Theory and method for constrained estimation in structural equation models with incomplete data," Computational Statistics & Data Analysis, Elsevier, vol. 27(3), pages 257-270, May.
    14. Liu, Chuanhai, 1997. "ML Estimation of the MultivariatetDistribution and the EM Algorithm," Journal of Multivariate Analysis, Elsevier, vol. 63(2), pages 296-312, November.
    15. Kevin Kim & Peter Bentler, 2002. "Tests of homogeneity of means and covariance matrices for multivariate incomplete data," Psychometrika, Springer;The Psychometric Society, vol. 67(4), pages 609-623, December.
    16. White, Halbert, 1982. "Maximum Likelihood Estimation of Misspecified Models," Econometrica, Econometric Society, vol. 50(1), pages 1-25, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Dursun Aydın & Ersin Yılmaz, 2021. "Semiparametric modeling of the right-censored time-series based on different censorship solution techniques," Empirical Economics, Springer, vol. 61(4), pages 2143-2172, October.
    2. Yuan, Ke-Hai & Savalei, Victoria, 2014. "Consistency, bias and efficiency of the normal-distribution-based MLE: The role of auxiliary variables," Journal of Multivariate Analysis, Elsevier, vol. 124(C), pages 353-370.
    3. Richard M. Golden & Steven S. Henley & Halbert White & T. Michael Kashner, 2019. "Consequences of Model Misspecification for Maximum Likelihood Estimation with Missing Data," Econometrics, MDPI, vol. 7(3), pages 1-27, September.
    4. Kano, Yutaka & Takai, Keiji, 2011. "Analysis of NMAR missing data without specifying missing-data mechanisms in a linear latent variate model," Journal of Multivariate Analysis, Elsevier, vol. 102(9), pages 1241-1255, October.
    5. Hayakawa, Kazuhiko, 2024. "Recent development of covariance structure analysis in economics," Econometrics and Statistics, Elsevier, vol. 29(C), pages 31-48.
    6. Ke-Hai Yuan & Mortaza Jamshidian & Yutaka Kano, 2018. "Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances," Psychometrika, Springer;The Psychometric Society, vol. 83(2), pages 425-442, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tang, Man-Lai & Bentler, Peter M., 1998. "Theory and method for constrained estimation in structural equation models with incomplete data," Computational Statistics & Data Analysis, Elsevier, vol. 27(3), pages 257-270, May.
    2. Patrick Gagliardini & Elisa Ossola & Olivier Scaillet, 2016. "Time‐Varying Risk Premium in Large Cross‐Sectional Equity Data Sets," Econometrica, Econometric Society, vol. 84, pages 985-1046, May.
    3. Schwiebert, Jörg & Wagner, Joachim, 2015. "A Generalized Two-Part Model for Fractional Response Variables with Excess Zeros," VfS Annual Conference 2015 (Muenster): Economic Development - Theory and Policy 113059, Verein für Socialpolitik / German Economic Association.
    4. Ke-Hai Yuan & Wai Chan & Yubin Tian, 2016. "Expectation-robust algorithm and estimating equations for means and dispersion matrix with missing data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 68(2), pages 329-351, April.
    5. Tang, Man-Lai & Lee, Sik-Yum, 1998. "Analysis of structural equation models with censored or truncated data via EM algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 27(1), pages 33-46, March.
    6. Song, Weixing & Yao, Weixin, 2011. "A lack-of-fit test in Tobit errors-in-variables regression models," Statistics & Probability Letters, Elsevier, vol. 81(12), pages 1792-1801.
    7. Koul, Hira L. & Song, Weixing & Liu, Shan, 2014. "Model checking in Tobit regression via nonparametric smoothing," Journal of Multivariate Analysis, Elsevier, vol. 125(C), pages 36-49.
    8. Richard M. Golden & Steven S. Henley & Halbert White & T. Michael Kashner, 2019. "Consequences of Model Misspecification for Maximum Likelihood Estimation with Missing Data," Econometrics, MDPI, vol. 7(3), pages 1-27, September.
    9. repec:gnv:wpaper:unige:76321 is not listed on IDEAS
    10. Song, Weixing & Zhang, Yi, 2012. "Empirical L2-distance lack-of-fit tests for Tobit regression models," Journal of Multivariate Analysis, Elsevier, vol. 111(C), pages 380-396.
    11. Ke-Hai Yuan & Linda Marshall & Peter Bentler, 2002. "A unified approach to exploratory factor analysis with missing data, nonnormal data, and in the presence of outliers," Psychometrika, Springer;The Psychometric Society, vol. 67(1), pages 95-121, March.
    12. Ke-Hai Yuan & Zhiyong Zhang, 2012. "Robust Structural Equation Modeling with Missing Data and Auxiliary Variables," Psychometrika, Springer;The Psychometric Society, vol. 77(4), pages 803-826, October.
    13. Arvid Raknerud, 2002. "Identification, Estimation and Testing in Panel Data Models with Attrition: The Role of the Missing at Random Assumption," Discussion Papers 330, Statistics Norway, Research Department.
    14. Fernando Rios-Avila & Gustavo Canavire-Bacarreza, 2018. "Standard-error correction in two-stage optimization models: A quasi–maximum likelihood estimation approach," Stata Journal, StataCorp LP, vol. 18(1), pages 206-222, March.
    15. Hagmann, M. & Scaillet, O., 2007. "Local multiplicative bias correction for asymmetric kernel density estimators," Journal of Econometrics, Elsevier, vol. 141(1), pages 213-249, November.
    16. Rama Lionel Ngenzebuke, 2016. "Female say on income and child outcomes: Evidence from Nigeria," WIDER Working Paper Series 134, World Institute for Development Economic Research (UNU-WIDER).
    17. Kriwoluzky, Alexander, 2008. "Matching theory and data: Bayesian vector autoregression and dynamic stochastic general equilibrium models," SFB 649 Discussion Papers 2008-060, Humboldt University Berlin, Collaborative Research Center 649: Economic Risk.
    18. Rosario Crinò, 2010. "Service Offshoring and White-Collar Employment," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 77(2), pages 595-632.
    19. Malmendier, Ulrike M. & Botsch, Matthew J., 2020. "The Long Shadows of the Great Inflation: Evidence from Residential Mortgages," CEPR Discussion Papers 14934, C.E.P.R. Discussion Papers.
    20. Vijverberg, Wim P. & Hasebe, Takuya, 2015. "GTL Regression: A Linear Model with Skewed and Thick-Tailed Disturbances," IZA Discussion Papers 8898, Institute of Labor Economics (IZA).
    21. Tang, Niansheng & Wang, Wenjun, 2019. "Robust estimation of generalized estimating equations with finite mixture correlation matrices and missing covariates at random for longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 640-655.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:100:y:2009:i:9:p:1900-1918. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.