IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v10y2011i1n17.html
   My bibliography  Save this article

Imputation Estimators Partially Correct for Model Misspecification

Author

Listed:
  • Minin Vladimir N.
  • O'Brien John D.
  • Seregin Arseni

Abstract

Inference problems with incomplete observations often aim at estimating population properties of unobserved quantities. One simple way to accomplish this estimation is to impute the unobserved quantities of interest at the individual level and then take an empirical average of the imputed values. We show that this simple imputation estimator can provide partial protection against model misspecification. We illustrate imputation estimators’ robustness to model specification on three examples: mixture model-based clustering, estimation of genotype frequencies in population genetics, and estimation of Markovian evolutionary distances. In the final example, using a representative model misspecification, we demonstrate that in non-degenerate cases, the imputation estimator dominates the plug-in estimate asymptotically. We conclude by outlining a Bayesian implementation of the imputation-based estimation.

Suggested Citation

  • Minin Vladimir N. & O'Brien John D. & Seregin Arseni, 2011. "Imputation Estimators Partially Correct for Model Misspecification," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-24, April.
  • Handle: RePEc:bpj:sagmbi:v:10:y:2011:i:1:n:17
    DOI: 10.2202/1544-6115.1650
    as

    Download full text from publisher

    File URL: https://doi.org/10.2202/1544-6115.1650
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.2202/1544-6115.1650?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Chen, Yi-Hau & Chatterjee, Nilanjan & Carroll, Raymond J., 2009. "Shrinkage Estimators for Robust and Efficient Inference in Haplotype-Based Case-Control Studies," Journal of the American Statistical Association, American Statistical Association, vol. 104(485), pages 220-233.
    2. Chris Fraley & Adrian E. Raftery, 2003. "Enhanced Model-Based Clustering, Density Estimation, and Discriminant Analysis Software: MCLUST," Journal of Classification, Springer;The Classification Society, vol. 20(2), pages 263-286, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mark S. Handcock & Adrian E. Raftery & Jeremy M. Tantrum, 2007. "Model‐based clustering for social networks," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 170(2), pages 301-354, March.
    2. repec:jss:jstsof:14:i12 is not listed on IDEAS
    3. Maugis, C. & Celeux, G. & Martin-Magniette, M.-L., 2011. "Variable selection in model-based discriminant analysis," Journal of Multivariate Analysis, Elsevier, vol. 102(10), pages 1374-1387, November.
    4. Jinbo Chen & Dongyu Lin & Hagit Hochner, 2012. "Semiparametric Maximum Likelihood Methods for Analyzing Genetic and Environmental Effects with Case-Control Mother–Child Pair Data," Biometrics, The International Biometric Society, vol. 68(3), pages 869-877, September.
    5. Cathy Maugis & Gilles Celeux & Marie-Laure Martin-Magniette, 2009. "Variable Selection for Clustering with Gaussian Mixture Models," Biometrics, The International Biometric Society, vol. 65(3), pages 701-709, September.
    6. Jeffrey Andrews & Paul McNicholas, 2014. "Variable Selection for Clustering and Classification," Journal of Classification, Springer;The Classification Society, vol. 31(2), pages 136-153, July.
    7. Brisa N. Sánchez & Shan Kang & Bhramar Mukherjee, 2012. "A Latent Variable Approach to Study Gene–Environment Interactions in the Presence of Multiple Correlated Exposures," Biometrics, The International Biometric Society, vol. 68(2), pages 466-476, June.
    8. repec:jss:jstsof:18:i06 is not listed on IDEAS
    9. Hao Cheng & Ying Wei, 2018. "A fast imputation algorithm in quantile regression," Computational Statistics, Springer, vol. 33(4), pages 1589-1603, December.
    10. Zhang, Ping & Serban, Nicoleta, 2007. "Discovery, visualization and performance analysis of enterprise workflow," Computational Statistics & Data Analysis, Elsevier, vol. 51(5), pages 2670-2687, February.
    11. Hennig, Christian, 2008. "Dissolution point and isolation robustness: Robustness criteria for general cluster analysis methods," Journal of Multivariate Analysis, Elsevier, vol. 99(6), pages 1154-1176, July.
    12. Hornik, Kurt, 2005. "A CLUE for CLUster Ensembles," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 14(i12).
    13. Salter-Townshend, Michael & Murphy, Thomas Brendan, 2013. "Variational Bayesian inference for the Latent Position Cluster Model for network data," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 661-671.
    14. Stephen S. M. Lee & Mehdi Soleymani, 2015. "A Simple Formula for Mixing Estimators With Different Convergence Rates," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1463-1478, December.
    15. McNicholas, P.D. & Murphy, T.B. & McDaid, A.F. & Frost, D., 2010. "Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 54(3), pages 711-723, March.
    16. Christian Hennig, 2010. "Methods for merging Gaussian mixture components," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 4(1), pages 3-34, April.
    17. Hien D. Nguyen & Geoffrey J. McLachlan & Jeremy F. P. Ullmann & Andrew L. Janke, 2016. "Spatial clustering of time series via mixture of autoregressions models and Markov random fields," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 70(4), pages 414-439, November.
    18. Crowley, Patrick M., 2008. "One money, several cycles? : evaluation of European business cycles using model-based cluster analysis," Research Discussion Papers 3/2008, Bank of Finland.
    19. Megan L. Neely & Howard D. Bondell & Jung-Ying Tzeng, 2015. "A penalized likelihood approach for investigating gene–drug interactions in pharmacogenetic studies," Biometrics, The International Biometric Society, vol. 71(2), pages 529-537, June.
    20. Alex Sharp & Glen Chalatov & Ryan P. Browne, 2023. "A dual subspace parsimonious mixture of matrix normal distributions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(3), pages 801-822, September.
    21. Andrews, Jeffrey L. & McNicholas, Paul D. & Subedi, Sanjeena, 2011. "Model-based classification via mixtures of multivariate t-distributions," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 520-529, January.
    22. Barnes, Andrew P. & Bevan, Kev & Moxey, Andrew & Grierson, Sascha & Toma, Luiza, 2023. "Identifying best practice in Less Favoured Area mixed livestock systems," Agricultural Systems, Elsevier, vol. 208(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:10:y:2011:i:1:n:17. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.