IDEAS home Printed from https://ideas.repec.org/a/spr/metrik/v84y2021i5d10.1007_s00184-020-00794-y.html
   My bibliography  Save this article

On histogram-based regression and classification with incomplete data

Author

Listed:
  • Eric Han

    (California State University Northridge)

  • Majid Mojirsheibani

    (California State University Northridge)

Abstract

We consider the problem of nonparametric regression with possibly incomplete covariate vectors. The proposed estimators, which are based on histogram methods, are fully nonparametric and straightforward to implement. The presence of incomplete covariates is handled by an inverse weighting method, where the weights are estimates of the conditional probabilities of having incomplete covariate vectors. We also derive various exponential bounds on the $$L_1$$ L 1 norms of our estimators, which can be used to establish strong consistency results for the corresponding, closely related, problem of nonparametric classification with missing covariates. As the main focus and application of our results, we consider the problem of pattern recognition and statistical classification in the presence of incomplete covariates and propose histogram classifiers that are asymptotically optimal.

Suggested Citation

  • Eric Han & Majid Mojirsheibani, 2021. "On histogram-based regression and classification with incomplete data," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 84(5), pages 635-662, July.
  • Handle: RePEc:spr:metrik:v:84:y:2021:i:5:d:10.1007_s00184-020-00794-y
    DOI: 10.1007/s00184-020-00794-y
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00184-020-00794-y
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00184-020-00794-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Chen, Qixuan & Paik, Myunghee Cho & Kim, Minjin & Wang, Cuiling, 2016. "Using link-preserving imputation for logistic partially linear models with missing covariates," Computational Statistics & Data Analysis, Elsevier, vol. 101(C), pages 174-185.
    2. Guo, Xu & Xu, Wangli & Zhu, Lixing, 2014. "Multi-index regression models with missing covariates at random," Journal of Multivariate Analysis, Elsevier, vol. 123(C), pages 345-363.
    3. Bravo, Francesco, 2015. "Semiparametric estimation with missing covariates," Journal of Multivariate Analysis, Elsevier, vol. 139(C), pages 329-346.
    4. Majid Mojirsheibani, 2012. "Some results on classifier selection with missing covariates," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 75(4), pages 521-539, May.
    5. Liang H. & Wang S. & Robins J.M. & Carroll R.J., 2004. "Estimation in Partially Linear Models With Missing Covariates," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 357-367, January.
    6. Racine, Jeff & Li, Qi, 2004. "Nonparametric estimation of regression functions with both categorical and continuous data," Journal of Econometrics, Elsevier, vol. 119(1), pages 99-130, March.
    7. Shen-Ming Lee & Chin-Shang Li & Shu-Hui Hsieh & Li-Hui Huang, 2012. "Semiparametric estimation of logistic regression model with missing covariates and outcome," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 75(5), pages 621-653, July.
    8. T. Martin Lukusa & Shen-Ming Lee & Chin-Shang Li, 2016. "Semiparametric estimation of a zero-inflated Poisson regression model with missing covariates," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 79(4), pages 457-483, May.
    9. Hua Yun Chen, 2004. "Nonparametric and Semiparametric Models for Missing Covariates in Parametric Regression," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 1176-1189, December.
    10. Hayfield, Tristen & Racine, Jeffrey S., 2008. "Nonparametric Econometrics: The np Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 27(i05).
    11. Kunling Wu & Lang Wu, 2007. "Generalized linear mixed models with informative dropouts and missing covariates," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 66(1), pages 1-18, July.
    12. Samiran Sinha & Krishna K. Saha & Suojin Wang, 2014. "Semiparametric approach for non-monotone missing covariates in a parametric regression model," Biometrics, The International Biometric Society, vol. 70(2), pages 299-311, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Timothy Reese & Majid Mojirsheibani, 2017. "On the $$L_p$$ L p norms of kernel regression estimators for incomplete data with applications to classification," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 26(1), pages 81-112, March.
    2. Qi Li & Juan Lin & Jeffrey S. Racine, 2013. "Optimal Bandwidth Selection for Nonparametric Conditional Distribution and Quantile Functions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(1), pages 57-65, January.
    3. Besstremyannaya, Galina, 2015. "Measuring the effect of health insurance companies on the quality of healthcare systems with kernel and parametric regressions," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 38(2), pages 3-20.
    4. Michael S. Delgado & Daniel J. Henderson & Christopher F. Parmeter, 2014. "Does Education Matter for Economic Growth?," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 76(3), pages 334-359, June.
    5. Arribas Ivan & Perez Francisco & Tortosa-Ausina Emili, 2010. "The Determinants of International Financial Integration Revisited: The Role of Networks and Geographic Neutrality," Studies in Nonlinear Dynamics & Econometrics, De Gruyter, vol. 15(1), pages 1-55, December.
    6. Gavoille, Nicolas & Verschelde, Marijn, 2017. "Electoral competition and political selection: An analysis of the activity of French deputies, 1958–2012," European Economic Review, Elsevier, vol. 92(C), pages 180-195.
    7. Koch, Steven F. & Nkuna, Blessings & Ye, Yuxiang, 2024. "Income elasticity of residential electricity consumption in rural South Africa," Energy Economics, Elsevier, vol. 131(C).
    8. Frölich, Markus & Huber, Martin & Wiesenfarth, Manuel, 2017. "The finite sample performance of semi- and non-parametric estimators for treatment effects and policy evaluation," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 91-102.
    9. Markus Frölich & Martin Huber, 2019. "Including Covariates in the Regression Discontinuity Design," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 37(4), pages 736-748, October.
    10. Das, Sonali & Racine, Jeffrey S., 2018. "Interactive nonparametric analysis of nonlinear systems," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 510(C), pages 290-301.
    11. Christina Felfe & Martin Huber, 2017. "Does preschool boost the development of minority children?: the case of Roma children," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(2), pages 475-502, February.
    12. Shen-Ming Lee & Truong-Nhat Le & Phuoc-Loc Tran & Chin-Shang Li, 2023. "Estimation of logistic regression with covariates missing separately or simultaneously via multiple imputation methods," Computational Statistics, Springer, vol. 38(2), pages 899-934, June.
    13. Abhijit Sharma & Alastair Bailey & Iain Fraser, 2011. "Technology Adoption and Pest Control Strategies Among UK Cereal Farmers: Evidence from Parametric and Nonparametric Count Data Models," Journal of Agricultural Economics, Wiley Blackwell, vol. 62(1), pages 73-92, February.
    14. Christopher F. Parmeter & Valentin Zelenyuk, 2019. "Combining the Virtues of Stochastic Frontier and Data Envelopment Analysis," Operations Research, INFORMS, vol. 67(6), pages 1628-1658, November.
    15. Haupt, Harry & Schnurbus, Joachim & Semmler, Willi, 2018. "Estimation of grouped, time-varying convergence in economic growth," Econometrics and Statistics, Elsevier, vol. 8(C), pages 141-158.
    16. Roman Matousek & Nickolaos G. Tzeremes, 2021. "The asymmetric impact of human capital on economic growth," Empirical Economics, Springer, vol. 60(3), pages 1309-1334, March.
    17. Buu-Chau Truong & Nguyen Van Thuan & Nguyen Huu Hau & Michael McAleer, 2019. "Applications of the Newton-Raphson Method in Decision Sciences and Education," Advances in Decision Sciences, Asia University, Taiwan, vol. 23(4), pages 52-80, December.
    18. Marijn Verschelde & Marijke D’Haese & Glenn Rayp & Ellen Vandamme, 2013. "Challenging Small-Scale Farming: A Non-Parametric Analysis of the (Inverse) Relationship Between Farm Productivity and Farm Size in Burundi," Journal of Agricultural Economics, Wiley Blackwell, vol. 64(2), pages 319-342, June.
    19. Galina Besstremyannaya, 2014. "Urban inequity in the performance of social health insurance system: evidence from Russian regions," Working Papers w0204, Center for Economic and Financial Research (CEFIR).
    20. Philippe Polomé, 2013. "Mimic Behavior in Home Waste-waters Management," Working Papers halshs-00855051, HAL.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:metrik:v:84:y:2021:i:5:d:10.1007_s00184-020-00794-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.