IDEAS home Printed from https://ideas.repec.org/a/spr/metrik/v82y2019i3d10.1007_s00184-018-0703-y.html
   My bibliography  Save this article

Kernel density estimation from complex surveys in the presence of complete auxiliary information

Author

Listed:
  • Sayed A. Mostafa

    (Indiana University
    North Carolina A&T State University)

  • Ibrahim A. Ahmad

    (Oklahoma State University)

Abstract

Auxiliary information is widely used in survey sampling to enhance the precision of estimators of finite population parameters, such as the finite population mean, percentiles, and distribution function. In the context of complex surveys, we show how auxiliary information can be used effectively in kernel estimation of the superpopulation density function of a given study variable. We propose two classes of “model-assisted” kernel density estimators that make efficient use of auxiliary information. For one class we assume that the functional relationship between the study variable Y and the auxiliary variable X is known, while for the other class the relationship is assumed unknown and is estimated using kernel smoothing techniques. Under the first class, we show that if the functional relationship can be written as a simple linear regression model with constant error variance, the mean of the proposed density estimator will be identical to the well-known regression estimator of the finite population mean. If we drop the intercept from the linear model and allow the error variance to be proportional to the auxiliary variable, the mean of the proposed density estimator matches the ratio estimator of the finite population mean. The properties of the new density estimators are studied under a combined design-model-based inference framework, which accounts for the underlying superpopulation model as well as the randomization distribution induced by the sampling design. Moreover, the asymptotic normality of each estimator is derived under both design-based and combined inference frameworks when the sampling design is simple random sampling without replacement. For the practical implementation of these estimators, we discuss how data-driven bandwidth estimators can be obtained. The finite sample properties of the proposed estimators are addressed via simulations and an example that mimics a real survey. These simulations show that the new estimators perform very well compared to standard kernel estimators which do not utilize the auxiliary information.

Suggested Citation

  • Sayed A. Mostafa & Ibrahim A. Ahmad, 2019. "Kernel density estimation from complex surveys in the presence of complete auxiliary information," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 82(3), pages 295-338, April.
  • Handle: RePEc:spr:metrik:v:82:y:2019:i:3:d:10.1007_s00184-018-0703-y
    DOI: 10.1007/s00184-018-0703-y
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00184-018-0703-y
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00184-018-0703-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Breunig, Robert, 2008. "Nonparametric density estimation for stratified samples," Statistics & Probability Letters, Elsevier, vol. 78(14), pages 2194-2200, October.
    2. Robert Breunig, 2001. "Density Estimation For Clustered Data," Econometric Reviews, Taylor & Francis Journals, vol. 20(3), pages 353-367.
    3. Yao, Qiwei & Tong, Howell, 1994. "Quantifying the influence of initial values on nonlinear prediction," LSE Research Online Documents on Economics 19426, London School of Economics and Political Science, LSE Library.
    4. Hayfield, Tristen & Racine, Jeffrey S., 2008. "Nonparametric Econometrics: The np Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 27(i05).
    5. Hansen, Bruce E., 2008. "Uniform Convergence Rates For Kernel Estimation With Dependent Data," Econometric Theory, Cambridge University Press, vol. 24(3), pages 726-748, June.
    6. Torsten Harms & Pierre Duchesne, 2010. "On kernel nonparametric regression designed for complex survey data," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 72(1), pages 111-138, July.
    7. Ingrid K. Glad & Nils Lid Hjort & Nikolai G. Ushakov, 2003. "Correction of Density Estimators that are not Densities," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 30(2), pages 415-427, June.
    8. F. J. Breidt & G. Claeskens & J. D. Opsomer, 2005. "Model-assisted estimation for complex surveys using penalised splines," Biometrika, Biometrika Trust, vol. 92(4), pages 831-846, December.
    9. Scott, David W., 2004. "Multivariate Density Estimation and Visualization," Papers 2004,16, Humboldt University of Berlin, Center for Applied Statistics and Economics (CASE).
    10. Daniel Bonnéry & F. Jay Breidt & François Coquet, 2017. "Kernel estimation for a superpopulation probability density function under informative selection," METRON, Springer;Sapienza Università di Roma, vol. 75(3), pages 301-318, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sayed A. Mostafa & Ibrahim A. Ahmad, 2021. "Kernel Density Estimation Based on the Distinct Units in Sampling with Replacement," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 83(2), pages 507-547, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel J. Henderson & Christopher F. Parmeter & R. Robert Russell, 2008. "Modes, weighted modes, and calibrated modes: evidence of clustering using modality tests," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 23(5), pages 607-638.
    2. Gadea Rivas, María Dolores & Gonzalo, Jesús, 2020. "Trends in distributional characteristics: Existence of global warming," Journal of Econometrics, Elsevier, vol. 214(1), pages 153-174.
    3. Juan Carlos Escanciano & Juan Carlos Pardo-Fernández & Ingrid Van Keilegom, 2017. "Semiparametric Estimation of Risk–Return Relationships," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 35(1), pages 40-52, January.
    4. Sayed A. Mostafa & Ibrahim A. Ahmad, 2021. "Kernel Density Estimation Based on the Distinct Units in Sampling with Replacement," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 83(2), pages 507-547, November.
    5. Kraus, Daniel & Czado, Claudia, 2017. "D-vine copula based quantile regression," Computational Statistics & Data Analysis, Elsevier, vol. 110(C), pages 1-18.
    6. David H. Bernstein & Christopher F. Parmeter, 2017. "Returns to Scale in Electricity Generation: Revisited and Replicated," Working Papers 2017-08, University of Miami, Department of Economics.
    7. El Ghouch, Anouar & Genton, Marc G. & Bouezmarni , Taoufik, 2012. "Measuring the Discrepancy of a Parametric Model via Local Polynomial Smoothing," LIDAM Discussion Papers ISBA 2012001, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    8. Aboubacar Amiri & Baba Thiam, 2018. "Regression estimation by local polynomial fitting for multivariate data streams," Statistical Papers, Springer, vol. 59(2), pages 813-843, June.
    9. Kemp, Gordon C.R. & Santos Silva, J.M.C., 2012. "Regression towards the mode," Journal of Econometrics, Elsevier, vol. 170(1), pages 92-101.
    10. Requillart, Vincent & Nauges, Celine & Simioni, Michel & Bontemps, Christophe, 2012. "Food Safety Regulation and Firm Productivity: Evidence from the French Food Industry," 2012 First Congress, June 4-5, 2012, Trento, Italy 124378, Italian Association of Agricultural and Applied Economics (AIEAA).
    11. Degl’Innocenti, Marta & Matousek, Roman & Sevic, Zeljko & Tzeremes, Nickolaos G., 2017. "Bank efficiency and financial centres: Does geographical location matter?," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 46(C), pages 188-198.
    12. Nishiyama, Yoshihiko & Hitomi, Kohtaro & Kawasaki, Yoshinori & Jeong, Kiho, 2011. "A consistent nonparametric test for nonlinear causality—Specification in time series regression," Journal of Econometrics, Elsevier, vol. 165(1), pages 112-127.
    13. George Halkos & Roman Matousek & Nickolaos Tzeremes, 2016. "Pre-evaluating technical efficiency gains from possible mergers and acquisitions: evidence from Japanese regional banks," Review of Quantitative Finance and Accounting, Springer, vol. 46(1), pages 47-77, January.
    14. Bonsoo Koo & Oliver Linton, 2010. "Semiparametric Estimation of Locally Stationary Diffusion Models," STICERD - Econometrics Paper Series 551, Suntory and Toyota International Centres for Economics and Related Disciplines, LSE.
    15. George E. Halkos & Nickolaos G. Tzeremes, 2015. "Measuring Seaports' Productivity: A Malmquist Productivity Index Decomposition Approach," Journal of Transport Economics and Policy, University of Bath, vol. 49(2), pages 355-376, April.
    16. Qi Li & Juan Lin & Jeffrey S. Racine, 2013. "Optimal Bandwidth Selection for Nonparametric Conditional Distribution and Quantile Functions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(1), pages 57-65, January.
    17. Bodory, Hugo & Huber, Martin, 2018. "The causalweight package for causal inference in R," FSES Working Papers 493, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    18. Jesus Gonzalo & Jose Olmo, 2014. "Conditional Stochastic Dominance Tests In Dynamic Settings," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 55(3), pages 819-838, August.
    19. Patrick Saart & Jiti Gao & Nam Hyun Kim, 2014. "Semiparametric methods in nonlinear time series analysis: a selective review," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 26(1), pages 141-169, March.
    20. Tsuruta, Yasuhito, 2024. "Bias correction for kernel density estimation with spherical data," Journal of Multivariate Analysis, Elsevier, vol. 203(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:metrik:v:82:y:2019:i:3:d:10.1007_s00184-018-0703-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.