IDEAS home Printed from https://ideas.repec.org/a/spr/jagbes/v23y2018i2d10.1007_s13253-018-0320-2.html
   My bibliography  Save this article

Combining Survey and Non-survey Data for Improved Sub-area Prediction Using a Multi-level Model

Author

Listed:
  • Jae Kwang Kim

    (Iowa State University)

  • Zhonglei Wang

    (Iowa State University)

  • Zhengyuan Zhu

    (Iowa State University)

  • Nathan B. Cruze

    (United States Department of Agriculture)

Abstract

Combining information from different sources is an important practical problem in survey sampling. Using a hierarchical area-level model, we establish a framework to integrate auxiliary information to improve state-level area estimates. The best predictors are obtained by the conditional expectations of latent variables given observations, and an estimate of the mean squared prediction error is discussed. Sponsored by the National Agricultural Statistics Service of the US Department of Agriculture, the proposed model is applied to the planted crop acreage estimation problem by combining information from three sources, including the June Area Survey obtained by a probability-based sampling of lands, administrative data about the planted acreage and the cropland data layer, which is a commodity-specific classification product derived from remote sensing data. The proposed model combines the available information at a sub-state level called the agricultural statistics district and aggregates to improve state-level estimates of planted acreages for different crops. Supplementary materials accompanying this paper appear on-line.

Suggested Citation

  • Jae Kwang Kim & Zhonglei Wang & Zhengyuan Zhu & Nathan B. Cruze, 2018. "Combining Survey and Non-survey Data for Improved Sub-area Prediction Using a Multi-level Model," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(2), pages 175-189, June.
  • Handle: RePEc:spr:jagbes:v:23:y:2018:i:2:d:10.1007_s13253-018-0320-2
    DOI: 10.1007/s13253-018-0320-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13253-018-0320-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13253-018-0320-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jae Kwang Kim & J. N. K. Rao, 2012. "Combining data from two independent surveys: a model-assisted approach," Biometrika, Biometrika Trust, vol. 99(1), pages 85-100.
    2. Takis Merkouris, 2010. "Combining information from multiple surveys by using regression for efficient small domain estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(1), pages 27-48, January.
    3. repec:bla:istatr:v:83:y:2015:i:3:p:436-448 is not listed on IDEAS
    4. Changbao Wu & Wilson W. Lu, 2016. "Calibration Weighting Methods for Complex Surveys," International Statistical Review, International Statistical Institute, vol. 84(1), pages 79-98, April.
    5. Raghunathan, Trivellore E. & Xie, Dawei & Schenker, Nathaniel & Parsons, Van L. & Davis, William W. & Dodd, Kevin W. & Feuer, Eric J., 2007. "Combining Information From Two Surveys to Estimate County-Level Prevalence Rates of Cancer Risk Factors and Screening," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 474-486, June.
    6. Torabi, Mahmoud & Rao, J.N.K., 2014. "On small area estimation under a sub-area level model," Journal of Multivariate Analysis, Elsevier, vol. 127(C), pages 36-55.
    7. Giancarlo Manzi & David J. Spiegelhalter & Rebecca M. Turner & Julian Flowers & Simon G. Thompson, 2011. "Modelling bias in combining small area prevalence estimates from multiple surveys," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 174(1), pages 31-50, January.
    8. Takis Merkouris, 2004. "Combining Independent Regression Estimators From Multiple Surveys," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 1131-1139, December.
    9. Michael R. Elliott & William W. Davis, 2005. "Corrigendum: Obtaining cancer risk factor prevalence estimates in small areas: combining data from two surveys," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 54(5), pages 958-958, November.
    10. Michael R. Elliott & William W. Davis, 2005. "Obtaining cancer risk factor prevalence estimates in small areas: combining data from two surveys," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 54(3), pages 595-609, June.
    11. Jae Kwang Kim & Mingue Park, 2010. "Calibration Estimation in Survey Sampling," International Statistical Review, International Statistical Institute, vol. 78(1), pages 21-39, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Erciulescu Andreea L. & Cruze Nathan B. & Nandram Balgobin, 2020. "Statistical Challenges in Combining Survey and Auxiliary Data to Produce Official Statistics," Journal of Official Statistics, Sciendo, vol. 36(1), pages 63-88, March.
    2. Camilla Salvatore, 2023. "Inference with non-probability samples and survey data integration: a science mapping study," METRON, Springer;Sapienza Università di Roma, vol. 81(1), pages 83-107, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Takis Merkouris, 2010. "Combining information from multiple surveys by using regression for efficient small domain estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(1), pages 27-48, January.
    2. Yves G. Berger & Ewa Kabzińska, 2020. "Empirical Likelihood Approach for Aligning Information from Multiple Surveys," International Statistical Review, International Statistical Institute, vol. 88(1), pages 54-74, April.
    3. Seho Park & Jae Kwang Kim & Diana Stukel, 2017. "A measurement error model approach to survey data integration: combining information from two surveys," METRON, Springer;Sapienza Università di Roma, vol. 75(3), pages 345-357, December.
    4. Giancarlo Manzi & David J. Spiegelhalter & Rebecca M. Turner & Julian Flowers & Simon G. Thompson, 2011. "Modelling bias in combining small area prevalence estimates from multiple surveys," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 174(1), pages 31-50, January.
    5. Paolo Righi, 2016. "Estimation procedure and inference for component totals of the economic aggregates in the “Frame SBS”," Rivista di statistica ufficiale, ISTAT - Italian National Institute of Statistics - (Rome, ITALY), vol. 18(1), pages 83-97.
    6. Shixiao Zhang & Peisong Han & Changbao Wu, 2023. "Calibration Techniques Encompassing Survey Sampling, Missing Data Analysis and Causal Inference," International Statistical Review, International Statistical Institute, vol. 91(2), pages 165-192, August.
    7. Marissa B. Reitsma & Sherri Rose & Alex Reinhart & Jeremy D. Goldhaber-Fiebert & Joshua A. Salomon, 2024. "Bias-Adjusted Predictions of County-Level Vaccination Coverage from the COVID-19 Trends and Impact Survey," Medical Decision Making, , vol. 44(2), pages 175-188, February.
    8. Rasner, Anika & Frick, Joachim R. & Grabka, Markus M., 2013. "Statistical Matching of Administrative and Survey Data: An Application to Wealth Inequality Analysis," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 42(2), pages 192-224.
    9. Andreea Erciulescu & Jianzhu Li & Tom Krenzke & Machell Town, 2024. "Hierarchical Bayes small area estimation for county-level health prevalence to having a personal doctor," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 33(4), pages 1171-1191, September.
    10. Alessio Guandalini & Yves Tillé, 2017. "Design-based Estimators Calibrated on Estimated Totals from Multiple Surveys," International Statistical Review, International Statistical Institute, vol. 85(2), pages 250-269, August.
    11. Anika Rasner & Joachim R. Frick & Markus M. Grabka, 2013. "Statistical Matching of Administrative and Survey Data," Sociological Methods & Research, , vol. 42(2), pages 192-224, May.
    12. K. Shuvo Bakar & Nicholas Biddle & Philip Kokic & Huidong Jin, 2020. "A Bayesian spatial categorical model for prediction to overlapping geographical areas in sample surveys," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(2), pages 535-563, February.
    13. Denis Devaud & Yves Tillé, 2019. "Deville and Särndal’s calibration: revisiting a 25-years-old successful optimization problem," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(4), pages 1033-1065, December.
    14. Song Cai & J.N.K. Rao, 2022. "Selection of Auxiliary Variables for Three-Fold Linking Models in Small Area Estimation: A Simple and Effective Method," Stats, MDPI, vol. 5(1), pages 1-11, February.
    15. Chipperfield James O., 2016. "Discussion," Journal of Official Statistics, Sciendo, vol. 32(2), pages 287-289, June.
    16. J. N. K. Rao, 2021. "On Making Valid Inferences by Integrating Data from Surveys and Other Sources," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 83(1), pages 242-272, May.
    17. Lu Chen & Luca Sartore & Habtamu Benecha & Valbona Bejleri & Balgobin Nandram, 2022. "Smoothing County-Level Sampling Variances to Improve Small Area Models’ Outputs," Stats, MDPI, vol. 5(3), pages 1-18, September.
    18. Linda J. Young & Lu Chen, 2022. "Using Small Area Estimation to Produce Official Statistics," Stats, MDPI, vol. 5(3), pages 1-17, September.
    19. Rong Tang & Yun Yang, 2022. "Bayesian inference for risk minimization via exponentially tilted empirical likelihood," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(4), pages 1257-1286, September.
    20. Anne Konrad & Jan Pablo Burgard & Ralf Münnich, 2021. "A Two‐level GREG Estimator for Consistent Estimation in Household Surveys," International Statistical Review, International Statistical Institute, vol. 89(3), pages 635-656, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jagbes:v:23:y:2018:i:2:d:10.1007_s13253-018-0320-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.