IDEAS home Printed from https://ideas.repec.org/a/bla/anzsta/v62y2020i3p383-405.html
   My bibliography  Save this article

stratifyR: An R Package for optimal stratification and sample allocation for univariate populations

Author

Listed:
  • K. G. Reddy
  • M. G. M. Khan

Abstract

This R package determines optimal stratification of univariate populations under stratified sampling designs using a parametric‐based method. It determines the optimum strata boundaries (OSB), optimum sample sizes (OSS) and multiple other quantities for the study variable, y, using the best‐fit probability density function of a study variable available from survey data. The method requires the parameters and other characteristics of the distribution of the study variable to be known, either from available data or from a hypothetical distribution if the data are not available. In the implementation, the problem of determining the OSB is formulated as a mathematical programming problem and solved by using a dynamic programming technique. If the data of the population (i.e. the study variable) are available to the surveyor, the method estimates its best‐fit distribution and determines the OSB and OSS under Neyman allocation, directly. When the dataset is not available, stratification is made based on the assumption that the values of the study variable, y, are available as hypothetical realisations of proxy values of y from past/recent surveys. Thus, it requires certain distributional assumptions about the study variable. At present, the package handles stratification for the populations where the study variable follows a continuous distribution: namely, Pareto, Triangular, Right‐triangular, Weibull, Gamma, Exponential, Uniform, Normal, Lognormal and Cauchy distributions. In this paper, applications of major functionalities in the package are illustrated with a number of real/simulated as well as some hypothetical populations.

Suggested Citation

  • K. G. Reddy & M. G. M. Khan, 2020. "stratifyR: An R Package for optimal stratification and sample allocation for univariate populations," Australian & New Zealand Journal of Statistics, Australian Statistical Publishing Association Inc., vol. 62(3), pages 383-405, September.
  • Handle: RePEc:bla:anzsta:v:62:y:2020:i:3:p:383-405
    DOI: 10.1111/anzs.12301
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/anzs.12301
    Download Restriction: no

    File URL: https://libkey.io/10.1111/anzs.12301?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Karuna Garan Reddy & Mohammad G M Khan & Sabiha Khan, 2018. "Optimum strata boundaries and sample sizes in health surveys using auxiliary variables," PLOS ONE, Public Library of Science, vol. 13(4), pages 1-34, April.
    2. M.G.M. Khan & K.G. Reddy & D.K. Rao, 2015. "Designing stratified sampling in economic and business surveys," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(10), pages 2080-2099, October.
    3. Dutang, Christophe & Goulet, Vincent & Pigeon, Mathieu, 2008. "actuar: An R Package for Actuarial Science," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 25(i07).
    4. Vincent Goulet & Christophe Dutang & Mathieu Pigeon, 2008. "actuar : An R Package for Actuarial Science," Post-Print hal-01616144, HAL.
    5. W. Bühler & T. Deutler, 1975. "Optimal stratification and grouping by dynamic programming," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 22(1), pages 161-175, December.
    6. Delignette-Muller, Marie Laure & Dutang, Christophe, 2015. "fitdistrplus: An R Package for Fitting Distributions," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 64(i04).
    7. Marie Laure Delignette-Muller & Christophe Dutang, 2015. "fitdistrplus : An R Package for Fitting Distributions," Post-Print hal-01616147, HAL.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Gustavo Amorim & Ran Tao & Sarah Lotspeich & Pamela A. Shaw & Thomas Lumley & Bryan E. Shepherd, 2021. "Two‐phase sampling designs for data validation in settings with covariate measurement error and continuous outcome," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(4), pages 1368-1389, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Avanzi, Benjamin & Taylor, Greg & Wang, Melantha & Wong, Bernard, 2021. "SynthETIC: An individual insurance claim simulator with feature control," Insurance: Mathematics and Economics, Elsevier, vol. 100(C), pages 296-308.
    2. Alexandru Amărioarei & Frankie Spencer & Gefry Barad & Ana-Maria Gheorghe & Corina Iţcuş & Iris Tuşa & Ana-Maria Prelipcean & Andrei Păun & Mihaela Păun & Alfonso Rodriguez-Paton & Romică Trandafir & , 2021. "DNA-Guided Assembly for Fibril Proteins," Mathematics, MDPI, vol. 9(4), pages 1-17, February.
    3. repec:jss:jstsof:35:i10 is not listed on IDEAS
    4. Schulte, Benedikt & Sachs, Anna-Lena, 2020. "The price-setting newsvendor with Poisson demand," European Journal of Operational Research, Elsevier, vol. 283(1), pages 125-137.
    5. Taleb-Berrouane, Mohammed & Khan, Faisal & Amyotte, Paul, 2020. "Bayesian Stochastic Petri Nets (BSPN) - A new modelling tool for dynamic safety and reliability analysis," Reliability Engineering and System Safety, Elsevier, vol. 193(C).
    6. Clément Laroche & Madalina Olteanu & Fabrice Rossi, 2023. "Pesticide concentration monitoring: Investigating spatio‐temporal patterns in left censored data," Environmetrics, John Wiley & Sons, Ltd., vol. 34(2), March.
    7. Chen Chen & Alireza Mostafizi & Haizhong Wang & Dan Cox & Lori Cramer, 2022. "Evacuation behaviors in tsunami drills," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 112(1), pages 845-871, May.
    8. Fezzi, Carlo & Menapace, Luisa & Raffaelli, Roberta, 2021. "Estimating risk preferences integrating insurance choices with subjective beliefs," European Economic Review, Elsevier, vol. 135(C).
    9. Anna Castañer & M.Mercè Claramunt & Maite Mármol, 2014. "Some optimization and decision problems in proportional reinsurance," UB School of Economics Working Papers 2014/310, University of Barcelona School of Economics.
    10. Pongnumkul, Suchit & Motohashi, Kazuyuki, 2018. "A bipartite fitness model for online music streaming services," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 490(C), pages 1125-1137.
    11. Chen, Shang & He, Liang & Cao, Yinxuan & Wang, Runhong & Wu, Lianhai & Wang, Zhao & Zou, Yufeng & Siddique, Kadambot H.M. & Xiong, Wei & Liu, Manshuang & Feng, Hao & Yu, Qiang & Wang, Xiaoming & He, J, 2021. "Comparisons among four different upscaling strategies for cultivar genetic parameters in rainfed spring wheat phenology simulations with the DSSAT-CERES-Wheat model," Agricultural Water Management, Elsevier, vol. 258(C).
    12. Faustino Prieto & Catalina B. Garc'ia-Garc'ia & Rom'an Salmer'on G'omez, 2024. "Modelling Global Fossil CO2 Emissions with a Lognormal Distribution: A Climate Policy Tool," Papers 2403.00653, arXiv.org.
    13. Riva-Palacio, Alan & Leisen, Fabrizio, 2021. "Compound vectors of subordinators and their associated positive Lévy copulas," Journal of Multivariate Analysis, Elsevier, vol. 183(C).
    14. Kufenko, Vadim & Geloso, Vincent, 2021. "Who are the champions? Inequality, economic freedom and the Olympics," Journal of Institutional Economics, Cambridge University Press, vol. 17(3), pages 411-427, June.
    15. Ekaterina Bulinskaya & Boris Shigida, 2021. "Discrete-Time Model of Company Capital Dynamics with Investment of a Certain Part of Surplus in a Non-Risky Asset for a Fixed Period," Methodology and Computing in Applied Probability, Springer, vol. 23(1), pages 103-121, March.
    16. Combes, Catherine & Ng, Hon Keung Tony, 2022. "On parameter estimation for Amoroso family of distributions," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 191(C), pages 309-327.
    17. Adele H. Marshall & Mariangela Zenga, 2012. "Experimenting with the Coxian Phase-Type Distribution to Uncover Suitable Fits," Methodology and Computing in Applied Probability, Springer, vol. 14(1), pages 71-86, March.
    18. Ozkok, Erengul & Streftaris, George & Waters, Howard R. & Wilkie, A. David, 2012. "Bayesian modelling of the time delay between diagnosis and settlement for Critical Illness Insurance using a Burr generalised-linear-type model," Insurance: Mathematics and Economics, Elsevier, vol. 50(2), pages 266-279.
    19. Minji Lee & Sun Ju Chung & Youngjo Lee & Sera Park & Jun-Gun Kwon & Dai Jin Kim & Donghwan Lee & Jung-Seok Choi, 2020. "Investigation of Correlated Internet and Smartphone Addiction in Adolescents: Copula Regression Analysis," IJERPH, MDPI, vol. 17(16), pages 1-12, August.
    20. Gzara, Fatma & Elhedhli, Samir & Yildiz, Burak C., 2020. "The Pallet Loading Problem: Three-dimensional bin packing with practical constraints," European Journal of Operational Research, Elsevier, vol. 287(3), pages 1062-1074.
    21. Veronesi, F. & Grassi, S. & Raubal, M., 2016. "Statistical learning approach for wind resource assessment," Renewable and Sustainable Energy Reviews, Elsevier, vol. 56(C), pages 836-850.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:anzsta:v:62:y:2020:i:3:p:383-405. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=1369-1473 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.