IDEAS home Printed from https://ideas.repec.org/a/bla/anzsta/v62y2020i3p383-405.html
   My bibliography  Save this article

stratifyR: An R Package for optimal stratification and sample allocation for univariate populations

Author

Listed:
  • K. G. Reddy
  • M. G. M. Khan

Abstract

This R package determines optimal stratification of univariate populations under stratified sampling designs using a parametric‐based method. It determines the optimum strata boundaries (OSB), optimum sample sizes (OSS) and multiple other quantities for the study variable, y, using the best‐fit probability density function of a study variable available from survey data. The method requires the parameters and other characteristics of the distribution of the study variable to be known, either from available data or from a hypothetical distribution if the data are not available. In the implementation, the problem of determining the OSB is formulated as a mathematical programming problem and solved by using a dynamic programming technique. If the data of the population (i.e. the study variable) are available to the surveyor, the method estimates its best‐fit distribution and determines the OSB and OSS under Neyman allocation, directly. When the dataset is not available, stratification is made based on the assumption that the values of the study variable, y, are available as hypothetical realisations of proxy values of y from past/recent surveys. Thus, it requires certain distributional assumptions about the study variable. At present, the package handles stratification for the populations where the study variable follows a continuous distribution: namely, Pareto, Triangular, Right‐triangular, Weibull, Gamma, Exponential, Uniform, Normal, Lognormal and Cauchy distributions. In this paper, applications of major functionalities in the package are illustrated with a number of real/simulated as well as some hypothetical populations.

Suggested Citation

  • K. G. Reddy & M. G. M. Khan, 2020. "stratifyR: An R Package for optimal stratification and sample allocation for univariate populations," Australian & New Zealand Journal of Statistics, Australian Statistical Publishing Association Inc., vol. 62(3), pages 383-405, September.
  • Handle: RePEc:bla:anzsta:v:62:y:2020:i:3:p:383-405
    DOI: 10.1111/anzs.12301
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/anzs.12301
    Download Restriction: no

    File URL: https://libkey.io/10.1111/anzs.12301?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Vincent Goulet & Christophe Dutang & Mathieu Pigeon, 2008. "actuar : An R Package for Actuarial Science," Post-Print hal-01616144, HAL.
    2. W. Bühler & T. Deutler, 1975. "Optimal stratification and grouping by dynamic programming," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 22(1), pages 161-175, December.
    3. Karuna Garan Reddy & Mohammad G M Khan & Sabiha Khan, 2018. "Optimum strata boundaries and sample sizes in health surveys using auxiliary variables," PLOS ONE, Public Library of Science, vol. 13(4), pages 1-34, April.
    4. M.G.M. Khan & K.G. Reddy & D.K. Rao, 2015. "Designing stratified sampling in economic and business surveys," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(10), pages 2080-2099, October.
    5. Dutang, Christophe & Goulet, Vincent & Pigeon, Mathieu, 2008. "actuar: An R Package for Actuarial Science," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 25(i07).
    6. Delignette-Muller, Marie Laure & Dutang, Christophe, 2015. "fitdistrplus: An R Package for Fitting Distributions," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 64(i04).
    7. Marie Laure Delignette-Muller & Christophe Dutang, 2015. "fitdistrplus : An R Package for Fitting Distributions," Post-Print hal-01616147, HAL.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Gustavo Amorim & Ran Tao & Sarah Lotspeich & Pamela A. Shaw & Thomas Lumley & Bryan E. Shepherd, 2021. "Two‐phase sampling designs for data validation in settings with covariate measurement error and continuous outcome," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(4), pages 1368-1389, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Avanzi, Benjamin & Taylor, Greg & Wang, Melantha & Wong, Bernard, 2021. "SynthETIC: An individual insurance claim simulator with feature control," Insurance: Mathematics and Economics, Elsevier, vol. 100(C), pages 296-308.
    2. Alexandru Amărioarei & Frankie Spencer & Gefry Barad & Ana-Maria Gheorghe & Corina Iţcuş & Iris Tuşa & Ana-Maria Prelipcean & Andrei Păun & Mihaela Păun & Alfonso Rodriguez-Paton & Romică Trandafir & , 2021. "DNA-Guided Assembly for Fibril Proteins," Mathematics, MDPI, vol. 9(4), pages 1-17, February.
    3. repec:jss:jstsof:35:i10 is not listed on IDEAS
    4. Schulte, Benedikt & Sachs, Anna-Lena, 2020. "The price-setting newsvendor with Poisson demand," European Journal of Operational Research, Elsevier, vol. 283(1), pages 125-137.
    5. Anna Castañer & M.Mercè Claramunt & Maite Mármol, 2014. "Some optimization and decision problems in proportional reinsurance," UB School of Economics Working Papers 2014/310, University of Barcelona School of Economics.
    6. Chen, Shang & He, Liang & Cao, Yinxuan & Wang, Runhong & Wu, Lianhai & Wang, Zhao & Zou, Yufeng & Siddique, Kadambot H.M. & Xiong, Wei & Liu, Manshuang & Feng, Hao & Yu, Qiang & Wang, Xiaoming & He, J, 2021. "Comparisons among four different upscaling strategies for cultivar genetic parameters in rainfed spring wheat phenology simulations with the DSSAT-CERES-Wheat model," Agricultural Water Management, Elsevier, vol. 258(C).
    7. Riva-Palacio, Alan & Leisen, Fabrizio, 2021. "Compound vectors of subordinators and their associated positive Lévy copulas," Journal of Multivariate Analysis, Elsevier, vol. 183(C).
    8. Minji Lee & Sun Ju Chung & Youngjo Lee & Sera Park & Jun-Gun Kwon & Dai Jin Kim & Donghwan Lee & Jung-Seok Choi, 2020. "Investigation of Correlated Internet and Smartphone Addiction in Adolescents: Copula Regression Analysis," IJERPH, MDPI, vol. 17(16), pages 1-12, August.
    9. Veronesi, F. & Grassi, S. & Raubal, M., 2016. "Statistical learning approach for wind resource assessment," Renewable and Sustainable Energy Reviews, Elsevier, vol. 56(C), pages 836-850.
    10. Denuit, Michel, 2019. "Size-biased transform and conditional mean risk sharing, with application to P2P insurance and tontines," LIDAM Discussion Papers ISBA 2019010, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    11. Phillip M. Gurman & Tom Ross & Andreas Kiermeier, 2018. "Quantitative Microbial Risk Assessment of Salmonellosis from the Consumption of Australian Pork: Minced Meat from Retail to Burgers Prepared and Consumed at Home," Risk Analysis, John Wiley & Sons, vol. 38(12), pages 2625-2645, December.
    12. Adam R. Martin & Rachel O. Mariani & Kimberley A. Cathline & Michael Duncan & Nicholas J. Paroshy & Gavin Robertson, 2022. "Soil Compaction Drives an Intra-Genotype Leaf Economics Spectrum in Wine Grapes," Agriculture, MDPI, vol. 12(10), pages 1-16, October.
    13. Héctor Nájera & David Gordon, 2023. "A Monte Carlo Study of Some Empirical Methods to Find the Optimal Poverty Line in Multidimensional Poverty Measurement," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 167(1), pages 391-419, June.
    14. Athanasios Zisos & Georgia-Konstantina Sakki & Andreas Efstratiadis, 2023. "Mixing Renewable Energy with Pumped Hydropower Storage: Design Optimization under Uncertainty and Other Challenges," Sustainability, MDPI, vol. 15(18), pages 1-21, September.
    15. Michael Grabchak, 2022. "Discrete Tempered Stable Distributions," Methodology and Computing in Applied Probability, Springer, vol. 24(3), pages 1877-1890, September.
    16. Valentas Gruzauskas & Aurelija Burinskiene & Andrius Krisciunas, 2023. "Application of Information-Sharing for Resilient and Sustainable Food Delivery in Last-Mile Logistics," Mathematics, MDPI, vol. 11(2), pages 1-21, January.
    17. Amanda M. Wilson & Kelly A. Reynolds & Marc P. Verhougstraete & Robert A. Canales, 2019. "Validation of a Stochastic Discrete Event Model Predicting Virus Concentration on Nurse Hands," Risk Analysis, John Wiley & Sons, vol. 39(8), pages 1812-1824, August.
    18. Wang, Xialin & Nuppenau, Ernst-August, 2021. "Modelling payments for ecosystem services for solving future water conflicts at spatial scales: The Okavango River Basin example," Ecological Economics, Elsevier, vol. 184(C).
    19. Sarra Ghaddab & Manel Kacem & Christian Peretti & Lotfi Belkacem, 2023. "Extreme severity modeling using a GLM-GPD combination: application to an excess of loss reinsurance treaty," Empirical Economics, Springer, vol. 65(3), pages 1105-1127, September.
    20. Kun Mo Lee & Min Hyeok Lee & Jong Seok Lee & Joo Young Lee, 2020. "Uncertainty Analysis of Greenhouse Gas (GHG) Emissions Simulated by the Parametric Monte Carlo Simulation and Nonparametric Bootstrap Method," Energies, MDPI, vol. 13(18), pages 1-15, September.
    21. Jian Wang & Cielito C. Reyes-Gibby & Sanjay Shete, 2021. "An Approach to Analyze Longitudinal Zero-Inflated Microbiome Count Data Using Two-Stage Mixed Effects Models," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(2), pages 267-290, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:anzsta:v:62:y:2020:i:3:p:383-405. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=1369-1473 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.