IDEAS home Printed from https://ideas.repec.org/a/spr/jagbes/v28y2023i2d10.1007_s13253-023-00527-4.html
   My bibliography  Save this article

An Approach for Specifying Trimming and Winsorization Cutoffs

Author

Listed:
  • Kedai Cheng

    (University of North Carolina - Asheville)

  • Derek S. Young

    (University of Kentucky)

Abstract

Outliers and extreme values are common in the era of big data, especially in the collection of survey data and real analysis. Clearly, care needs to be taken with how such values are treated in the calculation of statistical summaries, such as those involving the sample mean and sample variance. Robust alternatives based on trimming or Winsorization are often employed to mitigate the effect of those outlying points. An aspect critical to these methods, however, is in the determination of the cutoff locations. One classic approach is g-and-g-times trimming/Winsorization, which takes a proportion g off from both tails. However, this method does not carry any confidence statement, such as one finds with the calculation of statistical intervals. We propose the application of nonparametric statistical tolerance intervals, which captures a specified proportion of the sampled population at a confidence level, to determine cutoff locations for trimming and Winsorization. Extensive simulation studies show that this approach yields better coverage than the g-and-g-times method, even though the latter was not designed as a confidence procedure. Census of Agriculture data since 1982 is analyzed to highlight the impact on statistical summaries regarding farm land. Supplementary materials accompanying this paper appear online.

Suggested Citation

  • Kedai Cheng & Derek S. Young, 2023. "An Approach for Specifying Trimming and Winsorization Cutoffs," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 28(2), pages 299-323, June.
  • Handle: RePEc:spr:jagbes:v:28:y:2023:i:2:d:10.1007_s13253-023-00527-4
    DOI: 10.1007/s13253-023-00527-4
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13253-023-00527-4
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13253-023-00527-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Silvia Lui & James Mitchell & Martin Weale, 2011. "Qualitative business surveys: signal or noise?," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 174(2), pages 327-348, April.
    2. Mukhopadhyay, Nitai D. & Chatterjee, Snigdhansu, 2011. "High dimensional data analysis using multivariate generalized spatial quantiles," Journal of Multivariate Analysis, Elsevier, vol. 102(4), pages 768-780, April.
    3. Derek S. Young & Thomas Mathew, 2014. "Improved nonparametric tolerance intervals based on interpolated and extrapolated order statistics," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 26(3), pages 415-432, September.
    4. Di Bucchianico, A. & Einmahl, J.H.J. & Mushkudiani, N.A., 2001. "Smallest nonparametric tolerance regions," Other publications TiSEM 436f9be2-d0ad-49af-b6df-9, Tilburg University, School of Economics and Management.
    5. Dimitri, Carolyn & Effland, Anne & Conklin, Neilson C., 2005. "The 20th Century Transformation of U.S. Agriculture and Farm Policy," Economic Information Bulletin 59390, United States Department of Agriculture, Economic Research Service.
    6. Catherine Hausman & Maximilian Auffhammer & Peter Berck, 2012. "Farm Acreage Shocks and Crop Prices: An SVAR Approach to Understanding the Impacts of Biofuels," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 53(1), pages 117-136, September.
    7. Young, Derek S., 2010. "tolerance: An R Package for Estimating Tolerance Intervals," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 36(i05).
    8. Jesse Frey, 2010. "Data-driven nonparametric tolerance sets," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 22(2), pages 169-180.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ilaria Lucrezia Amerise, 2023. "A direct method for constructing distribution-free tolerance regions," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(5), pages 3941-3954, October.
    2. Frey, Jesse, 2014. "Shorter nonparametric prediction intervals for an order statistic from a future sample," Statistics & Probability Letters, Elsevier, vol. 91(C), pages 69-75.
    3. Kyung Serk Cho & Hon Keung Tony Ng, 2021. "Tolerance intervals in statistical software and robustness under model misspecification," Journal of Statistical Distributions and Applications, Springer, vol. 8(1), pages 1-49, December.
    4. Coleman, Jane A. & Shaik, Saleem, 2009. "Time-Varying Estimation of Crop Insurance Program in Altering North Dakota Farm Economic Structure," 2009 Annual Meeting, July 26-28, 2009, Milwaukee, Wisconsin 49516, Agricultural and Applied Economics Association.
    5. He, Xi, 2018. "Bigger Farms and Bigger Food Firms-The Agricultural Origin of Industrial Concentration in the Food Sector," 2018 Annual Meeting, August 5-7, Washington, D.C. 274206, Agricultural and Applied Economics Association.
    6. Christiane Baumeister & Lutz Kilian, 2014. "Do oil price increases cause higher food prices? [Biofuels, binding constraints, and agricultural commodity price volatility]," Economic Policy, CEPR, CESifo, Sciences Po;CES;MSH, vol. 29(80), pages 691-747.
    7. Velasco-Fernández, Raúl & Dunlop, Tessa & Giampietro, Mario, 2020. "Fallacies of energy efficiency indicators: Recognizing the complexity of the metabolic pattern of the economy," Energy Policy, Elsevier, vol. 137(C).
    8. Scott A. Carson, 2017. "Assessing Cumulative Net Nutrition and the Transition from 19th Century Bound to Free-Labor by Ethnic Status," CESifo Working Paper Series 6813, CESifo.
    9. Rachel M. Shellabarger & Rachel C. Voss & Monika Egerer & Shun-Nan Chiang, 2019. "Challenging the urban–rural dichotomy in agri-food systems," Agriculture and Human Values, Springer;The Agriculture, Food, & Human Values Society (AFHVS), vol. 36(1), pages 91-103, March.
    10. Roberts, Michael J. & Tran, A. Nam, 2013. "Conditional Suspension of the US Ethanol Mandate using Threshold Price inside a Competitive Storage Model," 2013 Annual Meeting, August 4-6, 2013, Washington, D.C. 150717, Agricultural and Applied Economics Association.
    11. Bachmann, Rüdiger & Elstner, Steffen, 2015. "Firm optimism and pessimism," European Economic Review, Elsevier, vol. 79(C), pages 297-325.
    12. Jeremy G. Weber & Conor Wall & Jason Brown & Tom Hertz, 2015. "Crop Prices, Agricultural Revenues, and the Rural Economy," Applied Economic Perspectives and Policy, Agricultural and Applied Economics Association, vol. 37(3), pages 459-476.
    13. Dalheimer, Bernhard & Herwartz, Helmut & Lange, Alexander, 2021. "The threat of oil market turmoils to food price stability in Sub-Saharan Africa," Energy Economics, Elsevier, vol. 93(C).
    14. Meyer, Kevin Michael, 2017. "Three essays on environmental and resource economics," ISU General Staff Papers 201701010800006585, Iowa State University, Department of Economics.
    15. Lijuan Du & Li Xu & Yanping Li & Changshun Liu & Zhenhua Li & Jefferson S. Wong & Bo Lei, 2019. "China’s Agricultural Irrigation and Water Conservancy Projects: A Policy Synthesis and Discussion of Emerging Issues," Sustainability, MDPI, vol. 11(24), pages 1-20, December.
    16. Glauber, Joseph W. & Effland, Anne, 2016. "United States agricultural policy: Its evolution and impact:," IFPRI discussion papers 1543, International Food Policy Research Institute (IFPRI).
    17. Spiegal, Sheri & Kleinman, Peter J.A. & Endale, Dinku M. & Bryant, Ray B. & Dell, Curtis & Goslee, Sarah & Meinen, Robert J. & Flynn, K. Colton & Baker, John M. & Browning, Dawn M. & McCarty, Greg & B, 2020. "Manuresheds: Advancing nutrient recycling in US agriculture," Agricultural Systems, Elsevier, vol. 182(C).
    18. Dalia Ghanem & Aaron Smith, 2022. "Causality in structural vector autoregressions: Science or sorcery?," American Journal of Agricultural Economics, John Wiley & Sons, vol. 104(3), pages 881-904, May.
    19. Ujjayant Chakravorty & Marie‐Hélène Hubert & Michel Moreaux & Linda Nøstbakken, 2017. "Long‐Run Impact of Biofuels on Food Prices," Scandinavian Journal of Economics, Wiley Blackwell, vol. 119(3), pages 733-767, July.
    20. Michele Caivano & Andrew Harvey, 2014. "Time-series models with an EGB2 conditional distribution," Journal of Time Series Analysis, Wiley Blackwell, vol. 35(6), pages 558-571, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jagbes:v:28:y:2023:i:2:d:10.1007_s13253-023-00527-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.