IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v115y2017icp250-266.html
   My bibliography  Save this article

On hyperbolic transformations to normality

Author

Listed:
  • Tsai, Arthur C.
  • Liou, Michelle
  • Simak, Maria
  • Cheng, Philip E.

Abstract

In biological and social sciences, it is essential to consider data transformations to normality for detecting structural effects and for better data representation and interpretation. An array of transformations to normality has been derived for data exhibiting skewed, leptokurtic and unimodal shapes, but is less amenable to data exhibiting platykurtic shapes, such as a nearly bimodal distribution. This study proposes and constructs a new family of hyperbolic power transformations for improving normality of raw data with varying degrees of skewness and kurtosis. An advantage this new family has is its effectiveness in transforming platykurtic or bimodal data distributions to normal. A simulation study and a real data example on mathematics achievement test scores are used to illustrate the wide-ranging applications of the proposed family of transformations. As a cautionary note, usefulness and limitations of the proposed method will be discussed for stabilizing the variance of DNA microarray data and for symmetrizing the data distribution towards normality. The empirical applications also illustrate an example of conservative t- and ANOVA F-tests when the assumption of normality is violated.

Suggested Citation

  • Tsai, Arthur C. & Liou, Michelle & Simak, Maria & Cheng, Philip E., 2017. "On hyperbolic transformations to normality," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 250-266.
  • Handle: RePEc:eee:csdana:v:115:y:2017:i:c:p:250-266
    DOI: 10.1016/j.csda.2017.06.001
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947317301408
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2017.06.001?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. J. A. John & N. R. Draper, 1980. "An Alternative Family of Transformations," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 29(2), pages 190-197, June.
    2. Parrish, Rudolph S. & Spencer III, Horace J. & Xu, Ping, 2009. "Distribution modeling and simulation of gene expression data," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1650-1660, March.
    3. Dieter Rasch & Klaus Kubinger & Karl Moder, 2011. "The two-sample t test: pre-testing its assumptions does not pay off," Statistical Papers, Springer, vol. 52(1), pages 219-231, February.
    4. Greenacre, Michael, 2009. "Power transformations in correspondence analysis," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3107-3116, June.
    5. Gel, Yulia R. & Gastwirth, Joseph L., 2008. "A robust modification of the Jarque-Bera test of normality," Economics Letters, Elsevier, vol. 99(1), pages 30-32, April.
    6. Filidor Vilca & Mariana Rodrigues-Motta & V�ctor Leiva, 2013. "On a variance stabilizing model and its application to genomic data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 40(11), pages 2354-2371, November.
    7. M. C. Jones & Arthur Pewsey, 2009. "Sinh-arcsinh distributions," Biometrika, Biometrika Trust, vol. 96(4), pages 761-780.
    8. Ambroise, Jerome & Bearzatto, Bertrand & Robert, Annie & Govaerts, Bernadette & Macq, Benoit & Gala, Jean-Luc, 2011. "Impact of the spotted microarray preprocessing method on fold-change compression and variance stability," LIDAM Reprints ISBA 2011054, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    9. Purdom Elizabeth & Holmes Susan P, 2005. "Error Distribution for Gene Expression Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 4(1), pages 1-35, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Michael A. Clemens & Hannah M. Postel, 2018. "Deterring Emigration with Foreign Aid: An Overview of Evidence from Low‐Income Countries," Population and Development Review, The Population Council, Inc., vol. 44(4), pages 667-693, December.
    2. Priddle, Jacob W. & Drovandi, Christopher, 2023. "Transformations in semi-parametric Bayesian synthetic likelihood," Computational Statistics & Data Analysis, Elsevier, vol. 187(C).
    3. Geovanny Marulanda & Antonio Bello & Jenny Cifuentes & Javier Reneses, 2020. "Wind Power Long-Term Scenario Generation Considering Spatial-Temporal Dependencies in Coupled Electricity Markets," Energies, MDPI, vol. 13(13), pages 1-19, July.
    4. Nicola Loperfido, 2023. "Kurtosis removal for data pre-processing," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(1), pages 239-267, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Snezhana Gocheva-Ilieva & Iliycho Iliev, 2016. "Using Generalized PathSeeker Regularized Regression for Modeling and Prediction of Output Power of CuBr Laser," Proceedings of International Academic Conferences 4006523, International Institute of Social and Economic Sciences.
    2. Ngene, Geoffrey & Tah, Kenneth A. & Darrat, Ali F., 2017. "Long memory or structural breaks: Some evidence for African stock markets," Review of Financial Economics, Elsevier, vol. 34(C), pages 61-73.
    3. Punathumparambath, Bindu & Kulathinal, Sangita & George, Sebastian, 2012. "Asymmetric type II compound Laplace distribution and its application to microarray gene expression," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1396-1404.
    4. Ambroise Jérôme & Bearzatto Bertrand & Robert Annie & Macq Benoit & Gala Jean-Luc, 2012. "Combining Multiple Laser Scans of Spotted Microarrays by Means of a Two-Way ANOVA Model," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(3), pages 1-20, February.
    5. Magne Mogstad & Joseph P Romano & Azeem M Shaikh & Daniel Wilhelm, 2024. "Inference for Ranks with Applications to Mobility across Neighbourhoods and Academic Achievement across Countries," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 91(1), pages 476-518.
    6. David Atkin & Benjamin Faber & Marco Gonzalez-Navarro, 2018. "Retail Globalization and Household Welfare: Evidence from Mexico," Journal of Political Economy, University of Chicago Press, vol. 126(1), pages 1-73.
    7. Huixia Judy Wang & Leonard A. Stefanski & Zhongyi Zhu, 2012. "Corrected-loss estimation for quantile regression with covariate measurement errors," Biometrika, Biometrika Trust, vol. 99(2), pages 405-421.
    8. Alina Bărbulescu & Cristian Ștefan Dumitriu, 2021. "On the Connection between the GEP Performances and the Time Series Properties," Mathematics, MDPI, vol. 9(16), pages 1-19, August.
    9. J. Hambuckers & C. Heuchenne, 2017. "A robust statistical approach to select adequate error distributions for financial returns," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(1), pages 137-161, January.
    10. Blasius, J. & Greenacre, M. & Groenen, P.J.F. & van de Velden, M., 2009. "Special issue on correspondence analysis and related methods," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3103-3106, June.
    11. Pora, Pierre & Wilner, Lionel, 2020. "A decomposition of labor earnings growth: Recovering Gaussianity?," Labour Economics, Elsevier, vol. 63(C).
    12. Sladana Babic & Laetitia Gelbgras & Marc Hallin & Christophe Ley, 2019. "Optimal tests for elliptical symmetry: specified and unspecified location," Working Papers ECARES 2019-26, ULB -- Universite Libre de Bruxelles.
    13. Mendonça, Suzielli M. & Cabella, Brenno C.T. & Martinez, Alexandre S., 2024. "A Multifractal Detrended Fluctuation Analysis approach using generalized functions," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 637(C).
    14. Lee, Sharon X. & McLachlan, Geoffrey J., 2022. "An overview of skew distributions in model-based clustering," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    15. Shalit, Haim, 2012. "Using OLS to test for normality," Statistics & Probability Letters, Elsevier, vol. 82(11), pages 2050-2058.
    16. Leonie Kuen & Fiona Schürmann & Daniel Westmattelmann & Sophie Hartwig & Shay Tzafrir & Gerhard Schewe, 2023. "Trust transfer effects and associated risks in telemedicine adoption," Electronic Markets, Springer;IIM University of St. Gallen, vol. 33(1), pages 1-22, December.
    17. Francisco J. Rubio & Yili Hong, 2016. "Survival and lifetime data analysis with a flexible class of distributions," Journal of Applied Statistics, Taylor & Francis Journals, vol. 43(10), pages 1794-1813, August.
    18. Tu, Shiyi & Wang, Min & Sun, Xiaoqian, 2016. "Bayesian analysis of two-piece location–scale models under reference priors with partial information," Computational Statistics & Data Analysis, Elsevier, vol. 96(C), pages 133-144.
    19. Mondal, Sagnik & Genton, Marc G., 2024. "A multivariate skew-normal-Tukey-h distribution," Journal of Multivariate Analysis, Elsevier, vol. 200(C).
    20. Frédérique Bec & Heino Bohn Nielsen & Sarra Saïdi, 2020. "Mixed Causal–Noncausal Autoregressions: Bimodality Issues in Estimation and Unit Root Testing," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 82(6), pages 1413-1428, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:115:y:2017:i:c:p:250-266. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.