IDEAS home Printed from https://ideas.repec.org/a/spr/stmapp/v24y2015i1p97-119.html
   My bibliography  Save this article

Tests for statistical significance of a treatment effect in the presence of hidden sub-populations

Author

Listed:
  • B. Karmakar
  • K. Dhara
  • K. Dey
  • A. Basu
  • A. Ghosh

Abstract

For testing the statistical significance of a treatment effect, we often compare between two parts of a population; one is exposed to the treatment, and the other is not exposed to it. Standard parametric or nonparametric two-sample tests are commonly used for this comparison. But direct applications of these tests can yield misleading results, especially when the population has some hidden sub-populations, and the effect of this sub-population difference on the response dominates the treatment effect. This problem becomes more evident if these sub-populations have widely different proportions of representatives in the samples obtained from these two parts. In this article, we propose some simple methods to overcome these limitations. These proposed methods first use a suitable clustering algorithm to find the hidden sub-populations, and then they eliminate the sub-population effect by using a suitable transformation of the data. Standard two-sample tests, when they are applied on the transformed data, usually yield better results. We analyze some simulated and real data sets to demonstrate the utility of these proposed methods. Copyright Springer-Verlag Berlin Heidelberg 2015

Suggested Citation

  • B. Karmakar & K. Dhara & K. Dey & A. Basu & A. Ghosh, 2015. "Tests for statistical significance of a treatment effect in the presence of hidden sub-populations," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 24(1), pages 97-119, March.
  • Handle: RePEc:spr:stmapp:v:24:y:2015:i:1:p:97-119
    DOI: 10.1007/s10260-014-0271-x
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1007/s10260-014-0271-x
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1007/s10260-014-0271-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Claeskens,Gerda & Hjort,Nils Lid, 2008. "Model Selection and Model Averaging," Cambridge Books, Cambridge University Press, number 9780521852258.
    2. Conor Dolan & Han Maas, 1998. "Fitting multivariage normal finite mixtures subject to structural equation modeling," Psychometrika, Springer;The Psychometric Society, vol. 63(3), pages 227-253, September.
    3. Mukhopadhyay, Subhadeep & Ghosh, Anil K., 2011. "Bayesian multiscale smoothing in supervised and semi-supervised kernel discriminant analysis," Computational Statistics & Data Analysis, Elsevier, vol. 55(7), pages 2344-2353, July.
    4. Robert Tibshirani & Guenther Walther & Trevor Hastie, 2001. "Estimating the number of clusters in a data set via the gap statistic," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(2), pages 411-423.
    5. Jörnsten, Rebecka, 2004. "Clustering and classification based on the L1 data depth," Journal of Multivariate Analysis, Elsevier, vol. 90(1), pages 67-89, July.
    6. Tenenhaus, Michel & Vinzi, Vincenzo Esposito & Chatelin, Yves-Marie & Lauro, Carlo, 2005. "PLS path modeling," Computational Statistics & Data Analysis, Elsevier, vol. 48(1), pages 159-205, January.
    7. Hoeting, Jennifer & Raftery, Adrian E. & Madigan, David, 1996. "A method for simultaneous variable selection and outlier identification in linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 22(3), pages 251-270, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Andrea Cappozzo & Luis Angel García Escudero & Francesca Greselin & Agustín Mayo-Iscar, 2021. "Parameter Choice, Stability and Validity for Robust Cluster Weighted Modeling," Stats, MDPI, vol. 4(3), pages 1-14, July.
    2. Julian Rossbroich & Jeffrey Durieux & Tom F. Wilderjans, 2022. "Model Selection Strategies for Determining the Optimal Number of Overlapping Clusters in Additive Overlapping Partitional Clustering," Journal of Classification, Springer;The Classification Society, vol. 39(2), pages 264-301, July.
    3. Thiemo Fetzer & Samuel Marden, 2017. "Take What You Can: Property Rights, Contestability and Conflict," Economic Journal, Royal Economic Society, vol. 0(601), pages 757-783, May.
    4. Daniel Agness & Travis Baseler & Sylvain Chassang & Pascaline Dupas & Erik Snowberg, 2022. "Valuing the Time of the Self-Employed," CESifo Working Paper Series 9567, CESifo.
    5. Jeffrey S. Racine & Qi Li & Dalei Yu & Li Zheng, 2023. "Optimal Model Averaging of Mixed-Data Kernel-Weighted Spline Regressions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 41(4), pages 1251-1261, October.
    6. Claudio Vitari & Elisabetta Raguseo, 2016. "Big data value and financial performance: an empirical investigation [Digital data, dynamic capability and financial performance: an empirical investigation in the era of Big Data]," Post-Print halshs-01923271, HAL.
    7. Martins, José & Costa, Catarina & Oliveira, Tiago & Gonçalves, Ramiro & Branco, Frederico, 2019. "How smartphone advertising influences consumers' purchase intention," Journal of Business Research, Elsevier, vol. 94(C), pages 378-387.
    8. Philippe Goulet Coulombe & Maxime Leroux & Dalibor Stevanovic & Stéphane Surprenant, 2022. "How is machine learning useful for macroeconomic forecasting?," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 920-964, August.
    9. Gupta, Prashant & Seetharaman, A. & Raj, John Rudolph, 2013. "The usage and adoption of cloud computing by small and medium businesses," International Journal of Information Management, Elsevier, vol. 33(5), pages 861-874.
    10. Asif Khan & Chih-Cheng Chen & Kwanrat Suanpong & Athapol Ruangkanjanases & Santhaya Kittikowit & Shih-Chih Chen, 2021. "The Impact of CSR on Sustainable Innovation Ambidexterity: The Mediating Role of Sustainable Supply Chain Management and Second-Order Social Capital," Sustainability, MDPI, vol. 13(21), pages 1-25, November.
    11. Davide Fiaschi & Andrea Mario Lavezzi & Angela Parenti, 2020. "Deep and Proximate Determinants of the World Income Distribution," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 66(3), pages 677-710, September.
    12. Chen, Shih-Chih & Hung, Chung-Wen, 2016. "Elucidating the factors influencing the acceptance of green products: An extension of theory of planned behavior," Technological Forecasting and Social Change, Elsevier, vol. 112(C), pages 155-163.
    13. Orietta Nicolis & Jean Paul Maidana & Fabian Contreras & Danilo Leal, 2024. "Analyzing the Impact of COVID-19 on Economic Sustainability: A Clustering Approach," Sustainability, MDPI, vol. 16(4), pages 1-30, February.
    14. Bronzo, Marcelo & de Resende, Paulo Tarso Vilela & de Oliveira, Marcos Paulo Valadares & McCormack, Kevin P. & de Sousa, Paulo Renato & Ferreira, Reinaldo Lopes, 2013. "Improving performance aligning business analytics with process orientation," International Journal of Information Management, Elsevier, vol. 33(2), pages 300-307.
    15. Li, Pai-Ling & Chiou, Jeng-Min, 2011. "Identifying cluster number for subspace projected functional data clustering," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2090-2103, June.
    16. Yaeji Lim & Hee-Seok Oh & Ying Kuen Cheung, 2019. "Multiscale Clustering for Functional Data," Journal of Classification, Springer;The Classification Society, vol. 36(2), pages 368-391, July.
    17. R.M. Thirupathi & S. Vinodh, 2016. "Application of interpretive structural modelling and structural equation modelling for analysis of sustainable manufacturing factors in Indian automotive component sector," International Journal of Production Research, Taylor & Francis Journals, vol. 54(22), pages 6661-6682, November.
    18. Forzani, Liliana & Gieco, Antonella & Tolmasky, Carlos, 2017. "Likelihood ratio test for partial sphericity in high and ultra-high dimensions," Journal of Multivariate Analysis, Elsevier, vol. 159(C), pages 18-38.
    19. Seif Obeid Al-Shbiel, 2016. "An Examination the Factors Influence on Unethical Behaviour among Jordanian external auditors: Job Satisfaction as a mediator," International Journal of Academic Research in Accounting, Finance and Management Sciences, Human Resource Management Academic Research Society, International Journal of Academic Research in Accounting, Finance and Management Sciences, vol. 6(3), pages 285-296, July.
    20. Yujia Li & Xiangrui Zeng & Chien‐Wei Lin & George C. Tseng, 2022. "Simultaneous estimation of cluster number and feature sparsity in high‐dimensional cluster analysis," Biometrics, The International Biometric Society, vol. 78(2), pages 574-585, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stmapp:v:24:y:2015:i:1:p:97-119. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.