IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0201950.html
   My bibliography  Save this article

Exploratory data analysis of a clinical study group: Development of a procedure for exploring multidimensional data

Author

Listed:
  • Bogumil M Konopka
  • Felicja Lwow
  • Magdalena Owczarz
  • Łukasz Łaczmański

Abstract

Thorough knowledge of the structure of analyzed data allows to form detailed scientific hypotheses and research questions. The structure of data can be revealed with methods for exploratory data analysis. Due to multitude of available methods, selecting those which will work together well and facilitate data interpretation is not an easy task. In this work we present a well fitted set of tools for a complete exploratory analysis of a clinical dataset and perform a case study analysis on a set of 515 patients. The proposed procedure comprises several steps: 1) robust data normalization, 2) outlier detection with Mahalanobis (MD) and robust Mahalanobis distances (rMD), 3) hierarchical clustering with Ward’s algorithm, 4) Principal Component Analysis with biplot vectors. The analyzed set comprised elderly patients that participated in the PolSenior project. Each patient was characterized by over 40 biochemical and socio-geographical attributes. Introductory analysis showed that the case-study dataset comprises two clusters separated along the axis of sex hormone attributes. Further analysis was carried out separately for male and female patients. The most optimal partitioning in the male set resulted in five subgroups. Two of them were related to diseased patients: 1) diabetes and 2) hypogonadism patients. Analysis of the female set suggested that it was more homogeneous than the male dataset. No evidence of pathological patient subgroups was found. In the study we showed that outlier detection with MD and rMD allows not only to identify outliers, but can also assess the heterogeneity of a dataset. The case study proved that our procedure is well suited for identification and visualization of biologically meaningful patient subgroups.

Suggested Citation

  • Bogumil M Konopka & Felicja Lwow & Magdalena Owczarz & Łukasz Łaczmański, 2018. "Exploratory data analysis of a clinical study group: Development of a procedure for exploring multidimensional data," PLOS ONE, Public Library of Science, vol. 13(8), pages 1-21, August.
  • Handle: RePEc:plo:pone00:0201950
    DOI: 10.1371/journal.pone.0201950
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0201950
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0201950&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0201950?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Fionn Murtagh & Pierre Legendre, 2014. "Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?," Journal of Classification, Springer;The Classification Society, vol. 31(3), pages 274-295, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Maksym Polyakov & Morteza Chalak & Md. Sayed Iftekhar & Ram Pandit & Sorada Tapsuwan & Fan Zhang & Chunbo Ma, 2018. "Authorship, Collaboration, Topics, and Research Gaps in Environmental and Resource Economics 1991–2015," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 71(1), pages 217-239, September.
    2. Giger, Markus & Mutea, Emily & Kiteme, Boniface & Eckert, Sandra & Anseeuw, Ward & Zaehringer, Julie G., 2020. "Large agricultural investments in Kenya’s Nanyuki Area: Inventory and analysis of business models," Land Use Policy, Elsevier, vol. 99(C).
    3. Walker, Nathan L. & Styles, David & Coughlan, Paul & Williams, A. Prysor, 2022. "Cross-sector sustainability benchmarking of major utilities in the United Kingdom," Utilities Policy, Elsevier, vol. 78(C).
    4. Abang Zainoren Abang Abdurahman & Syerina Azlin Md Nasir & Wan Fairos Wan Yaacob & Serah Jaya & Suhaili Mokhtar, 2021. "Spatio-Temporal Clustering of Sarawak Malaysia Total Protected Area Visitors," Sustainability, MDPI, vol. 13(21), pages 1-19, October.
    5. Mulu Abraha Woldegiorgis & Janet E. Hiller & Wubegzier Mekonnen & Jahar Bhowmik, 2018. "Disparities in maternal health services in sub-Saharan Africa," International Journal of Public Health, Springer;Swiss School of Public Health (SSPH+), vol. 63(4), pages 525-535, May.
    6. Monika Stanny & Łukasz Komorowski & Andrzej Rosner, 2021. "The Socio-Economic Heterogeneity of Rural Areas: Towards a Rural Typology of Poland," Energies, MDPI, vol. 14(16), pages 1-23, August.
    7. Anca Gabriela Ilie & Marinela Luminita Emanuela Zlatea & Cristina Negreanu & Dan Dumitriu & Alma Pentescu, 2023. "Reliance on Russian Federation Energy Imports and Renewable Energy in the European Union," The AMFITEATRU ECONOMIC journal, Academy of Economic Studies - Bucharest, Romania, vol. 25(64), pages 780-780, August.
    8. Jon Ellingsen & Vegard H. Larsen & Leif Anders Thorsrud, 2020. "News Media vs. FRED-MD for Macroeconomic Forecasting," CESifo Working Paper Series 8639, CESifo.
    9. Sokhna Dieng & Pierre Michel & Abdoulaye Guindo & Kankoe Sallah & El-Hadj Ba & Badara Cissé & Maria Patrizia Carrieri & Cheikh Sokhna & Paul Milligan & Jean Gaudart, 2020. "Application of Functional Data Analysis to Identify Patterns of Malaria Incidence, to Guide Targeted Control Strategies," IJERPH, MDPI, vol. 17(11), pages 1-23, June.
    10. Leila Fardeau & Eva Lelièvre & Loïc Trabut, 2023. "Complex households, a challenge for the study of families through census data," Working Papers 274, French Institute for Demographic Studies.
    11. Marco Cruz-Sandoval & Elisabet Roca & María Isabel Ortego, 2020. "Compositional Data Analysis Approach in the Measurement of Social-Spatial Segregation: Towards a Sustainable and Inclusive City," Sustainability, MDPI, vol. 12(10), pages 1-19, May.
    12. Yurij L. Katchanov & Yulia V. Markova, 2017. "The “space of physics journals”: topological structure and the Journal Impact Factor," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(1), pages 313-333, October.
    13. Xue Ding & Mengling Qin & Linsen Yin & Dayong Lv & Yao Bai, 2023. "Research on FinTech Talent Evaluation Index System and Recruitment Strategy: Evidence From Shanghai in China," SAGE Open, , vol. 13(4), pages 21582440231, November.
    14. Šubová, Nikola, 2022. "The Contribution of Energy Use and Production to Greenhouse Gas Emissions: Evidence from the Agriculture of European Countries," AGRIS on-line Papers in Economics and Informatics, Czech University of Life Sciences Prague, Faculty of Economics and Management, vol. 14(3), September.
    15. Babucea Ana-Gabriela, 2017. "Determinants Of The Recent Romanian Households' Financial Behaviour For Housing Loans - A Territorial Analysis At The Level Of Nuts 3 Regions," Annals - Economy Series, Constantin Brancusi University, Faculty of Economics, vol. 1, pages 71-80, December.
    16. Brian A Hoover & Marisol García-Reyes & Sonia D Batten & Chelle L Gentemann & William J Sydeman, 2021. "Spatio-temporal persistence of zooplankton communities in the Gulf of Alaska," PLOS ONE, Public Library of Science, vol. 16(1), pages 1-24, January.
    17. Xiao Li & Michele Guindani & Chaan S. Ng & Brian P. Hobbs, 2021. "A Bayesian nonparametric model for textural pattern heterogeneity," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(2), pages 459-480, March.
    18. Maritza Satama & Eva Iglesias, 2020. "Fuzzy Cognitive Map Clustering to Assess Local Knowledge of Ecosystem Conservation in Ecuador," Sustainability, MDPI, vol. 12(6), pages 1-26, March.
    19. Iustina Alina Boitan & Ewa Wanda Maruszewska, 2021. "Corporate governance features among European Union countries – an exploratory analysis," The Review of Finance and Banking, Academia de Studii Economice din Bucuresti, Romania / Facultatea de Finante, Asigurari, Banci si Burse de Valori / Catedra de Finante, vol. 13(1), pages 79-91, June.
    20. V. Candila & O. Cepni & G. M. Gallo & R. Gupta, 2024. "Influence of Local and Global Economic Policy Uncertainty on the volatility of US state-level equity returns: Evidence from a GARCH-MIDAS approach with Shrinkage and Cluster Analysis," Working Paper CRENoS 202414, Centre for North South Economic Research, University of Cagliari and Sassari, Sardinia.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0201950. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.