IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0141756.html
   My bibliography  Save this article

Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering

Author

Listed:
  • Lerato Lerato
  • Thomas Niesler

Abstract

Agglomerative hierarchical clustering becomes infeasible when applied to large datasets due to its O(N2) storage requirements. We present a multi-stage agglomerative hierarchical clustering (MAHC) approach aimed at large datasets of speech segments. The algorithm is based on an iterative divide-and-conquer strategy. The data is first split into independent subsets, each of which is clustered separately. Thus reduces the storage required for sequential implementations, and allows concurrent computation on parallel computing hardware. The resultant clusters are merged and subsequently re-divided into subsets, which are passed to the following iteration. We show that MAHC can match and even surpass the performance of the exact implementation when applied to datasets of speech segments.

Suggested Citation

  • Lerato Lerato & Thomas Niesler, 2015. "Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering," PLOS ONE, Public Library of Science, vol. 10(10), pages 1-24, October.
  • Handle: RePEc:plo:pone00:0141756
    DOI: 10.1371/journal.pone.0141756
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0141756
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0141756&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0141756?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jiandong Yin & Jiawen Yang & Qiyong Guo, 2014. "Evaluating the Feasibility of an Agglomerative Hierarchy Clustering Algorithm for the Automatic Detection of the Arterial Input Function Using DSC-MRI," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-9, June.
    2. Fionn Murtagh & Pierre Legendre, 2014. "Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?," Journal of Classification, Springer;The Classification Society, vol. 31(3), pages 274-295, October.
    3. William Day & Herbert Edelsbrunner, 1984. "Efficient algorithms for agglomerative hierarchical clustering methods," Journal of Classification, Springer;The Classification Society, vol. 1(1), pages 7-24, December.
    4. Roy Varshavsky & David Horn & Michal Linial, 2008. "Global Considerations in Hierarchical Clustering Reveal Meaningful Patterns in Data," PLOS ONE, Public Library of Science, vol. 3(5), pages 1-10, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Tianxiao Wang & Zhecheng Jing & Shupei Zhang & Chengqun Qiu, 2023. "Utilizing Principal Component Analysis and Hierarchical Clustering to Develop Driving Cycles: A Case Study in Zhenjiang," Sustainability, MDPI, vol. 15(6), pages 1-13, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Maurizio Vichi & Carlo Cavicchia & Patrick J. F. Groenen, 2022. "Hierarchical Means Clustering," Journal of Classification, Springer;The Classification Society, vol. 39(3), pages 553-577, November.
    2. Claudiu Vinte & Marcel Ausloos, 2022. "The Cross-Sectional Intrinsic Entropy. A Comprehensive Stock Market Volatility Estimator," Papers 2205.00104, arXiv.org.
    3. Jiao Jieying & Hu Guanyu & Yan Jun, 2021. "A Bayesian marked spatial point processes model for basketball shot chart," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 17(2), pages 77-90, June.
    4. Paulus, Michal & Kristoufek, Ladislav, 2015. "Worldwide clustering of the corruption perception," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 428(C), pages 351-358.
    5. Hyeri Choi & Min Jae Park, 2019. "Evaluating the Efficiency of Governmental Excellence for Social Progress: Focusing on Low- and Lower-Middle-Income Countries," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 141(1), pages 111-130, January.
    6. Maksym Polyakov & Morteza Chalak & Md. Sayed Iftekhar & Ram Pandit & Sorada Tapsuwan & Fan Zhang & Chunbo Ma, 2018. "Authorship, Collaboration, Topics, and Research Gaps in Environmental and Resource Economics 1991–2015," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 71(1), pages 217-239, September.
    7. Grzegorz Maciejewski & Mirosława Malinowska & Barbara Kucharska & Michał Kucia & Beata Kolny, 2021. "Sustainable Development as a Factor Differentiating Consumer Behavior: The Case of Poland," European Research Studies Journal, European Research Studies Journal, vol. 0(3), pages 934-948.
    8. Giger, Markus & Mutea, Emily & Kiteme, Boniface & Eckert, Sandra & Anseeuw, Ward & Zaehringer, Julie G., 2020. "Large agricultural investments in Kenya’s Nanyuki Area: Inventory and analysis of business models," Land Use Policy, Elsevier, vol. 99(C).
    9. Walker, Nathan L. & Styles, David & Coughlan, Paul & Williams, A. Prysor, 2022. "Cross-sector sustainability benchmarking of major utilities in the United Kingdom," Utilities Policy, Elsevier, vol. 78(C).
    10. Pierre H. H. Schneeberger & Morgan Gueuning & Sophie Welsche & Eveline Hürlimann & Julian Dommann & Cécile Häberli & Jürg E. Frey & Somphou Sayasone & Jennifer Keiser, 2022. "Different gut microbial communities correlate with efficacy of albendazole-ivermectin against soil-transmitted helminthiases," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    11. Abang Zainoren Abang Abdurahman & Syerina Azlin Md Nasir & Wan Fairos Wan Yaacob & Serah Jaya & Suhaili Mokhtar, 2021. "Spatio-Temporal Clustering of Sarawak Malaysia Total Protected Area Visitors," Sustainability, MDPI, vol. 13(21), pages 1-19, October.
    12. Mulu Abraha Woldegiorgis & Janet E. Hiller & Wubegzier Mekonnen & Jahar Bhowmik, 2018. "Disparities in maternal health services in sub-Saharan Africa," International Journal of Public Health, Springer;Swiss School of Public Health (SSPH+), vol. 63(4), pages 525-535, May.
    13. Monika Stanny & Łukasz Komorowski & Andrzej Rosner, 2021. "The Socio-Economic Heterogeneity of Rural Areas: Towards a Rural Typology of Poland," Energies, MDPI, vol. 14(16), pages 1-23, August.
    14. Renato Amorim, 2015. "Feature Relevance in Ward’s Hierarchical Clustering Using the L p Norm," Journal of Classification, Springer;The Classification Society, vol. 32(1), pages 46-62, April.
    15. Anca Gabriela Ilie & Marinela Luminita Emanuela Zlatea & Cristina Negreanu & Dan Dumitriu & Alma Pentescu, 2023. "Reliance on Russian Federation Energy Imports and Renewable Energy in the European Union," The AMFITEATRU ECONOMIC journal, Academy of Economic Studies - Bucharest, Romania, vol. 25(64), pages 780-780, August.
    16. Luiza Ossowska & Dorota Janiszewska & Natalia Bartkowiak-Bakun & Grzegorz Kwiatkowski, 2020. "Energy Consumption Versus Greenhouse Gas Emissions in EU," European Research Studies Journal, European Research Studies Journal, vol. 0(3), pages 185-198.
    17. Jon Ellingsen & Vegard H. Larsen & Leif Anders Thorsrud, 2020. "News Media vs. FRED-MD for Macroeconomic Forecasting," CESifo Working Paper Series 8639, CESifo.
    18. Sokhna Dieng & Pierre Michel & Abdoulaye Guindo & Kankoe Sallah & El-Hadj Ba & Badara Cissé & Maria Patrizia Carrieri & Cheikh Sokhna & Paul Milligan & Jean Gaudart, 2020. "Application of Functional Data Analysis to Identify Patterns of Malaria Incidence, to Guide Targeted Control Strategies," IJERPH, MDPI, vol. 17(11), pages 1-23, June.
    19. Jill F. Lundell & Brennan Bean & Jürgen Symanzik, 2023. "Let’s talk about the weather: a cluster-based approach to weather forecast accuracy," Computational Statistics, Springer, vol. 38(3), pages 1135-1155, September.
    20. Dong, Xinghui & Li, Jia & Gao, Di & Zheng, Kai, 2020. "Wind speed modeling for cascade clusters of wind turbines part 1: The cascade clusters of wind turbines," Energy, Elsevier, vol. 205(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0141756. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.