IDEAS home Printed from https://ideas.repec.org/a/bla/jorssc/v71y2022i4p773-805.html
   My bibliography  Save this article

Exploring British accents: Modelling the trap–bath split with functional data analysis

Author

Listed:
  • Aranya Koshy
  • Shahin Tavakoli

Abstract

The sound of our speech is influenced by the places we come from. Great Britain contains a wide variety of distinctive accents which are of interest to linguistics. In particular, the ‘a’ vowel in words like ‘class’ is pronounced differently in the North and the South. Speech recordings of this vowel can be represented as formant curves or as mel‐frequency cepstral coefficient curves. Functional data analysis and generalised additive models offer techniques to model the variation in these curves. Our first aim was to model the difference between typical Northern and Southern vowels /æ/ and /ɑ/, by training two classifiers on the North‐South Class Vowels dataset collected for this paper. Our second aim is to visualise geographical variation of accents in Great Britain. For this we use speech recordings from a second dataset, the British National Corpus (BNC) audio edition. The trained models are used to predict the accent of speakers in the BNC, and then we model the geographical patterns in these predictions using a soap film smoother. This work demonstrates a flexible and interpretable approach to modelling phonetic accent variation in speech recordings.

Suggested Citation

  • Aranya Koshy & Shahin Tavakoli, 2022. "Exploring British accents: Modelling the trap–bath split with functional data analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(4), pages 773-805, August.
  • Handle: RePEc:bla:jorssc:v:71:y:2022:i:4:p:773-805
    DOI: 10.1111/rssc.12555
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssc.12555
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssc.12555?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Shahin Tavakoli & Davide Pigoli & John A. D. Aston & John S. Coleman, 2019. "A Spatial Modeling Approach for Linguistic Object Data: Analyzing Dialect Sound Variations Across Great Britain," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(527), pages 1081-1096, July.
    2. Simon N. Wood, 2011. "Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(1), pages 3-36, January.
    3. Davide Pigoli & Pantelis Z. Hadjipantelis & John S. Coleman & John A. D. Aston, 2018. "The statistical analysis of acoustic phonetic data: exploring differences between spoken Romance languages," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 67(5), pages 1103-1145, November.
    4. Clara Happ & Sonja Greven, 2018. "Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(522), pages 649-659, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Timothy Riffe & Rustam Tursun-Zade & Sergi Trias Llimós, 2024. "Arriaga meets Kitagawa: life expectancy decomposition with population subgroups," MPIDR Working Papers WP-2024-029, Max Planck Institute for Demographic Research, Rostock, Germany.
    2. Gerhard Tutz & Moritz Berger, 2018. "Tree-structured modelling of categorical predictors in generalized additive regression," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(3), pages 737-758, September.
    3. Thomas A. Verschut & Renny Ng & Nicolas P. Doubovetzky & Guillaume Calvez & Jan L. Sneep & Adriaan J. Minnaard & Chih-Ying Su & Mikael A. Carlsson & Bregje Wertheim & Jean-Christophe Billeter, 2023. "Aggregation pheromones have a non-linear effect on oviposition behavior in Drosophila melanogaster," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    4. Tommaso Luzzati & Angela Parenti & Tommaso Rughi, 2017. "Spatial error regressions for testing the Cancer-EKC," Discussion Papers 2017/218, Dipartimento di Economia e Management (DEM), University of Pisa, Pisa, Italy.
    5. Markus Sihvonen, 2024. "Yield curve momentum," Review of Finance, European Finance Association, vol. 28(3), pages 805-830.
    6. Davide Fiaschi & Andrea Mario Lavezzi & Angela Parenti, 2020. "Deep and Proximate Determinants of the World Income Distribution," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 66(3), pages 677-710, September.
    7. Cederbaum, Jona & Scheipl, Fabian & Greven, Sonja, 2018. "Fast symmetric additive covariance smoothing," Computational Statistics & Data Analysis, Elsevier, vol. 120(C), pages 25-41.
    8. Cai Li & Luo Xiao & Sheng Luo, 2022. "Joint model for survival and multivariate sparse functional data with application to a study of Alzheimer's Disease," Biometrics, The International Biometric Society, vol. 78(2), pages 435-447, June.
    9. Umlauf, Nikolaus & Adler, Daniel & Kneib, Thomas & Lang, Stefan & Zeileis, Achim, 2015. "Structured Additive Regression Models: An R Interface to BayesX," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 63(i21).
    10. T Masak & S Sarkar & V M Panaretos, 2023. "Separable expansions for covariance estimation via the partial inner product," Biometrika, Biometrika Trust, vol. 110(1), pages 225-247.
    11. Simon N. Wood, 2020. "Inference and computation with generalized additive models and their extensions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(2), pages 307-339, June.
    12. Amira Elayouty & Marian Scott & Claire Miller, 2022. "Time-Varying Functional Principal Components for Non-Stationary EpCO $$_2$$ 2 in Freshwater Systems," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 27(3), pages 506-522, September.
    13. Konrad Menzel, 2023. "Transfer Estimates for Causal Effects across Heterogeneous Sites," Papers 2305.01435, arXiv.org, revised May 2024.
    14. Xiuli Du & Xiaohu Jiang & Jinguan Lin, 2023. "Multinomial Logistic Factor Regression for Multi-source Functional Block-wise Missing Data," Psychometrika, Springer;The Psychometric Society, vol. 88(3), pages 975-1001, September.
    15. Conor Waldock & Bernhard Wegscheider & Dario Josi & Bárbara Borges Calegari & Jakob Brodersen & Luiz Jardim de Queiroz & Ole Seehausen, 2024. "Deconstructing the geography of human impacts on species’ natural distribution," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    16. Cai, Leheng & Hu, Qirui, 2024. "Simultaneous inference and uniform test for eigensystems of functional data," Computational Statistics & Data Analysis, Elsevier, vol. 192(C).
    17. Paul Ghelasi & Florian Ziel, 2024. "From day-ahead to mid and long-term horizons with econometric electricity price forecasting models," Papers 2406.00326, arXiv.org, revised Aug 2024.
    18. Valtiala, Juho & Niskanen, Olli & Torvinen, Mikael & Riekkinen, Kirsikka & Suokannas, Antti, 2023. "The relationship between agricultural land parcel size and cultivation costs," Land Use Policy, Elsevier, vol. 131(C).
    19. Goldsmith, Jeff & Scheipl, Fabian, 2014. "Estimator selection and combination in scalar-on-function regression," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 362-372.
    20. Gosztonyi, Ákos & Demmler, Joanne C. & Juhola, Sirkku & Ala-Mantila, Sanna, 2023. "Ambient air pollution-related environmental inequality and environmental dissimilarity in Helsinki Metropolitan Area, Finland," Ecological Economics, Elsevier, vol. 213(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssc:v:71:y:2022:i:4:p:773-805. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.