IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i10p2299-d1147358.html
   My bibliography  Save this article

seq2R: An R Package to Detect Change Points in DNA Sequences

Author

Listed:
  • Nora M. Villanueva

    (Centro de Investigación en Nanomateriais e Biomedicina (CINBIO), Universidade de Vigo, 36310 Vigo, Spain
    Department of Statistics and Operations Research, SIDOR Research Group, University of Vigo, 36310 Vigo, Spain
    These authors contributed equally to this work.)

  • Marta Sestelo

    (CITMAga, 15782 Santiago de Compostela, Spain
    Department of Statistics and Operations Research, SIDOR Research Group, University of Vigo, 36310 Vigo, Spain
    These authors contributed equally to this work.)

  • Miguel M. Fonseca

    (Department of Biochemistry, Genetics and Immunology, 36310 Vigo, Spain)

  • Javier Roca-Pardiñas

    (CITMAga, 15782 Santiago de Compostela, Spain
    Department of Statistics and Operations Research, SIDOR Research Group, University of Vigo, 36310 Vigo, Spain)

Abstract

Identifying the mutational processes that shape the nucleotide composition of the mitochondrial genome (mtDNA) is fundamental to better understand how these genomes evolve. Several methods have been proposed to analyze DNA sequence nucleotide composition and skewness, but most of them lack any measurement of statistical support or were not developed taking into account the specificities of mitochondrial genomes. A new methodology is presented, which is specifically developed for mtDNA to detect compositional changes or asymmetries (AT and CG skews) based on nonparametric regression models and their derivatives. The proposed method also includes the construction of confidence intervals, which are built using bootstrap techniques. This paper introduces an R package, known as seq2R, that implements the proposed methodology. Moreover, an illustration of the use of seq2R is provided using real data, specifically two publicly available complete mtDNAs: the human ( Homo sapiens ) sequence and a nematode ( Radopholus similis ) mitogenome sequence.

Suggested Citation

  • Nora M. Villanueva & Marta Sestelo & Miguel M. Fonseca & Javier Roca-Pardiñas, 2023. "seq2R: An R Package to Detect Change Points in DNA Sequences," Mathematics, MDPI, vol. 11(10), pages 1-20, May.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:10:p:2299-:d:1147358
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/10/2299/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/10/2299/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Zeileis, Achim & Leisch, Friedrich & Hornik, Kurt & Kleiber, Christian, 2002. "strucchange: An R Package for Testing for Structural Change in Linear Regression Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 7(i02).
    2. Killick, Rebecca & Eckley, Idris A., 2014. "changepoint: An R Package for Changepoint Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 58(i03).
    3. Ross, Gordon J., 2015. "Parametric and Nonparametric Sequential Change Detection in R: The cpm Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 66(i03).
    4. Anestis Antoniadis & Irene Gijbels & Brenda Macgibbon, 2000. "Non‐parametric Estimation for the Location of a Change‐point in an Otherwise Smooth Hazard Function under Random Censoring," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 27(3), pages 501-519, September.
    5. A. N. Pettitt, 1979. "A Non‐Parametric Approach to the Change‐Point Problem," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 28(2), pages 126-135, June.
    6. Grégoire, Gérard & Hamrouni, Zouhir, 2002. "Change Point Estimation by Local Linear Smoothing," Journal of Multivariate Analysis, Elsevier, vol. 83(1), pages 56-83, October.
    7. Jushan Bai & Pierre Perron, 2003. "Computation and analysis of multiple structural change models," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 18(1), pages 1-22.
    8. Erdman, Chandra & Emerson, John W., 2007. "bcp: An R Package for Performing a Bayesian Analysis of Change Point Problems," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 23(i03).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lindeløv, Jonas Kristoffer, 2020. "mcp: An R Package for Regression With Multiple Change Points," OSF Preprints fzqxv, Center for Open Science.
    2. Zhang, Wenjia & Wu, Yulin & Deng, Guobang, 2024. "Social and spatial disparities in individuals’ mobility response time to COVID-19: A big data analysis incorporating changepoint detection and accelerated failure time models," Transportation Research Part A: Policy and Practice, Elsevier, vol. 184(C).
    3. Ross, Gordon J., 2015. "Parametric and Nonparametric Sequential Change Detection in R: The cpm Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 66(i03).
    4. repec:jss:jstsof:23:i03 is not listed on IDEAS
    5. Rui Qiang & Eric Ruggieri, 2023. "Autocorrelation and Parameter Estimation in a Bayesian Change Point Model," Mathematics, MDPI, vol. 11(5), pages 1-22, February.
    6. Ruggieri, Eric & Antonellis, Marcus, 2016. "An exact approach to Bayesian sequential change point detection," Computational Statistics & Data Analysis, Elsevier, vol. 97(C), pages 71-86.
    7. Patrik Nosil & Zachariah Gompert & Daniel J. Funk, 2024. "Divergent dynamics of sexual and habitat isolation at the transition between stick insect populations and species," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    8. Salvatore Fasola & Vito M. R. Muggeo & Helmut Küchenhoff, 2018. "A heuristic, iterative algorithm for change-point detection in abrupt change models," Computational Statistics, Springer, vol. 33(2), pages 997-1015, June.
    9. James Nolan & Zoe Laulederkind, 2022. "Plane to See? Empirical Analysis of the 1999–2006 Air Cargo Cartel," Advances in Airline Economics, in: The International Air Cargo Industry, volume 9, pages 241-262, Emerald Group Publishing Limited.
    10. Zeileis, Achim, 2004. "Econometric Computing with HC and HAC Covariance Matrix Estimators," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 11(i10).
    11. Amey Sapre, 2014. "Madhya Pradesh: Does Agriculture Determine the State’s Growth Trajectory?," Margin: The Journal of Applied Economic Research, National Council of Applied Economic Research, vol. 8(1), pages 39-57, February.
    12. Jung, R.C. & Maderitsch, R., 2014. "Structural breaks in volatility spillovers between international financial markets: Contagion or mere interdependence?," Journal of Banking & Finance, Elsevier, vol. 47(C), pages 331-342.
    13. Ashok Chanabasangouda Patil & Shailesh Rastogi, 2020. "Multifractal Analysis of Market Efficiency across Structural Breaks: Implications for the Adaptive Market Hypothesis," JRFM, MDPI, vol. 13(10), pages 1-18, October.
    14. Addona Vittorio & Yates Philip A, 2010. "A Closer Look at the Relative Age Effect in the National Hockey League," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 6(4), pages 1-19, October.
    15. Daniel Mantilla-García & Vijay Vaidyanathan, 2017. "Predicting stock returns in the presence of uncertain structural changes and sample noise," Financial Markets and Portfolio Management, Springer;Swiss Society for Financial Market Research, vol. 31(3), pages 357-391, August.
    16. DIMA, Bogdan & DIMA, Ştefana Maria & IOAN, Roxana, 2021. "Remarks on the behaviour of financial market efficiency during the COVID-19 pandemic. The case of VIX," Finance Research Letters, Elsevier, vol. 43(C).
    17. repec:jss:jstsof:11:i10 is not listed on IDEAS
    18. Abhijit Sharma & Kelvin G Balcombe & Iain M Fraser, 2009. "Non-renewable resource prices: Structural breaks and long term trends," Economics Bulletin, AccessEcon, vol. 29(2), pages 805-819.
    19. Kholodilin Konstantin Arkadievich & Siliverstovs Boriss, 2006. "On the Forecasting Properties of the Alternative Leading Indicators for the German GDP: Recent Evidence," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 226(3), pages 234-259, June.
    20. Edoardo Rainone, 2021. "Identifying deposits' outflows in real-time," Temi di discussione (Economic working papers) 1319, Bank of Italy, Economic Research and International Relations Area.
    21. Ricardo C. Pedroso & Rosangela H. Loschi & Fernando Andrés Quintana, 2023. "Multipartition model for multiple change point identification," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(2), pages 759-783, June.
    22. Fabio Clementi & Marco Gallegati & Mauro Gallegati, 2015. "Growth and Cycles of the Italian Economy Since 1861: The New Evidence," Italian Economic Journal: A Continuation of Rivista Italiana degli Economisti and Giornale degli Economisti, Springer;Società Italiana degli Economisti (Italian Economic Association), vol. 1(1), pages 25-59, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:10:p:2299-:d:1147358. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.