IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i24p3239-d702479.html
   My bibliography  Save this article

Analysing the Protein-DNA Binding Sites in Arabidopsis thaliana from ChIP-seq Experiments

Author

Listed:
  • Ginés Almagro-Hernández

    (Departamento de Informática y Sistemas, Universidad de Murcia, CEIR Campus Mare Nostrum, 30100 Murcia, Spain
    Instituto Murciano de Investigación Biosanitaria (IMIB-Arrixaca), 30120 Murcia, Spain)

  • Juana-María Vivo

    (Instituto Murciano de Investigación Biosanitaria (IMIB-Arrixaca), 30120 Murcia, Spain
    Departamento de Estadística e Investigación Operativa, Universidad de Murcia, CEIR Campus Mare Nostrum, 30100 Murcia, Spain)

  • Manuel Franco

    (Instituto Murciano de Investigación Biosanitaria (IMIB-Arrixaca), 30120 Murcia, Spain
    Departamento de Estadística e Investigación Operativa, Universidad de Murcia, CEIR Campus Mare Nostrum, 30100 Murcia, Spain)

  • Jesualdo Tomás Fernández-Breis

    (Departamento de Informática y Sistemas, Universidad de Murcia, CEIR Campus Mare Nostrum, 30100 Murcia, Spain
    Instituto Murciano de Investigación Biosanitaria (IMIB-Arrixaca), 30120 Murcia, Spain)

Abstract

Computational genomics aim at supporting the discovery of how the functionality of the genome of the organism under study is affected both by its own sequence and structure, and by the network of interaction between this genome and different biological or physical factors. In this work, we focus on the analysis of ChIP-seq data, for which many methods have been proposed in the recent years. However, to the best of our knowledge, those methods lack an appropriate mathematical formalism. We have developed a method based on multivariate models for the analysis of the set of peaks obtained from a ChIP-seq experiment. This method can be used to characterize an individual experiment and to compare different experiments regardless of where and when they were conducted. The method is based on a multivariate hypergeometric distribution, which fits the complexity of the biological data and is better suited to deal with the uncertainty generated in this type of experiments than the dichotomous models used by the state of the art methods. We have validated this method with Arabidopsis thaliana datasets obtained from the Remap2020 database, obtaining results in accordance with the original study of these samples. Our work shows a novel way for analyzing ChIP-seq data.

Suggested Citation

  • Ginés Almagro-Hernández & Juana-María Vivo & Manuel Franco & Jesualdo Tomás Fernández-Breis, 2021. "Analysing the Protein-DNA Binding Sites in Arabidopsis thaliana from ChIP-seq Experiments," Mathematics, MDPI, vol. 9(24), pages 1-26, December.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:24:p:3239-:d:702479
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/24/3239/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/24/3239/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Boland, Philip J. & Proschan, Frank, 1987. "Schur convexity of the maximum likelihood function for the multivariate hypergeometric and multinomial distributions," Statistics & Probability Letters, Elsevier, vol. 5(5), pages 317-322, August.
    2. J. Hemelrijk, 1967. "The hypergeometric, the normal and chi‐squared," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 21(3‐4), pages 225-228, September.
    3. Childs, Aaron & Balakrishnan, N., 2000. "Some approximations to the multivariate hypergeometric distribution with applications to hypothesis testing," Computational Statistics & Data Analysis, Elsevier, vol. 35(2), pages 137-154, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chu, Yu-Ming & Xia, Wei-Feng & Zhang, Xiao-Hui, 2012. "The Schur concavity, Schur multiplicative and harmonic convexities of the second dual form of the Hamy symmetric function with applications," Journal of Multivariate Analysis, Elsevier, vol. 105(1), pages 412-421.
    2. Requena, F. & Ciudad, N. Martin, 2000. "Characterization of maximum probability points in the Multivariate Hypergeometric distribution," Statistics & Probability Letters, Elsevier, vol. 50(1), pages 39-47, October.
    3. Eisuke Hida & Masafumi Akahira, 2003. "An approximation to the generalized hypergeometric distribution," Statistical Papers, Springer, vol. 44(4), pages 483-497, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:24:p:3239-:d:702479. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.