IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v8y2009i1n6.html
   My bibliography  Save this article

Composite Likelihood Modeling of Neighboring Site Correlations of DNA Sequence Substitution Rates

Author

Listed:
  • Deng Ling

    (Johnson & Johnson)

  • Moore Dirk F.

    (University of Medicine and Dentistry of New Jersey)

Abstract

Sequence data from a series of homologous DNA segments from related organisms are typically polymorphic at many sites, and these polymorphisms are the result of evolutionary processes. Such data may be used to estimate the substitution rates as well as the variability of these rates. Careful characterization of the distribution of this variation is essential for accurate estimation of evolutionary distances and phylogeny reconstruction among these sequences. Many researchers have recognized the importance of the variability of substitution rates, which most have modeled using a discrete gamma distribution. Some have extended these methods to explicitly account for the correlation of substitution rates among sites using hidden Markov models; others have proposed context-dependent substitution rate schemes. We accommodate these correlations using a composite likelihood method based on a bivariate gamma distribution, which is more flexible than hidden Markov models in terms of correlation structure and more computationally tractable compared to the context-dependent schemes. We show that the estimates have good theoretical properties. We also use simulations to compare the maximum composite likelihood estimates to those obtained from maximum likelihood based on the independence assumption. We use data from the mitochondrial DNA of ten primates to obtain maximum composite likelihood estimates of the mean substitution rate, overdispersion, and correlation parameters, and use these estimates in a parametric phylogenetic bootstrap to assess the impact of serial correlation on the estimates of substitution rates and branch lengths.

Suggested Citation

  • Deng Ling & Moore Dirk F., 2009. "Composite Likelihood Modeling of Neighboring Site Correlations of DNA Sequence Substitution Rates," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-22, January.
  • Handle: RePEc:bpj:sagmbi:v:8:y:2009:i:1:n:6
    DOI: 10.2202/1544-6115.1391
    as

    Download full text from publisher

    File URL: https://doi.org/10.2202/1544-6115.1391
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.2202/1544-6115.1391?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Robin Henderson, 2003. "A serially correlated gamma frailty model for longitudinal count data," Biometrika, Biometrika Trust, vol. 90(2), pages 355-366, June.
    2. Paul Fearnhead & Peter Donnelly, 2002. "Approximate likelihood methods for estimating local recombination rates," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(4), pages 657-680, October.
    3. Cristiano Varin & Paolo Vidoni, 2005. "A note on composite likelihood inference and model selection," Biometrika, Biometrika Trust, vol. 92(3), pages 519-528, September.
    4. D. R. Cox, 2004. "A note on pseudolikelihood constructed from marginal densities," Biometrika, Biometrika Trust, vol. 91(3), pages 729-737, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Cristiano Varin, 2008. "On composite marginal likelihoods," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 92(1), pages 1-28, February.
    2. Bhat, Chandra R. & Sener, Ipek N. & Eluru, Naveen, 2010. "A flexible spatially dependent discrete choice model: Formulation and application to teenagers' weekday recreational activity participation," Transportation Research Part B: Methodological, Elsevier, vol. 44(8-9), pages 903-921, September.
    3. Papageorgiou, Ioulia & Moustaki, Irini, 2019. "Sampling of pairs in pairwise likelihood estimation for latent variable models with categorical observed variables," LSE Research Online Documents on Economics 87592, London School of Economics and Political Science, LSE Library.
    4. Ipek Sener & Chandra Bhat, 2012. "Flexible spatial dependence structures for unordered multinomial choice models: formulation and application to teenagers’ activity participation," Transportation, Springer, vol. 39(3), pages 657-683, May.
    5. Vassilis Vasdekis & Silvia Cagnone & Irini Moustaki, 2012. "A Composite Likelihood Inference in Latent Variable Models for Ordinal Longitudinal Responses," Psychometrika, Springer;The Psychometric Society, vol. 77(3), pages 425-441, July.
    6. Nobel, Anne & Lizin, Sebastien & Malina, Robert, 2023. "What drives the designation of protected areas? Accounting for spatial dependence using a composite marginal likelihood approach," Ecological Economics, Elsevier, vol. 205(C).
    7. Chandra R. Bhat & Subodh K. Dubey & Mohammad Jobair Bin Alam & Waleed H. Khushefati, 2015. "A New Spatial Multiple Discrete-Continuous Modeling Approach To Land Use Change Analysis," Journal of Regional Science, Wiley Blackwell, vol. 55(5), pages 801-841, November.
    8. Chaoubi, Ihsan & Cossette, Hélène & Marceau, Etienne & Robert, Christian Y., 2021. "Hierarchical copulas with Archimedean blocks and asymmetric between-block pairs," Computational Statistics & Data Analysis, Elsevier, vol. 154(C).
    9. Hung‐pin Lai & Subal C. Kumbhakar, 2020. "Estimation of a dynamic stochastic frontier model using likelihood‐based approaches," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 35(2), pages 217-247, March.
    10. Stanislav Anatolyev & Renat Khabibullin & Artem Prokhorov, 2012. "Reconstructing high dimensional dynamic distributions from distributions of lower dimension," Working Papers 12003, Concordia University, Department of Economics.
    11. Larribe Fabrice & Lessard Sabin, 2008. "A Composite-Conditional-Likelihood Approach for Gene Mapping Based on Linkage Disequilibrium in Windows of Marker Loci," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(1), pages 1-33, August.
    12. Gourieroux, C. & Monfort, A., 2018. "Composite indirect inference with application to corporate risks," Econometrics and Statistics, Elsevier, vol. 7(C), pages 30-45.
    13. Liu, Haibin & Davidson, Rachel A. & Apanasovich, Tatiyana V., 2008. "Spatial generalized linear mixed models of electric power outages due to hurricanes and ice storms," Reliability Engineering and System Safety, Elsevier, vol. 93(6), pages 897-912.
    14. Li Liu & Liming Xiang, 2014. "Semiparametric estimation in generalized linear mixed models with auxiliary covariates: A pairwise likelihood approach," Biometrics, The International Biometric Society, vol. 70(4), pages 910-919, December.
    15. Joe, Harry & Lee, Youngjo, 2009. "On weighting of bivariate margins in pairwise likelihood," Journal of Multivariate Analysis, Elsevier, vol. 100(4), pages 670-685, April.
    16. Varin, Cristiano & Host, Gudmund & Skare, Oivind, 2005. "Pairwise likelihood inference in spatial generalized linear mixed models," Computational Statistics & Data Analysis, Elsevier, vol. 49(4), pages 1173-1191, June.
    17. Bhat, Chandra R., 2011. "The maximum approximate composite marginal likelihood (MACML) estimation of multinomial probit-based unordered response choice models," Transportation Research Part B: Methodological, Elsevier, vol. 45(7), pages 923-939, August.
    18. Qiurong Cui & Zhengjun Zhang, 2018. "Max-Linear Competing Factor Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 36(1), pages 62-74, January.
    19. Bhat, Chandra R. & Astroza, Sebastian & Hamdi, Amin S., 2017. "A spatial generalized ordered-response model with skew normal kernel error terms with an application to bicycling frequency," Transportation Research Part B: Methodological, Elsevier, vol. 95(C), pages 126-148.
    20. Nuo Xi & Michael W. Browne, 2014. "Contributions to the Underlying Bivariate Normal Method for Factor Analyzing Ordinal Data," Journal of Educational and Behavioral Statistics, , vol. 39(6), pages 583-611, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:8:y:2009:i:1:n:6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.