IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1003430.html
   My bibliography  Save this article

Linkage of Viral Sequences among HIV-Infected Village Residents in Botswana: Estimation of Linkage Rates in the Presence of Missing Data

Author

Listed:
  • Nicole Bohme Carnegie
  • Rui Wang
  • Vladimir Novitsky
  • Victor De Gruttola

Abstract

Linkage analysis is useful in investigating disease transmission dynamics and the effect of interventions on them, but estimates of probabilities of linkage between infected people from observed data can be biased downward when missingness is informative. We investigate variation in the rates at which subjects' viral genotypes link across groups defined by viral load (low/high) and antiretroviral treatment (ART) status using blood samples from household surveys in the Northeast sector of Mochudi, Botswana. The probability of obtaining a sequence from a sample varies with viral load; samples with low viral load are harder to amplify. Pairwise genetic distances were estimated from aligned nucleotide sequences of HIV-1C env gp120. It is first shown that the probability that randomly selected sequences are linked can be estimated consistently from observed data. This is then used to develop estimates of the probability that a sequence from one group links to at least one sequence from another group under the assumption of independence across pairs. Furthermore, a resampling approach is developed that accounts for the presence of correlation across pairs, with diagnostics for assessing the reliability of the method. Sequences were obtained for 65% of subjects with high viral load (HVL, n = 117), 54% of subjects with low viral load but not on ART (LVL, n = 180), and 45% of subjects on ART (ART, n = 126). The probability of linkage between two individuals is highest if both have HVL, and lowest if one has LVL and the other has LVL or is on ART. Linkage across groups is high for HVL and lower for LVL and ART. Adjustment for missing data increases the group-wise linkage rates by 40–100%, and changes the relative rates between groups. Bias in inferences regarding HIV viral linkage that arise from differential ability to genotype samples can be reduced by appropriate methods for accommodating missing data.Author Summary: The analysis of viral genomes has great potential for investigating transmission of disease, including the identification of risk factors and transmission clusters, and can thereby aid in targeting interventions. To make use of genetic data in this way, it is necessary to make inferences about population-level patterns of viral linkage. As with any rigorous statistical inference from sampled data to a population, it is important to consider the effect of the sampling strategy and the occurrence of missing data on the final inferences made. In this paper we highlight the effects of missing data on the resulting estimates of population level linkage rates and develop methods for adjusting for the presence of missing data. As an example, we consider comparing the rates of linkage of HIV sequences from subjects with high viral load, low viral load, or on antiretroviral treatment, and show that comparative inferences are compromised when adjustment is not made for missing sequences and bias in inferences can be reduced with proper adjustment.

Suggested Citation

  • Nicole Bohme Carnegie & Rui Wang & Vladimir Novitsky & Victor De Gruttola, 2014. "Linkage of Viral Sequences among HIV-Infected Village Residents in Botswana: Estimation of Linkage Rates in the Presence of Missing Data," PLOS Computational Biology, Public Library of Science, vol. 10(1), pages 1-16, January.
  • Handle: RePEc:plo:pcbi00:1003430
    DOI: 10.1371/journal.pcbi.1003430
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003430
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1003430&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1003430?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1003430. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.