Deep-sequencing of viral genomes from a large and diverse cohort of treatment-naive HIV-infected persons shows associations between intrahost genetic diversity and viral load.

Gabrielaite, Migle; Bennedbæk, Marc; Rasmussen, Malthe Sebro; Kan, Virginia; Furrer, Hansjakob; Flisiak, Robert; Losso, Marcelo; Lundgren, Jens D; Marvig, Rasmus L (2023). Deep-sequencing of viral genomes from a large and diverse cohort of treatment-naive HIV-infected persons shows associations between intrahost genetic diversity and viral load. PLoS computational biology, 19(1), e1010756. Public Library of Science 10.1371/journal.pcbi.1010756

journal.pcbi.1010756.pdf - Published Version
Available under License Creative Commons: Attribution (CC-BY).

Download (3MB) | Preview


Infection with human immunodeficiency virus type 1 (HIV) typically results from transmission of a small and genetically uniform viral population. Following transmission, the virus population becomes more diverse because of recombination and acquired mutations through genetic drift and selection. Viral intrahost genetic diversity remains a major obstacle to the cure the HIV; however, the association between intrahost diversity and disease progression markers has not been investigated in large and diverse cohorts for which the majority of the genome has been deep-sequenced. Viral load (VL) is a key progression marker and understanding of its relationship to viral intrahost genetic diversity could help design future strategies for HIV monitoring and treatment.


We analysed deep-sequenced viral genomes from 2,650 treatment-naive HIV-infected persons to measure the intrahost genetic diversity of 2,447 genomic codon positions as calculated by Shannon entropy. We tested for associations between VL and amino acid (AA) entropy accounting for sex, age, race, duration of infection, and HIV population structure.


We confirmed that the intrahost genetic diversity is highest in the env gene. Furthermore, we showed that mean Shannon entropy is significantly associated with VL, especially in infections of >24 months duration. We identified 16 significant associations between VL (p-value<2.0x10-5) and Shannon entropy at AA positions which in our association analysis explained 13% of the variance in VL. Finally, equivalent analysis based on variation in HIV consensus sequences explained only 2% of VL variance.


Our results elucidate that viral intrahost genetic diversity is associated with VL and could be used as a better disease progression marker than HIV consensus sequence variants, especially in infections of longer duration. We emphasize that viral intrahost diversity should be considered when studying viral genomes and infection outcomes.


Samples included in this study were derived from participants who consented in the clinical trial, START (NCT00867048) (23), run by the International Network for Strategic Initiatives in Global HIV Trials (INSIGHT). All the participant sites are listed here:

Item Type:

Journal Article (Original Article)


04 Faculty of Medicine > Department of Haematology, Oncology, Infectious Diseases, Laboratory Medicine and Hospital Pharmacy (DOLS) > Clinic of Infectiology

UniBE Contributor:

Furrer, Hansjakob


600 Technology > 610 Medicine & health




Public Library of Science




Pubmed Import

Date Deposited:

11 Jan 2023 14:38

Last Modified:

14 Jan 2023 00:17

Publisher DOI:


PubMed ID:





Actions (login required)

Edit item Edit item
Provide Feedback