Clusters of sub-Saharan African countries based on sociobehavioural characteristics and associated HIV incidence.

Merzouki, Aziza; Estill, Janne; Orel, Erol; Tal, Kali; Keiser, Olivia (2021). Clusters of sub-Saharan African countries based on sociobehavioural characteristics and associated HIV incidence. PeerJ, 9, e10660. PeerJ, Ltd 10.7717/peerj.10660

Merzouki_PeerJ_2021.pdf - Published Version
Available under License Creative Commons: Attribution (CC-BY).

Download (5MB) | Preview


HIV incidence varies widely between sub-Saharan African (SSA) countries. This variation coincides with a substantial sociobehavioural heterogeneity, which complicates the design of effective interventions. In this study, we investigated how sociobehavioural heterogeneity in sub-Saharan Africa could account for the variance of HIV incidence between countries.


We analysed aggregated data, at the national-level, from the most recent Demographic and Health Surveys of 29 SSA countries (2010-2017), which included 594,644 persons (183,310 men and 411,334 women). We preselected 48 demographic, socio-economic, behavioural and HIV-related attributes to describe each country. We used Principal Component Analysis to visualize sociobehavioural similarity between countries, and to identify the variables that accounted for most sociobehavioural variance in SSA. We used hierarchical clustering to identify groups of countries with similar sociobehavioural profiles, and we compared the distribution of HIV incidence (estimates from UNAIDS) and sociobehavioural variables within each cluster.


The most important characteristics, which explained 69% of sociobehavioural variance across SSA among the variables we assessed were: religion; male circumcision; number of sexual partners; literacy; uptake of HIV testing; women's empowerment; accepting attitude toward people living with HIV/AIDS; rurality; ART coverage; and, knowledge about AIDS. Our model revealed three groups of countries, each with characteristic sociobehavioural profiles. HIV incidence was mostly similar within each cluster and different between clusters (median (IQR); 0.5/1000 (0.6/1000), 1.8/1000 (1.3/1000) and 5.0/1000 (4.2/1000)).


Our findings suggest that the combination of sociobehavioural factors play a key role in determining the course of the HIV epidemic, and that similar techniques can help to predict the effects of behavioural change on the HIV epidemic and to design targeted interventions to impede HIV transmission in SSA.

Item Type:

Journal Article (Original Article)


04 Faculty of Medicine > Medical Education > Institute of General Practice and Primary Care (BIHAM)
08 Faculty of Science > Department of Mathematics and Statistics > Institute of Mathematical Statistics and Actuarial Science

UniBE Contributor:

Estill, Janne Anton Markus, Tal, Kali


600 Technology > 610 Medicine & health
300 Social sciences, sociology & anthropology > 360 Social problems & social services
500 Science > 510 Mathematics




PeerJ, Ltd


[4] Swiss National Science Foundation




Doris Kopp Heim

Date Deposited:

11 Feb 2021 09:49

Last Modified:

05 Dec 2022 15:47

Publisher DOI:


PubMed ID:


Uncontrolled Keywords:

Dimensionality reduction HIV incidence Hierarchical clustering Principal component analysis Sociobehavioural characteristics Unsupervised machine learning




Actions (login required)

Edit item Edit item
Provide Feedback