Multimethod, multidataset analysis reveals paradoxical relationships between sociodemographic factors, Hispanic ethnicity and diabetes.

Knight, Gabriel M; Spencer-Bonilla, Gabriela; Maahs, David M; Blum, Manuel R; Valencia, Areli; Zuma, Bongeka Z; Prahalad, Priya; Sarraju, Ashish; Rodriguez, Fatima; Scheinker, David (2020). Multimethod, multidataset analysis reveals paradoxical relationships between sociodemographic factors, Hispanic ethnicity and diabetes. BMJ open diabetes research & care, 8(2), e001725. BMJ Publishing Group 10.1136/bmjdrc-2020-001725

[img]
Preview
Text
Knight__BMJ_Open_Diab_Res_Care_2020.pdf - Published Version
Available under License Creative Commons: Attribution-Noncommercial (CC-BY-NC).

Download (493kB) | Preview

INTRODUCTION

Population-level and individual-level analyses have strengths and limitations as do 'blackbox' machine learning (ML) and traditional, interpretable models. Diabetes mellitus (DM) is a leading cause of morbidity and mortality with complex sociodemographic dynamics that have not been analyzed in a way that leverages population-level and individual-level data as well as traditional epidemiological and ML models. We analyzed complementary individual-level and county-level datasets with both regression and ML methods to study the association between sociodemographic factors and DM.

RESEARCH DESIGN AND METHODS

County-level DM prevalence, demographics, and socioeconomic status (SES) factors were extracted from the 2018 Robert Wood Johnson Foundation County Health Rankings and merged with US Census data. Analogous individual-level data were extracted from 2007 to 2016 National Health and Nutrition Examination Survey studies and corrected for oversampling with survey weights. We used multivariate linear (logistic) regression and ML regression (classification) models for county (individual) data. Regression and ML models were compared using measures of explained variation (area under the receiver operating characteristic curve (AUC) and R2).

RESULTS

Among the 3138 counties assessed, the mean DM prevalence was 11.4% (range: 3.0%-21.1%). Among the 12 824 individuals assessed, 1688 met DM criteria (13.2% unweighted; 10.2% weighted). Age, gender, race/ethnicity, income, and education were associated with DM at the county and individual levels. Higher county Hispanic ethnic density was negatively associated with county DM prevalence, while Hispanic ethnicity was positively associated with individual DM. ML outperformed regression in both datasets (mean R2 of 0.679 vs 0.610, respectively (p<0.001) for county-level data; mean AUC of 0.737 vs 0.727 (p<0.0427) for individual-level data).

CONCLUSIONS

Hispanic individuals are at higher risk of DM, while counties with larger Hispanic populations have lower DM prevalence. Analyses of population-level and individual-level data with multiple methods may afford more confidence in results and identify areas for further study.

Item Type:

Journal Article (Original Article)

Division/Institute:

04 Faculty of Medicine > Department of General Internal Medicine (DAIM) > Clinic of General Internal Medicine
04 Faculty of Medicine > Medical Education > Institute of General Practice and Primary Care (BIHAM)

UniBE Contributor:

Blum, Manuel

Subjects:

600 Technology > 610 Medicine & health
300 Social sciences, sociology & anthropology > 360 Social problems & social services

ISSN:

2052-4897

Publisher:

BMJ Publishing Group

Language:

English

Submitter:

Tobias Tritschler

Date Deposited:

03 Dec 2020 20:20

Last Modified:

05 Dec 2022 15:42

Publisher DOI:

10.1136/bmjdrc-2020-001725

PubMed ID:

33229378

Uncontrolled Keywords:

diabetes mellitus ethnic groups informatics risk factors type 2

BORIS DOI:

10.7892/boris.148785

URI:

https://boris.unibe.ch/id/eprint/148785

Actions (login required)

Edit item Edit item
Provide Feedback