Overcoming Language Barriers: Assessing the Potential of Machine Translation and Topic Modeling for the Comparative Analysis of Multilingual Text Corpora

Reber, Ueli (2019). Overcoming Language Barriers: Assessing the Potential of Machine Translation and Topic Modeling for the Comparative Analysis of Multilingual Text Corpora. Communication methods and measures, 13(2), pp. 102-125. Taylor & Francis 10.1080/19312458.2018.1555798

[img] Text
Reber_2019_Overcoming language barriers.pdf - Published Version
Restricted to registered users only
Available under License Publisher holds Copyright.

Download (2MB) | Request a copy
[img]
Preview
Text
Reber_2019_Overcoming-language-barriers_accepted-manuscript.pdf - Accepted Version
Available under License Publisher holds Copyright.

Download (1MB) | Preview

This study assesses the potential of topic models coupled with machine translation for comparative communication research across language barriers. From a methodological point of view, the robustness of a combined approach is examined. For this purpose the results of different machine translation services (Google Translate vs. DeepL) as well as methods (full-text vs. term-by-term) are compared. From a substantive point of view, the integratability of the approach into comparative study designs is tested. For this, the online discourses about climate change in Germany, the United Kingdom, and the United States are compared. First, the results show that the approach is relatively robust and second, that integration in comparative study designs is not a problem. It is concluded that this as well as the relatively moderate costs in terms of time and money makes the strategy to couple topic models with machine translation a valuable addition to the toolbox of comparative communication researchers.

Item Type:

Journal Article (Original Article)

Division/Institute:

03 Faculty of Business, Economics and Social Sciences > Social Sciences > Institute of Communication and Media Studies (ICMB)

UniBE Contributor:

Reber, Ueli

Subjects:

300 Social sciences, sociology & anthropology

ISSN:

1931-2458

Publisher:

Taylor & Francis

Funders:

[4] Swiss National Science Foundation ; [UNSPECIFIED] German Research Foundation (DFG)

Language:

English

Submitter:

Ueli Reber

Date Deposited:

26 Jul 2019 14:06

Last Modified:

05 Dec 2022 15:29

Publisher DOI:

10.1080/19312458.2018.1555798

Uncontrolled Keywords:

machine translation, topic model, comparative research, multilingual analysis, climate change, online discourse

BORIS DOI:

10.7892/boris.131398

URI:

https://boris.unibe.ch/id/eprint/131398

Actions (login required)

Edit item Edit item
Provide Feedback