Discourse-level Annotation over Europarl for Machine Translation: Connectives and Pronouns

Zufferey, Sandrine; Popescu-Belis, Andrei; Meyer, Thomas; Cartoni, Bruno (2012). Discourse-level Annotation over Europarl for Machine Translation: Connectives and Pronouns. In: Proceedings of the 8th International Conference on Language Resources and Evaluation. Istanbul, Turkey. 21.-27.05.2012.

[img] Text
255_Paper.pdf - Published Version
Restricted to registered users only
Available under License Publisher holds Copyright.

Download (358kB)

This paper describes methods and results for the annotation of two discourse-level phenomena, connectives and pronouns, over a multilingual parallel corpus. Excerpts from Europarl in English and French have been annotated with disambiguation information for connectives and pronouns, for about 3600 tokens. This data is then used in several ways: for cross-linguistic studies, for training automatic disambiguation software, and ultimately for training and testing discourse-aware statistical machine translation systems. The paper presents the annotation procedures and their results in detail, and overviews the first systems trained on the annotated resources and their use for machine translation.

Item Type:

Conference or Workshop Item (Paper)

Division/Institute:

06 Faculty of Humanities > Department of Linguistics and Literary Studies > Institute of French Language and Literature

UniBE Contributor:

Zufferey, Sandrine

Subjects:

800 Literature, rhetoric & criticism > 840 French & related literatures
400 Language > 440 French & related languages

ISBN:

978-2-9517408-7-7

Series:

Proceedings of the 8th International Conference on Language Resources and Evaluation

Language:

English

Submitter:

Sandrine Zufferey

Date Deposited:

25 Apr 2016 11:42

Last Modified:

05 Dec 2022 14:53

BORIS DOI:

10.7892/boris.78674

URI:

https://boris.unibe.ch/id/eprint/78674

Actions (login required)

Edit item Edit item
Provide Feedback