Text Punctuation: An Inter-annotator Agreement Study
Authors | |
---|---|
Year of publication | 2017 |
Type | Article in Proceedings |
Conference | Text, Speech, and Dialogue: 20th International Conference, TSD 2017 |
MU Faculty or unit | |
Citation | |
web | https://link.springer.com/chapter/10.1007/978-3-319-64206-2_14 |
Doi | http://dx.doi.org/10.1007/978-3-319-64206-2_14 |
Field | Informatics |
Keywords | Comma adding;Spoken language;Inter-annotator agreement |
Description | Spoken language is a phenomenon which is hard to be annotated accurately. One of the most ambiguous tasks is to fill in the punctuation marks into the spoken language transcription. Used punctuation marks are often dependent on how annotators understand the transcription content. This may differ as the spoken language often lacks clear structure (inherent to written language) due to the utterance spontaneity or due to skipping between ideas. Therefore we suspect that filling commas into the spoken language transcription is a very ambiguous task with low inter-annotator agreement (IAA). In this paper we analyze the IAA within group of annotators and we propose methods to increase it. We also propose and evaluate a reformulation of classical GT annotations for cases with multiple annotations available. |
Related projects: |