DSL Shared task 2016: Perfect Is The Enemy of Good Language Discrimination Through Expectation-Maximization and Chunk-based Language Model
Authors | |
---|---|
Year of publication | 2016 |
Type | Article in Proceedings |
Conference | Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3) |
MU Faculty or unit | |
Citation | |
Web | https://aclanthology.info/pdf/W/W16/W16-4815.pdf |
Field | Informatics |
Keywords | language discrimination;expectation maximization;language model |
Description | In this paper we investigate two approaches to discrimination of similar languages: Expectation--maximization algorithm for estimating conditional probability P(word|language) and byte level language models similar to compression-based language modelling methods. The accuracy of these methods reached respectively 86.6 % and 88.3 % on set A of the DSL Shared task 2016 competition. |
Related projects: |