Frequency of Low-Frequency Words in Text Corpora
Authors | |
---|---|
Year of publication | 2010 |
Type | Article in Proceedings |
Conference | Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2010 |
MU Faculty or unit | |
Citation | |
Web | https://nlp.fi.muni.cz/raslan/2010/paper15.pdf |
Field | Linguistics |
Keywords | Computational linguistics Language model; Low-frequency; Text analysis; Text corpora |
Description | Low-frequency words, esp. words occurring only once in a text corpus, are very popular in text analysis. Also many lexicographers draw attention to such words. This paper lists a detailed statistical analysis of low-frequency words. The results provides important information for many practical applications, including lexicography and language modeling. |
Related projects: |