A Spell Checker for Esperanto

This publication doesn't include Faculty of Arts. It includes Faculty of Informatics. Official publication website can be found on muni.cz.



Type Monograph
MU Faculty or unit

Faculty of Informatics

Description This thesis provides a brief overview of spell checking software and describes the process of constructing a spell checker for the Esperanto language and its implementation as a dictionary (i.e. an affix file and a word list) for the Hunspell spell checker. The word list is an adaptation of word roots coming from the renowned Esperanto dictionary PIV. Recognition of morphologically complex words, which are common in Esperanto due to its agglutinative nature, is made possible by the affix file which has been built based on ready-made morpheme segmentation of word derivations appearing in the same source. Rules derived in the latter process have been improved by semantic classification of all involved roots, for which a system has been created based on corpus analysis and several specialized dictionaries, in combination with knowledge on the capability of each affix to accept roots from different semantic classes, acquired from the PMEG reference grammar. The resulting spell checker is a working proof of concept, to be further improved and integrated in the grammar checker project of the E@I organization.