Korpus jako zdroj dat pro opravy chyb automatické morfologické analýzy

Title in English Corpus as Source of Amendements for Automatic Morphological Analysis
Authors

OSOLSOBĚ Klára

Year of publication 2007
Type Article in Proceedings
Conference Grammar & Corpora, 2nd International Conference, Abstracts
MU Faculty or unit

Faculty of Arts

Citation
Field Linguistics
Keywords corpus; automatical morphological analysis; verb form; word class; gradation
Description The aim of this paper is to present how a corpus can be used as a device (source) to improve the description of chosen grammatical phenomena in dictionary and grammar on one hand and in morpholigical taggers on the other hand. Two automatic morphological taggers used for tagging of Czech language corpora (Hajič, 2004 and Sedláček, 2005) will be compared. We shall analyze how three phenomena: a) synthetic future in Czech, b) comparison of adjectiv and c) word class transposition of words like hodně, mnoho, moc, are annotated in CNK and how are they described in Czech dictionaries (Slovník spisovného jazyka českého and Slovník spisovné češtiny pro školu a veřejnost) and grammars (Mluvnice češtiny, 1986, Česká mluvnice, 1989, Příruční mluvnice češtiny, 1996, Čeština, řeč a jazyk, 1996). We shall discuss how the analysis of corpus mined data can be used for detecting of gaps in examined materials and how can it contribute to filling them in.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.