Přegenerování a podgenerování : Jak efektivně vyhledávat v jazykových korpusech data pro lingvistický výzkum

Title in English Over/under Generating : How to Search Data for Linguistic Analysis in Language Corpora
Authors

OSOLSOBĚ Klára

Year of publication 2024
Type Requested lectures
MU Faculty or unit

Faculty of Arts

Citation
Description In this talk, we will show, how to minimize the overgeneration (to increase accuracy) and to prevent undergeneration (to maintain coverage) in corpus-based word formation research. On a specific example of retrieval of candidates for a word formation model (kutil) we shall show how to use observation of corpus data for progressive specification of corpus query. The data obtained from the corpus will be analysed from a quantitative and qualitative point of view. Next, we show to what extent homonymy of nouns formed by conversion of l-participles has a negative effect on the results of POS disambiguation.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

By clicking “Accept Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. Cookie Settings

Necessary Only Accept Cookies