Syntactic Patterns of Czech Multiword Expressions

Autoři

NEVĚŘILOVÁ Zuzana

Rok publikování 2019
Druh Článek ve sborníku
Konference Slavonic Natural Language Processing in the 21st Century
Citace
Popis We focus on a MWE collection that we created in past works. We analyze the collection using K-means clustering of the MWE tags as they occur in a web corpus. Afterwards, we compare the collection with another Czech MWE collection, the SemLex. The comparison shows how different the data are. Our collection created from web corpus contains less formal language and exemplifies the use of noun phrases with noun modifiers, mainly in English borrowings. On the other hand, the SemLex collection is extracted from dataset containing mostly formal Czech and noun phrase with adjective modifier is the prevalent syntactic pattern.

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.