SQAD: Simple Question Answering Database
Autoři | |
---|---|
Rok publikování | 2014 |
Druh | Článek ve sborníku |
Konference | Eighth Workshop on Recent Advances in Slavonic Natural Language Processing |
Fakulta / Pracoviště MU | |
Citace | |
Obor | Informatika |
Klíčová slova | question answering; Simple Question Answering Database; SQAD; syntax-based question answering; SBQA |
Popis | In this paper, we present a new free resource for comparable Czech question answering evaluation. The Simple Question Answering Database, SQAD, contains 3301 questions and answers extracted and processed from the Czech Wikipedia. The SQAD database was prepared with the aim of a precision evaluation of automatic question answering systems. Such resource was currently not available for the Czech language. We describe the process of SQAD creation, processing of the texts by automatic tokenization (Unitok) and morphological disambiguation (Desamb) and successive semi-automatic cleaning and post-processing. We also show the results of a first version of Czech question answering system named SBQA (syntax-based question answering). |
Související projekty: |