RapCor, Francophone Rap Songs Text Corpus

Authors

PODHORNÁ-POLICKÁ Alena

Year of publication 2020
Type Article in Proceedings
Conference Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2020
MU Faculty or unit

Faculty of Arts

Citation
Web online version
Keywords French; text processing; rap music; hip hop; lyrics; substandard; neology; written orality; corpus building
Attached files
Description The paper introduces the RapCor corpus, which is a specific text corpus for French, based on francophone rap songs’ texts from the last three decades when rap music became one of most popular music genres. An overview of more than ten years of rap corpora building presents our motivations, text processing methods, annotation decisions, as well as achievements and problematic issues. The published part of rap corpora, available in Sketch Engine manager for interdisciplinary research, the RapCor 1288, consists of 709,057 words of 1288 francophone rappers’ texts. It had been used mainly for the detection and longitudinal observation of so-called “identitary neologisms”, i.e. expressions emerging from communication between peers, motivated by search for group belonging, playfulness and expressivity. Rappers’ language is also a valuable resource for investigating metaphors and idioms that have been formed by assigning a new meaning to existing language items. The main goal of this largely substandard linguistic corpora is to uncover the phonemic and semantic innovations and trends in modern French.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.