RapCor, Francophone Rap Songs Text Corpus
Authors | |
---|---|
Year of publication | 2020 |
Type | Article in Proceedings |
Conference | Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2020 |
MU Faculty or unit | |
Citation | |
Web | online version |
Keywords | French; text processing; rap music; hip hop; lyrics; substandard; neology; written orality; corpus building |
Attached files | |
Description | The paper introduces the RapCor corpus, which is a specific text corpus for French, based on francophone rap songs’ texts from the last three decades when rap music became one of most popular music genres. An overview of more than ten years of rap corpora building presents our motivations, text processing methods, annotation decisions, as well as achievements and problematic issues. The published part of rap corpora, available in Sketch Engine manager for interdisciplinary research, the RapCor 1288, consists of 709,057 words of 1288 francophone rappers’ texts. It had been used mainly for the detection and longitudinal observation of so-called “identitary neologisms”, i.e. expressions emerging from communication between peers, motivated by search for group belonging, playfulness and expressivity. Rappers’ language is also a valuable resource for investigating metaphors and idioms that have been formed by assigning a new meaning to existing language items. The main goal of this largely substandard linguistic corpora is to uncover the phonemic and semantic innovations and trends in modern French. |
Related projects: |