Maximising the Power of Semantic Textual Data : CASTEMO Data Collection and the InkVisitor Application

Investor logo
Authors

ZBÍRAL David SHAW Robert Laurence John HAMPEJS Tomáš MERTEL Adam

Year of publication 2023
Type Appeared in Conference without Proceedings
Citation
Description The authors present Computer-Assisted Semantic Text Modelling (CASTEMO), a novel but now well-developed approach to transformation of textual resources into rich structured data, CASTEMO knowledge graphs, stored in JSON-based document databases. They also introduce the open-source InkVisitor research environment which assists in CASTEMO data collection workflow. Both the workflow and the environment were developed within the ERC-funded Dissident Networks Project (DISSINET] but are now made available to use by other researchers and projects. The CASTEMO data collection approach aims to preserve the rich qualitative texture of texts and at the same time produce structured data suitable for computational analysis. It preserves the contextual embeddedness of knowledge and the natural features of human knowledge, such as conflicting evidence and information given in a non-indicative modality, e.g. questions and conditional sentences. It thus answers a significant challenge in the digital study of texts, where a decision must often be taken to prefer extracting content or analysing discursive features, as well as whether to focus on distant or close reading. With CASTEMO, these levels can be readily interwoven into “scalable reading”. This presentation introduces the essential data modelling principles of CASTEMO, as well as its use cases and advantages for certain types of study. It also introduces the InkVisitor research environment.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.