A Multi-Level Annotation Model for Fake News Detection: Implementing Kazakh-Russian Corpus via Label Studio
Sambetbayeva M. Nekessova A. Yerimbetova A. Bayangali A. Kaldarova M. Telman D. Smailov N.
August 2025Multidisciplinary Digital Publishing Institute (MDPI)
Big Data and Cognitive Computing
2025#9Issue 8
This paper presents a multi-level annotation model for detecting fake news in Kazakh and Russian languages, aiming to enhance understanding of disinformation strategies in multilingual digital media environments. Unlike traditional binary models, our approach captures the complexity of disinformation by accounting for both linguistic and cultural factors. To support this, a corpus of over 5000 news texts was manually annotated using the Label Studio platform. The annotation scheme consists of seven interrelated categories: CLAIM, SOURCE, EVIDENCE, DISINFORMATION_TECHNIQUE, AUTHOR_INTENT, TARGET_AUDIENCE, and TIMESTAMP. Inter-annotator agreement, evaluated using Cohen’s Kappa, ranged from 0.72 to 0.81, indicating substantial consistency. The annotated data reveals recurring patterns of disinformation, such as emotional manipulation, targeting of vulnerable individuals, and the strategic concealment of intent. Semantic relations between entities, such as CLAIM → EVIDENCE and CLAIM → AUTHOR_INTENT were formalized to represent disinformation narratives as knowledge graphs. This study contributes the first linguistically and culturally adapted annotation model for Kazakh and Russian languages, providing a robust and empirical resource for building interpretable and context-aware fake news detection systems. The resulting annotated corpus and its semantic structure offer valuable empirical material for further research in natural language processing, computational linguistics, and media studies in low-resource language environments.
annotation , automated fake news detection , corpus linguistics , disinformation , fake news , Kazakh language , Label Studio , Russian language , semantic relations
Text of the article Перейти на текст статьи
Institute of Information and Computational Technologies of the Committee of Science of the Ministry of Science and Higher Education of the Republic of Kazakhstan, Almaty, 050010, Kazakhstan
Department of Information Systems, L.N. Gumilyov Eurasian National University, Astana, 010000, Kazakhstan
School of Information Technology and Engineering, Astana International University, Astana, 010000, Kazakhstan
Department of Software Engineering, Satbayev University, Almaty, 050013, Kazakhstan
Department of Electronics, Telecommunications and Space Technologies, Satbayev University, Almaty, 050013, Kazakhstan
Institute of Information and Computational Technologies of the Committee of Science of the Ministry of Science and Higher Education of the Republic of Kazakhstan
Department of Information Systems
School of Information Technology and Engineering
Department of Software Engineering
Department of Electronics
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026