NLP Models for Military Terminology Analysis and Detection of Information Operations on Social Media
Abdygalym B. Sambetbayeva M. Yerimbetova A. Nekessova A. Tasbolatuly N. Smailov N. Nazymkhan A.
November 2025Multidisciplinary Digital Publishing Institute (MDPI)
Computers
2025#14Issue 11
This paper presents Multi_mil, a multilingual annotated corpus designed for the analysis of information operations in military discourse. The corpus consists of 1000 texts collected from social media and news platforms in Russian, Kazakh, and English, covering military and geopolitical narratives. A multi-level annotation scheme was developed, combining entity categories (e.g., military terms, geographical references, sources) with pragmatic features such as information operation type, emotional tone, author intent, and fake claim indicators. Annotation was performed manually in Label Studio with high inter-annotator agreement (κ = 0.82). To demonstrate practical applicability, baseline models and the proposed Onto-IO-BERT architecture were tested, achieving superior performance (macro-F1 = 0.81). The corpus enables the identification of manipulation strategies, rhetorical patterns, and cognitive influence in multilingual contexts. Multi_mil contributes to advancing NLP methods for detecting disinformation, propaganda, and psychological operations.
annotated corpus , information operations , Label Studio , military discourse , NLP , social media analysis
Text of the article Перейти на текст статьи
International Science Complex Astana, Astana, 010000, Kazakhstan
Institute of Information and Computational Technologies of the Committee of Science of the Ministry of Science and Higher Education of the Republic of Kazakhstan, Almaty, 050010, Kazakhstan
Department of Information Systems, L.N. Gumilyov Eurasian National University, Astana, 010000, Kazakhstan
School of Engineering and Information Technology, Eurasian Technological University, Almaty, 050012, Kazakhstan
School of Information Technology and Engineering, Astana International University, Astana, 010000, Kazakhstan
Department of Electronics, Telecommunications and Space Technologies, Satbayev University, Almaty, 050013, Kazakhstan
International Science Complex Astana
Institute of Information and Computational Technologies of the Committee of Science of the Ministry of Science and Higher Education of the Republic of Kazakhstan
Department of Information Systems
School of Engineering and Information Technology
School of Information Technology and Engineering
Department of Electronics
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026