Applying machine learning techniques for religious extremism detection on online user contents
Mussiraliyeva S. Omarov B. Yoo P. Bolatbek M.
2021Tech Science Press
Computers, Materials and Continua
2021#70Issue 1915 - 934 pp.
In this research paper, we propose a corpus for the task of detecting religious extremism in social networks and open sources and compare various machine learning algorithms for the binary classification problem using a previously created corpus, thereby checking whether it is possible to detect extremist messages in the Kazakh language. To do this, the authors trained models using six classic machine-learning algorithms such as Support Vector Machine, Decision Tree, Random Forest, K Nearest Neighbors, Naive Bayes, and Logistic Regression. To increase the accuracy of detecting extremist texts, we used various characteristics such as Statistical Features, TF-IDF, POS, LIWC, and applied oversampling and undersampling techniques to handle imbalanced data. As a result, we achieved 98% accuracy in detecting religious extremism in Kazakh texts for the collected dataset. Testing the developed machine learning models in various databases that are often found in everyday life “Jokes”, “News”, “Toxic content”, “Spam”, “Advertising” has also shown high rates of extremism detection.
Extremism , Machine learning , Natural language processing , NLP , Religious extremism , Social media , Social network
Text of the article Перейти на текст статьи
Al-Farabi Kazakh National University, Almaty, Kazakhstan
CSIS, Birkbeck College, University of London, London, United Kingdom
Al-Farabi Kazakh National University
CSIS
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026