DETECTION AND CLASSIFICATION OF THREATS AND VULNERABILITIES ON HACKER FORUMS BASED ON MACHINE LEARNING

Mambetov S. Begimbayeva Y. Gurko O. Doroshenko H. Joldasbayev S. Fridman O. Kulambayev B. Babenko V. Ilhe I. Neronov S.
2024 Technology Center

Eastern-European Journal of Enterprise Technologies
2024 #3 Issue 9(129)16 - 27 pp.

The object of this study is the process of detecting threats and vulnerabilities in hacker forums, which are a well-known source of potential dangers for Internet users. However, the problem of analyzing and classifying data from these forums is its complexity due to such features of the participants’ language as specific slang, jargon, etc., which requires the use of modern tools of their processing. This paper explores the application of machine learning to devise an effective method for analyzing sentiment and trends in hacker forums to identify potential threats and vulnerabilities in cyberspace. All necessary stages of the process of detecting threats and vulnerabilities have been developed, ranging from data collection and preprocessing to the training of a model that is capable of processing “raw” unstructured data from hacker forums. The implementation of six popular machine learning algorithms, namely k Nearest Neighbors (kNN), Random Forest, Naive Bayes, Logistic Regression, Support Vector Machines (SVM), and Decision Tree algorithms have been studied with a view to determining their efficiency of threat and vulnerability detection and classification. The experiments have been conducted on real data (150, 000 messengers). It has been determined that the Random Forest algorithm coped with the task the best (accuracy=0.89, recall=0.84, precision=0.91, F1-score=0.87 and ROC-AUC=0.89). The proposed tool based on machine learning not only collects data that poses a potential threat but also processes and classifies it according to the specified. This allows detecting threats and vulnerabilities at a high speed. The results of the study make it possible to identify potential trends in threats and vulnerabilities. This will contribute to the improvement of cybersecurity systems and ensure more reliable protection of information resources Copyright

cybersecurity , data classification , hacker forum , machine learning , threats identification

Text of the article Перейти на текст статьи

Department of Information Systems, Al-Farabi Kazakh National University, al-Farabi ave., 71, Almaty, 050040, Kazakhstan
Department of Cybersecurity, AUPET named after Gumarbek Daukeyev, Baytursynuli str., 126/1, Almaty, 050013, Kazakhstan
Satbayev University, Satbayev str., 22, Almaty, 050013, Kazakhstan
Department of Automation and Computer-Aided Technologies, Ukraine
Department of Computer Engineering, International IT University, Manasa str., 34/1, Almaty, 050040, Kazakhstan
Department of Radio Engineering, Electronics and Telecommunications, Turan University, Satpaeva str., 16a, Almaty, 050013, Kazakhstan
Department of Computer Systems, Ukraine
Department of Law, Management & Economics Daugavpils University, Vienības str., 13, Daugavpils, 5401, Latvia
Kharkiv National Automobile and Highway University, Yaroslava Mudroho str., 25, Kharkiv, 61002, Ukraine
Department of Economics and Management, V. N. Karazin Kharkiv National University, Svobody sq., 4, Kharkiv, 61022, Ukraine

Department of Information Systems
Department of Cybersecurity
Satbayev University
Department of Automation and Computer-Aided Technologies
Department of Computer Engineering
Department of Radio Engineering
Department of Computer Systems
Department of Law
Kharkiv National Automobile and Highway University
Department of Economics and Management

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026