Classification of texts on emergency situations in Almaty


Andirov M.Y. Assan Z.Z. Nopembri S. Seilkhan A.M. Myrzakhmetov D.E.
2023Institute of Metallurgy and Ore Beneficiation JSC

Kompleksnoe Ispolzovanie Mineralnogo Syra
2023#327Issue 423 - 31 pp.

Text classification is a process that includes stages and approaches for the effective classification of texts that are diverse in their structure. In this article, machine learning algorithms are implemented, such as the support vector method, logistic regression, and the k nearest neighborhood method for classifying texts collected from emergency news sites in Almaty. During the experiment, a special role was played by the data collection stage, as well as their subsequent processing. Prior to the classification of the data set, preliminary data processing was performed, which includes such steps as the removal of stop words, tokenization, stemming, lemmatization, feature extraction, and the construction of feature vectors. The data was obtained by automated collection of information from open sources using a script. Experimental results show that the classifier based on logistic regression provides the best performance results compared to other types of algorithms. The performance indicators of each algorithm were obtained, which allows us to perform a comparative analysis between them.

emergencies , KNN , logistic regression , machine learning , NLP , preprocessing , support vector machine , text classification

Text of the article Перейти на текст статьи

Computer Science, Faculty of Information Technology, Al-Farabi Kazakh National University, Almaty, Kazakhstan
Universitas Negeri Yogyakarta, Yogyakarta, Indonesia
Computer science and information technology, Faculty of Physics and Mathematics, K. Zhubanov Aktobe Regional University, Aktobe, Kazakhstan

Computer Science
Universitas Negeri Yogyakarta
Computer science and information technology

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026