Evaluating the effectiveness of machine learning methods for keyword coverage using semantic data analysis


Shaushenova A. Bayegizova A. Baidrakhmanova G. Abuova Z. Kassymova A. Bakirova D. Golenko Y.
February 2025Institute of Advanced Engineering and Science

International Journal of Electrical and Computer Engineering
2025#15Issue 1559 - 568 pp.

This article presents a comprehensive comparative analysis of two advanced hybrid machine learning approaches for keyword extraction: bidirectional encoder representations from transformers (BERT) combined with autoencoder (AE) and term frequency-inverse document frequency (TF-IDF) combined with autoencoder. The research targets the task of semantic analysis in text data to evaluate the effectiveness of these methods in ensuring adequate keyword coverage across diverse text corpora. The study delves into the architecture and operational principles of each method, with a particular focus on the integration with autoencoders to enhance the semantic integrity and relevance of the extracted keywords. The experimental section provides a detailed performance analysis of both methods on various text datasets, highlighting how the structure and semantic richness of the source data influence the outcomes. The evaluation methodology includes precision, recall, and F1-score metrics. The paper discusses the advantages and disadvantages of each approach and their suitability for specific keyword extraction tasks. The findings offer valuable insights for the scientific community, aiding in the selection of the most appropriate text processing method for applications requiring deep semantic understanding and high accuracy in information extraction.

Bidirectional encoder representations from transformers , Hybrid methods , Inverse document frequency , Keyword extraction , Semantic data analysis , Term frequency

Text of the article Перейти на текст статьи

Department of Information System, S. Seifullin Kazakh Agro Technical Research University, Astana, Kazakhstan
Department of Radio Engineering, Electronics and Telecommunications, L. N. Gumilyov Eurasian National University, Astana, Kazakhstan
Department of Computer Science and Information Technologies, K. Zhubanov Aktobe Regional University, Aktobe, Kazakhstan
Higher School of Information Technology, Zhangir Khan West Kazakhstan Agrarian and Technical University, Uralsk, Kazakhstan
Department of Information Technology, Zhangir Khan University, Uralsk, Kazakhstan
Department of Construction, Institute of Architecture and Construction, L. N. Gumilyov Eurasian National University, Astana, Kazakhstan
Department of Information Systems, S. Seifullin Kazakh Agrotechnical Research University, Astana, Kazakhstan

Department of Information System
Department of Radio Engineering
Department of Computer Science and Information Technologies
Higher School of Information Technology
Department of Information Technology
Department of Construction
Department of Information Systems

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026