SOFTWARE ANALYSIS OF SCIENTIFIC TEXTS: COMPARATIVE STUDY OF DISTRIBUTED COMPUTING FRAMEWORKS


АНАЛІЗ ПРОГРАМНОГО ЗАБЕЗПЕЧЕННЯ НАУКОВИХ ТЕКСТІВ: ПОРІВНЯЛЬНЕ ДОСЛІДЖЕННЯ РОЗПОДІЛЕНИХ ОБЧИСЛЮЮВАЛЬНИХ ФРЕЙМВОРКІВ
Altynbek S. Shuitenov G. Muratbekov M. Barlybayev A.
2025National Aerospace University Kharkiv Aviation Institute

Radioelectronic and Computer Systems
2025#2025Issue 2118 - 131 pp.

The relevance of this study is related to the need for efficient analysis of scientific texts in the context of the growing amount of information. This study aims to conduct a study of popular distributed computing frameworks for scientific text processing. This study conducted an extensive analysis of the scientific literature, which has systematized the key features of distributed frameworks, such as Apache Flink, Apache Spark, and Apache Hadoop, with an in-depth focus on their application in the field of scientific text analysis. The results obtained from this study allowed delving into the architectural features of each of the studied frameworks, highlighting their strengths, such as high performance, scalability, and flexibility in data processing. Limitations such as resource requirements and customization complexity were also identified. The comparative analysis revealed the following: Apache Flink and Apache Spark have high performance and scalability by performing in-memory computation to increase processing speed and efficiency. They support both batch and streaming data processing and guarantee processing “exactly once”. Conversely, Apache Hadoop has lower performance, mainly using discbased data processing. Importantly, Apache Flink and Apache Spark support several programming languages, such as Java, Scala, and Python, providing developers with flexibility. Thus, the results of the study provide comprehensive information for researchers and engineers, helping them to choose the most appropriate frame work based on their research’s specific needs and objectives. The practical significance of this study is to provide information on the best tools for analyzing scientific texts, which can contribute to more efficient data processing and accelerate scientific research in various fields.

Apache Flink , Apache Hadoop , Apache Spark , big data , machine learning , text analysis

Text of the article Перейти на текст статьи

Department of Information Technology, Kazakh University of Technology and Business, Astana, Kazakhstan
Department of Information Technology, Esil University, Astana, Kazakhstan
Faculty of Information Technology, L.N. Gumilyov Eurasian National University, Astana, Kazakhstan

Department of Information Technology
Department of Information Technology
Faculty of Information Technology

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026