Greedy Optimization Method for Extractive Summarization of Scientific Articles


Akhmetov I. Gelbukh A. Mussabayev R.
2021Institute of Electrical and Electronics Engineers Inc.

IEEE Access
2021#9168141 - 168153 pp.

This work presents a method for summarizing scientific articles from the arXive and PubMed datasets using a greedy Extractive Summarization algorithm. We used the approach along with Variable Neighborhood Search (VNS) to learn what is the top-line exists in the area of Extractive Text Summarization quality in terms of ROUGE scores. The algorithm is based on first selecting for the summary the sentences from the text containing the maximum number of words with the higher TFIDF values along with minimum document frequency parameter tuning for TFIDF vectorization. As a result, the method achieves 0.43/0.12 and 0.40/0.13 for ROUGE-1/ROUGE-2 scores on arXive and PubMed datasets, respectively. These results are comparable to the state-of-the-art models using complex neural network architectures and serious computational resources together with the large amounts of training data. In contrast, our method uses a straightforward statistical inference methodology.

Extractive text summarization , greedy algorithm , variable neighborhood search

Text of the article Перейти на текст статьи

Institute of Information and Computational Technologies, Almaty, 050010, Kazakhstan
Faculty of Information Technologies (FIT), Kazakh-British Technical University, Almaty, 050000, Kazakhstan
CIC, Instituto Politécnico Nacional, Mexico City, 07738, Mexico

Institute of Information and Computational Technologies
Faculty of Information Technologies (FIT)
CIC

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026