Greedy Optimization Method for Extractive Summarization of Scientific Articles
Akhmetov I. Gelbukh A. Mussabayev R.
2021Institute of Electrical and Electronics Engineers Inc.
IEEE Access
2021#9168141 - 168153 pp.
This work presents a method for summarizing scientific articles from the arXive and PubMed datasets using a greedy Extractive Summarization algorithm. We used the approach along with Variable Neighborhood Search (VNS) to learn what is the top-line exists in the area of Extractive Text Summarization quality in terms of ROUGE scores. The algorithm is based on first selecting for the summary the sentences from the text containing the maximum number of words with the higher TFIDF values along with minimum document frequency parameter tuning for TFIDF vectorization. As a result, the method achieves 0.43/0.12 and 0.40/0.13 for ROUGE-1/ROUGE-2 scores on arXive and PubMed datasets, respectively. These results are comparable to the state-of-the-art models using complex neural network architectures and serious computational resources together with the large amounts of training data. In contrast, our method uses a straightforward statistical inference methodology.
Extractive text summarization , greedy algorithm , variable neighborhood search
Text of the article Перейти на текст статьи
Institute of Information and Computational Technologies, Almaty, 050010, Kazakhstan
Faculty of Information Technologies (FIT), Kazakh-British Technical University, Almaty, 050000, Kazakhstan
CIC, Instituto Politécnico Nacional, Mexico City, 07738, Mexico
Institute of Information and Computational Technologies
Faculty of Information Technologies (FIT)
CIC
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026