Dynamic optimization of min-df in the GreedSum algorithm for enhanced extractive summarization


Aubakirov S. Akhmetov I. Gelbukh A. Mussabayev R.
September 2025Springer Nature

Artificial Intelligence Review
2025#58Issue 9

This study aims to improve extractive text summarization by dynamically optimizing the min-df (minimum document frequency) parameter in the GreedSum algorithm. To achieve this, three methods are proposed for dynamic tuning: a geometric approach, a percentile-based threshold, and a clustering-based adaptation strategy. These methods were applied to a large-scale dataset of 17,038 scientific articles from arXiv and PubMed. The experiments demonstrate a 2% improvement in ROUGE-1 F-measure over fixed min-df settings, achieving a peak ROUGE-1 F1-score of 45%. This performance surpasses several established extractive and hybrid baselines. Our contributions include a comparative evaluation of dynamic tuning strategies and the demonstration of their effectiveness for adaptive summarization. These findings confirm the potential of dynamic min-df optimization for generating accurate and efficient summaries, with future research focused on deep learning integration and real-time, multi-modal summarization.

Extractive text summarization , GreedSum dynamic parameter optimization , Natural language processing , TF-IDF vectorization

Text of the article Перейти на текст статьи

Department, Satbayev University, Kanysh Satpayev 22 street, Almaty, 050013, Kazakhstan
Department, Kazakh-British Technical University, Tole Bi 59 street, Almaty, 050000, Kazakhstan
Department, Instituto Politécnico Nacional (IPN), Mexico City, Mexico

Department
Department
Department

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026