Finetuning Large Language Models for Vulnerability Detection
Shestov A. Levichev R. Mussabayev R. Maslov E. Zadorozhny P. Cheshkov A. Mussabayev R. Toleu A. Tolegen G. Krassovitskiy A.
2025Institute of Electrical and Electronics Engineers Inc.
IEEE Access
2025#1338889 - 38900 pp.
This paper presents the results of finetuning large language models (LLMs) for the task of detecting vulnerabilities in Java source code. We leverage WizardCoder, a recent improvement of the state-of-the-art LLM StarCoder, and adapt it for vulnerability detection through further finetuning. To accelerate training, we modify WizardCoder’s training procedure, also we investigate optimal training regimes. For the imbalanced dataset with many more negative examples than positive, we also explore different techniques to improve classification performance. The finetuned WizardCoder model achieves improvement in ROC AUC and F1 measures on balanced and imbalanced vulnerability datasets over CodeBERT-like model, demonstrating the effectiveness of adapting pretrained LLMs for vulnerability detection in source code. The key contributions are finetuning the state-of-the-art code LLM, WizardCoder, increasing its training speed without the performance harm, optimizing the training procedure and regimes, handling class imbalance, and improving performance on difficult vulnerability detection datasets. This demonstrates the potential for transfer learning by finetuning large pretrained language models for specialized source code analysis tasks.
cybersecurity , finetuning , Large language models , LoRA , PEFT , StarCoder , vulnerability detection , WizardCoder
Text of the article Перейти на текст статьи
Sber AI Lab, Moscow, 117312, Russian Federation
SaluteDevices, Moscow, 117312, Russian Federation
Satbayev University, AI Research Lab, Almaty, 050000, Kazakhstan
Huawei Russian Research Institute, Software Development Tools Cloud Technology Laboratory, Moscow, 121099, Russian Federation
Sber AI Lab
SaluteDevices
Satbayev University
Huawei Russian Research Institute
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026