A Systematic Evaluation of Large Language Models and Retrieval-Augmented Generation for the Task of Kazakh Question Answering
Mansurova A. Tleubayeva A. Nugumanova A. Shomanov A. Seker S.E.
November 2025Multidisciplinary Digital Publishing Institute (MDPI)
Information (Switzerland)
2025#16Issue 11
This paper presents a systematic evaluation of large language models (LLMs) and retrieval-augmented generation (RAG) approaches for question answering (QA) in the low-resource Kazakh language. We assess the performance of existing proprietary (GPT-4o, Gemini 2.5-flash) and open-source Kazakh-oriented models (KazLLM-8B, Sherkala-8B, Irbis-7B) across closed-book and RAG settings. Within a three-stage evaluation framework we benchmark retriever quality, examine LLM abilities such as knowledge-gap detection, external truth integration and context grounding, and measures gains from realistic end-to-end RAG pipelines. Our results show a clear pattern: proprietary models lead in closed-book QA, but RAG narrows the gap substantially. Under the Ideal RAG setting, KazLLM-8B improves from its closed-book baseline of 0.427 to reach answer correctness of 0.867, closely matching GPT-4o’s score of 0.869. In the end-to-end RAG setup, KazLLM-8B paired with Snowflake retriever achieved answer correctness up to 0.754, surpassing GPT-4o’s best score of 0.632. Despite improvements, RAG outcomes show an inconsistency: high retrieval metrics do not guarantee high QA system accuracy. The findings highlight the importance of retrievers and context grounding strategies in enabling open-source Kazakh models to deliver competitive QA performance in a low-resource setting.
information retrieval (IR) , Kazakh language , language model evaluation , large language models (LLM) , low-resource language , question answering (QA) system , retrieval augmented generation (RAG) , sentence embeddings
Text of the article Перейти на текст статьи
Big Data and Blockchain Technologies Research and Innovation Center, Astana IT University, Astana, 020000, Kazakhstan
School of Artificial Intelligence and Data Science, Astana IT University, Astana, 020000, Kazakhstan
Computer Science Department, Nazarbayev University, Astana, 020000, Kazakhstan
Department of Computer Engineering, Faculty of Computer and Information Technologies, Istanbul University, Istanbul, 34320, Turkey
Big Data and Blockchain Technologies Research and Innovation Center
School of Artificial Intelligence and Data Science
Computer Science Department
Department of Computer Engineering
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026