Development of a Hybrid Span-QA Model With Ontology Integration for Semantic Enrichment of Answers
Yergesh M. Yergesh B. Sharipbay A. Seker S.E. Maxutova K.
2025Institute of Electrical and Electronics Engineers Inc.
IEEE Access
2025#13165927 - 165940 pp.
In the expanding frontier of natural language processing, the challenge of building accurate and semantically rich question answering (QA) systems for low-resource languages remains largely unresolved. This study presents a hybrid extractive QA model tailored for Kazakh, a morphologically complex and digitally underrepresented language, by integrating dense retrieval mechanisms with ontology-based semantic prefixing mechanism. Unlike conventional approaches that rely solely on retrieval and reading comprehension, our architecture injects dynamically constructed semantic definitions for domain-specific terms into the answer context, enabling more profound understanding and improved accuracy. Leveraging a novel dataset of Kazakh QA pairs generated through GPT-4 with expert validation, we introduce a dual-stream hybrid model (Hybrid B) that combines ontology-driven enrichment with fine-tuned transformer models for the retriever and reader components. The proposed system achieves a significant leap in performance, with 88% F1 and 76% exact match scores, substantially outperforming established baselines, including KazQAD. By anchoring the QA pipeline in a custom-built educational ontology relevant to the target domain, this research demonstrates how semantic structure can compensate for linguistic and data scarcity, paving the way for scalable QA systems in other low-resource languages such as Turkish. Beyond its empirical results, the study contributes a reproducible and modular framework for retrieval-augmented, ontology-aware QA, with implications for educational platforms, intelligent tutoring systems, and domain-specific information access. Through this work, we argue that the future of multilingual QA lies in hybrid architectures that combine symbolic structure with neural adaptability.
Extractive QA , Kazakh language , low-resource language , ontology , retrieval-augmentation , sentence-BERT
Text of the article Перейти на текст статьи
L.N. Gumilyov Eurasian National University, Department of Artificial Intelligence Technologies, Astana, 010000, Kazakhstan
L.N. Gumilyov Eurasian National University, Digital Development Department, Astana, 010000, Kazakhstan
Istanbul University, Faculty of Computer and Information Technologies, Department of Computer Engineering, Istanbul, 34320, Turkey
L.N. Gumilyov Eurasian National University
L.N. Gumilyov Eurasian National University
Istanbul University
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026