The issues of developing the historical subcorpus of the National Corpus of the Kazakh Language


Fazylzhanova A. Seitbekova A. Kobdenova G. Seidamat A. Ayazbayev G.
1 May 2025De Gruyter Mouton

Lodz Papers in Pragmatics
2025#21Issue 1169 - 191 pp.

The purpose of this study is to investigate the main issues related to the further stage of development of the historical subcorpus of the National Corpus of the Kazakh Language. Through the study of the historical subcorpus in the Kazakh language, issues such as metatext markup, software improvement, and transcription were examined. When analysing the historical subcorpus the following points were noted: texts of XII, XIV-XX centuries were placed in it, and the search was carried out with the help of such parameters as author, text style, text graphics, text title, text genre, century, also Arabic, Cyrillic and Latin graphics were presented, in genre terms poems, prose, heroic songs, articles, epic of religious character, novels were considered. During the study of the first phase of the historical subcorpus, it was learnt that there is a need to incorporate the experience of other National Corpora, to develop mechanisms for the active inclusion of texts from different periods, in particular from the fifth to nineteenth centuries, the tenth to fifteenth centuries and the sixteenth to nineteenth centuries. Improving metatextual markup, providing more information about texts, and solving problems related to transcription are also important issues.

genre , graphics , meta-markup , software , transcription

Text of the article Перейти на текст статьи

Institute of Linguistics Named after A. Baitursynov, 29 Kurmangazy Str., Almaty, 050010, Kazakhstan

Institute of Linguistics Named after A. Baitursynov

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026