Transformer-Based Re-Ranking Model for Enhancing Contextual and Syntactic Translation in Low-Resource Neural Machine Translation
Javed A. Zan H. Mamyrbayev O. Abdullah M. Ahmed K. Oralbekova D. Dinara K. Akhmediyarova A.
January 2025Multidisciplinary Digital Publishing Institute (MDPI)
Electronics (Switzerland)
2025#14Issue 2
Neural machine translation (NMT) plays a vital role in modern communication by bridging language barriers and enabling effective information exchange across diverse linguistic communities. Due to the limited availability of data in low-resource languages, NMT faces significant translation challenges. Data sparsity limits NMT models’ ability to learn, generalize, and produce accurate translations, which leads to low coherence and poor context awareness. This paper proposes a transformer-based approach incorporating an encoder–decoder structure, bilingual curriculum learning, and contrastive re-ranking mechanisms. Our approach enriches the training dataset using back-translation and enhances the model’s contextual learning through BERT embeddings. An incomplete-trust (in-trust) loss function is introduced to replace the traditional cross-entropy loss during training. The proposed model effectively handles out-of-vocabulary words and integrates named entity recognition techniques to maintain semantic accuracy. Additionally, the self-attention layers in the transformer architecture enhance the model’s syntactic analysis capabilities, which enables better context awareness and more accurate translations. Extensive experiments are performed on a diverse Chinese–Urdu parallel corpus, developed using human effort and publicly available datasets such as OPUS, WMT, and WiLi. The proposed model demonstrates a BLEU score improvement of 1.80% for Zh→Ur and 2.22% for Ur→Zh compared to the highest-performing comparative model. This significant enhancement indicates better translation quality and accuracy.
BERT , M2M model , neural machine translation , re-ranking , syntactic analysis , transformer
Text of the article Перейти на текст статьи
School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, 450001, China
Institute of Information and Computational Technologies, Almaty, 050010, Kazakhstan
School of Software, Henan University, Kaifeng, 475001, China
Academy of Logistics and Transport, Almaty, 050010, Kazakhstan
Institute of Automation and Information Technologies, Satbayev University, Almaty, 050013, Kazakhstan
School of Computer and Artificial Intelligence
Institute of Information and Computational Technologies
School of Software
Academy of Logistics and Transport
Institute of Automation and Information Technologies
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026