Comparative evaluation of machine learning models for museum exhibit recognition from video-derived datasets
Ipalakova M. Bolatov Z. Daineko Y. Sharshova R. Abdugapparova K. Tsoy D.
2 October 2025PeerJ Inc.
PeerJ Computer Science
2025#11
This study evaluates the performance of multiple deep learning models for automatic recognition of museum artifacts using image frames extracted from real-world video footage. A comparative analysis is conducted across eight state-of-the-art architectures—MobileNetV3, ResNetV2, EfficientNetV2, You Only Look Once v8 (YOLOv8), Visual Geometry Group 16 (VGG16), ConvNeXtTiny, SwinV2-Base, and Dual Attention Vision Transformer (DaViT)—on a custom dataset collected in an actual museum environment. The dataset comprises labeled video frames categorized by artifact type and is used to train and test models for both classification and object detection tasks. Results indicate that YOLOv8, MobileNetV3, and DaViT achieve superior performance for real-time mobile and augmented reality (AR) applications, while ResNetV2 and SwinV2-Base provide high classification accuracy suitable for archival and cataloging systems. This work offers practical guidance on dataset design, model choice, and deployment strategies for artificial intelligence (AI)-powered cultural heritage technologies. Copyright
Artificial Intelligence , Augmented reality (AR) , Computer Vision , Cultural heritage , Data Mining and Machine Learning , Machine learning , Museum exhibit recognition , Neural Networks , Object detection , Real-time recognition , Video-derived datasets
Text of the article Перейти на текст статьи
International Information Technology University, Almaty, Kazakhstan
International Information Technology University
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026