Comparative evaluation of machine learning models for museum exhibit recognition from video-derived datasets


Ipalakova M. Bolatov Z. Daineko Y. Sharshova R. Abdugapparova K. Tsoy D.
2 October 2025PeerJ Inc.

PeerJ Computer Science
2025#11

This study evaluates the performance of multiple deep learning models for automatic recognition of museum artifacts using image frames extracted from real-world video footage. A comparative analysis is conducted across eight state-of-the-art architectures—MobileNetV3, ResNetV2, EfficientNetV2, You Only Look Once v8 (YOLOv8), Visual Geometry Group 16 (VGG16), ConvNeXtTiny, SwinV2-Base, and Dual Attention Vision Transformer (DaViT)—on a custom dataset collected in an actual museum environment. The dataset comprises labeled video frames categorized by artifact type and is used to train and test models for both classification and object detection tasks. Results indicate that YOLOv8, MobileNetV3, and DaViT achieve superior performance for real-time mobile and augmented reality (AR) applications, while ResNetV2 and SwinV2-Base provide high classification accuracy suitable for archival and cataloging systems. This work offers practical guidance on dataset design, model choice, and deployment strategies for artificial intelligence (AI)-powered cultural heritage technologies. Copyright

Artificial Intelligence , Augmented reality (AR) , Computer Vision , Cultural heritage , Data Mining and Machine Learning , Machine learning , Museum exhibit recognition , Neural Networks , Object detection , Real-time recognition , Video-derived datasets

Text of the article Перейти на текст статьи

International Information Technology University, Almaty, Kazakhstan

International Information Technology University

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026