Air Pollution Forecasting in Almaty Based on Meteorological Data Using Machine Learning for Sustainable Environmental Management
Domalatov Y. Chezhimbayeva K. Issakov B. Amirgalina A. Zharkymbekova M. Sakitzhanov M. Tokseit D.
31 December 2025International Information and Engineering Technology Association
International Journal of Sustainable Development and Planning
2025#20Issue 125227 - 5246 pp.
Sustainable air quality management in large and rapidly growing megacities requires the implementation of forecasting systems capable of accounting for nonlinear interactions between meteorological conditions and the dynamics of suspended particles. Almaty, characterized by pronounced mountain-valley circulation and frequent winter inversions, is one of the cities in Central Asia where PM2.5 and PM10 concentrations regularly exceed WHO recommendations. As part of the study, an interpretable model for short-term and conditional medium-term air pollution forecasting was developed based on Random Forest and LSTM algorithms using data from AQICN, AirKaz, Dashboard.air.org.kz, Ogimet and ERA5 for 2020–2024. Modelling was performed in two scenarios: (A) using only pollutant concentration lags and (B) adding a complete set of meteorological parameters, including temperature, relative humidity, wind speed, boundary layer height (BLH), surface pressure and cloud cover. Accuracy assessment at 7- and 30-day horizons showed that the inclusion of meteorological data significantly improves forecast quality, especially for PM2.5 with Random Forest providing the most stable RMSE and MAE values. The LSTM model demonstrates high sensitivity to short-term peak values, more accurately reflecting the dynamics of pollution episodes. Feature importance analysis shows the key role of atmospheric stability (BLH), wind regime, and autocorrelation structure in the formation of winter smog situations. Compared to the baseline methods (Persistence and Seasonal Naïve), the forecast accuracy over a 7-day horizon shows poor performance and in some cases, is inferior to the “persistence” method, while over a 30-day horizon, it improved to 40% for PM2.5 and to 15% for PM10. The developed system has high potential for integration into digital monitoring platforms, early warning services, and Smart City solutions. The study fills an existing scientific gap in the field of interpretable weather-dependent air quality forecasting for cities with mountain-valley circulation in Central Asia and strengthens the analytical basis for sustainable environmental management.
air quality management , LSTM , machine learning , PM2.5 PM10 pollution forecasting , Random Forest , Smart City
Text of the article Перейти на текст статьи
Department of Economics, Management and Finance, Sarsen Amanzholov East Kazakhstan University, Ust-Kamenogorsk, 070000, Kazakhstan
Humboldt Innovation GmbH, Humboldt University of Berlin, Berlin, 10099, Germany
Department of Telecommunications Engineering, Almaty University of Power Engineering and Telecommunications named Gumarbek Daukeyev, Almaty, 050013, Kazakhstan
Department of Industrial Safety and Ecology, Abylkas Saginov Karaganda Technical University, Karaganda, 100027, Kazakhstan
Department of Electrical Power Engineering, Almaty University of Power Engineering and Telecommunications named Gumarbek Daukeyev, Almaty, 050013, Kazakhstan
Department of Information Security, L.N. Gumilyov Eurasian National University, Astana, 010008, Kazakhstan
Department of Economics
Humboldt Innovation GmbH
Department of Telecommunications Engineering
Department of Industrial Safety and Ecology
Department of Electrical Power Engineering
Department of Information Security
10 лет помогаем публиковать статьи Международный издатель
Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026