Data-driven total organic carbon prediction using feature selection methods incorporated in an automated machine learning framework


Macêdo B.D.S. Wayo D.D.K. Campos D. De Santis R.B. Martinho A.D. Yaseen Z.M. Saporetti C.M. Goliatt L.
December 2025Nature Research

Scientific Reports
2025#15Issue 1

An accurate assessment of shale gas resources is highly important for the sustainable development of these energy resources. Total organic carbon (TOC) analysis thus becomes fundamental for understanding the distribution and quality of hydrocarbon source rocks within a shale gas reservoir. The elevation of the TOC is often associated with the presence of source rocks, indicating the potential for oil and gas production. TOC assessment is performed using laboratory methods, which can be time-consuming and costly. Data-driven models have been successfully applied to model the relationship between TOC and other constituents and to predict the TOC content. However, these methods depend on extensive parameter adjustments that must be carefully conducted in different sedimentary environments. In this context, Automated Machine Learning (AutoML) is an alternative for accurately predicting TOCs, saving time-consuming fine-tuning steps in model development. This study aims to develop an AutoML strategy for estimating TOC using well log data. This procedure automatically preprocesses the search for the best method parameters, reducing the execution time. Among the methods evaluated, Extremely Randomized Trees (XT) performed best (R = 0.8632, MSE = 0.1806) in the test set. The proposed strategy provides a powerful data-driven method, which allows real-world use of the well to assist in data analysis and subsequent decision-making.



Text of the article Перейти на текст статьи

Department of Computer Science, Federal University of Lavras, MG, Lavras, 37200-000, Brazil
Faculty of Chemical and Process Engineering Technology, Universiti Malaysia Pahang Al-Sultan Abdullah, Kuantan, 26300, Malaysia
Department of Petroleum Engineering, School of Mining and Geosciences, Nazarbayev University, Astana, 010000, Kazakhstan
Computational Modeling Program, Engineering Faculty, Federal University of Juiz de Fora, Juiz de Fora, 36036-900, Brazil
Department of Computer Science, Federal University of Juiz de Fora, Juiz de Fora, 36036-900, Brazil
Exact Sciences and Technology Department, Púnguè University, Tete Delegation, Campus Universitário de Cambinde - EN106, Tete, Matundo, Mozambique
Civil and Environmental Engineering Department, King Fahd University of Petroleum & Minerals, Dhahran, 31261, Saudi Arabia
Department of Computational Modeling, Polytechnic Institute, Rio de Janeiro State University, Nova Friburgo, 22000-900, Brazil
Department of Computational and Applied Mechanics, Federal University of Juiz de Fora, Juiz de Fora, 36036-900, Brazil

Department of Computer Science
Faculty of Chemical and Process Engineering Technology
Department of Petroleum Engineering
Computational Modeling Program
Department of Computer Science
Exact Sciences and Technology Department
Civil and Environmental Engineering Department
Department of Computational Modeling
Department of Computational and Applied Mechanics

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026