Multi-Task Attention-Guided Deep Learning for Simultaneous Breast Cancer Detection and Density Estimation in Mammography

Esen G. Nurtas M. la Paglia L. Amankulov J. Matkerim B. Altaibek A.
2025 Institute of Electrical and Electronics Engineers Inc.

IEEE Access
2025 #13 198938 - 198951 pp.

Early detection of breast cancer is critical for improving patient outcomes, and deep learning (DL) has shown promise in mammographic analysis. However, many existing systems tackle a single task (e.g., malignancy classification) and provide limited interpretability, which hinders clinical adoption. We present a multi-task, attention-guided framework that jointly predicts malignancy (benign vs. malignant) and BI-RADS breast density (A–D) from a single mammogram while producing interpretable attention maps. The architecture couples a pretrained convolutional backbone with lightweight channel–spatial attention and two task-specific heads. We evaluate six backbones (ResNet-50, DenseNet-121, EfficientNet-B0/B3/B7, MobileNetV3-Large) under a rigorous evaluation protocol with stratified hold-out split (64%/16%/20%) and 5-fold cross-validation. To support reliable deployment, we report comprehensive metrics including discrimination measures (ROC-AUC, PR-AUC, F1-score, sensitivity, specificity), calibration metrics (Brier score, Expected Calibration Error), and ablation studies contrasting single-task vs. multi-task learning and models with vs. without attention mechanisms. Across all evaluation protocols, multi-task learning consistently outperforms single-task baselines by 2.8% in pathology accuracy and 3.5% in density estimation, indicating that density cues provide complementary information for malignancy assessment. Among backbones, EfficientNet-B3 attains the strongest discrimination (AUC= 0.962, accuracy= 93.6%) on hold-out test with excellent calibration (Brier= 0.010), while MobileNetV3-Large achieves competitive performance (AUC= 0.940, accuracy= 91.2%) with only 4.26M parameters—an optimal accuracy–efficiency trade-off for resource-constrained deployment. Cross-validation results demonstrate robust generalization (AUC = 0.954 ± 0.003 , accuracy = 93.0 ± 0.3% ). Attention maps highlight diagnostically relevant regions including architectural distortions and suspicious calcifications, offering interpretable visual rationale aligned with radiological assessment. We discuss limitations including dataset size (410 images), class imbalance in rare density categories, and calibration challenges (ECE= 0.365–0.419), and outline directions for external validation on multi-institutional cohorts and integration of uncertainty quantification methods.

Attention mechanism , breast cancer , clinical decision support , deep learning , interpretable AI , mammography , multi-task learning

Text of the article Перейти на текст статьи

International Information Technology University, Faculty of Information Technology, Almaty, 050000, Kazakhstan
Al-Farabi Kazakh National University, Faculty of Information Technology and Artificial Intelligence, Almaty, 050040, Kazakhstan
Institute for High Performance Computing and Networking (ICAR), National Research Council (CNR) of Italy, Palermo, 90146, Italy
Kazakh Institute of Oncology and Radiology, Almaty, 050000, Kazakhstan

International Information Technology University
Al-Farabi Kazakh National University
Institute for High Performance Computing and Networking (ICAR)
Kazakh Institute of Oncology and Radiology

10 лет помогаем публиковать статьи Международный издатель

Книга Публикация научной статьи Волощук 2026 Book Publication of a scientific article 2026