Explainable Machine Learning for Multi-Class Classification of Internet Firewall Traffic

Titik Misriati(1), Riska Aryanti(2*),

(1) Universitas Bina Sarana Informatika, Jakarta
(2) Universitas Bina Sarana Informatika, Jakarta
(*) Corresponding Author

Abstract


The increasing diversity and scale of network traffic introduce significant challenges in performing accurate and interpretable firewall analysis. This research aims to bridge the gap between predictive performance and model transparency by developing an explainable machine learning framework for multi-class firewall traffic classification. The study utilizes the Internet Firewall Data dataset consisting of 65,532 network traffic instances distributed across four firewall action classes and evaluates seven classification algorithms, including Decision Tree, Random Forest, XGBoost, Support Vector Machine, k-Nearest Neighbors, Naïve Bayes, and Logistic Regression. The dataset was partitioned using a stratified 80:20 hold-out approach to preserve the original class distribution and the experimental process involves data preprocessing, normalization, and validation on an independent test set using accuracy, precision, recall, and F1-score metrics. The findings reveal that XGBoost achieves the highest performance, reaching an accuracy of 99.81%, followed by Decision Tree and Random Forest. This indicates that ensemble and tree-based approaches are highly effective in modeling complex and non-linear traffic patterns. To improve interpretability, this study incorporates explainable artificial intelligence techniques, including feature importance and SHAP analysis. The results show that traffic-related attributes significantly influence classification outcomes, providing meaningful insights into firewall decision behavior


Keywords


Ensemble Machine Learning; Firewall Log Analysis; Network Traffic Classification; SHAP Explainability; XAI

References


V. N. Gangineni, S. Pabbineedi, M. Penmetsa, J. R. Bhumireddy, R. Chalasani, and M. S. V. Tyagadurgam, “Strengthening Cybersecurity Governance: The Impact of Firewalls on Risk Management,” International Journal of AI, BigData, Computational and Management Studies, vol. 2, pp. 60–68, 2021, doi: 10.63282/3050-9416.IJAIBDCMS-V2I4P106.

A. Korkmaz, S. Bulut, T. Talan, S. Kosunalp, and T. Iliev, “Enhancing Firewall Packet Classification through Artificial Neural Networks and Synthetic Minority Over-Sampling Technique: An Innovative Approach with Evaluative Comparison,” Applied Sciences, vol. 14, no. 16, p. 7426, Aug. 2024, doi: 10.3390/app14167426.

H. Dhrir, M. Charfeddine, N. Tarhouni, and H. M. Kammoun, “Machine learning- and deep learning-based anomaly detection in firewalls: a survey,” J. Supercomput., vol. 81, no. 6, p. 761, Apr. 2025, doi: 10.1007/s11227-025-07212-y.

M. H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, Network Traffic Anomaly Detection and Prevention. Cham: Springer International Publishing, 2017. doi: 10.1007/978-3-319-65188-0.

M. Mingze, “Research and Application of Firewall Log and Intrusion Detection Log Data Visualization System,” IET Software, vol. 2024, no. 1, Jan. 2024, doi: 10.1049/2024/7060298.

Z. Azam, Md. M. Islam, and M. N. Huda, “Comparative Analysis of Intrusion Detection Systems and Machine Learning-Based Model Analysis Through Decision Tree,” IEEE Access, vol. 11, pp. 80348–80391, 2023, doi: 10.1109/ACCESS.2023.3296444.

A. A. Afuwape, Y. Xu, J. H. Anajemba, and G. Srivastava, “Performance evaluation of secured network traffic classification using a machine learning approach,” Comput. Stand. Interfaces, vol. 78, p. 103545, Oct. 2021, doi: 10.1016/j.csi.2021.103545.

M. Rashid, J. Kamruzzaman, T. Imam, S. Wibowo, and S. Gordon, “A tree-based stacking ensemble technique with feature selection for network intrusion detection,” Applied Intelligence, vol. 52, no. 9, pp. 9768–9781, Jul. 2022, doi: 10.1007/s10489-021-02968-1.

“Hybrid AI Models in Network Security: Combining ML, DL, and Rule-Based Systems,” International Journal of Emerging Research in Engineering and Technology, vol. 5, 2024, doi: 10.63282/3050-922X.IJERET-V5I4P111.

Z. Fan and Z. You, “Research on network intrusion detection based on XGBoost algorithm and multiple machine learning algorithms,” Theoretical and Natural Science, vol. 31, no. 1, pp. 161–166, Mar. 2024, doi: 10.54254/2753-8818/31/20241171.

P. Ferreira, E. Martins, J. Silva, and P. Teixeira, “Feature Selection and XGBoost for Enhanced Intrusion Detection: A Comparative Study Across Benchmark Datasets,” in 2025 13th International Symposium on Digital Forensics and Security (ISDFS), IEEE, Apr. 2025, pp. 1–6. doi: 10.1109/ISDFS65363.2025.11012060.

M. Janati and F. Messaoudi, “Network Behavior–Driven Intrusion Detection: A Hybrid Deep Learning and XGBoost Approach,” in AI-Driven Security for Next-Generation IoT Systems, Cham: Springer Nature Switzerland, 2026, pp. 107–118. doi: 10.1007/978-3-032-08784-3_8.

M. Hasan, Md. M. Hassan, S. Akter, P. Hajek, and M. Z. Abedin, “Advances in Explainable Big Data Analytics for Enhanced Cybersecurity,” Information Systems Frontiers, Feb. 2026, doi: 10.1007/s10796-026-10701-x.

Md. A. Hossain, W. Ishtiaq, and Md. S. Islam, “A Comparative Analysis of Ensemble‐Based Machine Learning Approaches With Explainable AI for Multi‐Class Intrusion Detection in Drone Networks,” SECURITY AND PRIVACY, vol. 9, no. 1, Jan. 2026, doi: 10.1002/spy2.70164.

M. Aljabri, A. A. Alahmadi, R. M. A. Mohammad, M. Aboulnour, D. M. Alomari, and S. H. Almotiri, “Classification of Firewall Log Data Using Multiclass Machine Learning Models,” Electronics (Basel)., vol. 11, no. 12, p. 1851, Jun. 2022, doi: 10.3390/electronics11121851.

H. AL-Behadili, “Decision Tree for Multiclass Classification of Firewall Access,” International Journal of Intelligent Engineering and Systems, vol. 14, no. 3, pp. 294–302, Jun. 2021, doi: 10.22266/ijies2021.0630.25.

T. C. Hung, D. M. Linh, H. M. Chau, N. X. Thoai, T. D. Phuong, and H. De Thu, “Anomaly-based intrusion detection leveraging optimized firewall log analysis: a real-time machine learning solution,” International Journal of Electrical and Computer Engineering (IJECE), vol. 15, no. 5, p. 4785, Oct. 2025, doi: 10.11591/ijece.v15i5.pp4785-4802.

M. Arunika, S. Saranya, S. Charulekha, S. Kabilarajan, and G. Kesavan, “A Survey on Explainable AI Using Machine Learning Algorithms Shap and Lime,” in 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), IEEE, Jun. 2024, pp. 1–6. doi: 10.1109/ICCCNT61001.2024.10725120.

S. M. Lundberg et al., “From local explanations to global understanding with explainable AI for trees,” Nat. Mach. Intell., vol. 2, no. 1, pp. 56–67, Jan. 2020, doi: 10.1038/s42256-019-0138-9.

S. Wali, Y. A. Farrukh, and I. Khan, “Explainable AI and Random Forest based reliable intrusion detection system,” Comput. Secur., vol. 157, p. 104542, Oct. 2025, doi: 10.1016/j.cose.2025.104542.

J. K. Mutinda et al., “Explainable AI for Breast Cancer Diagnosis: Comparative Analysis of ML Models Using Random Forest Feature Selection and SHAP Interpretability,” Asian Journal of Research in Computer Science, vol. 18, no. 10, pp. 30–46, Oct. 2025, doi: 10.9734/ajrcos/2025/v18i10762.

B. Devanathan, K. Jnana Varshitha, L. Pavan Kumar, S. A. Lakshmanan, and N. Krishna Prakash, “Explainable AI Framework Using XGBoost With SHAP and LIME for Multi-Scale Household Energy Forecasting,” IEEE Access, vol. 13, pp. 149750–149764, 2025, doi: 10.1109/ACCESS.2025.3602673.

X. Yang et al., “Explainable Artificial Intelligence (XAI) framework using XGBoost and SHAP for assessing urban fire risk based on spatial distribution features,” International Journal of Disaster Risk Reduction, vol. 129, p. 105798, Oct. 2025, doi: 10.1016/j.ijdrr.2025.105798.

Z. Shuai, T. J. Kwon, and Q. Xie, “Using explainable AI for enhanced understanding of winter road safety: insights with support vector machines and SHAP,” Canadian Journal of Civil Engineering, vol. 51, no. 9, pp. 943–953, Sep. 2024, doi: 10.1139/cjce-2023-0446.

M. K. S. Gowda, Y. I. Murthy, and A. Gupta, “Explainable AI Based Support Vector Machine Models for Soaked CBR Prediction,” International Journal of Pavement Research and Technology, May 2025, doi: 10.1007/s42947-025-00558-9.

R. O. Alabi, M. Elmusrati, I. Leivo, A. Almangush, and A. A. Mäkitie, “Machine learning explainability in nasopharyngeal cancer survival using LIME and SHAP,” Sci. Rep., vol. 13, no. 1, p. 8984, Jun. 2023, doi: 10.1038/s41598-023-35795-0.

A. M. Khairuddin, S. A. M. Aris, K. N. F. K. Azir, and A. Azizan, “Using the K-Nearest Neighbor and Explainable Artificial Intelligence to Classify Arrhythmias,” in 2025 International Conference on Artificial Intelligence for Sustainable Innovation (AI-SI), IEEE, Aug. 2025, pp. 1–6. doi: 10.1109/AI-SI66213.2025.11341336.

F. Haseeb, M. Peng, F. Arshad, and W. Ali, “Uncertainty aware unsupervised fault diagnosis of PWR nuclear power plant using KNN and SHAP method,” Progress in Nuclear Energy, vol. 193, p. 106191, Mar. 2026, doi: 10.1016/j.pnucene.2025.106191.

I. Boukrouh, F. Tayalati, and A. Azmani, “Comparative SHAP Analysis on SVM and K-NN: Impacts of Hyperparameter Tuning on Model Explainability,” in 2024 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), IEEE, Aug. 2024, pp. 194–198. doi: 10.1109/IICAIET62352.2024.10729995.

O. Islam, Md. Assaduzzaman, S. Akter, N. Fahad, and Md. J. Hossen, “Enhanced cervical cancer diagnosis using a novel Bayesian fusion ensemble method with explainable AI,” Sci. Rep., vol. 16, no. 1, p. 12306, Mar. 2026, doi: 10.1038/s41598-026-35334-7.

R. O. Alabi, M. Elmusrati, I. Leivo, A. Almangush, and A. A. Mäkitie, “Machine learning explainability in nasopharyngeal cancer survival using LIME and SHAP,” Sci. Rep., vol. 13, no. 1, p. 8984, Jun. 2023, doi: 10.1038/s41598-023-35795-0.

R. O. Alabi, M. Elmusrati, I. Leivo, A. Almangush, and A. A. Mäkitie, “Machine learning explainability in nasopharyngeal cancer survival using LIME and SHAP,” Sci. Rep., vol. 13, no. 1, p. 8984, Jun. 2023, doi: 10.1038/s41598-023-35795-0.

I. Tasin, T. U. Nabil, S. Islam, and R. Khan, “Diabetes prediction using machine learning and explainable AI techniques,” Healthc. Technol. Lett., vol. 10, no. 1–2, pp. 1–10, Feb. 2023, doi: 10.1049/htl2.12039.




DOI: http://dx.doi.org/10.61944/bids.v5i1.163

Refbacks

  • There are currently no refbacks.


Copyright (c) 2026 Titik Misriati, Riska Aryanti

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Bulletin of Informatics and Data Science
Asosiasi Peneliti Data Science Indonesia
Email: pdsi.bids@gmail.com
This work is licensed under a Creative Commons Attribution 4.0 International License.