Research Article | | Peer-Reviewed

Gradient Boosting Revisited: Comparative Analysis of Selected Advances on Real-World Tabular Data

Received: 26 April 2026     Accepted: 9 May 2026     Published: 12 June 2026
Views:       Downloads:
Abstract

Gradient Boosting has become one of the approaches design to improve general predictive performance as well as overcome some specific learning challenges. Though mature, there are still new adaptive variants being created to enhance flexibility, efficiency, as well as overall predictive power. However, there are limited benchmarking studies that sought to establish the generalisation abilities of these techniques especially the newer variants under varying conditions. This study, therefore, conducts a systematic analysis of seven Gradient Boosting models: XGBoost, LightGBM, CatBoost, HistGradientBoosting, GradientBoosting, AdaBoost, and the adaptive MorphBoost on ten benchmark datasets different challenges. All models were trained using a fixed 80:20 train–test split, with 3-fold cross-validation performed solely on the training portion to estimate stability. Performance was measured using accuracy, F1-score, and ROC-AUC to guarantee fairness and reproducibility. The findings indicate that CatBoost produced the highest mean accuracy of 0.9400 and a near-perfect ROC-AUC of 0.9915, which means that it can effectively generalize across diverse data types. HistGradientBoosting is identified as the most stable model across datasets with a good level of performance and computational efficiency, and it is currently followed by LightGBM and XGBoost. MorphBoost shows promise on binary and high-dimensional datasets where its implementation is fully supported, though its current lack of native multiclass handling limits general applicability. Generally, the research confirms that there is no single model that fits all circumstances; rather, dataset characteristics directly influence model performance. These results offer real-world guidance on the choice of boosting models and point to the areas where future research, particularly in adaptive and hybrid boosting techniques can be used to further enhance performance and generalization.

Published in Machine Learning Research (Volume 11, Issue 1)
DOI 10.11648/j.mlr.20261101.14
Page(s) 37-52
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2026. Published by Science Publishing Group

Keywords

Gradient Boosting, XGBoost, LightGBM, CatBoost, MorphBoost, NGBoost, Ensemble Learning, Stacking Ensemble, Tabular Data, Explainable AI

References
[1] Aabaah, I., Wiredu, J. K., & Batowise, B. E. (2024). Optimizing initial guesses for nonlinear system solvers using machine learning: A comparative study of classification algorithms. SSRN.
[2] Aabaah, I., Wiredu, J. K., Batowise, B. E., & Seidu, N. A. (2025). Revolutionizing nursing and midwifery informatics curriculum evaluation in Ghana: A data-driven machine learning approach. Journal of Information Systems and Informatics, 7(1), 442–460.
[3] Bergstra, J., Yamins, D., & Cox, D. D. (2013). Making a science of model search: Hyperparameter optimization in hundreds of dimensions. Proceedings of the 30th International Conference on Machine Learning, 115–123.
[4] Bishop, C. M. (2006). Pattern recognition and machine learning. Springer.
[5] Bkheet, S. A., Khamis, G. S. M., Alenazi, A., Almalih, W. A., Bashier, M. M., & Mohammed, Z. M. S.(2025). Comparative performance of gradient boosting and random forest for smart home device classification. Preprints, 202502.0690.v1.
[6] Buda, M., Maki, A., & Mazurowski, M. A. (2018). A systematic study of the class imbalance problem in convolutional neural networks. Neural networks, 106, 249- 259.
[7] Cai, Y., Feng, J., Wang, Y., Ding, Y., Hu, Y., & Fang, H. (2024). The optuna–lightgbm–xgboost model: A novel approach for estimating carbon emissions based on the electricity–carbon nexus. Applied Sciences, 14(11), 4632.
[8] Caruana, R., Karampatziakis, N., & Yessenalina, A.(2008). An empirical evaluation of supervised learning in high dimensions. In Proceedings of the 25th International Conference on Machine Learning (pp. 96–103).
[9] Caruana, R., Munson, A., & Niculescu-Mizil, A.(2006, December). Getting the most out of ensemble selection. In Sixth International Conference on Data Mining (ICDM’06) (pp. 828-833). IEEE.
[10] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.
[11] Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794).
[12] Chen, Z. (2025). A unified comparison of five advanced ensemble learners for wine quality prediction. arXiv preprint, 2506.06327v1.
[13] Chevalier, D., & Côté, M.-P. (2025). From point to probabilistic gradient boosting for claim frequency and severity prediction. European Actuarial Journal.
[14] Dal Pozzolo, A., Caelen, O., Johnson, R. A., & Bontempi, G. (2015). Calibrating probability with undersampling for unbalanced classification. In 2015 IEEE Symposium Series on Computational Intelligence (pp. 159–166).
[15] Dietterich, T. G. (2000). Ensemble methods in machine learning. In Multiple classifier systems (pp. 1–15). Springer.
[16] Dorogush, A. V., Ershov, V., & Gulin, A. (2018). CatBoost: Gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363.
[17] Duan, T., Avati, A., Ding, D. Y., Thai, K. K., Basu, S., Ng, A. Y., & Schuler, A. (2020). NGBoost: Natural gradient boosting for probabilistic prediction. Proceedings of the 37th International Conference on Machine Learning.
[18] Elith, J., Leathwick, J. R., & Hastie, T. (2008). A working guide to boosted regression trees. Journal of Animal Ecology, 77(4), 802–813.
[19] Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115–118.
[20] Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15, 3133–3181.
[21] Florek, P., & Zagdański, A. (2023). Benchmarking state-of-the-art gradient boosting algorithms for classification. arXiv preprint arXiv:2305.17094.
[22] Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.
[23] Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.
[24] Ghosh, K., Bellinger, C., Corizzo, R., Branco, P., Krawczyk, B., & Japkowicz, N. (2024). The class imbalance problem in deep learning. Machine Learning, 113(7), 4845-4901.
[25] Haddaway, N. R., Page, M. J., Pritchard, C. C., & McGuinness, L. A. (2022). PRISMA2020: An R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and Open Synthesis. Campbell Systematic Reviews, 18, e1230.
[26] Hand, D. J., & Till, R. J. (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45(2), 171–186.
[27] Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd ed.). Springer.
[28] He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.
[29] Ileri, K. (2025). Comparative analysis of CatBoost, LightGBM, XGBoost, RF, and DT methods optimised with PSO to estimate the number of k-barriers for intrusion detection in wireless sensor networks. International Journal of Machine Learning and Cybernetics, 16, 6937–6956.
[30] Imani, M., Beikmohammadi, A., & Arabnia, H. R.(2025). Comprehensive analysis of random forest and XGBoost performance with SMOTE, ADASYN, and GNUS under varying imbalance levels. Technologies, 13(3), 88.
[31] Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255–260.
[32] Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems (pp. 3146–3154).
[33] Krawczyk, B. (2016). Learning from imbalanced data: Open challenges and future directions. Progress in Artificial Intelligence, 5(4), 221–232.
[34] Kriuk, B. (2025). MorphBoost: Self-organizing universal gradient boosting with adaptive tree morphing. arXiv preprint, 2511.13234v1.
[35] Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring. European Journal of Operational Research, 247(1), 124–136.
[36] Limas Ptr, A. F., Siregar, M. M., & Daniel, I.(2024). Analysis of gradient boosting, XGBoost, and CatBoost on mobile phone classification. Journal of Computer Networks, Architecture and High Performance Computing, 6(2), 661–670.
[37] Luo, J., Yuan, Y., & Xu, S. (2025). Improving GBDT performance on imbalanced datasets: An empirical study of class-balanced loss functions. Neurocomputing, 634, 129896.
[38] Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (pp. 4765–4774).
[39] Nanini, S., Abid, M., Mamouni, Y., Wiedemann, A., Jouvet, P., & Bourassa, S. (2025). Development and comparative analysis of machine learning models for hypoxemia severity triage in CBRNE emergency scenarios using physiological and demographic data from medical-grade devices. arXiv preprint, 2410.23503v1.
[40] Natekin, A., & Knoll, A. (2013). Gradient boosting machines, a tutorial. Frontiers in Neurorobotics, 7, 21.
[41] Nguyen, N., & Ngo, D. (2025). Comparative analysis of boosting algorithms for predicting personal default. Cogent Economics & Finance, 13(1), 2465971.
[42] Nugroho, S. W. M. (2025). Stacking ensemble learning: Combining XGBoost, LightGBM, CatBoost, and AdaBoost with random forest meta model. Research Square.
[43] Olson, R. S., La Cava, W., Orzechowski, P., Urbanowicz, R. J., & Moore, J. H. (2018). PMLB: A large benchmark suite for machine learning evaluation and comparison. BioData Mining, 11(1), 36.
[44] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay,´E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
[45] Probst, P., Wright, M. N., & Boulesteix, A. L. (2019). Hyperparameters and tuning strategies for random forest. WIREs Data Mining and Knowledge Discovery, 9(3), e1301.
[46] Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., & Gulin, A. (2018). CatBoost: Unbiased boosting with categorical features. In Advances in Neural Information Processing Systems (pp. 6638–6648).
[47] Provost, F., & Fawcett, T. (2013). Data science for business. O’Reilly Media.
[48] Rafie, Z., Sedaghat Talab, M., Ebrahim Zadeh Koor, B., Garavand, A., Salehnasab, C., & Ghaderzadeh, M.(2025). Leveraging XGBoost and explainable AI for accurate prediction of type 2 diabetes. BMC Public Health, 25, 3688.
[49] Rivaldo, Taufik, R., Ilman, I. S., & Wulansari, O. D. E.(2025). A comparative study of XGBoost, LightGBM, and CatBoost models for customer churn prediction in the banking industry. Computer Science Unila Publishing Network.
[50] Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249.
[51] Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLOS ONE, 10(3), e0118432.
[52] Verbeke, W., Dejaeger, K., Martens, D., Hur, J., & Baesens, B. (2012). New insights into churn prediction in the telecommunication sector. European Journal of Operational Research, 218(1), 211–229.
[53] Wiredu, J. K., Akobre, S., Jibreel, F., & Abubakari, A. R.(2026). Assessing the Effectiveness of Machine Learning Classifiers in Handling Imbalanced Datasets. IJSAT–International Journal on Science and Technology, 17(1).
[54] Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.
[55] Yıldız, A. Y., & Kalayci, A. (2025, June). Gradient boosting decision trees on medical diagnosis over tabular data. Proceedings of the 2025 IEEE International Conference on AI and Data Analytics (ICAD), pp. 1–8. IEEE.
[56] Zhou, Z. H. (2012). Ensemble methods: Foundations and algorithms. CRC Press.
Cite This Article
  • APA Style

    Agebure, M. A., Wiredu, J. K., Akobre, S. (2026). Gradient Boosting Revisited: Comparative Analysis of Selected Advances on Real-World Tabular Data. Machine Learning Research, 11(1), 37-52. https://doi.org/10.11648/j.mlr.20261101.14

    Copy | Download

    ACS Style

    Agebure, M. A.; Wiredu, J. K.; Akobre, S. Gradient Boosting Revisited: Comparative Analysis of Selected Advances on Real-World Tabular Data. Mach. Learn. Res. 2026, 11(1), 37-52. doi: 10.11648/j.mlr.20261101.14

    Copy | Download

    AMA Style

    Agebure MA, Wiredu JK, Akobre S. Gradient Boosting Revisited: Comparative Analysis of Selected Advances on Real-World Tabular Data. Mach Learn Res. 2026;11(1):37-52. doi: 10.11648/j.mlr.20261101.14

    Copy | Download

  • @article{10.11648/j.mlr.20261101.14,
      author = {Moses Apambila Agebure and Japheth Kodua Wiredu and Stephen Akobre},
      title = {Gradient Boosting Revisited: Comparative Analysis of Selected Advances on Real-World Tabular Data
    },
      journal = {Machine Learning Research},
      volume = {11},
      number = {1},
      pages = {37-52},
      doi = {10.11648/j.mlr.20261101.14},
      url = {https://doi.org/10.11648/j.mlr.20261101.14},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.mlr.20261101.14},
      abstract = {Gradient Boosting has become one of the approaches design to improve general predictive performance as well as overcome some specific learning challenges. Though mature, there are still new adaptive variants being created to enhance flexibility, efficiency, as well as overall predictive power. However, there are limited benchmarking studies that sought to establish the generalisation abilities of these techniques especially the newer variants under varying conditions. This study, therefore, conducts a systematic analysis of seven Gradient Boosting models: XGBoost, LightGBM, CatBoost, HistGradientBoosting, GradientBoosting, AdaBoost, and the adaptive MorphBoost on ten benchmark datasets different challenges. All models were trained using a fixed 80:20 train–test split, with 3-fold cross-validation performed solely on the training portion to estimate stability. Performance was measured using accuracy, F1-score, and ROC-AUC to guarantee fairness and reproducibility. The findings indicate that CatBoost produced the highest mean accuracy of 0.9400 and a near-perfect ROC-AUC of 0.9915, which means that it can effectively generalize across diverse data types. HistGradientBoosting is identified as the most stable model across datasets with a good level of performance and computational efficiency, and it is currently followed by LightGBM and XGBoost. MorphBoost shows promise on binary and high-dimensional datasets where its implementation is fully supported, though its current lack of native multiclass handling limits general applicability. Generally, the research confirms that there is no single model that fits all circumstances; rather, dataset characteristics directly influence model performance. These results offer real-world guidance on the choice of boosting models and point to the areas where future research, particularly in adaptive and hybrid boosting techniques can be used to further enhance performance and generalization.
    },
     year = {2026}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Gradient Boosting Revisited: Comparative Analysis of Selected Advances on Real-World Tabular Data
    
    AU  - Moses Apambila Agebure
    AU  - Japheth Kodua Wiredu
    AU  - Stephen Akobre
    Y1  - 2026/06/12
    PY  - 2026
    N1  - https://doi.org/10.11648/j.mlr.20261101.14
    DO  - 10.11648/j.mlr.20261101.14
    T2  - Machine Learning Research
    JF  - Machine Learning Research
    JO  - Machine Learning Research
    SP  - 37
    EP  - 52
    PB  - Science Publishing Group
    SN  - 2637-5680
    UR  - https://doi.org/10.11648/j.mlr.20261101.14
    AB  - Gradient Boosting has become one of the approaches design to improve general predictive performance as well as overcome some specific learning challenges. Though mature, there are still new adaptive variants being created to enhance flexibility, efficiency, as well as overall predictive power. However, there are limited benchmarking studies that sought to establish the generalisation abilities of these techniques especially the newer variants under varying conditions. This study, therefore, conducts a systematic analysis of seven Gradient Boosting models: XGBoost, LightGBM, CatBoost, HistGradientBoosting, GradientBoosting, AdaBoost, and the adaptive MorphBoost on ten benchmark datasets different challenges. All models were trained using a fixed 80:20 train–test split, with 3-fold cross-validation performed solely on the training portion to estimate stability. Performance was measured using accuracy, F1-score, and ROC-AUC to guarantee fairness and reproducibility. The findings indicate that CatBoost produced the highest mean accuracy of 0.9400 and a near-perfect ROC-AUC of 0.9915, which means that it can effectively generalize across diverse data types. HistGradientBoosting is identified as the most stable model across datasets with a good level of performance and computational efficiency, and it is currently followed by LightGBM and XGBoost. MorphBoost shows promise on binary and high-dimensional datasets where its implementation is fully supported, though its current lack of native multiclass handling limits general applicability. Generally, the research confirms that there is no single model that fits all circumstances; rather, dataset characteristics directly influence model performance. These results offer real-world guidance on the choice of boosting models and point to the areas where future research, particularly in adaptive and hybrid boosting techniques can be used to further enhance performance and generalization.
    
    VL  - 11
    IS  - 1
    ER  - 

    Copy | Download

Author Information
  • Sections