Ensemble-Based Detection and Classification of Liver Diseases Caused by Hepatitis C

Document Type : Original Article

Authors

Faculty of Engineering & Technology, University of Mazandaran, Babolsar, Iran

Abstract

The liver, as the largest internal organ in the human body, plays a pivotal role in numerous physiological processes, orchestrating over 500 metabolic activities crucial for maintaining bodily functions. However, the Hepatitis C Virus (HCV) poses a grave threat to liver health, necessitating early identification of liver diseases to halt the progression to carcinoma and potentially save lives. This research aims to train ensemble-based algorithms for classifying and detecting Hepatitis, Fibrosis, and Cirrhosis. Employing rigorous preprocessing techniques, 80% of the dataset was allocated to train five ensemble-based algorithms: AdaBoost, Random Forest, Rotation Forest, XGBoost, and LightGBM. These algorithms were evaluated across four performance metrics—accuracy, precision, recall, and F1-score. Remarkably, LightGBM emerged as the frontrunner, boasting an exceptional accuracy rate of 98.37%. Rotation Forest followed closely with an accuracy of 96.74%, while XGBoost attained an accuracy of 95.12%. Random Forest and AdaBoost secured 94.19% and 93.30% accuracy, respectively. These findings underscore LightGBM’s prowess as a promising algorithm for detecting and classifying liver diseases. By leveraging advanced machine learning techniques, particularly ensemble-based algorithms, this research contributes to the ongoing efforts to enhance early detection, improve patient outcomes, and foster more effective management strategies for liver-related ailments in clinical settings

Keywords


  • Solomon, E. P., Berg, L. R., Martin, D. W. (2014). Biology.  Brooks/Cole Thomson Learning, Boston, United States.
  • Rehermann, B., & Nascimbeni, M. (2005). Immunology of hepatitis B virus and hepatitis C virus infection. Nature Reviews Immunology, 5(3), 215–229. doi:10.1038/nri1573.
  • Institute for Quality and Efficiency in Health Care (IQWiG) (2023). In brief: How does the liver work?. Institute for Quality and Efficiency in Health Care (IQWiG), New York, United States. Available online: https://www.ncbi.nlm.nih.gov/books/NBK279393/ (accessed on August 2024).
  • Park, S.-J., & Hahn, Y. S. (2023). Hepatocytes infected with hepatitis C virus change immunological features in the liver microenvironment. Clinical and Molecular Hepatology, 29(1), 65–76. doi:10.3350/cmh.2022.0032
  • Campollo, O., Amaya, G., & McCormick, P. A. (2022). Milestones in the discovery of hepatitis C. World Journal of Gastroenterology, 28(37), 5395–5402. doi:10.3748/wjg.v28.i37.5395.
  • [6] Ramos-Tovar, E., & Muriel, P. (2020). Molecular Mechanisms That Link Oxidative Stress, Inflammation, and Fibrosis in the Liver. Antioxidants, 9(12), 1279. doi:10.3390/antiox9121279.
  • Mohd Hanafiah, K., Groeger, J., Flaxman, A. D., & Wiersma, S. T. (2013). Global epidemiology of hepatitis C virus infection: New estimates of age-specific antibody to HCV seroprevalence. Hepatology, 57(4), 1333–1342. doi:10.1002/hep.26141.
  • Harabor, V., Mogos, R., Nechita, A., Adam, A.-M., Adam, G., Melinte-Popescu, A.-S., Melinte-Popescu, M., Stuparu-Cretu, M., Vasilache, I.-A., Mihalceanu, E., Carauleanu, A., Bivoleanu, A., & Harabor, A. (2023). Machine Learning Approaches for the Prediction of Hepatitis B and C Seropositivity. International Journal of Environmental Research and Public Health, 20(3), 2380. doi:10.3390/ijerph20032380.
  • Zhang, C., Shu, Z., Chen, S., Peng, J., Zhao, Y., Dai, X., Li, J., Zou, X., Hu, J., & Huang, H. (2024). A machine learning-based model analysis for serum markers of liver fibrosis in chronic hepatitis B patients. Scientific Reports, 14(1), 12081. doi:10.1038/s41598-024-63095-8.
  • Elshewey, A. M., Shams, M. Y., Tawfeek, S. M., Alharbi, A. H., Ibrahim, A., Abdelhamid, A. A., Eid, M. M., Khodadadi, N., Abualigah, L., Khafaga, D. S., & Tarek, Z. (2023). Optimizing HCV Disease Prediction in Egypt: The hyOPTGB Framework. Diagnostics, 13(22), 3439. doi:10.3390/diagnostics13223439.
  • Feng, S., Wang, J., Wang, L., Qiu, Q., Chen, D., Su, H., Li, X., Xiao, Y., & Lin, C. (2023). Current Status And Analysis Of Machine Learning in Hepatocellular Carcinoma. Journal of Clinical and Translational Hepatology, 11(5), 1184–1191. doi:10.14218/JCTH.2022.00077S.
  • Arjmand, A., Angelis, C. T., Tzallas, A. T., Tsipouras, M. G., Glavas, E., Forlano, R., Manousou, P., & Giannakeas, N. (2019). Deep Learning in Liver Biopsies using Convolutional Neural Networks. 2019 42nd International Conference on Telecommunications and Signal Processing (TSP). doi:10.1109/tsp.2019.8768837.
  • Lu, M.-Y., Huang, C.-F., Hung, C.-H., Tai, C., Mo, L.-R., Kuo, H.-T., Tseng, K.-C., Lo, C.-C., Bair, M.-J., Wang, S.-J., Huang, J.-F., Yeh, M.-L., Chen, C.-T., Tsai, M.-C., Huang, C.-W., Lee, P.-L., Yang, T.-H., Huang, Y.-H., … Chong, L.-W. (2023). Artificial intelligence predicts direct-acting antivirals failure among hepatitis C virus patients: A nationwide hepatitis C virus registry program. Clinical and Molecular Hepatology, 30(1), 64–79. doi:10.3350/cmh.2023.0287.
  • Park, H., Lo-Ciganic, W. H., Huang, J., Wu, Y., Henry, L., Peter, J., Sulkowski, M., & Nelson, D. R. (2022). Machine learning algorithms for predicting direct-acting antiviral treatment failure in chronic hepatitis C: An HCV-TARGET analysis. Hepatology, 76(2), 483–491. doi:10.1002/hep.32347.
  • Akella, A., & Akella, S. (2020). Applying machine learning to evaluate for fibrosis in chronic hepatitis c. MedRxiv, 2020-11. doi:10.1101/2020.11.02.20224840.
  • Maiellaro, P., Cozzolongo, R., & Marino, P. (2005). Artificial Neural Networks for the Prediction of Response to Interferon Plus Ribavirin Treatment in Patients with Chronic Hepatitis C. Current Pharmaceutical Design, 10(17), 2101–2109. doi:10.2174/1381612043384240.
  • Edeh, M. O., Dalal, S., Dhaou, I. Ben, Agubosim, C. C., Umoke, C. C., Richard-Nnabu, N. E., & Dahiya, N. (2022). Artificial Intelligence-Based Ensemble Learning Model for Prediction of Hepatitis C Disease. Frontiers in Public Health, 10, 892371. doi:10.3389/fpubh.2022.892371.
  • Prakash, N. N., Rajesh, V., Namakhwa, D. L., Dwarkanath Pande, S., & Ahammad, S. H. (2023). A DenseNet CNN-based liver lesion prediction and classification for future medical diagnosis. Scientific African, 20(e01629), 1629. doi:10.1016/j.sciaf.2023.e01629.
  • Park, H., Lo-Ciganic, W. H., Huang, J., Wu, Y., Henry, L., Peter, J., Sulkowski, M., & Nelson, D. R. (2022). Evaluation of machine learning algorithms for predicting direct-acting antiviral treatment failure among patients with chronic hepatitis C infection. Scientific Reports, 12(1), 18094. doi:10.1038/s41598-022-22819-4.
  • Wang, Y., Yin, B., & Zhu, Q. (2023). Application of Machine Learning Algorithms in Predicting Hepatitis C. Proceedings of the 2023 4th International Symposium on Artificial Intelligence for Medicine Science. doi:10.1145/3644116.3644176.
  • Butt, M. B., Alfayad, M., Saqib, S., Khan, M. A., Ahmad, M., Khan, M. A., & Elmitwally, N. S. (2021). Diagnosing the Stage of Hepatitis C Using Machine Learning. Journal of Healthcare Engineering, 2021, 8062410. doi:10.1155/2021/8062410.
  • Zhang, L., Wang, J., Chang, R., & Wang, W. (2024). Investigation of the effectiveness of a classification method based on improved DAE feature extraction for hepatitis C prediction. Scientific Reports, 14(1), 9143. doi:10.1038/s41598-024-59785-y.
  • Lilhore, U. K., Manoharan, P., Sandhu, J. K., Simaiya, S., Dalal, S., Baqasah, A. M., Alsafyani, M., Alroobaea, R., Keshta, I., & Raahemifar, K. (2023). Hybrid model for precise hepatitis-C classification using improved random forest and SVM method. Scientific Reports, 13(1). doi:10.1038/s41598-023-36605-3.
  • Alizargar, A., Chang, Y. L., & Tan, T. H. (2023). Performance Comparison of Machine Learning Approaches on Hepatitis C Prediction Employing Data Mining Techniques. Bioengineering, 10(4). doi:10.3390/bioengineering10040481.
  • Safdari, R., Deghatipour, A., Gholamzadeh, M., & Maghooli, K. (2022). Applying data mining techniques to classify patients with suspected hepatitis C virus infection. Intelligent Medicine, 2(4), 193–198. doi:10.1016/j.imed.2021.12.003.
  • Lichtinghagen, R., Klawonn, F., & Hoffmann, G. (2020). HCV data. UCI Machine Learning Repository, 10, C5D612.
  • Hair, J. F., Babin, B. J., Black, W. C., Anderson, R. E. (2019). Multivariate Data Analysis. Cengage, Boston, United Kingdom.
  • ickham, H., Grolemund, G. (2016). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O'Reilly Media, Sebastopol, United States.
  • Schapire, R.E. (2003). The Boosting Approach to Machine Learning: An Overview. Nonlinear Estimation and Classification. Lecture Notes in Statistics, 171, Springer, New York, United States. doi:10.1007/978-0-387-21579-2_9.
  • Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors). The Annals of Statistics, 28(2). doi:10.1214/aos/1016218223.
  • Rätsch, G., Onoda, T., & Müller, K. R. (2001). Soft margins for AdaBoost. Machine Learning, 42(3), 287–320. doi:10.1023/A:1007618119488.
  • Cutler, D. R., Edwards, T. C., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., & Lawler, J. J. (2007). Random forests for classification in ecology. Ecology, 88(11), 2783–2792. doi:10.1890/07-0539.1.
  • Dietterich, T. G. (2000). Ensemble Methods in Machine Learning. Multiple Classifier Systems, MCS 2000, Lecture Notes in Computer Science, 1857, Springer, Berlin, Germany. doi:10.1007/3-540-45014-9_1.
  • Rodríguez, J. J., Kuncheva, L. I., & Alonso, C. J. (2006). Rotation forest: A New classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1619–1630. doi:10.1109/TPAMI.2006.211.
  • Chen, T., & Guestrin, C. (2016). XGBoost. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. doi:10.1145/2939672.2939785.
  • Hastie, T., Friedman, J., & Tibshirani, R. (2001). The Elements of Statistical Learning. Springer Series in Statistics, Springer New York, United States. doi:10.1007/978-0-387-21606-5.
  • Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 4-9 December, 2017, Long Beach, United States.
Volume 1, Issue 1
March 2024
Pages 32-42
  • Receive Date: 01 January 2024
  • Revise Date: 02 February 2024
  • Accept Date: 15 September 2024
  • First Publish Date: 15 September 2024
  • Publish Date: 15 March 2024