Hybrid Algorithms for Heart Data
Hybrid Algorithms for Heart Data
1st Abyan Arya Syahputra Harahap 2nd Morin Adepatrick Damanik 3rd Inayatul Hafizhah Lubis
Information System Information System Information System
Institute Technology Sepuluh Nopember Institute Technology Sepuluh Nopember Institute Technology Sepuluh Nopember
Surabaya, Indonesia Surabaya, Indonesia Surabaya, Indonesia
[email protected] [email protected] [email protected]
The gradient boosting decision tree algorithm was Heart Dataset Kaggle
proposed by Friedman in 2001 [13]. Its principle is using the
gradient descent method to generate new trees based on all Classifier Accuracy
previous trees and making the objective function as small as
possible. XGBoost is called an extreme gradient boosting MLDS 92.72%
decision tree, which is a tree ensemble model that can be used
for classification and regression. When XGBoost is applied to XGB 73.52%
regression problems, new regression trees are added
continuously, and then the residuals of the previous model are LR 70.7%
fitted through the newly generated CART tree. K is used to
mark the number of trees in the complete trained model, and SVM 72.46%
the sum of the results corresponding to each tree is used as the
final predicted value [14]. KNN 69.22%
DT 63.14%
TABLE II. ACCURACY RESULT BASED ON MIDTERM REPORT WITH REFERENCES
DIFFERENT METHODS
[1] Uddin, M.N. and Halder, R.K., 2021. An ensemble method based
Heart Dataset Kaggle multilayer dynamic system to predict cardiovascular disease using
machine learning approach. Informatics in Medicine Unlocked, 24,
Classifier Accuracy p.100584.
[2] Ray, S., 2019, February. A quick review of machine learning algorithms.
Random Forest 98,59% In 2019 International conference on machine learning, big data, cloud
and parallel computing (COMITCon) (pp. 35-39). IEEE.
[3] Ahmed, M.R., Mahmud, S.H., Hossin, M.A., Jahan, H. and Noori,
Artificial Neural Network (ANN) 95.12%
S.R.H., 2018, December. A cloud based four-tier architecture for early
detection of heart disease with machine learning algorithms. In 2018
Naive Bayes (NB) 81.46% IEEE 4th International Conference on Computer and Communications
(ICCC) (pp. 1951-1955). IEEE..
TABLE III. ACCURACY RESULTS WITH HYBRID ALGORITHMS [4] Sheeba, P.T., Roy, D. and Syed, M.H., 2022. A metaheuristic-enabled
training system for ensemble classification technique for heart disease
prediction. Advances in Engineering Software, 174, p.103297.
Heart Dataset Kaggle [5] Bhattacharyya, Siddhartha & Dutta, Paramartha. (2012).
Handbook of Research on Computational Intelligence for Engineering,
Classifier Accuracy Science, and Business. IGI Global
[6] Ashir Javeed, Zhou Shijie, Liao Yongjian, Iqbal Qasim, Adeeb Noor,
XGBoost + Grid Search 100% Redhwan Nour. An intelligent learning system based on random search
algorithm and optimized random forest model for improved heart
disease detection. IEEE Access 2019;7(7):180235–43.
Random Forest + Randomize Search 91%
https://doi.org/10.1109/ACCESS.2019.2952107. November 2019
[7] Fordana, M.D.Y. and Rochmawati, N., 2022. Optimisasi
XGBoost + Randomize Search 89% Hyperparameter CNN Menggunakan Random Search Untuk Deteksi
COVID-19 Dari Citra X-Ray Dada. Journal of Informatics and
Computer Science (JINACS), 4(01), pp.10-18.
After we analyze and compare the result accuracy from [8] Javeed, A., Zhou, S., Yongjian, L., Qasim, I., Noor, A., & Nour, R.
the methods that are used in the journal, previous midterm (2019). An intelligent learning system based on random search algorithm
methods, and the methods with hybrid algorithms. The and optimized random forest model for improved heart disease
detection. IEEE Access, 7, 180235-180243.
accuracy results obtained from the journal paper are between
[9] Yan, H., He, Z., Gao, C., Xie, M., Sheng, H. and Chen, H., 2022.
63.14% and peaked at 92.72%. Then, the accuracy results Investment estimation of prefabricated concrete buildings based on
obtained from our previous midterm methods are between XGBoost machine learning algorithm. Advanced Engineering
81.46% and highest at 98.59%. While the accuracy results we Informatics, 54, p.101789.
get with hybrid algorithms are between 89% to the highest of [10] Hsu, C., Chang, C., Lin, C., 2003. A Practical Guide to Support Vector
100%, using XGBoost + Grid Search, Random Forest + Classification. Technical report, pp. 1–16.
Randomize Search and XGBoost + Randomize Search. After [11] Pan, S., Zheng, Z., Guo, Z. and Luo, H., 2022. An optimized XGBoost
several trials, we obtained the highest accuracy result 100% method for predicting reservoir porosity using petrophysical logs.
using the XGBoost + Grid Search method. Whereas, the Journal of Petroleum Science and Engineering, 208, p.109520.
accuracy of the journal is 92.72% by using the MLDS method. [12] Gong Jun, Zhong Xiaogang, Tan Juntao, Liu Yunyu, Rao Qingmao,
and based on our previous midterm report, the highest Xiang Tianyu and Wang Huilai, 2020. "Grid search + XGBoost"
algorithm to establish a predictive model for children with septic shock.
accuracy that we get is 98,59% by using Random Forest Medical Journal of the People's Liberation Army, pp.1-7.
Method. So, we can conclude that the accuracy result using 龚军, 钟小钢, 谈军涛, 刘蕴宇, 饶青茂, 向天雨 and 王惠来, 2020.
the hybrid algorithm is higher than the prior experiment. “网格搜索+ XGBoost” 算法建立儿童脓毒性休克预测模型. 解放军
医学杂志.
[13] Friedman, J.H., 2001. Greedy function approximation: a gradient
boosting machine. Annals of statistics, pp.1189-1232.
[14] Nguyen, H., Bui, X.N., Bui, H.B. and Cuong, D.T., 2019. Developing an
XGBoost model to predict blast-induced peak particle velocity in an
open-pit mine: a case study. Acta Geophysica, 67(2), pp.477-490.
[15] Dataset from :
https://www.kaggle.com/datasets/johnsmith88/heart-disease-dataset,
Access Date: 2022-10-10.