Abstract: Heart disease (HD), including heart attacks, is a leading cause of death worldwide, making accurate determination of a patient's risk a significant challenge in medical data analysis. Early detection and continuous monitoring by physicians can significantly reduce mortality rates, but heart disease is not always easily detectable, and physicians cannot monitor patients around the clock. Machine learning (ML) offers a promising solution to enhance diagnostics through more accurate predictions based on data from healthcare sectors globally. This study aims to employ various feature selection methods to develop an effective ML technique for early-stage heart disease prediction. The feature selection process utilized three distinct methods: chi-square, analysis of variance (ANOVA), and mutual information (MI), leading to three selected feature groups designated as SF-1, SF-2, and SF-3. We then evaluated ten different ML classifiers, including Naive Bayes, support vector machine (SVM), voting, XGBoost, AdaBoost, bagging, decision tree (DT), K-nearest neighbor (KNN), random forest (RF), and logistic regression (LR), to identify the best approach and feature subset. The proposed prediction method was validated using a private dataset, a publicly available dataset, and multiple cross-validation techniques. To address the challenge of unbalanced data, the Synthetic Minority Oversampling Technique (SMOTE) was applied. Experimental results showed that the AdaBoost classifier achieved optimal performance with the combined datasets and the SF-2 feature subset, yielding rates of 96.84% for accuracy, 95.32% for sensitivity, 91.12% for specificity, 94.67% for precision, 92.36% for F1 score, and 98.50% for AUC. Additionally, an explainable artificial intelligence approach utilizing SHAP methodologies is being developed to provide insights into the system's prediction process. The proposed technique demonstrates significant promise for the healthcare sector, facilitating early-stage heart disease prediction with reduced costs and minimal time.
Call for Papers
Rapid Publication 24/7
June 2025/July 2025
Submission: eMail paper now
Notification: Immediate
Publication: Immediately with eCertificates
Frequency: Monthly