IMPLEMENTASI KOMBINASI METODE RESAMPLING PADA KLASIFIKASI PENYAKIT STROKE DENGAN ALGORITMA K-NEAREST NEIGHBOR DAN SELEKSI FITUR INFORMATION GAIN

One of the main problems in the medical world is stroke. Stroke is the second cause of death in the world. Based on the results of Basic Health Research (Riskesdar) in 2018, the prevalence of stroke in Indonesia is 713,783 people who suffer from stroke every year. However, diagnosing a stroke takes...

Full description

Saved in:
Bibliographic Details
Main Author: Muhammad Fathurrahman, (Author)
Format: Book
Published: 2023-07-06.
Subjects:
Online Access:Link Metadata
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000 am a22000003u 4500
001 repoupnvj_25249
042 |a dc 
100 1 0 |a Muhammad Fathurrahman, .  |e author 
245 0 0 |a IMPLEMENTASI KOMBINASI METODE RESAMPLING PADA KLASIFIKASI PENYAKIT STROKE DENGAN ALGORITMA K-NEAREST NEIGHBOR DAN SELEKSI FITUR INFORMATION GAIN 
260 |c 2023-07-06. 
500 |a http://repository.upnvj.ac.id/25249/1/ABSTRAK.pdf 
500 |a http://repository.upnvj.ac.id/25249/13/AWAL.pdf 
500 |a http://repository.upnvj.ac.id/25249/3/BAB%201.pdf 
500 |a http://repository.upnvj.ac.id/25249/4/BAB%202.pdf 
500 |a http://repository.upnvj.ac.id/25249/5/BAB%203.pdf 
500 |a http://repository.upnvj.ac.id/25249/6/BAB%204.pdf 
500 |a http://repository.upnvj.ac.id/25249/7/BAB%205.pdf 
500 |a http://repository.upnvj.ac.id/25249/8/DAFTAR%20PUSTAKA.pdf 
500 |a http://repository.upnvj.ac.id/25249/9/RIWAYAT%20HIDUP.pdf 
500 |a http://repository.upnvj.ac.id/25249/10/LAMPIRAN.pdf 
500 |a http://repository.upnvj.ac.id/25249/11/HASIL%20PLAGIARISME.pdf 
500 |a http://repository.upnvj.ac.id/25249/12/ARTIKEL%20KI.pdf 
520 |a One of the main problems in the medical world is stroke. Stroke is the second cause of death in the world. Based on the results of Basic Health Research (Riskesdar) in 2018, the prevalence of stroke in Indonesia is 713,783 people who suffer from stroke every year. However, diagnosing a stroke takes quite a long time. Considering that every minute there are cells that die due to blockage of flow in the brain. Data mining can be used as a prediction of disease. In making data mining models, data imbalance is a problem because it can have a negative impact on the classification results where the machine learning model will pay more attention to the majority class and ignore the minority class. In this study, stroke prediction was carried out using the K-Nearest Neighbor algorithm by combining resampling techniques such as SMOTE, Tomek Links, and ENN. As well as research conducted to determine the effect of the search feature information obtained on the model. Through a 10 fold cross validation process, it is known that the K-NN machine learning model with SMOTE and Tomek Links is able to predict stroke with an accuracy of 83.5%, an f1-score of 12.5%, and a recall of 24.7%. Then K-NN with SMOTE and ENN obtained 78% accuracy, f1 score 16.8%, and recall 45%. When the selection of information gain features is carried out, there is an increase in performance in both methods. SMOTE and Tomek Links produce 79.9% accuracy, 18,3% f1-score, and 46,6% recall and the combination of SMOTE and ENN obtains 76% accuracy, 20% f1-score, and 59% recall. After the experiments, it is known that the resampling technique can improve the performance of the model in the case of imbalanced data from the recall and f1-score values by 54% and 7%. 
546 |a id 
546 |a id 
546 |a id 
546 |a id 
546 |a id 
546 |a id 
546 |a id 
546 |a id 
546 |a id 
546 |a id 
546 |a id 
546 |a id 
690 |a QA75 Electronic computers. Computer science 
690 |a T Technology (General) 
655 7 |a Thesis  |2 local 
655 7 |a NonPeerReviewed  |2 local 
787 0 |n http://repository.upnvj.ac.id/25249/ 
787 0 |n https://repository.upnvj.ac.id 
856 4 1 |u http://repository.upnvj.ac.id/25249/  |z Link Metadata