IMPLEMENTASI KOMBINASI METODE RESAMPLING PADA KLASIFIKASI PENYAKIT STROKE DENGAN ALGORITMA K-NEAREST NEIGHBOR DAN SELEKSI FITUR INFORMATION GAIN
One of the main problems in the medical world is stroke. Stroke is the second cause of death in the world. Based on the results of Basic Health Research (Riskesdar) in 2018, the prevalence of stroke in Indonesia is 713,783 people who suffer from stroke every year. However, diagnosing a stroke takes...
Saved in:
Main Author: | |
---|---|
Format: | Book |
Published: |
2023-07-06.
|
Subjects: | |
Online Access: | Link Metadata |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
MARC
LEADER | 00000 am a22000003u 4500 | ||
---|---|---|---|
001 | repoupnvj_25249 | ||
042 | |a dc | ||
100 | 1 | 0 | |a Muhammad Fathurrahman, . |e author |
245 | 0 | 0 | |a IMPLEMENTASI KOMBINASI METODE RESAMPLING PADA KLASIFIKASI PENYAKIT STROKE DENGAN ALGORITMA K-NEAREST NEIGHBOR DAN SELEKSI FITUR INFORMATION GAIN |
260 | |c 2023-07-06. | ||
500 | |a http://repository.upnvj.ac.id/25249/1/ABSTRAK.pdf | ||
500 | |a http://repository.upnvj.ac.id/25249/13/AWAL.pdf | ||
500 | |a http://repository.upnvj.ac.id/25249/3/BAB%201.pdf | ||
500 | |a http://repository.upnvj.ac.id/25249/4/BAB%202.pdf | ||
500 | |a http://repository.upnvj.ac.id/25249/5/BAB%203.pdf | ||
500 | |a http://repository.upnvj.ac.id/25249/6/BAB%204.pdf | ||
500 | |a http://repository.upnvj.ac.id/25249/7/BAB%205.pdf | ||
500 | |a http://repository.upnvj.ac.id/25249/8/DAFTAR%20PUSTAKA.pdf | ||
500 | |a http://repository.upnvj.ac.id/25249/9/RIWAYAT%20HIDUP.pdf | ||
500 | |a http://repository.upnvj.ac.id/25249/10/LAMPIRAN.pdf | ||
500 | |a http://repository.upnvj.ac.id/25249/11/HASIL%20PLAGIARISME.pdf | ||
500 | |a http://repository.upnvj.ac.id/25249/12/ARTIKEL%20KI.pdf | ||
520 | |a One of the main problems in the medical world is stroke. Stroke is the second cause of death in the world. Based on the results of Basic Health Research (Riskesdar) in 2018, the prevalence of stroke in Indonesia is 713,783 people who suffer from stroke every year. However, diagnosing a stroke takes quite a long time. Considering that every minute there are cells that die due to blockage of flow in the brain. Data mining can be used as a prediction of disease. In making data mining models, data imbalance is a problem because it can have a negative impact on the classification results where the machine learning model will pay more attention to the majority class and ignore the minority class. In this study, stroke prediction was carried out using the K-Nearest Neighbor algorithm by combining resampling techniques such as SMOTE, Tomek Links, and ENN. As well as research conducted to determine the effect of the search feature information obtained on the model. Through a 10 fold cross validation process, it is known that the K-NN machine learning model with SMOTE and Tomek Links is able to predict stroke with an accuracy of 83.5%, an f1-score of 12.5%, and a recall of 24.7%. Then K-NN with SMOTE and ENN obtained 78% accuracy, f1 score 16.8%, and recall 45%. When the selection of information gain features is carried out, there is an increase in performance in both methods. SMOTE and Tomek Links produce 79.9% accuracy, 18,3% f1-score, and 46,6% recall and the combination of SMOTE and ENN obtains 76% accuracy, 20% f1-score, and 59% recall. After the experiments, it is known that the resampling technique can improve the performance of the model in the case of imbalanced data from the recall and f1-score values by 54% and 7%. | ||
546 | |a id | ||
546 | |a id | ||
546 | |a id | ||
546 | |a id | ||
546 | |a id | ||
546 | |a id | ||
546 | |a id | ||
546 | |a id | ||
546 | |a id | ||
546 | |a id | ||
546 | |a id | ||
546 | |a id | ||
690 | |a QA75 Electronic computers. Computer science | ||
690 | |a T Technology (General) | ||
655 | 7 | |a Thesis |2 local | |
655 | 7 | |a NonPeerReviewed |2 local | |
787 | 0 | |n http://repository.upnvj.ac.id/25249/ | |
787 | 0 | |n https://repository.upnvj.ac.id | |
856 | 4 | 1 | |u http://repository.upnvj.ac.id/25249/ |z Link Metadata |