COMPARISON OF EXTREME GRADIENT BOOSTING ALGORITHM AND ARTIFICIAL NEURAL NETWORK ON DIABETES PREDICTION
ctrlnum |
31409 |
---|---|
fullrecord |
<?xml version="1.0"?>
<dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><relation>http://repository.unika.ac.id/31409/</relation><title>COMPARISON OF EXTREME GRADIENT BOOSTING ALGORITHM AND ARTIFICIAL NEURAL NETWORK ON DIABETES PREDICTION</title><creator>CARLA, JEVON</creator><subject>004 Data processing & computer science</subject><description>Diabetes is one of the serious diseases and it causes the sufferer to have high blood sugar
due to the body unable to produce the required amount of insulin to regulate glucose. It may cause
complications or may increase the risk of developing another disease like heart disease, kidney
disease, blindness, etc. One of the best ways to fight this disease is by early diagnosis. If there are
a lot of patient records, the machine learning classification algorithms play a great role in
predicting whether a person has diabetes or not. The used dataset is Diabetes UCI Dataset from
kaggle which has been collected using direct questionnaires from the patients of Sylhet Diabetes
Hospital in Sylhet, Bangladesh, and approved by a doctor. The dataset has 520 data and 17
attributes. Several studies have been made in the last few decades and some of them show that
Artificial Neural Networks (ANN) are one of the best algorithms for diabetes predictions, Extreme
Gradient Boosting (XGBoost) is one of the popular machine learning algorithms used for
classification, because of that reason the writer wants to find out whether XGBoost can be used
on diabetes prediction and compare it with ANN. Both algorithms models were trained with the
same ratio 80:20, 75:25, 70:30. 60:40, and 50:50. There are four models for the ANN with 3
hidden layers, 4 hidden layers, 5 hidden layers, and 6 hidden layers, as for the XGBoost models
there are the first model with default parameters and the second one with the hyperparameters
tuning. The accuracy, precision, recall, and f1 score of the models will be compared to find out
which one has better performance. XGBoost performance able to achieve better performance but
the third ANN models able to achieve highest score on 80:20, with 75:25 XGBoost with
hyperparameters tuning able to achieve highest score, but XGBoost with default parameters have
the same score as the the third ANN model, with 70:30 ratio, the third ANN model and both
XGBoost models have the same score and have the highest score among all ratio. with 60:40 ratio,
the first to third ANN models and XGBoost with default parameters have the same accuracy score
but the third ANN models have the highest recall but lower precision than the XGBoost models.
And with 50:50 XGBoost 2 has the best overall performances than the other models.</description><date>2023</date><type>Thesis:Thesis</type><type>PeerReview:NonPeerReviewed</type><type>Book:Book</type><language>eng</language><identifier>http://repository.unika.ac.id/31409/1/19.K1.0017-JEVON%20CARLA-COVER_a.pdf</identifier><type>Book:Book</type><language>eng</language><identifier>http://repository.unika.ac.id/31409/2/19.K1.0017-JEVON%20CARLA-BAB%20I_a.pdf</identifier><type>Book:Book</type><language>eng</language><identifier>http://repository.unika.ac.id/31409/3/19.K1.0017-JEVON%20CARLA-BAB%20II_a.pdf</identifier><type>Book:Book</type><language>eng</language><identifier>http://repository.unika.ac.id/31409/4/19.K1.0017-JEVON%20CARLA-BAB%20III_a.pdf</identifier><type>Book:Book</type><language>eng</language><identifier>http://repository.unika.ac.id/31409/5/19.K1.0017-JEVON%20CARLA-BAB%20IV_a.pdf</identifier><type>Book:Book</type><language>eng</language><identifier>http://repository.unika.ac.id/31409/6/19.K1.0017-JEVON%20CARLA-BAB%20V_a.pdf</identifier><type>Book:Book</type><language>eng</language><identifier>http://repository.unika.ac.id/31409/7/19.K1.0017-JEVON%20CARLA-BAB%20VI_a.pdf</identifier><type>Book:Book</type><language>eng</language><identifier>http://repository.unika.ac.id/31409/8/19.K1.0017-JEVON%20CARLA-DAPUS_a.pdf</identifier><type>Book:Book</type><language>eng</language><identifier>http://repository.unika.ac.id/31409/9/19.K1.0017-JEVON%20CARLA-LAMP_a.pdf</identifier><identifier> CARLA, JEVON (2023) COMPARISON OF EXTREME GRADIENT BOOSTING ALGORITHM AND ARTIFICIAL NEURAL NETWORK ON DIABETES PREDICTION. Other thesis, Universitas Katholik Soegijapranata Semarang. </identifier><recordID>31409</recordID></dc>
|
language |
eng |
format |
Thesis:Thesis Thesis PeerReview:NonPeerReviewed PeerReview Book:Book Book |
author |
CARLA, JEVON |
title |
COMPARISON OF EXTREME GRADIENT BOOSTING ALGORITHM AND ARTIFICIAL NEURAL NETWORK ON DIABETES PREDICTION |
publishDate |
2023 |
topic |
004 Data processing & computer science |
url |
http://repository.unika.ac.id/31409/1/19.K1.0017-JEVON%20CARLA-COVER_a.pdf http://repository.unika.ac.id/31409/2/19.K1.0017-JEVON%20CARLA-BAB%20I_a.pdf http://repository.unika.ac.id/31409/3/19.K1.0017-JEVON%20CARLA-BAB%20II_a.pdf http://repository.unika.ac.id/31409/4/19.K1.0017-JEVON%20CARLA-BAB%20III_a.pdf http://repository.unika.ac.id/31409/5/19.K1.0017-JEVON%20CARLA-BAB%20IV_a.pdf http://repository.unika.ac.id/31409/6/19.K1.0017-JEVON%20CARLA-BAB%20V_a.pdf http://repository.unika.ac.id/31409/7/19.K1.0017-JEVON%20CARLA-BAB%20VI_a.pdf http://repository.unika.ac.id/31409/8/19.K1.0017-JEVON%20CARLA-DAPUS_a.pdf http://repository.unika.ac.id/31409/9/19.K1.0017-JEVON%20CARLA-LAMP_a.pdf http://repository.unika.ac.id/31409/ |
contents |
Diabetes is one of the serious diseases and it causes the sufferer to have high blood sugar
due to the body unable to produce the required amount of insulin to regulate glucose. It may cause
complications or may increase the risk of developing another disease like heart disease, kidney
disease, blindness, etc. One of the best ways to fight this disease is by early diagnosis. If there are
a lot of patient records, the machine learning classification algorithms play a great role in
predicting whether a person has diabetes or not. The used dataset is Diabetes UCI Dataset from
kaggle which has been collected using direct questionnaires from the patients of Sylhet Diabetes
Hospital in Sylhet, Bangladesh, and approved by a doctor. The dataset has 520 data and 17
attributes. Several studies have been made in the last few decades and some of them show that
Artificial Neural Networks (ANN) are one of the best algorithms for diabetes predictions, Extreme
Gradient Boosting (XGBoost) is one of the popular machine learning algorithms used for
classification, because of that reason the writer wants to find out whether XGBoost can be used
on diabetes prediction and compare it with ANN. Both algorithms models were trained with the
same ratio 80:20, 75:25, 70:30. 60:40, and 50:50. There are four models for the ANN with 3
hidden layers, 4 hidden layers, 5 hidden layers, and 6 hidden layers, as for the XGBoost models
there are the first model with default parameters and the second one with the hyperparameters
tuning. The accuracy, precision, recall, and f1 score of the models will be compared to find out
which one has better performance. XGBoost performance able to achieve better performance but
the third ANN models able to achieve highest score on 80:20, with 75:25 XGBoost with
hyperparameters tuning able to achieve highest score, but XGBoost with default parameters have
the same score as the the third ANN model, with 70:30 ratio, the third ANN model and both
XGBoost models have the same score and have the highest score among all ratio. with 60:40 ratio,
the first to third ANN models and XGBoost with default parameters have the same accuracy score
but the third ANN models have the highest recall but lower precision than the XGBoost models.
And with 50:50 XGBoost 2 has the best overall performances than the other models. |
id |
IOS2679.31409 |
institution |
Universitas Katolik Soegijapranata |
institution_id |
334 |
institution_type |
library:university library |
library |
Perpustakaan Universitas Katolik Soegijapranata |
library_id |
522 |
collection |
Unika Repository |
repository_id |
2679 |
subject_area |
Akuntansi Arsitektur Ekonomi |
city |
SEMARANG |
province |
JAWA TENGAH |
repoId |
IOS2679 |
first_indexed |
2023-04-12T19:49:07Z |
last_indexed |
2023-04-27T22:48:39Z |
recordtype |
dc |
_version_ |
1765772020057899008 |
score |
17.538404 |