Feature Selection on Pregnancy Risk Classification Using C5.0 Method
Main Authors: | Azhar, Yufis, Afdian, Riz |
---|---|
Format: | Article PeerReviewed Book |
Bahasa: | eng |
Terbitan: |
Program Studi Elektro dan Informatika Fakultas Teknik Universitas Muhammadiyah Malang
, 2018
|
Subjects: | |
Online Access: |
http://eprints.umm.ac.id/60841/1/Azhar%20Afdian%20-%20Pregnancy%20Risk%20Classification%20Feature%20Selection%20C5.0.pdf http://eprints.umm.ac.id/60841/3/Peer%20Review%20-%20Azhar%20-%20Pregnancy%20Risk%20Classification%20Feature%20Selection%20C5.0.pdf http://eprints.umm.ac.id/60841/2/Similarity%20-%20Azhar%20Afdian%20-%20Pregnancy%20Risk%20Classification%20Feature%20Selection%20C5.0.pdf http://eprints.umm.ac.id/60841/ http://kinetik.umm.ac.id/index.php/kinetik/article/view/703 |
Daftar Isi:
- The maternal mortality rate in Indonesia is still relatively high. This is caused by several factors, including the ignorance of pregnant women about the risk status of pregnancy. Several methods are proposed for early detection of the risk of a mother's pregnancy. However, no one has highlighted what features are most influential in the process of classifying the risk of pregnancy. In this research, we use data of pregnant women in one of the health centers in Malang, Indonesia, as a dataset. The dataset has 107 features, therefore, feature selection is needed for the classification process. We propose to use the C5.0 method to select important features while classifying dataset into low, high, and very high risk of pregnancy. C5.0 was chosen because this method has a better pruning algorithm and requires relatively smaller memory compared to C4.5. Another classification method (SVM, Naive Bayes, and Nearest Neighbor) is then used to compare the accuracy values between datasets that use all features with datasets that only use the selected features. The test results show that feature selection can increase accuracy by up to 5%.