Software Fault Prediction Using Filtering Feature Selection in Cluster-Based Classification
Main Authors: | Muhamad, Fachrul Pralienka Bani; Department of Informatics Engineering, Polytechnic of Indramayu, Indramayu 45252, Indonesia, Siahaan, Daniel Oranova; Department of Informatics Engineering, Faculty of Information Technology, Institut Teknologi Sepuluh Nopember (ITS), Kampus ITS Sukolilo, Surabaya 60111, Indonesia, Fatichah, Chastine; Department of Informatics Engineering, Faculty of Information Technology, Institut Teknologi Sepuluh Nopember (ITS), Kampus ITS Sukolilo, Surabaya 60111, Indonesia |
---|---|
Format: | Article info application/pdf eJournal |
Bahasa: | eng |
Terbitan: |
Institut Teknologi Sepuluh Nopember
, 2018
|
Subjects: | |
Online Access: |
http://iptek.its.ac.id/index.php/jps/article/view/3508 http://iptek.its.ac.id/index.php/jps/article/view/3508/2678 |
Daftar Isi:
- The high accuracy of software fault prediction can help testing effort and improving software quality. Previous researchers had proposed the combination of Entropy-Based Discretization (EBD) and Cluster-Based Classification (CBC). However, the irrelevant and redundant features in software fault dataset tend to decrease the prediction accuracy value. This study proposes improvement of CBC outcomes by integrating filtering feature selection methods. Filtering feature selection methods that will be integrated with CBC i.e. Information Gain (IG), Gain Ratio (GR), and One-R (OR). Based on the research using 2 datasets NASA public MDP (i.e. PC2 and PC3), the result shows that the combination of CBC and IG yields the best average accuracy value compared to GR and OR. It generates 67.52% average of probability detection (pd) and 37.42% average of probability false alarm (pf). While CBC without feature selection yields 65.38% average pd and 49.95% average pf. It can be concluded that IG can improve CBC outcomes by increasing 2.14% average pd and reducing 12.53% average pf