Clustering and Feature Selection Technique for Improving Internet Traffic Classification Using K-NN

Main Authors: Wiradinata, Trianggoro, Paramita, Adi Suryaputra
Format: Lainnya application/pdf
Bahasa: eng
Terbitan: Journal of Advances in Computer Networks , 2017
Subjects:
Online Access: http://dspace.uc.ac.id/handle/123456789/730
ctrlnum 123456789-730
fullrecord <?xml version="1.0"?> <dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><title>Clustering and Feature Selection Technique for Improving Internet Traffic Classification Using K-NN</title><creator>Wiradinata, Trianggoro</creator><creator>Paramita, Adi Suryaputra</creator><subject>Clustering, classification, feature, bandwidth</subject><description>Abstract&#x2014;This research will use the algorithm K-Nearest Neighbour (K-NN) to classify internet data traffic, K-NN is suitable for large amounts of data and can produce a more accurate classification, K-NN algorithm has a weakness takes computing high because K-NN algorithm calculating the distance of all existing data. One solution to overcome these weaknesses is to do the clustering process before the classification process, because the clustering process does not require high computing time, clustering algorithm that can be used is Fuzzy C-Mean algorithm, the Fuzzy C-Mean algorithm does not need to be determined in first number of clusters to be formed, clusters that form on this algorithm will be formed&#xD; naturally based datasets be entered, but the algorithm Fuzzy C-Mean has the disadvantage of clustering results obtained are often not the same even though the same input data, this isbecause the initial dataset that of the Fuzzy C-Mean is not&#xD; optimal, to optimize initial data sets in this research using featureselection algorithm, after main feature of dataset selected the output from fuzzy C-Mean become consistent. Selection of the features is a method that is expected to provide an initial dataset that is optimum for the algorithm Fuzzy C-Means. Algorithms for feature selection in this study used are Principal Component Analysis (PCA). PCA reduced non significant attribute to created optimal dataset and can improve performance clustering and classification algorithm. Results in this study is an combining method of classification, clustering and feature&#xD; extraction of data, these three methods successfully modeled to generate a data classification method of internet bandwidth&#xD; usage that has high accuracy and have a fast performance.</description><date>2017-02-02T04:40:08Z</date><date>2017-02-02T04:40:08Z</date><date>2016-03-01</date><type>Other:Other</type><identifier>1793-8244</identifier><identifier>http://dspace.uc.ac.id/handle/123456789/730</identifier><language>eng</language><type>File:application/pdf</type><type>File:application/pdf</type><publisher>Journal of Advances in Computer Networks</publisher><recordID>123456789-730</recordID></dc>
language eng
format Other:Other
Other
File:application/pdf
File
author Wiradinata, Trianggoro
Paramita, Adi Suryaputra
title Clustering and Feature Selection Technique for Improving Internet Traffic Classification Using K-NN
publisher Journal of Advances in Computer Networks
publishDate 2017
topic Clustering
classification
feature
bandwidth
url http://dspace.uc.ac.id/handle/123456789/730
contents Abstract—This research will use the algorithm K-Nearest Neighbour (K-NN) to classify internet data traffic, K-NN is suitable for large amounts of data and can produce a more accurate classification, K-NN algorithm has a weakness takes computing high because K-NN algorithm calculating the distance of all existing data. One solution to overcome these weaknesses is to do the clustering process before the classification process, because the clustering process does not require high computing time, clustering algorithm that can be used is Fuzzy C-Mean algorithm, the Fuzzy C-Mean algorithm does not need to be determined in first number of clusters to be formed, clusters that form on this algorithm will be formed naturally based datasets be entered, but the algorithm Fuzzy C-Mean has the disadvantage of clustering results obtained are often not the same even though the same input data, this isbecause the initial dataset that of the Fuzzy C-Mean is not optimal, to optimize initial data sets in this research using featureselection algorithm, after main feature of dataset selected the output from fuzzy C-Mean become consistent. Selection of the features is a method that is expected to provide an initial dataset that is optimum for the algorithm Fuzzy C-Means. Algorithms for feature selection in this study used are Principal Component Analysis (PCA). PCA reduced non significant attribute to created optimal dataset and can improve performance clustering and classification algorithm. Results in this study is an combining method of classification, clustering and feature extraction of data, these three methods successfully modeled to generate a data classification method of internet bandwidth usage that has high accuracy and have a fast performance.
id IOS2777.123456789-730
institution Universitas Ciputra
institution_id 331
institution_type library:university
library
library Perpustakaan Universitas Ciputra
library_id 281
collection DSpace Perpustakaan Universitas Ciputra
repository_id 2777
subject_area Akuntansi
Arsitektur
Ekonomi
city KOTA SURABAYA
province JAWA TIMUR
repoId IOS2777
first_indexed 2018-03-09T22:50:47Z
last_indexed 2019-07-02T16:30:03Z
recordtype dc
merged_child_boolean 1
_version_ 1765839679184175104
score 17.538404