Klasterisasi Dengan Menggunakan Algoritma Agglomerative Hierarchical Clustering dan Bisecting K-Means Serta Pencarian Cerdas Berbasis Semantic Web Pada Studi Kasus Dokumen Tugas Akhir Jurusan Teknik Informatika Universitas Muhammadiyah Malang

Main Author: Isanta, Septiyan Andika
Format: Thesis NonPeerReviewed
Terbitan: , 2012
Subjects:
Online Access: http://eprints.umm.ac.id/19143/
Daftar Isi:
  • Document searching and clustering is a technique that is often studied because of its importance in text mining and information retrieval system. In data mining, there are two clustering approach, partitional algorithms and hierarchical algorithms respectively. This study aims to develop a prototype of semantic based intelligent search and clustering system, as well as compare the performance of clustering algorithms in final project documents case of study. Partitional algorithms studied with K-Means approach Bisecting number, and the SSE approach. As for the hierarchical algorithm is studied Hierarchical agglomerative clustering algorithm with the approach of Single-Link, Complete-Link, and Average-Link. The parameters used to compare the performance of the algorithm is the , Precision, Recall, and F-measure. The partitional clustering techniques performance is evaluated using SSE (cohesion), SSB (separtion), and TSS while Hierarchical clustering techniques is evaluated using Cophenetic Correlation Evaluation Cooeffecien (CPCC). The parameters used for testing intelligent search is precision, recall, and f-measure. The evaluation results show that Bisecting K-Means with the SSE, SSB and TSS obtained appropriate groups. Hierarchical agglomerative clustering algorithms, after being evaluated by Cophenetic Correlation Cooeffecien (CPCC), show that the clustering results are quite suitable as well. Overallhe performance of K-Means algorithm Bisecting is better than Hierarchical agglomerative clustering algorithm. It has good results and its complexity of grouping over time is much smaller. As for the evaluation of search results, searches without the use of ontology has the precision, recall, and f-measure is better than using the ontology.