Klasterisasi Dokumen Tugas Akhir Menggunakan K-Means Clustering, Sebagai Analisa Penerapan Sistem Temu Kembali

  • Very Kurnia Bakti Program Studi Teknik Komputer, Politeknik Harapan Bersama
  • Jatmiko Indriyatno Program Studi Teknik Komputer, Politeknik Harapan Bersama
Keywords: k-means, clustering

Abstract

Document searching of Final Project in Polytechnic Harapan Bersama today still displays search results ranked by sequential document matches or commonly called document ranking. Thereby this way causes document data discovery is not clustered on each theme of the final project accurately. Clustering algorithms can be used in categorising or grouping of documents. One of clustering algorithm usage is by applying the method of K means, a simple algorithm developed by Mac Queen in 1967. From the research that has been done, the document final projects’ abstract clustering in Indonesian language by applying the K Means algorithm shows generated a good enough clusters, so it can be recommendation that K-Means clustering method is good enough if applied in retrieval application system, with indicators of distance between clusters produced are very close, that is 0.001 when calculated by the method of Davies Bouldin Index.

References

[1] Agusta, Y. 2007. K-means-Penerapan, Permasalahan dan Metode Terkait. Jurnal Sistem dan Informatika Vol. 3 (Februari 2007): 47-60.
[2] Arifin, Agus Zainal, and Ari Novan Setiono. "Klasifikasi Dokumen Berita Kejadian Berbahasa Indonesia dengan Algoritma Single Pass Clustering." Prosiding Seminar on Intelligent Technology and its Applications (SITIA), Teknik Elektro, Institut Teknologi Sepuluh Nopember Surabaya.[This page intentionally left blank]. 2002.
[3] Cui, Xiaohui, Thomas E. Potok, and Paul Palathingal. "Document clustering using particle swarm optimization." Swarm Intelligence Symposium, 2005. SIS 2005. Proceedings 2005 IEEE. IEEE, 2005.
[4] Gosno, Eric Budiman, Isye Arieshanti, and Rully Soelaiman. "Implementasi KD-Tree K-Means ustering untuk Klasterisasi Dokumen." Jurnal Teknik ITS 2.2 (2013): A432-A437.
[5] Haryo Guritno. “Klasterisasi Dokumen Cerpen Dengan Metode K-Means Clustering” thesis Udinus (2015).
[6] Huang, Anna. "Similarity measures for text document clustering." Proceedings of the sixth new zealand computer science research student conference (NZCSRSC2008), Christchurch, New Zealand. 2008.
[7] Selim, Shokri Z., and Mohamed A. Ismail. "K-means-type algorithms: a generalized convergence theorem and characterization of local optimality." Pattern Analysis and Machine Intelligence, IEEE Transactions on 1 (1984): 81-87.
[8] Tala, Fadillah Z. "A study of stemming effects on information retrieval in Bahasa Indonesia." Institute for Logic, Language and Computation Universeit Van Amsterdam (2003).
[9] Vidya Ayuningtias, M. Arif Bijaksana, Rimba Widhiana Ciptasari “Pengkategorian hasil Pencarian Dokumen dengan klastering”tugas akhir, Universitas telkom university. 2008
[10] Yang, Yiming, et al. "Learning approaches for detecting and tracking news events." IEEE Intelligent Systems 4 (1999): 32-43.
[11] Yi, B., Qiao, H., Yang, F., & Xu, C. (2010). An Improved Initialization Center Algorithm for K-Means Clustering. 2010 International
Conference on Computational Intelligence and Software Engineering, IEEE (1), 1–4.
Published
2017-02-16