COMPARISON OF CLUSTERING ALGORITHMS FOR THOROID DATABASE

  • Cveta Martinovska Bande University Goce Delcev Computer Science Faculty
  • ANASTASIJA SAMARDZISKA student
Keywords: data mining, clustering, data analysis

Abstract

The main idea of this work is to propose a methodology for analyzing, visualizing and clustering data of patients with different symptoms from a thyroid database. In previous work the thyroid data were analyzed using WITT algorithm. This clustering method properly formed the clusters of a control group and hypothyroid patients but failed to cluster the hyperthyroid patients. In this paper we analyzed the data using several algorithms: K-means, hierarchical clustering, EM algorithm, DBSCAN and Cobweb algorithm. The main idea is to determine the degree of matching between the clusters produced and the class labels in order to determine which algorithms give better results. Classification-oriented measures are used to validate the clustering results. We propose several preprocessing steps to overcome the problems with the large amount of noise and unbalanced classes in the given data set. 

Downloads

Download data is not yet available.

Author Biography

Cveta Martinovska Bande, University Goce Delcev Computer Science Faculty
Published
2020-06-24
How to Cite
Martinovska Bande, C., & SAMARDZISKA, A. (2020). COMPARISON OF CLUSTERING ALGORITHMS FOR THOROID DATABASE. Balkan Journal of Applied Mathematics and Informatics, 3(1), pp. 73-84. Retrieved from https://js.ugd.edu.mk/index.php/bjami/article/view/3520
Section
Articles