COMPARISON OF CLUSTERING ALGORITHMS FOR THOROID DATABASE

  • Cveta Martinovska Bande University Goce Delcev Computer Science Faculty
  • ANASTASIJA SAMARDZISKA student

Abstract

The main idea of this work is to propose a methodology for analyzing, visualizing and clustering data of patients with different symptoms from a thyroid database. In previous work the thyroid data were analyzed using WITT algorithm. This clustering method properly formed the clusters of a control group and hypothyroid patients but failed to cluster the hyperthyroid patients. In this paper we analyzed the data using several algorithms: K-means, hierarchical clustering, EM algorithm, DBSCAN and Cobweb algorithm. The main idea is to determine the degree of matching between the clusters produced and the class labels in order to determine which algorithms give better results. Classification-oriented measures are used to validate the clustering results. We propose several preprocessing steps to overcome the problems with the large amount of noise and unbalanced classes in the given data set. 

Published
Jun 24, 2020
How to Cite
MARTINOVSKA BANDE, Cveta; SAMARDZISKA, ANASTASIJA. COMPARISON OF CLUSTERING ALGORITHMS FOR THOROID DATABASE. Balkan Journal of Applied Mathematics and Informatics, [S.l.], v. 3, n. 1, p. pp. 73-84, june 2020. Available at: <http://js.ugd.edu.mk/index.php/bjami/article/view/3520>. Date accessed: 15 july 2020.
Section
Articles