Pertanika Journal

Go to Pertanika

Go to JTAS Home

Go to Pertanika Facebook

Home / Regular Issue / JTAS Vol. 26 (2) Apr. 2018 / JST-0856-2017

Performance Comparison of Classification Algorithms for Medical Diagnosis

Anju Jain, Saroj Ratnoo and Dinesh Kumar

Pertanika Journal of Tropical Agricultural Science, Volume 26, Issue 2, April 2018

Keywords: Classification algorithms, machine learning, medical diagnosis, performance evaluation

Published on: 30 Apr 2018

Abstract

Knowledge extraction from medical datasets is a challenging task. Medical datasets are known for their complexity in terms of noise, missing values and imbalanced class distribution. Classification algorithms can assist medical experts in disease diagnosis provided that a rigorous and methodological evaluation of classification models is applied by selecting appropriate sampling techniques, performance metrics and statistical tests. An ad hoc approach in this regard can result in unexpectedly high misclassification rates, which may prove very costly in terms of people's health and lives. In this paper, we illustrate a methodology to evaluate and compare multiple classification algorithms on multiple medical datasets. The example experiment is conducted by applying five well-known machine learning algorithms i.e. the Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Ant Colony Optimisation (ACO) and Genetic Algorithm (GA) for discovering classification models for disease diagnosis from 11 publicly available medical datasets from the UCI machine learning laboratory. We conclude through a stepwise evaluation process that the performance of the Random Forest classifier was significantly better in diagnosing various diseases. The paper also addresses the issue of class imbalance and non-uniform misclassification costs, usually prevalent in datasets for disease diagnosis.

ISSN 1511-3701

e-ISSN 2231-8542

Article ID

JST-0856-2017