Pertanika Journal

Go to Pertanika

Go to JTAS Home

Go to Pertanika Facebook

Home / Pre-Press / JST-5636-2024

A Comparative Study of Gradient Descent Methods in Deep Learning Using Body Motion Dataset

Zulfikar Sembiring, Khairul Najmy Abdul Rani and Amiza Amir

Pertanika Journal of Science & Technology, Pre-Press

DOI: https://doi.org/10.47836/pjst.33.4.13

Keywords: Accuracy, activation function, body motion datasets, gradient descent (GD), learning rate, long short-term memory (LSTM), Recurrent Neural Network (RNN), running time

Published: 2025-07-04

Abstract

In this study, a Recurrent Neural Network (RNN) architecture model is used to analyse and compare the seven most widely used first-order stochastic gradient-based optimization algorithms. Adaptive Moment Estimation (ADAM), Root Mean Square Propagation (RMSprop), Stochastic Gradient Descent (SGD), Adaptive Gradient (AdaGrad), Adaptive Delta (AdaDelta), Nesterov-accelerated Adaptive Moment Estimation (NADAM), and Maximum Adaptive Moment Estimation (AdaMax) are the optimization techniques that have been studied. The study used the body motion datasets from the University of California-Irvine (UCI) Machine Learning (ML) datasets. This experiment demonstrates the capabilities of various combinations of optimizer models, long short-term memory (LSTM) architecture, activation functions, and learning rate. The main aim is to understand how good each optimizer performs in test accuracy and feasible training time behaviour over various learning rates and activation functions. The outcomes vary by setting, with some achieving higher accuracy and shorter training sessions than others. The AdaGrad model, which uses exponential and sigmoid activation functions with a learning rate of 0.001, has a training time of 17.1 minutes and a test accuracy of 78.31%, making it the top-performing configuration. The exponential function is an activation function that consistently outperforms other models and optimization algorithms. It consistently delivers high accuracy and minimal running time across numerous models and optimizers, while the Softmax activation function continuously underperforms.

ISSN 0128-7702

e-ISSN 2231-8534

Article ID

JST-5636-2024

PDF

Share this article

Make a Submission

PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY

A Comparative Study of Gradient Descent Methods in Deep Learning Using Body Motion Dataset