e-ISSN 2231-8526
ISSN 0128-7680
Zulfikar Sembiring, Khairul Najmy Abdul Rani and Amiza Amir
Pertanika Journal of Science & Technology, Pre-Press
DOI: https://doi.org/10.47836/pjst.33.4.13
Keywords: Accuracy, activation function, body motion datasets, gradient descent (GD), learning rate, long short-term memory (LSTM), Recurrent Neural Network (RNN), running time
Published: 2025-07-04
In this study, a Recurrent Neural Network (RNN) architecture model is used to analyse and compare the seven most widely used first-order stochastic gradient-based optimization algorithms. Adaptive Moment Estimation (ADAM), Root Mean Square Propagation (RMSprop), Stochastic Gradient Descent (SGD), Adaptive Gradient (AdaGrad), Adaptive Delta (AdaDelta), Nesterov-accelerated Adaptive Moment Estimation (NADAM), and Maximum Adaptive Moment Estimation (AdaMax) are the optimization techniques that have been studied. The study used the body motion datasets from the University of California-Irvine (UCI) Machine Learning (ML) datasets. This experiment demonstrates the capabilities of various combinations of optimizer models, long short-term memory (LSTM) architecture, activation functions, and learning rate. The main aim is to understand how good each optimizer performs in test accuracy and feasible training time behaviour over various learning rates and activation functions. The outcomes vary by setting, with some achieving higher accuracy and shorter training sessions than others. The AdaGrad model, which uses exponential and sigmoid activation functions with a learning rate of 0.001, has a training time of 17.1 minutes and a test accuracy of 78.31%, making it the top-performing configuration. The exponential function is an activation function that consistently outperforms other models and optimization algorithms. It consistently delivers high accuracy and minimal running time across numerous models and optimizers, while the Softmax activation function continuously underperforms.
ISSN 0128-7702
e-ISSN 2231-8534
Share this article