e-ISSN 2231-8526
ISSN 0128-7680

Home / Regular Issue / JST Vol. 31 (6) Oct. 2023 / JST-4013-2022


Intelligence System via Machine Learning Algorithms in Detecting the Moisture Content Removal Parameters of Seaweed Big Data

Olayemi Joshua Ibidoja, Fam Pei Shan, Mukhtar Eri Suheri, Jumat Sulaiman and Majid Khan Majahar Ali

Pertanika Journal of Science & Technology, Volume 31, Issue 6, October 2023


Keywords: Big data, drying, machine learning, seaweed, variable selection

Published on: 12 October 2023

The parameters that determine the removal of moisture content have become necessary in seaweed research as they can reduce cost and improve the quality and quantity of the seaweed. During the seaweed’s drying process, many drying parameters are involved, so it is hard to find a model that can determine the drying parameters. This study compares seaweed big data performance using machine learning algorithms. To achieve the objectives, four machine learning algorithms, such as bagging, boosting, support vector machine, and random forest, were used to determine the significant parameters from the data obtained from v-GHSD (v-Groove Hybrid Solar Drier). The mean absolute percentage error (MAPE) and coefficient of determination (R2) were used to assess the model. The importance of variable selection cannot be overstated in big data due to the large number of variables and parameters that exceed the number of observations. It will reduce the complexity of the model, avoid the curse of dimensionality, reduce cost, remove irrelevant variables, and increase precision. A total of 435 drying parameters determined the moisture content removal, and each algorithm was used to select 15, 25, 35 and 45 significant parameters. The MAPE and R-Square for the 45 highest variable importance for random forest are 2.13 and 0.9732, respectively. It performed best, with the lowest error and the highest R-square. These results show that random forest is the best algorithm to decide the vital drying parameters for removing moisture content.

  • Ali, M. K. M., Fudholi, A., Sulaiman, J., Muthuvalu, M. S., Ruslan, M. H., Yasir, S. M., & Hurtado, A. Q. (2017). Post-harvest handling of eucheumatoid seaweeds. In A. Q. Hurtado, A. T. Critchley & L. C. Neish (Eds.), Tropical Seaweed Farming Trends, Problems and Opportunities (pp. 131-145). Springer International Publishing.

  • Ali, M. K. M., Sulaiman, J., Yasir, S. M., Ruslan, M. H., Fudholi, A., Muthuvalu, M. S., & Ramu, V. (2017). Cubic spline as a powerful tools for processing experimental drying rate data of seaweed using solar drier. Article in Malaysian Journal of Mathematical Sciences, 11(S), 159-172.

  • Ali, M. K. M., Mukhtar, Ismail, M. T., Ferdinand, M. H., & Alimuddin. (2021). Machine learning-based variable selection: An evaluation of bagging and boosting. Turkish Journal of Computer and Mathematics Education, 12(13), 4343-4349.

  • Alsahaf, A., Petkov, N., Shenoy, V., & Azzopardi, G. (2022). A framework for feature selection through boosting. Expert Systems with Applications, 187, Article 115895.

  • Arjasakusuma, S., Kusuma, S. S., & Phinn, S. (2020). Evaluating variable selection and machine learning algorithms for estimating forest heights by combining lidar and hyperspectral data. ISPRS International Journal of Geo-Information, 9(9), 1-26.

  • Bajan, B., Mrówczyńska-Kamińska, A., & Poczta, W. (2020). Economic energy efficiency of food production systems. Energies, 13(21), 1-16.

  • Bixler, H. J., & Porse, H. (2011). A decade of change in the seaweed hydrocolloids industry. Journal of Applied Phycology, 23(3), 321-335.

  • Chen, R. C., Dewi, C., Huang, S. W., & Caraka, R. E. (2020). Selecting critical features for data classification based on machine learning methods. Journal of Big Data, 7(1), 1-26.

  • Chicco, D., Warrens, M. J., & Jurman, G. (2021). The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science, 7, Article e623.

  • Chowdhury, M. Z. I., & Turin, T. C. (2020). Variable selection strategies and its importance in clinical prediction modelling. Family Medicine and Community Health, 8(1), Article e000262.

  • Cole, M. B., Augustin, M. A., Robertson, M. J., & Manners, J. M. (2018). The science of food security. Npj Science of Food, 2(1), 1-8.

  • Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273-297.

  • Drobnič, F., Kos, A., & Pustišek, M. (2020). On the interpretability of machine learning models and experimental feature selection in case of multicollinear data. Electronics, 9(5), Article 761.

  • Echave, J., Otero, P., Garcia-Oliveira, P., Munekata, P. E. S., Pateiro, M., Lorenzo, J. M., Simal-Gandara, J., & Prieto, M. A. (2022). Seaweed-derived proteins and peptides: Promising marine bioactives. Antioxidants, 11(1), 1-26.

  • Freund, R. M., Grigas, P., & Mazumder, R. (2017). A new perspective on boosting in linear regression via subgradient optimization and relatives. Annals of Statistics, 45(6), 2328-2364.

  • Friedman, J. H. (2001). Greedy Function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189-1232.

  • Georganos, S., Grippa, T., Niang Gadiaga, A., Linard, C., Lennert, M., Vanhuysse, S., Mboga, N., Wolff, E., & Kalogirou, S. (2021). Geographical random forests: A spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling. Geocarto International, 36(2), 121-136.

  • Gouda, S. G., Hussein, Z., Luo, S., & Yuan, Q. (2019). Model selection for accurate daily global solar radiation prediction in China. Journal of Cleaner Production, 221, 132-144.

  • Gunn, H. J., Rezvan, P. H., Fernández, M. I., & Comulada, W. S. (2022). How to apply variable selection machine learning algorithms with multiply imputed data: A missing discussion. Psychological Methods, 28(2), 452-471.

  • Ibidoja, O. J., Ajare, E. O., & Jolayemi, E. T. (2016). Reliability measures of academic performance. International Journal of Science for Global Sustainability, 2(4), 59-64.

  • Javaid, A., Ismail, M. T., & Ali, M. K. M. (2020). Comparison of sparse and robust regression techniques in efficient model selection for moisture ratio removal of seaweed using solar drier. Pertanika Journal of Science and Technology, 28(2), 609-625.

  • Javaid, A., Muthuvalu, M. S., Sulaiman, J., Ismail, M. T., & Ali, M. K. M. (2019). Forecast the moisture ratio removal during seaweed drying process using solar drier. AIP Conference Proceedings, 2184, Article 050016.

  • Jierula, A., Wang, S., Oh, T. M., & Wang, P. (2021). Study on accuracy metrics for evaluating the predictions of damage locations in deep piles using artificial neural networks with acoustic emission data. Applied Sciences, 11(5), 1-21.

  • Kabari, L. G., Onwuka, U., & Onwuka, U. C. (2019). Comparison of bagging and voting ensemble machine learning algorithm as a classifier. International Journal of Computer Science and Software Engineering, 9(3), 19-23.

  • Kaneko, H. (2021). Examining variable selection methods for the predictive performance of regression models and the proportion of selected variables and selected random variables. Heliyon, 7(6), 1-12.

  • Kim, S., & Kim, H. (2016). A new metric of absolute percentage error for intermittent demand forecasts. International Journal of Forecasting, 32(3), 669-679.

  • Leys, C., Delacre, M., Mora, Y. L., Lakens, D., & Ley, C. (2019). How to classify, detect, and manage univariate and multivariate outliers, with emphasis on pre-registration. International Review of Social Psychology, 32(1), 1-10.

  • Lim, H. Y., Fam, P. S., Javaid, A., & Ali, M. K. M. (2020). Ridge regression as efficient model selection and forecasting of fish drying using v-groove hybrid solar drier. Pertanika Journal of Science and Technology, 28(4), 1179-1202.

  • Liu, C., Tang, F., & Bak, C. L. (2018). An accurate online dynamic security assessment scheme based on random forest. Energies, 11(7), Article 1914.

  • Meyer, H., Reudenbach, C., Wöllauer, S., & Nauss, T. (2019). Importance of spatial predictor variable selection in machine learning applications - Moving from data reproduction to spatial prediction. Ecological Modelling, 411, Article 108815.

  • Namana, M. S. K., Rathnala, P., Sura, S. R., Patnaik, P., Rao, G. N., & Naidu, P. V. (2022). Internet of things for smart agriculture - State of the art and challenges. Ecological Engineering and Environmental Technology, 23(6), 147-160.

  • Nuroğlu, E., Öz, E., Bakırdere, S., Bursalıoğlu, E. O., Kavanoz, H. B., & İçelli, O. (2019). Evaluation of magnetic field assisted sun drying of food samples on drying time and mycotoxin production. Innovative Food Science and Emerging Technologies, 52, 237-243.

  • Pradhan, B., Bhuyan, P. P., Patra, S., Nayak, R., Behera, P. K., Behera, C., Behera, A. K., Ki, J. S., & Jena, M. (2022). Beneficial effects of seaweeds and seaweed-derived bioactive compounds: Current evidence and future prospective. Biocatalysis and Agricultural Biotechnology, 39, Article 102242.

  • Prosekov, A. Y., & Ivanova, S. A. (2018). Food security: The challenge of the present. Geoforum, 91, 73-77.

  • Rahimi, P., Islam, M. S., Duarte, P. M., Tazerji, S. S., Sobur, M. A., el Zowalaty, M. E., Ashour, H. M., & Rahman, M. T. (2022). Impact of the COVID-19 pandemic on food production and animal health. Trends in Food Science and Technology, 121, 105-113.

  • Rahman, S., Irfan, M., Raza, M., Ghori, K. M., Yaqoob, S., & Awais, M. (2020). Performance analysis of boosting classifiers in recognizing activities of daily living. International Journal of Environmental Research and Public Health, 17(3), Article 1082.

  • Rajarathinam, A., & Vinoth, B. (2014). Outlier detection in simple linear regression models and robust regression-A case study on wheat production data. International Journal of Scientific Research, 3(2), 531-536.

  • Rashidi, H. H., Tran, N. K., Betts, E. V., Howell, L. P., & Green, R. (2019). Artificial intelligence and machine learning in pathology: The present landscape of supervised methods. Academic Pathology, 6, 1-17.

  • Safronova, O. V., Polyakova, E. D., Evdokimova, O. V., Demina, E. N., Lazareva, T. N., & Petrova, O. A. (2022). Development of sustainable systems of food production using spirulina platensis dairy technology as a functional filler. IOP Conference Series: Earth and Environmental Science, 981(2), Article 022074.

  • Solyali, D. (2020). A comparative analysis of machine learning approaches for short-/long-term electricity load forecasting in Cyprus. Sustainability, 12(9), Article 3612.

  • Ssemwanga, M., Makule, E., & Kayondo, S. I. (2020). Performance analysis of an improved solar dryer integrated with multiple metallic solar concentrators for drying fruits. Solar Energy, 204, 419-428.

  • Sumari, A. D. W., Charlinawati, D. S., & Ariyanto, Y. (2021). A simple approach using statistical-based machine learning to predict the weapon system operational readiness. Proceedings of the International Conference on Data Science and Official Statistics, 2021(1), 343-351.

  • Yang, W., Yuan, T., & Wang, L. (2020). Micro-blog sentiment classification method based on the personality and bagging algorithm. Future Internet, 12(4), Article 75.