e-ISSN 2231-8526
ISSN 0128-7680
Norlina Mohd Sabri, Izzatul Syahirah Ismail, Nik Marsyahariani Nik Daud and Nor Azila Awang Abu Bakar
Pertanika Journal of Science & Technology, Volume 33, Issue S3, December 2025
DOI: https://doi.org/10.47836/pjst.33.S3.07
Keywords: Awareness, climate change, public sentiment, Support Vector Machine
Published on: 2025-04-24
Climate change has threatened human society and natural ecosystems, yet public opinion surveys have found that public awareness and concern are very deficient. If society is unaware of climate change, activities such as open burning, deforestation, and releasing excessive carbon dioxide gases would not be reduced. There are several methods to detect public opinion on climate change, and one of the convenient and efficient methods is conducting sentiment analysis on Twitter. This study uses machine learning techniques to collect and analyze public opinion on climate change from Twitter. Due to the increasing occurrences of natural disasters worldwide, understanding public awareness of climate change is crucial. The objective of the study is to analyze public sentiment on the awareness of climate change based on the Support Vector Machine (SVM) algorithm. The methodology for the study consists of several phases: data collection, pre-processing, labeling, feature extraction and classifier evaluation. The evaluation results indicated that SVM achieved a high accuracy of 91% with an 80:20 data split. The SVM classifier model has also produced high precision, F1-score, and recall results. The government could use the study results and non-governmental organizations (NGOs) to help them spread awareness on climate change issues. Future work will improve the classifier by analyzing non-English tweets and using SentiWordNet to handle word ambiguity in the messages.
Ahuja, R., Chug, A., Kohli, S., Gupta, S., & Ahuja, P. (2019). The impact of features extraction on the sentiment analysis. Procedia Computer Science, 152, 341-348. https://doi.org/10.1016/j.procs.2019.05.008
Alkhatib, M., Barachi, M. El, Samuel Mathew, S., & Oroumchian, F. (2020). Using artificial intelligence to monitor the evolution of opinion leaders’ sentiments: Case study on global warming. In 2020 5th International Conference on Smart and Sustainable Technologies (SpliTech) (pp. 1-6). IEEE. https://doi.org/10.23919/SpliTech49282.2020.9243726
Anhsori, K., & Shidik, G. F. (2024). Comparison performance of SVM, Naïve Bayes and XGBoost Classifier on climate change issue. In 2024 International Seminar on Application for Technology of Information and Communication (iSemantic) (pp. 1-6). IEEE. https://doi.org/10.1109/iSemantic63362.2024.10762214
Anoop, V. S., Krishnan, T. K. A., Daud, A., Banjar, A., & Bukhari, A. (2024). Climate change sentiment analysis using domain specific bidirectional encoder representations from transformers. IEEE Access, 12, 114912-114922. https://doi.org/10.1109/access.2024.3441310
Arora, S. (2020, February 4). SVM: Difference between Linear and Non-Linear Models. AITUDE. https://www.aitude.com/svm-difference-between-linear-and-non-linear-models/
Baguio, J. D. S., Lu, B. A., & Peña, C. F. (2023). Text classification of climate change tweets using artificial neural networks, fasttext word embeddings, and latent dirichlet allocation. In 2023 International Conference in Advances in Power, Signal, and Information Technology (APSIT) (pp. 688-692). IEEE. https://doi.org/10.1109/APSIT58554.2023.10201782
Barkved, K. (2022, March 9). How to know if your machine learning model has good performance. Obviously AI. https://www.obviously.ai/post/machine-learning-model-performance
Bhandari, A. (2022, June 14). AUC-ROC curve in machine learning clearly explained. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2020/06/auc-roc-curve-machine-learning/
Boost Labs. (2020, November 3). What are word clouds? The value of simple visualizations. Boost Labs. https://boostlabs.com/blog/what-are-word-clouds-value-simple-visualizations/#:%7E:text=Word%20clouds%20(also%20known%20as,words%20depicted%20in%20different%20sizes
Brownlee, J. (2020, August 15). What is a confusion matrix in machine learning? Machine Learning Mastery. https://machinelearningmastery.com/confusion-matrix-machine-learning/
Chan, C. (2020, December 9). What is a ROC curve and how to interpret it. Displayr. https://www.displayr.com/what-is-a-roc-curve-how-tointerpret-it/
Chorev, S. (2021, August 9). What is data cleaning: A practical guide. Deepchecks. https://deepchecks.com/what-is-data-cleaning/
Joseph, V., Lora, C. P., & Narmadha, T. (2024). Exploring the application of natural language processing for social media sentiment analysis. In 2024 3rd International Conference for Innovation in Technology (INOCON) (pp. 1-6). IEEE. https://doi.org/10.1109/INOCON60754.2024.10511841
Khalid, I. A. (2021, December 14). Greater accuracy does not mean greater machine learning model performance. Medium. https://towardsdatascience.com/greater-accuracy-does-not-mean-greater-machine-learning-model-performance-771222345e61
Kharwal, A. (2021, July 7). Classification report in machine learning. Thecleverprogrammer.com. Article 1774. https://thecleverprogrammer.com/2021/07/07/classification-report-in-machine-learning/
Kumar, S. (2020, February 4). SVM: Difference between Linear and Non-Linear Models. AITUDE. https://www.aitude.com/svm-difference-between-linear-and-non-linear-models/
Loureiro, M. L., & Alló, M. (2020). Sensing climate change and energy issues: Sentiment and emotion analysis with social media in the U.K. and Spain. Energy Policy, 143, 111490. https://doi.org/10.1016/j.enpol.2020.111490
Maada, L., Al Fararni, K., Aghoutane, B., Fattah, M., & Farhaoui, Y. (2022). A comparative study of sentiment analysis machine learning approaches. In 2022 2nd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET) (pp. 1-5). IEEE. https://doi.org/10.1109/IRASET52964.2022.9738346
Markham, K. (2020, February 3). Simple guide to confusion matrix terminology. Data School. https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/
Mohamed, D. M. E. D., & El-din, M. H. N. (2017). Performance analysis for sentiment techniques evaluation perspectives. In Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2017 (pp. 448-457). Springer International Publishing. https://doi.org/10.1007/978-3-319-64861-3_42
Mustapha, W. N. A. W., Sabri, N. M., Abu Bakar, N. A. A., Nik Daud, N. M., & Azizan, A. (2024). Detection of harassment toward women in Twitter during pandemic based on machine learning. International Journal of Advanced Computer Science and Applications, 15(3), 1035-1043. https://doi.org/10.14569/IJACSA.2024.01503103
Narkhede, S. (2021, June 15). Understanding AUC - ROC curve - Towards data science. Medium. https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5
National Geographic Society. (2019, March 27). Climate Change. https://www.nationalgeographic.org/encyclopedia/climate-change/
Otero, P., Gago, J., & Quintas, P. (2021). Twitter data analysis to assess the interest of citizens on the impact of marine plastic pollution. Marine Pollution Bulletin, 170. https://doi.org/10.1016/j.marpolbul.2021.112620
Pai, A. (2022, June 21). What is Tokenization in NLP? Here’s all you need to know. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2020/05/what-is-tokenization-nlp/
Ram, N. R., Gautum, S., Jadeja, A., Joisar, H., & Rathore, N. (2024). Social media sentiment analysis using Twitter dataset. In 2024 1st International Conference on Cognitive, Green and Ubiquitous Computing (IC-CGU) (pp. 1-5). IEEE. https://doi.org/10.1109/IC-CGU58078.2024.10530694
Ramanathan, V., Al Hajri, H., & Ruth, A. (2024). Conceptual level semantic sentiment analysis using Twitter data. In 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS) (pp. 1-8). IEEE. https://doi.org/10.1109/ADICS58448.2024.10533498
Ray, S., & Kumar, A. M. S. (2023). Prediction and analysis of sentiments of reddit users towards the climate change crisis. In 2023 International Conference on Networking and Communications (ICNWC) (pp. 1-7). IEEE. https://doi.org/10.1109/ICNWC57852.2023.10127496.
Reddy, M. B. K., Vani, B., & Babu, C. N. K. (2022). A comparative analysis for the detection of hit rate of popular music videos in social network using logistic regression over support vector machine algorithm. In 2022 3rd International Conference on Intelligent Engineering and Management (ICIEM) (pp. 544-549). IEEE. https://doi.org/10.1109/ICIEM54221.2022.9853055
Ruz, G. A., Henríquez, P. A., & Mascareño, A. (2020). Sentiment analysis of Twitter data during critical events through Bayesian networks classifiers. Future Generation Computer Systems, 106, 92-104. https://doi.org/10.1016/j.future.2020.01.005
Shofiya, C., & Abidi, S. (2021). Sentiment analysis on COVID-19-related social distancing in Canada using twitter data. International Journal of Environmental Research and Public Health, 18(11), 5993. https://doi.org/10.3390/ijerph18115993
Singh, G., Singh, J., & Tripathy, B. (2024). Significance of sentiment analysis approaches using Machine Learning (ML) techniques. In 2024 2nd International Conference on Computer, Communication and Control (IC4) (pp. 1-6). IEEE. https://doi.org/10.1109/IC457434.2024.10486310
Singhal, G. (2020, October 5). Importance of text pre-processing. Pluralsight. https://www.pluralsight.com/guides/importance-of-text-pre-processing
Singleton, S., Kumar, S. A. P., & Li, Z. (2019). Twitter analytics-based assessment: Are the United States coastal regions prepared for climate change. In Proceedings of the International Symposium on Technology and Society (ISTAS) (pp. 150-155). IEEE. https://doi.org/10.1109/ISTAS.2018.8638266
Siong, T. G. (2019, December 13). What are w and b parameters in SVM? Cross Validated. https://stats.stackexchange.com/users/128610/siong-thye-goh
Sudhir, P., & Suresh, V. D. (2021). Comparative study of various approaches, applications and classifiers for sentiment analysis. Global Transitions Proceedings, 2(36), 205-211. https://doi.org/10.1016/j.gltp.2021.08.004
Thenmozhi, M., Shubigsha, G., Sindhuja, G., & Dhinakar, V. (2024). Sentiment analysis on climate change using Twitter data. In 2024 2nd International Conference on Networking and Communications (ICNWC) (pp. 1-6). IEEE. https://doi.org/10.1109/icnwc60771.2024.10537404
Varshney, H., Shishodiya, P., & Sriramulu, S. (2022). Classifying tweets based on climate change. In 2022 3rd International Conference on Intelligent Engineering and Management (ICIEM) (pp. 956-960). IEEE. https://doi.org/10.1109/ICIEM54221.2022.9853094.
Wali, K. (2022, May 4). Explained: Stemming vs lemmatization in NLP. Analytics India Magazine. https://analyticsindiamag.com/explained-stemming-vs-lemmatization-in-nlp
Wang, J., Obradovich, N., & Zheng, S. (2020). A 43-million-person investigation into weather and expressed sentiment in a changing climate. One Earth, 2(6), 568-577. https://doi.org/10.1016/j.oneear.2020.05.016
Yogi, K. S., Gowda, V. D., Sindhu, D., Soni, H., Mukherjee, S., & Madhu, G. C. (2024). Enhancing accuracy in social media sentiment analysis through comparative studies using machine learning techniques. In 2024 International Conference on Knowledge Engineering and Communication Systems (ICKECS) (Vol. 1, pp. 1-6). IEEE. https://doi.org/10.1109/ICKECS61492.2024.10616441
ISSN 0128-7680
e-ISSN 2231-8526