PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY

 

e-ISSN 2231-8526
ISSN 0128-7680

Home / Regular Issue / JST Vol. 32 (5) Aug. 2024 / JST-4839-2023

 

A Reliable Multimetric Straggling Task Detection

Lukuman Saheed Ajibade, Kamalrulnizam Abu Bakar, Muhammed Nura Yusuf and Babangida Isyaku

Pertanika Journal of Science & Technology, Volume 32, Issue 5, August 2024

DOI: https://doi.org/10.47836/pjst.32.5.19

Keywords: Big data, MapReduce, progress score, straggling tasks, stragglers

Published on: 26 August 2024

One of the most difficult issues in using MapReduce for parallelising and distributing large-scale data processing is detecting straggling tasks. It is defined as recognising processes that are operating on weak nodes. When two steps in the Map phase (copy, combine) and three stages in the Reduce phase (shuffle, sort, and reduce) are included, the overall execution time is the sum of the execution times of these five stages. The main objective of this study is to calculate the remaining time to complete a task, the time taken, and the straggler(s) detected in parallel execution. The suggested method is based on the use of Progress Score (PS), Progress Rate (PR), and Remaining Time (RT) metrics to detect straggling tasks. The results obtained have been compared with popular algorithms in this domain, such as Longest Approximate Time to End (LATE) and Combinatory Late-Machine (CLM), and it has been demonstrated to be capable of detecting straggling tasks, accurately estimating execution time, and supporting task acceleration. RMSTD outperforms LATE by 23.30% and CLM by 19.51%.

  • Ananthanarayanan, G., Kandula, S., Greenberg, A., Stoica, I., Lu, Y., Saha, B., & Harris, E. (2019). Reining in the outliers in MapReduce clusters using Mantri. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10) (pp. 265-278). USENIX Association.

  • Chen, Q., Liu, C., & Xiao, Z. (2014). Improving MapReduce performance using smart speculative execution strategy. IEEE Transactions on Computers, 63(4), 954-967. https://doi.org/10.1109/TC.2013.15

  • Dai, W., & Bassiouni, M. (2013). An improved task assignment scheme for Hadoop running in the clouds. Journal of Cloud Computing, 2, Article 23. https://doi.org/10.1186/2192-113X-2-23

  • Dai, W., Ibrahim, I., & Bassiouni, M. (2016). Improving load balance for data-intensive computing on cloud platforms. In 2016 IEEE International Conference on Smart Cloud (SmartCloud) (pp. 140-145). IEEE Publishing. https://doi.org/10.1109/SmartCloud.2016.44

  • Dai, W., Ibrahim, I., & Bassiouni, M. (2017). An improved straggler identification scheme for data-intensive computing on cloud platforms. n 2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud) (pp. 211-216). IEEE Publishing. https://doi.org/10.1109/CSCloud.2017.64

  • Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113. https://doi.org/10.1145/1327452.1327492

  • Ghare, G. D., & Leutenegger, S. T. (2005). Improving speedup and response times by replicating parallel programs on a SNOW. In D. G. Feitelson, L. Rudolph, U Schwiegelshohn (Eds.), Job scheduling strategies for parallel processing. JSSPP 2004. Lecture Notes in Computer Science (pp. 264-287). Springer. https://doi.org/10.1007/11407522_15

  • Javadpour, A., Wang, G., Rezaei, S., & Li, K. C. (2020). RETRACTED ARTICLE: Detecting straggler MapReduce tasks in big data processing infrastructure by neural network. Journal of Supercomputing, 76, 6969-6993. https://doi.org/10.1007/s11227-019-03136-6

  • Katrawi, A. H., Abdullah, R., Anbar, M., & Abasi, A. K. (2020). Earlier stage for straggler detection and handling using combined CPU test and LATE methodology. International Journal of Electrical and Computer Engineering, 10(5), Article 4910. https://doi.org/10.11591/ijece.v10i5.pp4910-4917

  • Katrawi, A. H., Abdullah, R., Anbar, M., AlShourbaji, I., & Abasi, A. K. (2021). Straggler handling approaches in MapReduce framework: A comparative study. International Journal of Electrical and Computer Engineering, 11(1), 375-382. https://doi.org/10.11591/ijece.v11i1.pp375-382

  • Ketu, S., Mishra, P. K., & Agarwal, S. (2020). Performance analysis of distributed computing frameworks for big data analytics: Hadoop vs Spark. Computacion y Sistemas, 24(2), 669-686. https://doi.org/10.13053/CyS-24-2-3401

  • Kumar, G., Mohan, S., & Nagesh, A. (2021). An ensemble of feature subset selection with deep belief network based secure intrusion detection in big data environment. Indian Journal of Computer Science and Engineering, 12(2), 409-420. https://doi.org/10.21817/indjcse/2021/v12i2/211202101

  • Ouyang, X., Garraghan, P., McKee, D., Townend, P., & Xu, J. (2016). Straggler detection in parallel computing systems through dynamic threshold calculation. In 2016 IEEE 30th International Conference on Advanced Information Networking and Applications (AINA) (pp. 414-421). IEEE Publishing. https://doi.org/10.1109/AINA.2016.84

  • Ouyang, X., Wang, C., Yang, R., Yang, G., Townend, P., & Xu, J. (2018). ML-NA: A machine learning based node performance analyzer utilizing straggler statistics. In 2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS) (pp. 73-80). IEEE Publishing. https://doi.org/10.1109/ICPADS.2017.00021

  • Phan, T. D., Pallez, G., Ibrahim, S., & Raghavan, P. (2019). A new framework for evaluating straggler detection mechanisms in MapReduce. ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 4(3), Article 14. https://doi.org/10.1145/3328740

  • Qiang, Y., Li, Y., Wei, W., Pei, B., Zhao, J., & Zhang, H. (2014). A job scheduling policy based on the job-classification and dynamic replica mechanism. Information Technology Journal, 13(3), Article 501. https://doi.org/10.3923/itj.2014.501.507

  • Zaharia, M., Konwinski, A., Joseph, A. D., Katz, R., & Stoica, I. (2019). Improving MapReduce performance in heterogeneous environments. In Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2008 (pp. 29-42). USENIX Association.