USING ARTIFICIAL INTELLIGENCE TECHNOLOGIES TO PREDICT FAILURES IN CLOUD INFRASTRUCTURES

Authors

DOI:

https://doi.org/10.35546/kntu2078-4481.2026.1.38

Keywords:

deep learning, machine learning, anomaly detection, AIOps, MLOps, Random Forest, XGBoost, LSTM, GRU

Abstract

In the modern world, cloud infrastructures have become an integral part of business processes, which necessitates the need to ensure their continuity and stability. In the context of the rapid development of artificial intelligence (hereinafter referred to as AI), there is a need to use them to predict failures in cloud systems. The purpose of the article is to study various AI models and methods used for predicting failures, as well as to assess their effectiveness, advantages and challenges that organizations face when implementing such technologies. The study uses theoretical (analysis, synthesis, abstraction, induction, deduction) and empirical methods, including description. The use of AI technologies to predict failures in cloud infrastructures includes analysis of technical failures, performance anomalies, cyber threats, service failures, integration issues, and human error factors, which helps improve system reliability using models and methods such as Random Forest, XGBoost, LSTM, GRU, and AIOps technologies. The article reviews various models and methods for predictive time series analytics and anomaly detection. It examines AIOps technologies that automate monitoring and risk management processes. It analyzes the potential benefits of implementing AI in cloud systems, including improved forecasting accuracy, the ability to process large amounts of data, and early detection of anomalies. At the same time, it also examines the challenges facing organizations, such as the need for large data sets, high computational costs, and cybersecurity risks. Based on the analysis, practical recommendations are provided for integrating AI technologies into monitoring and risk management processes. It is recommended to use digital twins, AI models for early anomaly detection, graph neural networks, and MLOps tools to automate testing of cloud infrastructure. This will minimize the risks associated with updates and increase the resilience of the system. Therefore, the use of AI technologies for failure prediction in cloud infrastructures can significantly increase their reliability and performance, but requires careful planning and consideration of possible challenges.

References

Voutsas F., Violos J., Leivadeas A. Mitigating alert fatigue in cloud monitoring systems: A machine learning perspective. Computer Networks. 2024. Vol. 250. Article 110543. DOI: https://doi.org/10.1016/j.comnet.2024.110543

Agboola O. A., Ogeawuchi J. C., Gbenle T. P., Abayomi A. A., Uzoka A. C. Advances in risk assessment and mitigation for complex cloud-based project environments. Journal of Frontiers in Multidisciplinary Research. 2023. Vol. 06, № 01. P. 309–320.DOI: https://doi.org/10.54660/.jfmr.2023.4.1.309-320

Drissi S., Chergui M., Khatar Z. A systematic literature review on risk assessment in cloud computing: Recent research advancements. IEEE Access. 2025. № 13. DOI: https://doi.org/10.1109/access.2025.3561123

Козак Ю. Б. Аналіз даних та машинне навчання на хмарних та туманних платформах як основа ефективної передачі даних. Вчені записки ТНУ імені В. І. Вернадського. Серія: Технічні науки. 2021. Т. 32 (71), № 5. С. 100–107. DOI: https://doi.org/10.32838/2663-5941/2021.5/16

Кушнерьов О. С., Позовна І. В., Сокол В. Вплив нейронних мереж на розвиток кібербезпеки в умовах регуляторних змін. Безпека інформації. 2024. Т. 30, № 2. С. 261–269.DOI: https://doi.org/10.18372/2225-5036.30.19238

Duda O., Shakleina I., Luchkevych M. Increasing the efficiency of DevOps through the use of artificial intelligence and machine learning. Herald of Khmelnytskyi National University. Technical sciences. 2025. Vol. 351, № 3(1). P. 143–149. DOI: https://doi.org/10.31891/2307-5732-2025-351-17

Adewusi A. O., Okoli U. I., Olorunsogo T., Adaga E., Daraojimba D. O., Obi O. C. Artificial intelligence in cybersecurity: Protecting national infrastructure: A USA review. World Journal of Advanced Research and Reviews. 2024. Vol. 21, № 1. P. 2263–2275.DOI: https://doi.org/10.30574/wjarr.2024.21.1.0313.

Burov Y., Zhovnir Y., Zakharya O. The vision and implementation of intelligent security system. Herald of Khmelnytskyi National University. Technical sciences. 2024. Т. 341, № 5. Р. 497–509. DOI: https://doi.org/10.31891/2307-5732-2024-341-5-72

Sarker I. H., Furhad M. H., Nowrozy R. AI-driven cybersecurity: An overview, security intelligence modeling and research directions. SN Computer Science. 2021. Vol. 2. Article 173. DOI: https://doi.org/10.1007/s42979-021-00557-0

Davydov V., Hrebeniuk D. Development the resources load variation forecasting method within cloud computing systems. Advanced Information Systems. 2020. Vol. 4, № 4. P. 128–135. DOI: https://doi.org/10.20998/2522-9052.2020.4.18

Ahmed S. A., Khalifa E. H., Nawaz M., Abdalla F. A., Mahmoud A. F. A. Enhancing cloud data center security through deep learning: A comparative analysis of RNN, CNN, and LSTM models for anomaly and intrusion detection. Engineering, Technology & Applied Science Research. 2025. Vol. 15, № 1. P. 20071–20076. DOI: https://doi.org/10.48084/etasr.9445

Tengku Asmawi T. N., Ismail A., Shen J. Cloud failure prediction based on traditional machine learning and deep learning. Journal of cloud computing. 2022. Vol. 11. Article 47. DOI: https://doi.org/10.1186/s13677-022-00327-0

Noor A. Cloud-based deep learning for real-time URL anomaly detection: LSTM/GRU and CNN/LSTM models. Computer systems science and engineering. 2025. № 49. P. 259–286. DOI: https://doi.org/10.32604/csse.2025.060387

Saha S., Sarkar J., Dhavala S., Mota P., Sarkar S. Quantile-long short term memory: A robust, time series anomaly detection method. IEEE transactions on artificial intelligence. 2024. Vol. 5, № 8. P. 3939–3950. DOI: https://doi.org/10.1109/tai.2024.3353163

Zhao Z., Xu C., Li B. A LSTM-based anomaly detection model for log analysis. Journal of signal processing systems. 2021. Vol. 93. P. 745–751. DOI: https://doi.org/10.1007/s11265-021-01644-4

Shaikh R., Muntean C. H., Gupta S. Prediction of resource utilisation in cloud computing using machine learning. Proceedings of the 14th International Conference on Cloud Computing and Services Science CLOSER. 2024. Vol. 1. P. 103–114. DOI: https://doi.org/10.5220/0012742200003711

Al-Ghuwairi A. R., Sharrab Y., Al-Fraihat D., AlElaimat M., Alsarhan A., Algarni A. Intrusion detection in cloud computing based on time series anomalies utilizing machine learning. Journal of cloud computing. 2023. Vol. 12. Article 127. DOI: https://doi.org/10.1186/s13677-023-00491-x

Dmytriv Y., Orlov M. Use of artificial intelligence methods and tools in the construction of cloud IT infrastructures. Вісник Національного університету «Львівська політехніка». Серія Інформаційні системи та мережі. 2025. Вип. 17. С. 101–113. DOI: https://doi.org/10.23939/sisn2025.17.101

Іванченко Ю., Аверичев І., Рижаков М. Узагальнена модель прогнозування та виявлення аномалій кібербезпеки на основі штучного інтелекту. Кібербезпека: освіта, наука, техніка. 2025. № 4(28). С. 493–510. DOI: https://doi.org/10.28925/2663-4023.2025.28.823

Трапаідзе С., Швецова К. Генеративний штучний інтелект у створенні маркетингового контенту для українських компаній. Економіка та суспільство. 2025. № 72. DOI: https://doi.org/10.32782/2524-0072/2025-72-161

Pravorska N. Method of applying machine learning to enhance the efficiency of DevOps processes. Herald of Khmelnytskyi National University, Technical sciences. 2024. Vol. 343, № 6(1). P. 454–463. DOI: https://doi.org/10.31891/2307-5732-2024-343-6-68

Published

2026-04-30