MODELING DECISION-MAKING FOR MICROSERVICE RESOURCE MANAGEMENT AND VIRTUAL MACHINE SELECTION IN A KUBERNETES ENVIRONMENT
DOI:
https://doi.org/10.35546/kntu2078-4481.2025.3.2.67Keywords:
microservices, Kubernetes, resource management, load forecasting, autoscaling, optimization, virtual machines, information technology, information systemsAbstract
Modern cloud systems built on microservice architectures face a critical challenge of efficient resource management under dynamic workloads. Standard reactive mechanisms of the Kubernetes orchestrator, such as the Horizontal Pod Autoscaler, often prove insufficient due to delayed responses, lack of consideration for future load trends, and the inertia of launching new microservice replicas and virtual machines. This results in inefficient resource utilization, performance fluctuations (flapping), and excessive operational costs. In response to these challenges, the paper proposes a comprehensive mathematical model for optimal proactive resource management based on load forecasting, dynamic scaling of microservices with consideration of delays, and optimal pod placement on virtual machines. The aim of this work is to develop a mathematical optimization model for automated microservice resource management and dynamic virtual machine selection. The subject domain is formalized through the definition of key sets: classes of virtual machines, active instances of virtual machines, types of microservices, and their replicas (pods). Based on historical CPU and memory metrics, load forecasting is performed on the planning horizon, and for each type of microservice, the desired number of replicas required to handle the expected load is calculated. The objective function and constraints of the problem of optimal resource management and virtual machine selection are constructed and formalized, relying on a set of accepted assumptions. The proposed approach increases the resilience and performance of microservice applications while simultaneously reducing operational costs by eliminating excessive overprovisioning and consolidating workloads on the most costeffective types of virtual machines. The results provide a theoretical foundation for further research and practical implementation of an intelligent orchestration system for Kubernetes.
References
Kubernetes. Horizontal Pod Autoscaling. URL: https://kubernetes.io/docs/tasks/run-application/horizontal-podautoscale/ (дата звернення: 03.07.2025).
Федоришин Б., Красько О. (2024) Міграція сервісів в кластері Kubernetes на основі прогнозування навантаження. Інфокомунікаційні технології та електронна інженерія, випуск 4, номер 2, cторінки 82–92. URL: https://doi.org/10.23939/ictee2024.02.082 (дата звернення: 09.07.2025).
Сімакін С., Божуха Л. (2024) Прогнозування навантаження на сервер з використанням ШІ для оптимізації веб–сервісів. Актуальні проблеми автоматизації та інформаційних технологій, номер 28, сторінки. 234–243. URL: http://doi.org/10.15421/432422 (дата звернення: 12.07.2025).
Snehal Chaflekar, Rajendra Rewatkar. (2025) Novel load prediction in microservice architecture using attention mechanism-based deep LSTM networks. International Journal of Innovative Research and Scientific Studies, vol. 8, no. 3, pp. 1046–1058. URL: https://doi.org/10.53894/ijirss.v8i3.6751 (дата звернення: 15.07.2025).
Гутман Д., Сирота О. (2023) Проактивне автоматичне масштабування вверх для Kuberneters. Адаптивні системи автоматичного управління, номер 1, сторінки 32-38. URL: https://doi.org/10.20535/1560-8956.42.2023.278925 (дата звернення: 23.07.2025).
Wei-Kuang Lai, You-Chiun Wang, Syu-Chen Wei. (2023) Delay-Aware Container Scheduling in Kubernetes. IEEE Internet of Things Journal, vol. 10, no. 13, pp. 11813–11824. URL: https://doi.org/10.1109/JIOT.2023.3244545 (дата звернення: 01.08.2025).
Kubernetes. Resource Bin Packing. URL: https://kubernetes.io/docs/concepts/scheduling-eviction/resource-binpacking/(дата звернення: 07.08.2025).
Rodriguez, M. A., & Buyya, R. (2018) Containers Orchestration with Cost-Efficient Autoscaling in Cloud Computing Environments. ArXiv, abs/1812.00300. URL: https://doi.org/10.48550/arXiv.1812.00300 (дата звернення: 15.08.2025).
Guruge PB and Priyadarshana YHPP. (2025) Time series forecasting-based Kubernetes autoscaling using Facebook Prophet and Long Short-Term Memory. Frontiers in Computer Science, vol. 7. URL: https://doi.org/10.3389/fcomp.2025.1509165 (дата звернення: 19.08.2025).
Maiyza, A. I., Hassan, H. A., Sheta, W. M. (2025) VTGAN based proactive VM consolidation in cloud data centers using value and trend approaches. Scientific Reports, vol. 15, no. 20133. URL: https://doi.org/10.1038/s41598-025-04757-z (дата звернення: 27.08.2025).







