APPROACH TO DEVELOPING ARCHITECTURE OF A HETEROGENEOUS MULTICOMPUTER TASK PLANNING SYSTEM
DOI:
https://doi.org/10.32782/KNTU2618-0340/2021.4.2.1.29Keywords:
heterogeneous multicomputer systems, software architecture, fault tolerance, interaction model, consensus algorithmsAbstract
When developing the architecture of a distributed task scheduling system, developers inevitably face the problem of ensuring the operation of the system as a whole unit. Despite the scalability advantage of heterogeneous multicomputer systems based on nodes with network operating systems, developers face challenges in ensuring the security and transparency of the system. The correct architectural approach in the development of such systems will help to level their shortcomings, as well as ensure fault tolerance and data consistency. When designing distributed task planning systems, it is necessary to consider the problem of load balancing at the nodes that participate in the execution of tasks, so the literature on this topic was considered. The article also analyzes other modern literature on the development of distributed systems in general. The design of a distributed system is considered from the standpoint of analysis of interaction models, message brokers, different types of architecture and consensus algorithms. Different models of interaction in distributed systems are also considered – remote procedure call (RPC), remote method invocation (RMI), message-oriented middleware (MOM), streams (streaming) - and the most flexible for building a distributed task scheduling system is identified. The article compares different brokers ( RabbitMQ, Apache Kafka, ZeroMQ) for routing messages within a distributed system with an emphasis on the reliability of message delivery. In addition, such architectures as grid and cluster are considered, their key features are generalized, and the characteristics of the developed system are presented. Methods for ensuring data consistency in distributed systems Paxos and Raft are also described. The failover model is presented to simplify system development and startup. In addition, the BPMN scheme of running the task within the developed distributed system, as well as the scheme of the architecture of the developed system are presented. The article presents the results of an experiment to determine the scalability of the developed system, as well as analyzes the features of the Golang programming language on which the developed system is written.
References
Стеен М., Таненбаум Е. Распределенные системы. Принципы и парадигмы: уч. пособ. Санкт-Петербург: Питер, 2003. 877 с.
Vucha M. A Case Study: Task Scheduling Methodologies for High Speed Computing Systems. International Journal of Embedded Systems and Applications. 2015. URL: https://www.researchgate.net/publication/270593930_A_Case_Study_Task_Scheduling_Methodologies_for_High_Speed_Computing_Systems.
Shakirat Haroon-Sulyman. Client-Server Model. IOSR Journal of Computer Engineering. 2014. № 16. С. 57-71. URL: https://www.researchgate.net/publication/271295146_Client-Server_Model.
Berry D., Djaoui A., Grimshaw A. та ін. The Open Grid Services Architecture, Version 1.5. 2006. URL: https://ogf.org/documents/GFD.80.pdf.
Communication in a microservice architecture. URL: https://docs.microsoft.com/enus/dotnet/architecture/microservices/architect-microservice-containerapplications/communication-in-microservice-architecture.
Baboia M., Iftene A., Gîfu D. Dynamic Microservices to Create Scalable and Fault Tolerance Architecture. Procedia Computer Science. 2019. № 159. URL: https://www.sciencedirect.com/science/article/pii/S187705091931467X.
Chee Shin Yeo, Rajkumar Buyya, Hossein Pourreza та ін. Cluster Computing: High-Performance, High-Availability, and High-Throughput Processing on a Network of Computers Handbook of Nature-Inspired and Innovative Computing. 2006. С. 521-551. URL: https://www.researchgate.net/publication/226533607_Cluster_Computing_High-Performance_High-Availability_and_High-Throughput_Processing_on_a_Network_of_Computers.
Barkallah H. Evolution of the Distributed Computing Paradigms: Brief Road Map. IJCDS Journal. № 6(5). 2017. URL:
Java RMI. URL: https://docs.oracle.com/javase/7/docs/platform/rmi/spec/rmiTOC.html.
Lecture 3: RPC and RMI. URL: https://cseweb.ucsd.edu/classes/sp16/cse291-e/applications/ln/lecture3.html.
RabbitMQ Tutorials. URL: https://www.rabbitmq.com/getstarted.html.
Curry E. Message-Oriented Middleware. Middleware for Communications. 2005. С. 1-28. URL: https://www.researchgate.net/publication/220035284_Message-Oriented_Middleware.
Vineet J, Xia Liu. A Survey of Distributed Message Broker Queues. 2017. URL: https://www.researchgate.net/publication/315764651_A_Survey_of_Distributed_Message_Broker_Queues.
Guo Fu., Yanfeng Zhang, Ge Yu. A Fair Comparison of Message Queuing Systems. 2020.URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9303425.
ZeroMQ Documentation URL: https://zeromq.org/.
Kamburugamuve S. Survey of Distributed Stream Processing. 2016. URL: https://www.researchgate.net/publication/299411481_Survey_of_Distributed_Stream_Proces sing
Isah H., Abughofa T., Mahfuz S., Ajerla D. A Survey of Distributed Data Stream Processing Frameworks IEEE Access. № 7. 2019. URL: https://www.researchgate.net/publication/336430459_A_Survey_of_Distributed_Data_Stream_Processing_Frameworks.
Kumari P., Kaur P. A survey of fault tolerance in cloud. Journal of King Saud University Computer and Information Sciences. 2018. URL: https://www.sciencedirect.com/science/article/pii/S1319157818306438.
Lamport L. Paxos Made Simple. 2001. URL: https://www.microsoft.com/enus/research/uploads/prod/2016/12/paxos-simple-Copy.pdf.
Santos N., Schiper A. Optimizing Paxos with batching and pipelining. Theoretical Computer Science. № 496. 2013. URL: https://www.sciencedirect.com/science/article/pii/S0304397512009097.
Ongaro D. Consensus: bridging theory and practice : дис. докт. / Stanford University. 2014.
Kleppmann M. How to do distributed locking. URL: https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html.
Golang. Frequently Asked Questions. URL: https://golang.org/doc/faq
Что такое горутины и каков их размер? URL: https://habr.com/ru/company/otus/blog/527748/
Ants. URL: https://github.com/panjf2000/ants
Розроблена розподілена система управління задачами. URL: https://github.com/devrenshark/demo_taskdealer