CN106817408B

CN106817408B - Distributed server cluster scheduling method and device

Info

Publication number: CN106817408B
Application number: CN201611229087.5A
Authority: CN
Inventors: 鲁逸丁; 李彪朋; 任明; 徐景良
Original assignee: China Unionpay Co Ltd
Current assignee: China Unionpay Co Ltd
Priority date: 2016-12-27
Filing date: 2016-12-27
Publication date: 2020-09-29
Anticipated expiration: 2036-12-27
Also published as: CN106817408A

Abstract

The invention discloses a distributed server cluster scheduling method and device, and relates to the technical field of operation platform management. The method includes: a main server receives an execution message sent by a network server via a message queue server, wherein the execution message includes the correspondence between each client server and the server. The master server obtains the status of each slave server in the slave server cluster, and the slave server cluster includes at least two slave servers; the master server calls the client server’s The maximum threshold and the state of each slave server determine the slave server to be called; the master server distributes the pending tasks corresponding to each client server in the execution message to the slave server to be called. In the embodiment of the present invention, by calling the slave server in the server cluster, the task is quickly delivered to the client corresponding to the task to execute the task, so that the utilization of the client is maximized and high availability is achieved.

Description

Distributed server cluster scheduling method and device

技术领域technical field

本发明涉及运营平台管理技术领域，尤其涉及一种分布式服务器集群调度方法及装置。The invention relates to the technical field of operation platform management, and in particular, to a method and device for scheduling distributed server clusters.

背景技术Background technique

分布式数据库系统已经成为信息处理学科的重要领域，正在迅速发展之中，具有如下有点：1、它可以解决组织机构分散而数据需要相互联系的问题。比如银行系统，总行与各分行处于不同的城市或城市中的各个地区，在业务上它们需要处理各自的数据，也需要彼此之间的交换和处理，这就需要分布式的系统。2、如果一个组织机构需要增加新的相对自主的组织单位来扩充机构，则分布式数据库系统可以在对当前机构影响最小的情况下进行扩充。3、均衡负载的需要。数据的分解采用使局部应用达到最大，这使得各处理机之间的相互干扰降到最低。负载在各处理机之间分担，可以避免临界瓶颈。4、当现有机构中已存在几个数据库系统，而且实现全局应用的必要性增加时，就可以由这些数据库自下而上构成分布式数据库系统。5、相等规模的分布式数据库系统在出现故障的几率上不会比集中式数据库系统低，但由于其故障的影响仅限于局部数据应用，因此就整个系统来讲它的可靠性是比较高的。Distributed database system has become an important field of information processing and is developing rapidly. It has the following advantages: 1. It can solve the problem that organizations are scattered and data needs to be interconnected. For example, in a banking system, the head office and branches are located in different cities or in various regions of the city. In terms of business, they need to process their own data, and also need to exchange and process each other, which requires a distributed system. 2. If an organization needs to add new relatively autonomous organizational units to expand the organization, the distributed database system can be expanded with the least impact on the current organization. 3. The need for load balancing. The decomposition of data is used to maximize local application, which minimizes the mutual interference between processors. The load is shared among processors to avoid critical bottlenecks. 4. When there are several database systems in the existing organization, and the necessity of realizing the global application increases, the distributed database system can be composed of these databases from the bottom up. 5. The probability of failure of a distributed database system of equal scale is not lower than that of a centralized database system, but since the impact of its failure is limited to local data applications, its reliability is relatively high in terms of the entire system .

随着企业业务的快速增长，服务器的数量以及对服务器的维护需求与日俱增，但是现有技术中使用单节点服务器作为任务下发平台，缺少高可用机制、任务执行缓慢、无法及时高效管理计算机机群。With the rapid growth of enterprise business, the number of servers and the maintenance requirements for servers are increasing day by day. However, in the prior art, a single-node server is used as a task distribution platform, which lacks a high-availability mechanism, slows task execution, and cannot manage computer clusters in a timely and efficient manner.

因此，综上所述，现有技术中不能提供一种有效且高效下发任务的方法。Therefore, to sum up, the prior art cannot provide an effective and efficient method for assigning tasks.

发明内容SUMMARY OF THE INVENTION

本发明提供一种分布式服务器集群调度方法及装置，用于解决现有技术中不能提供一种有效且高效下发任务的方法的问题。The present invention provides a distributed server cluster scheduling method and device, which are used to solve the problem that an effective and efficient task dispatching method cannot be provided in the prior art.

本发明实施例提供一种分布式服务器集群调度方法，所述方法包括：An embodiment of the present invention provides a distributed server cluster scheduling method, and the method includes:

主服务器接收网络服务器经消息队列服务器发送的执行消息，其中，所述执行消息中包括各客户端服务器对应的待处理任务；The main server receives the execution message sent by the network server via the message queue server, wherein the execution message includes tasks to be processed corresponding to each client server;

所述主服务器获取从服务器集群中每个从服务器的状态，其中，所述从服务器集群中包括至少两个从服务器；The master server obtains the status of each slave server in the slave server cluster, wherein the slave server cluster includes at least two slave servers;

所述主服务器根据待处理任务的客户端服务器的数量、所述从服务器调用客户端服务器的最大阈值以及每个所述从服务器的状态，确定需要调用的从服务器；The master server determines the slave server to be called according to the number of client servers of the task to be processed, the maximum threshold for calling the client server from the slave server, and the state of each slave server;

所述主服务器将所述执行消息中的各客户端服务器对应的待处理任务分发至所述需要调用的从服务器。The master server distributes the tasks to be processed corresponding to each client server in the execution message to the slave servers that need to be called.

本发明实施例中，主服务器在接收到待处理任务后，根据从服务器集群中的从服务器的状态，任务数量、每个从服务器能够调用的客户端的最大阈值、每个从服务器现在的运行状态，确定需要调用的从服务器，并将任务下发给这些从服务器。本发明实施例中，通过调用服务器集群中的从服器，能将任务快速下发给任务对应的客户端执行任务，使得客户端的利用最大化，并能够在从服务器集群中有从服务器发生故障时，还能够顺利的将任务下发给客户端，实现了高可用。In this embodiment of the present invention, after the master server receives the tasks to be processed, according to the status of the slave servers in the slave server cluster, the number of tasks, the maximum threshold of clients that each slave server can call, and the current running state of each slave server , determine the slave servers that need to be called, and issue the task to these slave servers. In the embodiment of the present invention, by calling the slave server in the server cluster, the task can be quickly dispatched to the client corresponding to the task to execute the task, so that the utilization of the client is maximized, and the slave server in the slave server cluster can fail. At the same time, the task can also be smoothly delivered to the client, achieving high availability.

进一步地，所述主服务器将所述执行消息中的各客户端服务器对应的待处理任务分发至所述需要调用的从服务器，包括：Further, the master server distributes the pending tasks corresponding to each client server in the execution message to the slave servers that need to be called, including:

所述主服务器将所述执行消息中的各客户端服务器对应的待处理任务切分为各所述需要调用的从服务器对应的任务列表，并将所述任务列表发送给对应的需要调用的从服务器；The master server divides the tasks to be processed corresponding to the client servers in the execution message into task lists corresponding to the slave servers that need to be called, and sends the task list to the corresponding slave servers that need to be called. server;

所述主服务器将所述执行消息中的各客户端服务器对应的待处理任务分发至所述需要调用的从服务器后，还包括：After the master server distributes the pending tasks corresponding to each client server in the execution message to the slave servers to be called, the method further includes:

针对每个需要调用的从服务器，所述需要调用的从服务器根据接收的任务列表调用对应的客户端服务器；所述需要调用的从服务器接收所述客户端服务器的任务处理结果，并将所述任务处理结果通过所述消息队列服务器发送给所述网络服务器。For each slave server to be called, the slave server to be called calls the corresponding client server according to the received task list; the slave server to be called receives the task processing result of the client server, and the The task processing result is sent to the network server through the message queue server.

本发明实施例中，主服务器将获取到的任务下发给调用任务对应客户端的从服务器，以使从服务器能够确定调用哪些客户端执行任务。In the embodiment of the present invention, the master server issues the acquired task to the slave server of the client corresponding to the calling task, so that the slave server can determine which clients to call to execute the task.

进一步地，所述主服务器获取从服务器集群中每个从服务器的状态，包括：Further, the master server obtains the status of each slave server in the slave server cluster, including:

管理服务器获取从服务器集群中每个从服务器的状态；The management server obtains the status of each slave server in the slave server cluster;

所述管理服务器将每个从服务器的状态同步到至少一个备用管理服务器中；The management server synchronizes the state of each slave server to at least one standby management server;

所述主服务器确定所述管理服务器的运行状态是否正常；The main server determines whether the running state of the management server is normal;

若所述主服务器确定所述管理服务器状态异常，则调用所述至少一个备用管理服务器中的任意一个，获取所述从服务器集群中每个从服务器的状态。If the master server determines that the state of the management server is abnormal, call any one of the at least one standby management server to obtain the state of each slave server in the slave server cluster.

本发明实施例中，管理服务器能够将每个从服务器的状态同步到备用管理服务器中，以便在管理服务器发生故障时，还可以及时获取到每个从服务器的状态，实现了高可用。In the embodiment of the present invention, the management server can synchronize the status of each slave server to the standby management server, so that when the management server fails, the status of each slave server can be obtained in time, thereby realizing high availability.

进一步地，所述管理服务器获取从服务器集群中每个从服务器的状态，包括：Further, the management server obtains the status of each slave server in the slave server cluster, including:

针对从服务器集群中的任意一个从服务器，若所述管理服务器在设定时间内不能获取到所述从服务器的心跳探测信息，则确定所述从服务器的状态异常。For any slave server in the slave server cluster, if the management server cannot obtain the heartbeat detection information of the slave server within a set time, it is determined that the state of the slave server is abnormal.

本发明实施例中，每个从服务器都通过心跳探测信息与管理服务器连接，若管理服务器在设定时间内不能接收到某个从服务器的心跳探测消息，则认为该从服务器不能使用，则在主服务器查询状态时，不再提供该从服务器的状态信息。In the embodiment of the present invention, each slave server is connected to the management server through heartbeat detection information. If the management server cannot receive the heartbeat detection message of a certain slave server within a set time, it is considered that the slave server cannot be used, and then the When the master server queries the status, the slave server's status information is no longer available.

进一步地，所述方法还包括：Further, the method also includes:

若所述主服务器确定所述从服务器集群中每个从服务器的使用率都等于所述最大阈值时，则新增从服务器，所述新增从服务器注册于所述管理服务器中。If the master server determines that the usage rate of each slave server in the slave server cluster is equal to the maximum threshold, it adds a new slave server, and the newly added slave server is registered in the management server.

本发明实施例中，当确定现有的从服务器集群已不能满足任务处理的要求，则动态增加从服务器，实现了可扩展性。In the embodiment of the present invention, when it is determined that the existing slave server cluster can no longer meet the requirements of task processing, the slave server is dynamically added to achieve scalability.

本发明还提供一种分布式服务器集群调度装置，包括：The present invention also provides a distributed server cluster scheduling device, comprising:

接收单元，用于接收网络服务器经消息队列服务器发送的执行消息，其中，所述执行消息中包括各客户端服务器对应的待处理任务；a receiving unit, configured to receive an execution message sent by the network server via the message queue server, wherein the execution message includes tasks to be processed corresponding to each client server;

获取单元，用于获取从服务器集群中每个从服务器的状态，其中，所述从服务器集群中包括至少两个从服务器；an obtaining unit, configured to obtain the status of each slave server in the slave server cluster, wherein the slave server cluster includes at least two slave servers;

确定单元，用于根据待处理任务的客户端服务器的数量、所述从服务器调用客户端服务器的最大阈值以及每个所述从服务器的状态，确定需要调用的从服务器；a determining unit, configured to determine the slave server to be called according to the number of client servers of the task to be processed, the maximum threshold for calling the client server from the slave server, and the state of each of the slave servers;

发送单元，用于将所述执行消息中的各客户端服务器对应的待处理任务分发至所述需要调用的从服务器。A sending unit, configured to distribute the tasks to be processed corresponding to each client server in the execution message to the slave server that needs to be called.

本发明实施例中，通过调用服务器集群中的从服器，能将任务快速下发给任务对应的客户端执行任务，使得客户端的利用最大化，并能够在从服务器集群中有从服务器发生故障时，还能够顺利的将任务下发给客户端，实现了高可用。In the embodiment of the present invention, by calling the slave server in the server cluster, the task can be quickly dispatched to the client corresponding to the task to execute the task, so that the utilization of the client is maximized, and the slave server in the slave server cluster can fail. At the same time, the task can also be smoothly delivered to the client, achieving high availability.

进一步地，所述确定单元，具体用于：Further, the determining unit is specifically used for:

将所述执行消息中的各客户端服务器对应的待处理任务切分为各所述需要调用的从服务器对应的任务列表，并将所述任务列表发送给对应的需要调用的从服务器；Divide the tasks to be processed corresponding to the client servers in the execution message into task lists corresponding to the slave servers that need to be called, and send the task list to the corresponding slave servers that need to be called;

根据接收的任务列表调用对应的客户端服务器；接收所述客户端服务器的任务处理结果，并将所述任务处理结果通过所述消息队列服务器发送给所述网络服务器。Invoke the corresponding client server according to the received task list; receive the task processing result of the client server, and send the task processing result to the network server through the message queue server.

进一步地，所述获取单元，具体用于：Further, the acquisition unit is specifically used for:

获取从服务器集群中每个从服务器的状态；Get the status of each slave server in the slave server cluster;

将每个从服务器的状态同步到至少一个备用管理服务器中；Synchronize the state of each slave server to at least one standby management server;

确定所述管理服务器的运行状态是否正常；determining whether the running state of the management server is normal;

若确定所述管理服务器状态异常，则调用所述至少一个备用管理服务器中的任意一个，获取所述从服务器集群中每个从服务器的状态。If it is determined that the state of the management server is abnormal, any one of the at least one standby management server is called to obtain the state of each slave server in the slave server cluster.

针对从服务器集群中的任意一个从服务器，若在设定时间内不能获取到所述从服务器的心跳探测信息，则确定所述从服务器的状态异常。For any slave server in the slave server cluster, if the heartbeat detection information of the slave server cannot be acquired within a set time, it is determined that the state of the slave server is abnormal.

进一步地，所述装置还包括：Further, the device also includes:

动态增加单元，用于若确定所述从服务器集群中每个从服务器的使用率都等于所述最大阈值时，则新增从服务器，所述新增从服务器注册于所述管理服务器中。A dynamic adding unit, configured to add a slave server if it is determined that the usage rate of each slave server in the slave server cluster is equal to the maximum threshold, and the newly added slave server is registered in the management server.

附图说明Description of drawings

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简要介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域的普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

图1为本发明实施例提供的一种分布式服务器集群调度系统的结构示意图；1 is a schematic structural diagram of a distributed server cluster scheduling system according to an embodiment of the present invention;

图2为本发明实施例提供的一种消息队列服务器中消息队列的示意图；2 is a schematic diagram of a message queue in a message queue server according to an embodiment of the present invention;

图3为本发明实施例提供的一种分布式服务器集群调度装置的结构示意图；3 is a schematic structural diagram of a distributed server cluster scheduling apparatus according to an embodiment of the present invention;

图4为本发明实施例提供的一种分布式服务器集群调度方法的流程示意图。FIG. 4 is a schematic flowchart of a method for scheduling a distributed server cluster according to an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案和优点更加清楚，下面将结合附图对本发明作进一步地详细描述，显然，所描述的实施例仅仅是本发明一部份实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例，都属于本发明保护的范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. . Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

本发明提供一种分布式服务器集群调度方法，如图1所示，包括：The present invention provides a distributed server cluster scheduling method, as shown in FIG. 1 , including:

主服务器101，网络服务器102，消息队列服务器103，从服务器104，客户端服务器105以及管理服务器106。Master server 101 , network server 102 , message queue server 103 , slave server 104 , client server 105 and management server 106 .

网络服务器102用于对系统进行管理，并获取信息反馈给上层或者获取上层的命令并在系统中执行命令。在本发明实施例中，网络服务器102可以制定任务，并将任务制作成任务列表，发送给客户端服务器105执行，其中任务列表中的任务于客户端服务器105有对应关系。The network server 102 is used to manage the system, obtain information and feed it back to the upper layer, or obtain commands from the upper layer and execute the commands in the system. In this embodiment of the present invention, the network server 102 may formulate tasks, make the tasks into a task list, and send the tasks to the client server 105 for execution, wherein the tasks in the task list have a corresponding relationship with the client server 105 .

例如，在本发明实施例中，网络服务器102制定的任务列表如表1所示，具体为将任务1下发给客户端1，将任务2下发给客户端2，将任务3下发给任务端3。For example, in this embodiment of the present invention, the task list formulated by the network server 102 is as shown in Table 1, specifically, task 1 is issued to client 1, task 2 is issued to client 2, and task 3 is issued to Task side 3.

任务内容Task content 客户端列表Client List 任务内容1task content 1 客户端1Client 1 任务内容2task content 2 客户端2Client 2 任务内容3task content 3 客户端3Client 3

表1：任务列表Table 1: Task List

本发明实施例中，主服务器101将任务列表进行封装，封装为一个执行消息，并发送给消息队列服务器103，在本发明实施例中，可选的，如图2所示，消息队列服务器103中按照接收消息的先后，顺序排列着多个消息，假设执行消息为按照顺序的第三个消息，则在消息队列服务器103处理完前两个消息后，则处理本发明实施例中的执行消息。In the embodiment of the present invention, the main server 101 encapsulates the task list into an execution message, and sends it to the message queue server 103. In the embodiment of the present invention, optionally, as shown in FIG. 2, the message queue server 103 According to the order in which the messages are received, multiple messages are arranged in order. Assuming that the execution message is the third message in sequence, after the message queue server 103 processes the first two messages, the execution message in the embodiment of the present invention is processed. .

可选的，在本发明实施例中，消息队列服务器103订阅了网络服务器102的消息，即在网络服务器102中有新增消息时，消息队列服务器103能够获取到该新增消息。Optionally, in this embodiment of the present invention, the message queue server 103 subscribes to the message of the network server 102 , that is, when there is a new message in the network server 102 , the message queue server 103 can obtain the new message.

主服务器101接收到消息队列服务器103发送的执行消息后，根据执行消息确定执行消息中的任务列表，例如，本发明实施例中，主服务器101接收到的是上述实施例中表1中的任务列表，则主服务器101则根据任务列表确定需要调用的从服务器集群中的从服务器104。After receiving the execution message sent by the message queue server 103, the main server 101 determines the task list in the execution message according to the execution message. For example, in this embodiment of the present invention, the main server 101 receives the tasks in Table 1 in the foregoing embodiment. list, the master server 101 determines the slave server 104 in the slave server cluster to be called according to the task list.

在本发明实施例中，从服务器集群是具有大量从服务器104的服务器集群，从服务器集群中的从服务器104能够调用客户端服务器105。In this embodiment of the present invention, the slave server cluster is a server cluster with a large number of slave servers 104, and the slave servers 104 in the slave server cluster can call the client server 105.

可选的，在本发明实施例中，从服务器集群中的每一个从服务器104能够调用客户端服务器105的数量是有最大值的，可以设置为m，即一个从服务器104能够调用m个客户端服务器105。Optionally, in this embodiment of the present invention, the number of each slave server 104 in the slave server cluster that can call the client server 105 has a maximum value, which can be set to m, that is, one slave server 104 can call m clients. End server 105.

在本发明实施例中，主服务器101调用从服务器104之前，首先需要确定从服务器集群中每个从服务器104的状态，可选的，在本发明实施例中，从服务器104的状态包括异常状态及正常状态。In this embodiment of the present invention, before the master server 101 calls the slave server 104, it first needs to determine the state of each slave server 104 in the slave server cluster. Optionally, in this embodiment of the present invention, the state of the slave server 104 includes an abnormal state and normal state.

在本发明实施例中，主服务器101可以通过管理服务器106来获取从服务器104的状态，当建立从服务器集群时，从服务器集群中的每个从服务器104会向管理服务器106发送连接请求，管理服务器106能够根据连接请求确定从服务器104加入到了从服务器集群中。In this embodiment of the present invention, the master server 101 can obtain the status of the slave servers 104 through the management server 106. When a slave server cluster is established, each slave server 104 in the slave server cluster will send a connection request to the management server 106, and the management The server 106 can determine that the slave server 104 has joined the slave server cluster according to the connection request.

从服务器104与管理服务器106在建立连接后，通过心跳探测消息来确定从服务器104是否异常，可选的，若在设定时间内管理服务器106没有接收到从服务器104发送的心跳探测消息，则认为给从服务器104的状态异常。After the connection between the slave server 104 and the management server 106 is established, the heartbeat detection message is used to determine whether the slave server 104 is abnormal. Optionally, if the management server 106 does not receive the heartbeat detection message sent from the The status given to the slave server 104 is considered abnormal.

可选的，在本发明实施例中，管理服务器106中保存了注册在管理服务器105中的所有从服务器104的状态信息表，该表周期性更新。例如如表2所示，管理服务器106中注册了N个从服务器104，管理服务器106中从服务器104的状态信息表更新了1次，则状态信息表中保存了每个从服务器的两个状态。Optionally, in this embodiment of the present invention, the management server 106 stores a state information table of all slave servers 104 registered in the management server 105, and the table is updated periodically. For example, as shown in Table 2, if N slave servers 104 are registered in the management server 106, and the state information table of the slave server 104 is updated once in the management server 106, the state information table stores two states of each slave server. .

从服务器slave server 状态1state 1 状态2state 2 从服务器1slave server 1 正常normal 正常normal 从服务器2slave 2 正常normal 异常abnormal ……... ……... ……... 从服务器NSlave N 正常normal 正常normal

表2：管理服务器中的状态信息表Table 2: Status information table in the management server

可选的，在本发明实施例中，管理服务器106中还保存了两张状态信息表，一张为从服务器104状态正常的信息表，如表3所示，一张为从服务器104状态异常的信息表，如表4所示。Optionally, in this embodiment of the present invention, the management server 106 also saves two status information tables, one is an information table that the slave server 104 is in a normal state, as shown in Table 3, and the other is an abnormal state of the slave server 104. information table, as shown in Table 4.

从服务器slave server 状态state 从服务器1slave server 1 正常normal 从服务器3slave server 3 正常normal ……... ……... 从服务器N1slave server N1 正常normal

表3：管理服务器中的状态信息表Table 3: Status information table in the management server

从服务器slave server 状态state 从服务器2slave 2 异常abnormal 从服务器4slave server 4 异常abnormal ……... ……... 从服务器N2slave server N2 异常abnormal

表4：管理服务器中的状态信息表Table 4: Status information table in the management server

表3中保存的全部是最新的状态为正常的从服务器104的列表，例如表3中的从服务器1，从服务器3等；可选的，保存的是从服务器104的标识信息，可以为从服务器104的硬件地址或者为其它能够表征从服务器104的唯一性的标识。All stored in Table 3 are the latest lists of slave servers 104 whose status is normal, such as slave server 1, slave server 3, etc. in Table 3; optionally, the identification information of slave server 104 is stored, which can be slave The hardware address of the server 104 or other identifiers that can characterize the uniqueness of the slave server 104 .

表4保存的全部是最新的状态为异常的从服务器104的列表，例如表4中的从服务器2，从服务器4等；可选的，保存的是从服务器104的标识信息，可以为从服务器104的硬件地址或者为其它能够表征从服务器104的唯一性的标识。Table 4 stores all the latest lists of slave servers 104 whose status is abnormal, such as slave server 2, slave server 4, etc. in Table 4; optionally, it stores the identification information of slave server 104, which can be slave server The hardware address of 104 or other identifiers that can characterize the uniqueness of the slave server 104 .

可选的，在本发明实施例中，管理服务器106还能够获取状态为正常的从服务器104列表中每个从服务器列表104的工作状态，在本发明实施例中，工作状态指的是每一个从服务器104是否已经调用了客户端服务器105，调用了几个客户端服务器105，如表5所示，在管理服务器106中还保存了状态为正常的从服务器104列表中每个从服务器104的调用的客户端服务器105的数量。Optionally, in this embodiment of the present invention, the management server 106 can also obtain the working status of each slave server list 104 in the list of slave servers 104 whose status is normal. Whether the slave server 104 has called the client server 105, and called several client servers 105, as shown in Table 5, the management server 106 also saves the status of each slave server 104 in the normal list of slave servers 104. The number of client server 105 calls.

从服务器slave server 状态state 已使用情况used 从服务器1slave server 1 正常normal 已调用2台2 units have been called 从服务器3slave server 3 正常normal 已调用0台0 units have been called 从服务器4slave server 4 正常normal 已调用m台m stations have been called ……... ……... ……...

表5：管理服务器中保存的从服务器的工作状态表Table 5: The working status table of the slave server saved in the management server

主服务器101可以根据表5中的工作状态表确定是否能够调用从服务器104，且还可以使用该从服务器104调用多少台从服务器，例如，在本发明实施例中，从服务器104调用客户端服务器105的最大数量为m，也就是说如表5中的从服务器4，已经调用了m台客户端服务器105，则不能再调用客户端服务器105了。从服务器1已经调用了2台客户端服务器105，则还可以调用m-2台客户端服务器105。The master server 101 can determine whether the slave server 104 can be called according to the working status table in Table 5, and can also use the slave server 104 to call how many slave servers. For example, in this embodiment of the present invention, the slave server 104 calls the client server. The maximum number of 105 is m, that is to say, as from server 4 in Table 5, if m client servers 105 have been called, the client server 105 cannot be called any more. The slave server 1 has already called 2 client servers 105, then m-2 client servers 105 can also be called.

可选的，在本发明实施例中，管理服务器106将获取到的从服务器104的所有信息列表进行底层数据复制，并复制给至少一个备用管理服务器106中，在主服务器101向管理服务器106获取从服务器104的信息时，需要首先判断管理服务器106是否异常，若确定管理服务器106异常，则调用其它的工作正常的任一一个备用管理服务器106来查询从服务器104的状态。Optionally, in this embodiment of the present invention, the management server 106 replicates the underlying data of all the acquired information lists of the slave server 104, and replicates it to at least one standby management server 106, and the master server 101 obtains it from the management server 106. When the information of the slave server 104 is used, it is necessary to first determine whether the management server 106 is abnormal.

主服务器101根据接收到的任务列表中需要执行任务的客户端服务器105的数量，当前从服务器104的状态以及从服务器104调用客户端服务器105的最大阈值来确定在从服务器集群中调用哪些从服务器104。The master server 101 determines which slave servers to call in the slave server cluster according to the number of client servers 105 that need to perform tasks in the received task list, the current state of the slave server 104 and the maximum threshold for calling the client server 105 from the slave server 104 104.

在本发明实施例中，假设需要执行任务的客户端服务器105的数量为n，查询到的从服务器集群中共有20台从服务器104，其中有4台从服务器104已调用0台，有10台从服务器104已调用m台，还有6台从服务器104已调用2台，则主服务器101首先确定需要执行任务的客户端服务器105的数量是否满足公式1：In this embodiment of the present invention, assuming that the number of client servers 105 that need to perform tasks is n, there are 20 slave servers 104 in the queried slave server cluster, of which 4 slave servers 104 have called 0, and there are 10 slave servers 104 The slave server 104 has called m units, and there are 6 slave servers 104 that have called 2 units. The master server 101 first determines whether the number of client servers 105 that need to perform tasks satisfies Formula 1:

n≤(m-2)×6+4m (公式1)n≤(m-2)×6+4m (Formula 1)

若满足公式1，例如n＝(m-2)×6+4m，则调用从服务集群中除10台已调用m台客户端服务器105的从服务器104以外的所有从服务器104，且4台已调用0台客户端服务器105的从服务器104每台调用m个客户端服务器105，6台已调用2台客户端服务器105的从服务器104每台调用m-2个客户端服务器105。If formula 1 is satisfied, for example, n=(m-2)×6+4m, all slave servers 104 except 10 slave servers 104 that have called m client servers 105 in the slave service cluster are called, and 4 slave servers 104 have been called. Each of the slave servers 104 calling 0 client servers 105 calls m client servers 105, and each of the 6 slave servers 104 that have called 2 client servers 105 calls m-2 client servers 105 each.

主服务器101将任务列表中需要执行任务的n个客户端服务器105分发给上述中每个从服务器104，即将n个客户端服务器105的随机分配给10m-12台从服务器104中，每个从服务器104在接收到需要调用的客户端服务器105后，调用对应的客户端服务器105执行任务列表中的任务。The master server 101 distributes the n client servers 105 that need to perform tasks in the task list to each of the above slave servers 104, that is, randomly assigns the n client servers 105 to 10m-12 slave servers 104, each slave server 104. After receiving the client server 105 to be called, the server 104 calls the corresponding client server 105 to execute the task in the task list.

在本发明实施例中，从服务器104接收调用的客户端服务器105的在处理任务后的任务处理结果，并将任务处理结果发送给消息队列服务器103，消息队列服务器103发送给网络服务器102。In this embodiment of the present invention, the task processing result of the called client server 105 after processing the task is received from the server 104 , and the task processing result is sent to the message queue server 103 , and the message queue server 103 sends it to the network server 102 .

可选的，在本发明实施例中，当确定不满足公式1时，可以动态增加从服务器集群中从服务器104的数量，可选的，在本发明实施例中，从数据库中调用新的从服务器104，并将新的从服务器104注册在管理服务器106中。Optionally, in this embodiment of the present invention, when it is determined that formula 1 is not satisfied, the number of slave servers 104 in the slave server cluster may be dynamically increased. server 104 and register the new slave server 104 in the management server 106 .

基于同样的构思，本发明实施例还提供一种分布式服务器集群调度装置，如图3所示，包括：Based on the same concept, an embodiment of the present invention also provides a distributed server cluster scheduling device, as shown in FIG. 3 , including:

接收单元301，用于接收网络服务器经消息队列服务器发送的执行消息，其中，所述执行消息中包括各客户端服务器对应的待处理任务；A receiving unit 301, configured to receive an execution message sent by a network server via a message queue server, wherein the execution message includes tasks to be processed corresponding to each client server;

获取单元302，用于获取从服务器集群中每个从服务器的状态，其中，所述从服务器集群中包括至少两个从服务器；an obtaining unit 302, configured to obtain the status of each slave server in the slave server cluster, wherein the slave server cluster includes at least two slave servers;

确定单元303，用于根据待处理任务的客户端服务器的数量、所述从服务器调用客户端服务器的最大阈值以及每个所述从服务器的状态，确定需要调用的从服务器；Determining unit 303, configured to determine the slave server that needs to be called according to the number of client servers of the task to be processed, the maximum threshold for calling the client server from the slave server, and the state of each of the slave servers;

发送单元304，用于将所述执行消息中的各客户端服务器对应的待处理任务分发至所述需要调用的从服务器。The sending unit 304 is configured to distribute to-be-processed tasks corresponding to each client server in the execution message to the slave server that needs to be called.

进一步地，所述确定单元303，具体用于：Further, the determining unit 303 is specifically configured to:

进一步地，所述获取单元302，具体用于：Further, the obtaining unit 302 is specifically used for:

进一步地，所述装置还包括：Further, the device also includes:

动态增加单元305，用于若确定所述从服务器集群中每个从服务器的使用率都等于所述最大阈值时，则新增从服务器，所述新增从服务器注册于所述管理服务器中。The dynamic adding unit 305 is configured to add a slave server if it is determined that the usage rate of each slave server in the slave server cluster is equal to the maximum threshold, and the newly added slave server is registered in the management server.

基于同样的构思，本发明还提供一种分布式服务器集群调度方法，如图4所示，包括：Based on the same concept, the present invention also provides a distributed server cluster scheduling method, as shown in FIG. 4 , including:

步骤401，主服务器接收网络服务器经消息队列服务器发送的执行消息，其中，所述执行消息中包括各客户端服务器对应的待处理任务；Step 401, the main server receives the execution message sent by the network server via the message queue server, wherein the execution message includes the tasks to be processed corresponding to each client server;

步骤402，所述主服务器获取从服务器集群中每个从服务器的状态，其中，所述从服务器集群中包括至少两个从服务器；Step 402, the master server obtains the status of each slave server in the slave server cluster, wherein the slave server cluster includes at least two slave servers;

步骤403，所述主服务器根据待处理任务的客户端服务器的数量、所述从服务器调用客户端服务器的最大阈值以及每个所述从服务器的状态，确定需要调用的从服务器；Step 403, the master server determines the slave server to be called according to the number of client servers of the task to be processed, the maximum threshold for calling the client server from the slave server, and the state of each slave server;

步骤404，所述主服务器将所述执行消息中的各客户端服务器对应的待处理任务分发至所述需要调用的从服务器。Step 404, the master server distributes the tasks to be processed corresponding to each client server in the execution message to the slave servers that need to be called.

进一步地，所述方法还包括：Further, the method also includes:

本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

尽管已描述了本发明的优选实施例，但本领域内的技术人员一旦得知了基本创造性概念，则可对这些实施例作出另外的变更和修改。所以，所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。Although preferred embodiments of the present invention have been described, additional changes and modifications to these embodiments may occur to those skilled in the art once the basic inventive concepts are known. Therefore, the appended claims are intended to be construed to include the preferred embodiment and all changes and modifications that fall within the scope of the present invention.

显然，本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样，倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内，则本发明也意图包含这些改动和变型在内。It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit and scope of the invention. Thus, provided that these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include these modifications and variations.

Claims

1. A distributed server cluster scheduling method, wherein the method comprises:

The main server receives the execution message sent by the network server via the message queue server, wherein the execution message includes tasks to be processed corresponding to each client server;

The master server obtains the status of each slave server in the slave server cluster, wherein the slave server cluster includes at least two slave servers;

The master server determines the slave server that needs to be called according to the number of client servers of the task to be processed, the maximum threshold for calling the client server from the slave server, and the status of each slave server, wherein if the following formula is satisfied, then The master server calls the slave server corresponding to the client server:

n≤(m-2)×6+4m

Wherein, n is the number of client servers that need to be called, and m is the maximum number of the client servers that the slave server calls;

The master server distributes the tasks to be processed corresponding to each client server in the execution message to the slave servers that need to be called.

2. The method according to claim 1, wherein the master server distributes the tasks to be processed corresponding to each client server in the execution message to the slave servers that need to be called, comprising:

The master server divides the tasks to be processed corresponding to the client servers in the execution message into task lists corresponding to the slave servers that need to be called, and sends the task list to the corresponding slave servers that need to be called. server;

After the master server distributes the pending tasks corresponding to each client server in the execution message to the slave servers to be called, the method further includes:

For each slave server to be called, the slave server to be called calls the corresponding client server according to the received task list; the slave server to be called receives the task processing result of the client server, and the The task processing result is sent to the network server through the message queue server.

3. The method according to claim 1, wherein the master server obtains the status of each slave server in the slave server cluster, comprising:

The management server obtains the status of each slave server in the slave server cluster;

The management server synchronizes the state of each slave server to at least one standby management server;

The main server determines whether the running state of the management server is normal;

If the master server determines that the state of the management server is abnormal, call any one of the at least one standby management server to obtain the state of each slave server in the slave server cluster.

4. The method according to claim 3, wherein the management server obtains the status of each slave server in the slave server cluster, comprising:

For any slave server in the slave server cluster, if the management server cannot obtain the heartbeat detection information of the slave server within a set time, it is determined that the state of the slave server is abnormal.

5. The method according to claim 3, wherein the method further comprises:

If the master server determines that the usage rate of each slave server in the slave server cluster is equal to the maximum threshold, it adds a new slave server, and the newly added slave server is registered in the management server.

6. A distributed server cluster scheduling device, comprising:

a receiving unit, configured to receive, through the main server, an execution message sent by the network server via the message queue server, wherein the execution message includes tasks to be processed corresponding to each client server;

an obtaining unit, configured to obtain the status of each slave server in the slave server cluster through the master server, wherein the slave server cluster includes at least two slave servers;

a determining unit, configured to determine, by the master server, the slave server to be called according to the number of client servers of the task to be processed, the maximum threshold for calling the client server from the slave server, and the status of each slave server, wherein If the following formula is satisfied, the determining unit calls the slave server corresponding to the client server:

n≤(m-2)×6+4m

A sending unit, configured to distribute, through the master server, the tasks to be processed corresponding to each client server in the execution message to the slave servers that need to be called.

7. The device according to claim 6, wherein the determining unit is specifically configured to:

The master server divides the pending tasks corresponding to the client servers in the execution message into task lists corresponding to the slave servers that need to be called, and sends the task list to the corresponding slave servers that need to be called. from the server;

Call the corresponding client server through the main server according to the received task list; receive the task processing result of the client server through the main server, and send the task processing result to the message queue server through the message queue server. Network Server.

8. The device according to claim 6, wherein the acquiring unit is specifically configured to:

Obtain the status of each slave server in the slave server cluster through the management server;

Synchronizing the state of each slave server to at least one standby management server through the management server;

Determine by the main server whether the running state of the management server is normal;

If it is determined by the master server that the state of the management server is abnormal, any one of the at least one standby management server is called to obtain the state of each slave server in the slave server cluster.

9. The device according to claim 8, wherein the acquiring unit is specifically configured to:

10. The apparatus of claim 8, wherein the apparatus further comprises:

A dynamic adding unit, configured to add a new slave server if it is determined by the master server that the usage rate of each slave server in the slave server cluster is equal to the maximum threshold, and the newly added slave server is registered in the in the management server.