CN113590313B

CN113590313B - Load balancing method, device, storage medium and computing equipment

Info

Publication number: CN113590313B
Application number: CN202110773763.XA
Authority: CN
Inventors: 刘迎冬; 张晓龙; 陈谔; 陈洁; 刘秀颖
Original assignee: Hangzhou Netease Shuzhifan Technology Co ltd
Current assignee: Hangzhou Netease Shuzhifan Technology Co ltd
Priority date: 2021-07-08
Filing date: 2021-07-08
Publication date: 2024-02-02
Anticipated expiration: 2041-07-08
Also published as: CN113590313A

Abstract

The embodiment of the disclosure provides a load balancing method, which comprises the following steps: receiving a capacity expansion instruction aiming at the first type of service; wherein the capacity expansion instruction comprises identification information of the second scheduling group; responding to the capacity expansion instruction, distributing a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calling the interface, modifying the binding relation between the plurality of processors and the service, binding the plurality of processors with the first type of service, and creating a capacity expansion scheduling group based on the plurality of processors bound with the first type of service; and carrying out load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing strategy so as to schedule at least part of processing tasks in the first type of services born by the first scheduling group to the capacity expansion scheduling.

Description

Load balancing method, device, storage medium and computing equipment

Technical Field

Embodiments of the present disclosure relate to the field of computer technology, and more particularly, to a load balancing method, apparatus, storage medium, and computing device.

Background

This section is intended to provide a background or context to the embodiments of the disclosure recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.

And the mixed deployment refers to deploying processing tasks corresponding to at least two different types of services on the same computing device or the same computing device cluster. For example, in practical applications, services such as online and offline may be deployed on the same server or cluster of servers.

With the development of processor technology, the computing device used to carry out processing tasks may be a computing device with multiple CPU cores. In order for a computing device to be able to perform balanced scheduling among its onboard CPUs, the CPUs may typically be partitioned into scheduling Domains (Sched Domains) within the operating system kernel of the computing device.

The abstract scheduling domain refers to a set of CPUs which share attributes and scheduling policies and are abstracted in the kernel of the operating system. Each scheduling field may in turn contain one or more scheduling groups (Sched groups) which are treated as a separate scheduling unit by the scheduling field. The computing device may schedule the processing tasks that are to be undertaken among the respective scheduling groups based on a load balancing policy to load balance the processing tasks that are to be undertaken by the respective scheduling groups.

In practical applications, when a computing device adopts a hybrid deployment manner, in order to avoid interaction between different types of services of the hybrid deployment, service isolation is generally required for different types of services borne by the computing device.

Currently, when performing service isolation on different types of services borne by a computing device, a manner is generally adopted in which specific service types are respectively bound for each scheduling group on the computing device.

For example, taking the case of mixing and deploying the online service and the offline service on the same server, assuming that the server is provided with multiple CPU cores, each CPU is abstracted into a first scheduling group and a second scheduling group, the online service and the first scheduling group may be bound, and the offline service and the second scheduling group may be bound, so as to realize service isolation of the online service and the offline service.

However, by adopting a mode of binding specific service types for each scheduling group, although service isolation can be realized, in practical application, the service types bound for each scheduling group may cause a certain limit on load balancing scheduling of processing tasks among each scheduling group, so that the processing tasks born by each scheduling group cannot reach load balancing.

Disclosure of Invention

In a first aspect of embodiments of the present disclosure, a load balancing method is provided. Applied to a computing device comprising a plurality of processors; wherein, the processors all adopt a three-level cache architecture; the plurality of processors is divided into at least a first schedule group and a second schedule group; at least part of the processors in the first scheduling group are bound with a first type of service; at least part of the processors in the second scheduling group are bound with a second class of service; the computing device opens an interface for modifying the binding relationship of the processor and the service; a third-level cache is shared among the processors in the first scheduling group, and a third-level cache is shared among the processors in the second scheduling group; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; comprising the following steps:

receiving a capacity expansion instruction aiming at the first type of service; wherein the capacity expansion instruction comprises identification information of the second scheduling group;

responding to the capacity expansion instruction, distributing a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calling the interface, modifying the binding relation between the plurality of processors and the service, binding the plurality of processors with the first type of service, and creating a capacity expansion scheduling group based on the plurality of processors bound with the first type of service;

And carrying out load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing strategy so as to schedule at least part of processing tasks in the first type of services born by the first scheduling group to the capacity expansion scheduling group.

In one embodiment of the present disclosure, invoking the interface, modifying the binding relationship between the plurality of processors and the service, and binding the plurality of processors with the first type of service, including:

and calling the interface, carrying out dynamic thermal modification on the second scheduling group, and binding the plurality of processors with the first type of service.

In one embodiment of the present disclosure, the second scheduling group includes a reserved set of processors that are not bound to the second type of service and are used for assuming processing tasks of the first type of service;

allocating a plurality of processors for the first type of service from the second scheduling group, including:

and allocating processors for the first type of service from the reserved processor set in the second scheduling group.

In one embodiment of the disclosure, the capacity expansion instruction includes identification information of a plurality of processors allocated to the first type of service from the second scheduling group;

The allocating a plurality of processors for the first type of service from the second scheduling group comprises the following steps:

and distributing a plurality of processors in the second scheduling group corresponding to the identification information included in the capacity expansion instruction to the first type of service.

In one embodiment of the present disclosure, the first scheduling group and the second scheduling group belong to the same scheduling domain; topology data for describing the topology structure of the scheduling domain is maintained in the kernel of the operating system carried by the computing device; wherein the topology data includes description information for describing binding relations between processors and services in respective scheduling groups in a scheduling domain;

the interface opened by the computing device comprises a user state interface for modifying the description information maintained in a kernel of an operating system;

calling the interface, modifying the binding relation between the processors and the service, binding the processors with the first type service, and creating a capacity expansion scheduling group based on the processors bound with the first type service, wherein the capacity expansion scheduling group comprises the following steps:

and calling the user state interface, modifying the description information of the second scheduling group, creating binding relations between the processors and the first type of service in the description information, and creating a capacity expansion scheduling group based on the processors bound with the first type of service.

In one embodiment of the disclosure, the plurality of processors onboard the computing device employ a NUMA architecture; the first scheduling group and the second scheduling group belong to the same scheduling domain formed by all processors under the NUMA architecture; the first scheduling domain comprises a first NUMA node bound with a first type of service under the NUMA architecture; the second scheduling domain comprises a second NUMA node bound with a second class of service under the NUMA architecture; and a third-level cache is shared between the processor in the first NUMA node and the processor in the second NUMA node.

In one embodiment of the disclosure, creating the binding relationship between the plurality of processors and the first type of service includes:

and creating a binding relation between the plurality of processors and the service processing processes corresponding to the first type of service.

In one embodiment of the present disclosure, the binding relationship includes an affinity relationship between the processor and the business process.

In one embodiment of the present disclosure, the first type of service comprises an online service; the second class of services includes offline services; or,

the first type of service comprises an offline service; the second type of service includes an online service.

In one embodiment of the present disclosure, the load balancing policy includes:

the service scheduling is preferentially performed among service scheduling groups bound with the same service.

In one embodiment of the present disclosure, performing load balancing processing for the first scheduling group and the capacity expansion scheduling group includes:

when any target processor in the capacity expansion scheduling group meets the load balancing processing condition, determining a scheduling group with larger service load in the first scheduling group and the capacity expansion scheduling group;

if the service load of the first scheduling group is larger, further confirming the processor with the highest service load in the first scheduling group; the method comprises the steps of,

and dispatching at least part of processing tasks in the first type of services born by the determined processor with the highest service load to the target processor.

In one embodiment of the present disclosure, the load balancing processing conditions include any one of the following:

the target processor meets the condition of periodically carrying out load balancing processing;

the number of processing tasks carried by the target processor is below a threshold.

In a second aspect of embodiments of the present disclosure, there is provided a medium for use in a computing device, the computing device comprising a plurality of processors; wherein, the processors all adopt a three-level cache architecture; the plurality of processors is divided into at least a first schedule group and a second schedule group; at least part of the processors in the first scheduling group are bound with a first type of service; at least part of the processors in the second scheduling group are bound with a second class of service; the computing device opens an interface for modifying the binding relationship of the processor and the service; a third-level cache is shared among the processors in the first scheduling group, and a third-level cache is shared among the processors in the second scheduling group; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; the storage medium has stored thereon computer instructions which, when executed by a processor, perform the steps of the method of:

Receiving a capacity expansion instruction triggered when the capacity expansion condition is met by the first type of service; wherein the capacity expansion instruction comprises identification information of the second scheduling group;

In a third aspect of embodiments of the present disclosure, there is provided an apparatus for application to a computing device comprising a plurality of processors; wherein, the processors all adopt a three-level cache architecture; the plurality of processors is divided into at least a first schedule group and a second schedule group; at least part of the processors in the first scheduling group are bound with a first type of service; at least part of the processors in the second scheduling group are bound with a second class of service; the computing device opens an interface for modifying the binding relationship of the processor and the service; a third-level cache is shared among the processors in the first scheduling group, and a third-level cache is shared among the processors in the second scheduling group; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; comprising the following steps:

The receiving module is used for receiving a capacity expansion instruction triggered when the capacity expansion condition is met by the first type of service; wherein the capacity expansion instruction comprises identification information of the second scheduling group;

the creation module responds to the capacity expansion instruction, allocates a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calls the interface, modifies the binding relation between the plurality of processors and the service, binds the plurality of processors with the first type of service, and creates a capacity expansion scheduling group based on the plurality of processors bound with the first type of service;

and the scheduling module is used for carrying out load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing strategy so as to schedule at least part of processing tasks in the first type of services born by the first scheduling group to the capacity expansion scheduling group.

In a fourth aspect of embodiments of the present disclosure, there is provided a computing device comprising: a plurality of processors; a memory for storing processor-executable instructions; wherein, the processors all adopt a three-level cache architecture; the plurality of processors is divided into at least a first schedule group and a second schedule group; at least part of the processors in the first scheduling group are bound with a first type of service; at least part of the processors in the second scheduling group are bound with a second class of service; the computing device opens an interface for modifying the binding relationship of the processor and the service; a third-level cache is shared among the processors in the first scheduling group, and a third-level cache is shared among the processors in the second scheduling group; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; the processor executes the executable instructions to implement the method as follows:

In the above embodiments of the present disclosure, at least the following advantageous effects are provided:

on the one hand, the computing device opens an interface for modifying the binding relation between the processors and the services, so that when the first-class services bound by the first scheduling group meet the capacity expansion condition, a plurality of processors are allocated for the first-class services from the second scheduling group bound by the second-class services, the binding relation between the allocated processors and the services is modified by calling the interface, the allocated processors are bound with the first-class services, and then the capacity expansion scheduling group is created based on the processors bound with the first-class services, so that the processing task of the first-class services borne by the first scheduling group is shared, and the influence on the first-class services borne by the first scheduling group due to the fact that the first scheduling group bound with the first-class services is not expanded and the service load is overlarge can be avoided.

On the other hand, when the computing device performs load balancing processing on the first type of service, the computing device generally needs to perform load balancing scheduling on processing tasks corresponding to the first type of service among a plurality of scheduling groups bound with the first type of service; on the basis of the first scheduling group bound with the first type of service, a capacity expansion scheduling group is created based on a plurality of processors bound with the first type of service distributed from the second scheduling group, so that the computing equipment can carry out load balancing processing on processing tasks corresponding to the first type of service based on a carried load balancing strategy, the processing tasks of the first type of service carried by the first scheduling group are scheduled to the capacity expansion scheduling group, and the load pressure of the first scheduling group is relieved;

moreover, the processors in the capacity expansion scheduling group can be bound with the first type of service instead of the second type of service; therefore, the problem that the processing tasks borne by the first scheduling group and the processing tasks borne by the capacity expansion scheduling group are unbalanced due to the fact that the service types bound by the first scheduling group and the capacity expansion scheduling group are inconsistent and the processing tasks borne by the first scheduling group cannot be scheduled to the capacity expansion scheduling group in the process of carrying out load balancing processing between the first scheduling group and the capacity expansion scheduling group can be avoided, and therefore the processing resources of the processors in the capacity expansion scheduling group can be fully utilized and the processing resources of the processors in the capacity expansion scheduling group can not be fully utilized is avoided.

Drawings

The above, as well as additional purposes, features, and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description when read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings, in which:

FIG. 1 schematically illustrates a schematic diagram of a computing device employing a three-level processor cache architecture, in accordance with an embodiment of the present disclosure;

FIG. 2 schematically illustrates a schematic diagram of a computing device employing a NUMA architecture in accordance with embodiments of the disclosure;

FIG. 3 schematically illustrates a flow chart of a load balancing method according to an embodiment of the present disclosure;

fig. 4 schematically illustrates a flowchart of a load balancing process for a first schedule group and the capacity expansion schedule group described above according to an embodiment of the present disclosure;

fig. 5 schematically illustrates a block diagram of a load balancing apparatus according to an embodiment of the present disclosure;

FIG. 6 schematically illustrates a hardware architecture diagram of a computing device in accordance with an embodiment of the present disclosure;

fig. 7 schematically shows a schematic diagram of a software product applied to a load balancing method according to an embodiment of the present disclosure.

In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

The principles and spirit of the present disclosure will be described below with reference to several exemplary embodiments. It should be understood that these embodiments are presented merely to enable one skilled in the art to better understand and practice the present disclosure and are not intended to limit the scope of the present disclosure in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Those skilled in the art will appreciate that embodiments of the present disclosure may be implemented as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the following forms, namely: complete hardware, complete software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

According to an embodiment of the disclosure, a load balancing method, medium, device and computing equipment are provided.

Herein, it is to be understood that the terms involved are meant. Furthermore, any number of elements in the figures is for illustration and not limitation, and any naming is used for distinction only and not for any limiting sense.

The principles and spirit of the present disclosure are explained in detail below with reference to several representative embodiments thereof.

Application scene overview

Referring to fig. 1, fig. 1 is a schematic diagram of a computing device employing a three-level processor cache architecture according to the present disclosure.

As shown in fig. 1, the above-described computing device may be equipped with a plurality of physical CPUs (central processing units) that commonly access the same physical memory. Wherein, the physical CPU refers to CPU hardware actually inserted on the motherboard of the computing device. The number of physical CPUs mounted on the computing device refers to the number of CPU hardware actually inserted into the motherboard of the computing device. Each of the physical CPUs mounted on the computing device may be a multi-core CPU, and may include a plurality of CPU cores (such as the CPU core shown in fig. 1). Furthermore, in order to improve the access efficiency of each CPU core to the memory, a multi-level cache may be set for each CPU core.

The multi-level cache refers to a temporary storage located between the CPU and the memory. Generally, the above-described multi-level caches generally include three levels of caches, i.e., an L1 cache (first level cache), an L2cache (second level cache), and an L3cache (third level cache), respectively.

The L1 cache is generally arranged in a CPU core, and is a cache which is independently shared by the CPU core and is used for storing data to be accessed by the CPU and instructions to be executed; in practical applications, the L1 Cache may be further divided into an L1D-Cache for storing data and an L1I-Cache for storing instructions.

The L2 cache is also usually provided in the CPU core, and is a cache that is shared by the CPU core, and is used for storing data that the CPU needs to access.

In practical applications, the L2 cache may be a cache provided outside the CPU cores and shared by a plurality of CPU cores, and is not particularly limited in this specification. For example, the L2 cache shown in fig. 1 is a cache that is disposed inside a CPU core and is shared solely by the CPU core. In practical applications, the L2 cache shown in fig. 1 may be disposed outside the CPU cores, as in the L3 cache, and shared by multiple CPU cores.

The L3 cache is generally a cache shared by a plurality of CPU cores and provided outside the CPU cores, and is used for storing data to be accessed by the CPU.

In the three-level processor cache architecture shown in fig. 1, since each physical CPU commonly accesses the same physical memory, if the number of CPU cores of the physical CPU is excessive, each physical CPU faces a performance bottleneck.

Referring to FIG. 2, FIG. 2 is a schematic diagram of a computing device employing a NUMA (Non Uniform Memory Access, non-uniform memory access) architecture as shown in the present disclosure.

NUMA architecture is a processor architecture derived based on the three-level processor cache architecture shown in FIG. 1. Under the NUMA architecture, independent memories can be respectively set for each physical CPU, so that compared with the architecture shown in FIG. 1, the performance bottleneck faced by each physical CPU due to the fact that the number of CPU cores of the physical CPU is large can be relieved.

As shown in fig. 2, under a NUMA architecture, a CPU on a computing device may be divided into multiple NUMA nodes.

For example, as shown in fig. 2, under the NUMA architecture, each physical CPU carried by the computing device may be respectively used as an independent NUMA node; for example, assuming that a computing device has N physical CPUs mounted thereon, the N physical CPUs may be divided into N NUMA nodes, one for each individual physical CPU.

As shown in fig. 2, L3cache is shared between CPU cores in each NUMA node; the L3cache is not shared between the CPU cores in each NUMA node and the CPU cores in other NUMA nodes.

Under the NUMA architecture, after dividing the CPU on the computing device into a plurality of NUMA nodes, each NUMA node can be allocated with a unique memory.

For example, in practical application, the memory carried by the computing device may be evenly allocated to each NUMA node according to the total number of NUMA nodes; for example, assuming that a computing device using a NUMA mechanism has N NUMA nodes in total, the capacity of the memory carried by the computing device is M, each NUMA node is allocated with N/M of the capacity of the exclusive memory.

With continued reference to FIG. 2, in a computing device employing a NUMA architecture, interconnections between NUMA nodes can be achieved through a NUMA interconnection module. Specific technical details of interconnection of the NUMA nodes through the NUMA interconnection module are not described in the present specification.

On the one hand, NUMA node can Access the memory (called Local Access) of the exclusive share through the internal channel (such as IO bus); on the other hand, the memory (called Remote Access) exclusive to other NUMA nodes can also be accessed remotely through the NUMA interconnection module.

The memory allocated for the NUMA node may be referred to as a local memory of the NUMA node; the memory allocated for other NUMA nodes may be referred to as the remote memory of the NUMA node.

In practical applications, when the NUMA node accesses the local memory through the internal channel, the NUMA node can generally access the corresponding data directly based on the memory address of the local memory. While NUMA nodes typically have lower access speeds than local memory because of the need to access memory remotely through NUMA interconnect modules when accessing other NUMA nodes.

The number of CPUs mounted on the computing device generally refers to the number of logical CPUs mounted on the computing device.

In practical applications, each CPU core on each physical CPU shown in FIGS. 1 and 2 may be further modeled as a pair of logical CPUs using Hyper-Threading (HyperThreeding).

In this case, the number of CPUs mounted on the computing device depends on the number of physical CPUs mounted on the computing device and the number of CPU cores included in each physical CPU.

For example, assume a computing device has N physical CPUs mounted, each of which in turn includes M CPU cores; each CPU core may be further modeled as a pair of logical CPUs based on the hyper-threading technique. The number of logical CPUs on the computing device is n×m×2. For example, assuming that the computing device is equipped with 2 physical CPUs, each of which is a multi-core CPU with 4CPU cores, the number of logical CPUs on the computing device is 16 if N is 2 and m is 4.

A computing device such as that shown in fig. 2 may be used to undertake processing tasks corresponding to at least two different classes of traffic. For example, in one example, an online service and an offline service may be deployed in a hybrid on the computing device described above.

In addition, when at least two different types of services are mixed and deployed on the computing device, in order to allow the processing tasks borne by the computing device to be uniformly scheduled among the logic CPUs mounted on the computing device, the logic CPUs mounted on the computing device may be generally divided into a plurality of scheduling domains in an operating system kernel of the computing device.

When the operating system kernel of the computing device divides the scheduling domain into each logic CPU, all the logic CPUs of the computing device may be generally divided into one scheduling domain. And then, dividing all the logic CPUs carried by the computing equipment into CPUs sharing the L3cache into the same scheduling group. That is, the CPUs in each of the partitioned dispatch groups share the L3cache.

As shown in FIG. 2, as described above, L3cache is shared among the logical CPUs included in the physical CPUs; therefore, the above-described division method of the schedule group is equivalent to dividing each logical CPU included in each physical CPU mounted on the computing device into the same schedule group. The number of scheduling groups finally divided is the same as the number of physical CPUs mounted on the computing device.

For example, referring to FIG. 2, for a computing device employing NUMA architecture, since each physical CPU will be a separate NUMA node; therefore, when the scheduling groups are divided in the above manner, all the logic CPUs carried by the computing device can be divided into one large scheduling domain, and then the logic CPUs included in the NUMA node corresponding to each physical CPU can be divided into one independent scheduling group.

In this case, assuming that the computing device includes N NUMA nodes, the logical CPUs included in each NUMA node will together form an independent scheduling group, where a total of N scheduling groups corresponding to each NUMA node may be included in the scheduling domain.

In practical applications, to avoid interaction between different types of services deployed in a hybrid manner, service isolation may also be performed on different types of services assumed by a computing device.

One common solution for isolating traffic is the cpu set scheme. The CPU set scheme is a scheme for implementing service isolation by binding a service processing process corresponding to a service to a specific CPU core or cores and restricting execution of the service processing process corresponding to the service to the specific CPU core or cores for execution. The CPU scheme is easy to realize and good in isolation, so that the CPU scheme is a general isolation scheme widely applied at present.

When the CPU scheme is adopted for service isolation, as the CPU scheme is already in the operating system kernel of the computing device before, each logic CPU carried by the computing device is divided into a scheduling domain and a scheduling group; therefore, different types of services carried by the computing equipment can be deployed into different scheduling groups respectively, and the different types of services are bound with the corresponding scheduling groups respectively, so that the isolation of the services is realized.

Taking the above-mentioned computing device as an example, an online service and an offline service are mixed and deployed for illustration.

Assume that the computing device includes a total of two NUMA nodes, denoted NUMA node A and NUMA node B, respectively. NUMA nodeA corresponds to the partitioned first scheduling group; NUMA nodeB corresponds to the partitioned second schedule group.

In this case, the online service may be deployed to a first scheduling group, and the CPU in the first scheduling group bears a processing task corresponding to the online service; correspondingly, the offline service can be deployed to the second scheduling group, and the CPU in the second scheduling group bears the processing task corresponding to the offline service.

Meanwhile, in order to realize the isolation between the online service and the offline service, the above-described cpu set scheme may be adopted to bind the online service with the first scheduling group and bind the offline service with the second scheduling group.

After the online service is deployed to the first scheduling group and the online service is bound with the first scheduling group, the subsequent computing device can perform load balancing processing on processing tasks corresponding to the online service borne by each CPU in the first scheduling group among the CPUs in the first scheduling group based on a load balancing policy, so that the processing tasks corresponding to the online service borne by each CPU in the first scheduling group can reach a load balancing state.

Correspondingly, after the offline service is deployed to the second scheduling group and the offline service is bound with the second scheduling group, the subsequent computing device can also perform load balancing processing on processing tasks corresponding to the offline service borne by each CPU in the second scheduling group between each CPU in the second scheduling group based on a load balancing policy, so that the processing tasks corresponding to the offline service borne by each CPU in the second scheduling group can also reach a load balancing state.

Further, in practical applications, when the processing task borne by one of the first scheduling group or the second scheduling group is overloaded (for example, the number of borne processing task processes reaches a threshold), it may be necessary to expand the scheduling group, and allocate a new CPU from the other scheduling group to bear the processing task.

For example, assuming that the task load carried in a NUMA nodeA (i.e., first scheduling group) bound to an online service is excessive, a number of CPUs may be allocated from a NUMA nodeB (i.e., second scheduling group) to the NUMA nodeA to carry the processing task corresponding to the online service.

However, when a certain scheduling group is expanded, a new CPU allocated to the scheduling group from other scheduling groups is not generally consistent with the service type bound to the expanded scheduling group for the service type bound to the new CPU; therefore, the inconsistency of the bound service types may cause a certain limitation on the load balancing scheduling of the processing tasks among the scheduling groups, and further may cause that the processing tasks born by the scheduling groups cannot reach the load balancing.

For example, assuming that the task load carried in NUMA nodeA bound to online traffic is excessive, after allocating several CPUs from NUMA nodeB to NUMA nodeA, these allocated CPUs are bound to offline traffic. And the load balancing scheduling strategy adopted by the computing equipment prioritizes the strategy of service scheduling among the service scheduling groups bound with the same service.

This may result in that even if several CPUs are allocated from NUMA nodeB for NUMA nodeA, these allocated CPUs are not bound to online traffic, but are still bound to offline traffic; therefore, when the computing device performs load balancing processing on the CPUs in the NUMA nodeA, the computing device still cannot schedule the processing tasks of the online service that the CPUs in the NUMA nodeA overload to the newly allocated CPUs for processing.

In this way, the processing resources of the allocated CPUs are not fully utilized, and the processing tasks borne by the newly allocated CPUs and the CPUs in the NUMA nodeA are always in a state that the load balance cannot be achieved. For example, it may occur that the CPUs in NUMA nodeA are still overloaded, while the traffic load of online traffic carried by the CPUs allocated for NUMA nodeA from NUMA nodeB is still low or even completely unloaded.

Therefore, in the scenario that multiple services are deployed in a mixed manner by the computing device and service isolation of the multiple services is achieved through the Cpuset scheme, once the capacity expansion of the CPU of the cross-scheduling group is involved, the problem that the processing resources of the expanded CPU cannot be fully utilized may occur.

Summary of The Invention

As described above, in the scenario where multiple services are deployed in a hybrid manner on a computing device using an architecture as shown in fig. 1 or fig. 2, and service isolation of the multiple services is achieved through a Cpuset scheme, once expansion of a CPU across a scheduling group is involved, a problem that the processing resources of the expanded CPU cannot be fully utilized generally occurs.

In view of this, the present disclosure provides a load balancing method for fully utilizing the processing resources of the expanded CPU and enabling the expanded CPU and the existing CPU to achieve load balancing when expanding the CPU of the cross-scheduling group in a scenario where multiple services are mixedly deployed in a computing device and service isolation of the multiple services is achieved through a Cpuset scheme.

The core technical conception of the specification is as follows:

and opening modification authorities aiming at binding relations between CPUs and services in each scheduling group of the computing equipment, so that when the first type of services bound with the first scheduling group of the computing equipment meet capacity expansion conditions, a plurality of CPUs can be allocated for the first type of services from the second scheduling group bound with the second type of services, the CPUs are bound with the first type of services in a mode of modifying the binding relations between the CPUs and the services, and then the capacity expansion scheduling group is created based on the CPUs bound with the first type of services so as to share the load pressure of the first type of services born by the first scheduling group.

In this way, since the computing device generally needs to perform load balancing processing on the first type of service, a processing task corresponding to the first type of service is subjected to load balancing scheduling among a plurality of scheduling groups bound to the first type of service; on the basis of the first scheduling group bound with the first type of service, a capacity expansion scheduling group is created based on a plurality of processors bound with the first type of service distributed from the second scheduling group, so that the computing equipment can carry out load balancing processing on processing tasks corresponding to the first type of service based on a carried load balancing strategy, the processing tasks of the first type of service carried by the first scheduling group are scheduled to the capacity expansion scheduling group, and the load pressure of the first scheduling group is relieved;

Exemplary method

The technical idea of the present specification will be described in detail by specific examples.

Referring to fig. 3, fig. 3 is a flowchart of a load balancing method according to an exemplary embodiment. The method is applied to a computing device.

Wherein, in the computing device, a plurality of processors may be included; the multiple processors may each employ the three-level cache architecture shown in fig. 1; the plurality of processors is divided into at least a first schedule group and a second schedule group; at least part of the processors in the first scheduling group are bound with the first type of service; at least part of the processors in the second scheduling group are bound with the second class of service; the computing device opens an interface for modifying the binding relationship of the processor and the service; sharing an L3cache among processors in a first scheduling group, and sharing the L3cache among processors in a second scheduling group; the L3cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; the method performs the steps of:

Step 301, receiving a capacity expansion instruction for the first type of service; wherein the capacity expansion instruction comprises identification information of the second scheduling group;

the computing device can be any form of hardware device capable of mixedly deploying multiple types of services; for example, in one example, the computing device may be a computing device in a cloud computing platform for assuming computing tasks; in another example, the computing device may also be a server that provides various types of real-time or non-real-time services to the user.

The specific service types of the first type of service and the second type of service are not particularly limited in the specification, and in practical application, any type of service which can be mixed and deployed on the same computing device can be included;

in one example, the first type of service may be an online service, and the second type of service may be an offline service; alternatively, the first type of service may be an offline service; the second type of service may be an online service. The online service can be a service with higher real-time performance; in contrast, the offline service may be a service that has low requirements on real-time;

For example, taking the computing device as an example of a computing device in a cloud computing platform that bears a computing task, in such a scenario, the online service may be a real-time cloud computing task; the offline service may be an offline computing task that is performed offline in the background;

as another example, taking the computing device as an example of a server of a music playing APP for providing various real-time or offline services for users, in this scenario, the online service may be an online playing service provided for users; the offline service may be a service for offline music download provided by a user.

In another example, the first type of service and the second type of service may be other types of services besides online service or offline service;

for example, taking the computing device as a server of a certain music playing APP as an example, in this scenario, the first type of service may be a music copyright purchase service provided for an ordinary user; the second type of service may be a split service of music copyrights provided by a professional-oriented musician. The computing device may be equipped with a plurality of physical CPUs, each of which may be a multi-core CPU including a plurality of CPU cores; each CPU core may be further modeled as a pair of logical CPUs when the computing device supports hyper-threading techniques. In this case, the number of CPUs mounted on the computing device generally refers to the number of logical CPUs mounted on the computing device, and the number of logical CPUs depends on the number of physical CPUs mounted on the computing device and the number of CPU cores included in each physical CPU.

For example, in one example, assuming that the computing device mounts 2 physical CPUs, each physical CPU is a multi-core CPU with 4CPU cores, and each CPU core is further simulated into a pair of logic CPUs by using the hyper-threading technology, the number of logic CPUs mounted on the computing device is 4×2×2=16.

The CPU carried by the computing device can be divided into a plurality of scheduling domains, and each scheduling domain can be further divided into a plurality of scheduling groups.

Wherein, under the application scene of mixed deployment of the first class service and the second class service on the computing device, the divided scheduling groups can comprise; a first dispatch group for assuming processing tasks of a first class of traffic; and a second dispatch group for assuming processing tasks for the second class of traffic.

For example, when implemented, all logical CPUs onboard a computing device may be divided into one scheduling domain. Further, since the logical CPUs under each physical CPU share L3 cup; the physical CPUs on the computing device may be divided into different scheduling groups. In this case, the first schedule group and the second schedule group will correspond to different physical CPUs, respectively.

In one embodiment shown, the computing device described above may specifically employ a NUMA architecture as shown in FIG. 2. When the computing device employs a NUMA architecture, the computing device may include a plurality of NUMA nodes; for example, at least a first NUMA node and a second NUMA node can be included. In this case, all logical CPUs under the NUMA architecture can be divided into one scheduling domain; and dividing the first NUMA node into the first scheduling domain, and dividing the first NUMA node into the second scheduling domain.

For the divided first scheduling domain and the second scheduling domain, the cpu scheme may be still used to bind the first type of service with the first scheduling domain and bind the second type of service with the second type of service, so as to realize service isolation between the first type of service and the second type of service under the condition of mixed deployment of the first type of service and the second type of service on the computing device.

In this specification, when the first type of service or the second type of service meets the capacity expansion condition, the capacity expansion of the first type of service or the second type of service may also be performed by triggering the capacity expansion instruction.

The capacity expansion conditions can be flexibly customized by a user based on actual capacity expansion requirements; for example, the capacity expansion condition may include that the CPU that is responsible for the processing task of the first type of service or the second type of service reaches a threshold; the number of the processing tasks corresponding to the first type of service or the second type of service reaches a threshold (for example, the number of service processes reaches a threshold), and the like, which are not listed in this specification.

In one example, the capacity expansion instruction may specifically be a capacity expansion instruction that is automatically triggered by the computing device when the first type service or the second type service meets the capacity expansion condition. In another example, the computing device may specifically provide a command line tool associated with a service to a user (e.g., a service manager); for example, the command line tool may be a client software that provides instruction input services to a user. In this case, the expansion instruction may be an expansion instruction manually input by the user through the command line tool when the first type service or the second type service satisfies an expansion condition.

In this specification, it is assumed that the first type of service satisfies the capacity expansion condition, and at this time, a plurality of CPUs for carrying out processing tasks of the first type of service need to be expanded for the first type of service from the second scheduling group bound to the second type of service.

In this case, the capacity expansion instruction triggered when the first type of service meets the capacity expansion condition may specifically include the identifier of the second scheduling group. The computing device can receive the capacity expansion instruction, process the capacity expansion instruction, and expand capacity for the first type of service.

Step 302, in response to the capacity expansion instruction, allocating a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calling the interface, modifying the binding relation between the plurality of processors and the service, binding the plurality of processors with the first type of service, and creating a capacity expansion scheduling group based on the plurality of processors bound with the first type of service;

when the computing device receives the capacity expansion instruction, the computing device can respond to the capacity expansion instruction and allocate a plurality of CPUs for the first type of service from the second scheduling group

In one embodiment, in the second scheduling group, a plurality of CPUs which are not bound with the second type of service may be reserved, and the CPU sets are used for bearing processing tasks corresponding to the first type of service; wherein the number of reserved CPUs in the CPU set can be flexibly set based on the total number of CPUs actually contained by the computing device.

In this case, when allocating a CPU for a first type of service requiring capacity expansion from the second scheduling group, the CPU may be specifically allocated for the first type of service from the reserved CPU set.

Of course, in the first scheduling group, a plurality of CPUs which are not bound with the first type of service may be reserved, and the CPU sets are used for bearing processing tasks corresponding to the second type of service. And when the second type of service meets the capacity expansion condition, distributing the CPU for the second type of service from the CPU set.

In the illustrated embodiment, the capacity expansion instruction may specifically include identification information of a plurality of CPUs allocated to the first type of service from the second scheduling group. For example, the expansion instruction may be an expansion instruction manually input by the user through a command line tool, where the expansion instruction may specifically include identification information, specified by the user, of a plurality of CPUs that need to be allocated from the second scheduling group to the first type service.

In this case, when allocating a CPU for a first type of service requiring capacity expansion from the second scheduling group, a plurality of CPUs in the second scheduling group corresponding to the identification information included in the capacity expansion instruction may be specifically allocated to the first type of service.

In the present specification, a plurality of CPUs allocated for the first type of service from the second scheduling group may be bound with the second type of service by default; therefore, in order to avoid such a binding relationship, the computing device may open the modification authority of the binding relationship between the CPU and the service in each scheduling group under the CPU set scheme to the user by opening an interface for modifying the binding relationship between the processor and the service.

In this case, after allocating several CPUs for the first type of service from the second scheduling group, the above interface may be further called, the binding relationship between the allocated CPUs and the first type of service is modified, the allocated CPUs and the first type of service are bound, and then the capacity expansion scheduling group is created based on the allocated CPUs bound to the first type of service.

In the illustrated embodiment, when the interface is called to modify the binding relationship between the allocated CPUs and the service, a dynamic thermal modification manner may be specifically adopted to adjust the binding relationship between the CPUs in the second scheduling group and the service in real time, so as to bind the CPUs with the first type of service.

By adopting a dynamic thermal modification mode, the binding relation between a plurality of CPUs distributed for the first type of service in the second scheduling group and the first type of service can be enabled to be effective immediately without restarting the computing equipment.

Of course, in practical application, besides the dynamic thermal modification manner, a conventional cold modification manner that the device needs to be modified after being restarted to be effective may also be adopted, which is not particularly limited in this specification.

In the illustrated embodiment, the kernel of the operating system carried by the computing device generally faithfully describes the scheduling domain and the scheduling group according to the physical topology structure of the physical CPU carried by the device, and generally maintains, in the kernel of the operating system, topology data for describing the topology structure of the scheduling domain to which the first scheduling domain and the second scheduling group belong; for example, in practice, the topology data is typically a structure maintained in the operating system kernel; for example, the construct is typically a construct file called structschdule_domain_policy_level maintained in the operating system kernel.

The topology data further comprises description information for describing the binding relation between the processors and the service in each scheduling group in the scheduling domain; for example, in practice, the topology data is typically a structure maintained in the operating system kernel; for example, the construct is typically a variable called sched_domain_mask_f mask maintained in the operating system kernel.

In the related art, because of the kernel of the operating system, the scheduling domain and the scheduling group are generally faithfully described according to the physical topology structure of the physical CPU carried by the device; thus, in general, the above-described structures and description describe the actual physical topology of the physical CPU that the computing device is carrying.

In this specification, in order to enable the two physically isolated scheduling groups to flexibly expand the CPUs mutually, the modification authority for the above description information in the above structure body may be opened, so that the user may flexibly adjust the binding relationship between the CPUs in the scheduling groups and the service based on the actual expansion requirement, without being limited by the physical topology structure of the physical CPU itself carried by the computing device.

It should be noted that, as described above, the modification to the description information may be a dynamic thermal modification, so as to ensure that the modification to the description information can be immediately effective without restarting the computing device.

In the illustrated embodiment, the interface of the computing device, which is opened to modify the binding relationship between the processor and the service, may specifically be a user interface for modifying the description information maintained in the kernel of the operating system; for example, the user interface may be an API interface for opening the modification rights of the above description information.

In this case, after the computing device allocates a plurality of CPUs for the first type of service from the second scheduling group, the user interface may be further called, the description information corresponding to the second scheduling group is modified, and a binding relationship with the first type of service is created for the allocated CPUs.

For example, in one example, it is assumed that the CPUs allocated for the first type of service are the CPUs in the reserved CPU set, where the CPUs are not bound to the second type of service; the binding relation between the CPUs and the first type of service can be directly added into the description information corresponding to the second scheduling group.

In another example, it is assumed that a plurality of CPUs allocated for the first type of service, not the CPUs in the reserved CPU set, are bound with the second type of service in advance; the description information corresponding to the second scheduling group can be modified, and the binding relation between the plurality of CPUs and the second type of service maintained in the description information is re-modified into the binding relation between the plurality of CPUs and the first type of service.

In one implementation manner, when a binding relationship with the first type of service is created for the allocated CPUs, the allocated CPUs may be specifically bound with the service processing processes corresponding to the first type of service, and a binding relationship between the allocated CPUs and the service processing processes corresponding to the first type of service is created in the description information.

The binding relationship between the service processing process corresponding to the first type of service and the CPU generally refers to an affinity relationship between the service processing process and the CPU. The business process has affinity relationship with a certain CPU, which generally means that the business process needs to run on the CPU for as long as possible, and not be migrated to other CPUs.

Further, after the description information of the second scheduling group is modified by calling the user interface, a binding relation with the first type of service is created for the allocated CPUs, a capacity expansion scheduling group can be created based on the allocated CPUs at this time to trigger the reconstruction of the second scheduling group, and a capacity expansion scheduling group bound with the first type of service is divided from the second scheduling group.

The second scheduling group after reconstruction at this time includes two scheduling groups, one of which is a capacity expansion scheduling group bound with the first type of service, and the other of which is a scheduling group bound with the second type of service, consisting of the remaining CPUs.

And 303, carrying out load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing strategy so as to schedule at least part of processing tasks in the first type of services born by the first scheduling group to the capacity expansion scheduling group.

In this specification, after the computing device responds to the capacity expansion instruction, allocates a plurality of CPUs for the first type of service from the second scheduling group, and creates a capacity expansion scheduling group bound to the first type of service based on the allocated CPUs, the subsequent computing device may perform load balancing processing on the first scheduling group and the capacity expansion scheduling group based on a load balancing policy, so as to schedule at least part of processing tasks in the first type of service borne by the first scheduling group to the capacity expansion scheduling group, so as to relieve load pressure in the first scheduling group.

In one embodiment shown, the load balancing policy employed by the computing device may still be a policy that prioritizes service scheduling among service scheduling groups bound to the same class of service.

The newly built binding relation between the capacity expansion scheduling group and the service has been modified to be bound with the first type of service in the manner described above; at this time, the first scheduling group and the capacity expansion scheduling group are both bound with the first type of service; therefore, by adopting the strategy of carrying out service scheduling preferentially among the service scheduling groups bound with the same service, at least part of processing tasks in the first service born by the first scheduling group can be normally scheduled to the expansion scheduling group.

In practical applications, when the computing device adopts a load balancing policy to perform load balancing processing on the first scheduling group and the capacity expansion scheduling group, the load balancing processing is generally performed by taking each CPU in the first scheduling group and the capacity expansion scheduling group as a unit.

When the load balancing method is implemented, the computing device can sequentially determine whether each CPU in the first scheduling group and the capacity expansion scheduling group meets the load balancing processing conditions, and when any CPU meets the load balancing processing conditions, a load balancing algorithm related to the load balancing strategy can be adopted to perform load balancing processing on the CPU.

The specific type of the load balancing algorithm used by the computing device and corresponding to the load balancing policy is not particularly limited in this specification.

In one embodiment shown, the load balancing algorithm may be a CFS (completely fair schedule, full fair scheduling) algorithm in particular.

Referring to fig. 4, fig. 4 is a flowchart of a load balancing process for the first scheduling group and the capacity expansion scheduling group according to an exemplary embodiment, which includes the following steps:

step 401, when any target processor in the capacity expansion scheduling group meets a load balancing processing condition, determining a scheduling group with larger service load in the first scheduling group and the capacity expansion scheduling group;

based on a CFS algorithm, when any target CPU in the capacity expansion scheduling group is assumed to meet a load balancing condition, determining the first scheduling group and a scheduling group with larger service load in the capacity expansion scheduling group; for example, a scheduling group with a large number of processes carried in the first scheduling group and the capacity expansion scheduling group is determined.

Step 402, if the service load of the first scheduling group is larger, further confirming the processor with the highest service load in the first scheduling group; and scheduling at least part of processing tasks in the first type of services born by the determined processor with the highest service load to the target processor.

If the service load of the first scheduling group is larger, the CPU with the highest service load in the first scheduling group can be further confirmed, and at least part of processing tasks in the first type of service borne by the CPU with the highest service load can be confirmed and scheduled to the target CPU; for example, at least part of the service processing processes in the first type of service borne by the processor with the highest service load are migrated to the target CPU.

By the scheduling mode, the processing tasks born in the first scheduling group can be gradually shared to the newly expanded capacity expansion scheduling group, so that the processing tasks born by the CPUs in the first scheduling group and the capacity expansion scheduling group can reach a load balance state, and further the load pressure of the first scheduling group is relieved.

Of course, if any CPU except the target CPU in the first scheduling group satisfies the load balancing processing condition after a period of time, the load balancing processing procedure for the first scheduling group and the capacity expansion scheduling group is similar to the above description, and will not be repeated. It should be noted that, the load balancing processing conditions generally depend on the load balancing algorithm adopted by the computing device, and are not particularly limited in the present specification;

For example, if the load balancing algorithm employed is a CFS algorithm, common load balancing processing conditions may include the following:

in one case, for each CPU in the first scheduling group and the capacity expansion scheduling group, load balancing processing may be performed periodically; for example, a user may configure a fixed time for performing load balancing processing for each CPU; wherein the time of the load balancing process of different CPUs may be different. In this case, for a certain CPU, if the CPU satisfies a condition for periodically performing load balancing processing, that is, the current time reaches the time configured for the CPU for performing load balancing processing, it is considered that the CPU has satisfied the load balancing condition.

In another case, the load balancing condition of each CPU in the first scheduling group and the capacity expansion scheduling group may be that the number of processing tasks carried by the CPU is lower than a threshold. In this case, for a certain CPU, if the number of processing tasks carried by that CPU is below a threshold. The CPU is deemed to have satisfied the load balancing condition.

In the above embodiment, since the computing device generally needs to perform load balancing scheduling on the processing task corresponding to the first type of service when performing load balancing processing on the first type of service, the load balancing scheduling is performed among the plurality of scheduling groups bound to the first type of service; therefore, on the basis that the first scheduling group bound with the first type of service exists, a capacity expansion scheduling group is created based on a plurality of processors bound with the first type of service distributed from the second scheduling group, so that the computing equipment can carry out load balancing processing on processing tasks corresponding to the first type of service based on the carried load balancing strategy, the processing tasks of the first type of service carried by the first scheduling group are scheduled to the capacity expansion scheduling group, and the load pressure of the first scheduling group is relieved.

In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.

In an exemplary embodiment of the present disclosure, a load balancing apparatus is also provided. Fig. 5 shows a schematic structural diagram of the load balancing apparatus 500, and as shown in fig. 5, the load balancing apparatus 500 may include: a receiving module 510, a creating module 520 and a scheduling module 530. Wherein:

The receiving module 510 is configured to receive a capacity expansion instruction for the first class of service; wherein the capacity expansion instruction comprises identification information of the second scheduling group;

the creation module 520 is configured to allocate a plurality of processors for the first type of service from the second scheduling group corresponding to the identification information in response to the capacity expansion instruction, call the interface, modify the binding relationship between the plurality of processors and the service, bind the plurality of processors with the first type of service, and create a capacity expansion scheduling group based on the plurality of processors bound with the first type of service;

the scheduling module 530 is configured to perform load balancing processing on the first scheduling group and the capacity expansion scheduling group by adopting a load balancing policy, so as to schedule at least part of processing tasks in the first type of services borne by the first scheduling group to the capacity expansion scheduling group.

The specific details of the foregoing modules of the load balancing apparatus 500 have been described in the foregoing description of the load balancing method flow, and thus are not repeated herein.

It should be noted that although several modules or units of the load balancing apparatus 500 are mentioned in the detailed description above, such partitioning is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.

An electronic device 600 according to such an embodiment of the present disclosure is described below with reference to fig. 6. The electronic device 600 shown in fig. 6 is merely an example and should not be construed to limit the functionality and scope of use of embodiments of the present disclosure in any way.

As shown in fig. 6, the electronic device 600 is in the form of a general purpose computing device. Components of electronic device 600 may include, but are not limited to: the at least one processing unit 601, the at least one memory unit 602, a bus 603 connecting the different system components, including the memory unit 602 and the processing unit 601.

Wherein the storage unit stores program code executable by the processing unit 601 such that the processing unit 601 performs the steps of the various embodiments described herein.

The memory unit 602 may include readable media in the form of volatile memory units, such as Random Access Memory (RAM) 6021 and/or cache memory 6022, and may further include Read Only Memory (ROM) 6023.

The storage unit 602 may also include a program/usage tool 6024 having a set (at least one) of program modules 6025, such program modules 6025 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which may include the reality of a network environment, or some combination thereof.

Bus 603 may be one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 600 may also communicate with one or more external devices 604 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 600, and/or any devices (e.g., routers, modems, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 605. Also, the electronic device 600 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through a network adapter 606. As shown, the network adapter 606 communicates with other modules of the electronic device 600 over the bus 603. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 600, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.

From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.

In an exemplary embodiment of the present disclosure, a computer-readable storage medium having stored thereon a program product capable of implementing the method described above in the present specification is also provided. In some possible embodiments, the various aspects of the present disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the disclosure as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.

Referring to fig. 7, a program product 70 for implementing the above-described method according to an embodiment of the present disclosure is described, which may employ a portable compact disc read-only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It should be noted that although several units/modules or sub-units/modules of the apparatus are mentioned in the above detailed description, this division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more units/modules described above may be embodied in one unit/module in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit/module described above may be further divided into ones that are embodied by a plurality of units/modules.

Furthermore, although the operations of the methods of the present disclosure are depicted in the drawings in a particular order, this is not required to or suggested that these operations must be performed in this particular order or that all of the illustrated operations must be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.

While the spirit and principles of the present disclosure have been described with reference to several particular embodiments, it is to be understood that this disclosure is not limited to the particular embodiments disclosed nor does it imply that features in these aspects are not to be combined to benefit from this division, which is done for convenience of description only. The disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A load balancing method applied to a computing device, the computing device comprising a plurality of processors employing a NUMA architecture; wherein, the processors all adopt a three-level cache architecture; the plurality of processors is divided into at least a first schedule group and a second schedule group; at least part of the processors in the first scheduling group are bound with a first type of service; at least part of the processors in the second scheduling group are bound with a second class of service; the first scheduling group and the second scheduling group belong to the same scheduling domain under the NUMA architecture; topology data for describing the topology structure of the scheduling domain is maintained in the kernel of the operating system carried by the computing device; wherein the topology data includes description information for describing binding relations between processors and services in respective scheduling groups in a scheduling domain; the computing device opens a user-mode interface for modifying the description information maintained in the kernel of the operating system; a third-level cache is shared among the processors in the first scheduling group, and a third-level cache is shared among the processors in the second scheduling group; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; comprising the following steps:

responding to the capacity expansion instruction, distributing a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calling the user mode interface, modifying the description information of the second scheduling group, creating binding relation between the plurality of processors and the first type of service in the description information, and creating a capacity expansion scheduling group based on the plurality of processors bound with the first type of service;

2. The method of claim 1, invoking the interface, modifying the binding relationship of the plurality of processors to services, binding the plurality of processors to the first type of services, comprising:

3. The method of claim 2, wherein the second scheduling group includes a reserved set of processors not bound to the second type of traffic for assuming processing tasks of the first type of traffic;

4. The method of claim 2, wherein the capacity expansion instruction includes identification information of a plurality of processors allocated for the first type of service from the second scheduling group;

5. The method of claim 2, a first scheduling domain comprising a first NUMA node under the NUMA architecture that is bound to a first type of traffic; the second scheduling domain comprises a second NUMA node bound with a second class of service under the NUMA architecture; and a third-level cache is shared between the processor in the first NUMA node and the processor in the second NUMA node.

6. The method of claim 1, creating a binding relationship of the number of processors with the first type of traffic, comprising:

7. The method of claim 6, the binding relationship comprising an affinity relationship between a processor and a business process.

8. The method of claim 1, the first type of traffic comprising online traffic; the second class of services includes offline services; or,

9. The method of claim 1, the load balancing policy comprising:

10. The method of claim 9, performing load balancing processing for the first schedule group and the capacity expansion schedule group, comprising:

11. The method of claim 10, the load balancing processing conditions comprising any one of the following:

12. A storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method of any of claims 1-11.

13. A traffic scheduling apparatus applied to a computing device comprising a plurality of processors employing a NUMA architecture; wherein, the processors all adopt a three-level cache architecture; the plurality of processors is divided into at least a first schedule group and a second schedule group; at least part of the processors in the first scheduling group are bound with a first type of service; at least part of the processors in the second scheduling group are bound with a second class of service; the first scheduling group and the second scheduling group belong to the same scheduling domain under the NUMA architecture; topology data for describing the topology structure of the scheduling domain is maintained in the kernel of the operating system carried by the computing device; wherein the topology data includes description information for describing binding relations between processors and services in respective scheduling groups in a scheduling domain; the computing device opens a user-mode interface for modifying the description information maintained in the kernel of the operating system; a third-level cache is shared among the processors in the first scheduling group, and a third-level cache is shared among the processors in the second scheduling group; a third-level cache is not shared between the processors in the first scheduling group and the processors in the second scheduling group; comprising the following steps:

The receiving module receives a capacity expansion instruction aiming at the first type of service; wherein the capacity expansion instruction comprises identification information of the second scheduling group;

the creation module responds to the capacity expansion instruction, allocates a plurality of processors for the first type of service from a second scheduling group corresponding to the identification information, calls the user state interface, modifies the description information of the second scheduling group, creates binding relations between the plurality of processors and the first type of service in the description information, and creates a capacity expansion scheduling group based on the plurality of processors bound with the first type of service;

14. A computing device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to implement the method of any of claims 1-11 by executing the executable instructions.