+

CN109783224B - Task allocation method and device based on load allocation and terminal equipment - Google Patents

Task allocation method and device based on load allocation and terminal equipment Download PDF

Info

Publication number
CN109783224B
CN109783224B CN201811502296.1A CN201811502296A CN109783224B CN 109783224 B CN109783224 B CN 109783224B CN 201811502296 A CN201811502296 A CN 201811502296A CN 109783224 B CN109783224 B CN 109783224B
Authority
CN
China
Prior art keywords
task
calculation
node
computing
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811502296.1A
Other languages
Chinese (zh)
Other versions
CN109783224A (en
Inventor
王路生
陆进
陈斌
宋晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201811502296.1A priority Critical patent/CN109783224B/en
Publication of CN109783224A publication Critical patent/CN109783224A/en
Application granted granted Critical
Publication of CN109783224B publication Critical patent/CN109783224B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention is suitable for the technical field of data processing, and provides a task allocation method, a device, terminal equipment and a computer readable storage medium based on load allocation, wherein the method comprises the following steps: binding at least two computing nodes, and carrying out computing power statistics on each bound computing node to obtain an average computing power value of each computing node; if the task type of the calculation task is the timing task, acquiring a task force calculation value of the calculation task, analyzing a calculation node meeting the task force calculation value according to the average force calculation value, determining the analyzed calculation node as a target node, and distributing the calculation task to all target nodes; and if the task type is the maximum computing power task, determining each bound computing node as a target node, and distributing the computing tasks to all the target nodes. According to the invention, the task allocation is carried out according to the average calculation force value of the calculation node and the specific task type of the calculation task, so that the flexibility of the task allocation is improved, and meanwhile, the processing efficiency of the task is also improved.

Description

Task allocation method and device based on load allocation and terminal equipment
Technical Field
The present invention belongs to the technical field of data processing, and in particular, to a method and an apparatus for task allocation based on load scheduling, a terminal device, and a computer-readable storage medium.
Background
With the development of mathematics and computer technology, deep learning has become the current focus of research. Deep learning is a branch field of machine learning, and data such as images, sounds or texts are interpreted by simulating a mechanism of a human brain by establishing and simulating a neural network for analyzing and learning of the human brain and sending a deep learning task to the neural network for processing.
In the existing deep learning framework, the tasks of deep learning are generally evenly distributed to each computing node in a processing unit (such as a central processing unit) for processing, and since the processing capacities of the computing nodes in the processing unit may be unequal, and some computing nodes may have a risk of crashing, the task processing efficiency is low. Therefore, in the prior art, the task allocation mode is rigid, and the task processing efficiency is low.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for task allocation based on load allocation, a terminal device, and a computer-readable storage medium, so as to solve the problems in the prior art that task allocation is not flexible and task processing efficiency is low.
A first aspect of an embodiment of the present invention provides a task allocation method based on load allocation, including:
binding at least two computing nodes, and carrying out computing power statistics on each bound computing node to obtain an average computing power value of each computing node, wherein the computing nodes are central processor cores, graph processor cores or neural network processor cores;
if the task type of the calculation task is a timing task, acquiring a task force value of the calculation task, analyzing the calculation node meeting the task force value according to the average force value, determining the analyzed calculation node as a target node, and distributing the calculation task to all the target nodes, wherein the calculated amount distributed to the target nodes corresponds to the average force value of the target nodes;
and if the task type is the maximum calculation power task, determining each bound computing node as the target node, and distributing the computing task to all the target nodes.
A second aspect of the embodiments of the present invention provides a task allocation apparatus based on load scheduling, including:
the calculation force counting unit is used for binding at least two calculation nodes and carrying out calculation force counting on each bound calculation node to obtain an average calculation force value of each calculation node, wherein the calculation nodes are a central processing unit core, a graph processor core or a neural network processor core;
the first allocation unit is used for acquiring a task force calculation value of the calculation task if the task type of the calculation task is a timing task, analyzing the calculation nodes meeting the task force calculation value according to the average force calculation value, determining the analyzed calculation nodes as target nodes, and allocating the calculation task to all the target nodes, wherein the calculation amount allocated to the target nodes corresponds to the average force calculation value of the target nodes;
and the second distribution unit is used for determining each bound computing node as the target node and distributing the computing task to all the target nodes if the task type is the maximum computing power task.
A third aspect of the embodiments of the present invention provides a terminal device, where the terminal device includes a memory, a processor, and a computer program that is stored in the memory and is executable on the processor, and the processor implements the following steps when executing the computer program:
binding at least two computing nodes, and carrying out computing power statistics on each bound computing node to obtain an average computing power value of each computing node, wherein the computing nodes are central processor cores, graph processor cores or neural network processor cores;
if the task type of the calculation task is a timing task, acquiring a task force calculation value of the calculation task, analyzing the calculation nodes meeting the task force calculation value according to the average force calculation value, determining the analyzed calculation nodes as target nodes, and distributing the calculation task to all the target nodes, wherein the calculation amount distributed to the target nodes corresponds to the average force calculation value of the target nodes;
and if the task type is the maximum calculation power task, determining each bound computing node as the target node, and distributing the computing task to all the target nodes.
A fourth aspect of embodiments of the present invention provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of:
binding at least two computing nodes, and carrying out computing power statistics on each bound computing node to obtain an average computing power value of each computing node, wherein the computing nodes are central processor cores, graph processor cores or neural network processor cores;
if the task type of the calculation task is a timing task, acquiring a task force calculation value of the calculation task, analyzing the calculation nodes meeting the task force calculation value according to the average force calculation value, determining the analyzed calculation nodes as target nodes, and distributing the calculation task to all the target nodes, wherein the calculation amount distributed to the target nodes corresponds to the average force calculation value of the target nodes;
and if the task type is the maximum calculation power task, determining each bound computing node as the target node, and distributing the computing task to all the target nodes.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the method comprises the steps that computing nodes supporting task processing are bound, the average calculation force value of each computing node is calculated, if the task type of the obtained computing task is a timing task, the computing nodes meeting the task calculation force value of the computing task are analyzed according to the average calculation force value, and the computing task is distributed to the analyzed computing nodes; and if the task type is the maximum computation task, distributing the computation task to all the bound computation nodes for processing. According to the embodiment of the invention, the average calculation force value of each calculation node is calculated, and the task is allocated according to the specific task type of the calculation task, so that the flexibility of task allocation is improved, and the processing efficiency of the task is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a flowchart illustrating an implementation of a task allocation method based on load scheduling according to an embodiment of the present invention;
fig. 2 is a flowchart of an implementation of a task allocation method based on load scheduling according to a second embodiment of the present invention;
fig. 3 is a flowchart of an implementation of a task allocation method based on load scheduling according to a third embodiment of the present invention;
fig. 4 is a flowchart of an implementation of a task allocation method based on load scheduling according to a fourth embodiment of the present invention;
fig. 5 is a flowchart of an implementation of a task allocation method based on load scheduling according to a fifth embodiment of the present invention;
FIG. 6 is a block diagram of a task allocation apparatus based on load scheduling according to a sixth embodiment of the present invention;
fig. 7 is a schematic diagram of a terminal device according to a seventh embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
Fig. 1 shows an implementation flow of the task allocation method based on load scheduling according to the embodiment of the present invention, which is detailed as follows:
in S101, at least two computing nodes are bound, computing power statistics is carried out on each bound computing node to obtain an average computing power value of each computing node, and the computing nodes are central processing unit cores, graph processor cores or neural network processor cores.
In the embodiment of the present invention, first, all computing nodes in the terminal device are determined according to an actual configuration situation of the terminal device, and at least two computing nodes in the terminal device are bound, where a computing node is a minimum computing Unit having a task Processing capability, for example, the computing node may be a Central Processing Unit (CPU) core, a Graphics Processing Unit (GPU) core, or a Neural Network Processing Unit (NPU) core, and besides, the computing node may also be a Digital Signal Processing (DSP) Unit or a hardware acceleration Unit, and the like. Binding the computing nodes refers to using the computing nodes as candidate nodes for processing computing tasks, and different binding modes exist in the embodiment of the invention according to the actual configuration condition of the terminal equipment and different computing tasks. For example, after all the computing nodes on the terminal device are determined, all the computing nodes can be directly bound; if the terminal equipment is in CPU + GPU heterogeneous configuration and the computing task is specified to be processed only by the CPU, only the computing task is bound as a computing node of a CPU core; and if the terminal equipment is in a CPU + GPU heterogeneous configuration and the computing task is specified to be processed only by the GPU, only the computing node bound as a GPU core is bound. The embodiment of the present invention does not limit the specific manner of binding the computing nodes, for example, a preset binding parameter can be set for the computing node to be bound in the configuration file, and the processing subject of the computing task is limited to the computing node corresponding to the binding parameter.
And carrying out calculation force statistics on each bound calculation node so as to obtain an average calculation force value of each calculation node, wherein the average calculation force value indicates the processing capacity of the calculation node for the calculation task. For convenience of explanation, the unit of the calculation force value (average calculation force value) is the number of Operations Per Second (OPS) in the embodiment of the present invention, but this does not limit the embodiment of the present invention. When calculating force statistics is carried out, a processor to which a calculating node belongs physically is determined, for example, a processor to which a central processing unit core belongs physically is a central processing unit, then an average calculating force value of the calculating node under the processor is obtained by obtaining the calculating force value of the processor, specifically, one mode is to directly read configuration parameters of the processor to which the calculating node belongs, and obtain the position from the configuration parametersPerforming division operation on the calculation force value of the processor and the number of the calculation nodes contained in the processor to obtain an average calculation force value of the calculation nodes; or the task with known calculated amount is delivered to the processor to which the calculation node belongs for processing, the calculation force value of the processor is calculated according to the duration of the processor processing the task, and then the calculation force value and the number of the calculation nodes contained in the processor are divided to obtain the average calculation force value of the calculation node. For the convenience of understanding, in the embodiment of the present invention, the number of operations is used as a unit of the calculated amount, and the calculated amount may also be represented in other forms in an actual application scenario. For example, the computation amount of a task is 100G (giga, i.e. billion) operations, and the computation Node A If the time length for the processor to complete the task is 2 seconds, the calculation force value of the processor can be calculated to be 100/2=50GOPS, and if the processor comprises 2 calculation nodes, the calculation Node can be calculated A The average calculated force value of (2) =25gops.
Optionally, an computation force inventory table is established based on all the average computation force values, and a mapping relationship between the average computation force values and the computation nodes is configured in the computation force inventory table. In the embodiment of the invention, after the average calculation force value corresponding to each bound calculation node is calculated, a calculation force preparation table can be established based on all the average calculation force values, and the mapping relation between the average calculation force value and the calculation node is established in the calculation force preparation table, so that the subsequent task distribution is facilitated, the calculation force preparation table can be a database table or a table in other forms, and the convenience of the subsequent task distribution can be improved by the method.
In S102, if the task type of the computation task is a timing task, obtaining a task force value of the computation task, analyzing the computation nodes satisfying the task force value according to the average force value, determining the analyzed computation nodes as target nodes, and allocating the computation task to all the target nodes, where the computation amount allocated to the target nodes corresponds to the average force value of the target nodes.
In the embodiment of the present invention, for a to-be-processed computation task, task allocation is performed according to a task type of the computation task, where the computation task may be used to implement functions such as deep learning, monitoring, or retrieval, and the like, which is not limited in the embodiment of the present invention. Specifically, if the task type of the computing task is a timing task, that is, the computing task has a fixed aging requirement, acquiring a task computation force value required by the computing task, analyzing a computing node meeting the task computation force value according to the average computation force value, for convenience of distinguishing, naming the analyzed computing node as a target node, and distributing the computing task to all target nodes, wherein the sum of the average computation force values of the target nodes is greater than or equal to the task computation force value, and in addition, the task type and the task computation force value of the computing task can be pre-specified by a user. For example, the real-time video processing task needs to complete video processing when each frame of a video ends, and the requirement on the timeliness is fixed, so that the task type of the real-time video processing task can be designated as a timing task.
In the process of distributing the calculation tasks to the target nodes, the calculation amount distributed to each target node is also analyzed. Firstly, determining a target execution duration according to the aging requirement of the calculation task, and then determining the calculation total amount of the calculation task according to the target execution duration and the task calculation force value, for example, if the task calculation force value of the calculation task is 100GOPS, and the target execution duration (aging requirement) is 2 seconds, it can be determined that the calculation total amount of the calculation task is 100 × 2=200g operations. If the number of the target nodes is only one, distributing the total calculation amount to the target nodes; if the number of the target nodes is at least two, splitting the total calculation amount into at least two calculation amounts according to the average calculation force value of each target node, and distributing the calculation amounts to each target node respectively. When the number of the target nodes is at least two, performing product operation on the average calculation force value of each target node and the target execution duration of the calculation task to obtain the maximum support amount of each target node, and performing calculation amount distribution according to the maximum support amount and the total calculation amount, wherein two calculation amount distribution modes specifically exist: the first is a priority distribution mode, that is, the calculated amount is distributed to the target node in the front row preferentially according to the numerical order of the average calculated force valueThe distributed calculated amount is the same as the maximum support amount of the target Node until the total calculated amount is distributed, wherein the numerical sequence of the average calculated force value can be the sequence from the large average calculated force value to the small average calculated force value or the sequence from the small average calculated force value to the large average calculated force value, for example, the total calculated amount of the calculation task is 200G times of operation, the target execution time is 2 seconds, and the target Node comprises a target Node B And target Node C ,Node B The average calculated force value of 70GOPS, node C If the average calculated force value is 60GOPS, the Node can be calculated B And Node C The maximum support amounts of the calculation results are respectively 140G times of calculation and 120G times of calculation, and if the numerical sequence of the average calculation force values is the sequence of the average calculation force values from large to small, the Node is preferentially selected B Allocating the calculated amount equal to the maximum supported amount, i.e. 140G operations, and then Node C Distributing the calculated amount of the remaining 60G operations; the other mode is a balanced distribution mode, namely, the calculated amount is distributed to the target Node according to the proportion of the average calculated force value of the target Node, and the Node is used for calculating the calculated amount B And Node C Example of (2) Node B And Node C The ratio of the average calculated force value of (2) is 7:6, determining the distribution to the Node B The calculated amount of (2) is 200 x (7/(7+6)) ≈ 108G times of operation, and the calculation amount distributed to Node is determined C The calculated amount of (2) is 200 (6/(7+6)) ≈ 92G operations, and of course, if the calculated amount allocated to a certain target node exceeds the maximum support amount of the target node, the calculated amount exceeding the maximum support amount may be allocated to other target nodes again for further equalization, preventing the calculation task processing from being overdue. The former calculation amount distribution mode is higher in efficiency in distribution, the latter calculation amount distribution mode realizes load balancing, reduces loss caused by a fault of a certain target node, and can be applied according to actual application scenarios.
Optionally, sorting all the bound computing nodes according to the numerical sequence of the average computing force value to generate a task allocation sequence; analyzing the minimum computing node meeting the task computing force value according to the task allocation sequence, determining the analyzed computing node as a target node, and allocating the computing taskTo the target node. In the embodiment of the invention, as the task type of the computing task is a timing task, in order to reduce the number of target nodes participating in the computing as much as possible and reduce resource consumption, all the bound computing nodes can be sorted according to the numerical sequence of the average computing force value to generate a task distribution sequence, wherein the numerical sequence of the average computing force value can be the sequence of the average computing force value from large to small or the sequence of the average computing force value from small to large. And then, analyzing the minimum computing node meeting the task computing force value according to the generated task distribution sequence, determining the analyzed computing node as a target node, and distributing the computing task to the target node. For example, all the computing nodes bound include the computing Node D 、Node E And Node F The average computation force values are respectively 50GOPS, 60GOPS and 70GOPS, and if the numerical sequence of the average computation force values is the sequence of the average computation force values from large to small, the generated task distribution sequence is Node F -Node E -Node D If the task force value of the calculation task is 100GOPS, when the target Node is determined according to the task distribution sequence, firstly, the Node at the head of the sequence is determined F Determining as the target Node, and determining the Node again because the computation value of 100-70=30GOPS remains E And determining the target node, and finishing the distribution of the task calculation value and the determination of the target node. By the method, the number of the target nodes is reduced as much as possible while the task calculation value is met, so that the consumption of calculation resources is reduced.
In S103, if the task type is the maximum computation task, determining each bound computing node as the target node, and allocating the computing task to all the target nodes.
If the task type of the computing task is a maximum computing power task (that is, the processing speed of the computing task is as high as possible, and no fixed aging requirement exists), such as a search task or a picture retrieval task, and the like, determining each bound computing node as a target node, and distributing the computing task to all the target nodes to accelerate the processing speed. It should be noted that, when the task type of the computation task is the maximum computation task, the average computation amount may be computed based on the total computation amount of the computation task and the number of target nodes, and then the average computation amount is allocated to each target node, for example, if the total computation amount of the computation task is 200G operations and the number of target nodes is 10, the average computation amount may be computed as 200/10=20g operations, and the average computation amount is allocated to each target node, so as to implement load balancing. It should be noted that, in a case that the calculation task is obtained from an external system, in order to facilitate identification of the task type, a mapping relationship may be established between the task type of the timing task and a first preset identifier, and a mapping relationship may be established between the task type of the maximum calculation task and a second preset identifier, the external system selects, according to specific content of the calculation task, whether the first preset identifier or the second preset identifier is added to the task name of the calculation task, and after the terminal device obtains the calculation task, the identifier in the calculation task is identified, so that the task type corresponding to the calculation task is identified, and if the task name of the calculation task contains the first preset identifier, the task type of the calculation task is determined to be the timing task.
Preferably, a bound and free computing node is determined as the target node. In order to prevent the compute node from being overloaded, when determining the target node, the utilization rate of each bound compute node may be counted, and a compute node in which the utilization rate is idle, i.e., the utilization rate of the compute node is lower than a preset utilization rate threshold (e.g., 10%), may be determined as the target node. The utilization rate of the computing node may be obtained by executing a specific query instruction in the operating system of the terminal device, for example, if the operating system of the terminal device is a Linux system, the utilization rate of the computing node may be obtained by executing a query instruction such as a top instruction.
As can be seen from the embodiment shown in fig. 1, in the embodiment of the present invention, at least two computing nodes are bound, and an average computation force value of each computing node is computed, if the task type of the computing task is a timing task, a task computation force value of the computing task is obtained, the computing nodes satisfying the task computation force value are determined as target nodes, and the computing task is distributed to the target nodes; and if the task type is the maximum computing power task, distributing the computing task to all the bound computing nodes. According to the embodiment of the invention, each computable unit in the terminal equipment is used as a computing node, and the computing task is distributed to the corresponding computing node according to the task type of the computing task, so that the flexibility of task distribution and the processing efficiency of the computing task are improved.
Fig. 2 shows a method obtained by performing extension based on the first embodiment of the present invention. An embodiment of the present invention provides an implementation flowchart of a task allocation method based on load allocation, and as shown in fig. 2, the task allocation method may include the following steps:
in S201, the calculation force monitoring is performed on the target node to which the calculation task has been allocated, and a real-time calculation force value of the target node is obtained.
After the calculation task is distributed, calculation force monitoring is carried out on the target node of the distributed calculation task, and therefore the current real-time calculation force value of the target node is obtained, wherein calculation force monitoring modes are different according to different processors to which the target node belongs, for example, if the target node is a central processor core, namely the processor to which the target node belongs is a central processor, calculation force monitoring can be carried out on monitoring software such as a CPU-Z based on an open source.
In S202, if the real-time computation force value is smaller than the average computation force value of the target node, and an absolute value of a difference between the real-time computation force value and the average computation force value exceeds a preset fluctuation value, determining the target node as a node to be evaluated, reducing the computation amount allocated to the node to be evaluated according to the real-time computation force value, and allocating redundant computation amount to other idle target nodes.
If the monitored real-time computing force value is smaller than the average computing force value of the target node and the absolute value of the difference between the real-time computing force value and the average computing force value exceeds a preset fluctuation value, determining the target node as a node to be evaluated, calculating the proportion between the real-time computing force value and the average computing force value, performing product operation on the proportion and the computing quantity initially distributed to the node to be evaluated, and taking the result of the product operation as the computing quantity redistributed to the node to be evaluated. The fluctuation value can be set according to the fluctuation range of the target node in the actual application scenario, for example, it can be set to 30GOPS. As for the redundant computation amount, the redundant computation amount can be allocated to other idle target nodes, and similarly, an idle target node refers to a target node whose utilization rate is less than the utilization rate threshold. It should be noted that if the real-time force value has other conditions, such as the real-time force value is not less than the average force value, or the absolute value of the difference between the real-time force value and the average force value does not exceed the fluctuation value, the operation of updating the calculated amount is not executed, the force monitoring is continued to the target node until the calculation task on the target node is executed,
for example, if the fluctuation value is 30GOPS, the calculation amount initially allocated to the target node is 300G operations, the average calculation force value of the target node is 60GOPS, the real-time calculation force value obtained by monitoring the target node is 20GOPS, and the target node is determined as the node to be evaluated because the real-time calculation force value is smaller than the average calculation force value, the absolute value of the difference between the real-time calculation force value and the average calculation force value is 40GOPS, and exceeds the fluctuation value of 30GOPS. Then, calculating the ratio between the real-time force value and the average force value as 1: and 3, further calculating the calculation amount newly allocated to the node to be evaluated as 300 × 1/3) =100G operations, and allocating the redundant 200G operations to other idle target nodes.
Optionally, after the calculation amount redistributed to the node to be evaluated is calculated according to the ratio between the real-time calculation value and the average calculation value, the calculation amount redistributed to the node to be evaluated is updated according to the preset margin amount. Under the condition that the calculation value of the node to be evaluated is reduced to be out of the normal fluctuation range, the calculation value of the node to be evaluated may continue to be reduced, so in order to not delay the processing of the calculation task, the margin amount may be preset, after the calculation amount redistributed to the node to be evaluated is calculated according to the proportion between the real-time calculation value and the average calculation value, the calculation amount and the margin amount are subjected to subtraction operation, and the obtained result is used as the final calculation amount redistributed to the node to be evaluated. For example, assuming that the margin amount is 10G operations, the computation amount reallocated to the node to be evaluated is calculated to be 30G operations according to the ratio between the real-time computation value and the average computation value of the node to be evaluated, the computation amount is subtracted from the margin amount to obtain the final computation amount of 30-10=20g operations, and the computation amount of 20G operations is reallocated to the node to be evaluated.
As can be seen from the embodiment shown in fig. 2, in the embodiment of the present invention, the real-time computation value of the target node is obtained by monitoring the computation force of the target node to which the computation task has been allocated, and if the obtained real-time computation value is smaller than the average computation value of the target node and the absolute value of the difference between the real-time computation value and the average computation value exceeds the preset fluctuation value, the target node is determined as the node to be evaluated, the computation amount allocated to the node to be evaluated is reduced according to the real-time computation value, and the redundant computation amount is allocated to other idle target nodes.
Fig. 3 is a method obtained by refining a process of determining a target node as a node to be evaluated, reducing a computation amount allocated to the node to be evaluated according to a real-time computation value, and allocating an excess computation amount to other idle target nodes, based on the second embodiment of the present invention. An embodiment of the present invention provides an implementation flowchart of a task allocation method based on load allocation, and as shown in fig. 3, the task allocation method may include the following steps:
in S301, the processor where the node to be evaluated is located is determined as the processor to be evaluated.
Because the decrease of the computation value of the node to be evaluated may be caused by processor failure or processor frequency reduction, the processor where the node to be evaluated is located is determined as the processor to be evaluated in terms of the determined target node not limited to the scenario of one processor.
In S302, the calculated amounts allocated to all the target nodes in the processor to be evaluated are uniformly reduced according to the real-time calculated amount value, and redundant calculated amounts are allocated to other idle target nodes not belonging to the processor to be evaluated.
And under the condition that the processor to be evaluated is determined, calculating the proportion between the real-time force value and the average force value of the node to be evaluated, carrying out product operation on each target node under the processor to be evaluated and the calculation amount initially distributed to the target node according to the proportion, and taking the result of the product operation as the calculation amount redistributed to the target node. And allocating the redundant calculated amount to other idle target nodes which do not belong to the processor to be evaluated. It is worth mentioning that if at least two nodes to be evaluated exist under a certain processor to be evaluated, the proportion between the real-time computation force value and the average computation force value corresponding to each node to be evaluated is respectively calculated, and the computation amount redistributed to the target node is calculated according to the proportion with the lowest value.
For example, the target Node under the processor to be evaluated includes Node G 、Node H And Node I Wherein Node G Confirmed as the Node to be evaluated, and is initially allocated to the Node G 、Node H And Node I The calculated amount of (A) is respectively 50G times of operation, 100G times of operation and 150G times of operation, node G The ratio of the real-time force value to the average force value is 1:5, the reassignment to Node can be calculated G 、Node H And Node I The calculated amount of the operation is 10G times of operation, 20G times of operation and 30G times of operation respectively, and the calculated amount of the redundant 240G times of operation is distributed to other target nodes which do not belong to the processor to be evaluated and are idle.
As can be seen from the embodiment shown in fig. 3, in the embodiment of the present invention, the processor in which the node to be evaluated is located is determined as the processor to be evaluated, the computation amounts allocated to all target nodes in the processor to be evaluated are uniformly reduced according to the real-time computation value, and the redundant computation amounts are allocated to other idle target nodes that do not belong to the processor to be evaluated.
Fig. 4 is a diagram illustrating a method obtained by expanding a process of reducing the computation amount allocated to the node to be evaluated according to the real-time computation value and allocating the redundant computation amount to other idle target nodes based on the second embodiment of the present invention. An embodiment of the present invention provides an implementation flowchart of a task allocation method based on load allocation, and as shown in fig. 4, the task allocation method may include the following steps:
in S401, at least two real-time computation force values of the node to be evaluated are obtained in a preset monitoring time period, and a computation force variation trend value of the node to be evaluated is analyzed according to all the obtained real-time computation force values.
In the embodiment of the present invention, a target node determined as a node to be evaluated may also be continuously monitored, and specifically, at least two real-time computation values of the node to be evaluated are obtained according to a preset monitoring time period and a preset monitoring frequency, where the monitoring time period and the monitoring frequency may be determined according to a target execution duration of a computation task in an actual application scenario, preferably, a set duration of the monitoring time period is smaller than the target execution duration, for example, when the target execution duration is 10 minutes, the monitoring time period is set to 5 minutes after the node to be evaluated is confirmed, and the monitoring frequency is set to monitor once every 5 seconds. And for all the obtained real-time force values, taking the monitoring time of the real-time force values as a horizontal axis, taking the real-time force values as a vertical axis, establishing a force value coordinate system, sequentially calculating the slope of a straight line formed by every two separated real-time force values according to the sequence of the monitoring time from front to back, and finally taking the sum of all the obtained slopes as a force variation trend value. For example, the three real-time calculation force values obtained according to the monitoring frequency of 5 seconds are respectively Value A 、Value B And Value C ,Value A 、Value B And Value C The values of (A) are respectively 50GOPS, 25GOPS and 75GOPS A 、Value B And Value C The monitoring time of (2) is respectively the 5 th second, the 10 th second and the 15 th second, then the Value can be calculated A And Value B The slope of the formed straight line is (25-50)/(10-5) = -5 B And Value C The slope of the formed straight line is (75-25)/(15-10) =10, and the final calculated force variation trend value is-5 +10=5.
In S402, if the calculation force variation trend value is less than zero and the newly obtained real-time calculation force value is less than a preset valley value, the node to be evaluated is unbound, and the calculation amount currently allocated to the node to be evaluated is reallocated to other idle target nodes.
And if the calculation force variation trend value is smaller than zero and the newly acquired real-time calculation force value is smaller than the preset valley value, the node to be evaluated corresponding to the real-time calculation force value is proved to be incapable of supporting normal processing of the calculation task, the node to be evaluated is unbound, and the calculation amount currently allocated to the node to be evaluated is reallocated to other idle target nodes. The valley value can be set according to the actual application scenario, for example, set to 10GOPS. In addition, if the calculation force variation trend value is less than zero and the newly acquired real-time calculation force value is greater than or equal to the valley value, no operation is performed and calculation force monitoring is continued due to the possibility of subsequent rise of the calculation force value of the node to be evaluated. And if all the acquired real-time calculation values are greater than or equal to the valley values and the calculation force variation trend value is always smaller than zero when the monitoring time period is over, and the fact that the calculation force value of the node to be evaluated is abnormal for a long time is proved, performing unbinding operation on the node to be evaluated similarly when the monitoring time period is over, and reallocating the calculation amount currently distributed to the node to be evaluated to other idle target nodes.
In S403, if the calculation force variation trend value is greater than or equal to zero and the newly obtained real-time calculation force value is greater than or equal to the average calculation force value, reassigning the calculation amount initially assigned to the node to be evaluated.
And if the calculation force variation trend value is greater than or equal to zero and the newly acquired real-time calculation force value is greater than or equal to the average calculation force value, determining that the calculation force value of the node to be evaluated is restored to the level of the average calculation force value, and reallocating the calculation amount originally distributed to the node to be evaluated.
As can be seen from the embodiment shown in fig. 4, in the embodiment of the present invention, the calculation force is monitored in the preset monitoring time period, the calculation force change trend value of the node to be evaluated is analyzed, if the calculation force change trend value is smaller than zero and the newly obtained real-time calculation force value is smaller than the preset valley value, the node to be evaluated is unbound, and the calculation amount currently allocated to the node to be evaluated is reallocated to other idle target nodes; and if the calculation force variation trend value is greater than or equal to zero and the newly acquired real-time calculation force value is greater than or equal to the average calculation force value, reallocating the calculation amount initially distributed to the node to be evaluated. According to the embodiment of the invention, the calculation capacity of the node to be evaluated is monitored in the monitoring time period, and different operations such as unbinding or calculation capacity replying are executed according to the monitoring result, so that the calculation capacity distribution is more fit with the calculation capacity value change condition of the node to be evaluated, and the applicability of the calculation capacity distribution is improved.
Fig. 5 is a method obtained by refining a process of performing computation statistics on each bound computing node to obtain an average computation value of each computing node, based on the first embodiment of the present invention. An embodiment of the present invention provides an implementation flowchart of a task allocation method based on load allocation, and as shown in fig. 5, the task allocation method may include the following steps:
in S501, the computing nodes bound and located in the same processor are classified into a node group, where the node group includes at least one computing node.
Since the read configuration parameters may not be accurate due to processor aging or different specific configurations of the processor, and the computation force values of each compute node within the processor are not necessarily equal, embodiments of the present invention provide for reading configuration parameters that are inaccurate due to processor aging or different specific configurations of the processorAnd calculating force statistics is carried out on the computing nodes according to the actual task execution conditions of the computing nodes. Firstly, the bound computing nodes located in the same processor are classified into a node group, wherein one node group includes at least one computing node, and it should be understood that the node group only indicates that the computing nodes located in the same processor are individually classified, and does not refer to a specific storage format. For example, the bound computing Node J And Node K For central processor core, bound computing Node L And Node M For the graphic processor core, node will be J And Node K Classifying into a Node group L And Node M And belongs to another node group.
In S502, a preset statistical task is fragmented according to the number of the computing nodes in the node group, and the fragmented statistical task is distributed to each of the computing nodes in the node group for processing.
After at least one node group is obtained, for each node group, fragmenting a preset statistical task according to the number of the computing nodes in the node group, and distributing the fragmented statistical task to each computing node in the node group for processing. For example, the computing nodes in a Node group include nodes N And Node O And the total calculation amount of the statistical task is 100G times of operation, the total calculation amount is divided into two, and the calculation amount of 50G times of operation is delivered to the Node N Processing is carried out, and the calculated amount of the other 50G times of calculation is delivered to the Node O Carrying out treatment; for example, the computing nodes in a Node group include nodes P 、Node Q And Node R And the total calculation amount of the statistical task is 15 Gigabytes (GB), the total calculation amount is divided into three equal parts, and 5GB data is delivered to the Node P Processing, delivering another 5GB data to Node Q To carry outProcessing, delivering the residual 5GB data to the Node R And (6) processing. Preferably, the preset statistical task and the calculation task have the same function, namely belong to the same task, so that the method is convenient to adapt to the scene of subsequently executing the calculation task.
In S503, the processing duration of each computing node is obtained, and the average computation force value is calculated according to the processing duration and the computation amount of the fragmented statistical task, where the processing duration is the duration of the statistical task after the computing node completes the fragmentation.
After the calculated amount of the statistical task is distributed to each calculation node in the node group, the processing time of each calculation node is obtained, and the average calculated value of the calculation node is obtained according to the processing time and the calculated amount, wherein the processing time is the time of the statistical task after the calculation node completes the fragmentation. For example, if the processing time of a certain computing node is 2 seconds and the processing amount is 100G operations, the average computation force value of the computing node can be calculated to be 50GOPS.
As can be seen from the embodiment shown in fig. 5, in the embodiment of the present invention, computing nodes that are bound and located in the same processor are classified into a node group, a preset statistical task is fragmented according to the number of the computing nodes in the node group, the fragmented statistical task is respectively distributed to each computing node in the node group for processing, then the processing time of each computing node is obtained, and an average computation value is calculated according to the processing time and the computation amount of the fragmented statistical task. According to the embodiment of the invention, the average calculation force value is obtained by enabling the calculation node to process the statistical task, so that the accuracy of the average calculation force value is improved, and accurate calculation amount distribution is conveniently carried out according to the average calculation force value.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Fig. 6 is a block diagram illustrating a structure of a task allocation apparatus based on load scheduling according to an embodiment of the present invention, and referring to fig. 6, the task allocation apparatus includes:
the calculation force counting unit 61 is used for binding at least two calculation nodes and carrying out calculation force counting on each bound calculation node to obtain an average calculation force value of each calculation node, wherein the calculation nodes are central processor cores, graph processor cores or neural network processor cores;
a first allocating unit 62, configured to obtain a task force value of the computing task if the task type of the computing task is a timing task, analyze the computing node that meets the task force value according to the average force value, determine the analyzed computing node as a target node, and allocate the computing task to all the target nodes, where a computation amount allocated to the target node corresponds to the average force value of the target node;
a second allocating unit 63, configured to determine each bound computing node as the target node if the task type is the maximum computation power task, and allocate the computing task to all the target nodes.
Optionally, the task allocation device further includes:
the monitoring unit is used for carrying out calculation force monitoring on the target node which is distributed with the calculation task to obtain a real-time calculation force value of the target node;
and the evaluation unit is used for determining the target node as the node to be evaluated if the real-time computing force value is smaller than the average computing force value of the target node and the absolute value of the difference value between the real-time computing force value and the average computing force value exceeds a preset fluctuation value, reducing the computing amount distributed to the node to be evaluated according to the real-time computing force value, and distributing redundant computing amount to other idle target nodes.
Optionally, the evaluation unit comprises:
the processor evaluation unit is used for determining the processor where the node to be evaluated is positioned as the processor to be evaluated;
and the reducing unit is used for uniformly reducing the calculated amount distributed to all the target nodes in the processor to be evaluated according to the real-time calculated force value, and distributing redundant calculated amount to other idle target nodes which do not belong to the processor to be evaluated.
Optionally, the evaluation unit further comprises:
the analysis unit is used for acquiring at least two real-time calculation force values of the node to be evaluated in a preset monitoring time period and analyzing a calculation force change trend value of the node to be evaluated according to all the acquired real-time calculation force values;
the unbinding unit is used for unbinding the node to be evaluated and reallocating the calculated amount currently allocated to the node to be evaluated to other idle target nodes if the calculated force variation trend value is smaller than zero and the newly acquired real-time calculated force value is smaller than a preset valley value;
and the redistribution unit is used for redistributing the calculated amount which is originally distributed to the node to be evaluated if the calculated force variation trend value is greater than or equal to zero and the newly obtained real-time calculated force value is greater than or equal to the average calculated force value.
Optionally, the calculation force statistic unit 61 includes:
the classification unit is used for classifying the bound computing nodes which are positioned on the same processor into a node group, wherein the node group comprises at least one computing node;
the fragmentation unit is used for fragmenting a preset statistical task according to the number of the computing nodes in the node group and distributing the fragmented statistical task to each computing node in the node group for processing;
and the computing unit is used for acquiring the processing time of each computing node and computing the average computation force value according to the processing time and the computation amount of the fragmented statistical task, wherein the processing time is the time of the statistical task processed and fragmented by the computing nodes.
Optionally, the first distribution unit 62 comprises:
the sequencing unit is used for sequencing all the bound computing nodes according to the numerical sequence of the average computing force value to generate a task allocation sequence;
and the first allocating subunit is used for analyzing the minimum computing node meeting the task computing force value according to the task allocating sequence, determining the analyzed computing node as the target node, and allocating the computing task to the target node.
Therefore, the task allocation device based on load allocation provided by the embodiment of the invention allocates tasks according to the average calculation force value and the task type of the calculation task, and the flexibility of task allocation and the efficiency of task processing are improved.
Fig. 7 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 7, the terminal device 7 of this embodiment includes: a processor 70, a memory 71 and a computer program 72 stored in said memory 71 and operable on said processor 70, such as a load-scheduling based task allocation program. The processor 70, when executing the computer program 72, implements the steps of the above-mentioned various load-scheduling-based task allocation method embodiments, such as the steps S101 to S103 shown in fig. 1. Alternatively, the processor 70, when executing the computer program 72, implements the functions of the units in the above-described embodiments of the task assigning apparatus, such as the functions of the units 61 to 63 shown in fig. 6.
Illustratively, the computer program 72 may be divided into one or more units, which are stored in the memory 71 and executed by the processor 70 to accomplish the present invention. The one or more units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 72 in the terminal device 7. For example, the computer program 72 may be divided into a computation force statistics unit, a first distribution unit and a second distribution unit, each unit having the following specific functions:
the calculation force counting unit is used for binding at least two calculation nodes and carrying out calculation force statistics on each bound calculation node to obtain an average calculation force value of each calculation node, wherein the calculation nodes are central processor cores, graph processor cores or neural network processor cores;
the first distribution unit is used for acquiring a task force value of the computing task if the task type of the computing task is a timing task, analyzing the computing nodes meeting the task force value according to the average force value, determining the analyzed computing nodes as target nodes, and distributing the computing task to all the target nodes, wherein the calculated amount distributed to the target nodes corresponds to the average force value of the target nodes;
and the second distributing unit is used for determining each bound computing node as the target node and distributing the computing task to all the target nodes if the task type is the maximum computing power task.
The terminal device 7 may be a computing device such as a desktop computer, a notebook, a palm computer, and a cloud server. The terminal device may include, but is not limited to, a processor 70, a memory 71. It will be appreciated by those skilled in the art that fig. 7 is merely an example of a terminal device 7 and does not constitute a limitation of the terminal device 7 and may comprise more or less components than shown, or some components may be combined, or different components, for example the terminal device may further comprise input output devices, network access devices, buses, etc.
The Processor 70 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 71 may be an internal storage unit of the terminal device 7, such as a hard disk or a memory of the terminal device 7. The memory 71 may also be an external storage device of the terminal device 7, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 7. Further, the memory 71 may also include both an internal storage unit and an external storage device of the terminal device 7. The memory 71 is used for storing the computer program and other programs and data required by the terminal device. The memory 71 may also be used to temporarily store data that has been output or is to be output.
It is obvious to those skilled in the art that, for convenience and simplicity of description, the above division of each functional unit is only used for illustration, and in practical applications, the above function distribution may be performed by different functional units according to needs, that is, the internal structure of the terminal device is divided into different functional units to perform all or part of the above described functions. Each functional unit in the embodiments may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the application. The specific working process of the units in the system may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed terminal device and method may be implemented in other ways. For example, the above-described terminal device embodiments are merely illustrative, and for example, the division of the units is only one logical function division, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above-mentioned embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (9)

1. A task allocation method based on load allocation is characterized by comprising the following steps:
binding at least two computing nodes, and carrying out computing power statistics on each bound computing node to obtain an average computing power value of each computing node, wherein the computing nodes are central processor cores, graph processor cores or neural network processor cores;
if the task type of the calculation task is a timing task, acquiring a task force calculation value of the calculation task, analyzing the calculation nodes meeting the task force calculation value according to the average force calculation value, determining the analyzed calculation nodes as target nodes, and distributing the calculation task to all the target nodes, wherein the calculation amount distributed to the target nodes corresponds to the average force calculation value of the target nodes;
if the task type is a maximum computation task, determining each bound computing node as the target node, and distributing the computing task to all the target nodes, wherein the maximum computation task is a task with the processing speed as high as possible and without fixed timeliness requirements;
carrying out calculation force monitoring on the target node distributed with the calculation task to obtain a real-time calculation force value of the target node;
if the real-time force value is smaller than the average force value of the target node, and the absolute value of the difference between the real-time force value and the average force value exceeds a preset fluctuation value, determining the target node as a node to be evaluated, reducing the calculated amount distributed to the node to be evaluated according to the real-time force value, and distributing the redundant calculated amount to other idle target nodes.
2. The task allocation method according to claim 1, wherein the determining the target node as a node to be evaluated, reducing the amount of computation allocated to the node to be evaluated according to the real-time computation value, and allocating excess amount of computation to other idle target nodes comprises:
determining a processor where the node to be evaluated is located as a processor to be evaluated;
and uniformly reducing the calculated amount distributed to all the target nodes in the processor to be evaluated according to the real-time calculated value, and distributing redundant calculated amount to other idle target nodes which do not belong to the processor to be evaluated.
3. The task allocation method according to claim 1, wherein after reducing the computation amount allocated to the node to be evaluated according to the real-time computation force value and allocating the redundant computation amount to other idle target nodes, the method further comprises:
acquiring at least two real-time calculation force values of the node to be evaluated in a preset monitoring time period, and analyzing a calculation force change trend value of the node to be evaluated according to all the acquired real-time calculation force values;
if the calculation force variation trend value is smaller than zero and the newly acquired real-time calculation force value is smaller than a preset valley value, unbinding the node to be evaluated and reallocating the calculation amount currently allocated to the node to be evaluated to other idle target nodes;
and if the calculation force variation trend value is greater than or equal to zero and the newly acquired real-time calculation force value is greater than or equal to the average calculation force value, reallocating the calculation amount initially allocated to the node to be evaluated.
4. The task assignment method of claim 1, wherein said performing computation statistics on each of said computing nodes of the binding to obtain an average computation value for each of said computing nodes comprises:
the computing nodes which are bound and located on the same processor are classified into a node group, wherein the node group comprises at least one computing node;
fragmenting a preset statistical task according to the number of the computing nodes in the node group, and distributing the fragmented statistical task to each computing node in the node group for processing;
acquiring the processing time of each computing node, and calculating the average computation force value according to the processing time and the calculated amount of the fragmented statistical task, wherein the processing time is the time of the statistical task processed and fragmented by the computing nodes.
5. The task assigning method according to claim 1, wherein the analyzing the calculation node satisfying the task force value according to the average force value, determining the analyzed calculation node as a target node, and assigning the calculation task to the target node, comprises:
sequencing all the bound computing nodes according to the numerical sequence of the average computing force value to generate a task allocation sequence;
analyzing the minimum computing node meeting the task computing force value according to the task distribution sequence, determining the analyzed computing node as the target node, and distributing the computing task to the target node.
6. A task allocation apparatus based on load scheduling, comprising:
the calculation force counting unit is used for binding at least two calculation nodes and carrying out calculation force statistics on each bound calculation node to obtain an average calculation force value of each calculation node, wherein the calculation nodes are central processor cores, graph processor cores or neural network processor cores;
the first allocation unit is used for acquiring a task force calculation value of the calculation task if the task type of the calculation task is a timing task, analyzing the calculation nodes meeting the task force calculation value according to the average force calculation value, determining the analyzed calculation nodes as target nodes, and allocating the calculation task to all the target nodes, wherein the calculation amount allocated to the target nodes corresponds to the average force calculation value of the target nodes;
the second distribution unit is used for determining each bound computing node as the target node and distributing the computing task to all the target nodes if the task type is a maximum computing power task, wherein the maximum computing power task refers to a task with the processing speed as high as possible and without a fixed time efficiency requirement;
the monitoring unit is used for carrying out calculation force monitoring on the target node which is distributed with the calculation task to obtain a real-time calculation force value of the target node;
and the evaluation unit is used for determining the target node as a node to be evaluated if the real-time computing force value is smaller than the average computing force value of the target node and the absolute value of the difference between the real-time computing force value and the average computing force value exceeds a preset fluctuation value, reducing the computing amount distributed to the node to be evaluated according to the real-time computing force value, and distributing redundant computing amount to other idle target nodes.
7. A load-scheduling-based task allocation terminal device, wherein the terminal device comprises a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor executes the computer program to implement the following steps:
binding at least two computing nodes, and carrying out computing power statistics on each bound computing node to obtain an average computing power value of each computing node, wherein the computing nodes are central processor cores, graph processor cores or neural network processor cores;
if the task type of the calculation task is a timing task, acquiring a task force calculation value of the calculation task, analyzing the calculation nodes meeting the task force calculation value according to the average force calculation value, determining the analyzed calculation nodes as target nodes, and distributing the calculation task to all the target nodes, wherein the calculation amount distributed to the target nodes corresponds to the average force calculation value of the target nodes;
if the task type is a maximum computation task, determining each bound computing node as the target node, and distributing the computing task to all the target nodes, wherein the maximum computation task is a task with the processing speed as high as possible and without fixed timeliness requirements;
carrying out calculation force monitoring on the target node which is distributed with the calculation task to obtain a real-time calculation force value of the target node;
if the real-time force value is smaller than the average force value of the target node, and the absolute value of the difference between the real-time force value and the average force value exceeds a preset fluctuation value, determining the target node as a node to be evaluated, reducing the calculated amount distributed to the node to be evaluated according to the real-time force value, and distributing the redundant calculated amount to other idle target nodes.
8. The terminal device of claim 7, further comprising:
carrying out calculation force monitoring on the target node distributed with the calculation task to obtain a real-time calculation force value of the target node;
if the real-time force value is smaller than the average force value of the target node, and the absolute value of the difference between the real-time force value and the average force value exceeds a preset fluctuation value, determining the target node as a node to be evaluated, reducing the calculated amount distributed to the node to be evaluated according to the real-time force value, and distributing the redundant calculated amount to other idle target nodes.
9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the task assigning method according to any one of claims 1 to 5.
CN201811502296.1A 2018-12-10 2018-12-10 Task allocation method and device based on load allocation and terminal equipment Active CN109783224B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811502296.1A CN109783224B (en) 2018-12-10 2018-12-10 Task allocation method and device based on load allocation and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811502296.1A CN109783224B (en) 2018-12-10 2018-12-10 Task allocation method and device based on load allocation and terminal equipment

Publications (2)

Publication Number Publication Date
CN109783224A CN109783224A (en) 2019-05-21
CN109783224B true CN109783224B (en) 2022-10-14

Family

ID=66495796

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811502296.1A Active CN109783224B (en) 2018-12-10 2018-12-10 Task allocation method and device based on load allocation and terminal equipment

Country Status (1)

Country Link
CN (1) CN109783224B (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110231987A (en) * 2019-06-21 2019-09-13 深圳市网心科技有限公司 A kind of data processing method and relevant apparatus
CN110928676B (en) * 2019-07-18 2022-03-11 国网浙江省电力有限公司衢州供电公司 Power CPS load distribution method based on performance evaluation
CN110502323B (en) * 2019-07-18 2022-02-18 国网浙江省电力有限公司衢州供电公司 Real-time scheduling method for cloud computing tasks
CN110851529B (en) * 2019-11-01 2024-05-28 腾讯科技(深圳)有限公司 Computing power scheduling method and related equipment
CN110837421B (en) * 2019-11-13 2022-09-20 北京知道创宇信息技术股份有限公司 Task allocation method and device
CN111090517A (en) * 2019-11-26 2020-05-01 中国建设银行股份有限公司 Job scheduling method, device, equipment and storage medium
CN111580974B (en) * 2020-05-08 2023-06-27 抖音视界有限公司 GPU instance allocation method, device, electronic equipment and computer readable medium
CN114077524A (en) * 2020-08-07 2022-02-22 展讯半导体(南京)有限公司 Computing power sharing exception reporting, processing method and device, storage medium, and terminal equipment
CN112003930A (en) * 2020-08-21 2020-11-27 深圳柏成科技有限公司 Task allocation method, device, equipment and storage medium
CN112150262B (en) * 2020-09-29 2023-09-19 中国银行股份有限公司 Account checking data processing method and device
CN114691351A (en) * 2020-12-31 2022-07-01 维沃移动通信有限公司 Information processing method, device, equipment and storage medium
CN112835703B (en) * 2021-02-26 2024-04-26 大众问问(北京)信息科技有限公司 Task processing method, device, equipment and storage medium
CN113641124B (en) * 2021-08-06 2023-03-10 珠海格力电器股份有限公司 Calculation force distribution method and device, controller and building control system
CN113626200A (en) * 2021-08-24 2021-11-09 Oppo广东移动通信有限公司 Task load calculation method, device, storage medium and terminal
CN113886089B (en) * 2021-10-21 2024-01-26 上海勃池信息技术有限公司 Task processing method, device, system, equipment and medium
CN114040479B (en) * 2021-10-29 2023-04-28 中国联合网络通信集团有限公司 Method and device for selecting computing power node and computer readable storage medium
CN114880106B (en) * 2021-11-17 2025-01-28 中信科智联科技有限公司 A computing power matching method, device and management platform
CN113934545A (en) * 2021-12-17 2022-01-14 飞诺门阵(北京)科技有限公司 Video data scheduling method, system, electronic equipment and readable medium
CN114708480B (en) * 2022-03-04 2024-11-01 深圳海星智驾科技有限公司 Load balancing control method and device and electronic equipment
CN114579317A (en) * 2022-03-17 2022-06-03 北京爱笔科技有限公司 Model testing method and system
CN114661441A (en) * 2022-03-25 2022-06-24 深圳海星智驾科技有限公司 Computing platform, computing force expansion method and device of computing platform
CN115359427B (en) * 2022-08-30 2025-09-02 深圳市芯存科技有限公司 Intelligent security method, storage medium and device
CN115794561B (en) * 2022-12-28 2023-08-04 声龙(新加坡)私人有限公司 Calculation power monitoring method, device and storage medium of calculation power server
CN115729715B (en) * 2023-01-10 2023-09-01 摩尔线程智能科技(北京)有限责任公司 Load distribution method, device, equipment and medium for GPU (graphics processing Unit) system
CN117009089B (en) * 2023-09-28 2023-12-12 南京庆文信息科技有限公司 Robot cluster supervision and management system based on distributed computing and UWB positioning
CN117472551B (en) * 2023-12-27 2024-03-01 四川弘智远大科技有限公司 Cloud computing hardware acceleration control system and method based on GPU integration
CN117632380B (en) * 2024-01-25 2024-08-20 泰德网聚(北京)科技股份有限公司 Low-code workflow system for automatically generating script based on user demand

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101278264A (en) * 2005-09-28 2008-10-01 英特尔公司 Reliable computing with a many-core processor
WO2011102219A1 (en) * 2010-02-19 2011-08-25 日本電気株式会社 Real time system task configuration optimization system for multi-core processors, and method and program
CN108776934A (en) * 2018-05-15 2018-11-09 中国平安人寿保险股份有限公司 Distributed data computational methods, device, computer equipment and readable storage medium storing program for executing

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7941805B2 (en) * 2006-08-15 2011-05-10 International Business Machines Corporation Affinity dispatching load balancer with precise CPU consumption data
EP2427022B1 (en) * 2010-09-06 2016-11-09 ABB Research Ltd. Method for reassigning the role of a wireless node in a wireless network
US8645454B2 (en) * 2010-12-28 2014-02-04 Canon Kabushiki Kaisha Task allocation multiple nodes in a distributed computing system
KR101812583B1 (en) * 2011-07-21 2018-01-30 삼성전자주식회사 Apparatus or task assignment, method for task assignment and a computer-readable storage medium
CN103793272B (en) * 2013-12-27 2017-05-24 北京天融信软件有限公司 Periodical task scheduling method and periodical task scheduling system
CN106528280B (en) * 2015-09-15 2019-10-29 阿里巴巴集团控股有限公司 A kind of method for allocating tasks and system
US11625738B2 (en) * 2016-08-28 2023-04-11 Vmware, Inc. Methods and systems that generated resource-provision bids in an automated resource-exchange system
CN108268318A (en) * 2016-12-30 2018-07-10 华为技术有限公司 A kind of method and apparatus of distributed system task distribution
CN108710537A (en) * 2018-04-09 2018-10-26 平安科技(深圳)有限公司 A kind of task processing method, storage medium and server

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101278264A (en) * 2005-09-28 2008-10-01 英特尔公司 Reliable computing with a many-core processor
WO2011102219A1 (en) * 2010-02-19 2011-08-25 日本電気株式会社 Real time system task configuration optimization system for multi-core processors, and method and program
CN108776934A (en) * 2018-05-15 2018-11-09 中国平安人寿保险股份有限公司 Distributed data computational methods, device, computer equipment and readable storage medium storing program for executing

Also Published As

Publication number Publication date
CN109783224A (en) 2019-05-21

Similar Documents

Publication Publication Date Title
CN109783224B (en) Task allocation method and device based on load allocation and terminal equipment
CN111176852B (en) Resource allocation method, device, chip and computer readable storage medium
CN111176792B (en) Resource scheduling method and device and related equipment
US10558498B2 (en) Method for scheduling data flow task and apparatus
CN108446176B (en) Task allocation method, computer readable storage medium and terminal device
CN110119876B (en) Work order processing method and device
CN110231987A (en) A kind of data processing method and relevant apparatus
CN112148468B (en) Resource scheduling method and device, electronic equipment and storage medium
CN110351375B (en) A data processing method, device and computer device, and readable storage medium
WO2018095066A1 (en) Method and device for task grouping, electronic device, and computer storage medium
CN109345108A (en) Task allocation method, device, device and storage medium
CN110347602B (en) Method and device for executing multitasking script, electronic equipment and readable storage medium
CN112190927B (en) Game resource allocation method based on cloud computing and cloud game service platform
CN112559147A (en) Dynamic matching algorithm, system and equipment based on GPU resource occupation characteristics
CN111680085A (en) Data processing task analysis method and device, electronic equipment and readable storage medium
CN118394592B (en) A Paas platform based on cloud computing
CN111176833A (en) Task allocation method and system for multiprocessing nodes
CN116467082A (en) A resource allocation method and system based on big data
CN112988383A (en) Resource allocation method, device, equipment and storage medium
WO2016206441A1 (en) Method and device for allocating virtual resource, and computer storage medium
CN109324898B (en) Service processing method and system
CN113886086A (en) Cloud platform computing resource allocation method, system, terminal and storage medium
US20140047454A1 (en) Load balancing in an sap system
CN113419863A (en) Data distribution processing method and device based on node capability
CN111796934A (en) Task issuing method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载