+

CN119382159A - Intelligent decision-making method and system for distribution network based on knowledge embedding and multi-agent system - Google Patents

Intelligent decision-making method and system for distribution network based on knowledge embedding and multi-agent system Download PDF

Info

Publication number
CN119382159A
CN119382159A CN202411507652.4A CN202411507652A CN119382159A CN 119382159 A CN119382159 A CN 119382159A CN 202411507652 A CN202411507652 A CN 202411507652A CN 119382159 A CN119382159 A CN 119382159A
Authority
CN
China
Prior art keywords
network
agent
scheduling
energy
distribution network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411507652.4A
Other languages
Chinese (zh)
Inventor
陈赟
王佳裕
潘智俊
赵文恺
罗潇
林震宇
汤蕾
傅超然
洪祎祺
王晓慧
贺兴
马墅研
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Shanghai Electric Power Co Ltd
Original Assignee
State Grid Shanghai Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Shanghai Electric Power Co Ltd filed Critical State Grid Shanghai Electric Power Co Ltd
Priority to CN202411507652.4A priority Critical patent/CN119382159A/en
Publication of CN119382159A publication Critical patent/CN119382159A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for AC mains or AC distribution networks
    • H02J3/12Circuit arrangements for AC mains or AC distribution networks for adjusting voltage in AC networks by changing a characteristic of the network load
    • H02J3/14Circuit arrangements for AC mains or AC distribution networks for adjusting voltage in AC networks by changing a characteristic of the network load by switching loads on to, or off from, network, e.g. progressively balanced loading
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for AC mains or AC distribution networks
    • H02J3/28Arrangements for balancing of the load in a network by storage of energy
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for AC mains or AC distribution networks
    • H02J3/38Arrangements for parallely feeding a single network by two or more generators, converters or transformers
    • H02J3/381Dispersed generators
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for AC mains or AC distribution networks
    • H02J3/38Arrangements for parallely feeding a single network by two or more generators, converters or transformers
    • H02J3/46Controlling of the sharing of output between the generators, converters, or transformers
    • H02J3/466Scheduling the operation of the generators, e.g. connecting or disconnecting generators to meet a given demand
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/10Power transmission or distribution systems management focussing at grid-level, e.g. load flow analysis, node profile computation, meshed network optimisation, active network management or spinning reserve management
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2203/00Indexing scheme relating to details of circuit arrangements for AC mains or AC distribution networks
    • H02J2203/20Simulating, e g planning, reliability check, modelling or computer assisted design [CAD]
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2300/00Systems for supplying or distributing electric power characterised by decentralized, dispersed, or local generation
    • H02J2300/20The dispersed energy generation being of renewable origin
    • H02J2300/22The renewable source being solar energy
    • H02J2300/24The renewable source being solar energy of photovoltaic origin
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J2300/00Systems for supplying or distributing electric power characterised by decentralized, dispersed, or local generation
    • H02J2300/20The dispersed energy generation being of renewable origin
    • H02J2300/28The renewable source being wind energy

Landscapes

  • Engineering & Computer Science (AREA)
  • Power Engineering (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention relates to a power distribution network intelligent decision-making method and system based on knowledge embedding and a multi-agent system, wherein the method comprises the following steps of collecting operation data of each distributed energy device in a power distribution network in real time and preprocessing; and based on the preprocessed operation data and the distributed multi-layer energy control architecture based on the multi-agent system, performing intelligent energy scheduling decision by adopting a method combining expert knowledge embedding and reinforcement learning to finish a decision process. Compared with the prior art, the method and the device can realize efficient distributed energy scheduling in complex and changeable energy scenes, and improve the intelligence, flexibility and robustness of power distribution network scheduling.

Description

Intelligent decision-making method and system for power distribution network based on knowledge embedding and multi-agent system
Technical Field
The invention relates to the technical field of intelligent power grids, in particular to a power distribution network intelligent decision method and system based on knowledge embedding and a multi-agent system.
Background
With the increasing demand for energy and the rapid development of distributed energy resources (such as photovoltaic power generation, wind energy, energy storage systems, electric vehicles, etc.), the complexity of modern power systems has been difficult to deal with by conventional centralized energy scheduling modes. The highly decentralized, unpredictable and diverse nature of distributed energy resources results in existing energy scheduling systems that perform poorly in dealing with uncertainty, real-time load changes and dynamic demands.
The traditional scheduling system often depends on a single centralized control architecture, so that the advantages of distributed energy resources are difficult to fully develop, and the problems of energy waste, scheduling delay, unbalanced load and the like are easily caused. In addition, with the continuous development of energy internet, smart grid and other technologies, the complexity of modern power systems is continuously increased, and the difficulty of scheduling and management is also increased. Therefore, a more intelligent, distributed and adaptive energy scheduling method is needed to cope with the complex energy network scheduling demands, and realize efficient coordination and optimization of distributed energy resources. The multi-agent system is used as an emerging scheduling mode, and can realize the collaborative optimization of the distributed energy system on different levels through layered control and agent collaboration, so that the flexibility and response speed of the system are improved.
Patent application CN114841448A discloses a hierarchical partition load optimization regulation and control method based on a multi-agent system, wherein a hierarchical partition dynamic regulation and control framework is designed in the regulation and control method, the framework is combined with the multi-agent system, distributed resources with wide points and multiple surfaces, various types and different characteristics are gathered through the cloud edge cooperation and the Internet of things technology, and the multi-agent system is adopted for distributed management and control, so that high-efficiency energy interaction is realized. On the basis, a multi-objective optimization strategy which is economical and applicable, small in environmental pollution and high in safety performance is provided, so that hierarchical partition optimization regulation and control of the demand side resources is realized. However, this approach requires reliance on accurate modeling, resulting in performance degradation when complex energy system distributed energy resource scheduling is a problem with complexity and uncertainty.
Therefore, those skilled in the art are highly in need of developing a multi-agent system intelligent decision method capable of coping with the complexity and uncertainty problems of the distributed energy resource scheduling of the complex energy system.
Disclosure of Invention
The invention aims to provide a power distribution network intelligent decision method and system based on a knowledge embedding and multi-agent system for improving the utilization rate of distributed energy sources in a complex environment.
The aim of the invention can be achieved by the following technical scheme:
A power distribution network intelligent decision method based on knowledge embedding and a multi-agent system comprises the following steps:
collecting operation data of each distributed energy device in the power distribution network in real time, and preprocessing;
Hierarchical division is carried out on the power distribution network, and a distributed multi-layer energy control framework based on a multi-agent system is constructed;
Based on the preprocessed operation data and a distributed multi-layer energy control architecture based on a multi-agent system, an expert knowledge embedding and reinforcement learning combined method is adopted to conduct intelligent energy scheduling decision, and a decision process is completed.
Further, the operation data of each distributed energy device comprises voltage, current and power output of photovoltaic power generation, wind power generation, energy storage equipment, electric vehicles and a power network.
Further, the preprocessing operation includes performing normalization processing on the operation data of each distributed energy device, where an expression of the normalization processing is:
Wherein X norm is normalized data, X is raw acquired data, and X min、Xmax is the minimum value and the maximum value in the data respectively.
Further, the step of constructing a multi-agent system-based distributed multi-layer energy control architecture includes:
Dividing a power distribution network into three layers from top to bottom, namely a main distribution network layer, a regional coordination layer and an equipment unit layer, and constructing a distributed multi-layer energy control architecture based on a multi-agent system, wherein the distributed multi-layer energy control architecture comprises a main distribution network layer agent, a regional coordination layer agent and an equipment unit layer agent;
The distribution network main guiding layer takes the distribution network as a main guide, the control of the whole distributed multi-layer energy control architecture is carried out through the distribution network main guiding layer agent, excitation signals are formulated according to a dispatching optimization target, and the excitation signals are communicated with the regional coordination layer agent so as to realize cross-regional cooperative dispatching;
The regional coordination layer is used for scheduling and controlling the operation of each distributed energy device in each region, each region is provided with a regional coordination layer agent, and the regional coordination layer agent performs local optimization by communicating with the device unit layer agent and combining the energy consumption requirement in the region with the scheduling optimization target;
The device unit layer comprises a plurality of distributed energy devices, each distributed energy device is provided with a device unit layer agent, the device unit layer agent communicates with the regional coordination layer agent to obtain a scheduling instruction so as to schedule the energy of each distributed energy device.
Further, the step of making an intelligent energy scheduling decision includes:
according to the scheduling optimization target, a scheduling model of the power distribution network is constructed, wherein the scheduling model of the power distribution network aims at minimizing the running cost, and the expression is as follows:
Wherein E DSO represents the running cost, and P t grid represents the electricity purchase amount of the distribution system operator from the power transmission network; representing the electricity purchase price of the distribution system operator from the power transmission network; The electricity selling and purchasing quantity of the power distribution system operator to the micro-grid i are respectively represented; the electricity selling price and the electricity purchasing price of the distribution system operator to the micro-grid i are respectively represented, wherein N is the number of MG connected to the distribution network; M is the total number of branches;
Acquiring expert knowledge from an expert knowledge base;
Based on the preprocessed operation data, the expert knowledge provides scheduling guidance, the scheduling model is solved by combining reinforcement learning, and energy scheduling decision is made according to a solving result.
Further, the reinforcement learning employs a depth deterministic strategy gradient algorithm.
Further, the architecture of the depth deterministic strategy gradient algorithm adopts actor-critic dual neural network architecture, and the training steps of the actor-critic dual neural network architecture comprise:
1) Initializing actor parameters theta Q of the network, parameters theta π of the critic network, parameters theta Q of the target actor network, parameters theta π' of the target critic network and an experience playback buffer;
2) Actor, selecting actions by the network according to the current strategy, executing the selected actions by the intelligent agent, and interacting with the environment to obtain the current state, rewards and the next state;
3) Experience playback, namely storing the current state, action, rewards and the next state obtained by interaction into an experience playback buffer area;
4) critic updating the network, namely randomly extracting a batch of experiences from the experience playback buffer, calculating a loss function of the critic network by the critic network by using the extracted experiences, and updating a parameter theta Q of critic network parameters by a gradient descent method;
5) actor updating the network, namely calculating the strategy gradient of the critic network according to the output of the critic network, and updating the parameter theta π of the actor network parameter by a gradient rising method, wherein the calculation expression of the strategy gradient of the critic network is as follows:
In the formula, A policy gradient that is a critic network objective function,For the actor network policy, The gradient is critic network Q-function value and actor network, n is the number of the agents, Q is the sample experience number, s i,t is the state of the ith agent at the time t;
6) Updating the target network, namely periodically copying parameters of actor networks and critic networks into corresponding actor networks and target critic networks respectively;
7) The strategy evaluation and improvement comprises the steps of defining an evaluation function to calculate a Q-function expected value so as to measure the strategy effect of an intelligent agent, wherein the evaluation function has the expression:
Wherein J (pi) is an evaluation function, when the state s obeys the probability distribution p π, the expected value of a Q-function Q (s, pi (s)) obtained by executing actions according to the strategy pi is represented by the J (pi), Q (s, pi (s)) is a Q-function value obtained by executing actions according to the strategy pi by an agent, and p π is a probability distribution function;
8) And (3) performing iteration, namely repeatedly performing the steps 2) -7) until the iteration stopping condition is met.
Further, the expression of the reward update is:
Where r i',t is the target reward for the ith agent at time t, r i,t is the instant reward for the ith agent at time t, gamma is a relationship discount factor for balancing the current reward against the future reward, For the target critic network Q-function value,For the target actor network policy, err is the error between the predicted Q value and the target Q value, a i,t is the action of the ith agent at the time t, and Q is the sample experience number.
Further, the method also comprises the step of optimizing a scheduling model by adopting a closed-loop feedback mechanism so as to adjust the intelligent energy scheduling decision, and comprises the following steps:
Dynamically comparing an actual scheduling execution result of the power distribution network energy scheduling decision with an expected scheduling scheme, and identifying an execution deviation;
And correcting according to the execution deviation, and adaptively optimizing the scheduling model to dynamically adjust the energy scheduling decision.
The invention also provides an intelligent decision system of the power distribution network based on the knowledge embedding and multi-agent system, which comprises the following components:
The data acquisition module is used for acquiring the operation data of each distributed energy device in the power distribution network in real time and preprocessing the operation data;
The control architecture construction module is used for carrying out hierarchical division on the power distribution network and constructing a distributed multi-layer energy control architecture based on a multi-agent system;
And the scheduling decision module is used for performing intelligent energy scheduling decision by adopting a method combining expert knowledge embedding and reinforcement learning based on the preprocessed operation data and a multi-agent system-based distributed multi-layer energy control architecture to complete the decision process.
Compared with the prior art, the invention has the following beneficial effects:
(1) According to the invention, a distributed multi-layer energy control architecture based on a multi-agent system is constructed according to the power distribution network, and the distributed energy resources of the power distribution network are scheduled by combining expert knowledge embedding and reinforcement learning technology.
(2) The distributed multi-layer energy control architecture based on the multi-agent system realizes the fine management and collaborative optimization of energy resources of different layers, and improves the flexibility and response speed of the system.
(3) According to the invention, expert domain knowledge is embedded into a scheduling decision, the intelligence and the interpretation of the power distribution network are enhanced, the decision in a complex scene is more reasonable and reliable, and the scheduling strategy can be optimized in a self-adaptive manner by combining a reinforcement learning algorithm, so that the dynamic change and uncertainty in an energy system are effectively treated, and more efficient energy management and optimization control are realized.
(4) The invention sets the closed-loop feedback mechanism to optimize the scheduling model, thereby continuously running experience, obtaining the optimal scheduling path and strategy, and improving the overall efficiency and robustness of the scheduling model.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of a distributed multi-layered energy control architecture based on a multi-agent system according to the present invention;
Fig. 3 is a DDPG training framework of the present invention.
Detailed Description
The invention will now be described in detail with reference to the drawings and specific examples. The present embodiment is implemented on the premise of the technical scheme of the present invention, and a detailed implementation manner and a specific operation process are given, but the protection scope of the present invention is not limited to the following examples.
Example 1
The embodiment provides an intelligent decision-making method of a power distribution network based on knowledge embedding and a multi-agent system, which comprises the steps of firstly, acquiring real-time operation data of distributed energy resources through a distributed sensor network, wherein the data comprise parameters such as voltage, current, power output and the like of photovoltaic power generation, wind power generation, energy storage equipment and electric automobiles, and carrying out standardized processing on the data; then, the power grid is divided into a main distribution network layer, a regional coordination layer and an equipment unit layer based on a multi-agent system, agents of all layers are responsible for corresponding scheduling tasks to ensure the stability and flexibility of the system, intelligent scheduling decisions are carried out by combining expert knowledge and a depth deterministic strategy gradient (DDPG) algorithm, an adaptive scheduling strategy is generated through reinforcement learning, energy allocation is dynamically optimized, and finally scheduling execution results are monitored and adjusted in real time through a closed loop feedback mechanism, differences between actual execution and expected results are analyzed, and scheduling model parameters are optimized. Specifically, as shown in fig. 1, the method includes the steps of:
S1, data acquisition and preprocessing.
The invention firstly collects the operation data of the distributed energy resources in real time through the distributed sensor network, and the collected data cover the key operation parameters of various distributed energy resources such as photovoltaic power generation, wind power generation, an energy storage system, an electric automobile and the like, including voltage, current, power output, load change, charging state and the like. In the data acquisition process, the sensor transmits real-time data to a central database of the system through the Internet of things equipment. In order to ensure the consistency and availability of the data, the invention performs the following standardized processing on the data from different sources, and provides a basis for the subsequent optimization decision.
S2, constructing a distributed multi-layer energy control architecture based on a multi-agent system.
In the invention, the power grid is divided into three layers of a distribution network main guide layer, a region coordination layer and a device unit layer from top to bottom, and distributed multi-level energy control is realized through a multi-agent system, as shown in fig. 2. In the architecture, each layer is controlled by a corresponding agent, and the energy distribution and the scheduling optimization of the whole system are realized through coordination optimization, so that the effective utilization of resources and the stable operation of the system are ensured.
(1) And the distribution network main guiding layer takes the distribution network as a leading part, and realizes cross-region cooperative scheduling by communicating with agents of the regional coordination layers. The main conducting layer of the distribution network formulates an excitation signal according to the optimization target of the whole system, and the excitation signal is transmitted to the equipment units through each level of agent layers, so that the stable and safe operation of the whole system is ensured. Its core function is to implement global scheduling and management of multiple virtual power plants and their distributed resources.
The method comprises the steps of taking the minimized running cost of a power distribution network as an optimization target, establishing a scheduling model of the power distribution network, and performing energy scheduling on each distributed energy device, wherein the scheduling model is as follows:
Wherein E DSO represents the running cost, and P t grid represents the electricity purchase amount of the distribution system operator from the power transmission network; representing the electricity purchase price of the distribution system operator from the power transmission network; The electricity selling and purchasing quantity of the power distribution system operator to the micro-grid i are respectively represented; the electricity selling price and the electricity purchasing price of the distribution system operator to the micro-grid i are respectively represented, wherein N is the number of MG connected to the distribution network; the energy loss of the branch is calculated, and M is the total number of the branches.
(2) And the regional coordination layer is responsible for the operation scheduling and control of the equipment in the region. The hierarchical agent acquires real-time equipment state data through communication with the equipment unit layer agent, and performs local optimization by combining the energy consumption requirement in the area with the overall system target. The regional coordination layer can respond to the excitation signals sent by the upper layer agent, so that the autonomous optimization of various energy devices in the region is realized, and the energy balance of the local region is ensured.
(3) And the equipment unit layer is used for controlling each distributed energy resource unit such as photovoltaic power generation, an energy storage device, wind power, flexible load and the like. The device unit layer agent acquires a scheduling instruction through communication with the regional coordination layer, and adjusts the power output and the running state of the device in real time. The equipment unit layer is not only responsible for the safe operation of equipment, but also maximally dissipates clean energy in the regulation and control process, and promotes the efficient utilization of energy.
Through the hierarchical control structure, collaborative optimization is realized among all levels, so that the response speed and the scheduling flexibility of the system are effectively improved, the energy waste in the system is reduced, and the efficient consumption of clean energy and the economic operation of the system are realized to the maximum extent.
S3, intelligent decision making is carried out based on expert knowledge embedding and reinforcement learning.
The intelligent scheduling decision of the distributed energy resource is realized by combining expert knowledge embedding and depth deterministic strategy Gradient (DEEP DETERMINISTIC Policy Gradient, DDPG) reinforcement learning algorithm. The method not only utilizes the field experience of an expert, but also continuously optimizes the scheduling strategy in a complex dynamic environment through a reinforcement learning algorithm, so that the system can make optimal decisions under different loads and uncertainty scenes.
The expert knowledge base is a scheduling experience and decision rule of various typical scenes, including historical scheduling experience, load characteristics under different meteorological conditions, equipment operation constraint, strategy for coping with sudden events and the like. By embedding these expert knowledge into the intelligent decision system, guidance can be provided in the reinforcement learning process so that the scheduling decisions can be enhanced not only by data driving, but also by the actual experience of the expert. For example, during a hot summer peak period, historical experience has shown that the air conditioning load in this area increases dramatically, resulting in a dramatic increase in power demand, photovoltaic power generation provides sufficient power during the day, but is under-powered during the evening due to insufficient photovoltaic power generation capacity. According to historical electricity consumption during similar peak electricity consumption, the expert knowledge base provides scheduling guidance that the energy storage device should be charged preferentially in the daytime to release electricity in preparation for the peak evening, and gradually discharge according to load requirements at the evening, and meanwhile, the schedulable loads such as air conditioner and electric automobile charging time are adjusted appropriately to reduce the power peak. By embedding the scheduling strategy into the intelligent decision system, the reinforcement learning process not only can generate the scheduling strategy based on real-time data, but also can refer to empirical knowledge provided by experts, so that energy can be reasonably distributed by the energy storage equipment under the condition of rapid load increase, and the scheduling of the power grid is more prospective and safer.
The reinforcement learning adopts DDPG algorithm, as shown in fig. 3, the DDPG algorithm adopts actor-critic structure, performs intelligent agent action exploration by random strategy, and performs algorithm updating by deterministic gradient strategy. Wherein actor and critic are two deep neural networks (deep neural network, DNN), critic is also called Q network, which can obtain global information and global actions of the system, actor is also called pi network, which acts only according to local environment. The two networks are trained respectively to minimize the loss function, network updating is realized, and in order to solve the problem of unstable updating, the algorithm creates a backup network, namely a target network, for the two networks.
In critic networks, the parameter vector θ Q is used to estimate the value Q *(s,a|θQ of a Q-function, also known as a state-action value function, used to evaluate the decision quality of the participants and to provide gradient direction information to the algorithm. In actor networks, the parameter vector θ π is used to estimate a policy pi *(s|θπ) that maps states to agents, makes decisions and outputs continuous actions. Defining an evaluation function J (pi) as a Q-function expected value to measure the effect of the agent strategy:
Wherein p π is a probability distribution function, Q (s, pi (s)) represents a Q-function value obtained by the intelligent agent executing action according to the strategy pi, and J (pi) represents a Q-function Q (s, pi (s)) expected value obtained by executing action according to the strategy pi when the state s obeys the probability distribution p π.
The intelligent agent accumulates experience through interaction with the environment to update the network, the essence of the intelligent agent is policy update, actor network selects action through initial random policy, critic network evaluates the action and outputs corresponding Q-function value, guidance actor updates policy parameters according to the evaluation value, and the steps are repeated to gradually increase the accumulated rewards obtained by the policy, wherein the updating process is as follows:
wherein the following equation represents a prize update, r i',t is a target prize, For the target critic network Q-function value,Network policy is targeted actor. The following equation represents critic network errors, and network updating is achieved by minimizing the errors; Q is the empirical number of samples for the actual critic network Q-function value. The following formula represents the network policy gradient to determine the update direction of the algorithm; gradient of critic network and actor network, respectively.
Specifically, the training steps of the actor-critic dual neural network architecture comprise:
1) Initializing actor parameters theta Q of the network, parameters theta π of the critic network, parameters theta Q of the target actor network, parameters theta π' of the target critic network and an experience playback buffer;
2) Actor, selecting actions by the network according to the current strategy, executing the selected actions by the intelligent agent, and interacting with the environment to obtain the current state, rewards and the next state;
3) Experience playback, namely storing the current state, action, rewards and the next state obtained by interaction into an experience playback buffer area;
4) critic updating the network, namely randomly extracting a batch of experiences from the experience playback buffer, calculating a loss function of the critic network by the critic network by using the extracted experiences, and updating a parameter theta Q of critic network parameters by a gradient descent method;
5) actor updating the network, namely calculating a strategy gradient of the critic network according to the output of the critic network, and updating the parameter theta π of the actor network parameter by a gradient ascent method, wherein the strategy gradient of the critic network is calculated according to a formula (5);
6) Updating the target network, namely periodically copying parameters of actor networks and critic networks into corresponding actor networks and target critic networks respectively;
7) Defining an evaluation function to calculate a Q-function expected value so as to measure the policy effect of the intelligent agent, wherein the evaluation function is shown as a formula (2);
8) And (3) performing iteration, namely repeatedly performing the steps 2) -7) until the iteration stopping condition is met.
In the distributed multi-layer energy control architecture, intelligent decisions obtained by reinforcement learning are executed through different layers, so that the system can operate efficiently and safely. First, the distribution main guiding layer receives and integrates state information from each region through communicating with agents of the coordination layers of each region, and generates a scheduling strategy according to a global optimization target. The main aim of the layer is to coordinate and schedule a plurality of virtual power plants and distributed resources, so as to ensure the stability of the system. The intelligent scheduling strategy generated by the reinforcement learning algorithm at the layer is transmitted to the regional hierarchy through an incentive mechanism, so that coordination and reasonable resource allocation of each region are promoted. Second, the agents of the regional coordination layer are responsible for performing these intelligent decisions and scheduling in conjunction with the actual state of the devices within the region. The function of the agent at the layer is equivalent to that of the regional agent, and the action generated by the agent can be finely adjusted according to the information such as the load condition of the region, the equipment state and the like, so that the energy balance of the whole system is realized. For example, the agent may optimize the charge and discharge timing of the energy storage devices in the region, or coordinate the use of renewable energy sources, through reinforcement learning strategies, ensuring efficient use of energy sources. At the equipment unit layer, the agent corresponds to agents of various distributed energy resources, such as photovoltaic power generation, energy storage devices or wind power equipment. At this level, the agent performs specific operations based on the real-time status of the device, such as adjusting the output of photovoltaic power generation, controlling the charging and discharging processes of the energy storage device, and so on. The agent of the device unit layer directly executes the dispatching instruction from the upper layer, so that quick response can be ensured in actual operation and safe operation of the device is ensured.
Through the design of the layered architecture, the intelligent decision of reinforcement learning can not only realize the optimization of the system at the global level, but also realize the layer-by-layer realization to the region and the equipment layer, thereby ensuring the efficient dispatching and resource optimization from the global to the local.
S4, optimizing a scheduling model by a closed loop feedback mechanism.
In the invention, the closed loop feedback mechanism plays a key scheduling optimization role, and the actual execution result is compared and analyzed with the expected scheduling scheme by monitoring the execution condition of the scheduling scheme in real time, so that the scheduling model is continuously optimized. The closed loop feedback mechanism ensures that the system can adapt to changes under different scenes and load conditions, and the precision and efficiency of scheduling decisions are continuously improved according to the execution effect. Specifically, the feedback mechanism includes the following steps:
And S4.1, real-time monitoring and data feedback. In the process of executing the scheduling scheme, the system tracks the running state of the distributed energy resources in real time through the sensor network and the monitoring module, wherein the running state comprises the charging and discharging conditions, the load change, the running state of the equipment and the like of the power generation and energy storage equipment. All real-time data are transmitted to a central control system through the Internet of things equipment, and are dynamically compared with a scheduling scheme generated before, so that possible deviation in the execution process is identified.
S4.2, feedback data analysis and error correction. The system can identify the abnormality or inconsistency in the scheduling by analyzing the deviation of the actual execution result from the expected scheme. For example, if the actual output power of a distributed energy device is below expected, or the discharge time of the energy storage device does not reach the expected effect, the system may capture these anomalies via a feedback mechanism. And then, the system carries out error correction according to the deviation value, adjusts parameters in a scheduling strategy, and ensures that the next scheduling period can be adjusted and optimized for the problem in the previous period.
And S4.3, dynamically adjusting the scheduling model. Based on the feedback data, the scheduling model in the system can be adaptively optimized. The system can dynamically adjust the scheduling strategy of various distributed energy sources by combining reinforcement learning algorithm (DDPG) with expert knowledge. For example, when the system recognizes that the electricity demand suddenly changes, the scheduling model can immediately adjust the charging and discharging strategies of the energy storage equipment, and redistribute the load according to the actual demand so as to ensure the stable operation of the whole power grid.
And S4.4, long-term optimization and strategy upgrading. In addition to short-term feedback tuning, the system also performs policy upgrades based on long-term scheduling data. Through closed loop feedback of multiple scheduling periods, the system continuously accumulates operation experience and gradually optimizes scheduling rules in an expert knowledge base. In long-term operation, the system can summarize the optimal scheduling path and strategy through big data analysis, so that the overall efficiency and robustness of the scheduling model are improved.
The embodiment applies the method to an actual scene, and mainly comprises the following steps:
And step 1), acquiring real-time operation data of the distributed energy resources, wherein the real-time operation data comprise parameters such as photovoltaic power generation, wind power generation, voltage, current, power output, charge and discharge states, load demands and the like of energy storage equipment. Through the distributed sensor network, the system can monitor the key data in real time. All the collected data are subjected to standardized processing through a data preprocessing module, so that different types of energy resources can be uniformly processed, and reliable data support is provided for subsequent scheduling analysis;
Step 2) a distributed multi-layer energy control architecture based on a multi-agent system. The power grid system is divided into three layers, namely a distribution network main guide layer, a region coordination layer and a device unit layer. The agents of each hierarchy are respectively responsible for the corresponding scheduled tasks. The distribution network main-guide agent makes a global scheduling strategy through cooperation with each regional coordination layer, so that the overall stability and efficient operation of the system are ensured. The regional coordination layer agent integrates the running states of the distributed energy sources in the regional to generate a local optimization scheme. The equipment unit layer agent controls specific energy resource equipment, ensures that the equipment unit layer agent runs in a safe range and maximally absorbs clean energy;
And 3) performing intelligent scheduling decision based on expert knowledge embedding and reinforcement learning (DDPG) algorithm. Under the guidance of an expert knowledge base, the system combines historical data and real-time data, and optimizes the energy scheduling strategy through DDPG algorithm. The DDPG algorithm generates optimal scheduling actions, such as adjusting charge and discharge time of the energy storage device or adjusting load distribution, according to the current system state, so that the system can adapt to complex and changeable requirements and uncertainty. The expert knowledge base provides scheduling experience in a special scene, so that the system can accelerate convergence in the reinforcement learning process and make reasonable scheduling decisions;
And 4) monitoring the execution effect of the scheduling scheme in real time through a closed loop feedback mechanism. The system compares the actual execution result with the expected scheduling scheme, identifies possible deviation, analyzes the cause of the deviation, and corrects the scheduling model through a feedback mechanism. The system optimizes the parameters of the scheduling model according to the feedback result, and ensures that the scheduling strategy is more accurate and efficient in the next round of execution.
Example 2
The embodiment provides a power distribution network intelligent decision system based on knowledge embedding and a multi-agent system, which comprises:
The data acquisition module is used for acquiring the operation data of each distributed energy device in the power distribution network in real time and preprocessing the operation data;
The control architecture construction module is used for carrying out hierarchical division on the power distribution network and constructing a distributed multi-layer energy control architecture based on a multi-agent system;
And the scheduling decision module is used for performing intelligent energy scheduling decision by adopting a method combining expert knowledge embedding and reinforcement learning based on the preprocessed operation data and a multi-agent system-based distributed multi-layer energy control architecture to complete the decision process.
The remainder were as in example 1.
The above functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The scheme in the embodiment of the invention can be realized by adopting various computer languages, such as object-oriented programming language Java, an transliteration script language JavaScript and the like.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1.一种基于知识嵌入和多代理系统的配电网智能决策方法,其特征在于,包括以下步骤:1. A distribution network intelligent decision-making method based on knowledge embedding and multi-agent system, characterized by comprising the following steps: 实时采集配电网中各分布式能源设备的运行数据,并进行预处理;Collect the operating data of each distributed energy device in the distribution network in real time and perform pre-processing; 对配电网进行层次划分,构建基于多代理系统的分布式多层能量控制架构;Divide the distribution network into layers and build a distributed multi-layer energy control architecture based on a multi-agent system; 基于所述预处理后的运行数据和基于多代理系统的分布式多层能量控制架构,采用专家知识嵌入与强化学习相结合的方法进行智能化能源调度决策,完成决策过程。Based on the pre-processed operation data and the distributed multi-layer energy control architecture based on the multi-agent system, a method combining expert knowledge embedding and reinforcement learning is adopted to make intelligent energy scheduling decisions and complete the decision-making process. 2.根据权利要求1所述的一种基于知识嵌入和多代理系统的配电网智能决策方法,其特征在于,所述各分布式能源设备的运行数据包括光伏发电、风力发电、储能设备、电动汽车及电力网络的电压、电流、功率输出。2. According to claim 1, a distribution network intelligent decision-making method based on knowledge embedding and multi-agent system is characterized in that the operating data of each distributed energy device includes the voltage, current and power output of photovoltaic power generation, wind power generation, energy storage equipment, electric vehicles and power networks. 3.根据权利要求1所述的一种基于知识嵌入和多代理系统的配电网智能决策方法,其特征在于,所述预处理操作包括对所述各分布式能源设备的运行数据进行标准化处理,所述标准化处理的表达式为:3. According to the method of intelligent decision-making of distribution network based on knowledge embedding and multi-agent system in claim 1, it is characterized in that the preprocessing operation includes standardizing the operation data of each distributed energy device, and the expression of the standardization is: 式中,Xnorm为标准化后的数据,X为原始获取数据,Xmin、Xmax分别为数据中的最小值和最大值。Where X norm is the standardized data, X is the original acquired data, X min and X max are the minimum and maximum values in the data, respectively. 4.根据权利要求1所述的一种基于知识嵌入和多代理系统的配电网智能决策方法,其特征在于,所述构建基于多代理系统的分布式多层能量控制架构的步骤包括:4. According to the method of intelligent decision-making of distribution network based on knowledge embedding and multi-agent system in claim 1, it is characterized in that the step of constructing a distributed multi-layer energy control architecture based on multi-agent system comprises: 将配电网由上至下划分三个层次,分别为配网主导层、区域协调层和设备单元层,并构建基于多代理系统的分布式多层能量控制架构,包括配网主导层代理、区域协调层代理和设备单元层代理;The distribution network is divided into three levels from top to bottom, namely the distribution network dominant layer, regional coordination layer and equipment unit layer, and a distributed multi-layer energy control architecture based on a multi-agent system is constructed, including the distribution network dominant layer agent, regional coordination layer agent and equipment unit layer agent; 其中,所述配网主导层以配电网为主导,通过配网主导层代理进行整个分布式多层能量控制架构的把控,根据调度优化目标制定激励信号,并与区域协调层代理进行通信,以实现跨区域的协同调度;The distribution network leading layer is dominated by the distribution network, and controls the entire distributed multi-layer energy control architecture through the distribution network leading layer agent, formulates incentive signals according to the scheduling optimization goals, and communicates with the regional coordination layer agent to achieve cross-regional collaborative scheduling; 所述区域协调层用于各区域内各分布式能源设备的运行调度与控制,每个区域设置一个区域协调层代理,区域协调层代理通过与设备单元层代理进行通信,并结合区域内的能源消耗需求和所述调度优化目标,进行局部优化;The regional coordination layer is used for operation scheduling and control of each distributed energy device in each region. Each region is provided with a regional coordination layer agent, which communicates with the device unit layer agent and performs local optimization in combination with the energy consumption demand in the region and the scheduling optimization target. 所述设备单元层包括多种分布式能源设备,每个分布式能源设备设置一个设备单元层代理,所述设备单元层代理与区域协调层代理进行通信,获取调度指令,以对各分布式能源设备进行能源调度。The device unit layer includes a variety of distributed energy devices, each distributed energy device is provided with a device unit layer agent, and the device unit layer agent communicates with the regional coordination layer agent to obtain scheduling instructions to perform energy scheduling for each distributed energy device. 5.根据权利要求4所述的一种基于知识嵌入和多代理系统的配电网智能决策方法,其特征在于,所述进行智能化能源调度决策的步骤包括:5. The method for intelligent decision-making of a distribution network based on knowledge embedding and multi-agent system according to claim 4, characterized in that the step of making intelligent energy dispatching decisions comprises: 根据所述调度优化目标,构建配电网的调度模型,其中所述配电网的调度模型以最小化运行成本为目标,表达式为:According to the dispatch optimization target, a dispatch model of the distribution network is constructed, wherein the dispatch model of the distribution network aims to minimize the operating cost, and the expression is: 式中,EDSO表示运行成本,Pt grid表示配电系统运营商从输电网的购电电量;表示配电系统运营商从输电网的购电电价;分别表示配电系统运营商向微电网i的售电和购电电量;分别表示配电系统运营商向微电网i的售电和购电电价;N为连到配网的MG个数;为支路电能损耗;M为支路总数;In the formula, E DSO represents the operating cost, P t grid represents the amount of electricity purchased by the distribution system operator from the transmission grid; It represents the price at which the distribution system operator purchases electricity from the transmission grid; They represent the amount of electricity sold and purchased by the distribution system operator to microgrid i respectively; They represent the electricity sales and purchase prices from the distribution system operator to the microgrid i, respectively; N is the number of MGs connected to the distribution network; is the branch power loss; M is the total number of branches; 从专家知识库中获取专家知识;Acquire expert knowledge from expert knowledge base; 基于所述预处理后的运行数据,所述专家知识提供调度指导,结合强化学习对所述调度模型进行求解,根据求解结果进行能源调度决策。Based on the preprocessed operating data, the expert knowledge provides scheduling guidance, the scheduling model is solved in combination with reinforcement learning, and energy scheduling decisions are made according to the solution results. 6.根据权利要求5所述的一种基于知识嵌入和多代理系统的配电网智能决策方法,其特征在于,所述强化学习采用深度确定性策略梯度算法。6. According to the method of intelligent decision-making of distribution network based on knowledge embedding and multi-agent system in claim 5, it is characterized in that the reinforcement learning adopts deep deterministic policy gradient algorithm. 7.根据权利要求6所述的一种基于知识嵌入和多代理系统的配电网智能决策方法,其特征在于,所述深度确定性策略梯度算法的架构采用actor-critic双重神经网络架构,所述actor-critic双重神经网络架构的训练步骤包括:7. The method for intelligent decision-making of a distribution network based on knowledge embedding and multi-agent system according to claim 6, characterized in that the architecture of the deep deterministic policy gradient algorithm adopts an actor-critic dual neural network architecture, and the training steps of the actor-critic dual neural network architecture include: 1)初始化:初始化actor网络的参数θQ、critic网络的参数θπ、目标actor网络的参数θQ'、目标critic网络的参数θπ'以及经验回放缓冲区;1) Initialization: Initialize the parameters θ Q of the actor network, the parameters θ π of the critic network, the parameters θ Q ' of the target actor network, the parameters θ π ' of the target critic network, and the experience replay buffer; 2)探索与交互:actor网络根据当前策略选择动作,智能体执行选定的动作,并与环境进行交互,得到当前状态、奖励和下一状态;2) Exploration and interaction: The actor network selects actions based on the current strategy, the agent executes the selected actions and interacts with the environment to obtain the current state, reward, and next state; 3)经验回放:将交互得到的当前状态、动作、奖励和下一状态存储到经验回放缓冲区;3) Experience replay: store the current state, action, reward and next state obtained from the interaction into the experience replay buffer; 4)critic网络更新:从所述经验回放缓冲区中随机抽取一批经验,critic网络使用抽取的经验计算critic网络的损失函数,并通过梯度下降方法更新critic网络参数的参数θQ4) Critic network update: randomly extract a batch of experiences from the experience playback buffer, the critic network uses the extracted experiences to calculate the loss function of the critic network, and updates the parameters θ Q of the critic network parameters by the gradient descent method; 5)actor网络更新:根据critic网络的输出计算critic网络的策略梯度,通过梯度上升方法更新actor网络参数的参数θπ,其中所述critic网络的策略梯度的计算表达式为:5) Actor network update: Calculate the policy gradient of the critic network according to the output of the critic network, and update the parameters θ π of the actor network parameters by the gradient ascent method, where the calculation expression of the policy gradient of the critic network is: 式中,为critic网络目标函数的策略梯度,为actor网络策略, 分别为critic网络Q-函数值和actor网络的梯度,n为智能体数量,q为样本经验数,si,t为第i个智能体在t时刻的状态;In the formula, is the policy gradient of the critic network objective function, is the actor network strategy, are the Q-function value of the critic network and the gradient of the actor network, n is the number of agents, q is the number of sample experiences, s i,t is the state of the i-th agent at time t; 6)目标网络更新:定期将actor网络和critic网络的参数分别复制到对应的actor网络和目标critic网络中;6) Target network update: Regularly copy the parameters of the actor network and critic network to the corresponding actor network and target critic network respectively; 7)策略评估与改进:定义评价函数计算Q-函数期望值,以衡量智能体的策略效果,所述评价函数的表达式为:7) Strategy evaluation and improvement: Define an evaluation function to calculate the expected value of the Q-function to measure the strategy effect of the agent. The expression of the evaluation function is: 式中,J(π)为评价函数,表示状态s服从概率分布pπ时,按策略π执行动作得到的Q-函数Q(s,π(s))期望值,Q(s,π(s))为智能体按策略π执行动作得到的Q-函数值,pπ为概率分布函数;Where J(π) is the evaluation function, which means the expected value of the Q-function Q(s,π(s)) obtained by executing actions according to strategy π when state s obeys probability distribution pπ, Q(s,π(s)) is the Q-function value obtained by the agent executing actions according to strategy π, and is the probability distribution function; 8)迭代执行:重复执行步骤2)-7),直至满足迭代停止条件。8) Iterative execution: Repeat steps 2)-7) until the iteration stop condition is met. 8.根据权利要求7所述的一种基于知识嵌入和多代理系统的配电网智能决策方法,其特征在于,所述奖励更新的表达式为:8. According to claim 7, a distribution network intelligent decision-making method based on knowledge embedding and multi-agent system is characterized in that the expression of the reward update is: 式中,ri',t为第i个智能体在t时刻的目标奖励,ri,t为第i个智能体在时刻t时的即时奖励,γ为用于权衡当前奖励与未来奖励的关系折扣因子,为目标critic网络Q-函数值,为目标actor网络策略,err为预测Q值和目标Q值之间的误差,ai,t为第i个智能体在t时刻的动作,q为样本经验数。In the formula, ri ' ,t is the target reward of the ith agent at time t, ri ,t is the immediate reward of the ith agent at time t, and γ is the discount factor used to weigh the relationship between current rewards and future rewards. is the target critic network Q-function value, is the target actor network strategy, err is the error between the predicted Q value and the target Q value, a i,t is the action of the ith agent at time t, and q is the number of sample experiences. 9.根据权利要求1所述的一种基于知识嵌入和多代理系统的配电网智能决策方法,其特征在于,还包括采用闭环反馈机制优化调度模型的步骤,以对智能化能源调度决策进行调整,包括以下步骤:9. The method for intelligent decision-making of a distribution network based on knowledge embedding and multi-agent system according to claim 1, characterized in that it also includes the step of optimizing the scheduling model by adopting a closed-loop feedback mechanism to adjust the intelligent energy scheduling decision, including the following steps: 将配电网能源调度决策的实际调度执行结果与预期调度方案进行动态对比,识别执行偏差;Dynamically compare the actual dispatch execution results of the distribution network energy dispatch decision with the expected dispatch plan to identify execution deviations; 根据所述执行偏差进行修正,并对调度模型进行适应性优化,以动态调整能源调度决策。Corrections are made based on the execution deviations, and the scheduling model is adaptively optimized to dynamically adjust energy scheduling decisions. 10.一种基于知识嵌入和多代理系统的配电网智能决策系统,其特征在于,包括:10. A distribution network intelligent decision-making system based on knowledge embedding and multi-agent system, characterized by comprising: 数据获取模块:用于实时采集配电网中各分布式能源设备的运行数据,并进行预处理;Data acquisition module: used to collect the operating data of each distributed energy device in the distribution network in real time and perform preprocessing; 控制架构构建模块:用于对配电网进行层次划分,构建基于多代理系统的分布式多层能量控制架构;Control architecture building module: used to divide the distribution network into layers and build a distributed multi-layer energy control architecture based on a multi-agent system; 调度决策模块:用于基于所述预处理后的运行数据和基于多代理系统的分布式多层能量控制架构,采用专家知识嵌入与强化学习相结合的方法进行智能化能源调度决策,完成决策过程。Scheduling decision module: It is used to make intelligent energy scheduling decisions based on the pre-processed operation data and the distributed multi-layer energy control architecture based on the multi-agent system, using a method combining expert knowledge embedding and reinforcement learning to complete the decision-making process.
CN202411507652.4A 2024-10-28 2024-10-28 Intelligent decision-making method and system for distribution network based on knowledge embedding and multi-agent system Pending CN119382159A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411507652.4A CN119382159A (en) 2024-10-28 2024-10-28 Intelligent decision-making method and system for distribution network based on knowledge embedding and multi-agent system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411507652.4A CN119382159A (en) 2024-10-28 2024-10-28 Intelligent decision-making method and system for distribution network based on knowledge embedding and multi-agent system

Publications (1)

Publication Number Publication Date
CN119382159A true CN119382159A (en) 2025-01-28

Family

ID=94331726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411507652.4A Pending CN119382159A (en) 2024-10-28 2024-10-28 Intelligent decision-making method and system for distribution network based on knowledge embedding and multi-agent system

Country Status (1)

Country Link
CN (1) CN119382159A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119784121A (en) * 2025-03-13 2025-04-08 四川中烟工业有限责任公司 A digital production operation management method for data quality assessment and analysis
CN120222428A (en) * 2025-03-21 2025-06-27 中能建氢能源有限公司 Electricity-hydrogen coupling intelligent control method and system considering wind-solar forecast error

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119784121A (en) * 2025-03-13 2025-04-08 四川中烟工业有限责任公司 A digital production operation management method for data quality assessment and analysis
CN120222428A (en) * 2025-03-21 2025-06-27 中能建氢能源有限公司 Electricity-hydrogen coupling intelligent control method and system considering wind-solar forecast error

Similar Documents

Publication Publication Date Title
CN119382159A (en) Intelligent decision-making method and system for distribution network based on knowledge embedding and multi-agent system
US20210143639A1 (en) Systems and methods of autonomous voltage control in electric power systems
CN105046395B (en) Method for compiling day-by-day rolling plan of power system containing multiple types of new energy
CN119783997A (en) A virtual power plant peak load optimization scheduling method, system, electronic equipment and medium
CN113159341A (en) Power distribution network aid decision-making method and system integrating deep reinforcement learning and expert experience
CN110414725B (en) Wind power plant energy storage system scheduling method and device integrating prediction and decision
CN115934333A (en) Historical data perception-based cloud computing resource scheduling method and system
CN119382128A (en) A method for allocating power between source, grid, load and storage based on prediction and reinforcement learning multi-objectives
CN119651605A (en) A method for adding control branches in multiple scenarios during demand response execution
CN112950001A (en) Intelligent energy management and control system and method based on cloud edge closed-loop architecture
CN114219045A (en) Dynamic early warning method, system and device for risk of power distribution network and storage medium
CN119009988A (en) Power grid optimal scheduling method and related device
CN119476670A (en) Power multi-agent dynamic collaborative inspection method and system
CN119543228A (en) A primary frequency modulation control method and system based on collaborative optimization strategy
CN117638938A (en) Control methods, devices and electronic equipment for power systems
CN119740841A (en) Multi-time scale water dispatching method based on irrigation area
CN118798584B (en) A method, system, medium and device for robust optimization scheduling of building microgrid
CN115795992A (en) An Online Scheduling Method of Park Energy Internet Based on Virtual Deduction of Operating Situation
CN119834221A (en) Urban power grid load recovery method and device based on user requirements
CN119651766A (en) Method, device and electronic equipment for determining resource carrying capacity of power grid
CN117767433A (en) Real-time county energy internet scheduling method and system based on digital twin
KR20240095661A (en) Method and apparatus for heat pump control based on reinforcement learning and air conditioning system using the same
CN119010203A (en) AC/DC micro-grid energy scheduling method and device
CN116702988A (en) Carbon neutral calculation cost optimization method and equipment for economic dispatch of carbon capture power plants in smart grid
CN115829248A (en) Multi-time-scale hierarchical optimization scheduling method and system for virtual power plant

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载