Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a multi-coal mixed combustion control method for a fluidized bed boiler, which dynamically adjusts the height and air flow rate of the bed layer of the fluidized bed boiler by monitoring the particle state, strengthening learning control and subphase volume fraction feedback in real time, ensures the uniform combustion of anthracite and lignite and improves the combustion efficiency.
The fluidized bed boiler can effectively control the generation of pollutants due to larger combustion space and lower combustion temperature, in practical application, the characteristics (density, volatile matters, ash content, particle size and the like) of different coal types are different, the mixed combustion of multiple coal types brings technical challenges to the combustion control of the fluidized bed boiler, the coal types with larger density are usually deposited at the bottom of the fluidized bed, the coal types with smaller density are easily brought to the upper part of the fluidized bed by airflow, the uneven distribution can lead to insufficient combustion and reduced energy utilization rate, the larger coal particles need higher airflow speed to be suspended, the smaller coal particles easily rise to the upper part of the boiler or are discharged along with flue gas flow, the size distribution difference of the combustion particles can influence the stability of the bed layer and the combustion reaction speed, the combustion of the coal types with high volatile matters is faster, a large amount of heat is released, the combustion of the coal types with low volatile matters is slower and lower, the difference of the combustion rate can lead to uneven temperature field inside the boiler, thereby influencing the combustion efficiency, the combustion of the multiple coal types need to be uniformly distributed and the mixed combustion of the coal types with different volatile matters, the coal types need to be fully distributed and fully distributed with the air, the air and the air can be fully distributed and fully reacted with the air.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a fluidized bed boiler multi-coal mixed combustion control method comprises the following steps:
Step 1, data acquisition and state monitoring, namely monitoring the density, the volume, the suspension state, the bed temperature and the air flow speed of anthracite and lignite in a fluidized bed in real time based on an intelligent sensor network, using a minor phase volume fraction equation to represent the distribution condition of different particles in the bed, acquiring the dynamic distribution of various particles in the bed through real-time monitoring and a sensor, and determining the change condition of the minor phase volume fraction;
analyzing combustion efficiency and temperature distribution, namely analyzing the combustion efficiency by detecting the spatial distribution of bed temperature and combining temperature gradient and particle volume fraction, and identifying the overheating and insufficient combustion areas;
Step 3, intelligently controlling the air flow speed and the bed height, namely using an intelligent algorithm control model based on reinforcement learning to continuously adjust the bed height and the air flow speed according to historical data and the current combustion state;
Step 4, feedback adjustment of the secondary phase volume fraction, namely obtaining suspension states and bed change trends of different particles by solving a secondary phase volume fraction equation, so as to adjust feedback parameters of airflow speed and bed height, dynamically adjusting the bed height in real time based on a preset self-adaptive height adjustment formula by combining different densities and combustion speeds of the particles, and guaranteeing uniform combustion of anthracite and lignite;
And 5, outputting and feedback controlling, namely optimizing a control process by using a preset optimizing control function based on actual operation data, and outputting the optimal values of bed height and air flow velocity through repeated iteration and feedback optimization.
In step 1, the intelligent sensor network comprises a node part responsible for monitoring specified physical quantities and a main node part responsible for data summarizing, analyzing and deciding, wherein the main node part carries out anomaly detection on data of the node part based on a Gaussian mixture model, when anomaly is detected, the main node part sends out self-adaptive adjustment signals according to the data fed back by the node part, the working state of each node sensor is automatically adjusted, communication between the main node part and the node part is realized by using the Internet of things technology, the frequency of data acquisition of the node part predicts the change trend of a bed layer according to historical data trend based on an adaptive filter algorithm, and when mutation of the node data of the node part is detected, the node part acquisition frequency is automatically increased from a normal acquisition frequency of 100Hz to a strain acquisition frequency of 500Hz.
As a further aspect of the present invention, in step 1, the sub-phase volume fraction equation includes an anthracite sub-phase volume fraction equation and a lignite sub-phase volume fraction equation, which are in the form of:
Anthracite secondary phase volume fraction equation: ;
Wherein: Is the minor phase volume fraction of the anthracite particles, As a function of the time variable,Is the velocity field of the anthracite particles,,In order for the air flow rate to be high,Is the density of the anthracite particles,For the density of the gas stream,The acceleration of the gravity is that,Is the radius of the anthracite particles,Is the viscosity of the gas dynamic force,Is a generation item of anthracite particles,,Is the combustion rate constant of anthracite coal,For the temperature of the bed layer,Is the ignition temperature of the anthracite coal,As a step function whenExceeding the limitWhen generating itemsAnd becomes active, otherwise 0,Is a deposition item of anthracite particles,,Is the height of the smokeless coal bed layer;
a subphase volume fraction equation of lignite: ;
Wherein: Is the secondary phase volume fraction of the lignite granules, Is the velocity field of the lignite granules,,Is the density of the lignite granules,Is the radius of the lignite granules,Is a generation item of lignite granules,WhereinIs the combustion rate constant of the lignite,For the temperature of the bed layer,Is the combustion starting temperature of the lignite,Is the consumption item of the lignite granules,WhereinIs the consumption rate constant of lignite combustion.
As a further scheme of the invention, the specific implementation flow of the step 2 comprises the following steps:
step 21, temperature distribution space interpolation, namely constructing a temperature field in a bed layer by using acquired temperature data and adopting an inverse distance weighting method, constructing a three-dimensional temperature distribution map of the bed layer, and calculating a temperature gradient;
Step 22, analyzing the combustion efficiency, namely calculating the combustion efficiency by using the average temperature of a temperature field in the fluidized bed and the combustion reaction rate, obtaining the distribution of anthracite and lignite in the bed through a subphase volume fraction equation, and further refining the analysis of the combustion efficiency by combining a combustion efficiency formula;
And step 23, identifying the overheating and insufficient burning areas, namely identifying the areas exceeding the preset normal burning temperature through the calculated temperature of each position, further identifying the boundaries of the overheating areas by using temperature gradient analysis, identifying the areas with the temperature lower than the ignition temperature of the fuel as the insufficient burning areas, and identifying the areas with the volume fraction of the anthracite being higher than the preset value and the temperature lower than the ignition temperature as the areas with the insufficient burning and serious deposition of the anthracite by combining a particle volume fraction equation.
As a further scheme of the invention, in the step 3, the specific implementation flow for intelligently controlling the airflow speed and the bed height comprises the following steps:
Step 31, initializing parameters, namely setting an initial value of the air flow speed and an initial value of the bed height based on the historical optimal operation condition, setting a learning rate, an exploration rate and a discount factor of reinforcement learning, setting an operation range of the air flow speed and the bed height, setting an action adjustment range of the air flow speed and the bed height, and setting a trigger condition of an adjustment action;
Step 32, defining a reinforcement learning model, namely defining combustion state parameters in a state space, including temperature distribution, air flow speed, bed height, combustion efficiency, anthracite and lignite particle volume fractions, constructing a reward function by combining the combustion efficiency, temperature gradient, uniformity of particle distribution and volume penalties of three-dimensional overheating and insufficient combustion areas, and simultaneously introducing a three-dimensional volume penalty term to reduce the areas with insufficient overheating or insufficient combustion while realizing combustion efficiency optimization in a Q value updating formula;
step 33, executing intelligent control decision by the current state -Greedy strategy selects optimal actions, adjusts the gas flow velocity and the bed height in real time, after performing actions, waits until the combustion state is balanced, and collects new rounds of data, updates the combustion state, recalculates all items in the reward function, updates the Q value using the latest reward value and the next state, iterates continuously, performs actions, acquires feedback, updates the Q value, and gradually finds the optimal gas flow velocity and bed height adjustment strategy through multiple rounds of learning.
As a further aspect of the present invention, in step 32, the reward function formula involved is:
Wherein: As the state space of the current time, For the purpose of the real-time combustion efficiency,For the purpose of ideal combustion efficiency,WhereinIs the space coordinate of the temperature measuring point,As the action space at the present moment,Wherein,,、The limit value of (2) is set according to the actual requirement,Is the value of the rewarding function corresponding to the current state space and the action space,For the bed temperature distribution in the fluidized bed boiler,Is a temperature gradient at different locations within the bed,,、、Respectively isDirection(s),Direction and directionThe unit vector of the direction is set,For the volume fraction difference of anthracite and lignite particles,For the volume of the superheating area,In order to burn the volume of the area where there is insufficient combustion,、、、The weight coefficients of the combustion efficiency item, the temperature gradient item, the particle distribution difference item and the overheating and insufficient combustion area punishment item are respectively set according to actual requirements.
As a further aspect of the present invention, in step 32, the Q value update formula involved is:
Wherein: Is the current state Execute action downwardsThe cumulative desired prize update values that are available,For the old Q value,In order for the rate of learning to be high,For an instant bonus item,As a discount factor, the number of times the discount is calculated,To be at the next time stepState of (2)Down-select optimal actionThe maximum Q value that can be obtained.
As a further aspect of the present invention, in the Q value update formula involved in step 32,By reinforcement in learning models-Greedy policy for action selection with probabilitySelecting the action with the largest current Q value as the optimal action,,Is an action in an action space, whereinTo explore the rate, probabilityRandomly selecting an action to try, exploring a new policy space,Based on the initial set exploration rate, the system randomly selects actions to explore different optimization strategies, and the exploration rate gradually decreases with preset step length as the system gradually learns a better optimization control strategy.
As a further scheme of the present invention, in step 4, the specific implementation process of the minor phase volume fraction feedback adjustment is:
Step 41, solving a sub-phase volume fraction equation, namely solving an anthracite sub-phase volume fraction equation and a lignite sub-phase volume fraction equation through a finite difference method, and passing a time step Iteratively solving, and predicting the anthracite particle deposition area and the suspension state of lignite particles at a future time point;
Step 42, analyzing the change trend of the bed, namely identifying and analyzing the change trend of the bed based on the anthracite particle deposition area and the suspension state of lignite particles obtained in the step 41, wherein when the anthracite particle deposition area is larger than a preset value, the height of the bed is reduced, the combustion of particles is insufficient, and when the number of suspended particles is increased to exceed the preset value, the air flow speed is too high or the height of the bed is unsuitable;
Step 43, adjusting feedback control parameters, namely increasing the air flow speed and increasing the bed height when the deposition of anthracite particles is increased to exceed a preset value according to the change trend of the minor phase volume fraction, decreasing the air flow speed when the suspension quantity of lignite particles is increased to exceed the preset value, decreasing the bed height when the suspension quantity of lignite particles is lower than the preset value, and adjusting the bed height and the air flow speed in real time;
Step 44, iterative optimization of feedback adjustment, namely, adjusting parameters of air flow speed and bed height based on feedback, dynamically optimizing an adjustment range by combining actual operation data, waiting for a preset stable duration to adjust after each adjustment, and optimizing a control effect through multiple iterations and feedback by self-adapting learning rate and adjustment amplitude;
And 45, verifying an adjustment result, namely verifying the adjustment effect by solving the secondary phase volume fraction equation again after adjusting the gas speed and the bed height, and monitoring the combustion efficiency and the temperature distribution in real time.
Compared with the prior art, the method has the technical effects that the suspension state, density, volume distribution, bed temperature and air flow speed of anthracite and lignite particles in the bed can be mastered in real time through data acquisition and state monitoring, the accuracy and timeliness of the data are guaranteed, the combustion efficiency and the temperature distribution are analyzed, the insufficient combustion and overheating areas are identified, a basis is provided for subsequent adjustment, a reinforcement learning intelligent control algorithm is introduced, the bed height and the air flow speed are dynamically adjusted according to historical data and the current state, the change of different loads and fuel groups can be adapted, the distribution and suspension state of the particles is monitored and solved in real time through subphase volume fraction feedback adjustment, the dynamic optimization adjustment of the bed height is guaranteed, the uniformity and the efficiency of combustion are improved, the feedback control mechanism outputs the optimal bed height and air flow speed value through repeated optimization iteration, accordingly, more efficient and uniform combustion is achieved, the combustion uneven problem caused by the deposition of anthracite and the over-fast combustion of lignite is reduced, the fuel utilization rate and the system efficiency are remarkably improved, and the technical problem of the adjustment of the mixed combustion of the anthracite and the bed height and lignite in the fluidized bed boiler can be effectively solved.
Detailed Description
The following description of the embodiments of the present invention will be made in detail, but not necessarily with reference to the accompanying drawings, wherein the invention is shown in the drawings. Based on the teachings herein, all other technical solutions available to one of ordinary skill in the art without making any inventive effort fall within the scope of the present invention.
As shown in FIG. 1, the method for controlling the mixed combustion of multiple coals in the fluidized bed boiler provided by the invention comprises the following steps:
Step 1, data acquisition and state monitoring, namely monitoring the density, the volume, the suspension state, the bed temperature and the air flow speed of anthracite and lignite in a fluidized bed in real time based on an intelligent sensor network, using a minor phase volume fraction equation to represent the distribution condition of different particles in the bed, acquiring the dynamic distribution of various particles in the bed through real-time monitoring and a sensor, and determining the change condition of the minor phase volume fraction;
analyzing combustion efficiency and temperature distribution, namely analyzing the combustion efficiency by detecting the spatial distribution of bed temperature and combining temperature gradient and particle volume fraction, and identifying the overheating and insufficient combustion areas;
Step 3, intelligently controlling the air flow speed and the bed height, namely using an intelligent algorithm control model based on reinforcement learning to continuously adjust the bed height and the air flow speed according to historical data and the current combustion state;
Step 4, feedback adjustment of the secondary phase volume fraction, namely obtaining suspension states and bed change trends of different particles by solving a secondary phase volume fraction equation, so as to adjust feedback parameters of airflow speed and bed height, dynamically adjusting the bed height in real time based on a preset self-adaptive height adjustment formula by combining different densities and combustion speeds of the particles, and guaranteeing uniform combustion of anthracite and lignite;
And 5, outputting and feedback controlling, namely optimizing a control process by using a preset optimizing control function based on actual operation data, and outputting the optimal values of bed height and air flow velocity through repeated iteration and feedback optimization.
According to the invention, through data acquisition and state monitoring, the suspension state, density and volume distribution of anthracite and lignite particles in a bed and the temperature and air flow speed of the bed can be mastered in real time, the accuracy and timeliness of data are ensured, the combustion efficiency and temperature distribution are analyzed, the insufficient combustion and overheating areas are identified, a foundation is provided for subsequent adjustment, a reinforcement learning intelligent control algorithm is introduced, the bed height and air flow speed are dynamically adjusted according to historical data and the current state, the change of different loads and fuel groups can be adapted, the distribution and suspension state of the particles are monitored and solved in real time through subphase volume fraction feedback adjustment, the dynamic optimization adjustment of the bed height is ensured, the uniformity and efficiency of combustion are improved, the feedback control mechanism outputs optimal bed height and air flow speed values through multiple optimization iterations, so that more efficient and uniform combustion is realized, the uneven combustion problem caused by anthracite deposition and lignite overquick combustion is reduced, the fuel utilization rate and the system efficiency are remarkably improved, and the technical problem of the real-time adjustment of the bed height and air flow speed during the mixed combustion of lignite in a fluidized bed boiler can be effectively solved.
In step1, the intelligent sensor network includes a node responsible for monitoring a specified physical quantity and a main node responsible for data summarizing, analyzing and deciding, the main node performs anomaly detection on the data of the node based on a gaussian mixture model, when anomaly is detected, the main node sends out an adaptive adjustment signal according to the data fed back by the node, the working state of each node sensor is automatically adjusted, communication between the main node and the node is realized by using the internet of things technology, the frequency of data acquisition of the node predicts the change trend of a bed layer according to the historical data trend based on an adaptive filter algorithm, and when mutation of the node data is detected, the node acquisition frequency is automatically increased from a normal acquisition frequency of 100Hz to a strain acquisition frequency of 500Hz.
The data of the nodes are preprocessed through a multi-sensor data fusion technology, so that the influence of single sensor failure or data deviation on the overall performance of the system is reduced, and the stability and consistency of the data are ensured. The Gaussian mixture model is not only used for anomaly detection in the main node, but also used for dynamically adjusting the threshold according to the statistical characteristics of different monitoring physical quantities of each sub-node, so that the anomaly data can be detected more accurately. The adaptive filter algorithm of the sub-nodes not only is based on historical data trend, but also is combined with a real-time feedback signal to carry out weight adjustment on the prediction model so as to improve the prediction precision and reduce over-sampling and delay. The system is provided with a sub-node fault detection mechanism, and when the sensor node has hardware or communication faults, the system automatically starts redundant sensors or adjusts the acquisition frequency of adjacent nodes to supplement data missing. According to the abnormal detection result, the main node not only adjusts the working state of the partial node sensor, but also optimizes the whole operation strategy of the system, such as adjusting the height of a bed and the air flow speed, through decision logic based on a machine learning algorithm so as to improve the combustion efficiency. The main node and the partial nodes are coordinated through a high-precision time synchronization mechanism, so that timeliness and consistency of data acquisition and feedback are ensured, and data delay or failure caused by asynchronous data acquisition is avoided. The internet of things protocol used for communication adopts low-delay and high-bandwidth standards, such as LoRa or NB-IoT, so as to ensure the data transmission efficiency and stability between the sub-nodes and the main node in a large-scale monitoring scene.
In step 1, the subphase volume fraction equation includes an anthracite subphase volume fraction equation and a lignite subphase volume fraction equation, and the form thereof is as follows:
Anthracite secondary phase volume fraction equation: ;
Wherein: Is the minor phase volume fraction of the anthracite particles, As a function of the time variable,Is the velocity field of the anthracite particles,,In order for the air flow rate to be high,Is the density of the anthracite particles,For the density of the gas stream,The acceleration of the gravity is that,Is the radius of the anthracite particles,Is the viscosity of the gas dynamic force,Is a generation item of anthracite particles,,Is the combustion rate constant of anthracite coal,For the temperature of the bed layer,Is the ignition temperature of the anthracite coal,As a step function whenExceeding the limitWhen generating itemsAnd becomes active, otherwise 0,Is a deposition item of anthracite particles,,Is the height of the smokeless coal bed layer;
a subphase volume fraction equation of lignite: ;
Wherein: Is the secondary phase volume fraction of the lignite granules, Is the velocity field of the lignite granules,,Is the density of the lignite granules,Is the radius of the lignite granules,Is a generation item of lignite granules,WhereinIs the combustion rate constant of the lignite,For the temperature of the bed layer,Is the combustion starting temperature of the lignite,Is the consumption item of the lignite granules,WhereinIs the consumption rate constant of lignite combustion.
The problem of uneven combustion in the fluidized bed boiler is solved by respectively constructing the subphase volume fraction equations of the anthracite and the lignite. The anthracite equation and the lignite equation respectively consider respective generation items and deposition/consumption items, and can accurately describe the dynamic distribution and combustion processes of different coal particles in a bed. The speed field expression in the anthracite equation combines the factors such as particle density, airflow speed, bed temperature and the like, so that the combustion rate can be dynamically adjusted according to actual conditions, and the generation item is ensured after the ignition temperature of the anthracite is reachedThe validation is started and the operation is started, avoiding premature or too late combustion. And the lignite equation passes through its consumption termAnd a combustion rate constant, which accurately describes the process and consumption of lignite combustion. The combustion state of the anthracite and the lignite in the fluidized bed is respectively monitored and regulated through a subphase volume fraction equation, so that the temperature and particle distribution of each region in the fluidized bed are more uniform, the uneven combustion phenomenon caused by the deposition of the anthracite and the excessive combustion of the lignite is reduced, and the combustion efficiency and the system stability are remarkably improved.
It should be noted that the specific execution flow of step 2 includes:
step 21, temperature distribution space interpolation, namely constructing a temperature field in a bed layer by using acquired temperature data and adopting an inverse distance weighting method, constructing a three-dimensional temperature distribution map of the bed layer, and calculating a temperature gradient;
Step 22, analyzing the combustion efficiency, namely calculating the combustion efficiency by using the average temperature of a temperature field in the fluidized bed and the combustion reaction rate, obtaining the distribution of anthracite and lignite in the bed through a subphase volume fraction equation, and further refining the analysis of the combustion efficiency by combining a combustion efficiency formula;
And step 23, identifying the overheating and insufficient burning areas, namely identifying the areas exceeding the preset normal burning temperature through the calculated temperature of each position, further identifying the boundaries of the overheating areas by using temperature gradient analysis, identifying the areas with the temperature lower than the ignition temperature of the fuel as the insufficient burning areas, and identifying the areas with the volume fraction of the anthracite being higher than the preset value and the temperature lower than the ignition temperature as the areas with the insufficient burning and serious deposition of the anthracite by combining a particle volume fraction equation.
The acquired temperature data is spatially interpolated by an inverse distance weighting method to construct a three-dimensional temperature distribution map in the bed layer, so that the construction of a temperature field is more accurate, and the temperature gradient change can be clearly displayed. This provides a basis for identifying areas of overheating and insufficient combustion. Based on the average temperature of the temperature field and the combustion reaction rate, the combustion efficiency of different areas is accurately calculated by combining the subphase volume fraction equation of the anthracite and the lignite. The process makes the analysis of the combustion efficiency finer, so that the combustion process can be dynamically adjusted, and the utilization rate of fuel is improved. By calculating the temperature gradient, the overheat area exceeding the normal combustion temperature and the insufficient combustion area not reaching the fuel ignition temperature in the bed layer can be accurately identified. Meanwhile, by combining a minor phase volume fraction equation, the areas where the anthracite coal is not fully combusted and is seriously deposited are further accurately identified.
It should be noted that, in step 3, the specific implementation process of intelligently controlling the airflow speed and the bed height includes:
Step 31, initializing parameters, namely setting an initial value of the air flow speed and an initial value of the bed height based on the historical optimal operation condition, setting a learning rate, an exploration rate and a discount factor of reinforcement learning, setting an operation range of the air flow speed and the bed height, setting an action adjustment range of the air flow speed and the bed height, and setting a trigger condition of an adjustment action;
Step 32, defining a reinforcement learning model, namely defining combustion state parameters in a state space, including temperature distribution, air flow speed, bed height, combustion efficiency, anthracite and lignite particle volume fractions, constructing a reward function by combining the combustion efficiency, temperature gradient, uniformity of particle distribution and volume penalties of three-dimensional overheating and insufficient combustion areas, and simultaneously introducing a three-dimensional volume penalty term to reduce the areas with insufficient overheating or insufficient combustion while realizing combustion efficiency optimization in a Q value updating formula;
step 33, executing intelligent control decision by the current state -Greedy strategy selects optimal actions, adjusts the gas flow velocity and the bed height in real time, after performing actions, waits until the combustion state is balanced, and collects new rounds of data, updates the combustion state, recalculates all items in the reward function, updates the Q value using the latest reward value and the next state, iterates continuously, performs actions, acquires feedback, updates the Q value, and gradually finds the optimal gas flow velocity and bed height adjustment strategy through multiple rounds of learning.
Based on a real-time monitoring and reinforcement learning algorithm, the air flow speed and the bed height can be dynamically adjusted, the optimal suspension state of different coal types in the combustion process is ensured, and the combustion uniformity is improved. The method has the advantages that the historical optimal operation condition is utilized to initialize parameters, the learning process is accelerated, and the optimal control strategy is gradually found after multiple rounds of learning through continuous iteration and Q value updating, so that the complexity of manual debugging is reduced. The reward function not only optimizes the combustion efficiency, but also introduces a three-dimensional volume penalty term to reduce the overheating and insufficient combustion areas, thereby improving the stability and fuel utilization rate of the whole system.
In step 32, the formula of the reward function involved is:
Wherein: As the state space of the current time, For the purpose of the real-time combustion efficiency,For the purpose of ideal combustion efficiency,WhereinIs the space coordinate of the temperature measuring point,As the action space at the present moment,Wherein,,、The limit value of (2) is set according to the actual requirement,Is the value of the rewarding function corresponding to the current state space and the action space,For the bed temperature distribution in the fluidized bed boiler,Is a temperature gradient at different locations within the bed,,、、Respectively isDirection(s),Direction and directionThe unit vector of the direction is set,For the volume fraction difference of anthracite and lignite particles,For the volume of the superheating area,In order to burn the volume of the area where there is insufficient combustion,、、、The weight coefficients of the combustion efficiency item, the temperature gradient item, the particle distribution difference item and the overheating and insufficient combustion area punishment item are respectively set according to actual requirements.
According to the rewarding function formula, through introducing combustion efficiency, temperature gradient, particle distribution difference and volume penalty of the overheat and undercombustion area, the airflow speed and the bed height of the fluidized bed boiler are dynamically optimized, the combustion efficiency, temperature uniformity, coal particle distribution difference and the overheat/undercombustion area are comprehensively considered by the formula, the comprehensive optimization of the combustion process is ensured, and deviation caused by a single optimization target is avoided. The air flow speed and the bed height are continuously adjusted according to the real-time combustion state through the reinforcement learning model, the combustion process is dynamically optimized, and the system self-adaptability is improved. The punishment items of the temperature gradient and the volume fraction difference reduce the phenomena of local overheating and insufficient combustion, so that the combustion process is more uniform, and the overall combustion efficiency is improved.
In step 32, the Q value update formula is:
Wherein: Is the current state Execute action downwardsThe cumulative desired prize update values that are available,For the old Q value,In order for the rate of learning to be high,For an instant bonus item,As a discount factor, the number of times the discount is calculated,To be at the next time stepState of (2)Down-select optimal actionThe maximum Q value that can be obtained.
The Q value updating formula realizes the dynamic learning and optimization of the combustion control process of the fluidized bed boiler by introducing instant rewards and the maximum Q value of the next state. The instant rewarding item ensures that the system feeds back the states of combustion efficiency, temperature uniformity and the like immediately after each step of action, and the maximum rewards which are possibly obtained later are introduced into Q value update through discount factors, so that the attention of the system to long-term benefits is enhanced, and short-term decision making is avoided. Meanwhile, the learning rate lambda controls the Q value updating speed, so that the system gradually tends to an optimal control strategy. Compared with the prior art, the method can adaptively optimize the airflow speed and the bed height, realizes high-efficiency and stable combustion control under a complex combustion state by continuously iterating Q value updating, and improves the self-learning capability and control precision of the system.
Note that, in the Q value update formula related to step 32,By reinforcement in learning models-Greedy policy for action selection with probabilitySelecting the action with the largest current Q value as the optimal action,,Is an action in an action space, whereinTo explore the rate, probabilityRandomly selecting an action to try, exploring a new policy space,Based on the initial set exploration rate, the system randomly selects actions to explore different optimization strategies, and the exploration rate gradually decreases with preset step length as the system gradually learns a better optimization control strategy.
By using-Greedy strategy that achieves a balance of exploration and utilization in reinforcement learning process to probabilityThe action with the maximum current Q value is selected, so that the system is ensured to optimize the air flow speed and the bed height under the learned optimal control strategy, and the combustion efficiency is improved. At the same time, with probabilityThe action is randomly selected, so that the system is allowed to explore a new strategy space, and the local optimum is avoided being trapped. As the system gradually learns a better control strategy, the exploration rate gradually decreases, so that the system depends more on the optimal strategy to make decisions. Compared with the prior art, the strategy step-by-step optimization mode has more flexibility, can adapt to complex combustion environment and dynamically-changed working conditions, thereby realizing more precise control and improving the self-learning capacity and the overall operation efficiency of the system.
In the step 4, the specific implementation process of the secondary phase volume fraction feedback adjustment is as follows:
Step 41, solving a sub-phase volume fraction equation, namely solving an anthracite sub-phase volume fraction equation and a lignite sub-phase volume fraction equation through a finite difference method, and passing a time step Iteratively solving, and predicting the anthracite particle deposition area and the suspension state of lignite particles at a future time point;
Step 42, analyzing the change trend of the bed, namely identifying and analyzing the change trend of the bed based on the anthracite particle deposition area and the suspension state of lignite particles obtained in the step 41, wherein when the anthracite particle deposition area is larger than a preset value, the height of the bed is reduced, the combustion of particles is insufficient, and when the number of suspended particles is increased to exceed the preset value, the air flow speed is too high or the height of the bed is unsuitable;
Step 43, adjusting feedback control parameters, namely increasing the air flow speed and increasing the bed height when the deposition of anthracite particles is increased to exceed a preset value according to the change trend of the minor phase volume fraction, decreasing the air flow speed when the suspension quantity of lignite particles is increased to exceed the preset value, decreasing the bed height when the suspension quantity of lignite particles is lower than the preset value, and adjusting the bed height and the air flow speed in real time;
Step 44, iterative optimization of feedback adjustment, namely, adjusting parameters of air flow speed and bed height based on feedback, dynamically optimizing an adjustment range by combining actual operation data, waiting for a preset stable duration to adjust after each adjustment, and optimizing a control effect through multiple iterations and feedback by self-adapting learning rate and adjustment amplitude;
And 45, verifying an adjustment result, namely verifying the adjustment effect by solving the secondary phase volume fraction equation again after adjusting the gas speed and the bed height, and monitoring the combustion efficiency and the temperature distribution in real time.
The secondary phase volume fraction feedback regulation scheme solves a secondary phase volume fraction equation through a finite difference method, accurately predicts the deposition area of anthracite particles and the suspension state of lignite particles at a future time point, and provides a scientific basis for real-time regulation of the fluidized bed boiler. When the change trend of the bed is analyzed, the problems of excessive deposition of anthracite particles or unstable suspension of lignite can be accurately identified, and the air flow speed and the bed height are dynamically optimized by adjusting feedback control parameters, so that the more uniform and efficient combustion process is ensured. Through iterative optimization of feedback adjustment, the system can self-adaptively learn the optimal control strategies under different load conditions, verify the adjustment effect by solving the secondary phase volume fraction equation again after adjustment, ensure that the adjustment measures are effective and gradually improve the control precision. Compared with the prior art, the method obviously improves the combustion uniformity and efficiency of the fluidized bed boiler, reduces the phenomena of anthracite deposition and insufficient lignite combustion, has higher control flexibility and adaptability, and ensures that the system is more stable and efficient in performance under complex working conditions.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Finally, the foregoing description is merely illustrative of specific embodiments of the present invention and is not intended to limit the invention, and any modifications, equivalents, improvements or others that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.