CN118134729B - Intelligent forecasting method and system for urban flood control - Google Patents
Intelligent forecasting method and system for urban flood control Download PDFInfo
- Publication number
- CN118134729B CN118134729B CN202410561250.6A CN202410561250A CN118134729B CN 118134729 B CN118134729 B CN 118134729B CN 202410561250 A CN202410561250 A CN 202410561250A CN 118134729 B CN118134729 B CN 118134729B
- Authority
- CN
- China
- Prior art keywords
- flood
- data
- forecast
- rainfall
- historical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
- G06Q50/265—Personal security, identity or safety
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01W—METEOROLOGY
- G01W1/00—Meteorology
- G01W1/10—Devices for predicting weather conditions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0895—Weakly supervised learning, e.g. semi-supervised or self-supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A10/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
- Y02A10/40—Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Tourism & Hospitality (AREA)
- Biomedical Technology (AREA)
- Environmental & Geological Engineering (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Computer Security & Cryptography (AREA)
- Ecology (AREA)
- Biodiversity & Conservation Biology (AREA)
- Educational Administration (AREA)
- Atmospheric Sciences (AREA)
- Environmental Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
用于城市防洪的智慧预报方法,包括如下步骤:收集城市防洪数据,从城市防洪数据中提取降雨数据、历史洪水数据和非降雨洪水驱动因子;构建四种洪水预报模型,包括洪水预报统计模型、洪水预报物理模型、洪水预报机器学习模型和洪水预报历史相似性模型;针对每种预报模型,分别从城市防洪数据和相似降雨数据中提取数据构建训练集,依序训练每一预报模型;基于训练后的每种预报模型,输出洪水预报;采用加权法综合四种洪水预报结果,得到综合洪水预报,即城市防洪的智慧预报。本发明提高了洪水预报的精度和时效性。
The intelligent forecasting method for urban flood control includes the following steps: collecting urban flood control data, extracting rainfall data, historical flood data and non-rainfall flood driving factors from the urban flood control data; constructing four flood forecasting models, including a flood forecasting statistical model, a flood forecasting physical model, a flood forecasting machine learning model and a flood forecasting historical similarity model; for each forecasting model, extracting data from urban flood control data and similar rainfall data to construct a training set, and training each forecasting model in sequence; outputting a flood forecast based on each trained forecasting model; and using a weighted method to synthesize the four flood forecasting results to obtain a comprehensive flood forecast, i.e., an intelligent forecast for urban flood control. The present invention improves the accuracy and timeliness of flood forecasting.
Description
技术领域Technical Field
本发明涉及用于城市防洪的智慧预报方法。The invention relates to an intelligent forecasting method for urban flood control.
背景技术Background technique
在智慧城市和防灾减灾领域,城市防洪一直是一个重要课题。传统的城市防洪预报方法主要依赖于历史经验和统计模型。随着人工智能技术的发展,神经网络在城市防洪领域的应用越来越多,比如过卷积神经网络提取降雨特征,并结合地形、管网等因素进行内涝风险预测,还有的技术通过整合气象雷达数据、水文模型和机器学习算法,实现了精确到街区级别的洪水风险预警。In the field of smart cities and disaster prevention and mitigation, urban flood control has always been an important topic. Traditional urban flood forecasting methods mainly rely on historical experience and statistical models. With the development of artificial intelligence technology, neural networks are increasingly being used in the field of urban flood control. For example, convolutional neural networks are used to extract rainfall characteristics and combine terrain, pipe networks and other factors to predict waterlogging risks. Other technologies integrate meteorological radar data, hydrological models and machine learning algorithms to achieve accurate flood risk warnings at the block level.
尽管已有多种智慧化的城市防洪预报方法,但仍然存在一些亟需解决的问题:一是缺乏对洪水形成驱动因子的深入分析,预报模型的解释性和适用性有待加强;二是对多源异构数据的融合利用不够充分,数据质量和实时性有待提高;三是预报模型的不确定性评估和动态校正机制不完善,预报结果的可信度有待提升;四是缺乏针对复杂城市环境的精细化预报方案,对极端暴雨和内涝的预警能力不足。Although there are many intelligent urban flood forecasting methods, there are still some problems that need to be solved urgently: first, there is a lack of in-depth analysis of the driving factors of flood formation, and the interpretability and applicability of the forecasting model need to be strengthened; second, the fusion and utilization of multi-source heterogeneous data are not sufficient, and the data quality and real-time performance need to be improved; third, the uncertainty assessment and dynamic correction mechanism of the forecasting model are not perfect, and the credibility of the forecast results needs to be improved; fourth, there is a lack of refined forecasting plans for complex urban environments, and the early warning capabilities for extreme rainstorms and waterlogging are insufficient.
因此,需要进行研究和创新。Therefore, research and innovation are needed.
发明内容Summary of the invention
发明目的,提供一种用于城市防洪的智慧预报方法,以解决现有技术存在的上述问题。另一方面提供用于城市防洪的智慧预报系统。The purpose of the invention is to provide an intelligent forecasting method for urban flood control to solve the above problems existing in the prior art. On the other hand, an intelligent forecasting system for urban flood control is provided.
技术方案,根据本申请的一个方面,提供一种用于城市防洪的智慧预报方法,包括如下步骤:Technical solution, according to one aspect of the present application, provides a smart forecasting method for urban flood control, comprising the following steps:
步骤S1、收集城市防洪数据,从城市防洪数据中提取降雨数据、历史洪水数据和非降雨洪水驱动因子;计算各个非降雨洪水驱动因子与洪水过程之间的同步性,构建关键非降雨洪水驱动因子集,基于历史洪水数据和其对应的降雨数据,计算洪水事件与降雨数据的映射关系,得到相似降雨,形成相似降雨数据;Step S1, collecting urban flood control data, extracting rainfall data, historical flood data and non-rainfall flood driving factors from the urban flood control data; calculating the synchronization between each non-rainfall flood driving factor and the flood process, constructing a set of key non-rainfall flood driving factors, and calculating the mapping relationship between flood events and rainfall data based on historical flood data and its corresponding rainfall data, obtaining similar rainfall, and forming similar rainfall data;
步骤S2、构建四种洪水预报模型,包括洪水预报统计模型、洪水预报物理模型、洪水预报机器学习模型和洪水预报历史相似性模型;Step S2, constructing four flood forecasting models, including a flood forecasting statistical model, a flood forecasting physical model, a flood forecasting machine learning model and a flood forecasting historical similarity model;
步骤S3、针对每种预报模型,分别从城市防洪数据和相似降雨数据中提取数据构建训练集,依序训练每一预报模型;Step S3: for each forecast model, extract data from urban flood control data and similar rainfall data to construct a training set, and train each forecast model in sequence;
步骤S4、基于训练后的每种预报模型,输出洪水预报;采用加权法综合四种洪水预报结果,得到综合洪水预报,即城市防洪的智慧预报。Step S4: output flood forecasts based on each trained forecast model; use a weighted method to integrate the four flood forecast results to obtain a comprehensive flood forecast, that is, an intelligent forecast for urban flood control.
根据本申请的另一个方面,提供一种用于城市防洪的智慧预报系统,包括:According to another aspect of the present application, there is provided a smart forecasting system for urban flood control, comprising:
至少一个处理器;以及at least one processor; and
与至少一个所述处理器通信连接的存储器;其中,A memory communicatively connected to at least one of the processors; wherein,
所述存储器存储有可被所述处理器执行的指令,所述指令用于被所述处理器执行以实现上述任一项技术方案所述的用于城市防洪的智慧预报方法。The memory stores instructions that can be executed by the processor, and the instructions are used to be executed by the processor to implement the intelligent forecasting method for urban flood control described in any of the above technical solutions.
有益效果:采用用于城市防洪的智慧预报方法,为城市防洪工作带来新的技术支持和解决方案,提升城市的防洪能力和应对水灾的效率。相关技术效果,将在下文结合具体实施方式进行详细描述。Beneficial effects: The intelligent forecasting method for urban flood control provides new technical support and solutions for urban flood control, improving the city's flood control capabilities and the efficiency of responding to floods. The relevant technical effects will be described in detail below in conjunction with the specific implementation methods.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本发明的流程图。FIG. 1 is a flow chart of the present invention.
图2是本发明步骤S1的流程图。FIG. 2 is a flow chart of step S1 of the present invention.
图3是本发明步骤S3的流程图。FIG. 3 is a flow chart of step S3 of the present invention.
图4是本发明步骤S4的流程图。FIG. 4 is a flow chart of step S4 of the present invention.
具体实施方式Detailed ways
如图1所示,提出如下技术方案。As shown in FIG1 , the following technical solution is proposed.
根据本申请的一个方面,用于城市防洪的智慧预报方法,包括如下步骤:According to one aspect of the present application, a smart forecasting method for urban flood control comprises the following steps:
步骤S1、收集城市防洪数据,从城市防洪数据中提取降雨数据、历史洪水数据和非降雨洪水驱动因子;计算各个非降雨洪水驱动因子与洪水过程之间的同步性,构建关键非降雨洪水驱动因子集,基于历史洪水数据和其对应的降雨数据,计算洪水事件与降雨数据的映射关系,得到相似降雨,形成相似降雨数据;Step S1, collecting urban flood control data, extracting rainfall data, historical flood data and non-rainfall flood driving factors from the urban flood control data; calculating the synchronization between each non-rainfall flood driving factor and the flood process, constructing a set of key non-rainfall flood driving factors, and calculating the mapping relationship between flood events and rainfall data based on historical flood data and its corresponding rainfall data, obtaining similar rainfall, and forming similar rainfall data;
步骤S2、构建四种洪水预报模型,包括洪水预报统计模型、洪水预报物理模型、洪水预报机器学习模型和洪水预报历史相似性模型;Step S2, constructing four flood forecasting models, including a flood forecasting statistical model, a flood forecasting physical model, a flood forecasting machine learning model and a flood forecasting historical similarity model;
步骤S3、针对每种预报模型,分别从城市防洪数据和相似降雨数据中提取数据构建训练集,依序训练每一预报模型;Step S3: for each forecast model, extract data from urban flood control data and similar rainfall data to construct a training set, and train each forecast model in sequence;
步骤S4、基于训练后的每种预报模型,输出洪水预报;采用加权法综合四种洪水预报结果,得到综合洪水预报,即城市防洪的智慧预报。Step S4: output flood forecasts based on each trained forecast model; use a weighted method to integrate the four flood forecast results to obtain a comprehensive flood forecast, that is, an intelligent forecast for urban flood control.
传统的洪水预报主要关注降雨因素,而忽略了诸如下垫面、水库调度、潮汐顶托等多种非降雨驱动因子的影响。通过引入弗雷歇距离和互信息等数学方法,定量刻画了驱动因子与洪水过程的相关性,并筛选出了与洪水同步性高的关键非降雨因子。定量化的驱动因子分析方法,可以揭示洪水形成的内在机制,发现典型洪水事件的成因特征,是对物理机理认知的深化,为后续的模型构建和情景分析提供了扎实的基础。Traditional flood forecasting focuses mainly on rainfall factors, while ignoring the impact of various non-rainfall driving factors such as underlying surface, reservoir scheduling, and tidal support. By introducing mathematical methods such as Fréchet distance and mutual information, the correlation between driving factors and flood processes is quantitatively characterized, and key non-rainfall factors with high synchronization with floods are screened out. The quantitative driving factor analysis method can reveal the internal mechanism of flood formation and discover the causal characteristics of typical flood events. It deepens the understanding of physical mechanisms and provides a solid foundation for subsequent model construction and scenario analysis.
通过分析历史降雨与洪水的对应关系,利用Pearson相关系数和DTW距离这两种互补的统计指标,从历史数据中识别出与当前降雨特征相似的降雨事件,并将其作为后续预报的参考样本。这种相似性分析的思想弥补了单一物理模型的局限性,利用历史数据的经验知识来辅助预报,大大提高了预报的可解释性和置信度。此外,该方法采用双指标筛选,既考虑了降雨量的相关性,又兼顾了降雨时程分布的相似性,保证了所选样本的代表性,为后续模型集成提供了高质量的训练数据。By analyzing the correspondence between historical rainfall and floods, and using two complementary statistical indicators, the Pearson correlation coefficient and the DTW distance, rainfall events with similar characteristics to the current rainfall can be identified from historical data and used as reference samples for subsequent forecasts. This similarity analysis idea makes up for the limitations of a single physical model, and uses the empirical knowledge of historical data to assist forecasting, greatly improving the interpretability and confidence of the forecast. In addition, this method uses dual-index screening, which not only considers the correlation of rainfall, but also takes into account the similarity of rainfall time distribution, ensuring the representativeness of the selected samples and providing high-quality training data for subsequent model integration.
与单一模型相比,该方案融合了统计学习、物理机理、数据驱动、相似推理等多种建模范式,从不同侧面刻画洪水演进过程。其中,统计模型侧重从数据中提炼历史规律,物理模型重点模拟洪水的产汇流机制,机器学习模型善于挖掘复杂数据中的关联模式,而相似性模型则注重对类比案例的推理预测。这种多视角、多路径的模型集成框架,既继承了传统方法的精髓,又吸收了人工智能的新技术,可以充分发挥各类模型的预报效能,扬长避短,取长补短,从而显著提升预报的稳健性和适应性。Compared with a single model, this solution integrates multiple modeling paradigms such as statistical learning, physical mechanism, data-driven, and similar reasoning to depict the evolution of floods from different perspectives. Among them, the statistical model focuses on extracting historical laws from the data, the physical model focuses on simulating the flood generation and convergence mechanism, the machine learning model is good at mining the correlation patterns in complex data, and the similarity model focuses on the reasoning and prediction of analogous cases. This multi-perspective, multi-path model integration framework inherits the essence of traditional methods and absorbs new technologies of artificial intelligence. It can give full play to the forecasting effectiveness of various models, take advantage of their strengths and avoid their weaknesses, and make up for their weaknesses, thereby significantly improving the robustness and adaptability of the forecast.
城市洪水预报面临着降雨变化快、下垫面更新频繁等挑战,对预报模型的实时性和动态性提出了很高要求。针对这一难题,该方案采用了滚动更新的策略,通过移动时间窗口不断将最新的监测数据纳入训练样本,并利用验证期的实测数据对模型进行动态校正。同时,该方案还引入了对抗验证机制,通过人工构造一些极端情景数据来考验模型的鲁棒性,并据此调整模型参数。这种增量学习和主动验证的模式,赋予了预报模型持续进化和自我完善的能力,使其能够适应不断变化的外部环境,始终保持较高的预报精度和可靠性。Urban flood forecasting faces challenges such as rapid changes in rainfall and frequent updates of the underlying surface, which places high demands on the real-time and dynamic nature of the forecast model. To address this problem, the solution adopts a rolling update strategy, which continuously incorporates the latest monitoring data into the training samples through a moving time window, and dynamically corrects the model using the measured data during the validation period. At the same time, the solution also introduces an adversarial verification mechanism, which tests the robustness of the model by artificially constructing some extreme scenario data and adjusts the model parameters accordingly. This mode of incremental learning and active verification gives the forecast model the ability to continuously evolve and improve itself, enabling it to adapt to the ever-changing external environment and always maintain a high level of forecast accuracy and reliability.
由于各个模型的原理、结构、参数、训练数据等都存在差异,它们的预报结果难免会出现一定的偏差和不一致。如何从这些似是而非的结果中提炼出最优的组合预报,是集成学习领域的一大挑战。对此,该方案分别采用了博弈论和贝叶斯模型平均等策略,通过设置纳什均衡目标函数来优化模型权重,或者利用实时预报表现动态调整权重概率分布。此外,该方案还应用蒙特卡洛丢弃等技术手段,通过随机扰动评估预报结果的不确定性,并用置信区间等指标来量化预报可信度。这些集成策略和不确定性分析手段,可以充分挖掘多模型的互补性和集体智慧,削弱单一模型的片面性,使预报结果更加稳健和均衡。Since the principles, structures, parameters, training data, etc. of each model are different, their forecast results will inevitably have certain deviations and inconsistencies. How to extract the optimal combined forecast from these seemingly plausible results is a major challenge in the field of integrated learning. To this end, the scheme adopts strategies such as game theory and Bayesian model averaging, optimizes model weights by setting Nash equilibrium objective functions, or dynamically adjusts weight probability distributions using real-time forecast performance. In addition, the scheme also uses technical means such as Monte Carlo discarding to evaluate the uncertainty of forecast results through random perturbations, and uses indicators such as confidence intervals to quantify forecast credibility. These integrated strategies and uncertainty analysis methods can fully tap the complementarity and collective wisdom of multiple models, weaken the one-sidedness of a single model, and make forecast results more robust and balanced.
总之,通过构建关键驱动因子集,揭示洪水形成机理,为预报奠定基础;通过引入相似性分析,充分利用历史经验,提高预报可解释性;通过发展多模型集成,融合多学科方法,增强预报稳健性;通过采用增量对抗学习,适应动态环境,保证预报时效性;通过应用贝叶斯集成和不确定性量化,优化决策,提升预报可信度。能够解决现有技术存在如下技术问题:非降雨因子影响的定量刻画与筛选;复杂数据环境下的相似性度量与匹配;多源异构模型的集成优化与自动更新;动态变化情景下的预报模型鲁棒性与适应性;预报结果的可解释性、可信度与不确定性控制。In short, by constructing a set of key driving factors, we can reveal the flood formation mechanism and lay the foundation for forecasting; by introducing similarity analysis, we can make full use of historical experience and improve the interpretability of forecasts; by developing multi-model integration and integrating multidisciplinary methods, we can enhance the robustness of forecasts; by adopting incremental adversarial learning, we can adapt to dynamic environments and ensure the timeliness of forecasts; by applying Bayesian integration and uncertainty quantification, we can optimize decisions and improve the credibility of forecasts. It can solve the following technical problems existing in existing technologies: quantitative characterization and screening of non-rainfall factor influences; similarity measurement and matching in complex data environments; integrated optimization and automatic updating of multi-source heterogeneous models; robustness and adaptability of forecast models under dynamic change scenarios; interpretability, credibility and uncertainty control of forecast results.
根据本申请的一个方面,所述步骤S1进一步为:According to one aspect of the present application, step S1 further comprises:
步骤S11、收集城市防洪数据,包括降雨数据、历史洪水数据和非降雨洪水驱动因子,所述非降雨洪水驱动因子包括非降雨气象数据;Step S11, collecting urban flood control data, including rainfall data, historical flood data and non-rainfall flood driving factors, wherein the non-rainfall flood driving factors include non-rainfall meteorological data;
步骤S12、从历史洪水数据中提取每一历史洪水过程,以及该历史洪水过程对应的非降雨洪水驱动因子,构建各场历史洪水过程的非降雨洪水驱动因子集;Step S12, extracting each historical flood process and the non-rainfall flood driving factor corresponding to the historical flood process from the historical flood data, and constructing a set of non-rainfall flood driving factors for each historical flood process;
步骤S13、针对每一历史洪水过程,采用弗雷歇距离计算各个非降雨洪水驱动因子与洪水过程之间的同步性,构建每场历史洪水过程的关键的非降雨洪水驱动因子集;Step S13: for each historical flood process, the synchronization between each non-rainfall flood driving factor and the flood process is calculated using the Flechet distance, and a key non-rainfall flood driving factor set for each historical flood process is constructed;
步骤S14、基于历史洪水数据和其对应的降雨数据,计算洪水事件与降雨的映射关系,识别出相似降雨,形成相似降雨数据。Step S14: Based on the historical flood data and the corresponding rainfall data, a mapping relationship between flood events and rainfall is calculated, similar rainfall is identified, and similar rainfall data is generated.
根据本申请的一个方面,所述步骤S13进一步为:According to one aspect of the present application, step S13 is further:
步骤S13a、依次收集每个非降雨洪水驱动因子的数据,得到每个非降雨洪水驱动因子的数据序列;Step S13a, collecting data of each non-rainfall flood driving factor in sequence to obtain a data sequence of each non-rainfall flood driving factor;
步骤S13b、针对每个历史洪水过程,计算每个非降雨洪水驱动因子与历史洪水过程之间的弗雷歇距离和互信息,将弗雷歇距离小于阈值且互信息大于阈值的非降雨洪水驱动因子与洪水过程认定为同步性高于阈值;Step S13b, for each historical flood process, calculating the Fleche distance and mutual information between each non-rainfall flood driving factor and the historical flood process, and identifying the non-rainfall flood driving factor and the flood process whose Fleche distance is less than a threshold and whose mutual information is greater than a threshold as having synchronization higher than a threshold;
步骤S13c、针对每一历史洪水过程,依序提取同步性高于阈值的非降雨洪水驱动因子,构建每一历史洪水过程的关键非降雨洪水驱动因子集。Step S13c: for each historical flood process, non-rainfall flood driving factors with synchronization higher than a threshold are sequentially extracted to construct a set of key non-rainfall flood driving factors for each historical flood process.
根据本申请的一个方面,所述步骤S14进一步为:According to one aspect of the present application, step S14 is further:
步骤S14a、基于历史洪水过程和其对应的降雨数据,设置Pearson相关系数阈值和DTW距离阈值;Step S14a, based on the historical flood process and its corresponding rainfall data, setting the Pearson correlation coefficient threshold and the DTW distance threshold;
步骤S14b、针对每一历史洪水过程,采用Pearson相关系数和DTW距离分别构建历史洪水过程和降雨数据的映射关系;Step S14b: for each historical flood process, use the Pearson correlation coefficient and DTW distance to respectively construct a mapping relationship between the historical flood process and rainfall data;
步骤S14c、计算所有降雨数据和历史洪水过程的Pearson相关系数和DTW距离,将相关系数大于Pearson阈值且DTW距离小于DTW阈值的降雨识别为相似降雨。Step S14c: Calculate the Pearson correlation coefficient and DTW distance of all rainfall data and historical flood processes, and identify rainfall with a correlation coefficient greater than a Pearson threshold and a DTW distance less than the DTW threshold as similar rainfall.
在本实施例中,传统的洪水预报主要依赖降雨和水文要素,而忽略了诸如温度、湿度、风速等气象因子的影响。事实上,这些非降雨因子通过影响水气循环和下垫面条件,进而影响流域产汇流过程,是洪水形成的重要贡献者之一。该步骤创新性地提出了非降雨驱动因子的概念,并将其作为重要的候选特征纳入后续分析,这种多源数据并举的策略,有助于从更全面的视角来理解和刻画洪水的成因机制,是对数据维度的一次有益拓展。此外,在数据种类上兼顾了降雨、水文、气象等多个专业领域,在数据形式上囊括了实测、统计、预报等多种类型,体现了很强的系统性和综合性,这为建模分析提供了高质量、多样化的数据支撑。In this embodiment, traditional flood forecasting mainly relies on rainfall and hydrological elements, while ignoring the influence of meteorological factors such as temperature, humidity, and wind speed. In fact, these non-rainfall factors affect the watershed runoff process by affecting the water vapor cycle and underlying surface conditions, and are one of the important contributors to flood formation. This step innovatively proposes the concept of non-rainfall driving factors and incorporates them as important candidate features into subsequent analysis. This strategy of using multiple sources of data helps to understand and characterize the cause mechanism of floods from a more comprehensive perspective, which is a beneficial expansion of the data dimension. In addition, the data types take into account multiple professional fields such as rainfall, hydrology, and meteorology, and the data forms include various types such as actual measurement, statistics, and forecasts, reflecting a strong systematic and comprehensive nature, which provides high-quality and diversified data support for modeling and analysis.
通过从历史数据中提取典型洪水事件,并围绕每场洪水提取相关的非降雨影响因子,形成事件驱动因子集。与常见的时间序列建模不同,该步骤基于"洪水事件"构建训练样本,每个样本由一次完整的洪水过程及其影响因子组成。这种事件视角的数据组织形式,一方面尊重了洪水演进的连续性和完整性,避免了时间片段的机械割裂,另一方面突出了每次洪水的个性特征,便于挖掘不同场次之间的差异规律。此外,围绕洪水事件提取驱动因子,遵循了水文过程的因果逻辑,有助于刻画各因子与洪水的内在联系,是一种面向机理的样本构建策略。这种事件驱动的思路可以克服传统时间驱动方法的局限性,提供更符合洪水发生规律的数据视角,从而提高后续分析和建模的针对性。By extracting typical flood events from historical data and extracting relevant non-rainfall influencing factors around each flood, an event-driven factor set is formed. Different from the common time series modeling, this step constructs training samples based on "flood events", and each sample consists of a complete flood process and its influencing factors. This data organization form from an event perspective, on the one hand, respects the continuity and integrity of flood evolution and avoids the mechanical division of time segments, and on the other hand, highlights the individual characteristics of each flood, making it easier to explore the differences between different events. In addition, extracting driving factors around flood events follows the causal logic of hydrological processes, helps to depict the intrinsic connection between each factor and floods, and is a mechanism-oriented sample construction strategy. This event-driven approach can overcome the limitations of traditional time-driven methods and provide a data perspective that is more in line with the laws of flood occurrence, thereby improving the pertinence of subsequent analysis and modeling.
利用弗雷歇距离和互信息来量化非降雨因子与洪水过程的同步性。弗雷歇距离是一种衡量两条时间序列相似性的度量,对序列形状特征更为敏感,因此更适用于刻画因子与洪水在时间动态上的一致性。而互信息是一种度量两个随机变量相关性的指标,对非线性关系更为敏感,因此更适用于刻画因子与洪水在数值相关性上的一致性。将这两种度量指标结合,可以从时间一致性和数值相关性两个维度去筛选最能代表洪水变化的关键因子,既考虑了因子与洪水在时程曲线形状上的契合度,又考虑了因子与洪水在数值变幅上的关联度,是一种基于时空特征的双重同步性刻画方法。这为后续的建模分析提供了更为可靠和全面的特征支撑。此外,通过设置阈值过滤出高同步性的因子,可以去除冗余信息,降低后续分析的复杂度,体现了很好的针对性。The synchronicity between non-rainfall factors and flood processes is quantified using Fleche distance and mutual information. Fleche distance is a measure of the similarity between two time series, which is more sensitive to the shape characteristics of the series, and is therefore more suitable for describing the consistency between factors and floods in terms of temporal dynamics. Mutual information is an indicator that measures the correlation between two random variables, which is more sensitive to nonlinear relationships, and is therefore more suitable for describing the consistency between factors and floods in terms of numerical correlation. Combining these two metrics, the key factors that best represent flood changes can be screened from the two dimensions of temporal consistency and numerical correlation, which considers both the fit between factors and floods in the shape of the time-course curve and the correlation between factors and floods in terms of numerical variation. This is a dual synchronicity characterization method based on spatiotemporal characteristics. This provides more reliable and comprehensive feature support for subsequent modeling and analysis. In addition, by setting a threshold to filter out factors with high synchronicity, redundant information can be removed, the complexity of subsequent analysis can be reduced, and good pertinence is reflected.
如何从海量的历史降雨数据中快速找出与当前降雨最为相似的一组,并将其作为参考样本来指导后续预报,是需要解决的技术难题之一。本步骤不是简单地做降雨量的对比,而是综合考虑了降雨的数值相关性和时程分布相似性这两个方面。其中,Pearson相关系数衡量的是两组降雨数据的线性相关强度,而DTW距离衡量的是两组降雨时程曲线的形状相似度。这两个指标分别从静态数值和动态变化两个角度来刻画降雨相似性,可以全面考察降雨的关键特征。此外,通过分别设置两个指标的阈值,用"且"逻辑筛选出同时满足数值相关性和时程相似性的降雨事件,可以进一步提高相似性样本的质量,使其更具有代表性和可比性。这种相似降雨识别方法利用相似推理思想,通过类比分析来补充样本信息,可以弥补单纯数据驱动方法的不足,提高预报的可解释性和可靠性。How to quickly find the group of rainfall data that is most similar to the current rainfall from the massive historical rainfall data and use it as a reference sample to guide subsequent forecasts is one of the technical problems that need to be solved. This step is not a simple comparison of rainfall, but a comprehensive consideration of the numerical correlation and time-course distribution similarity of rainfall. Among them, the Pearson correlation coefficient measures the linear correlation strength of the two sets of rainfall data, while the DTW distance measures the shape similarity of the two sets of rainfall time-course curves. These two indicators characterize the rainfall similarity from the perspectives of static values and dynamic changes, respectively, and can comprehensively examine the key characteristics of rainfall. In addition, by setting the thresholds of the two indicators respectively and using the "and" logic to filter out rainfall events that meet both numerical correlation and time-course similarity, the quality of similarity samples can be further improved, making them more representative and comparable. This similar rainfall identification method uses the idea of similarity reasoning and supplements sample information through analogy analysis, which can make up for the shortcomings of the simple data-driven method and improve the interpretability and reliability of the forecast.
总之,本步骤通过引入非降雨驱动因子,拓展数据维度,加深洪水机理认知;基于洪水事件构建样本,尊重洪水过程完整性,符合水文规律;应用弗雷歇距离和互信息,多角度量化因子洪水同步性,提供可靠特征;综合Pearson相关系数和DTW距离,全面刻画降雨相似性,利用了相似推理思想。本步骤有效解决了以下技术问题:如何从多源数据中选取最具洪水预报价值的驱动要素;如何设计符合洪水发生规律的样本数据组织形式;如何定量评估驱动因子与洪水过程的相关性与同步性;如何从历史数据中快速找出最相似的降雨事件作为参考。In summary, this step introduces non-rainfall driving factors to expand the data dimension and deepen the understanding of flood mechanisms; constructs samples based on flood events, respects the integrity of the flood process, and conforms to hydrological laws; applies Fréchet distance and mutual information to quantify flood synchronization from multiple angles and provide reliable features; combines Pearson correlation coefficient and DTW distance to comprehensively characterize rainfall similarity and utilize similar reasoning ideas. This step effectively solves the following technical problems: how to select the most flood-predicting driving factors from multi-source data; how to design a sample data organization form that conforms to the laws of flood occurrence; how to quantitatively evaluate the correlation and synchronization between driving factors and flood processes; and how to quickly find the most similar rainfall events from historical data as references.
根据本申请的一个方面,所述步骤S3进一步为:According to one aspect of the present application, step S3 further comprises:
步骤S31、提取洪水过程,并按照时间顺序将每个洪水过程的城市防洪数据、降雨数据和非降雨洪水驱动因子分成K段连续的过程,K为大于1的自然数;在初始时,通过四种洪水预报模型给出当前洪水过程;Step S31, extracting the flood process, and dividing the urban flood control data, rainfall data and non-rainfall flood driving factors of each flood process into K continuous processes in chronological order, where K is a natural number greater than 1; at the initial stage, the current flood process is given by four flood forecasting models;
步骤S32、将每个时间段作为一个训练步长,依序针对每个时间段,提取非降雨洪水驱动因子、降雨数据及其对应的历史洪水过程,计算当前洪水过程和历史洪水数据的弗雷歇距离,筛选出相似洪水过程;采用四种洪水预报模型进行预报并输出综合洪水预报,然后将相似洪水过程的后续过程作为验证集对四种洪水预报模型进行验证;直至得到训练好的四种洪水预报模型;Step S32: taking each time period as a training step, extracting non-rainfall flood driving factors, rainfall data and corresponding historical flood processes for each time period, calculating the Frechet distance between the current flood process and the historical flood data, and screening out similar flood processes; using four flood forecasting models to forecast and output a comprehensive flood forecast, and then using the subsequent processes of similar flood processes as a validation set to validate the four flood forecasting models; until the four trained flood forecasting models are obtained;
步骤S33、根据预配置的周期,对模型进行对抗验证,通过对输入的非降雨洪水驱动因子数据添加扰动构建对抗样本,测试模型在极端情况下的预测能力;分析模型对对抗样本的脆弱性,并针对性地进行模型修正和优化,提高洪水预报在恶劣条件下的鲁棒性。Step S33: According to the preconfigured cycle, the model is verified adversarially. Adversarial samples are constructed by adding disturbances to the input non-rainfall flood driving factor data to test the model's prediction ability under extreme conditions. The model's vulnerability to adversarial samples is analyzed, and the model is modified and optimized in a targeted manner to improve the robustness of flood forecasting under harsh conditions.
根据本申请的一个方面,所述S32还包括在每个步长的训练过程中,给四个洪水预报模型赋予权重:According to one aspect of the present application, the S32 further includes assigning weights to the four flood forecasting models during the training process of each step length:
步骤S32a、采用博弈论法,设置纳什均衡为协调目标,对四种洪水预报模型对应的洪水预报赋权,综合四种洪水预报模型对应的洪水预报权重值得到综合洪水预报;或者,Step S32a, using game theory, setting Nash equilibrium as the coordination target, weighting the flood forecasts corresponding to the four flood forecast models, and synthesizing the flood forecast weights corresponding to the four flood forecast models to obtain a comprehensive flood forecast; or,
采用贝叶斯模型平均方法对四种洪水预报模型进行集成,设置先验概率分布;据每个模型的实时预测表现,通过贝叶斯公式动态更新其后验概率;将四种洪水预报模型的预测结果按照其后验概率加权平均,得到综合洪水预报结果;The four flood forecasting models are integrated by using the Bayesian model averaging method to set the prior probability distribution. According to the real-time prediction performance of each model, its posterior probability is dynamically updated through the Bayesian formula. The prediction results of the four flood forecasting models are weighted averaged according to their posterior probabilities to obtain the comprehensive flood forecast results.
步骤S32b、将综合洪水预报与该洪水过程实际发生的历史洪水数据对比,并基于两者差别调整权重值;Step S32b, comparing the comprehensive flood forecast with the historical flood data of the actual flood process, and adjusting the weight value based on the difference between the two;
步骤S32c、将调整后的权重值赋予得到的四种洪水预报模型的洪水预报,得到综合洪水预报;Step S32c, assigning the adjusted weight values to the flood forecasts of the four flood forecast models to obtain a comprehensive flood forecast;
步骤S32d、对综合洪水预报结果进行不确定性量化,采用蒙特卡洛丢弃方法随机丢弃模型神经元并重复预测多次,得到洪水预报的概率分布;Step S32d, quantifying the uncertainty of the comprehensive flood forecast results, using the Monte Carlo discarding method to randomly discard model neurons and repeat the prediction multiple times to obtain the probability distribution of the flood forecast;
步骤S32e、计算洪水预报分布的均值和置信区间,输出包含不确定性水平的概率性洪水预报;评估洪水预报结果的可靠性,将置信区间宽度作为不确定性大小的指标。Step S32e, calculate the mean and confidence interval of the flood forecast distribution, and output a probabilistic flood forecast including the uncertainty level; evaluate the reliability of the flood forecast results, and use the confidence interval width as an indicator of the uncertainty size.
在本实施例中,采用了滚动时间窗口的思路来构建训练数据。传统的数据驱动模型通常将数据随机划分为训练集和测试集,忽略了样本的时间先后关系。而该步骤充分考虑了洪水的动态演进特点,通过移动分析时间窗口,将每个洪水过程划分为若干个连续的阶段,每个阶段包含一段历史数据和待预测的未来数据。这种时间序贯的数据组织形式,既保留了洪水在不同阶段的演变规律,又兼顾了整个过程的连续性,更符合洪水预报的实际应用场景。此外,该方法允许模型在每个阶段都能获得一次预报结果,可以为防汛决策提供更加动态和实时的预报信息,体现了很强的实用性。在初始阶段引入四类模型的集成预报,可以快速给出基准预报,为后续训练提供良好的起点,是一种典型的迁移学习策略。In this embodiment, the idea of rolling time window is adopted to construct training data. Traditional data-driven models usually randomly divide data into training set and test set, ignoring the time sequence of samples. This step fully considers the dynamic evolution characteristics of floods. By moving the analysis time window, each flood process is divided into several continuous stages, each of which contains a period of historical data and future data to be predicted. This time-sequential data organization form not only retains the evolution law of floods in different stages, but also takes into account the continuity of the whole process, which is more in line with the actual application scenario of flood forecasting. In addition, this method allows the model to obtain a forecast result at each stage, which can provide more dynamic and real-time forecast information for flood control decisions, reflecting strong practicality. Introducing the integrated forecast of four types of models in the initial stage can quickly give a benchmark forecast, providing a good starting point for subsequent training, which is a typical transfer learning strategy.
针对如何实现模型参数的增量更新和自适应校正的问题。洪水预报面临着数据分布变化、机理认知更新等挑战,模型必须能够持续学习和自我完善。基于阶段数据构建了一套完整的训练-预报-验证流程,每个时间步长都对模型进行一次训练和校正,通过将后续真实数据反馈回模型,使其能够及时捕捉洪水演进规律的新变化,不断降低预报偏差。从训练数据源看,融合了非降雨影响因子、降雨数据、相似历史洪水等多种异构数据,使模型能够全面学习洪水形成的内在机理。从训练效果看,将相似洪水的后续过程作为验证集,可以客观评估模型的预测能力,以期对真实洪水有更好的外推性能。这种边训练边预报、滚动更新、持续验证的模式,赋予了模型持续进化的能力,既提高了预报的动态性,又增强了预报的可解释性,是一种因地制宜的个性化建模策略。Regarding the problem of how to achieve incremental updates and adaptive corrections of model parameters. Flood forecasting faces challenges such as changes in data distribution and updates in mechanism cognition, and the model must be able to continuously learn and improve itself. A complete training-forecasting-verification process is constructed based on stage data. The model is trained and calibrated once at each time step. By feeding subsequent real data back to the model, it can capture new changes in the evolution of floods in a timely manner and continuously reduce forecast deviations. From the perspective of training data sources, a variety of heterogeneous data such as non-rainfall influencing factors, rainfall data, and similar historical floods are integrated, enabling the model to fully learn the internal mechanism of flood formation. From the perspective of training results, using the subsequent process of similar floods as a verification set can objectively evaluate the prediction ability of the model, in order to achieve better extrapolation performance for real floods. This mode of training, forecasting, rolling updates, and continuous verification gives the model the ability to continuously evolve, which not only improves the dynamics of the forecast, but also enhances the interpretability of the forecast. It is a personalized modeling strategy that is adapted to local conditions.
本实施例还利用了合作博弈论和贝叶斯模型平均等策略来优化多模型的集成权重。博弈论法通过设置纳什均衡目标,可以从动态非合作的角度达到多模型预报的帕累托最优,这种均衡性有利于提高预报的稳健性。而贝叶斯平均法则利用实时预报表现,通过后验概率更新来调节模型权重,使集成效果能够自适应地趋近最优模型,这种自适应性有利于提高预报的精准性。两种集成策略的核心是利用模型互评机制来实现优胜劣汰,动态选择预报效果好的模型予以加强,从而最大化整体预报效能。这种智能化的模型选择和融合方式,对提高预报系统在复杂多变的洪水场景下的适应性至关重要。This embodiment also uses strategies such as cooperative game theory and Bayesian model averaging to optimize the integration weights of multiple models. Game theory can achieve Pareto optimality of multi-model forecasting from a dynamic non-cooperative perspective by setting the Nash equilibrium target. This equilibrium is conducive to improving the robustness of the forecast. The Bayesian average rule uses real-time forecast performance to adjust the model weights through posterior probability updates, so that the integration effect can adaptively approach the optimal model. This adaptability is conducive to improving the accuracy of the forecast. The core of the two integration strategies is to use the model mutual evaluation mechanism to achieve the survival of the fittest, dynamically select models with good forecasting effects to strengthen them, and thus maximize the overall forecasting effectiveness. This intelligent model selection and fusion method is crucial to improving the adaptability of the forecasting system in complex and changeable flood scenarios.
最后,将当前阶段的综合预报结果与后续实际发生的洪水过程进行对比,形成预报偏差,并据此调整各模型的权重参数。这实际上是一种监督学习的过程,通过将真实的结果反馈给模型,使其能够从误差中学习,自我校准,不断趋近最优状态。经过调整的模型权重重新应用于当前阶段,生成更新后的综合预报。这种动态反馈控制机制赋予了模型持续自我修正的能力,一方面可以减小预报偏差,另一方面可以适应洪水演变的新趋势,保持预报的有效性。这种反馈控制的思想借鉴了现代控制理论,是一种典型的闭环优化策略,对于提高预报模型的自适应性和鲁棒性意义重大。Finally, the comprehensive forecast results of the current stage are compared with the subsequent actual flood process to form a forecast deviation, and the weight parameters of each model are adjusted accordingly. This is actually a supervised learning process. By feeding back the real results to the model, it can learn from the error, self-calibrate, and continuously approach the optimal state. The adjusted model weights are reapplied to the current stage to generate an updated comprehensive forecast. This dynamic feedback control mechanism gives the model the ability to continuously self-correct. On the one hand, it can reduce the forecast deviation, and on the other hand, it can adapt to the new trend of flood evolution and maintain the effectiveness of the forecast. This feedback control idea draws on modern control theory and is a typical closed-loop optimization strategy. It is of great significance to improve the adaptability and robustness of the forecast model.
通过引入概率预报和不确定性分析的理念。传统的洪水预报多采用确定性模型,给出一个固定的预报值,忽略了洪水本身的随机性和模型的不确定性。而该步骤创新性地应用了蒙特卡洛丢弃等技术,通过随机扰动模型结构,多次重复预测,从而得到一个预报结果的概率分布。这种从确定性到概率性的范式转变,一方面更加尊重洪水现象的内在不确定性,另一方面也让模型的可靠程度有了量化指标。基于预报分布的均值和置信区间,可以为防汛决策提供更丰富的参考信息,均值反映了预报的期望水平,而置信区间则反映了预报的可信程度。置信区间的宽窄决定了不确定性的大小,它的动态变化也揭示了模型可靠性的变化趋势。这种不确定性感知和表达的方式,是对传统确定性模式的一次突破,让预报结果更加全面和透明,为风险评估和管理提供了重要依据。By introducing the concepts of probabilistic forecasting and uncertainty analysis. Traditional flood forecasting mostly uses deterministic models to give a fixed forecast value, ignoring the randomness of the flood itself and the uncertainty of the model. This step innovatively applies techniques such as Monte Carlo discarding, randomly perturbing the model structure, and repeating the prediction many times to obtain a probability distribution of the forecast results. This paradigm shift from determinism to probability, on the one hand, respects the inherent uncertainty of flood phenomena more, and on the other hand, it also provides quantitative indicators for the reliability of the model. Based on the mean and confidence interval of the forecast distribution, more abundant reference information can be provided for flood control decisions. The mean reflects the expected level of the forecast, while the confidence interval reflects the credibility of the forecast. The width of the confidence interval determines the size of the uncertainty, and its dynamic changes also reveal the changing trend of the model reliability. This way of perceiving and expressing uncertainty is a breakthrough in the traditional deterministic model, making the forecast results more comprehensive and transparent, and providing an important basis for risk assessment and management.
在一些实施例中,训练得再好的模型,在实际应用中也难免遇到一些极端异常情况。该步骤通过人为构造一些对抗性的非降雨驱动因子数据,刻意去挑战和迷惑模型,以测试其在恶劣条件下的预测能力。这种对抗性验证的过程,可以帮助发现模型的薄弱环节,揭示可能影响预报效果的关键因素。基于模型脆弱性分析,可以因地制宜地调整模型结构或参数,提高抗干扰能力。此外,将对抗样本纳入训练,可以起到数据增强的作用,使模型学习到更多的极端模式,增强预报的外推能力。这种主动验证和针对性优化的机制,是对模型鲁棒性和泛化性的一种保障,对于提高预报系统的实用性和可靠性至关重要。这一思路借鉴了人工智能领域的对抗学习理念,代表了模型测试和优化的新方向。In some embodiments, no matter how well the model is trained, it is inevitable to encounter some extreme abnormal situations in practical applications. This step deliberately challenges and confuses the model by artificially constructing some adversarial non-rainfall driving factor data to test its predictive ability under harsh conditions. This adversarial verification process can help discover the weak links of the model and reveal the key factors that may affect the forecast effect. Based on the model vulnerability analysis, the model structure or parameters can be adjusted according to local conditions to improve the anti-interference ability. In addition, incorporating adversarial samples into training can play a role in data enhancement, enabling the model to learn more extreme patterns and enhance the extrapolation ability of the forecast. This mechanism of active verification and targeted optimization is a guarantee of the robustness and generalization of the model, and is crucial to improving the practicality and reliability of the forecast system. This idea draws on the concept of adversarial learning in the field of artificial intelligence and represents a new direction for model testing and optimization.
总之,采用滚动时间窗口,符合洪水动态演进规律,提高预报时效性;构建训练-预报-验证流程,实现模型增量更新和自适应校正,提高预报连续性;融合博弈论和贝叶斯平均,优化模型集成策略,提高预报适应性;引入预报偏差反馈控制,实现模型自我修正,提高预报精准性;应用蒙特卡洛丢弃法,量化预报不确定性,提高预报透明性;开展对抗性验证和优化,增强模型鲁棒性,提高预报可靠性。这些技术解决了以下技术难题:如何建立符合洪水动态演进规律的模型训练范式;如何实现模型参数的持续进化和自我完善;如何权衡多源异构模型的预报结果;如何减小预报偏差,提高预测精度;如何刻画预报的内在不确定性;如何评估和提升模型的稳健性。In short, the use of rolling time windows conforms to the law of flood dynamic evolution and improves the timeliness of forecasts; the construction of a training-forecast-verification process realizes incremental model updates and adaptive corrections to improve forecast continuity; the integration of game theory and Bayesian averaging optimizes model integration strategies to improve forecast adaptability; the introduction of forecast deviation feedback control realizes model self-correction and improves forecast accuracy; the application of the Monte Carlo discarding method quantifies forecast uncertainty and improves forecast transparency; adversarial verification and optimization are carried out to enhance model robustness and improve forecast reliability. These technologies solve the following technical problems: how to establish a model training paradigm that conforms to the law of flood dynamic evolution; how to realize the continuous evolution and self-improvement of model parameters; how to weigh the forecast results of multi-source heterogeneous models; how to reduce forecast deviations and improve prediction accuracy; how to characterize the inherent uncertainty of forecasts; how to evaluate and improve the robustness of models.
根据本申请的一个方面,所述步骤S4进一步为:According to one aspect of the present application, step S4 is further:
步骤S41、分别计算训练好的四种洪水预报模型,得到各自对应的洪水预报;Step S41, respectively calculating the four trained flood forecasting models to obtain the corresponding flood forecasts;
步骤S42、采用加权法综合得到最终的综合洪水预报,即城市防洪的智慧预报。Step S42: Use a weighted method to obtain a final comprehensive flood forecast, that is, an intelligent forecast for urban flood control.
根据本申请的一个方面,所述步骤S41进一步为:According to one aspect of the present application, step S41 is further:
步骤S41a、提取当前时段已发生的洪水数据、降雨数据和其对应的非降雨洪水驱动因子输入洪水预报统计模型,得到洪水预报统计模型的洪水预报;Step S41a, extracting flood data and rainfall data that have occurred in the current period and their corresponding non-rainfall flood driving factors and inputting them into the flood forecast statistical model to obtain the flood forecast of the flood forecast statistical model;
步骤S41b、提取当前时段已发生的洪水数据、降雨数据、地形数据、土地利用数据、城镇布局规划输入洪水预报物理模型,得到洪水预报物理模型的洪水预报;Step S41b, extracting flood data, rainfall data, terrain data, land use data, and town layout planning that have occurred in the current period and inputting them into the flood forecasting physical model to obtain a flood forecast of the flood forecasting physical model;
步骤S41c、提取当前时段已发生的洪水数据、降雨数据、非降雨气象数据、地形数据、土地利用数据、城镇布局规划输入洪水预报机器学习模型,得到洪水预报机器学习模型的洪水预报;Step S41c, extracting flood data, rainfall data, non-rainfall meteorological data, terrain data, land use data, and town layout planning that have occurred in the current period and inputting them into the flood forecasting machine learning model to obtain a flood forecast of the flood forecasting machine learning model;
步骤S41d、提取当前时段已发生的洪水数据、降雨数据、当前降雨数据对应的历史相似洪水数据,输入洪水预报历史相似性模型,得到洪水预报历史相似性模型的洪水预报。Step S41d, extracting flood data and rainfall data that have occurred in the current period, and historical similar flood data corresponding to the current rainfall data, inputting them into the flood forecast historical similarity model, and obtaining the flood forecast of the flood forecast historical similarity model.
根据本申请的一个方面,所述S42b进一步为:According to one aspect of the present application, the S42b further comprises:
步骤S42b1、根据综合洪水预报,提取洪水类型并归类,针对每一洪水类型,计算该洪水过程与历史洪水过程的相似程度;Step S42b1, extracting and classifying flood types according to the comprehensive flood forecast, and calculating the similarity between the flood process and the historical flood process for each flood type;
步骤S42b2、根据洪水过程的归类结果和相似度度量,计算综合预报洪水与各类历史洪水的相似性概率分布;Step S42b2: Calculate the similarity probability distribution between the comprehensive flood forecast and various historical floods based on the classification results and similarity measurement of the flood process;
步骤S42b3、根据综合预报洪水的相似性概率分布,调整权重值,加权融合不同相似度度量方法,对预报结果进行校正,得到的最终预报结果。Step S42b3: According to the similarity probability distribution of the comprehensive flood forecast, the weight value is adjusted, and different similarity measurement methods are weightedly integrated to correct the forecast result to obtain the final forecast result.
根据本申请的一个方面,所述步骤S42b1进一步为:According to one aspect of the present application, step S42b1 is further:
基于历史洪水数据对洪水进行归类,得到洪水类型,包括缓涨缓退型洪水、陡涨陡落型洪水、多峰型洪水和平缓型洪水;Based on historical flood data, floods are classified into flood types, including slow-rise and slow-recession floods, steep-rise and steep-fall floods, multi-peak floods, and gentle floods.
对历史洪水事件和综合预报洪水过程进行聚类分析,得到各类洪水的中心洪水过程;Cluster analysis was performed on historical flood events and comprehensive flood forecast processes to obtain the central flood processes of various flood types;
计算每个洪水过程与各类中心洪水过程的相似度,进行归类,其中,对于缓涨缓退型洪水,使用DTW距离计算相似度;对于陡涨陡落型洪水,使用欧式距离计算相似度;对于多峰型洪水,使用洪峰特征加权欧氏距离计算相似度;对于平缓型洪水,使用面积相似系数。The similarity between each flood process and various central flood processes is calculated and classified. For slow-rising and slow-falling floods, the DTW distance is used to calculate the similarity; for steep-rising and steep-falling floods, the Euclidean distance is used to calculate the similarity; for multi-peak floods, the flood peak characteristic weighted Euclidean distance is used to calculate the similarity; for gentle floods, the area similarity coefficient is used.
在本实施例中,针对性地提取了不同模型所需的输入特征。四类模型在建模思路和所需数据上各有侧重:统计模型主要利用降雨、洪水等时间序列数据,通过相关性和因果性分析建立预报关系;物理模型则更多地依赖地形、土地利用等地理信息数据,通过机理推演和数值模拟得到预报结果;机器学习模型综合利用气象、水文、地理、规划等多源异构数据,通过数据挖掘和模式识别实现预报;相似性模型则重点利用历史洪水资料,通过类比分析和相似推理给出预报。该步骤充分考虑了不同模型的特点,因地制宜地选取了最能体现各自优势的特征子集,是一种典型的个性化和差异化的数据准备策略。这种做法一方面满足了不同模型的建模需求,另一方面避免了无效冗余数据的干扰,可以最大限度地发挥各模型的预报效能。此外,在数据类型上涵盖了降雨、洪水、气象、地形、土地利用、城镇布局等多个专业领域,体现了很强的综合性和系统性。In this embodiment, the input features required by different models are extracted in a targeted manner. The four types of models have different focuses on modeling ideas and required data: statistical models mainly use time series data such as rainfall and floods to establish forecast relationships through correlation and causal analysis; physical models rely more on geographic information data such as terrain and land use, and obtain forecast results through mechanism deduction and numerical simulation; machine learning models comprehensively use multi-source heterogeneous data such as meteorology, hydrology, geography, and planning, and achieve forecasts through data mining and pattern recognition; similarity models focus on using historical flood data and give forecasts through analogy analysis and similarity reasoning. This step fully considers the characteristics of different models, and selects the feature subsets that best reflect their respective advantages according to local conditions. It is a typical personalized and differentiated data preparation strategy. This approach meets the modeling needs of different models on the one hand, and avoids the interference of invalid redundant data on the other hand, which can maximize the forecasting efficiency of each model. In addition, the data types cover multiple professional fields such as rainfall, floods, meteorology, terrain, land use, and urban layout, reflecting a strong comprehensiveness and systematicity.
对于如何权衡多模型的预报结果,形成一个综合可靠的最终预报的问题。由于不同模型的基本假设、结构框架、参数设置等方面存在差异,它们的预报结果往往也不尽相同,甚至可能出现较大偏差。如何从这些看似矛盾的结果中提炼出一个一致性较好的综合预报,是多模型集成面临的共性挑战。针对这一难题,该步骤提出了一套基于相似性分析和概率融合的模型校正方法。首先,根据综合预报将当前洪水归类到某一典型类别,并计算其与该类历史洪水的相似程度。这一过程利用了相似推理思想,通过与历史上类似洪水的对比分析,可以初步判断综合预报的合理性。然后,基于相似性分析得到综合预报与各类洪水的概率分布,并以此动态调整各模型的权重系数。此举是将概率论与模型集成相结合的典型案例,通过概率分布体现不同洪水类型的可能性,并将其作为参考来优化模型组合策略,可以增强预报的适应性。最后,利用调整后的权重对多模型预报进行加权平均,并对结果进行误差校正,得到最终的校正预报。这种融合与校正并重的思路,既发挥了概率权重的索引作用,又考虑了预报偏差的修正,从而最大限度地提升了综合预报的可靠性。How to weigh the forecast results of multiple models to form a comprehensive and reliable final forecast. Due to differences in the basic assumptions, structural frameworks, parameter settings, etc. of different models, their forecast results are often different, and may even have large deviations. How to extract a comprehensive forecast with good consistency from these seemingly contradictory results is a common challenge faced by multi-model integration. In response to this problem, this step proposes a model correction method based on similarity analysis and probability fusion. First, the current flood is classified into a typical category according to the comprehensive forecast, and its similarity with the historical flood of this type is calculated. This process uses the idea of similar reasoning. By comparing and analyzing similar floods in history, the rationality of the comprehensive forecast can be preliminarily judged. Then, based on the similarity analysis, the probability distribution of the comprehensive forecast and various floods is obtained, and the weight coefficients of each model are dynamically adjusted. This is a typical case of combining probability theory with model integration. The probability distribution reflects the possibility of different flood types, and it is used as a reference to optimize the model combination strategy, which can enhance the adaptability of the forecast. Finally, the adjusted weights are used to perform weighted averaging of the multi-model forecasts, and the results are error corrected to obtain the final corrected forecast. This idea of combining fusion and correction not only plays the indexing role of probability weights, but also takes into account the correction of forecast deviations, thereby maximizing the reliability of the comprehensive forecast.
同时,本实施例针对不同类型洪水采用差异化的相似性度量策略。传统的相似洪水识别大多采用固定的距离度量,如欧氏距离、曼哈顿距离等,忽略了不同洪水类型在成因机制、演进规律上的差异性。针对这一问题,该步骤提出了一种分门别类的相似性分析方法。首先,基于历史洪水数据的聚类分析,总结出缓涨缓退型、陡涨陡落型、多峰型、平缓型等典型洪水类别,并确定各自的中心洪水过程,这是一种半监督学习的思路。然后,计算综合预报洪水与各类中心洪水的相似度,据此判断其所属类别。在相似性计算中,对不同类别采取不同的度量方法:对于缓涨缓退型洪水,使用DTW距离突出时序的整体相似性;对于陡涨陡落型洪水,使用欧式距离突出波形的局部相似性;对于多峰洪水,使用洪峰特征加权突出关键特征的相似性;对于平缓型洪水,使用面积相似系数突出累积总量的相似性。这种因类施策的相似性分析,一方面尊重了不同洪水类型的特点,另一方面也体现了水文机理知识对数据挖掘的指导作用,是定性经验与定量计算的融合,可以更精准地刻画洪水的相似性,为后续的模型优化提供更可靠的参考。At the same time, this embodiment adopts differentiated similarity measurement strategies for different types of floods. Traditional similar flood identification mostly adopts fixed distance measurement, such as Euclidean distance, Manhattan distance, etc., ignoring the differences in the cause mechanism and evolution law of different flood types. In response to this problem, this step proposes a classified similarity analysis method. First, based on the cluster analysis of historical flood data, typical flood categories such as slow rise and slow retreat type, steep rise and steep fall type, multi-peak type, and gentle type are summarized, and the central flood process of each type is determined. This is a semi-supervised learning idea. Then, the similarity between the comprehensive forecast flood and various central floods is calculated, and the category to which it belongs is judged accordingly. In the similarity calculation, different measurement methods are adopted for different categories: for slow rise and slow retreat type floods, DTW distance is used to highlight the overall similarity of the time series; for steep rise and steep fall type floods, Euclidean distance is used to highlight the local similarity of the waveform; for multi-peak floods, flood peak feature weighting is used to highlight the similarity of key features; for gentle floods, area similarity coefficient is used to highlight the similarity of cumulative total amount. This similarity analysis based on different flood types, on the one hand, respects the characteristics of different flood types, and on the other hand, reflects the guiding role of hydrological mechanism knowledge in data mining. It is a fusion of qualitative experience and quantitative calculation, which can more accurately characterize the similarities of floods and provide a more reliable reference for subsequent model optimization.
总的来说,通过个性化提取模型输入特征,满足差异化需求,发挥各模型所长;应用相似性分析初判预报合理性,借鉴相似推理思想,融入历史经验;动态调整模型权重,引入概率分布,优化集成策略;加权平均与误差校正并重,统筹权重融合与偏差修正,形成可靠预报;针对不同洪水类型,采用差异化的相似性度量,融合定性经验和定量分析。解决了以下技术难题:如何兼顾不同模型的特点,提取最优输入特征;如何利用历史洪水经验,初判预报结果的合理性;如何权衡不同模型的预报结果,形成一致性预报;如何削减预报偏差,提高预测精度;如何增强相似性分析的针对性和可解释性。In general, by extracting model input features in a personalized way, we can meet differentiated needs and give full play to the strengths of each model; apply similarity analysis to initially judge the rationality of forecasts, draw on similar reasoning ideas, and integrate historical experience; dynamically adjust model weights, introduce probability distributions, and optimize integration strategies; give equal weight to weighted average and error correction, coordinate weight fusion and deviation correction, and form reliable forecasts; for different flood types, use differentiated similarity metrics, integrate qualitative experience and quantitative analysis. The following technical problems have been solved: how to take into account the characteristics of different models and extract the optimal input features; how to use historical flood experience to initially judge the rationality of forecast results; how to weigh the forecast results of different models to form a consistent forecast; how to reduce forecast deviations and improve prediction accuracy; how to enhance the pertinence and interpretability of similarity analysis.
在本申请中,非降雨驱动因子包括:北半球极涡面积指数(NHPVA)、北半球极涡强度指数(NHPVI)、北半球极涡中心经向位置指数(NHPVCLON)、北半球极涡中心纬向位置指数(NHPVCLAT)、西太副高面积指数(WPSHA)、西太副高强度指数(WPSHI)、西太副高脊线位置指数(WPSHRP)、西太副高西伸脊点指数(WPSHWRP)、西太副高北界位置指数(WPSHNBP)、欧亚纬向环流指数(EZC)、欧亚经向环流指数(EMC)、东亚槽位置指数(EATP)、东亚槽强度指数(EATI)、Nino1+2区海温指数、Nino3区海温指数、Nino4区海温指数、Nino3.4区海温指数、槽(T)、脊(R)、高空急流(HJ)、切变线(SL)、低涡(VO)、气旋(CL)、锋面(FS)、台风(TY)、冷空气(CA)和低空急流(LJ)。In this application, non-rainfall driving factors include: Northern Hemisphere Polar Vortex Area Index (NHPVA), Northern Hemisphere Polar Vortex Intensity Index (NHPVI), Northern Hemisphere Polar Vortex Center Meridional Position Index (NHPVCLON), Northern Hemisphere Polar Vortex Center Latitudinal Position Index (NHPVCLAT), Western Pacific Subtropical High Area Index (WPSHA), Western Pacific Subtropical High Intensity Index (WPSHI), Western Pacific Subtropical High Ridgeline Position Index (WPSHRP), Western Pacific Subtropical High Westward Extension Ridge Point Index (WPSHWRP), Western Pacific Subtropical High Northern Limit Position Index (WP SHNBP), Eurasian zonal circulation index (EZC), Eurasian meridional circulation index (EMC), East Asian trough position index (EATP), East Asian trough intensity index (EATI), Nino1+2 area SST index, Nino3 area SST index, Nino4 area SST index, Nino3.4 area SST index, trough (T), ridge (R), high-level jet (HJ), shear line (SL), low vortex (VO), cyclone (CL), front (FS), typhoon (TY), cold air (CA) and low-level jet (LJ).
需要注意的是,在训练的时候,采用博弈论或贝叶斯平均的方法进行权重调整,从而让预报模型能够更好的拟合。而在使用这个模型时,可以将当前的洪水过程与历史洪水过程进行相似度对比,然后基于相似度给出权重,进行校验和比较,从而给出更准确的数据。It should be noted that during training, game theory or Bayesian averaging methods are used to adjust weights so that the forecast model can fit better. When using this model, the current flood process can be compared with the historical flood process for similarity, and then weights are given based on the similarity, and verification and comparison are performed to provide more accurate data.
在本申请的另一实施例中,弗雷歇距离和互信息的计算过程如下:In another embodiment of the present application, the calculation process of the Fréchet distance and the mutual information is as follows:
弗雷歇距离计算,给定洪水驱动因子和洪水过程的时间序列数据X={x1, x2,..., xn} 和 Y={y1, y2, ..., yn},其中 n 为序列长度。弗雷歇距离的计算步骤如下:Calculation of Fréchet distance, given the time series data of flood driving factors and flood processes X={x1, x2, ..., xn} and Y={y1, y2, ..., yn}, where n is the length of the sequence. The calculation steps of Fréchet distance are as follows:
计算X中每个元素xi到Y中所有元素的欧氏距离,得到距离矩阵 D{X→Y}:Calculate the Euclidean distance from each element xi in X to all elements in Y and get the distance matrix D{X→Y}:
DX→Y(i, j) =sqrt((xi - yj)2), i,j=1,2,...,n;D X→Y (i, j) = sqrt((xi - yj) 2 ), i, j = 1, 2, ..., n;
对于X中的每个元素xi,找到 Y 中距离最近的元素y{ji},即:ji =argminjDX→Y(i,j);For each element xi in X, find the nearest element y{j i } in Y, that is: j i = argmin j D X→Y (i, j);
计算xi到y{ji} 的距离 dX→Y(i) = DX→Y(i, j_i)。Calculate the distance from xi to y{ ji } dX→Y (i) = DX →Y (i, j_i).
类似地,计算Y中每个元素yj到X中所有元素的欧氏距离矩阵 DY→X,并找到 yj在 X中的最近邻元素xij,计算距离 dY→X(j)。Similarly, the Euclidean distance matrix D Y→X from each element yj in Y to all elements in X is calculated, and the nearest neighbor element xi j of yj in X is found and the distance d Y→X (j) is calculated.
取两个方向距离的最大值,得到最终的弗雷歇距离:dF(X, Y) = max(maxidX→Y(i), maxjdY→X(j));弗雷歇距离越小,说明两个时间序列的形状越相似。Take the maximum value of the distances in the two directions to obtain the final Fréchet distance: d F (X, Y) = max(max i d X→Y (i), max j d Y→X (j)); the smaller the Fréchet distance, the more similar the shapes of the two time series are.
互信息用于度量两个随机变量之间的相互依赖性。对于离散随机变量 X 和 Y,互信息定义为:I(X;Y) =∑x∈X∑y∈YP(x,y)log2(P(x,y)/(P(x)P(y));Mutual information is used to measure the mutual dependence between two random variables. For discrete random variables X and Y, mutual information is defined as: I(X;Y) =∑ x∈X ∑ y∈Y P(x,y)log 2 (P(x,y)/(P(x)P(y));
其中P(x)和P(y)分别是X和Y的边缘概率分布,P(x,y)是联合概率分布。互信息计算步骤如下:Where P(x) and P(y) are the marginal probability distributions of X and Y respectively, and P(x, y) is the joint probability distribution. The steps for calculating mutual information are as follows:
对洪水驱动因子和洪水过程的时间序列数据X和Y进行数据离散化。可以使用等宽分箱或者等频分箱等方法,将连续变量转换为离散变量。Discretize the time series data X and Y of flood driving factors and flood processes. You can use equal-width binning or equal-frequency binning to convert continuous variables into discrete variables.
计算离散化后X和 Y 的经验概率分布P(x)和P(y)。即对每个离散值统计其出现频率,然后除以总样本数。Calculate the empirical probability distribution P(x) and P(y) of discretized X and Y. That is, count the frequency of occurrence of each discrete value and then divide it by the total number of samples.
计算X和Y的经验联合概率分布P(x,y)。对每对离散值组合(x,y) 统计其同时出现的频率,然后除以总样本数。将上述概率分布代入互信息公式,计算I(X;Y)。互信息越大,说明两个因子之间的相关性越强。Calculate the empirical joint probability distribution P(x, y) of X and Y. For each pair of discrete value combinations (x, y), count the frequency of their co-occurrence and divide it by the total number of samples. Substitute the above probability distribution into the mutual information formula and calculate I(X; Y). The larger the mutual information, the stronger the correlation between the two factors.
综合洪水预报的不确定性量化:在训练好的模型中随机丢弃一部分神经元,通过重复多次预测来估计结果的不确定性。对于综合洪水预报模型,MC-DO的实现步骤如下:Uncertainty quantification of comprehensive flood forecasting: Randomly discard a portion of neurons in the trained model and estimate the uncertainty of the results by repeating the prediction multiple times. For the comprehensive flood forecasting model, the implementation steps of MC-DO are as follows:
在综合洪水预报模型的多个全连接层后面添加丢弃层。丢弃是指在训练过程中,以一定概率p随机将一部分神经元的输出置零。A dropout layer is added after multiple fully connected layers of the comprehensive flood forecasting model. Dropout means that during the training process, the output of a part of neurons is randomly set to zero with a certain probability p.
使用带有丢弃的模型对训练数据进行拟合,直到模型收敛。在训练时,每个批次的数据经过模型时,都会随机丢弃一部分神经元。这相当于训练了一个参数共享的模型集合。Use the model with dropout to fit the training data until the model converges. During training, a portion of neurons are randomly dropped when each batch of data passes through the model. This is equivalent to training a set of models with shared parameters.
在测试阶段,对于每个输入样本x,进行T次随机丢弃前向预测。每次预测时,模型中的一部分神经元被随机丢弃,生成一个预测结果y*t。这相当于从模型集合中采样T个模型进行预测。In the test phase, for each input sample x, T random discard forward predictions are performed. Each time a prediction is made, a portion of the neurons in the model are randomly discarded to generate a prediction result y*t. This is equivalent to sampling T models from the model set for prediction.
步骤4:对 T 次预测结果 {y*1, y*2, ..., y*T} 计算均值和方差:E(y*) = (1/T)∑t=1 T(y*t);Var (y*) =(1/T)∑t=1 T (y*t- E(y*))2;均值反映了预测的期望,方差反映了预测的不确定性。Step 4: Calculate the mean and variance of T prediction results {y*1, y*2, ..., y*T}: E(y*) = (1/T)∑ t=1 T (y*t); Var (y*) = (1/T)∑ t=1 T (y*t- E(y*)) 2 ; the mean reflects the expectation of the prediction, and the variance reflects the uncertainty of the prediction.
假设预测结果服从正态分布,给定置信度α,计算置信区间:Assuming that the prediction results follow a normal distribution and given a confidence level α, calculate the confidence interval:
[E(y*)-(zα/2)sqrt(Var (y*)),E(y*)+(zα/2)sqrt(Var (y*))][E(y*)-(z α /2)sqrt(Var (y*)), E(y*)+(z α /2)sqrt(Var (y*))]
其中 zα/2是标准正态分布的α/2 分位数。置信区间给出了预测结果的不确定性范围。通过随机丢弃神经元来近似贝叶斯神经网络,以较低的计算代价估计了模型的不确定性。置信区间越宽,说明模型的预测越不确定。Where z α /2 is the α/2 quantile of the standard normal distribution. The confidence interval gives the uncertainty range of the prediction results. By randomly discarding neurons to approximate the Bayesian neural network, the uncertainty of the model is estimated at a low computational cost. The wider the confidence interval, the more uncertain the model's predictions are.
对抗验证用于评估机器学习模型鲁棒性。其目标是生成一些微小扰动的对抗样本,使得模型在这些样本上的预测发生较大改变,从而探测模型的脆弱性。对抗验证的一般步骤如下:Adversarial validation is used to evaluate the robustness of machine learning models. Its goal is to generate some adversarial samples with slight perturbations, so that the model's predictions on these samples change significantly, thereby detecting the model's vulnerability. The general steps of adversarial validation are as follows:
选择一个训练好的洪水预报模型fθ,其中θ为模型参数。Select a trained flood forecast model fθ, where θ is the model parameter.
从测试集中随机选择一个样本 x,根据模型预测其洪水标签y *= fθ。Randomly select a sample x from the test set and predict its flood label y*= fθ according to the model.
构造一个对抗目标函数J(x'),使得当输入为扰动后的样本x'时,模型的预测结果与原预测结果y *差异较大。常用的目标函数有交叉熵损失、最大置信度损失等。Construct an adversarial objective function J(x') so that when the input is the perturbed sample x', the model's prediction result is significantly different from the original prediction result y*. Commonly used objective functions include cross entropy loss, maximum confidence loss, etc.
利用梯度上升等优化算法最大化目标函数,生成对抗扰动δ:Use optimization algorithms such as gradient ascent to maximize the objective function and generate adversarial perturbations δ:
maxJ(x+δ),s,t,||δ||p≤ε;maxJ(x+δ),s,t,||δ|| p ≤ ε;
其中||δ||p表示Lp范数, ε为扰动的大小限制。通常取p等于无穷大,此时 ||δ||无穷大≤ε表示扰动的每个元素不超过ε。Where ||δ|| p represents the Lp norm, and ε is the size limit of the perturbation. Usually p is taken to be infinity, then ||δ|| infinity≤ε means that each element of the perturbation does not exceed ε.
将扰动叠加到原样本上,得到对抗样本: xadv= x +δ。Superimpose the perturbation on the original sample to obtain the adversarial sample: x adv = x +δ.
将对抗样本输入模型,得到其预测标签y*adv= fθ(xadv),比较y*adv和y*的差异。如果差异较大,说明模型对扰动敏感,鲁棒性不够。Input the adversarial sample into the model and get its predicted label y* adv = fθ(x adv ), and compare the difference between y* adv and y*. If the difference is large, it means that the model is sensitive to perturbations and is not robust enough.
重复上述步骤,在大量样本上评估模型的平均对抗鲁棒性。鲁棒性指标可以是对抗样本的预测准确率、平均预测损失等。分析模型的对抗脆弱点,改进模型结构或训练方法,提高模型鲁棒性。常用的方法包括对抗训练、梯度正则化、输入预处理等。Repeat the above steps to evaluate the average adversarial robustness of the model on a large number of samples. Robustness indicators can be the prediction accuracy of adversarial samples, average prediction loss, etc. Analyze the adversarial vulnerabilities of the model, improve the model structure or training method, and improve the model robustness. Common methods include adversarial training, gradient regularization, input preprocessing, etc.
对抗验证通过模拟潜在的对抗攻击,主动识别模型的弱点,有助于构建更加安全和可靠的洪水预报系统。Adversarial verification helps build a more secure and reliable flood forecasting system by simulating potential adversarial attacks and proactively identifying model weaknesses.
步骤S42b1的过程具体如下:The process of step S42b1 is specifically as follows:
根据洪水形成的原因、过程和特征,可以将洪水分为以下几类:缓涨缓退型洪水、陡涨陡落型洪水、多峰型洪水和平缓型洪水。针对以上4类洪水,设计适应性的相似度计算方法:According to the causes, processes and characteristics of floods, floods can be divided into the following categories: slow-rising and slow-falling floods, steep-rising and steep-falling floods, multi-peak floods and gentle floods. For the above four types of floods, an adaptive similarity calculation method is designed:
对历史洪水事件和综合预报洪水过程进行聚类分析,得到各类洪水的中心洪水过程。可以使用K-means、层次聚类等常用聚类算法。Cluster analysis is performed on historical flood events and comprehensive flood forecast processes to obtain the central flood processes of various floods. Common clustering algorithms such as K-means and hierarchical clustering can be used.
计算每个洪水过程与各类中心洪水过程的相似度,进行归类。相似度计算采用以下方法:Calculate the similarity between each flood process and various central flood processes and classify them. The similarity calculation adopts the following method:
对于缓涨缓退型洪水,使用动态时间弯曲(DTW)距离:For slow-rise and slow-recession floods, the Dynamic Time Warping (DTW) distance is used:
步骤1:对两个洪水过程的水位序列Q1={q11, q12, ..., q1n} 和 Q2={q21,q22, ..., q2m} 进行归一化处理,消除量纲影响。Step 1: Normalize the water level series Q1={q11, q12, ..., q1n} and Q2={q21, q22, ..., q2m} of the two flood processes to eliminate the dimension effect.
步骤2:构建两个序列之间的距离矩阵 D∈Rn×m,其中 D(i,j)表示 q1i和 q2j之间的欧氏距离。Step 2: Construct a distance matrix D∈R n×m between two sequences, where D(i, j) represents the Euclidean distance between q1i and q2j.
步骤3:在距离矩阵上寻找一条从D(1,1)到 D(n,m) 的最优匹配路径,使得路径上的累积距离最小。这条路径可以通过动态规划求解:Step 3: Find an optimal matching path from D(1, 1) to D(n, m) on the distance matrix so that the cumulative distance on the path is minimized. This path can be solved by dynamic programming:
M(i,j) = D(i,j) + min{M(i-1,j-1), M(i-1,j), M(i,j-1)};其中 M(i,j) 表示从(1,1)到(i,j)的最小累积距离。M(i, j) = D(i, j) + min{M(i-1, j-1), M(i-1, j), M(i, j-1)}; where M(i, j) represents the minimum cumulative distance from (1, 1) to (i, j).
步骤4:最优匹配路径的累积距离M(n,m) 即为两个洪水过程的DTW距离,用于衡量它们的相似度。Step 4: The cumulative distance M(n, m) of the optimal matching path is the DTW distance of the two flood processes, which is used to measure their similarity.
对于陡涨陡落型洪水,使用欧式距离:For steep rise and fall floods, use the Euclidean distance:
步骤1:对两个洪水过程的水位序列Q1和Q2 进行归一化处理,消除量纲影响。Step 1: Normalize the water level series Q1 and Q2 of the two flood processes to eliminate the dimension effect.
步骤2:将两个序列补齐到相同长度 L,缺失值可以用线性插值填充。Step 2: Pad the two sequences to the same length L. Missing values can be filled using linear interpolation.
步骤3:计算两个序列对应元素的欧氏距离:d(Q1, Q2) = sqrt{sumi=1 L(q1i -q2i)2}Step 3: Calculate the Euclidean distance between corresponding elements of two sequences: d(Q1, Q2) = sqrt{sum i=1 L (q1i -q2i) 2 }
欧氏距离越小,表示两个洪水过程越相似。The smaller the Euclidean distance, the more similar the two flood processes are.
对于多峰型洪水,使用洪峰特征加权欧氏距离:For multi-peak floods, the weighted Euclidean distance of flood peak characteristics is used:
步骤1:对两个洪水过程 Q1 和 Q_2 进行特征提取,得到各自的洪峰水位 {p11,..., p1k1 和 {p21, ..., p2k2},洪峰出现时间 {t11, ..., t1k1} 和 {t21, ...,t2k2}。Step 1: Extract features of the two flood processes Q1 and Q_2 to obtain their respective peak water levels {p11, ..., p1k1 and {p21, ..., p2k2}, and peak occurrence times {t11, ..., t1k1} and {t21, ..., t2k2}.
步骤2:计算两个洪水过程的洪峰特征距离矩阵 Dp∈Rk1×k2},其中,Dp(i,j) = wp|p1i- p2j| + wt|t1i - t2j|;wp 和 wt 分别为洪峰水位和出现时间的权重。Step 2: Calculate the flood peak characteristic distance matrix Dp∈R k1×k2 } of the two flood processes, where Dp(i, j) = wp|p1i- p2j| + wt|t1i - t2j|; wp and wt are the weights of the flood peak water level and occurrence time, respectively.
步骤3:匈牙利算法求解洪峰特征距离矩阵的最优匹配,得到洪峰匹配距离的加权和作为两个洪水过程的相似度度量。Step 3: The Hungarian algorithm solves the optimal match of the flood peak feature distance matrix and obtains the weighted sum of the flood peak matching distances as the similarity measure of the two flood processes.
对于平缓型洪水,使用面积相似系数:For mild floods, use the area similarity factor:
步骤1:对两个洪水过程的水位序列 Q1 和 Q2 进行归一化处理,消除量纲影响。Step 1: Normalize the water level series Q1 and Q2 of the two flood processes to eliminate the dimension effect.
步骤2:计算两个水位序列与时间轴围成的面积 A1 和 A2,可以用梯形面积法近似。Step 2: Calculate the areas A1 and A2 enclosed by the two water level series and the time axis, which can be approximated by the trapezoidal area method.
步骤3:计算两个洪水过程的面积相似系数:s(Q1, Q2) = (2min(A1, A2))/(A1 +A2);Step 3: Calculate the area similarity coefficient of the two flood processes: s(Q1, Q2) = (2min(A1, A2))/(A1 +A2);
面积相似系数的取值范围为[0,1],越接近1表示两个洪水过程越相似。The area similarity coefficient ranges from [0, 1]. The closer it is to 1, the more similar the two flood processes are.
步骤S3c:根据洪水过程的归类结果和相似度度量,计算综合预报洪水与各类历史洪水的相似性概率分布。Step S3c: Based on the classification results and similarity measures of the flood process, the similarity probability distribution between the comprehensive forecast flood and various historical floods is calculated.
步骤S3d:根据综合预报洪水的相似性概率分布,加权融合不同相似度度量方法得到的预报结果,生成最终的概率预报。Step S3d: Based on the similarity probability distribution of the comprehensive flood forecast, the forecast results obtained by different similarity measurement methods are weighted and integrated to generate the final probability forecast.
根据本申请的另一个方面,步骤S31中,构建训练集的过程进一步为:According to another aspect of the present application, in step S31, the process of constructing the training set is further as follows:
步骤S31a、提取收集的洪水过程,将其分成N个时间段;Step S31a, extracting the collected flood process and dividing it into N time periods;
步骤S31b、分别提取N个时间段的关键非降雨洪水驱动因子数据;Step S31b, extracting key non-rainfall flood driving factor data for N time periods respectively;
步骤S31c、将N个时间段的关键非降雨洪水驱动因子数据、水文数据和非降雨气象数据输入四种洪水预报模型计算,并采用线性加权法得到综合洪水预报,提取N个洪水过程,记为训练集。Step S31c, input the key non-rainfall flood driving factor data, hydrological data and non-rainfall meteorological data of N time periods into four flood forecasting models for calculation, and use the linear weighted method to obtain a comprehensive flood forecast, extract N flood processes, and record them as training sets.
步骤S32a、提取N个洪水过程中的一个洪水过程,依次计算该洪水过程和所有历史洪水数据的弗雷歇距离,将两者之间的弗雷歇距离小于阈值的历史洪水数据认定为相似洪水过程;Step S32a, extracting one flood process from the N flood processes, calculating the Fleche distance between the flood process and all historical flood data in turn, and identifying the historical flood data whose Fleche distance between the two is less than a threshold as a similar flood process;
步骤S32b、依次计算N个洪水过程对应的相似洪水过程;Step S32b, sequentially calculating similar flood processes corresponding to N flood processes;
步骤S32c、提取相似洪水的后续洪水过程作为验证集。Step S32c, extracting subsequent flood processes of similar floods as a verification set.
提取了某市500场历史洪水过程,以10分钟为步长将每场洪水过程分成K段,最后一段不足10分钟的自成一段。相应的城市防洪数据、降雨数据和非降雨洪水驱动因子也分成对应的K段。We extracted 500 historical flood processes in a city and divided each flood process into K segments with a step length of 10 minutes. The last segment of less than 10 minutes is a separate segment. The corresponding urban flood control data, rainfall data and non-rainfall flood driving factors are also divided into corresponding K segments.
用50场历史洪水过程的每一场分别训练和验证四种洪水预报模型,具体操作如下:Four flood forecasting models were trained and validated using each of the 50 historical flood processes. The specific operations are as follows:
对于第a场历史洪水过程,a=1,2,…,50,以其以第1段(即第1个10分钟)的城市防洪数据、降雨数据和非降雨洪水驱动因子采取四种洪水预报模型计算出当前洪水过程,并训练模型参数使计算过程与第a场历史洪水过程第1段实际发生过程误差最小;For the a-th historical flood process, a=1, 2, ..., 50, the current flood process is calculated using four flood forecasting models based on the first section (i.e., the first 10 minutes) of urban flood control data, rainfall data, and non-rainfall flood driving factors, and the model parameters are trained to minimize the error between the calculation process and the actual process of the first section of the a-th historical flood process;
计算当前洪水过程和其他499场历史洪水数据的弗雷歇距离,筛选出最相似的一场历史洪水过程,以那场洪水的第2段(即第2个10分钟)作为验证集;Calculate the Fréchet distance between the current flood process and the other 499 historical flood data, select the most similar historical flood process, and use the second section of that flood (i.e. the second 10 minutes) as the validation set;
以此类推,遍历第a场历史洪水过程的每一段;And so on, going through each section of the a-th historical flood process;
以此类推,用50场历史洪水过程的每一场分别训练和验证四种洪水预报模型。In this way, four flood forecasting models are trained and verified using each of the 50 historical flood processes.
对模型进一步进行对抗验证,通过对输入的非降雨洪水驱动因子数据添加扰动构建对抗样本,测试模型在极端情况下的预测能力;分析模型对对抗样本的脆弱性,并针对性地进行模型修正和优化,提高洪水预报在恶劣条件下的鲁棒性。The model is further verified by adding disturbances to the input non-rainfall flood driving factor data to construct adversarial samples, and the model's predictive ability under extreme conditions is tested. The model's vulnerability to adversarial samples is analyzed, and targeted model corrections and optimizations are performed to improve the robustness of flood forecasts under harsh conditions.
根据本申请的另一个方面,提供一种用于城市防洪的智慧预报系统,其特征在于,包括:According to another aspect of the present application, there is provided a smart forecasting system for urban flood control, characterized in that it includes:
至少一个处理器;以及at least one processor; and
与至少一个所述处理器通信连接的存储器;其中,a memory communicatively connected to at least one of the processors; wherein,
所述存储器存储有可被所述处理器执行的指令,所述指令用于被所述处理器执行以实现上述任一项所述的用于城市防洪的智慧预报方法。The memory stores instructions that can be executed by the processor, and the instructions are used to be executed by the processor to implement any of the above-mentioned intelligent forecasting methods for urban flood control.
以上详细描述了本发明的优选实施方式,但是,本发明并不限于上述实施方式中的具体细节,在本发明的技术构思范围内,可以对本发明的技术方案进行多种等同变换,这些等同变换均属于本发明的保护范围。The preferred embodiments of the present invention are described in detail above; however, the present invention is not limited to the specific details in the above embodiments. Within the technical concept of the present invention, various equivalent transformations can be made to the technical solutions of the present invention, and these equivalent transformations all belong to the protection scope of the present invention.
Claims (5)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410561250.6A CN118134729B (en) | 2024-05-08 | 2024-05-08 | Intelligent forecasting method and system for urban flood control |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410561250.6A CN118134729B (en) | 2024-05-08 | 2024-05-08 | Intelligent forecasting method and system for urban flood control |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN118134729A CN118134729A (en) | 2024-06-04 |
| CN118134729B true CN118134729B (en) | 2024-07-05 |
Family
ID=91248298
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410561250.6A Active CN118134729B (en) | 2024-05-08 | 2024-05-08 | Intelligent forecasting method and system for urban flood control |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118134729B (en) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119151240B (en) * | 2024-11-12 | 2025-10-03 | 中科晟通(山东)信息技术有限公司 | A method and system for urban governance scheduling based on dynamic evaluation |
| CN120069211A (en) * | 2025-02-13 | 2025-05-30 | 天津白泽技术有限公司 | Regional water affair intelligent early warning system and method based on cloud computing |
| CN120044642B (en) * | 2025-04-24 | 2025-09-23 | 成都润联科技开发有限公司 | Meteorological short-term forecasting method based on multi-source data fusion and real-time modeling |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107292098A (en) * | 2017-06-15 | 2017-10-24 | 河海大学 | Medium-and Long-Term Runoff Forecasting method based on early stage meteorological factor and data mining technology |
| CN117010726A (en) * | 2023-09-29 | 2023-11-07 | 水利部交通运输部国家能源局南京水利科学研究院 | Intelligent early warning method and system for urban flood control |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112801342A (en) * | 2020-12-31 | 2021-05-14 | 国电大渡河流域水电开发有限公司 | Adaptive runoff forecasting method based on rainfall runoff similarity |
| US20230123322A1 (en) * | 2021-04-16 | 2023-04-20 | Strong Force Vcn Portfolio 2019, Llc | Predictive Model Data Stream Prioritization |
-
2024
- 2024-05-08 CN CN202410561250.6A patent/CN118134729B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107292098A (en) * | 2017-06-15 | 2017-10-24 | 河海大学 | Medium-and Long-Term Runoff Forecasting method based on early stage meteorological factor and data mining technology |
| CN117010726A (en) * | 2023-09-29 | 2023-11-07 | 水利部交通运输部国家能源局南京水利科学研究院 | Intelligent early warning method and system for urban flood control |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118134729A (en) | 2024-06-04 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN118134729B (en) | Intelligent forecasting method and system for urban flood control | |
| Liu et al. | Short-term runoff prediction using deep learning multi-dimensional ensemble method | |
| CN103974311B (en) | Based on the Condition Monitoring Data throat floater detection method for improving Gaussian process regression model | |
| CN114444378B (en) | A short-term power forecasting method for regional wind power clusters | |
| CN119848517B (en) | Building dynamic structure health monitoring method | |
| CN115096627B (en) | Method and system for fault diagnosis and operation and maintenance in manufacturing process of hydraulic forming intelligent equipment | |
| CN113449919B (en) | Power consumption prediction method and system based on feature and trend perception | |
| Zhang et al. | A deep-learning based precipitation forecasting approach using multiple environmental factors | |
| CN109886496B (en) | A Method of Agricultural Yield Forecasting Based on Meteorological Information | |
| CN116738192A (en) | Digital twinning-based security data evaluation method and system | |
| CN116663404A (en) | A Flood Forecasting Method and System Coupling Artificial Intelligence and Bayesian Theory | |
| Li et al. | A novel multichannel long short-term memory method with time series for soil temperature modeling | |
| CN115907436B (en) | Quality coupling prediction-based water resource water environment regulation and control method and system | |
| CN118643467B (en) | Day runoff prediction method based on multi-feature fusion and two-dimensional time convolution network | |
| CN119988897B (en) | Fault identification method based on intelligent model | |
| CN118468239A (en) | A method for predicting river water level using a family of trace regression models based on high-dimensional variable screening | |
| CN118676452A (en) | Method, device, equipment and storage medium for monitoring energy consumption of battery control chip | |
| CN118966451A (en) | A method for ultra-short-term power prediction of distributed photovoltaic clusters | |
| CN118656640A (en) | Meteorological report generation method and system based on deep learning | |
| CN113962456B (en) | A medium- and long-term load forecasting method taking into account industry correlation | |
| CN115062762A (en) | Ocean current trajectory prediction method | |
| CN119474881B (en) | A basin rainfall runoff prediction method, device and program product integrating mechanism model and machine learning model | |
| CN118412862B (en) | A regional wind power prediction method, device and server taking into account extreme weather | |
| CN110909943A (en) | Multi-scale multi-factor joint-driven monthly runoff probability forecasting method | |
| Zhu | [Retracted] Big Data’s Analysis and Prediction Method of Art Education Based on the BP Neural Network |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |