+

CN110503570A - A method, system, device and storage medium for detecting abnormal power consumption data - Google Patents

A method, system, device and storage medium for detecting abnormal power consumption data Download PDF

Info

Publication number
CN110503570A
CN110503570A CN201910641996.7A CN201910641996A CN110503570A CN 110503570 A CN110503570 A CN 110503570A CN 201910641996 A CN201910641996 A CN 201910641996A CN 110503570 A CN110503570 A CN 110503570A
Authority
CN
China
Prior art keywords
data
load
electricity consumption
abnormal
management
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910641996.7A
Other languages
Chinese (zh)
Inventor
刘恬语
张涛
刘松梅
王桢干
刘伟
徐蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Industrial Co Ltd Of Strand Intense Source
State Grid Jiangsu Electric Power Co Ltd
Yancheng Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Binhai Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
State Grid Corp of China SGCC
Original Assignee
Electric Industrial Co Ltd Of Strand Intense Source
State Grid Jiangsu Electric Power Co Ltd
Yancheng Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Binhai Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
State Grid Corp of China SGCC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Industrial Co Ltd Of Strand Intense Source, State Grid Jiangsu Electric Power Co Ltd, Yancheng Power Supply Co of State Grid Jiangsu Electric Power Co Ltd, Binhai Power Supply Co of State Grid Jiangsu Electric Power Co Ltd, State Grid Corp of China SGCC filed Critical Electric Industrial Co Ltd Of Strand Intense Source
Priority to CN201910641996.7A priority Critical patent/CN110503570A/en
Publication of CN110503570A publication Critical patent/CN110503570A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明提供一种异常用电数据检测方法,包括步骤:获取数据,数据清洗,数据降维,模型建立,筛选异常用户。本发明还涉及一种异常用电数据检测系统、电子设备和存储介质。本发明有效的解决了线损管理中存在的问题,能够针对台区用电系统的线损异常进行数据挖掘研究和分析,使线损管理更加透明、高效,能够发挥其综合管理应用,最终实现节能降损、规范化管理的目标。

The invention provides a method for detecting abnormal electricity consumption data, which includes the steps of: acquiring data, cleaning data, reducing data dimension, establishing a model, and screening abnormal users. The invention also relates to an abnormal power consumption data detection system, electronic equipment and storage medium. The invention effectively solves the problems existing in the line loss management, can conduct data mining research and analysis on the abnormal line loss of the power consumption system in the station area, makes the line loss management more transparent and efficient, and can exert its comprehensive management application, and finally realize the The goal of energy saving and loss reduction and standardized management.

Description

一种异常用电数据检测方法、系统、设备、存储介质A method, system, device and storage medium for detecting abnormal power consumption data

技术领域technical field

本发明涉及用电信息采集技术领域,尤其涉及一种异常用电数据检测方 法。The invention relates to the technical field of electricity consumption information collection, in particular to a method for detecting abnormal electricity consumption data.

背景技术Background technique

随着信息化时代的迅速发展,率先展开大数据相关研究的是互联网、信 息通信行业。对电力行业而言,大数据也同样具有深远的研究意义和光明的 应用前景。随着下一代电力系统逐步演进,基于数据驱动的电力供应链将逐 步取代传统的电力供应链。其中用电信息采集系统的推广,为我国电力行业 开展基于电力数据分析的管理运营决策和供电服务优化提供了必要的数据基 础。同时随着电能数据、工况数据、事件信息等用电数据呈指数增长,大数 据特征越来越显著,用电大数据的应用需求日益迫切。海量的用电数据主要来源于各类计量装置及系统,由于多种设备故障、通信故障、电网波动和管 理等原因,出现了大量异常的用电数据。面对这种海量用电数据的增加,多 数电力部门仅使用传统的统计方法进行异常数据分析,并且大多需要依赖现 场检验来实现。由于受到人力、物力、财力的限制,异常数据背后隐藏的深 层次原因无法有效得以提炼,却带来了“数据灾难”和“数据荒废”。因此, 用传统分析手段己难以满足要求,我们需要通过数据挖掘来发现用电数据异常更深层次的规律,排除数据的偶然性,提炼数据的必然性。With the rapid development of the information age, the Internet and information and communication industries are the first to carry out research related to big data. For the power industry, big data also has far-reaching research significance and bright application prospects. With the gradual evolution of the next-generation power system, the data-driven power supply chain will gradually replace the traditional power supply chain. Among them, the promotion of electricity consumption information collection system provides the necessary data foundation for the management and operation decision-making and power supply service optimization based on power data analysis in my country's power industry. At the same time, with the exponential growth of power consumption data such as power data, working condition data, and event information, the characteristics of big data are becoming more and more prominent, and the application demand of power consumption big data is becoming more and more urgent. Massive power consumption data mainly comes from various metering devices and systems. Due to various equipment failures, communication failures, power grid fluctuations and management and other reasons, a large number of abnormal power consumption data have appeared. In the face of this massive increase in electricity consumption data, most power departments only use traditional statistical methods to analyze abnormal data, and most of them need to rely on on-site inspections. Due to the limitation of human, material and financial resources, the underlying reasons behind abnormal data cannot be effectively extracted, but it has brought about "data disaster" and "data waste". Therefore, it is difficult to meet the requirements with traditional analysis methods. We need to discover the deeper laws of abnormal electricity consumption data through data mining, eliminate the contingency of the data, and refine the inevitability of the data.

由于低压客户群体数量庞大,且变化频繁,目前台区线损管理中普遍存 在户变关系不清、抄表质量不佳、窃电、计量故障等管理原因导致的线损异 常。近几年,国内许多供电企业不同程度的面临一个共同的窘境,即在治理 台区线损上“投资大、回报小”,其根源是近十年以来,影响台区线损的主要 因素已经转变为管理上的损耗,而改造投资方向不变。Due to the large number of low-voltage customer groups and frequent changes, there are currently common line loss abnormalities in Taiwan district line loss management due to unclear household change relationship, poor meter reading quality, electricity theft, metering failures and other management reasons. In recent years, many domestic power supply companies have faced a common dilemma to varying degrees, that is, "large investment and low return" in the treatment of Taiwan line losses. It turns into a loss in management, and the direction of investment in transformation remains unchanged.

发明内容SUMMARY OF THE INVENTION

为了克服现有技术的不足,本发明提供一种异常用电数据检测方法。本 发明通过基于实时数据库与云计算、云实时存储平台技术相融合的应用一体 化,使用高效的并行计算技术实现大数据批处理任务的高吞吐率。采用稳定 性好,抗噪性能强的孤立森林算法有效挖掘数据异常用户,分析线损原因, 加强台区线损管理。In order to overcome the deficiencies of the prior art, the present invention provides a method for detecting abnormal power consumption data. The present invention achieves high throughput of big data batch processing tasks by using efficient parallel computing technology through application integration based on the integration of real-time database, cloud computing, and cloud real-time storage platform technology. Using the isolated forest algorithm with good stability and strong anti-noise performance, it can effectively mine abnormal data users, analyze the cause of line loss, and strengthen the management of line loss in the station area.

本发明提供一种异常用电数据检测方法,包括如下步骤:The present invention provides a method for detecting abnormal electricity consumption data, comprising the following steps:

获取数据,通过用电信息采集方式获取用电数据;Obtain data, and obtain electricity consumption data through electricity consumption information collection;

数据清洗,将采集到的用电数据进行清洗,并检测用电数据中脏数据的 类型,得到有效用电数据;所述脏数据的类型包括:缺失值、重复值、极大 极小值、负荷毛刺、冲击负值;Data cleaning, cleaning the collected electricity consumption data, and detecting the type of dirty data in the electricity consumption data to obtain valid electricity consumption data; the types of dirty data include: missing values, duplicate values, maximum and minimum values, Load burr, negative impact value;

数据降维,利用日负荷特性指标对所述有效用电数据进行特征降维,所 述日负荷特征指标包括负荷率、峰谷差率、最高利用小时率、峰期负载率、 平期负载率、谷期负载率;Data dimensionality reduction, using the daily load characteristic index to perform feature dimension reduction on the effective power consumption data, the daily load characteristic index includes load rate, peak-valley difference rate, maximum utilization hour rate, peak load rate, and average load rate , valley load rate;

模型建立,若干个孤立树构建成孤立森林,利用孤立森林算法建立第一 分析模型,利用评估曲线进行模型评估;Model establishment, several isolated trees are constructed into an isolated forest, the first analysis model is established by using the isolated forest algorithm, and the model is evaluated by using the evaluation curve;

筛选异常用户,利用所述第一分析模型对目标数据进行筛选,并对筛选 过的数据进行数据挖掘,筛选出用电异常用户。Screen abnormal users, use the first analysis model to screen target data, and perform data mining on the screened data to screen out abnormal electricity users.

优选地,所述用电信息采集方式包括云存储,所述云存储用于将所述用 电数据分散存储在多台独立的存储服务器上,所述存储服务器的类型包括元 数据管理服务、卷管理服务和块数据管理服务。Preferably, the electricity consumption information collection method includes cloud storage, and the cloud storage is used for scattered storage of the electricity consumption data on a plurality of independent storage servers, and the types of the storage servers include metadata management services, volume Management Services and Block Data Management Services.

优选地,在步骤数据清洗中还包括:根据用电负荷的波动周期特性填充 空缺失值的计算公式如下:Preferably, in the step data cleaning, it also includes: the calculation formula for filling empty missing values according to the fluctuation period characteristics of the electricity load is as follows:

其中,Xi表示当前时刻的用电负荷,i为负荷数据缺失的时刻,取值为 1-24,a1和a2为前后两天对应时刻和当前时刻前后两个时间点负荷的加权 系数。Among them, X i represents the electricity load at the current moment, i is the moment when the load data is missing, ranging from 1 to 24, and a1 and a2 are the weighting coefficients of the load at the corresponding moment of the two days before and after the current moment and the two time points before and after the current moment.

优选地,在步骤获取数据之前还包括步骤:Preferably, before the step of acquiring data, it also includes the steps of:

建立管理方案,建立台区线损管理指标,所述台区线损管理指标的状态 标识包括覆盖类、户变类、可采类、数据类、线损类;对采集的多台区的用 电数据进行状态标识,并针对不同的状态采取相对应的管控措施,形成台区 线损管理方案。Establish a management plan and establish line loss management indicators in the station area. The status identifiers of the station area line loss management indicators include coverage type, household change type, recoverable type, data type, and line loss type; The electrical data is used to identify the state, and corresponding control measures are taken for different states to form a line loss management plan in the station area.

优选地,在步骤模型建立中还包括,利用受试者工作特征ROC曲线、曲 线下面积AUC、累积查全率曲线和P-R曲线,以查准率为纵坐标,查全率为 横坐标进行模型评估。Preferably, in the establishment of the step model, it also includes: using the receiver operating characteristic ROC curve, the area under the curve AUC, the cumulative recall rate curve and the P-R curve, taking the precision as the ordinate and the recall as the abscissa to model the model Evaluate.

优选地,所述孤立森林算法包括第一阶段算法与第二阶段算法,所述第 一阶段算法包括构建多个孤立树组成孤立森林;所述第二阶段算法包括利用 生成的孤立森林来评估测试数据,对被检测数据计算异常分值。Preferably, the isolated forest algorithm includes a first-stage algorithm and a second-stage algorithm, the first-stage algorithm includes constructing a plurality of isolated trees to form an isolated forest; the second-stage algorithm includes using the generated isolated forest to evaluate and test data, and calculate the abnormal score for the detected data.

一种电子设备,包括:处理器;An electronic device, comprising: a processor;

存储器;以及程序,其中所述程序被存储在所述存储器中,并且被配置 成由处理器执行,所述程序包括一种异常用电数据检测方法。a memory; and a program, wherein the program is stored in the memory and configured to be executed by a processor, the program including an abnormal power usage data detection method.

一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被 处理器执行包括一种异常用电数据检测方法。A computer-readable storage medium on which a computer program is stored, the computer program being executed by a processor includes a method for detecting abnormal power consumption data.

一种异常用电数据检测系统,包括获取数据模块、数据清洗模块、数据 降维模块、模型建立模块、筛选异常用户模块;其中,An abnormal power consumption data detection system, comprising a data acquisition module, a data cleaning module, a data dimension reduction module, a model building module, and a screening abnormal user module; wherein,

所述获取数据模块用于通过用电信息采集方式获取用电数据;The acquiring data module is used for acquiring electricity consumption data by means of electricity consumption information collection;

所述数据清洗模块用于将采集到的用电数据进行清洗,并检测用电数据 中脏数据的类型,得到有效用电数据;所述脏数据的类型包括:缺失值、重 复值、极大极小值、负荷毛刺、冲击负值;The data cleaning module is used to clean the collected electricity consumption data, and detect the type of dirty data in the electricity consumption data to obtain valid electricity consumption data; the types of dirty data include: missing values, duplicate values, extremely large Minimum value, load burr, negative impact value;

所述数据降维模块用于利用日负荷特性指标对所述有效用电数据进行特 征降维,所述日负荷特征指标包括负荷率、峰谷差率、最高利用小时率、峰 期负载率、平期负载率、谷期负载率;The data dimensionality reduction module is used to perform feature dimensionality reduction on the effective power consumption data by using a daily load characteristic index, wherein the daily load characteristic index includes a load rate, a peak-valley difference rate, a maximum utilization hour rate, a peak load rate, Load rate during flat period and load rate during valley period;

所述模型建立模块用于若干个孤立树构建成孤立森林,利用孤立森林算 法建立第一分析模型,利用评估曲线进行模型评估;Described model establishment module is used for several isolated trees to construct into isolated forest, utilizes the isolated forest algorithm to establish the first analysis model, utilizes the evaluation curve to carry out model evaluation;

所述筛选异常用户模块用于利用所述第一分析模型对目标数据进行筛 选,并对筛选过的数据进行数据挖掘,筛选出用电异常用户。The abnormal user screening module is used to screen the target data by using the first analysis model, and perform data mining on the screened data to screen out abnormal electricity users.

优选地,还包括建立管理方案模块,所述建立管理方案模块用于建立台 区线损管理指标,所述台区线损管理指标的状态标识包括覆盖类、户变类、 可采类、数据类、线损类;对采集的多台区的用电数据进行状态标识,并针 对不同的状态采取相对应的管控措施,形成台区线损管理方案;Preferably, it also includes an establishment management scheme module, the establishment management scheme module is used to establish the station area line loss management index, and the status identifiers of the station area line loss management index include coverage type, household change type, recoverable type, data Class, line loss class; state identification of the collected power consumption data of multiple stations, and take corresponding control measures for different states to form a station area line loss management plan;

所述获取数据模块包括云存储单元,所述云存储单元用于将所述用电数 据分散存储在多台独立的存储服务器上,所述存储服务器的类型包括元数据 管理服务、卷管理服务和块数据管理服务;The data acquisition module includes a cloud storage unit, and the cloud storage unit is configured to store the power consumption data in a distributed manner on multiple independent storage servers, and the types of the storage servers include metadata management services, volume management services, and block data management services;

所述数据清洗模块包括缺失值填充单元,所述缺失值填充单元用于根据 用电负荷的波动周期特性填充空缺失值的计算公式如下:The data cleaning module includes a missing value filling unit, and the calculation formula of the missing value filling unit for filling empty missing values according to the fluctuation period characteristic of the electricity load is as follows:

其中,Xi表示当前时刻的用电负荷,i为负荷数据缺失的时刻,取值为 1-24,a1和a2为前后两天对应时刻和当前时刻前后两个时间点负荷的加权 系数。Among them, X i represents the electricity load at the current moment, i is the moment when the load data is missing, ranging from 1 to 24, and a1 and a2 are the weighting coefficients of the load at the corresponding moment of the two days before and after the current moment and the two time points before and after the current moment.

相比现有技术,本发明的有益效果在于:Compared with the prior art, the beneficial effects of the present invention are:

1)、一种异常用电数据检测方法,作为一种新的适应智能电网发展需求 的台线损管理方法,有效的解决了当前台区管理中存在的问题,使台区线损 管理更加透明、高效,发挥其在营销管理中的综合管理作用,最终实现节能 降损、规范管理的目标。1), a detection method of abnormal power consumption data, as a new station line loss management method that adapts to the development needs of smart grid, effectively solves the problems existing in the current station area management, and makes the station area line loss management more transparent , high efficiency, give full play to its comprehensive management role in marketing management, and ultimately achieve the goals of energy saving, loss reduction, and standardized management.

2)、建立的台区线损管理指标体系共有五种状态:覆盖类、户变类、可 采类、数据类、线损类五种状态标识及其层级关系。对于不同状态的台区, 按照不同的管控重点,制定不同的管控方法、管控周期和责任部门,最终推 动台区实现良态递进。2) The established line loss management index system in the station area has five states: coverage, household change, recoverable, data, line loss, and five state identifiers and their hierarchical relationships. For stations in different states, according to different management and control priorities, formulate different management and control methods, management and control cycles, and responsible departments, and ultimately promote the progress of the stations in a healthy state.

3)、本发明有效的解决了线损管理中存在的问题,能够针对台区用电系 统的线损异常进行数据挖掘研究和分析,使线损管理更加透明、高效,能够 发挥其综合管理应用,最终实现节能降损、规范化管理的目标;3) The present invention effectively solves the problems existing in the line loss management, and can conduct data mining research and analysis on the abnormal line loss of the power consumption system in the station area, so that the line loss management is more transparent and efficient, and its comprehensive management application can be exerted. , and ultimately achieve the goals of energy saving and loss reduction and standardized management;

4)、云计算技术可以通过利用分布式的软硬件资源和信息,提供按需分 配的高质量服务,并在搜索引擎、社交网络、通信等众多领域中得到了成功 的应用。在智能电网信息化建设领域,云计算所独具的大规模数据高效存取 和并行计算能力,使之能够为包括用电信息采集系统在内的信息系统提供高 质量的数据处理服务,为智能电网时代的信息化体系提供坚实的技术支撑。4) Cloud computing technology can provide high-quality services on demand by using distributed software and hardware resources and information, and has been successfully applied in many fields such as search engines, social networks, and communications. In the field of smart grid informatization construction, cloud computing's unique large-scale data efficient access and parallel computing capabilities enable it to provide high-quality data processing services for information systems including power consumption information collection systems, and provide intelligent The information system in the power grid era provides solid technical support.

上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技 术手段,并可依照说明书的内容予以实施,以下以本发明的较佳实施例并配 合附图详细说明如后。本发明的具体实施方式由以下实施例及其附图详细给 出。The above description is only an overview of the technical solution of the present invention. In order to be able to understand the technical means of the present invention more clearly, and to implement it according to the content of the description, the preferred embodiments of the present invention are described below in detail with the accompanying drawings. Specific embodiments of the present invention are given in detail by the following examples and the accompanying drawings.

附图说明Description of drawings

此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部 分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的 不当限定。在附图中:The accompanying drawings described herein are used to provide a further understanding of the present invention and constitute a part of the present application. The exemplary embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute an improper limitation of the present invention. In the attached image:

图1为本发明的一种异常用电数据检测方法的整体流程图;Fig. 1 is the overall flow chart of a kind of abnormal electricity consumption data detection method of the present invention;

图2为本发明的一种异常用电数据检测方法的台区线损管理指标状态递 进示意图;Fig. 2 is the progressive schematic diagram of the station area line loss management index state of a kind of abnormal power consumption data detection method of the present invention;

图3为本发明的一种异常用电数据检测方法的构建孤立树的示意图;3 is a schematic diagram of constructing an isolated tree of a method for detecting abnormal electricity consumption data according to the present invention;

图4为本发明的一种异常用电数据检测方法的数据降维处理示意图;4 is a schematic diagram of data dimensionality reduction processing of a method for detecting abnormal electricity consumption data according to the present invention;

图5为本发明的一种异常用电数据检测方法的筛选异常用户示意图;5 is a schematic diagram of screening abnormal users in a method for detecting abnormal electricity consumption data according to the present invention;

图6为本发明的一种异常用电数据检测系统的面向服务的架构体系的整 体结构示意图;Fig. 6 is the overall structure schematic diagram of the service-oriented architecture system of a kind of abnormal power consumption data detection system of the present invention;

图7为本发明的一种异常用电数据检测系统的整体结构示意图。FIG. 7 is a schematic diagram of the overall structure of an abnormal power consumption data detection system of the present invention.

具体实施方式Detailed ways

下面,结合附图以及具体实施方式,对本发明做进一步描述,需要说明 的是,在不相冲突的前提下,以下描述的各实施例之间或各技术特征之间可 以任意组合形成新的实施例。The present invention will be further described below with reference to the accompanying drawings and specific embodiments. It should be noted that, on the premise of no conflict, the embodiments or technical features described below can be combined arbitrarily to form new embodiments. .

一种异常用电数据检测方法,如图1所示,包括如下步骤:A method for detecting abnormal electricity consumption data, as shown in Figure 1, includes the following steps:

S0、建立管理方案,建立台区线损管理指标,所述台区线损管理指标的 状态标识包括覆盖类、户变类、可采类、数据类、线损类;对采集的多台区 的用电数据进行状态标识,并针对不同的状态采取相对应的管控措施,形成 台区线损管理方案。在一个实施例中,如图2所示,1.1-初始台区:数据准 备,纳入层次递进管理;1.2-安装类台区:合理安排设备的安装,覆盖率达 到100%;1.3-户变类台区:核查台区户变关系,准确率达到100%;1.4-可采 类台区:多次采集,分析故障,可采率达到95%;1.5-数据类台区:多次采 集,分析误差,可采率达到95%;1.6-线损类台区:分析线损率异常原因, 制定降损措施;1.7-达标台区:采取固优措施,保持达标台状态。S0. Establish a management plan, and establish a line loss management index in the station area. The status identifiers of the line loss management index in the station area include coverage type, household change type, recoverable type, data type, and line loss type; According to the power consumption data, the status is identified, and corresponding control measures are taken for different statuses to form a line loss management plan in the station area. In one embodiment, as shown in Figure 2, 1.1-Initial station area: data preparation, included in hierarchical progressive management; 1.2-Installation station area: Reasonable arrangement of equipment installation, coverage rate reaches 100%; 1.3-Household change Class station area: check the relationship between households in the station area, the accuracy rate reaches 100%; 1.4-recoverable class station area: collect multiple times, analyze faults, and the recoverable rate reaches 95%; 1.5-data class station area: collect multiple times, Analyzing the error, the recoverable rate reaches 95%; 1.6-line loss type station area: analyze the cause of abnormal line loss rate and formulate loss reduction measures; 1.7-standard station area: take solid optimization measures to maintain the standard station status.

具体地,建立的台区线损管理指标包括覆盖类、户变类、可采类、数据类、 线损类五种状态标识及其层级关系;根据采集到的用电负荷数据对多台区进 行以下状态标识,针对不同类型台区的管控重点制定相应的管控措施,从而 形成基于用电信息采集系统的台区线损管理方法,具体措施如下:Specifically, the established station area line loss management indicators include five state identifiers and their hierarchical relationships: coverage type, household variable type, recoverable type, data type, and line loss type; The following status identification is carried out, and corresponding control measures are formulated for the management and control focus of different types of stations, so as to form a line loss management method based on the power consumption information collection system. The specific measures are as follows:

覆盖类:台区内采集设备安装率未达到100%,应合理安排釆集设备安装 计划;Coverage category: If the installation rate of collection equipment in the station area does not reach 100%, the collection equipment installation plan should be reasonably arranged;

户变类:采集覆盖率已达到100%的台区,但户变关系尚不准确,应通过 内查资料外查现场结合的方式,核准户变关系;Household change category: The collection coverage has reached 100% in the Taiwan area, but the household change relationship is still inaccurate, and the household change relationship should be approved through the combination of internal inspection data and on-site inspection;

可采类:釆集覆盖率己达到100%,但可采率尚未达到95%,应统计可采率, 分析漏采、误采的原因;Recoverable category: The collection coverage rate has reached 100%, but the recoverable rate has not yet reached 95%. The recoverable rate should be counted, and the reasons for missed and miscollected collections should be analyzed;

数据类:覆盖率达到100%、可采率达到95%且户变关系正确,但采集的数 据与人工抄表数据误差大于均值,制定合理的抄表计划;Data type: The coverage rate reaches 100%, the recovery rate reaches 95%, and the household change relationship is correct, but the error between the collected data and the manual meter reading data is greater than the average value, and a reasonable meter reading plan is formulated;

线损类:覆盖率、可采率、准确率均已达到100%且户变关系正确,但线 损率异常,应及时分析线损率异常原因,制定降损措施。Line loss category: The coverage rate, recoverability rate, and accuracy rate have all reached 100%, and the household change relationship is correct, but the line loss rate is abnormal. The reason for the abnormal line loss rate should be analyzed in time, and loss reduction measures should be formulated.

S1、获取数据,通过用电信息采集方式获取用电数据。在一个实施例中, 针对线损类台区,采用云存储技术实现多个线损类台区用电信息数据采集、 分类、处理。采用云存储的分布式文件存储机制,将用电信息数据分散存储 在多台独立的存储服务器上,它包括卷管理、元数据管理、块数据管理服务;S1. Acquire data, and acquire power consumption data by means of power consumption information collection. In one embodiment, for the line loss type station area, the cloud storage technology is used to realize the collection, classification and processing of power consumption information data of multiple line loss type station areas. The distributed file storage mechanism of cloud storage is adopted to store electricity information data in multiple independent storage servers. It includes volume management, metadata management, and block data management services;

元数据是指文件的名称、属性、数据块位置信息,因元数据访问频繁,故 系统将元数据加载缓存至内存中管理,提高访问效率。Metadata refers to the file name, attributes, and data block location information. Because metadata is accessed frequently, the system loads and caches the metadata into memory for management to improve access efficiency.

块数据是指文件数据被按照一定大小分割而成的多个数据块,分布存储到 不同的存储节点服务器上,由一对元数据服务器及其管理的存储服务器节点 所提供的存储空间称为一个卷空间;Block data refers to file data divided into multiple data blocks according to a certain size, which are distributed and stored on different storage node servers. The storage space provided by a pair of metadata servers and the storage server nodes managed by them is called a data block. volume space;

卷管理服务器负责将多个卷虚拟化整合,对外提供统一的整体访问云实时 存储平台空间。The volume management server is responsible for virtualizing and integrating multiple volumes and providing a unified overall access to the cloud real-time storage platform space.

S2、数据清洗,将采集到的用电数据进行清洗,并检测用电数据中脏数据 的类型,得到有效用电数据;所述脏数据的类型包括:缺失值、重复值、极 大极小值、负荷毛刺、冲击负值。在一个实施例中,分析总结脏数据的类型, 再根据其表现形式采取针对性的手段,删除数据集中的冗余数据,保持数据 集的完整性。常见脏数据类型有:1)、缺失值:表格中为空值;2)、重复值: 用户某一时刻用电负荷数据重复;3)、极大极小值:用电负荷数据过大或过 小;4)、负荷毛刺:相邻时段数据间突然增大或减小;5)、冲击负值:连续 某时间段内读数数据下降。S2. Data cleaning, cleaning the collected electricity consumption data, and detecting the type of dirty data in the electricity consumption data to obtain valid electricity consumption data; the types of dirty data include: missing values, duplicate values, extremely large and extremely small data. value, load glitch, shock negative value. In one embodiment, the types of dirty data are analyzed and summarized, and then targeted measures are taken according to their manifestations to delete redundant data in the data set and maintain the integrity of the data set. Common dirty data types are: 1), missing values: empty values in the table; 2), duplicate values: the user's electricity load data is repeated at a certain moment; 3), maximum and minimum values: the electricity load data is too large or Too small; 4), load burr: the data suddenly increases or decreases between adjacent periods; 5), negative impact value: the reading data drops in a continuous period of time.

根据用电负荷的波动周期特性,对于缺失严重的数据根据用电负荷的波 动周期特性,计算前后相邻两日相同时间点的负荷以及当前时刻的前后两个 时间点的负荷的均值,和后一日相对前一日的负荷变化率法,以均值加上负 荷变化量填充空缺值,计算方法如下:According to the fluctuation cycle characteristics of the electricity load, for the data with serious missing data, according to the fluctuation cycle characteristics of the electricity load, calculate the load at the same time point in the two adjacent days before and after, and the average value of the load at the two time points before and after the current moment, and then The load change rate method of one day relative to the previous day fills the vacant value with the mean value plus the load change amount. The calculation method is as follows:

公式中,Xi表示当前时刻的用电负荷,i为负荷数据缺失的时刻,取值为1-24, α1和α2表前后两天对应时刻和当前时刻前后两个时间点负荷的加权系数,对 于异常的噪声点数据,利用矩形法对当天各采集时刻的负荷数据进行积分计 算电量的修复值,其计算公式如下所示:In the formula, X i represents the electricity load at the current moment, i is the moment when the load data is missing, and takes a value from 1 to 24. α 1 and α 2 represent the corresponding moments of the two days before and after the current moment and the weighted load of the two time points before and after the current moment. coefficient. For abnormal noise point data, the rectangular method is used to integrate the load data at each collection time of the day to calculate the repair value of the electricity. The calculation formula is as follows:

式中,Xi为电量修复值,F为一天内的负荷数据采集次数,Pi为i时刻的负荷 数据,ΔT为负荷数据采集时间间隔。In the formula, X i is the power restoration value, F is the number of load data collection in one day, P i is the load data at time i, and ΔT is the load data collection time interval.

S3、数据降维,利用日负荷特性指标对所述有效用电数据进行特征降维, 所述日负荷特征指标包括负荷率、峰谷差率、最高利用小时率、峰期负载率、 平期负载率、谷期负载率。在一个实施例中,如图4所示,对于作为时间序 列的负荷曲线来说,用电负荷数据易受气温、收入、电价政策等多种因素影 响,这些影响结果作为时序数据的内在特征,无法通过距离得到充分反映, 不能完全保证时间序列的形态或轮廓的相似性。并且,对于日负荷曲线这类 有明显负荷形状的曲线,在高维情况下会表现出不理想的等距性。为充分反 映负荷间的相似性,兼顾运算效率,本实施例选取了六种常用的日负荷特性 指标:负荷率、峰谷差率、最高利用小时率、峰期负载率、平期负载率、谷 期负载率,从全天、峰期、平期、谷期四个角度,较为全面地反映了各类用 户的用电特性。利用六个日负荷特性指标对有效负荷曲线矩阵进行特征降维。S3. Data dimensionality reduction, using a daily load characteristic index to perform feature dimension reduction on the effective power consumption data, where the daily load characteristic index includes a load rate, a peak-valley difference rate, a maximum utilization hour rate, a peak load rate, and a flat period. Load rate, valley load rate. In one embodiment, as shown in Figure 4, for the load curve as a time series, the electricity load data is easily affected by various factors such as temperature, income, and electricity price policy. The distance cannot be fully reflected, and the similarity of the shape or contour of the time series cannot be fully guaranteed. Moreover, for curves with obvious load shapes such as daily load curves, they will show unsatisfactory isometric properties in high-dimensional situations. In order to fully reflect the similarity between loads and take into account the computing efficiency, this embodiment selects six commonly used daily load characteristic indicators: load rate, peak-valley difference rate, maximum utilization hour rate, peak load rate, average load rate, The load rate during the valley period reflects the power consumption characteristics of various users more comprehensively from the four perspectives of the whole day, peak period, flat period and valley period. Feature dimension reduction is performed on the effective load curve matrix using six daily load characteristic indexes.

S4、模型建立,模型建立,若干个孤立树构建成孤立森林,利用孤立森 林算法建立第一分析模型,利用评估曲线进行模型评估;S4, model establishment, model establishment, a number of isolated trees are constructed into an isolated forest, a first analysis model is established by using the isolated forest algorithm, and model evaluation is performed by using an evaluation curve;

在一个实施例中,如图3所示,构造孤立树iTree,过程如下:1、在六 个日负荷特性指标中随机选择一个特征;2、随机选择该特征的一个值k;3、 根据特征对每条记录进行分类,把特征中小于k的记录放在左分支,把大于 等于k的记录放在右分支;4、然后递归构造左分支和右分支,直到满足以下 条件:a、传入的数据集只有一条记录或者多条一样的记录;b、树的高度达 到了限定高度。In one embodiment, as shown in Fig. 3, an isolated tree iTree is constructed, and the process is as follows: 1. Randomly select a feature from the six daily load characteristic indicators; 2. Randomly select a value k of the feature; 3. According to the feature Classify each record, put the records less than k in the feature on the left branch, and put the records greater than or equal to k on the right branch; 4. Then recursively construct the left branch and the right branch until the following conditions are met: a. Incoming The data set has only one record or multiple identical records; b. The height of the tree reaches the limit height.

具体地,构建t个iTree组成的孤立森林,其步骤如下:Specifically, to construct an isolated forest composed of t iTrees, the steps are as follows:

1、从训练数据中随机选择ψ个点样本点作为子样本集,放入树的根节点;1. Randomly select ψ sample points from the training data as a sub-sample set and put them into the root node of the tree;

2、随机指定一个维度,在当前节点数据中随机产生一个切割点P;2. Randomly specify a dimension, and randomly generate a cutting point P in the current node data;

3、以此切割点生成一个超平面,将当前节点数据空间划分为2个子空间, 把指定维度里小于P的数据放在当前节点的左边,把大于等于p的数据放在 当前节点的右边。3. Generate a hyperplane at this cutting point, divide the data space of the current node into two subspaces, put the data less than P in the specified dimension on the left of the current node, and put the data greater than or equal to p on the right of the current node.

4、在子节点中递归步骤1和2,不断构造新的子节点,直到数据本身不 可再分或树的深度达到log2ψ。4. Steps 1 and 2 are recursively performed in the child nodes, and new child nodes are continuously constructed until the data itself can no longer be divided or the depth of the tree reaches log 2 ψ.

S5、筛选异常用户,利用步骤模型建立中的第一分析模型对目标数据进 行筛选,并对筛选过的数据进行数据挖掘,筛选出用电异常用户。在一个实 施例中,如图5所示,由数棵具有差异性的iTree构成iForest,并运用ROC 曲线与AUC及累积查全率曲线与P-R曲线进行模型评估,iForest每次只能 对单个用户进行评价,每次评价过程中需要遍历所有iTree。统计查询对象 落在的叶子节点的位置,通过其平均路径长度计算异常分值。最后根据异常 分值的大小对用户进行评价,判断待测用户是否为异常用户。S5. Screen abnormal users, screen the target data by using the first analysis model in the step model establishment, and perform data mining on the screened data to screen out abnormal electricity users. In one embodiment, as shown in Figure 5, iForest is composed of several iTrees with differences, and the model is evaluated by using ROC curve and AUC, cumulative recall curve and P-R curve. When a single user evaluates, all iTrees need to be traversed during each evaluation process. The position of the leaf node that the query object falls on is counted, and the anomaly score is calculated by the average path length. Finally, according to the size of the abnormal score, the user is evaluated to determine whether the user to be tested is an abnormal user.

具体地,对于受试者工作特征ROC曲线,当测试集中的正负样本的分布变 化时,ROC曲线能够保持不变。对于二元分类模型输出的连续数值,将大于 阈值的样本划为正类,小于阈值的样本则划为负类。减小阀值能识别出更多 的正类,提高了查全率的同时也会将更多的负样本划为正类,如此提高了误 报率。ROC曲线形象化这一变化过程,在ROC空间坐标中,点(0,1)表示理 想分类器,ROC曲线越接近点(0,1)表示分类效果越好。AUC的数值就是ROC 曲线下方部分面积的大小,AUC=1对应理想分类器,AUC=0.5代表跟随机猜 测一样,模型没有预测价值,在0.5到1之间代表优于随机猜测。Specifically, for the receiver operating characteristic ROC curve, when the distribution of positive and negative samples in the test set changes, the ROC curve can remain unchanged. For the continuous values output by the binary classification model, the samples larger than the threshold are classified as positive, and the samples smaller than the threshold are classified as negative. Reducing the threshold value can identify more positive classes, improve the recall rate, and at the same time classify more negative samples as positive classes, thus increasing the false positive rate. The ROC curve visualizes this change process. In the ROC space coordinates, the point (0,1) represents the ideal classifier, and the closer the ROC curve is to the point (0,1), the better the classification effect. The value of AUC is the size of the area under the ROC curve. AUC=1 corresponds to an ideal classifier, and AUC=0.5 means that the model has no predictive value, just like the guessing of the follower machine. Between 0.5 and 1, it means that it is better than random guessing.

对于P-R曲线,以查准率为纵轴、查全率为横轴作图,就得到查准率与查 全率的曲线,简称为“P-R曲线”随着分类阈值从大到小变化,查准率减小, 查全率增加,评价分类器时,P-R曲线越靠近点(1,1)表示分类效果越好。For the P-R curve, plot the precision rate on the vertical axis and the recall rate on the horizontal axis to obtain the curve of the precision rate and the recall rate, referred to as "P-R curve" as the classification threshold changes from large to small. The accuracy decreases and the recall increases. When evaluating the classifier, the closer the P-R curve is to the point (1,1), the better the classification effect.

具体地,用生成的iForest来评估测试数据,对被检测样本计算异常分 值。对于任一数据x令其遍历每一棵iTree,得出x在iTree所处的深度及 在每棵iTree所处的平均深度h(x),从而计算样本的异常分值。被检测样本x 的异常分值定义如下式所示:Specifically, the generated iForest is used to evaluate the test data, and the abnormal score is calculated for the detected samples. For any data x, let it traverse each iTree, get the depth of x in the iTree and the average depth h(x) in each iTree, so as to calculate the abnormal score of the sample. The abnormal score of the detected sample x is defined as follows:

其中:h(x)是被检测样本x在iTree中检索到的节点的深度;E(h(x))是对所有 t个iTree取均值;c(ψ)是ψ个点构建的二分搜索树的平均路径长度;Among them: h(x) is the depth of the node retrieved by the detected sample x in the iTree; E(h(x)) is the average of all t iTrees; c(ψ) is the binary search tree constructed by ψ points The average path length of ;

H(k)=ln(k)+ζ,ζ为欧拉常数。H(k)=ln(k)+ζ, where ζ is Euler's constant.

观察异常分值的定义式,可知:当E(h(x))→0,s→1;当E(h(x))→ψ-1,s→0;当 E(h(x))→c(ψ),s→0.5。即s(x)越接近1表示异常数据的可能性高,越接近0表 示是正常点的可能性比较高。Observing the definition of abnormal score, we can see that: when E(h(x))→0, s→1; when E(h(x))→ψ-1, s→0; when E(h(x)) →c(ψ),s→0.5. That is, the closer s(x) is to 1, the higher the possibility of abnormal data, and the closer to 0, the higher the possibility of normal points.

一种电子设备,包括:处理器;An electronic device, comprising: a processor;

存储器;以及程序,其中所述程序被存储在所述存储器中,并且被配置 成由处理器执行,所述程序包括一种异常用电数据检测方法。a memory; and a program, wherein the program is stored in the memory and configured to be executed by a processor, the program including an abnormal power usage data detection method.

一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被 处理器执行包括一种异常用电数据检测方法。A computer-readable storage medium on which a computer program is stored, the computer program being executed by a processor includes a method for detecting abnormal power consumption data.

一种异常用电数据检测系统,如图7所示,包括获取数据模块、数据清 洗模块、数据降维模块、模型建立模块、筛选异常用户模块;其中,An abnormal power consumption data detection system, as shown in Figure 7, includes a data acquisition module, a data cleaning module, a data dimensionality reduction module, a model building module, and a screening abnormal user module; wherein,

所述获取数据模块用于通过用电信息采集方式获取用电数据;The acquiring data module is used for acquiring electricity consumption data by means of electricity consumption information collection;

所述数据清洗模块用于将采集到的用电数据进行清洗,并检测用电数据 中脏数据的类型,得到有效用电数据;所述脏数据的类型包括:缺失值、重 复值、极大极小值、负荷毛刺、冲击负值;The data cleaning module is used to clean the collected electricity consumption data, and detect the type of dirty data in the electricity consumption data to obtain valid electricity consumption data; the types of dirty data include: missing values, duplicate values, extremely large Minimum value, load burr, negative impact value;

所述数据降维模块用于利用日负荷特性指标对所述有效用电数据进行特 征降维,所述日负荷特征指标包括负荷率、峰谷差率、最高利用小时率、峰 期负载率、平期负载率、谷期负载率;The data dimensionality reduction module is used to perform feature dimensionality reduction on the effective power consumption data by using a daily load characteristic index, wherein the daily load characteristic index includes a load rate, a peak-valley difference rate, a maximum utilization hour rate, a peak load rate, Load rate during flat period and load rate during valley period;

所述模型建立模块用于若干个孤立树构建成孤立森林,利用孤立森林算 法建立第一分析模型,利用评估曲线进行模型评估;Described model establishment module is used for several isolated trees to construct into isolated forest, utilizes the isolated forest algorithm to establish the first analysis model, utilizes the evaluation curve to carry out model evaluation;

所述筛选异常用户模块用于利用所述第一分析模型对目标数据进行筛 选,并对筛选过的数据进行数据挖掘,筛选出用电异常用户。The abnormal user screening module is used to screen the target data by using the first analysis model, and perform data mining on the screened data to screen out abnormal electricity users.

进一步地,还包括建立管理方案模块,所述建立管理方案模块用于建立 台区线损管理指标,所述台区线损管理指标的状态标识包括覆盖类、户变类、 可采类、数据类、线损类;对采集的多台区的用电数据进行状态标识,并针 对不同的状态采取相对应的管控措施,形成台区线损管理方案;Further, it also includes establishing a management scheme module, and the establishment management scheme module is used to establish a line loss management index in the station area, and the state identification of the line loss management index in the station area includes a coverage class, a household change class, a collectible class, and a data Class, line loss class; state identification of the collected power consumption data of multiple stations, and take corresponding control measures for different states to form a station area line loss management plan;

所述获取数据模块包括云存储单元,所述云存储单元用于将所述用电数 据分散存储在多台独立的存储服务器上,所述存储服务器的类型包括元数据 管理服务、卷管理服务和块数据管理服务;The data acquisition module includes a cloud storage unit, and the cloud storage unit is configured to store the power consumption data in a distributed manner on multiple independent storage servers, and the types of the storage servers include metadata management services, volume management services, and block data management services;

所述数据清洗模块包括缺失值填充单元,所述缺失值填充单元用于根据 用电负荷的波动周期特性填充空缺失值的计算公式如下:The data cleaning module includes a missing value filling unit, and the calculation formula of the missing value filling unit for filling empty missing values according to the fluctuation period characteristic of the electricity load is as follows:

其中,Xi表示当前时刻的用电负荷,i为负荷数据缺失的时刻,取值为 1-24,a1和a2为前后两天对应时刻和当前时刻前后两个时间点负荷的加权 系数。Among them, X i represents the electricity load at the current moment, i is the moment when the load data is missing, ranging from 1 to 24, and a1 and a2 are the weighting coefficients of the load at the corresponding moment of the two days before and after the current moment and the two time points before and after the current moment.

在一个具体实施例中,该系统的开发设计中用面向服务的架构体系作为 总体设计思想架构,获取数据模块采用05版规约的专变终端,可以每隔15 分钟采集用户电能表即24小时共96点的电压电流和电量数据,即数据集S 为n条日负荷曲线构成的n*24阶初始负荷曲线矩阵。该模块将采集到的海量 数据通过云存储技术实现分布式存储。数据经处理后得到:2018年9月至2019 年3月某县电力公司共有台区701个,台区总容量34.9万KVA,平均单台容 量497.8KVA,累计损失电量4.6万Kwh,平均台区线损率2.69%。In a specific embodiment, in the development and design of the system, a service-oriented architecture system is used as the overall design ideological structure, and the data acquisition module adopts a special-purpose terminal of the 05 version of the statute, which can collect the user's electric energy meter every 15 minutes, that is, a total of 24 hours. The voltage, current and power data of 96 points, that is, the data set S is an n*24-order initial load curve matrix composed of n daily load curves. This module realizes distributed storage of the collected massive data through cloud storage technology. After the data is processed, it is obtained: From September 2018 to March 2019, a county power company has a total of 701 stations, with a total capacity of 349,000 KVA, an average single-unit capacity of 497.8KVA, and a cumulative power loss of 46,000 Kwh. The average station area The line loss rate is 2.69%.

进一步地,如图6所示,采集集群周期性地从用户终端中采集信息,并 通过调用存储接口将数据存储到云存储与查询环境中;数据存储与查询环境 负责对采集到的信息进行高并发的存储,并向上提供用电数据索引和高效查 询功能。并行ETL(Extraction-Transformation-Loading)环境负责原有关 系型数据库中档案信息与云计算环境的数据交换;利用ETL管理工具建立数 据表映射关系以及任务的执行策略,系统通过并行ETL工具对关联系统中的 数据进行实时跟踪、获取和一致性校验。并行分析与计算环境负责运行孤立 森林算法挖掘异常数据。前端接口包括类SQL(Structured Query Language) 接口、Web服务、客户端包等,面向外部系统提供查询和分析计算的服务。 映射工具采用了基于查询重写的SQL到Map/Reduce的优化技术,将原有 SQL转化为查询图,并利用重写规则演变为多种形式,实现原有存储过程形 式的应用程序向云计算环境的辅助迁移、正确性验证和性能优化,能够大幅 度降低关系型数据库应用到云计算的迁移成本,提高开发效率,提升并行计 算的总体性能。Further, as shown in Figure 6, the collection cluster periodically collects information from the user terminal, and stores the data in the cloud storage and query environment by calling the storage interface; Concurrent storage, and provide upward power data indexing and efficient query functions. The parallel ETL (Extraction-Transformation-Loading) environment is responsible for the data exchange between the archive information in the original relational database and the cloud computing environment; the ETL management tool is used to establish the data table mapping relationship and the task execution strategy, and the system uses the parallel ETL tool for the associated system. The data in the real-time tracking, acquisition and consistency check. The parallel analysis and computing environment is responsible for running the isolated forest algorithm to mine abnormal data. Front-end interfaces include SQL-like (Structured Query Language) interfaces, Web services, client packages, etc., and provide query and analytical computing services for external systems. The mapping tool adopts the optimization technology of SQL to Map/Reduce based on query rewriting, converts the original SQL into a query graph, and uses the rewriting rules to evolve into various forms to realize the application of the original stored procedure form to cloud computing The assisted migration, correctness verification and performance optimization of the environment can greatly reduce the migration cost of relational database applications to cloud computing, improve development efficiency, and improve the overall performance of parallel computing.

云存储单元采用并行ETL环境,把原先计算密集型复杂任务,进行原子 性分解,分配到不同的任务处理节点上,进行并发同步处理,提高数据处理 效率和数据处理容量,保证数据处理性能。The cloud storage unit adopts a parallel ETL environment to atomically decompose the original computationally intensive and complex tasks and assign them to different task processing nodes for concurrent and synchronous processing, thereby improving data processing efficiency and data processing capacity and ensuring data processing performance.

该系统还包括降损辅助决策模块主要包括降损决策支持功能以及降损方 案库管理两个部分,该模块针对用电数据异常用户进行检查,重点关注以下 内容:a、台区内是否存在窃电行为;b、台区负荷运行变化,有无切改;c、 台区变压器是否轻载或重载;d、无功补偿设备运行情况;e、三相负荷是否 平衡;f、电压质量;g、变压器、线路、计量设备是否合理、正常;引起电 能计量装置异常的原因主要有表计故障、互感器故障、接线盒故障以及终端故障等等;h、低压供电半径是否过长;i、其他原因造成线损异常。The system also includes a loss reduction auxiliary decision-making module, which mainly includes two parts: loss reduction decision support function and loss reduction scheme database management. This module checks users with abnormal power consumption data, focusing on the following contents: a. Whether there is theft in the station area Electricity behavior; b. The load operation changes in the platform area, and whether there is a switch; c. Whether the transformer in the platform area is light-loaded or heavy-loaded; d. The operation of the reactive power compensation equipment; e. Whether the three-phase load is balanced; f. The voltage quality; g. Whether the transformer, circuit, and metering equipment are reasonable and normal; the main reasons for the abnormality of the electric energy metering device are meter failure, transformer failure, junction box failure and terminal failure, etc.; h. Whether the low-voltage power supply radius is too long; i. Abnormal line loss caused by other reasons.

表1为2019年3月异常用电数据检测系统的线损类台区分析统计表:Table 1 is the analysis and statistics table of the line loss type station area of the abnormal power consumption data detection system in March 2019:

由表1可知,目前线损率达标的台区为694台,约占管理范围台区总数 的99%,釆集设备覆盖率低,是影响台区管控总体进程的主要原因。通过进 一步的台区明细分析,台采集覆盖率低的主要原因是大部分台区的非居民采 集设备安装率低。原因查清后,应调整采集设备安装方案。此外台区户变关 系不准的问题在影响台区管控效果中位列第二,在456个采集覆盖率100%的 台区中户变关系准确的有312个,准确率为68%,通过对144个户变关系不 准的台区进行调查发现,主要原因一是部分老台区,资料遗失;二是台区运 行中负荷发生较大变化但资料变更不及时。应在合理安排采集设备安装的同 时,关注台区户变关系的核查,还可利用台区客户双向识别仪辅助进行现场 户变关系核查。As can be seen from Table 1, there are 694 stations in the current line loss rate compliance, accounting for about 99% of the total number of stations within the management scope. Through further detailed analysis of the station area, the main reason for the low station collection coverage rate is the low installation rate of non-resident collection equipment in most station areas. After the cause is found out, the installation plan of the acquisition equipment should be adjusted. In addition, the problem of inaccurate household change relationship in Taiwan area ranks second in affecting the control effect of Taiwan area. Among the 456 stations with 100% collection coverage, 312 household change relationships are accurate, with an accuracy rate of 68%. A survey of 144 stations with inaccurate household change relationships found that the main reasons were: firstly, some old station areas had lost data; While reasonably arranging the installation of acquisition equipment, attention should be paid to the verification of the relationship between households in the station area, and the two-way customer identification device in the station area can also be used to assist in the verification of the relationship between the household changes in the station area.

以上,仅为本发明的较佳实施例而已,并非对本发明作任何形式上的限 制;凡本行业的普通技术人员均可按说明书附图所示和以上而顺畅地实施本 发明;但是,凡熟悉本专业的技术人员在不脱离本发明技术方案范围内,利用 以上所揭示的技术内容而做出的些许更动、修饰与演变的等同变化,均为本 发明的等效实施例;同时,凡依据本发明的实质技术对以上实施例所作的任何 等同变化的更动、修饰与演变等,均仍属于本发明的技术方案的保护范围之 内。The above are only preferred embodiments of the present invention, and do not limit the present invention in any form; any person of ordinary skill in the industry can smoothly implement the present invention as shown in the accompanying drawings and above; however, any Those skilled in the art, without departing from the scope of the technical solution of the present invention, make use of the above-disclosed technical content to make some changes, modifications and equivalent changes of evolution are equivalent embodiments of the present invention; at the same time, Any alteration, modification and evolution of any equivalent changes made to the above embodiments according to the essential technology of the present invention still fall within the protection scope of the technical solution of the present invention.

Claims (10)

1.一种异常用电数据检测方法,其特征在于,包括如下步骤:1. an abnormal electricity consumption data detection method, is characterized in that, comprises the steps: 获取数据,通过用电信息采集方式获取用电数据;Obtain data, and obtain electricity consumption data through electricity consumption information collection; 数据清洗,将采集到的用电数据进行清洗,并检测用电数据中脏数据的类型,得到有效用电数据;所述脏数据的类型包括:缺失值、重复值、极大极小值、负荷毛刺、冲击负值;Data cleaning, cleaning the collected electricity consumption data, and detecting the type of dirty data in the electricity consumption data to obtain valid electricity consumption data; the types of dirty data include: missing values, duplicate values, maximum and minimum values, Load burr, negative impact value; 数据降维,利用日负荷特性指标对所述有效用电数据进行特征降维,所述日负荷特征指标包括负荷率、峰谷差率、最高利用小时率、峰期负载率、平期负载率、谷期负载率;Data dimensionality reduction, using daily load characteristic indicators to perform feature dimension reduction on the effective power consumption data, the daily load characteristic indicators include load rate, peak-to-valley difference rate, maximum utilization hour rate, peak load rate, and average load rate , valley load rate; 模型建立,若干个孤立树构建成孤立森林,利用孤立森林算法建立第一分析模型,利用评估曲线进行模型评估;Model establishment, several isolated trees are constructed into an isolated forest, the first analysis model is established by using the isolated forest algorithm, and the model is evaluated by using the evaluation curve; 筛选异常用户,利用所述第一分析模型对目标数据进行筛选,并对筛选过的数据进行数据挖掘,筛选出用电异常用户。Screen abnormal users, screen target data by using the first analysis model, perform data mining on the screened data, and screen out users with abnormal electricity consumption. 2.如权利要求1所述的一种异常用电数据检测方法,其特征在于,所述用电信息采集方式包括云存储,所述云存储用于将所述用电数据分散存储在多台独立的存储服务器上,所述存储服务器的类型包括元数据管理服务、卷管理服务和块数据管理服务。2 . The method for detecting abnormal electricity consumption data according to claim 1 , wherein the electricity consumption information collection method comprises cloud storage, and the cloud storage is used for scattered storage of the electricity consumption data in a plurality of devices. 3 . On an independent storage server, the types of the storage server include a metadata management service, a volume management service, and a block data management service. 3.如权利要求1或2所述的一种异常用电数据检测方法,其特征在于,在步骤数据清洗中还包括:根据用电负荷的波动周期特性填充空缺失值的计算公式如下:3. A kind of abnormal electricity consumption data detection method as claimed in claim 1 or 2, it is characterized in that, in step data cleaning, also comprises: the calculation formula of filling empty missing value according to the fluctuation period characteristic of electricity load is as follows: 其中,Xi表示当前时刻的用电负荷,i为负荷数据缺失的时刻,取值为1-24,a1和a2为前后两天对应时刻和当前时刻前后两个时间点负荷的加权系数。Among them, X i represents the electricity load at the current moment, i is the moment when the load data is missing, ranging from 1 to 24, and a1 and a2 are the weighting coefficients of the load at the corresponding moment of the two days before and after the current moment and the two time points before and after the current moment. 4.如权利要求3所述的一种异常用电数据检测方法,其特征在于,在步骤获取数据之前还包括步骤:4. a kind of abnormal electricity consumption data detection method as claimed in claim 3, is characterized in that, also comprises the step before the step of acquiring data: 建立管理方案,建立台区线损管理指标,所述台区线损管理指标的状态标识包括覆盖类、户变类、可采类、数据类、线损类;对采集的多台区的用电数据进行状态标识,并针对不同的状态采取相对应的管控措施,形成台区线损管理方案。Establish a management plan and establish line loss management indicators in the station area. The status identifiers of the station area line loss management indicators include coverage type, household change type, recoverable type, data type, and line loss type; The electrical data is used to identify the state, and corresponding control measures are taken for different states to form a line loss management plan in the station area. 5.如权利要求1所述的一种异常用电数据检测方法,其特征在于,在步骤模型建立中还包括,利用受试者工作特征ROC曲线、曲线下面积AUC、累积查全率曲线和P-R曲线,以查准率为纵坐标,查全率为横坐标进行模型评估。5. a kind of abnormal electricity data detection method as claimed in claim 1 is characterized in that, in step model establishment, also comprises, utilizes receiver operating characteristic ROC curve, area under the curve AUC, cumulative recall curve and For the P-R curve, the precision is the ordinate and the recall is the abscissa for model evaluation. 6.如权利要求1或5所述的一种异常用电数据检测方法,其特征在于,所述孤立森林算法包括第一阶段算法与第二阶段算法,所述第一阶段算法包括构建多个孤立树组成孤立森林;所述第二阶段算法包括利用生成的孤立森林来评估测试数据,对被检测数据计算异常分值。6. A method for detecting abnormal electricity consumption data according to claim 1 or 5, wherein the isolated forest algorithm includes a first-stage algorithm and a second-stage algorithm, and the first-stage algorithm includes constructing a plurality of The isolated trees form an isolated forest; the second-stage algorithm includes using the generated isolated forest to evaluate the test data, and to calculate an anomaly score for the detected data. 7.一种电子设备,其特征在于,包括:处理器;7. An electronic device, comprising: a processor; 存储器;以及程序,其中所述程序被存储在所述存储器中,并且被配置成由处理器执行,所述程序包括用于执行如权利要求1所述的方法。a memory; and a program, wherein the program is stored in the memory and configured to be executed by a processor, the program comprising for performing the method of claim 1 . 8.一种计算机可读存储介质,其上存储有计算机程序,其特征在于:所述计算机程序被处理器执行如权利要求1所述的方法。8. A computer-readable storage medium on which a computer program is stored, wherein the computer program is executed by a processor to execute the method according to claim 1. 9.一种异常用电数据检测系统,其特征在于,包括获取数据模块、数据清洗模块、数据降维模块、模型建立模块、筛选异常用户模块;其中,9. An abnormal power consumption data detection system, characterized in that it comprises a data acquisition module, a data cleaning module, a data dimensionality reduction module, a model building module, and a screening abnormal user module; wherein, 所述获取数据模块用于通过用电信息采集方式获取用电数据;The acquiring data module is used for acquiring electricity consumption data by means of electricity consumption information collection; 所述数据清洗模块用于将采集到的用电数据进行清洗,并检测用电数据中脏数据的类型,得到有效用电数据;所述脏数据的类型包括:缺失值、重复值、极大极小值、负荷毛刺、冲击负值;The data cleaning module is used to clean the collected electricity consumption data, and detect the type of dirty data in the electricity consumption data to obtain valid electricity consumption data; the types of dirty data include: missing values, duplicate values, extremely large Minimum value, load burr, negative impact value; 所述数据降维模块用于利用日负荷特性指标对所述有效用电数据进行特征降维,所述日负荷特征指标包括负荷率、峰谷差率、最高利用小时率、峰期负载率、平期负载率、谷期负载率;The data dimensionality reduction module is used to perform feature dimensionality reduction on the effective power consumption data by using a daily load characteristic index, wherein the daily load characteristic index includes a load rate, a peak-valley difference rate, a maximum utilization hour rate, a peak load rate, Load rate during flat period and load rate during valley period; 所述模型建立模块用于若干个孤立树构建成孤立森林,利用孤立森林算法建立第一分析模型,利用评估曲线进行模型评估;The model establishment module is used for constructing an isolated forest from several isolated trees, using an isolated forest algorithm to establish a first analysis model, and using an evaluation curve to evaluate the model; 所述筛选异常用户模块用于利用所述第一分析模型对目标数据进行筛选,并对筛选过的数据进行数据挖掘,筛选出用电异常用户。The abnormal user screening module is used for screening the target data by using the first analysis model, and performing data mining on the screened data to screen out abnormal electricity users. 10.如权利要求1所述的一种异常用电数据检测系统,其特征在于,还包括建立管理方案模块,所述建立管理方案模块用于建立台区线损管理指标,所述台区线损管理指标的状态标识包括覆盖类、户变类、可采类、数据类、线损类;对采集的多台区的用电数据进行状态标识,并针对不同的状态采取相对应的管控措施,形成台区线损管理方案;10. A system for detecting abnormal power consumption data as claimed in claim 1, further comprising establishing a management scheme module, the establishment management scheme module is used to establish a station area line loss management index, the station area line The status identification of the loss management indicators includes coverage type, household change type, recoverable type, data type, and line loss type; status identification is performed on the collected electricity consumption data of multiple stations, and corresponding control measures are taken for different states. , to form a line loss management plan in the Taiwan area; 所述获取数据模块包括云存储单元,所述云存储单元用于将所述用电数据分散存储在多台独立的存储服务器上,所述存储服务器的类型包括元数据管理服务、卷管理服务和块数据管理服务;The data acquisition module includes a cloud storage unit, and the cloud storage unit is configured to store the power consumption data in a distributed manner on multiple independent storage servers, and the types of the storage servers include metadata management services, volume management services, and block data management services; 所述数据清洗模块包括缺失值填充单元,所述缺失值填充单元用于根据用电负荷的波动周期特性填充空缺失值的计算公式如下:The data cleaning module includes a missing value filling unit, and the calculation formula of the missing value filling unit for filling empty missing values according to the fluctuation period characteristic of the electricity load is as follows: 其中,Xi表示当前时刻的用电负荷,i为负荷数据缺失的时刻,取值为1-24,a1和a2为前后两天对应时刻和当前时刻前后两个时间点负荷的加权系数。Among them, X i represents the electricity load at the current moment, i is the moment when the load data is missing, ranging from 1 to 24, and a1 and a2 are the weighting coefficients of the load at the corresponding moment of the two days before and after the current moment and the two time points before and after the current moment.
CN201910641996.7A 2019-07-16 2019-07-16 A method, system, device and storage medium for detecting abnormal power consumption data Pending CN110503570A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910641996.7A CN110503570A (en) 2019-07-16 2019-07-16 A method, system, device and storage medium for detecting abnormal power consumption data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910641996.7A CN110503570A (en) 2019-07-16 2019-07-16 A method, system, device and storage medium for detecting abnormal power consumption data

Publications (1)

Publication Number Publication Date
CN110503570A true CN110503570A (en) 2019-11-26

Family

ID=68586132

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910641996.7A Pending CN110503570A (en) 2019-07-16 2019-07-16 A method, system, device and storage medium for detecting abnormal power consumption data

Country Status (1)

Country Link
CN (1) CN110503570A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177138A (en) * 2019-12-30 2020-05-19 深圳市恒泰能源科技有限公司 Big data analysis method, device, equipment and storage medium for power demand side
CN111522864A (en) * 2020-04-21 2020-08-11 国网四川省电力公司电力科学研究院 Enterprise production mode recognition and transfer production early warning method based on electricity consumption data
CN111611255A (en) * 2020-04-30 2020-09-01 广东良实机电工程有限公司 Equipment energy consumption energy-saving management method and device, terminal equipment and storage medium
CN111666276A (en) * 2020-06-11 2020-09-15 上海积成能源科技有限公司 Method for eliminating abnormal data by applying isolated forest algorithm in power load prediction
CN111669368A (en) * 2020-05-07 2020-09-15 宜通世纪科技股份有限公司 End-to-end network sensing abnormity detection and analysis method, system, device and medium
CN111694822A (en) * 2020-04-30 2020-09-22 云南电网有限责任公司信息中心 Low-voltage distribution network operation state data acquisition system and acquisition method thereof
CN112362292A (en) * 2020-10-30 2021-02-12 北京交通大学 Method for anomaly detection of wind tunnel test data
CN113033897A (en) * 2021-03-26 2021-06-25 国网上海市电力公司 Method for identifying station area subscriber variation relation based on electric quantity correlation of subscriber branch
CN113657872A (en) * 2021-09-02 2021-11-16 南方电网数字电网研究院有限公司 Method and device for analyzing abnormal archive information of power consumer and computer equipment
CN114386471A (en) * 2021-10-29 2022-04-22 国网陕西省电力公司西安供电公司 Anomaly detection method, device, device and medium for power data
CN115630022A (en) * 2022-09-19 2023-01-20 淮安明日网络科技有限公司 Micro-module linking system for picture window and product
CN117874459A (en) * 2023-12-28 2024-04-12 西安中创新能网络科技有限责任公司 A power consumption abnormality monitoring system and method based on topological structure
CN117955084A (en) * 2023-12-11 2024-04-30 广东电网有限责任公司 Power distribution network self-healing capacity analysis method and device based on data driving
CN118035660A (en) * 2024-01-31 2024-05-14 浙江清芯微电子有限公司 Metering parameter intelligent cleaning method and system based on self-contained MCU carrier chip

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107657288A (en) * 2017-10-26 2018-02-02 国网冀北电力有限公司 A kind of power scheduling flow data method for detecting abnormality based on isolated forest algorithm
CN108011782A (en) * 2017-12-06 2018-05-08 北京百度网讯科技有限公司 Method and apparatus for pushing warning information
CN108985632A (en) * 2018-07-16 2018-12-11 国网上海市电力公司 A kind of electricity consumption data abnormality detection model based on isolated forest algorithm
CN110189232A (en) * 2019-05-14 2019-08-30 三峡大学 Abnormal Analysis Method of Electricity Information Collection Data Based on Isolated Forest Algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107657288A (en) * 2017-10-26 2018-02-02 国网冀北电力有限公司 A kind of power scheduling flow data method for detecting abnormality based on isolated forest algorithm
CN108011782A (en) * 2017-12-06 2018-05-08 北京百度网讯科技有限公司 Method and apparatus for pushing warning information
CN108985632A (en) * 2018-07-16 2018-12-11 国网上海市电力公司 A kind of electricity consumption data abnormality detection model based on isolated forest algorithm
CN110189232A (en) * 2019-05-14 2019-08-30 三峡大学 Abnormal Analysis Method of Electricity Information Collection Data Based on Isolated Forest Algorithm

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177138A (en) * 2019-12-30 2020-05-19 深圳市恒泰能源科技有限公司 Big data analysis method, device, equipment and storage medium for power demand side
CN111522864A (en) * 2020-04-21 2020-08-11 国网四川省电力公司电力科学研究院 Enterprise production mode recognition and transfer production early warning method based on electricity consumption data
CN111522864B (en) * 2020-04-21 2020-11-10 国网四川省电力公司电力科学研究院 Enterprise production mode recognition and transfer production early warning method based on electricity consumption data
CN111611255A (en) * 2020-04-30 2020-09-01 广东良实机电工程有限公司 Equipment energy consumption energy-saving management method and device, terminal equipment and storage medium
CN111694822A (en) * 2020-04-30 2020-09-22 云南电网有限责任公司信息中心 Low-voltage distribution network operation state data acquisition system and acquisition method thereof
CN111611255B (en) * 2020-04-30 2023-12-12 广东良实机电工程有限公司 Equipment energy consumption energy-saving management method and device, terminal equipment and storage medium
CN111669368B (en) * 2020-05-07 2022-12-06 宜通世纪科技股份有限公司 End-to-end network sensing abnormity detection and analysis method, system, device and medium
CN111669368A (en) * 2020-05-07 2020-09-15 宜通世纪科技股份有限公司 End-to-end network sensing abnormity detection and analysis method, system, device and medium
CN111666276A (en) * 2020-06-11 2020-09-15 上海积成能源科技有限公司 Method for eliminating abnormal data by applying isolated forest algorithm in power load prediction
CN112362292A (en) * 2020-10-30 2021-02-12 北京交通大学 Method for anomaly detection of wind tunnel test data
CN113033897A (en) * 2021-03-26 2021-06-25 国网上海市电力公司 Method for identifying station area subscriber variation relation based on electric quantity correlation of subscriber branch
CN113657872A (en) * 2021-09-02 2021-11-16 南方电网数字电网研究院有限公司 Method and device for analyzing abnormal archive information of power consumer and computer equipment
CN114386471A (en) * 2021-10-29 2022-04-22 国网陕西省电力公司西安供电公司 Anomaly detection method, device, device and medium for power data
CN115630022A (en) * 2022-09-19 2023-01-20 淮安明日网络科技有限公司 Micro-module linking system for picture window and product
CN117955084A (en) * 2023-12-11 2024-04-30 广东电网有限责任公司 Power distribution network self-healing capacity analysis method and device based on data driving
CN117874459A (en) * 2023-12-28 2024-04-12 西安中创新能网络科技有限责任公司 A power consumption abnormality monitoring system and method based on topological structure
CN118035660A (en) * 2024-01-31 2024-05-14 浙江清芯微电子有限公司 Metering parameter intelligent cleaning method and system based on self-contained MCU carrier chip
CN118035660B (en) * 2024-01-31 2024-09-24 浙江清芯微电子有限公司 Metering parameter intelligent cleaning method and system based on self-contained MCU carrier chip

Similar Documents

Publication Publication Date Title
CN110503570A (en) A method, system, device and storage medium for detecting abnormal power consumption data
CN110189232A (en) Abnormal Analysis Method of Electricity Information Collection Data Based on Isolated Forest Algorithm
CN111639237B (en) Electric power communication network risk assessment system based on clustering and association rule mining
CN106570581B (en) Load prediction system and method under energy internet environment based on Attribute Association
CN111860600B (en) User electricity utilization characteristic selection method based on maximum correlation minimum redundancy criterion
CN114048870A (en) An abnormal monitoring method of power system based on intelligent mining of log features
CN110807550A (en) Distribution transformer overload identification early warning method based on neural network and terminal equipment
CN105184455A (en) High dimension visualized analysis method facing urban electric power data analysis
CN113435610B (en) Method for determining classified line loss based on low-voltage internet of things sensing terminal
CN115905319B (en) A method and system for automatically identifying abnormal electricity charges of massive users
CN116797049B (en) Quantitative assessment method for differentiated energy-saving potential of distribution network
CN114118269A (en) Energy big data aggregation analysis method based on typical business scenarios
JP2024015999A (en) Big data screening system for abnormal capacity of distribution transformers
CN118917810A (en) Customs comprehensive information intelligent supervision system based on big data and artificial intelligence
CN119782966B (en) Power data quality evaluation method and system based on anomaly detection
CN113256444A (en) Low-voltage transformer area household transformation relation identification method and device
CN115796665A (en) Multi-index carbon efficiency grading evaluation method and device for green energy power generation project
CN110490220A (en) A kind of bus load discrimination method and system
CN112257964B (en) A method for demand aggregation modeling of load-intensive urban smart parks
CN114154776A (en) Distribution network operation planning comprehensive evaluation and investment benefit analysis method
CN112381422A (en) Method and device for determining performance of photovoltaic power station
CN117371206A (en) Power grid frequency stability characteristic analysis method, device, equipment and medium
CN117350447A (en) A multi-source heterogeneous power data fusion algorithm suitable for power grids
CN117493923A (en) Method and system for repairing abnormal data of low-voltage distribution transformer area containing distributed photovoltaic
CN112488360B (en) Distribution transformer abnormality analysis and early warning method based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191126

WD01 Invention patent application deemed withdrawn after publication
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载