CN106951431A

CN106951431A - A kind of cluster monitoring collecting method

Info

Publication number: CN106951431A
Application number: CN201710036456.7A
Authority: CN
Inventors: 吴宗泽; 黄振豪; 张勰; 何煦; 林志勇; 蔡旭坤; 巫辉强
Original assignee: South China University of Technology SCUT; Guangdong University of Technology
Current assignee: South China University of Technology SCUT; Guangdong University of Technology
Priority date: 2016-12-16
Filing date: 2017-01-18
Publication date: 2017-07-14

Abstract

The invention discloses a kind of cluster monitoring collecting method, including：Business module pushes cluster monitoring data to monitoring module, and the monitoring data is preserved in monitoring module in the form of the data structure KVData based on key-value pair；Clustered node Redis is initialized, and cluster monitoring data carry out landing storage using Hash table, while using the index of set storage non-zero；Two kinds of storage modes of Hash table and set based on Redis, statistical module uses unified monitoring data statistical method, carries out the statistics of cluster monitoring data in single measurement period；With n minutes frequencies once, the non-zero in KVData is gathered, monitoring data is reported into data center.This method has acquisition mode unified, first counts and reports afterwards, memory space needed for cluster monitoring data center is small, and monitored item extension is convenient, the characteristics of cluster monitoring data acquisition reports efficiency high, acquisition module and low business module degree of coupling.

Description

A cluster monitoring data collection method

技术领域technical field

本发明涉及IT集群监控的技术领域，具体涉及一种集群监控数据采集方法。The invention relates to the technical field of IT cluster monitoring, in particular to a cluster monitoring data collection method.

背景技术Background technique

随着互联网技术的飞速发展，人们的生活方式也不断变化，变得和互联网联系得越来越紧密，人们无时无刻不享受着互联网服务给生活带来的便利。互联网服务的日常化使得其服务质量及稳定性变得尤为重要。为了提供支持高并发的稳定服务，互联网服务提供商通常采用分布式集群部署服务，而集群监控服务是提高互联网服务集群稳定性的重要保证。然而任何一个集群服务都不可能做到100％可靠，集群监控能够帮助开发运维人员快速发现定位及排除故障，恢复服务。集群监控服务的载体是数据，监控服务是一种异常自动化检测及发现服务，数据是其检测的对象，是判断服务运行状况的重要依据。With the rapid development of Internet technology, people's lifestyles are constantly changing, becoming more and more closely connected with the Internet, and people enjoy the convenience brought by Internet services all the time. The daily use of Internet services makes its service quality and stability more important. In order to provide stable services that support high concurrency, Internet service providers usually use distributed cluster deployment services, and cluster monitoring services are an important guarantee for improving the stability of Internet service clusters. However, it is impossible for any cluster service to be 100% reliable. Cluster monitoring can help development and maintenance personnel quickly locate, troubleshoot, and restore services. The carrier of the cluster monitoring service is data. The monitoring service is an automatic abnormality detection and discovery service. The data is the object of its detection and an important basis for judging the operation status of the service.

在现有集群服务中，一般节点数量大，监控项多，并且不同模块有不同的数据采集格式，数据管理分析成本高。并且每个节点通常都按照固定的频率上报数据，监控系统收集的数据量是庞大的，很有可能成为集群监控服务的一个瓶颈。为了全面准确地反映集群服务程序的运行状况，首要任务就是对集群监控数据进行高效地采集上报。因此有必要设置一个合理数据存储结构及高效的数据统计方法，对所有监控项采用统一的存储格式，在数据上报之前对监控数据进行聚合，减轻后端负载，并将统计的数据进行高效地采集与上报，实现集群监控数据高效采集的功能。In the existing cluster services, generally the number of nodes is large, there are many monitoring items, and different modules have different data collection formats, and the cost of data management and analysis is high. And each node usually reports data at a fixed frequency, the amount of data collected by the monitoring system is huge, and it is likely to become a bottleneck of the cluster monitoring service. In order to fully and accurately reflect the running status of the cluster service program, the first task is to efficiently collect and report the cluster monitoring data. Therefore, it is necessary to set up a reasonable data storage structure and an efficient data statistics method, adopt a unified storage format for all monitoring items, aggregate the monitoring data before reporting the data, reduce the back-end load, and efficiently collect the statistical data and reporting to realize the efficient collection of cluster monitoring data.

发明内容Contents of the invention

本发明的目的是为了解决现有技术中的上述缺陷，提供一种集群监控数据采集方法。The object of the present invention is to provide a cluster monitoring data collection method in order to solve the above-mentioned defects in the prior art.

本发明的目的可以通过采取如下技术方案达到：The purpose of the present invention can be achieved by taking the following technical solutions:

一种集群监控数据采集方法，所述方法包括下列步骤：A cluster monitoring data collection method, said method comprising the following steps:

S1、业务模块向监控模块推送集群监控数据，该集群监控数据在监控模块以基于键值对的数据结构KVData的形式保存，其中的键对应于单个节点的单个监控项，值对应于具体监控项的数值，该集群监控数据通过节点ID及单节点监控项Item进行寻址；S1. The business module pushes cluster monitoring data to the monitoring module. The cluster monitoring data is stored in the monitoring module in the form of a data structure KVData based on key-value pairs, where the key corresponds to a single monitoring item of a single node, and the value corresponds to a specific monitoring item. The value of , the cluster monitoring data is addressed by node ID and single node monitoring item Item;

S2、集群节点Redis初始化，所述集群监控数据采用Redis提供的哈希表进行落地存储，同时采用Redis提供的集合存储非零监控数据的索引；S2, the cluster node Redis is initialized, the cluster monitoring data is stored using the hash table provided by Redis, and the index of non-zero monitoring data is stored in the collection provided by Redis;

S3、基于Redis提供的哈希表和集合两种存储方式，统计模块采用统一的监控数据统计方法，进行单个统计周期内集群监控数据的统计；S3. Based on the hash table and set storage methods provided by Redis, the statistics module adopts a unified monitoring data statistics method to perform statistics on cluster monitoring data in a single statistical cycle;

S4、统计数据上报模块以n分钟一次的频率遍历集合，获取非零数据的索引，并根据得出的索引集，在Redis的哈希表中查出非零数据索引对应的监控数据统计值，对数据进行上报。S4. The statistical data reporting module traverses the collection at a frequency of n minutes to obtain the index of non-zero data, and according to the obtained index set, find out the statistical value of the monitoring data corresponding to the non-zero data index in the hash table of Redis, Report the data.

进一步地，所述步骤S2具体包括：Further, the step S2 specifically includes:

S2.1、初始化哈希表，每个集群节点都在其Redis中创建一个哈希表，其中哈希表命名为其节点ID，不同监控项Item作为该哈希表的不同字段，哈希表中字段的值则存储相应监控项的数据；S2.1. Initialize the hash table. Each cluster node creates a hash table in its Redis. The hash table is named as its node ID. Different monitoring items are used as different fields of the hash table. The hash table The value of the field in the field stores the data of the corresponding monitoring item;

S2.2、初始化集合，集群节点初始化时，每个集群节点都在其Redis中创建一个空集合，其中集合的Key设为该节点ID，该集合用于存储非零监控数据的索引。S2.2. Initialize the collection. When the cluster node is initialized, each cluster node creates an empty collection in its Redis, where the Key of the collection is set to the node ID, and the collection is used to store the index of non-zero monitoring data.

进一步地，所述步骤S3具体包括：Further, the step S3 specifically includes:

S3.1、获取时间戳clock，其中所述时间戳clock记录统计周期的起点，若时间戳clock为0，则将时间戳clock置为系统当前时间，若时间戳clock不为0，则跳过此步骤；S3.1. Obtain the timestamp clock, wherein the timestamp clock records the starting point of the statistical cycle. If the timestamp clock is 0, set the timestamp clock as the current time of the system. If the timestamp clock is not 0, skip this step;

S3.2、生成事务；S3.2, generate a transaction;

S3.3、生成往哈希表监控项累加监控数据的命令；S3.3. Generate a command for accumulating monitoring data to the hash table monitoring item;

S3.4、生成往集合写入非零监控数据索引的命令；S3.4. Generate a command to write a non-zero monitoring data index into the set;

S3.5、执行命令，统计单个统计周期内的集群监控数据。S3.5. Execute the command to collect the cluster monitoring data in a single statistical period.

进一步地，所述步骤S4具体包括：Further, the step S4 specifically includes:

S4.1、生成随机数Rdm_value,设置锁超时时间；S4.1, generate a random number Rdm_value, set the lock overtime;

S4.2、设置锁lock_redis_ID；S4.2, set the lock lock_redis_ID;

S4.3、获取集合中所有成员,保存非零监控项索引；S4.3. Obtain all members in the set and save the non-zero monitoring item index;

S4.4、根据索引遍历哈希表中非零监控项，读取数据进行数据上报；S4.4. Traverse the non-zero monitoring items in the hash table according to the index, read the data and report the data;

S4.5、Redis生成事务；S4.5, Redis generates transactions;

S4.6、Redis生成清空哈希表数据的命令；S4.6, Redis generates a command to clear the hash table data;

S4.7、Redis生成清空集合的命令；S4.7, Redis generates a command to clear the collection;

S4.8、执行命令；S4.8, execute the order;

S4.9、释放锁lock_redis_ID；S4.9, release the lock lock_redis_ID;

S4.10、将时间戳clock置0；S4.10. Set the timestamp clock to 0;

S4.11、以n分钟一次的频率循环上报数据至数据中心。S4.11. Circularly report data to the data center at a frequency of n minutes.

进一步地，所述基于键值对的数据结构KVData的键由节点ID和单节点监控项Item通过字符串的连接的形式组成，两个子字符串中间使用逗号将两个数据在逻辑上进行分离，KVData字符串类型的键最终以“ID,Item”的形式保存。Further, the key of the key-value pair-based data structure KVData is composed of a node ID and a single-node monitoring item Item connected through a string, and a comma is used between the two substrings to logically separate the two data, The key of the KVData string type is finally saved in the form of "ID, Item".

进一步地，所述监控项包括每个模块在集群RPC调用框架下普遍的性能指标，所述性能指标包括：接口调用耗时、调用成功率、数据上报量、不同业务逻辑的具体监控数据和异常数据。Further, the monitoring items include general performance indicators of each module under the cluster RPC call framework, and the performance indicators include: interface call time-consuming, call success rate, data reporting volume, specific monitoring data and exceptions of different business logics data.

进一步地，所述步骤S3中统计模块对单节点监控数据根据节点ID和监控项，进行聚合，减少监控数据存储所需空间及减少后续监控数据上报量。Further, in the step S3, the statistics module aggregates the monitoring data of a single node according to the node ID and the monitoring item, so as to reduce the space required for storing the monitoring data and reduce the amount of subsequent monitoring data to be reported.

进一步地，所述往哈希表监控项累加监控数据的命令为“HINCRBY节点ID监控项名监控数据值”；Further, the command to accumulate monitoring data to the hash table monitoring item is "HINCRBY node ID monitoring item name monitoring data value";

所述往集合写入非零监控数据索引的命令为“SADD节点ID监控项名”；The command to write the non-zero monitoring data index to the set is "SADD node ID monitoring item name";

所述步骤S3.5中，若Redis已加锁，则将集群监控数据放入等待队列中。In the step S3.5, if the Redis has been locked, put the cluster monitoring data into the waiting queue.

进一步地，所述获取集合中所有成员的命令为“SMEMBERS节点ID”；Further, the command to obtain all members in the set is "SMEMBERS node ID";

所述Redis生成清空哈希表数据的命令为“SMEMBERS节点ID”；The command that the Redis generates to clear the hash table data is "SMEMBERS node ID";

所述Redis生成清空集合的命令为“HSET节点ID监控项名0”。The command generated by Redis to clear the set is "HSET node ID monitoring item name 0".

进一步地，所述n取值为2。Further, the value of n is 2.

本发明相对于现有技术具有如下的优点及效果：Compared with the prior art, the present invention has the following advantages and effects:

(1)统一不同模块监控项的数据采集格式KVData，降低了数据统计和处理难度，有效提升了数据采集上报效率。(1) Unify the data collection format KVData of monitoring items in different modules, which reduces the difficulty of data statistics and processing, and effectively improves the efficiency of data collection and reporting.

(2)使用键值对数据结构管理监控数据，可平滑增加监控项，可扩展性强。(2) Use the key-value pair data structure to manage monitoring data, which can smoothly increase monitoring items and has strong scalability.

(3)使用内存数据库Redis管理KVData,读写速度快。(3) Use the memory database Redis to manage KVData, and the reading and writing speed is fast.

(4)提供监控数据统计方法，避免集群每个节点频繁上报大量数据带来的系统性能损失，实现监控数据上报的高效性。(4) Provide monitoring data statistics methods to avoid system performance loss caused by frequent reporting of large amounts of data by each node of the cluster, and achieve high efficiency of monitoring data reporting.

(5)统计过程中使用Redis集合存储非零监控项索引，减少有效数据查找时间。(5) During the statistical process, the Redis collection is used to store the non-zero monitoring item index to reduce the effective data search time.

(6)往数据中心上报的数据是根据节点ID和监控项聚合的数据，提高了数据上报效率，降低了后端监控数据中心的负载。(6) The data reported to the data center is aggregated data based on node IDs and monitoring items, which improves the efficiency of data reporting and reduces the load on the back-end monitoring data center.

附图说明Description of drawings

图1是本发明公开的集群监控数据采集方法的逻辑步骤图；Fig. 1 is a logical step diagram of the cluster monitoring data acquisition method disclosed by the present invention;

图2是本发明方法的集群监控数据采集系统架构图；Fig. 2 is the framework diagram of the cluster monitoring data acquisition system of the inventive method;

图3是本发明方法的单节点监控数据采集服务初始化流程图；Fig. 3 is the flow chart of initialization of the single-node monitoring data acquisition service of the inventive method;

图4是本发明方法的集群单点监控数据统计流程图；Fig. 4 is the flow chart of cluster single-point monitoring data statistics of the inventive method;

图5是本发明方法的监控数据上报流程图。Fig. 5 is a flow chart of monitoring data reporting in the method of the present invention.

具体实施方式detailed description

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

实施例Example

本实施例公开了一种集群监控数据采集方法，图1是本发明公开的集群监控数据采集方法的逻辑步骤图，图2是本发明方法的集群监控数据采集系统架构图，该方法具体包括以下步骤：The present embodiment discloses a cluster monitoring data acquisition method. FIG. 1 is a logical step diagram of the cluster monitoring data acquisition method disclosed in the present invention. FIG. 2 is a cluster monitoring data acquisition system architecture diagram of the inventive method. The method specifically includes the following step:

S1、业务模块向监控模块推送集群监控数据，该集群监控数据在监控模块以基于键值对的数据结构KVData的形式保存，其中的键对应于单个节点的单个监控项，值对应于具体监控项的数值，该集群监控数据通过节点ID及单节点监控项Item进行寻址。S1. The business module pushes cluster monitoring data to the monitoring module. The cluster monitoring data is stored in the monitoring module in the form of a data structure KVData based on key-value pairs, where the key corresponds to a single monitoring item of a single node, and the value corresponds to a specific monitoring item. The value of the cluster monitoring data is addressed by node ID and single node monitoring item Item.

在具体实施方式中，所述的基于键值对的数据结构KVData的键由节点ID和单节点监控项Item通过字符串的连接的形式组成，两个子字符串中间使用逗号将两个数据在逻辑上进行分离，KVData字符串类型的键最终以“ID,Item”的形式保存。In a specific embodiment, the key of the key-value pair-based data structure KVData is composed of a node ID and a single-node monitoring item Item in the form of a string connection, and a comma is used between the two substrings to separate the two data in a logical manner. Separation is performed on the above, and the key of the KVData string type is finally saved in the form of "ID, Item".

所述的集群监控数据信息通过节点ID及单点监控项Item进行寻址。The cluster monitoring data information is addressed through the node ID and the single point monitoring item Item.

所述的监控项包括每个模块在集群RPC调用框架下普遍的性能指标，所述性能指标包括：接口调用耗时、调用成功率、数据上报量、不同业务逻辑的具体监控数据和异常数据。The monitoring items include general performance indicators of each module under the cluster RPC call framework, and the performance indicators include: interface call time consumption, call success rate, data reporting volume, specific monitoring data and abnormal data of different business logics.

该步骤S1中，所有业务模块的不同监控项采用统一的存储格式，使用键值对数据结构具有可扩展性好的特点，通过节点ID和监控项来对监控数据进行索引加快了监控数据寻址速度。In this step S1, different monitoring items of all business modules adopt a unified storage format, and the use of key-value pair data structure has the characteristics of good scalability, and indexing monitoring data through node IDs and monitoring items speeds up monitoring data addressing speed.

重点在于统一各模块不同监控项的存储结构。The key point is to unify the storage structure of different monitoring items in each module.

该步骤的流程图如附图3所示，具体包括：The flow chart of this step is shown in accompanying drawing 3, specifically includes:

S2.1、集群节点初始化时，每个集群节点都在其Redis中创建一个哈希表，其中哈希表命名为其节点ID，不同监控项Item作为该哈希表的不同字段，哈希表中字段的值则存储相应监控项的数据；S2.1. When the cluster nodes are initialized, each cluster node creates a hash table in its Redis, where the hash table is named as its node ID, and different monitoring items are used as different fields of the hash table. The hash table The value of the field in the field stores the data of the corresponding monitoring item;

S2.2、集群节点初始化时，每个集群节点都在其Redis中创建一个空集合，其中集合的Key设为该节点ID，该集合用于存储非零监控数据的索引。S2.2. When the cluster nodes are initialized, each cluster node creates an empty collection in its Redis, where the Key of the collection is set to the node ID, and the collection is used to store the index of non-zero monitoring data.

在具体实施方式中，集群节点Redis初始化，集群监控数据采用Redis提供的哈希表进行落地存储，同时采用集合存储非零监控数据的索引。即该集合中的数据对应于哈希表相应非零监控数据的字段名。所述的非零监控数据的字段名都可以通过遍历集合中元素获得。In a specific implementation, the cluster node Redis is initialized, and the cluster monitoring data is stored using the hash table provided by Redis, and at the same time, the index of the non-zero monitoring data is stored collectively. That is, the data in the set corresponds to the field name of the corresponding non-zero monitoring data in the hash table. The field names of the non-zero monitoring data can be obtained by traversing the elements in the collection.

重点在于初始化Redis中哈希表和集合。The focus is on initializing hash tables and collections in Redis.

S3、基于Redis提供的的哈希表和集合两种存储方式，统计模块采用统一的监控数据统计方法，进行单个统计周期内集群监控数据的统计；S3. Based on the two storage methods provided by Redis, the hash table and the set, the statistics module adopts a unified monitoring data statistics method to perform statistics on the cluster monitoring data in a single statistical cycle;

该步骤的流程图如附图4所示，具体包括：The flow chart of this step is shown in accompanying drawing 4, specifically includes:

S3.1、获取时间戳clock，若时间戳clock为0，则将时间戳clock置为系统当前时间，若时间戳clock不为0，则跳过此步骤；S3.1. Obtain the timestamp clock. If the timestamp clock is 0, set the timestamp clock as the current system time. If the timestamp clock is not 0, skip this step;

S3.2、生成事务；S3.2, generate a transaction;

S3.5、执行命令，统计单个统计周期内的集群监控数据。S3.5. Execute the command to collect the cluster monitoring data in a single statistical cycle.

该步骤S3中，统计模块对单节点监控数据根据节点ID和监控项，进行聚合，减少监控数据存储所需空间及减少后续监控数据上报量。In the step S3, the statistics module aggregates the monitoring data of a single node according to the node ID and the monitoring items, so as to reduce the space required for storing the monitoring data and reduce the amount of subsequent monitoring data to be reported.

该步骤S3中，时间戳clock记录统计周期的起点。In this step S3, the time stamp clock records the starting point of the statistics period.

该步骤S3.3中，往哈希表监控项累加监控数据，使用命令“HINCRBY节点ID监控项名监控数据值”。In this step S3.3, the monitoring data is accumulated to the hash table monitoring item, and the command "HINCRBY node ID monitoring item name monitoring data value" is used.

该步骤S3.4中，往集合写入非零数据索引，使用命令“SADD节点ID监控项名”。In this step S3.4, a non-zero data index is written into the collection, and the command "SADD node ID monitoring item name" is used.

该步骤S3.5中，若Redis已加锁，则将集群监控数据放入等待队列中。In this step S3.5, if the Redis has been locked, put the cluster monitoring data into the waiting queue.

重点在于对单节点各模块不同监控项数据的统计。The focus is on the statistics of different monitoring item data for each module of a single node.

该步骤的流程图如附图5所示，具体包括：The flow chart of this step is shown in accompanying drawing 5, specifically includes:

S4.2、设置锁lock_redis_ID；S4.2, set the lock lock_redis_ID;

S4.5、Redis生成事务；S4.5, Redis generates transactions;

S4.8、执行命令；S4.8, execute the order;

S4.9、释放锁lock_redis_ID；S4.9, release the lock lock_redis_ID;

S4.10、将时间戳clock置0；S4.10. Set the timestamp clock to 0;

在优选的实施方式中，n取值为2，即以2分钟一次的频率循环数据上报逻辑，需要强调的是，上述取值并不作为本技术方案的限制。In a preferred embodiment, the value of n is 2, that is, the data reporting logic is cycled every 2 minutes. It should be emphasized that the above-mentioned values are not limited to this technical solution.

该步骤S4.3中，获取集合中所有成员，使用命令“SMEMBERS节点ID”。In this step S4.3, all members in the set are obtained, and the command "SMEMBERS node ID" is used.

该步骤S4.6中，清空哈希表数据，使用命令“HSET节点ID监控项名0”。In this step S4.6, the data in the hash table is cleared, and the command "HSET node ID monitoring item name 0" is used.

该步骤S4.7中，清空集合，使用命令“SINTER节点ID”。In this step S4.7, the set is cleared, and the command "SINTER node ID" is used.

该步骤S4.10中，将时间戳clock置0，用于结束上一统计周期，开始新的统计周期。In this step S4.10, the time stamp clock is set to 0, which is used to end the previous statistical cycle and start a new statistical cycle.

所述集群监控数据高效上报方法，每次集群节点的单点监控数据上报成功后，清除原有监控数据，避免重复统计。In the method for efficiently reporting cluster monitoring data, each time the single-point monitoring data of a cluster node is successfully reported, the original monitoring data is cleared to avoid repeated statistics.

集群中每个节点的监控数据的采集上报通过数据上报模块实现，该方法可通过对数据上报模块设定CPU亲和性，降低数据上报服务与业务模块的耦合性，使得集群监控数据采集上报与业务模块互不影响。The collection and reporting of the monitoring data of each node in the cluster is realized through the data reporting module. This method can reduce the coupling between the data reporting service and the business module by setting the CPU affinity for the data reporting module, so that the collection and reporting of the cluster monitoring data and Business modules do not affect each other.

重点在于将已经聚合的监控数据上报至数据中心。The focus is on reporting the aggregated monitoring data to the data center.

综上所述，该集群监控数据采集方法针对集群各节点的不同监控项，统一提供一个键值对的数据结构KVData，用于存储监控数据，统一监控数据采集方式，同时使用Redis数据库中的哈希表与集合，维护单点监控数据，并根据监控项对监控数据进行聚合，对统计数据落地存储，并按策略进行数据上报，有效提升集群监控数据采集上报的效率。To sum up, the cluster monitoring data collection method provides a unified key-value pair data structure KVData for different monitoring items of each node in the cluster, which is used to store monitoring data, and unified monitoring data collection methods. Tables and collections are used to maintain single-point monitoring data, aggregate monitoring data according to monitoring items, store statistical data on the ground, and report data according to policies, effectively improving the efficiency of cluster monitoring data collection and reporting.

上述实施例为本发明较佳的实施方式，但本发明的实施方式并不受上述实施例的限制，其他的任何未背离本发明的精神实质与原理下所作的改变、修饰、替代、组合、简化，均应为等效的置换方式，都包含在本发明的保护范围之内。The above-mentioned embodiment is a preferred embodiment of the present invention, but the embodiment of the present invention is not limited by the above-mentioned embodiment, and any other changes, modifications, substitutions, combinations, Simplifications should be equivalent replacement methods, and all are included in the protection scope of the present invention.

Claims

1. a kind of cluster monitoring collecting method, it is characterised in that methods described comprises the following steps：

S1, business module push cluster monitoring data to monitoring module, and the cluster monitoring data are in monitoring module with based on key assignments To data structure KVData form preserve, key therein corresponds to the single monitored item of individual node, and value corresponds to specific The numerical value of monitored item, the cluster monitoring data are addressed by node ID and single node monitored item Item；

S2, clustered node Redis are initialized, and the cluster monitoring data carry out landing storage using the Redis Hash tables provided, Simultaneously using the index of the Redis set storage non-zero monitoring datas provided；

S3, two kinds of storage modes of the Hash table based on Redis offers and set, statistical module are counted using unified monitoring data Method, carries out the statistics of cluster monitoring data in single measurement period；

S4, statistics reporting module were traveled through with n minutes frequencies once to be gathered, the index of acquisition non-zero, and according to The indexed set gone out, non-zero is found in Redis Hash table and indexes corresponding monitoring data statistical value, and data are carried out Report.

2. a kind of cluster monitoring collecting method according to claim 1, it is characterised in that the step S2 is specifically wrapped Include：

S2.1, initialization Hash table, each clustered node create a Hash table in its Redis, and wherein Hash table is named For its node ID, different monitoring Item is as the different field of the Hash table, and the value of field then stores corresponding prison in Hash table Control the data of item；

S2.2, initialization set, when clustered node is initialized, each clustered node creates a null set in its Redis, The Key wherein gathered is set to the node ID, and the set is used for the index for storing non-zero monitoring data.

3. a kind of cluster monitoring collecting method according to claim 1, it is characterised in that the step S3 is specifically wrapped Include：

S3.1, acquisition timestamp clock, wherein the timestamp clock records the starting point of measurement period, if timestamp clock For 0, then timestamp clock is set to the current time in system, if timestamp clock is not 0, skips this step；

S3.2, generation affairs；

S3.3, the order for generating the monitoring data that added up toward Hash table monitored item；

S3.4, the order for generating past set write-in non-zero monitoring data index；

S3.5, the cluster monitoring data performed in order, the single measurement period of statistics.

4. a kind of cluster monitoring collecting method according to claim 1, it is characterised in that the step S4 is specifically wrapped Include：

S4.1, generation random number R dm_value, set lock time-out time；

S4.2, setting lock lock_redis_ID；

All members in S4.3, acquisition set, preserve non-zero monitoring entry index；

S4.4, according to index traversal Hash table in non-zero monitored item, read data carry out data report；

S4.5, Redis generate affairs；

S4.6, Redis generate the order for emptying Hash table data；

S4.7, Redis generate the order for emptying set；

S4.8, perform order；

S4.9, release lock lock_redis_ID；

S4.10, timestamp clock set to 0；

S4.11, with n minutes frequency cycle reported datas once to data center.

5. a kind of cluster monitoring collecting method according to claim 1, it is characterised in that described based on key-value pair Data structure KVData key is made up of node ID and single node monitored item the Item form connected by character string, two sub- words Two data are logically separated using comma in the middle of symbol string, the keys of KVData character string types it is final with " ID, Item " form is preserved.

6. a kind of cluster monitoring collecting method according to claim 1, it is characterised in that the monitored item includes every Individual module performance indications universal under cluster RPC invocation framenorts, the performance indications include：Interface interchange takes, called into Power, the data amount of reporting, the specific monitoring data of different business logic and abnormal data.

7. a kind of cluster monitoring collecting method according to claim 1, it is characterised in that counted in the step S3 Module, according to node ID and monitored item, is polymerize to single node monitoring data, is reduced and space and subtract needed for supervising data storage Few follow-up monitoring data amount of reporting.

8. a kind of cluster monitoring collecting method according to claim 3, it is characterised in that

The order toward the cumulative monitoring data of Hash table monitored item is " HINCRBY node IDs monitoring key name monitoring data value "；

The order toward set write-in non-zero monitoring data index is " SADD node IDs monitor key name "；

In the step S3.5, if Redis has been locked, cluster monitoring data are put into waiting list.

9. a kind of cluster monitoring collecting method according to claim 4, it is characterised in that

The order for obtaining all members in set is " SMEMBERS node IDs "；

The order that the Redis generations empty Hash table data is " SMEMBERS node IDs "；

The Redis generations empty the order of set for " HSET node IDs monitor key name 0 ".

10. a kind of cluster monitoring collecting method according to claim 1 or 4, it is characterised in that the n values are 2。