CN107957962A - It is a kind of to calculate efficient figure division methods and system towards big figure - Google Patents
It is a kind of to calculate efficient figure division methods and system towards big figure Download PDFInfo
- Publication number
- CN107957962A CN107957962A CN201711375929.2A CN201711375929A CN107957962A CN 107957962 A CN107957962 A CN 107957962A CN 201711375929 A CN201711375929 A CN 201711375929A CN 107957962 A CN107957962 A CN 107957962A
- Authority
- CN
- China
- Prior art keywords
- vertex
- dram
- nvm
- entry
- processing unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0866—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
- G06F12/0871—Allocation or management of cache space
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0877—Cache access modes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0893—Caches characterised by their organisation or structure
- G06F12/0895—Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
本发明提出了一种面向大图计算高效图划分方法与系统,该方法将图数据划分成多个顶点,并将顶点随机排序转为队列;按照队列顺序对第一个顶点进行分区分配,即分配到处理单元,分配完后以该顶点的分区信息作为值,此顶点的邻点作为键,以字典条目的形式存储于DRAM或者NVM中;后续顶点,先判断DRAM或NVM中是否有以此顶点为键的条目,如存在,直接将此顶点的分区信息追加到DRAM或者NVM中对应的条目;如果不存在,则将该点分配到负载最小的处理单元。将每个分配完的顶点的分区信息作为值,此顶点的邻点作为键,以字典条目的形式存储于对应的缓存中。该方法每次可直接根据当前定点查找对应的以此点为键的条目,效率得到了提升。
The present invention proposes a high-efficiency graph partitioning method and system for large graph computing. The method divides the graph data into multiple vertices, and converts the random sorting of the vertices into a queue; the first vertex is partitioned and allocated according to the sequence of the queue, that is, Assigned to the processing unit, after the allocation, the partition information of the vertex is used as the value, and the neighbors of the vertex are used as the key, and stored in DRAM or NVM in the form of dictionary entries; for subsequent vertices, first determine whether there is such a vertex in DRAM or NVM If the entry whose vertex is the key exists, the partition information of this vertex is directly appended to the corresponding entry in DRAM or NVM; if it does not exist, the point is allocated to the processing unit with the least load. The partition information of each allocated vertex is used as the value, and the neighbors of this vertex are used as keys, and stored in the corresponding cache in the form of dictionary entries. This method can directly search for the corresponding entry with the point as the key according to the current fixed point each time, and the efficiency is improved.
Description
技术领域technical field
本发明涉及计算机领域,具体涉及一种面向大图计算高效图划分方法与系统。The invention relates to the field of computers, in particular to a method and system for large-scale computing and high-efficiency graph division.
背景技术Background technique
目前如今,图的规模巨大且不断增长,比如由大脑神经构成的一张图,可高达几百个TB,最典型的是万维网,通过搜索引擎可以抓取约1万亿的链接关系图,据估计未来规模将超过十万亿。全球最大的社交网络Facebook目前拥有约10亿的用户,与之相对应的是数百亿的关系链接。普通的计算机由于内存的限制无法对这些图(大图)正常处理,这给常见的图计算带来了严峻挑战(如寻找连通分量,计算三角形和Pagerank)。一个标准的解决方案是将图数据划分为多个子图装载到不同处理单元进行分布式计算。为此,Spark,Pregel,Giraph和Trinity等分布式系统框架被相继的开发出来,这些系统主要根据节点ID利用伪随机哈希函数将任务分发到每个处理单元,虽然能达到负载均衡,然而由于分区之间的通信量很大导致计算运行的时间会比划分质量好的算法慢。幸运的是,这些系统支持自定义划分方式,用户可以用一个更复杂的划分方式代替现有的哈希算法。At present, the scale of graphs is huge and growing. For example, a graph composed of brain nerves can reach hundreds of terabytes. The most typical is the World Wide Web. Search engines can capture about 1 trillion link graphs. According to It is estimated that the future scale will exceed ten trillion. The world's largest social network Facebook currently has about 1 billion users, corresponding to tens of billions of relationship links. Ordinary computers cannot process these graphs (large graphs) normally due to memory limitations, which poses severe challenges to common graph calculations (such as finding connected components, calculating triangles and Pagerank). A standard solution is to divide graph data into multiple subgraphs and load them into different processing units for distributed computing. To this end, distributed system frameworks such as Spark, Pregel, Giraph, and Trinity have been successively developed. These systems mainly use pseudo-random hash functions to distribute tasks to each processing unit according to the node ID. Although load balancing can be achieved, due to The high amount of communication between partitions results in slower computational runtimes than algorithms with good partition quality. Fortunately, these systems support custom partitioning methods, and users can replace the existing hash algorithm with a more complex partitioning method.
图的划分管理是分布式计算的前提,与图计算过程中子区之间的通信量或运行时间直接相关。离线划分由于需要重复迭代计算频繁访问顶点与边,将会导致较高的时间复杂度且对内存的容量要求,很难应用在大规模的图划分中,流式划分由于高效的划分管理,近年来得到了不断的发展,最新的Fennel算法在划分质量上接近甚至超过了Metis划分,而且流算法也能够有效处理动态的大图数据。Graph partition management is the premise of distributed computing, which is directly related to the communication volume or running time between sub-regions in the graph computing process. Due to the need for repeated iterative calculations and frequent access to vertices and edges, offline partitioning will lead to high time complexity and memory capacity requirements, making it difficult to apply to large-scale graph partitioning. Since then, it has been continuously developed. The latest Fennel algorithm is close to or even surpasses the Metis division in terms of division quality, and the stream algorithm can also effectively process dynamic large-scale data.
在已有流式方法中,对于载入内存新的顶点,其所需要的邻点分区信息都已保存在内存中,而且每次都是先查找邻点,然后根据邻点查找对应的分区,查找的复杂度与邻居点的数量直接相关,查找效率低下。在实际中,很多大图规模超过了普通计算机的内存容量,甚至有些图规模单位达到了TB级,因此如果将全部点及点分区信息都存入缓存是行不通的。如果将部分中间数据存储于硬盘,势必会降低性能。In the existing streaming method, for a new vertex loaded into the memory, the required neighbor partition information has been stored in the memory, and each time the neighbor is searched first, and then the corresponding partition is searched according to the neighbor. The complexity of the search is directly related to the number of neighbor points, and the search efficiency is low. In reality, the scale of many large graphs exceeds the memory capacity of ordinary computers, and some graphs even reach the terabyte level. Therefore, it is not feasible to store all point and point partition information in the cache. If some intermediate data is stored on the hard disk, performance will inevitably be reduced.
新型非易失性存储器(Non-Volatile Memory,NVM)的出现有利于缓解传统存储体系结构中的问题。NVM是一种全新的存储介质,它具有可字节寻址、非易失性、静态功耗低、密度高等优点,同时,处理器访问NVM的时延与访问DRAM的时延相近,且优于闪存和磁盘。NVM是具有这些特性的非易失性存储介质的统称。NVM通常可以作为主存或外存来缓解传统存储体系结构的问题。一方面,由于NVM具有可字节寻址的特性,它可以直接挂载在内存总线上被访问,处理器可以通过load/store指令访存NVM上的数据,而不需要通过耗时的I/O操作来访存数据。同时,NVM在访问时延和读写性能等方面与DRAM相近,静态功耗低于DRAM,图划分是读写密集型计算,由于NVM写性能较差,写功耗高,写操作次数有限,如果直接将NVM作为内存,由于高频度的写操作直接影响NVM的使用寿命。The emergence of a new type of non-volatile memory (Non-Volatile Memory, NVM) is beneficial to alleviate the problems in the traditional storage architecture. NVM is a brand-new storage medium, which has the advantages of byte addressability, non-volatility, low static power consumption, and high density. on flash and disk. NVM is a general term for non-volatile storage media with these characteristics. NVM can often be used as main memory or external memory to alleviate the problems of traditional storage architectures. On the one hand, because NVM is byte-addressable, it can be directly mounted on the memory bus for access, and the processor can access and store data on the NVM through load/store instructions instead of time-consuming I/O O operation to access and store data. At the same time, NVM is similar to DRAM in terms of access delay and read and write performance, and its static power consumption is lower than that of DRAM. The graph division is a read and write intensive calculation. Due to the poor write performance of NVM, high write power consumption and limited number of write operations, If NVM is directly used as memory, the service life of NVM will be directly affected due to high-frequency write operations.
发明内容Contents of the invention
为了克服上述现有技术中存在的缺陷,本发明的目的是提供一种面向大图计算高效图划分方法与系统。In order to overcome the above-mentioned defects in the prior art, the object of the present invention is to provide a method and system for large-scale computing and efficient graph partitioning.
为了实现本发明的上述目的,本发明提供了一种面向大图计算高效图划分方法,包括以下步骤:In order to achieve the above-mentioned purpose of the present invention, the present invention provides a kind of high-efficiency graph division method for large graph calculation, comprising the following steps:
S1,将图数据划分成多个顶点,并将顶点随机排序转为队列;S1, divide the graph data into multiple vertices, and randomly sort the vertices into a queue;
S2,按照队列顺序对第一个顶点进行分区分配,即分配到处理单元分配完后以该顶点的分区信息作为值,此顶点的邻点作为键,以字典条目的形式存储于DRAM中,如果DRAM的数据容量超过设定阈值则存储于NVM中;S2, partition the first vertex according to the order of the queue, that is, after the allocation to the processing unit, the partition information of the vertex is used as the value, and the neighbors of the vertex are used as keys, which are stored in the DRAM in the form of dictionary entries. If The data capacity of DRAM exceeds the set threshold and is stored in NVM;
S3,对于队列中后续顶点,先判断DRAM或NVM中是否有以此顶点为键的条目,如果存在,直接将此顶点的分区信息追加到DRAM或者NVM中对应的条目,即依据该条目中的值,将该点分配到对应的处理单元,每分配完一个顶点,均以分配完成顶点的分区信息作为值,此顶点的邻点作为键,以字典条目的形式存储于对应缓存中;S3, for the subsequent vertex in the queue, first judge whether there is an entry with this vertex as the key in DRAM or NVM, if it exists, directly append the partition information of this vertex to the corresponding entry in DRAM or NVM, that is, according to the entry in the entry Value, assign the point to the corresponding processing unit, each time a vertex is assigned, the partition information of the assigned vertex is used as the value, and the neighbors of this vertex are used as keys, which are stored in the corresponding cache in the form of dictionary entries;
如果DRAM或NVM中不存在以此顶点为键的条目,则将该点分配到负载最小的处理单元,再将每个分配完的顶点的分区信息作为值,此顶点的邻点作为键,以字典条目的形式存储于DRAM中,如果DRAM的数据容量超过设定阈值则存储于NVM中,直至队列中所有点都分配完毕,其中,所述负载最小的处理单元为拥有顶点数量最少的分区。If there is no entry with this vertex as the key in DRAM or NVM, then assign this point to the processing unit with the least load, and then use the partition information of each allocated vertex as the value, and the neighbors of this vertex as the key, and use The form of the dictionary entry is stored in the DRAM, and if the data capacity of the DRAM exceeds the set threshold, it is stored in the NVM until all points in the queue are allocated, wherein the processing unit with the least load is the partition with the least number of vertices.
该方法每次可直接根据当前定点查找对应的以此点为键的条目,缓存效率高。This method can directly search for the corresponding entry with the point as the key according to the current fixed point each time, and the cache efficiency is high.
进一步的,分区分配时,将待分配顶点分配至含有该顶点的邻点数量最多的分区。Further, when assigning partitions, the vertex to be assigned is assigned to the partition containing the largest number of neighbors of the vertex.
进一步的,采用以下公式进行分区:其中,Sind为顶点v分配到的分区,N(v)代表顶点v的邻居点,Si t表示在t时刻已划分完的顶点分区状态,n为图的顶点数,k为分区数,f1是指在t时刻,每个分区中含有顶点v的邻居点数量,f2为惩罚函数,n/k为平均负载。Further, the following formula is used for partitioning: Among them, S ind is the partition assigned to vertex v, N(v) represents the neighbor point of vertex v, S i t represents the state of the vertex partition that has been divided at time t, n is the number of vertices in the graph, k is the number of partitions, f 1 refers to the number of neighbor points containing vertex v in each partition at time t, f 2 is the penalty function, and n/k is the average load.
进一步的,在图数据的划分过程中,每个条目的生成或者更新时,同步更新该条目的时间戳;Further, during the division of graph data, when each entry is generated or updated, the timestamp of the entry is updated synchronously;
当DRAM中的数据容量超过设定阈值时,按照缓存中条目时间戳的前后,将时间最靠前的条目数据迁移到NVM中;When the data capacity in DRAM exceeds the set threshold, the entry data with the earliest time is migrated to NVM according to the time stamp of the entry in the cache;
当DRAM中的缓存数据减少到设定阈值且NVM中的数据不为空时,将NVM中时间戳最靠后的缓存中条目迁移到DRAM中。When the cache data in the DRAM is reduced to a set threshold and the data in the NVM is not empty, the entry in the cache with the last timestamp in the NVM is migrated to the DRAM.
该方案降低了对NVM写的次数,提升NVM的寿命。This solution reduces the number of writes to the NVM and improves the life of the NVM.
进一步的,在对队列中后续顶点进行分配时,如果DRAM或NVM中存在以此顶点为键的条目时,在依据条目中的值将点分配到对应的处理单元后,将此条目从缓存中删除,这提高了内存空间的利用率。Further, when allocating subsequent vertices in the queue, if there is an entry with this vertex as the key in DRAM or NVM, after assigning the point to the corresponding processing unit according to the value in the entry, remove this entry from the cache Delete, which improves the utilization of memory space.
本发明还提出了面向大图计算高效图划分系统,包括CPU、DRAM、NVM以及处理单元,所述处理单元为集群中的计算单元,所述CPU与DRAM、NVM、处理单元通信连接,所述DRAM与NVM通信连接,所述CPU、DRAM、NVM和处理单元按权利要求1-3任一项所述的面向大图计算高效图划分方法对大图进行缓存。The present invention also proposes a high-efficiency graph division system for large graph computing, including a CPU, DRAM, NVM, and a processing unit. The processing unit is a computing unit in a cluster, and the CPU communicates with the DRAM, NVM, and processing unit. The DRAM is connected to the NVM in communication, and the CPU, the DRAM, the NVM, and the processing unit cache the large image according to the large-image calculation and efficient image division method described in any one of claims 1-3.
本发明的有益效果:Beneficial effects of the present invention:
1、本申请不用考虑内存容量是否满足要求,相比较磁盘或者固态硬盘,划分效率得到了提升,接近内存的处理速度;1. This application does not need to consider whether the memory capacity meets the requirements. Compared with disk or solid-state hard disk, the division efficiency has been improved, and the processing speed is close to that of memory;
2、采用本发明,可直接根据当前定点查找对应的以此点为键的条目,效率明显提升;2. By adopting the present invention, the corresponding entry with this point as the key can be searched directly according to the current fixed point, and the efficiency is obviously improved;
3、对NVM的写次数明显得到了降低,提升NVM平均寿命。3. The number of writes to NVM has been significantly reduced, increasing the average life of NVM.
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
附图说明Description of drawings
本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and comprehensible from the description of the embodiments in conjunction with the following drawings, wherein:
图1是本发明的混合存储架构图;Fig. 1 is a hybrid storage architecture diagram of the present invention;
图2是图划分示例图;Figure 2 is an example diagram of graph division;
图3是大图的顶点队列示意图;Figure 3 is a schematic diagram of a vertex queue of a large graph;
图4-图6是本发明与其他两种缓存方法处理进程所占的内容容量的对比图;Fig. 4-Fig. 6 are the comparison diagrams of the content capacity occupied by the present invention and the processing process of the other two caching methods;
图7-9是不同替换策略的写次数比较图。Figure 7-9 is a comparison of write times for different replacement strategies.
具体实施方式Detailed ways
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.
在本发明的描述中,除非另有规定和限定,需要说明的是,术语“安装”、“相连”、“连接”应做广义理解,例如,可以是机械连接或电连接,也可以是两个元件内部的连通,可以是直接相连,也可以通过中间媒介间接相连,对于本领域的普通技术人员而言,可以根据具体情况理解上述术语的具体含义。In the description of the present invention, unless otherwise specified and limited, it should be noted that the terms "installation", "connection" and "connection" should be understood in a broad sense, for example, it can be mechanical connection or electrical connection, or two The internal communication of each element may be directly connected or indirectly connected through an intermediary. Those skilled in the art can understand the specific meanings of the above terms according to specific situations.
如图1所示的一种混合式存储结构,将DRAM为一级缓存,NVM作为磁盘的二级磁盘,本发明基于该结构提供了一种面向大图计算高效图划分方法,包括以下步骤:A hybrid storage structure as shown in Figure 1, DRAM is used as a first-level cache, and NVM is used as a second-level disk of a disk. Based on this structure, the present invention provides a method for partitioning large-scale computing and efficient graphs, including the following steps:
S1,将图数据划分成多个顶点,并将顶点随机排序转为队列。S1, the graph data is divided into multiple vertices, and the vertices are randomly sorted into queues.
S2,按照队列顺序对第一个顶点进行分区分配,即分配到处理单元,这里的处理单元即为集群中的计算单元。分配完后以该顶点的分区信息作为值,此顶点的邻点作为键,以字典条目的形式存储于DRAM中,如果DRAM的数据容量超过设定阈值则存储于NVM中或NVM中。S2, according to the order of the queue, the first vertex is partitioned and allocated, that is, allocated to the processing unit, where the processing unit is the computing unit in the cluster. After allocation, the partition information of the vertex is used as the value, and the neighbors of the vertex are used as keys, which are stored in DRAM in the form of dictionary entries. If the data capacity of DRAM exceeds the set threshold, it is stored in NVM or in NVM.
S3,对于队列中后续顶点,先判断DRAM或NVM中是否有以此顶点为键的条目,如果存在,直接将此顶点的分区信息追加到DRAM或者NVM中对应的条目,即依据该条目中的值,将该点分配到对应的处理单元。每分配完一个顶点,均以分配完成顶点的分区信息作为值,此顶点的邻点作为键,以字典条目的形式存储于对应缓存中。S3, for the subsequent vertex in the queue, first judge whether there is an entry with this vertex as the key in DRAM or NVM, if it exists, directly append the partition information of this vertex to the corresponding entry in DRAM or NVM, that is, according to the entry in the entry Value, assign the point to the corresponding processing unit. Every time a vertex is allocated, the partition information of the allocated vertex is used as the value, and the neighbors of this vertex are used as keys, which are stored in the corresponding cache in the form of dictionary entries.
在对队列中后续顶点进行分配时,如果DRAM或NVM中存在以此顶点为键的条目时,在依据条目中的值将点分配到对应的处理单元后,将此条目从缓存中删除。When allocating subsequent vertices in the queue, if there is an entry with this vertex as the key in DRAM or NVM, the entry is deleted from the cache after the point is assigned to the corresponding processing unit according to the value in the entry.
如果不存在,则将该点分配到负载最小的处理单元,再将每个分配完的顶点的分区信息作为值,此顶点的邻点作为键,以字典条目的形式存储于DRAM中,如果DRAM的数据容量超过设定阈值则存储于NVM中,直至队列中所有点都分配完毕,其中,所述负载最小的处理单元为拥有顶点数量最少的分区。If it does not exist, allocate the point to the processing unit with the least load, and then use the partition information of each allocated vertex as the value, and the neighbors of this vertex as keys, and store them in DRAM in the form of dictionary entries. If DRAM If the data capacity exceeds the set threshold, it will be stored in the NVM until all points in the queue are allocated, wherein the processing unit with the least load is the partition with the least number of vertices.
分区分配时,将待分配顶点分配至含有该顶点的邻点数量最多的分区。When assigning partitions, the vertex to be assigned is assigned to the partition containing the largest number of neighbors of the vertex.
具体可采用以下公式进行分区: Specifically, the following formula can be used for partitioning:
其中,N(v)代表顶点v的邻居点,表示在t时刻已划分完的顶点分区状态,n为图的顶点数,k为分区数。从以上方程可以看出,每一个顶点只检查一次,方程前半部分函数f1含义是在t时刻,每个分区中含有顶点v的邻居点数量,为了使分区负载均衡,在方程后面乘以惩罚函数f2。选择函数值最大的子区,将顶点v分配到此子区中,n/k为平均负载。Among them, N(v) represents the neighbor point of vertex v, Indicates the vertex partition status that has been divided at time t, n is the number of vertices in the graph, and k is the number of partitions. It can be seen from the above equation that each vertex is only checked once. The function f 1 in the first half of the equation means the number of neighbors of vertex v in each partition at time t. In order to balance the partition load, multiply the penalty after the equation function f 2 . Select the sub-area with the largest function value, assign the vertex v to this sub-area, and n/k is the average load.
如表1所示,表中最左边一列为数据集,本发明的混合存储结构分别与其它存储结构在划分时间上进行了比较,在非易失存储设备、磁盘、固态硬盘中,分别设置DRAM中的缓存容量分别为对应图的5%,9%,13%及17%。其中缓存内容存储结构都是引入本发明的存储结构。可以看出本发明的混合存储结构在时间上比其它存储结构降低了很多。As shown in Table 1, the leftmost column in the table is a data set. The hybrid storage structure of the present invention is compared with other storage structures in terms of division time. In non-volatile storage devices, magnetic disks, and solid-state hard disks, DRAMs are respectively set The cache capacity in is respectively 5%, 9%, 13% and 17% of the corresponding figure. The cache content storage structure is the storage structure introduced in the present invention. It can be seen that the hybrid storage structure of the present invention is much shorter in time than other storage structures.
设置DRAM存储容量不受限制,与原始划分时间进行了比较,由表数据直接看出引入本发明的缓存内容结构的优越性。The DRAM storage capacity is set to be unlimited, compared with the original division time, the superiority of the cache content structure introduced in the present invention can be seen directly from the table data.
表1不同介质下的划分时间对比Table 1 Comparison of division time under different media
下图举例说明划分的过程,将图2的左图利用上述算法划分为三个子区,其中,图2的左图为图的基本表达方式,以文件的形式存在电脑磁盘里的,载入内存即可,计算过程如图3以及表2所示,先将大图数据划分成多个顶点,并将顶点随机排序转为队列,如图3所示;表2为从时刻T到时刻T+6对大图G进行划分过程中动态缓存数据的内容(邻边结构),在T时刻处理ID号为1的顶点(v1),分配完成之后,将此点的分区信息作为值,邻点做为键(v2→S1,v3→S1)保存在缓存中。在T+1时刻,计算顶点v3,由于在缓存中已经保存了点v3的邻点分区情况(v3→S1),所以根据此条目直接计算出此点所属的分区,将点v3分配完毕,由于此点的条目(v3→S1)已变成冗余数据,所以直接将其删除。直到所有的顶点都分配完成,算法结束。The following figure illustrates the division process. The left figure in Figure 2 is divided into three sub-areas using the above algorithm. Among them, the left figure in Figure 2 is the basic expression of the figure, which is stored in the computer disk in the form of a file and loaded into the memory. That is, the calculation process is shown in Figure 3 and Table 2. First, the large graph data is divided into multiple vertices, and the random ordering of the vertices is converted into a queue, as shown in Figure 3; Table 2 is from time T to time T+ 6. During the division of the large graph G, dynamically cache the data content (adjacent edge structure), and process the vertex (v 1 ) with the ID number 1 at time T. After the allocation is completed, use the partition information of this point as the value, and the neighbor Stored in the cache as the key (v 2 →S 1 , v 3 →S 1 ). At time T+1, calculate vertex v 3 , since the neighbor partition of point v 3 has been saved in the cache (v 3 → S 1 ), so the partition to which this point belongs is directly calculated according to this entry, and point v 3 After the allocation is completed, the entry (v 3 →S 1 ) at this point has become redundant data, so it is directly deleted. Until all vertices are allocated, the algorithm ends.
一个点只能分配到一个分区,缓存中一个键可以对应多个值,该算法是根据已分配的邻点数来计算的,例如(v2→S1)不是把点v2分配到s1的意思,只要知道v2的邻点在s1中有一个就可以了。其中(V4:S2,S3)表明点v4的邻点在子区s2中有一个,在s3中有一个,知道这些信息就可以代入上述公式进行计算。在计算机领域,值可以理解为需要存放的数据,键为需要存的值的编号。A point can only be assigned to one partition, and a key in the cache can correspond to multiple values. This algorithm is calculated based on the number of allocated neighbor points. For example, (v2→S1) does not mean assigning point v2 to s1. Just know Only one neighbor of v2 is in s1. Among them (V4:S2, S3) indicates that there is one adjacent point of point v4 in the sub-region s2 and one in the sub-region s3. Knowing this information, it can be substituted into the above formula for calculation. In the computer field, the value can be understood as the data that needs to be stored, and the key is the number of the value that needs to be stored.
表2 从时刻T到时刻T+6对图G进行划分过程中动态缓存数据的内容Table 2 Contents of dynamic cache data in the process of partitioning graph G from time T to time T+6
如图4-6所示的三种缓存内容存储结构随着划分的进程,缓存数据的容量占对应图规模的百分比。其中,OCS表示采用本发明的缓存内容存储结构,OCSnodel表示采用本发明的缓存内容存储结构但没有引入删除操作,NPS表示原始算法的存储结构,以点及对应点的分区信息作为基本的条目。由此图可知,随着点处理的进程,使用NPS结构和OCSnodel结构的规模在不断的增长,OCSnodel结构最后达到了原图规模的100%。采用本发明的存储结构OCS随着顶点处理的流程,到达峰值之后容量却在不断的减少,NPS需要的最大存储容量要小于OCS及OCSnodel,但是所需的最大容量也占了接近30%,对于TB及的大图而言,显然目前普通单机已经无能为力。The three cache content storage structures shown in Figure 4-6 follow the process of division, and the capacity of the cache data accounts for the percentage of the corresponding graph scale. Among them, OCS means adopting the cache content storage structure of the present invention, OCSnodel means adopting the cache content storage structure of the present invention without introducing deletion operation, NPS means the storage structure of the original algorithm, and the partition information of points and corresponding points is used as the basic entry. It can be seen from the figure that with the process of point processing, the scale of using the NPS structure and the OCSnodel structure is constantly increasing, and the OCSnodel structure finally reaches 100% of the original scale. Adopting the storage structure OCS of the present invention follows the process of vertex processing, but the capacity is constantly decreasing after reaching the peak value. The maximum storage capacity required by NPS is smaller than that of OCS and OCSnodel, but the required maximum capacity also accounts for nearly 30%. For As far as the big picture of TB is concerned, it is obvious that ordinary stand-alone machines are powerless at present.
为了减少对NVM写的次数,使用循环最近最少数据迁移策略,在图数据的划分过程中,每个条目的生成或者更新时,同步更新该条目的时间戳;In order to reduce the number of writes to NVM, the circular least recent data migration strategy is used. During the division of graph data, when each entry is generated or updated, the timestamp of the entry is updated synchronously;
当DRAM中的数据容量超过设定阈值时,按照缓存中条目时间戳的前后,将时间最靠前的条目数据迁移到NVM中;When the data capacity in DRAM exceeds the set threshold, the entry data with the earliest time is migrated to NVM according to the time stamp of the entry in the cache;
当DRAM中的缓存数据减少到设定阈值且NVM中的数据不为空时,将NVM中时间戳最靠后的缓存中条目迁移到DRAM中。When the cache data in the DRAM is reduced to a set threshold and the data in the NVM is not empty, the entry in the cache with the last timestamp in the NVM is migrated to the DRAM.
如图7-9所示,设置DRAM缓存容量为对应图的百分比(5%,10%,15%)。当DRAM中的缓存数据超过指定的规模时,数据将写入NVM中,为了验证此策略的优越性,也对其他策略进行了实验对结果进行对比。As shown in Figure 7-9, set the DRAM cache capacity as a percentage (5%, 10%, 15%) corresponding to the figure. When the cache data in DRAM exceeds the specified size, the data will be written into NVM. In order to verify the superiority of this strategy, other strategies are also tested to compare the results.
Random为随机策略,当内存缓存数据已满时,随机选择其中的数据写入NVM中。LFU为最不经常使用策略,条目中加入计数变量,以更新条目次数为基本单位,将次数最少的条目迁移到NVM中。LRU为最近最少使用策略,条目中加入时间戳,将时间最久的条目迁移到NVM中。LRUloop为循环最近最少使用策略,其中DRAM中的缓存数据迁移到NVM中的策略与LRU相同,不同的是,由图3可知,在处理到顶点队列中的大约0.05%-0.2%的位置时,所需的缓存容量不断减小,DRAM中会有多余的空间,本发明利用此空间又引入了回写策略,当此剩余空间增加到一定规模,将NVM中的条目根据时间戳选择最近的条目数据迁移到DRAM中,再次降低了写次数。图3结果体现本策略的优越性,写次数明显少于其他策略。Random is a random strategy. When the memory cache data is full, the data in it is randomly selected and written to NVM. LFU is the least frequently used strategy. A count variable is added to the entry, and the number of updated entries is used as the basic unit to migrate the entry with the least number of times to NVM. LRU is the least-recently-used policy, adding timestamps to entries, and migrating the oldest entries to NVM. LRU loop is the least-recently-used strategy. The strategy of migrating cached data in DRAM to NVM is the same as LRU. The difference is that, as shown in Figure 3, when processing to about 0.05%-0.2% of the vertex queue , the required cache capacity is constantly decreasing, and there will be redundant space in the DRAM. The present invention uses this space to introduce a write-back strategy. When the remaining space increases to a certain scale, the entries in the NVM are selected according to the timestamp. Entry data is migrated to DRAM, again reducing write cycles. The results in Figure 3 reflect the superiority of this strategy, and the number of writes is significantly less than other strategies.
本发明还提出了一种面向大图计算高效图划分系统,包括CPU、DRAM、NVM以及处理单元,所述处理单元为集群中的计算单元,所述CPU与DRAM、NVM、处理单元通信连接,所述DRAM与NVM通信连接,所述CPU、DRAM、NVM和处理单元按上述的面向大图计算高效图划分方法对大图进行缓存。The present invention also proposes a high-efficiency graph division system for large graph computing, including a CPU, DRAM, NVM, and a processing unit. The processing unit is a computing unit in a cluster, and the CPU communicates with the DRAM, NVM, and processing unit. The DRAM is connected to the NVM in communication, and the CPU, DRAM, NVM, and processing unit cache the large image according to the above-mentioned high-efficiency graph division method for large-scale computing.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
尽管已经示出和描述了本发明的实施例,本领域的普通技术人员可以理解:在不脱离本发明的原理和宗旨的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由权利要求及其等同物限定。Although the embodiments of the present invention have been shown and described, those skilled in the art can understand that various changes, modifications, substitutions and modifications can be made to these embodiments without departing from the principle and spirit of the present invention. The scope of the invention is defined by the claims and their equivalents.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711375929.2A CN107957962A (en) | 2017-12-19 | 2017-12-19 | It is a kind of to calculate efficient figure division methods and system towards big figure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711375929.2A CN107957962A (en) | 2017-12-19 | 2017-12-19 | It is a kind of to calculate efficient figure division methods and system towards big figure |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107957962A true CN107957962A (en) | 2018-04-24 |
Family
ID=61959253
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711375929.2A Pending CN107957962A (en) | 2017-12-19 | 2017-12-19 | It is a kind of to calculate efficient figure division methods and system towards big figure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107957962A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109522428A (en) * | 2018-09-17 | 2019-03-26 | 华中科技大学 | A kind of external memory access method of the figure computing system based on index positioning |
CN111209106A (en) * | 2019-12-25 | 2020-05-29 | 北京航空航天大学杭州创新研究院 | A streaming graph partitioning method and system based on caching mechanism |
CN111292225A (en) * | 2018-12-07 | 2020-06-16 | 三星电子株式会社 | Partitioning graphics data for large-scale graphics processing |
CN112912865A (en) * | 2018-07-27 | 2021-06-04 | 浙江天猫技术有限公司 | Graph data storage method, system and electronic device |
CN113377523A (en) * | 2021-01-13 | 2021-09-10 | 绍兴文理学院 | Heterogeneous sensing stream graph partitioning method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140379985A1 (en) * | 2013-06-25 | 2014-12-25 | International Business Machines Corporation | Multi-level aggregation techniques for memory hierarchies |
CN104778126A (en) * | 2015-04-20 | 2015-07-15 | 清华大学 | Method and system for optimizing transaction data storage in non-volatile memory |
-
2017
- 2017-12-19 CN CN201711375929.2A patent/CN107957962A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140379985A1 (en) * | 2013-06-25 | 2014-12-25 | International Business Machines Corporation | Multi-level aggregation techniques for memory hierarchies |
CN104778126A (en) * | 2015-04-20 | 2015-07-15 | 清华大学 | Method and system for optimizing transaction data storage in non-volatile memory |
Non-Patent Citations (1)
Title |
---|
QI LI等: "Streaming Graph Partitioning for Large Graphs with Limited Memory", 《IEEE》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112912865A (en) * | 2018-07-27 | 2021-06-04 | 浙江天猫技术有限公司 | Graph data storage method, system and electronic device |
CN112912865B (en) * | 2018-07-27 | 2024-06-07 | 浙江天猫技术有限公司 | Graph data storage method and system and electronic equipment |
US12182201B2 (en) | 2018-07-27 | 2024-12-31 | Zhejiang Tmall Technology Co., Ltd. | Graph data storage method, system and electronic device |
CN109522428A (en) * | 2018-09-17 | 2019-03-26 | 华中科技大学 | A kind of external memory access method of the figure computing system based on index positioning |
CN109522428B (en) * | 2018-09-17 | 2020-11-24 | 华中科技大学 | An external memory access method for graph computing system based on index positioning |
CN111292225A (en) * | 2018-12-07 | 2020-06-16 | 三星电子株式会社 | Partitioning graphics data for large-scale graphics processing |
CN111292225B (en) * | 2018-12-07 | 2022-05-31 | 三星电子株式会社 | Partitioning graphics data for large-scale graphics processing |
CN111209106A (en) * | 2019-12-25 | 2020-05-29 | 北京航空航天大学杭州创新研究院 | A streaming graph partitioning method and system based on caching mechanism |
CN111209106B (en) * | 2019-12-25 | 2023-10-27 | 北京航空航天大学杭州创新研究院 | Flow chart dividing method and system based on caching mechanism |
CN113377523A (en) * | 2021-01-13 | 2021-09-10 | 绍兴文理学院 | Heterogeneous sensing stream graph partitioning method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230066084A1 (en) | Distributed storage system | |
US11163699B2 (en) | Managing least recently used cache using reduced memory footprint sequence container | |
US10055161B1 (en) | Data reduction techniques in a flash-based key/value cluster storage | |
US9934231B2 (en) | System and methods for prioritizing data in a cache | |
WO2021218038A1 (en) | Storage system, memory management method, and management node | |
US9805048B2 (en) | System and method for managing a deduplication table | |
CN107957962A (en) | It is a kind of to calculate efficient figure division methods and system towards big figure | |
US9665305B1 (en) | Tiering data between two deduplication devices | |
CN110058822B (en) | Transverse expansion method for disk array | |
US9582421B1 (en) | Distributed multi-level caching for storage appliances | |
CN103885728B (en) | A kind of disk buffering system based on solid-state disk | |
TWI627536B (en) | System and method for a shared cache with adaptive partitioning | |
US10235044B2 (en) | System and methods for storage data deduplication | |
US20170075741A1 (en) | Prioritizing Data Reconstruction in Distributed Storage Systems | |
US11169927B2 (en) | Efficient cache management | |
CN104317742A (en) | Thin provisioning method for optimizing space management | |
CN108932150B (en) | Caching method, device and medium based on SSD and disk hybrid storage | |
CN104111804A (en) | Distributed file system | |
CN103279429A (en) | Application-aware distributed global shared cache partition method | |
US10101934B1 (en) | Memory allocation balancing for storage systems | |
US10705977B2 (en) | Method of dirty cache line eviction | |
US10176103B1 (en) | Systems, devices and methods using a solid state device as a caching medium with a cache replacement algorithm | |
WO2020024591A1 (en) | Cost-effective deployments of a pmem-based dmo system | |
CN103020077A (en) | Method for managing memory of real-time database of power system | |
US10802748B2 (en) | Cost-effective deployments of a PMEM-based DMO system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180424 |