+

CN106294772B - The buffer memory management method of distributed memory columnar database - Google Patents

The buffer memory management method of distributed memory columnar database Download PDF

Info

Publication number
CN106294772B
CN106294772B CN201610659223.8A CN201610659223A CN106294772B CN 106294772 B CN106294772 B CN 106294772B CN 201610659223 A CN201610659223 A CN 201610659223A CN 106294772 B CN106294772 B CN 106294772B
Authority
CN
China
Prior art keywords
cache
caching
node
weight
queue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610659223.8A
Other languages
Chinese (zh)
Other versions
CN106294772A (en
Inventor
段翰聪
闵革勇
张建
郑松
詹文翰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201610659223.8A priority Critical patent/CN106294772B/en
Publication of CN106294772A publication Critical patent/CN106294772A/en
Application granted granted Critical
Publication of CN106294772B publication Critical patent/CN106294772B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of buffer memory management methods of distributed memory columnar database, comprising: establishes buffer queue in caching main controlled node;Physics executive plan where cutting it as root node using each physical tasks calculates track to obtain the corresponding caching of each physical tasks;Track building cache feature tree in caching main controlled node is calculated according to the corresponding caching of each physical tasks;When inquiry request arrives, SQL statement is parsed into physics executive plan by query execution engine;Each node in physics executive plan is traversed since level the root node of physics executive plan, judges that the corresponding caching of each physical tasks calculates whether track matches with cache feature tree;If matching directly reads the caching real data of the physical tasks from node from caching, otherwise calculates the physical tasks.The buffer memory management method of distributed memory columnar database provided by the invention detects rapidly whether caching hits by efficient cache match algorithm, improves search efficiency.

Description

The buffer memory management method of distributed memory columnar database
Technical field
The present invention relates to computer software technical fields, and in particular to a kind of caching pipe of distributed memory columnar database Reason method.
Background technique
With the development of information age, explosive increase is presented in data scale, and how to extract from these mass datas has The information of value is the huge challenge that current social faces.And on-line analytical processing (OLAP, On-Line Analytical Processing) system is demonstrated by its powerful data analysis capabilities, it is widely used to bank, telecommunications, stock exchange Equal commercial fields.
Support the distributed memory columnar database of OLAP system that user is allowed to extract from mass data with multiple dimensions With the valuable information of analysis, these information may be a simple report, it is also possible to a complicated analysis result.With The promotion of query statement complexity, the time required for inquiry operation also can be longer, and the query statement of high complexity is propping up It is very high to hold the frequency occurred in the distributed memory columnar database of OLAP system.
Inquiry request has very strong correlation in terms of semanteme in database, therefore partial query result is likely to going through Occurred in history inquiry.Distributed memory columnar database introduces cache management system and saves certain historical query informations, from And reduce the execution number for repeating inquiry.Fig. 1 is the structural schematic diagram of the cache management system of distributed memory columnar database, The cache management system includes query execution engine (Query Engine) 11, caching main controlled node (Cache Master) 12, standby node (Standby Cache Master) 13 and at least one caching are from node (Cache Slave) 14.Its In, the query execution engine 11 is responsible for parsing and executes user's SQL request, returns to query result;The caching main controlled node 12 are responsible for all cachings of management from node 14, safeguard cache metadata, eliminate caching according to related algorithm, maintenance caching is consistent Property;All cache metadatas that main controlled node 12 is cached described in 13 periodic synchronization of standby node, when the caching master control section When point 12 breaks down, the caching main controlled node 12 is substituted immediately, continues to provide buffer service;The caching is negative from node 14 Memory buffers real data is blamed, the request that the query execution engine 11 reads caching is responded.
How under distributed scene as far as possible the high data of cache access frequency, accelerate data base querying speed, exactly Distributed caching management system will solve the problems, such as.
Summary of the invention
It is to be solved by this invention be how under distributed scene as far as possible the high data of cache access frequency, accelerate number The problem of according to library inquiry speed.
The present invention is achieved through the following technical solutions:
A kind of buffer memory management method of distributed memory columnar database, the caching of the distributed memory columnar database Management system includes query execution engine, caches main controlled node and at least one caching from node, the buffer memory management method It include: to establish buffer queue in caching main controlled node, each element corresponds to a physical tasks in the buffer queue Cache metadata;Physics executive plan where cutting it as root node using each physical tasks is to obtain each physical tasks pair The caching answered calculates track;It is special that track building caching in caching main controlled node is calculated according to the corresponding caching of each physical tasks Sign tree;When inquiry request arrives, SQL statement is parsed into physics executive plan by query execution engine;From physics executive plan Root node start each node in level traversal physics executive plan, judge each physical tasks corresponding caching calculating track Whether matched with the cache feature tree;If the caching actual number of the physical tasks is directly read in matching from caching from node According to otherwise calculating the physical tasks.
The present invention calculates track unique identification caching using caching, when inquiry request arrives, it is only necessary to by each physics The corresponding caching of task calculates track and is matched with cache feature tree, can detect whether caching hits rapidly, from certain journey The calculating that iterative task in distributed data base is reduced on degree, saves query time, improves search efficiency.
Further, sequence arrangement of each element by weight from big to small in the buffer queue.
Further, in the buffer queue weight of each element according to Wi=qi×(a×Si+b×Pi) obtain, In, WiFor the weight of i-th element, qiFor the weight factor of the corresponding physical tasks of i-th element, SiFor i-th element when Sky ratio andtiThe time that root node longest path is arrived in track, k are calculated for the corresponding caching of i-th elementiFor The corresponding storage strategy constant of i-th element, miFor system memory space shared by i-th element reality, PiFor i-th element Hit frequency andniFor the historical hit number of i-th element, diFor i-th last hit of element distance Time interval, viTime interval is averagely hit for i-th element, a is the impact factor that space-time compares weight, and b is hit frequency For rate to the impact factor of weight, i is positive integer.
Further, the buffer memory management method of the distributed memory columnar database further includes when the cache management system System carries out following steps when receiving the storage request newly cached: step S1 updates the weight of each element in the buffer queue; Step S2, judges whether the cache management system current residual space is enough to store new caching;If the cache management system Current residual space is enough to store new caching, executes step S3, and caching main controlled node notice caching stores new caching from node, will New cache metadata is put into the buffer queue, and records in the cache feature tree and newly cache corresponding caching calculating rail Mark;If the cache management system current residual space is not enough to store new caching, step S4 is executed, judges the weight newly cached Whether the weight of in the buffer queue last element is greater than;If the weight newly cached is not more than in the buffer queue most The weight of latter element, executes step S5, and caching main controlled node notice caching refuses the new caching of storage from node;If new caching Weight be greater than the buffer queue in last element weight, execute step S6, judging whether still to have inquiry operation just Using last element in the buffer queue;If still there is inquiry operation that last in the buffer queue is used Element executes step S7, is to be deleted by last rubidium marking in the buffer queue;If no inquiry operation is used Last element in the buffer queue, executes step S8, and caching main controlled node deletes last in the buffer queue Element recycles the memory space that the caching that is eliminated in the cache management system occupies, deletes institute from the cache feature tree It states the corresponding caching of last in buffer queue element and calculates track, and go to step S2.
Further, the buffer memory management method of the distributed memory columnar database further include: be the buffer queue In each element one corresponding reference count is set, when an element initial hit, by this element corresponding reference meter Number sets 1, and starts a timer;If before the timer expires, caching main controlled node receives the SQL language using an element The feedback that sentence inquiry finishes, then subtract 1 for the reference count of this element, and reset timer;In the corresponding reference of an element When counting equal to 0, then timer is closed.
Further, it is judgement that judging whether, which still has inquiry operation that last element in the buffer queue is used, Whether the corresponding reference count of last element is 0 in the buffer queue.
Further, the result that the corresponding caching of each physical tasks calculates that track includes each node calculates time, knot The data transmission period that fruit storage size and each edge represent.
Further, table involved in all nodes is owned by version number in the cache feature tree.
Compared with prior art, the present invention having the following advantages and benefits:
The buffer memory management method of distributed memory columnar database provided by the invention passes through efficient cache match algorithm Whether detection caching hits rapidly, guarantees system availability and stability using reasonable caching life cycle algorithm, to a certain degree On reduce the calculating of iterative task in distributed data base, save query time and memory space, improve search efficiency.
Detailed description of the invention
Attached drawing described herein is used to provide to further understand the embodiment of the present invention, constitutes one of the application Point, do not constitute the restriction to the embodiment of the present invention.In the accompanying drawings:
Fig. 1 is the structural schematic diagram of the cache management system of distributed memory columnar database;
Fig. 2 is the structural schematic diagram of the physics executive plan of the embodiment of the present invention;
Fig. 3 is that the caching of the embodiment of the present invention calculates the structural schematic diagram of track;
Fig. 4 is the structural schematic diagram of the cache feature tree of the embodiment of the present invention;
Fig. 5 is that the T2-Join of the embodiment of the present invention calculates the structural schematic diagram of track;
Fig. 6 is that the T3-Join of the embodiment of the present invention calculates the structural schematic diagram of track;
Fig. 7 is the structural schematic diagram of the cache match result of the embodiment of the present invention;
Fig. 8 is that the caching of the embodiment of the present invention eliminates the flow diagram of method.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below with reference to embodiment and attached drawing, to this Invention is described in further detail, and exemplary embodiment of the invention and its explanation for explaining only the invention, are not made For limitation of the invention.
Embodiment
The present embodiment provides a kind of buffer memory management method of distributed memory columnar database, the distributed memory column The structural schematic diagram of the cache management system of database can refer to Fig. 1, including query execution engine, caching main controlled node, spare Node and at least one caching are from node.
When inquiry request arrives, SQL statement is parsed into the physics executive plan indicated by DAG by query execution engine. One physical tasks of each node on behalf in physics executive plan, physical tasks be divided into again GetColumn, Join, Filter, Group and BuildRow etc., each edge represent the transmission relationship of calculated result between two physical tasks.One typical inquiry The physics of sentence (B.id≤80 SELECT A.id FROM A, B WHERE A.id=B.id AND A.id≤100AND) Executive plan is as shown in Figure 1.In cache management system, data cached granularity is the calculated result of single physical task.When When caching is the calculated result of BuildRow, then the final query result of whole SQL statement has been cached.The present embodiment is using caching Calculate track unique identification caching.
Firstly, establishing buffer queue in caching main controlled node, each element corresponds to an object in the buffer queue The cache metadata of reason task.As previously mentioned, data cached granularity is the calculated result of single physical task, initially setting up When buffer queue, the cache management system has enough memory spaces, therefore, when receiving cache request, directly by object The calculated result of reason task is stored in caching from node, using the cache metadata of physical tasks as a member of buffer queue Element.
Physics executive plan where cutting it as root node using each physical tasks is corresponding to obtain each physical tasks Caching calculate track.Specifically, each physical tasks of caching is requested to be corresponding with a physics executive plan.It is held in physics Row in the works, cuts physics executive plan as root node using the physical tasks of request caching and obtains subgraph, which is exactly root The corresponding caching of node calculates track.Caching calculates track record from initial data to all information for generating the caching, wraps The result for including each node calculates the data transmission period that time, result storage size and each edge represent.Caching calculates rail Each node of mark is a characteristic point, and all characteristic points and its relationship constitute the feature of caching.With physics shown in Fig. 2 execution For plan, if user will cache the calculated result of T3-Join physical tasks, then using the physical tasks node as root node, It is as shown in Figure 3 to obtain corresponding caching calculating track.
Track building cache feature tree in caching main controlled node is calculated according to the corresponding caching of each physical tasks.It is described Cache feature tree has recorded the characteristic point and its relationship of each caching, it presses hierarchical structure tissue according to table relationship, and the number of plies represents The table number that this layer is related to, the bottom pertain only to a table.One cache feature point of each node on behalf of the cache feature tree, Mutual dependence is indicated by line between characteristic point, a caching is characterized in by series of features point and its relationship structure At.
Data cached in order to prevent and database initial data is inconsistent, in the cache feature tree involved by all nodes Table be owned by version number, only just can be carried out matching under the premise of table version number is consistent.User attempts to read or write-in is slow When depositing, input comprising the table that is related in current distributed database newest version number, when caching main controlled node finds certain When the version of a table is expired, then all cachings related with the table are deleted.One typical cache feature tree as shown in figure 4, its In, solid box indicates that the caching of this feature point description exists, and dotted line frame indicates that the caching of this feature point description is not present, and the latter is only It is the feature integrality in order to guarantee caching belonging to it.
When inquiry request arrives, SQL statement is parsed into physics executive plan, physics executive plan by query execution engine It is indicated by a physical tasks DAG.How SQL statement is parsed into physics and held by query execution engine as known to those skilled in the art Row plan, details are not described herein.
Each node in physics executive plan is traversed since level the root node of physics executive plan, judges each physics The corresponding caching of task calculates whether track matches with the cache feature tree.Cache match includes two kinds of situations: exact matching It is matched with part.Exact matching refers to that all characteristic points and its relationship can be special in the caching in the caching calculating track of request It is found in sign tree;Part matching is substantially consistent with exact matching, and unique difference is that the root node that caching calculates track is described The subset of some characteristic point in cache feature tree.If matching, the caching that the physical tasks are directly read from node from caching is real Otherwise border data calculate the physical tasks.
On the basis of the cache feature tree of Fig. 4 description, the SQL statement (SELECT A.id the FROM A, B that are described with Fig. 2 B.id≤80 WHERE A.id=B.id AND A.id≤100AND) for, from physics executive plan root node start layers The secondary each node of traversal, steps are as follows for corresponding cache match:
Whether detection T1-BuildRow hits caching.It is exactly whole DAG figure that the caching of T1-BuildRow, which calculates track, from Root node starts level and scans whether each characteristic point hits, as long as detecting a characteristic point miss, which is not ordered Middle caching.Since node T1 is related to two Table A B, so search whether in the Level 2 of characteristics tree there are this feature point, it is practical The result is that nothing, so T1-BuildRow miss caches.
Whether detection T2-Join hits caching.The caching of T2-Join calculates track as shown in figure 5, node T2 matching caching Characteristic point 2_1 in characteristics tree, node T5 match the characteristic point 1_1 in cache feature tree, and node T4 is the son of characteristic point 1_5 Collection, but relationship (1_5,2_1) is not present, so T2-Join miss caches.
Whether detection T3-Join hits caching.The caching of T3-Join calculates track as shown in fig. 6, node T3 matching caching Characteristic point 2_1 in characteristics tree, node T6 match the characteristic point 1_2 in cache feature tree, and node T5 is matched known to aforementioned Characteristic point 1_1 in cache feature tree, relationship (T5, T3) matching relationship (1_1,2_1), relationship (T6, T3) matching relationship (1_2, 2_1), so T3-Join is exactly matched.
Node T4 is the subset of characteristic point 1_5, i.e. the part T4-GetColumn matches, node T5 matching characteristic point 1_1, therefore T5-GetColumn exact matching.Node T6 matching characteristic point 1_2, i.e. T6-GetColumn exact matching.
After above-mentioned steps, the cache match result of the available SQL statement is as shown in Figure 7.It can from Fig. 7 Out, query execution engine only needs to calculate T1 and T2 physical tasks, so that it may which the inquiry for completing whole SQL statement greatly mentions High search efficiency.
Since the memory space of cache management system is limited, after memory space is all occupied, new caching is received Storage request when need to carry out existing caching it is superseded.It is one of core of cache management system that caching, which is eliminated, is determined slow The swapping in and out deposited, and then influence the hit rate of caching and the stability of buffer service.In the present embodiment, the buffer queue In sequence arrangement of each element by weight from big to small, tail of the queue weight is minimum.When eliminating caching every time, member is popped up from tail of the queue Element.
Assuming that tiIt is that the corresponding caching of i-th element calculates the time (unit: ms) that root node longest path is arrived in track, miIt is system memory space shared by i-th element reality (unit: Byte), kiIt is that the corresponding storage strategy of i-th element is normal Amount, different storage strategies, the mode of reading cache data and time are all different.The space-time of i-th element compares calculation formula Are as follows:
Assuming that diIt is the time interval (unit: ms) of i-th last hit of element distance, niIt is the history of i-th element Hit-count, viIt is that i-th element averagely hits time interval (unit: ms).The hit frequency calculation formula of i-th element Are as follows:If i-th element is that queue is added for the first time, Pi=1.
In conclusion in the buffer queue i-th element weight calculation formula are as follows: Wi=qi×(a×Si+b× Pi).Wherein, WiFor the weight of i-th element, qiFor the weight factor of the corresponding physical tasks of i-th element.I-th element pair The physical tasks answered are more complicated, the weight factor q of the corresponding physical tasks of i-th elementiValue it is bigger.Such as Join task pair The value of the corresponding weight factor of value ratio GetColumn task for the weight factor answered is big.Under default situations, i-th element pair The weight factor q for the physical tasks answerediEqual to 1.0.A and b is constant, respectively represent space-time than with hit frequency to weight Impact factor.If system memory space is smaller, the value of a can be tuned up, increase the influence that space-time compares weight, it is on the contrary then Turn a down.
In the present embodiment, one corresponding reference count is set for each element in the buffer queue, represents Use the inquiry operation number of the caching.When an element initial hit, the corresponding reference count of this element sets 1, and Start a timer, default value can be configured according to the actual situation, such as 30 seconds.If before the timer expires, delayed It deposits main controlled node and receives certain feedback finished using the SQL statement inquiry of an element, then by the corresponding reference of this element Counting subtracts 1, and resets timer.When the corresponding reference count of an element is equal to 0, then timer is closed.The work of timer With being to prevent query execution engine from breaking down, the feedback that inquiry finishes can not be received by causing to cache main controlled node.
Caching is when being eliminated, if being eliminated, to cache corresponding reference count be not 0, illustrates still have inquiry operation using The caching may cause query execution engine and obtain cache failure if deleting the caching immediately, expends over head time and is appointed Business fault recovery.The caching being eliminated only be marked as it is to be deleted, until reference count be 0 when be just really deleted.
Fig. 8 be the embodiment of the present invention caching eliminate method flow diagram, when system receive one newly cache deposit When storage request, caching main controlled node carries out following steps:
Step S1 updates the weight of each element in the buffer queue.Due in buffer queue each element apart from upper The time interval of hit at first time can increase as time goes by, so needing first to update the weight of each element in buffer queue.
Step S2, judges whether the cache management system current residual space is enough to store new caching.
If the cache management system current residual space is enough to store new caching, step S3 is executed, caches main controlled node Notice caching stores new caching from node, new cache metadata is put into the buffer queue, and in the cache feature tree Record newly caches corresponding caching and calculates track.
If the cache management system current residual space is not enough to store new caching, step S4 is executed, judges new caching Weight whether be greater than the weight of last element in the buffer queue.
If the weight newly cached executes step S5, caching no more than the weight of last element in the buffer queue Main controlled node notice caching stores new caching from node refusal.
If the weight newly cached is greater than the weight of last element in the buffer queue, step S6 is executed, judgement is It is no still to have inquiry operation that last element in the buffer queue is used, that is, judge last in the buffer queue Whether the corresponding reference count of element is 0.
If still there is inquiry operation that last element in the buffer queue is used, step S7 is executed, it will be described slow It is to be deleted for depositing last in queue rubidium marking.It is to the corresponding reference count of last element in the buffer queue When 0, then last element in the buffer queue deleted.
If last element in the buffer queue is used in no inquiry operation, step S8 is executed, caches master control section It is empty to recycle the storage that caching occupies that is eliminated in the cache management system for last in buffer queue described in point deletion element Between, the corresponding caching of last element in the buffer queue is deleted from the cache feature tree and calculates track, and is gone to Step S2.
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects It is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the invention Protection scope, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include Within protection scope of the present invention.

Claims (5)

1.一种分布式内存列式数据库的缓存管理方法,所述分布式内存列式数据库的缓存管理系统包括查询执行引擎、缓存主控节点以及至少一个缓存从节点,其特征在于,所述缓存管理方法包括:1. A cache management method of a distributed memory columnar database, the cache management system of the distributed memory columnar database comprises a query execution engine, a cache master node and at least one cache slave node, characterized in that the cache Management methods include: 在缓存主控节点中建立缓存队列,所述缓存队列中每项元素对应为一个物理任务的缓存元数据;A cache queue is established in the cache master control node, and each element in the cache queue corresponds to the cache metadata of a physical task; 以每个物理任务为根节点切割其所在的物理执行计划以获得每个物理任务对应的缓存计算轨迹;每个物理任务对应的缓存计算轨迹包括每个节点的结果计算时间、结果存储大小以及每条边代表的数据传输时间;Take each physical task as the root node to cut the physical execution plan where it is located to obtain the cache calculation track corresponding to each physical task; the cache calculation track corresponding to each physical task includes the result calculation time, result storage size and The data transfer time represented by the edge; 根据每个物理任务对应的缓存计算轨迹在缓存主控节点中构建缓存特征树;Build a cache feature tree in the cache master node according to the cache calculation trajectory corresponding to each physical task; 在查询请求到来时,查询执行引擎将SQL语句解析成物理执行计划;When a query request arrives, the query execution engine parses the SQL statement into a physical execution plan; 从物理执行计划的根节点开始层次遍历物理执行计划中每个节点,判断每个物理任务对应的缓存计算轨迹是否与所述缓存特征树匹配;Starting from the root node of the physical execution plan, traverse each node in the physical execution plan hierarchically, and determine whether the cache calculation track corresponding to each physical task matches the cache feature tree; 若匹配,直接从缓存从节点中读取该物理任务的缓存实际数据,否则计算该物理任务;If it matches, directly read the actual data of the physical task from the cache slave node, otherwise calculate the physical task; 所述缓存队列中每项元素按权重从大到小的顺序排列;所述缓存队列中每项元素的权重根据Wi=qi×(a×Si+b×Pi)获得,其中,Wi为第i项元素的权重,qi为第i项元素对应的物理任务的权重因子,Si为第i项元素的时空比且ti为第i项元素对应的缓存计算轨迹中到根节点最长路径的时间,ki为第i项元素对应的存储策略常量,mi为第i项元素实际所占的系统存储空间,Pi为第i项元素的命中频率且ni为第i项元素的历史命中次数,di为第i项元素距离上一次命中的时间间隔,vi为第i项元素平均命中时间间隔,a为时空比对权重的影响因子,b为命中频率对权重的影响因子,i为正整数。Each element in the cache queue is arranged in descending order of weight; the weight of each element in the cache queue is obtained according to W i =q i ×(a×S i +b×P i ), wherein, Wi is the weight of the i -th element, qi is the weight factor of the physical task corresponding to the i -th element, S i is the space-time ratio of the i-th element and t i is the time from the longest path to the root node in the cache calculation track corresponding to the i -th element, ki is the storage policy constant corresponding to the i -th element, mi is the system storage space actually occupied by the i-th element, P i is the hit frequency of the i-th element and n i is the number of historical hits of the i-th element, d i is the time interval from the i-th element to the previous hit, vi is the average hit time interval of the i -th element, a is the influence factor of the space-time ratio on the weight, b is the impact factor of the hit frequency on the weight, i is a positive integer. 2.根据权利要求1所述的分布式内存列式数据库的缓存管理方法,其特征在于,还包括当所述缓存管理系统收到新缓存的存储请求时进行如下步骤:2. The cache management method of a distributed in-memory columnar database according to claim 1, further comprising performing the following steps when the cache management system receives a newly cached storage request: 步骤S1,更新所述缓存队列中每项元素的权重;Step S1, update the weight of each element in the cache queue; 步骤S2,判断所述缓存管理系统当前剩余空间是否足以存储新缓存;Step S2, judging whether the current remaining space of the cache management system is enough to store a new cache; 若所述缓存管理系统当前剩余空间足以存储新缓存,执行步骤S3,缓存主控节点通知缓存从节点存储新缓存,将新缓存元数据放入所述缓存队列,并在所述缓存特征树中记录新缓存对应的缓存计算轨迹;If the current remaining space of the cache management system is sufficient to store the new cache, step S3 is executed, and the cache master node notifies the cache slave node to store the new cache, puts the new cache metadata into the cache queue, and stores it in the cache feature tree. Record the cache calculation track corresponding to the new cache; 若所述缓存管理系统当前剩余空间不足以存储新缓存,执行步骤S4,判断新缓存的权重是否大于所述缓存队列中最后一项元素的权重;If the current remaining space of the cache management system is insufficient to store the new cache, step S4 is performed to determine whether the weight of the new cache is greater than the weight of the last element in the cache queue; 若新缓存的权重不大于所述缓存队列中最后一项元素的权重,执行步骤S5,缓存主控节点通知缓存从节点拒绝存储新缓存;If the weight of the new cache is not greater than the weight of the last element in the cache queue, step S5 is performed, and the cache master node informs the cache slave node to refuse to store the new cache; 若新缓存的权重大于所述缓存队列中最后一项元素的权重,执行步骤S6,判断是否仍有查询操作正在使用所述缓存队列中最后一项元素;If the weight of the new cache is greater than the weight of the last element in the cache queue, step S6 is performed to determine whether there is still a query operation using the last element in the cache queue; 若仍有查询操作正在使用所述缓存队列中最后一项元素,执行步骤S7,将所述缓存队列中最后一项元素标记为待删除;If there is still a query operation using the last element in the cache queue, step S7 is performed to mark the last element in the cache queue as to be deleted; 若无查询操作正在使用所述缓存队列中最后一项元素,执行步骤S8,缓存主控节点删除所述缓存队列中最后一项元素,回收所述缓存管理系统中被淘汰缓存占用的存储空间,从所述缓存特征树中删除所述缓存队列中最后一项元素对应的缓存计算轨迹,并转到步骤S2。If no query operation is using the last element in the cache queue, step S8 is executed, the cache master control node deletes the last element in the cache queue, and reclaims the storage space occupied by the eliminated cache in the cache management system, Delete the cache calculation track corresponding to the last element in the cache queue from the cache feature tree, and go to step S2. 3.根据权利要求2所述的分布式内存列式数据库的缓存管理方法,其特征在于,还包括:3. The cache management method of the distributed in-memory columnar database according to claim 2, characterized in that, further comprising: 为所述缓存队列中每项元素设置一个对应的引用计数,当一项元素首次命中时,将该项元素对应的引用计数置1,并且启动一个计时器;A corresponding reference count is set for each element in the cache queue, when an element hits for the first time, the reference count corresponding to the element is set to 1, and a timer is started; 若在计时器超时前,缓存主控节点收到使用一项元素的SQL语句查询完毕的反馈,则将该项元素对应的引用计数减1,并重置计时器;If the cache master node receives the feedback that the query of the SQL statement using an element is completed before the timer expires, the reference count corresponding to the element is decremented by 1, and the timer is reset; 在一项元素对应的引用计数等于0时,则关闭计时器。When the reference count corresponding to an element is equal to 0, the timer is closed. 4.根据权利要求3所述的分布式内存列式数据库的缓存管理方法,其特征在于,判断是否仍有查询操作正在使用所述缓存队列中最后一项元素为判断所述缓存队列中最后一项元素对应的引用计数是否为0。4. The cache management method of a distributed memory columnar database according to claim 3, wherein judging whether there is still a query operation using the last element in the cache queue is to judge the last element in the cache queue. Whether the reference count corresponding to the item element is 0. 5.根据权利要求1所述的分布式内存列式数据库的缓存管理方法,其特征在于,所述缓存特征树中所有节点所涉及的表都拥有版本号。5 . The cache management method for a distributed in-memory columnar database according to claim 1 , wherein the tables involved in all nodes in the cache feature tree have version numbers. 6 .
CN201610659223.8A 2016-08-11 2016-08-11 The buffer memory management method of distributed memory columnar database Active CN106294772B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610659223.8A CN106294772B (en) 2016-08-11 2016-08-11 The buffer memory management method of distributed memory columnar database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610659223.8A CN106294772B (en) 2016-08-11 2016-08-11 The buffer memory management method of distributed memory columnar database

Publications (2)

Publication Number Publication Date
CN106294772A CN106294772A (en) 2017-01-04
CN106294772B true CN106294772B (en) 2019-03-19

Family

ID=57669593

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610659223.8A Active CN106294772B (en) 2016-08-11 2016-08-11 The buffer memory management method of distributed memory columnar database

Country Status (1)

Country Link
CN (1) CN106294772B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329814B (en) * 2017-06-16 2020-05-26 电子科技大学 A Distributed Memory Database Query Engine System Based on RDMA
CN111949210B (en) * 2017-06-28 2024-11-15 华为技术有限公司 Metadata storage method, system and storage medium in distributed storage system
US12135655B2 (en) 2017-07-27 2024-11-05 International Business Machines Corporation Saving track metadata format information for tracks demoted from cache for use when the demoted track is later staged into cache
US10691566B2 (en) 2017-07-27 2020-06-23 International Business Machines Corporation Using a track format code in a cache control block for a track in a cache to process read and write requests to the track in the cache
US11036641B2 (en) 2017-08-09 2021-06-15 International Business Machines Corporation Invalidating track format information for tracks demoted from cache
US10579532B2 (en) 2017-08-09 2020-03-03 International Business Machines Corporation Invalidating track format information for tracks in cache
CN107784103A (en) * 2017-10-27 2018-03-09 北京人大金仓信息技术股份有限公司 A kind of standard interface of access HDFS distributed memory systems
CN109376014B (en) * 2018-10-19 2021-07-02 郑州云海信息技术有限公司 A distributed lock manager implementation method and system
CN109492005A (en) * 2018-11-07 2019-03-19 郑州云海信息技术有限公司 A kind of B+ tree read buffer method and relevant apparatus
CN110389965B (en) * 2018-11-30 2023-03-14 上海德拓信息技术股份有限公司 Multidimensional data query and cache optimization method
CN110119275B (en) * 2019-05-13 2021-04-02 电子科技大学 A Distributed Memory Columnar Database Compilation Executor Architecture
CN110162272B (en) * 2019-05-23 2020-06-12 北京邮电大学 Memory computing cache management method and device
CN113312385B (en) * 2020-07-07 2025-02-28 阿里巴巴集团控股有限公司 Cache operation method, device and system, storage medium and operation device
CN112528081B (en) * 2020-12-24 2021-09-14 长沙翔宇信息科技有限公司 Space situation relative motion trajectory multilevel loading display method and device
CN114817262B (en) * 2022-04-27 2023-03-28 电子科技大学 Graph traversal algorithm based on distributed graph database
CN117909258B (en) * 2024-03-18 2024-05-14 北京开源芯片研究院 Optimization method and device for processor cache, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103177057A (en) * 2011-12-20 2013-06-26 Sap股份公司 Many core algorithms for in-memory column store databases
CN103970870A (en) * 2014-05-12 2014-08-06 华为技术有限公司 Database query method and server

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103177057A (en) * 2011-12-20 2013-06-26 Sap股份公司 Many core algorithms for in-memory column store databases
CN103970870A (en) * 2014-05-12 2014-08-06 华为技术有限公司 Database query method and server

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DWMS列存储中执行引擎的优化与实现;张琦;《中国优秀硕士学位论文全文数据库 信息科技辑》;20120615;正文第12-50页
基于语义缓存的RDF数据查询优化;徐芳芳;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150715;正文第3-35、43页
达梦嵌入式数据库的执行计划缓存研究;王竹峰;《中国优秀硕士学位论文全文数据库 信息科技辑》;20120615;全文

Also Published As

Publication number Publication date
CN106294772A (en) 2017-01-04

Similar Documents

Publication Publication Date Title
CN106294772B (en) The buffer memory management method of distributed memory columnar database
CN102521406B (en) Distributed query method and system for complex task of querying massive structured data
CN102521405B (en) Massive structured data storage and query methods and systems supporting high-speed loading
CN104301360B (en) A kind of method of logdata record, log server and system
US9189506B2 (en) Database index management
US8732163B2 (en) Query optimization with memory I/O awareness
EP2746970B1 (en) Timeline index for managing temporal data
CN111143389A (en) Transaction execution method and device, computer equipment and storage medium
CN111159252A (en) Transaction execution method, apparatus, computer equipment and storage medium
KR101496179B1 (en) System and method for searching information based on data absence tagging
US9576038B1 (en) Consistent query of local indexes
US10417265B2 (en) High performance parallel indexing for forensics and electronic discovery
US10754854B2 (en) Consistent query of local indexes
Kolchinsky et al. Lazy evaluation methods for detecting complex events
CN107077492A (en) The expansible transaction management based on daily record
CN106484906A (en) A kind of distributed objects storage system flash back method and device
CN109313640A (en) Method and system for data base optimization
CN105740472A (en) Distributed real-time full-text search method and system
CN110309233A (en) Method, apparatus, server and the storage medium of data storage
CN105574054B (en) A kind of distributed caching range query method, apparatus and system
CN105740445A (en) A database query method and device
CN104317957B (en) A kind of open platform of report form processing, system and report processing method
CN102955792A (en) Method for implementing transaction processing for real-time full-text search engine
KR101429046B1 (en) Method for searching, inputting, deleting and garbage collecting of data in database having key-value structure
CN104854587A (en) Maintenance of active database queries

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载