+

CN115048377B - A spatiotemporal keyword query method in a hybrid storage blockchain environment - Google Patents

A spatiotemporal keyword query method in a hybrid storage blockchain environment Download PDF

Info

Publication number
CN115048377B
CN115048377B CN202210650145.0A CN202210650145A CN115048377B CN 115048377 B CN115048377 B CN 115048377B CN 202210650145 A CN202210650145 A CN 202210650145A CN 115048377 B CN115048377 B CN 115048377B
Authority
CN
China
Prior art keywords
block
chain
tree
query
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210650145.0A
Other languages
Chinese (zh)
Other versions
CN115048377A (en
Inventor
信俊昌
姜吉宁
郝琨
姚钟铭
刘红飞
王之琼
徐石成
黄敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Neusoft Corp
Northeastern University China
Original Assignee
Neusoft Corp
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neusoft Corp, Northeastern University China filed Critical Neusoft Corp
Priority to CN202210650145.0A priority Critical patent/CN115048377B/en
Publication of CN115048377A publication Critical patent/CN115048377A/en
Application granted granted Critical
Publication of CN115048377B publication Critical patent/CN115048377B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Bioethics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明针对链上链下混合存储区块链的时空关键字查询问题,提出了一种混合存储区块链环境下的时空关键字查询方法,涉及计算机区块链查询技术领域;首先,针对查询语义较弱,构建按属性分类且赋予语义的区块链模型CSBM,为区块内的事务划分属性类型并添加语义;其次,针对查询效率较低,构建基于B2F‑BKM结构的链上两级索引结构,该结构能够对所有区块和事务进行索引,有效提升查询效率;最后,针对通信开销较大,设计链上链下时空关键字查询方法,通过遍历B2F‑BKM结构进行链上条件查询,然后根据连接属性值进行链上链下数据连接查询,相比传统查询方法减少了不必要的通信开销;经试验证明本发明索引具有良好稳定性,查询效率大幅提升并且通信开销明显降低。

Aiming at the spatiotemporal keyword query problem of on-chain and off-chain hybrid storage blockchain, the present invention proposes a spatiotemporal keyword query method in a hybrid storage blockchain environment, which relates to the technical field of computer blockchain query. Firstly, in view of the weak query semantics, a blockchain model CSBM classified by attributes and endowed with semantics is constructed, and attribute types are divided and semantics are added for transactions in the block. Secondly, in view of the low query efficiency, an on-chain two-level index structure based on the B2F -BKM structure is constructed, and the structure can index all blocks and transactions, effectively improving the query efficiency. Finally, in view of the large communication overhead, an on-chain and off-chain spatiotemporal keyword query method is designed, which performs on-chain conditional query by traversing the B2F -BKM structure, and then performs on-chain and off-chain data connection query according to the connection attribute value, thereby reducing unnecessary communication overhead compared with the traditional query method. Experiments have proved that the index of the present invention has good stability, greatly improves the query efficiency and significantly reduces the communication overhead.

Description

Space-time keyword query method in mixed storage block chain environment
Technical Field
The invention belongs to the technical field of computer block chain query, and particularly relates to a space-time keyword query method in a mixed storage block chain environment.
Background
In recent years, with the increasing popularity of digital cryptocurrency, the underlying technology, namely blockchain technology, has become the focus of current social attention. Essentially, the blockchain can be regarded as a distributed database, has the characteristics of decentralization, non-tampering, traceability, data transparency and the like, and can well solve the trust problem among a plurality of participants in a supply chain and other scenes. The space-time data is a common data expression form in the fields of supply chains, the Internet of things and the like, and a large amount of space-time data is stored on the blockchain along with the application of the blockchain in the scenes. The space-time keyword query uses time, space and keywords as query conditions, and searches all information meeting the conditions, so that the space-time keyword query is widely applied to the above scenes.
The current commonly used blockchain storage mode is a mixed storage mode of a chain up-chain and a chain down-chain, operation transactions which can be shared by multiple parties are stored on the chain, and private data of all parties is stored under the chain. When the space-time keyword query is performed in this mode, the result after the link-up and link-down data connection needs to be returned. Because most of the existing query methods use B+ tree indexes in blocks, the query efficiency is lower for space-time keyword query with multi-dimensional search space. In addition, the existing method needs to read all data in the link table under the chain for connection during inquiry, so that a large amount of communication overhead is caused, and user experience is affected. Therefore, the research on the space-time keyword query method in the mixed storage block chain environment has important significance.
Taking a real scene as an example, in a supply chain scene, a plurality of participants store production information, logistics information, sales information and the like of the production information, the logistics information, the sales information and the like of the production information, the logistics information and the sales information are stored in a blockchain in a public transparent manner for multiparty data sharing and collaborative processing, and private data of each participant is stored under the chain. When the user sends the space-time keyword query Q shown in the figure, the transactions meeting the query conditions under the chain are required to be returned to the querying user after being connected through the connection attribute. The existing query method needs to query respectively according to the B+ tree index of each attribute column in the block, intersections are taken from query results to obtain all transactions meeting query conditions, then all data in the link table under the chain are read to be connected to the blockchain, and finally the connected results are returned to the query user. The drawbacks of this query approach are mainly the following three aspects:
first, query semantics are weaker. While transactions on a blockchain contain a variety of attributes, existing blockchain platforms typically store transactions in the form of key-value or bespoke semantics. The key-value mode does not support relational query such as selection, connection and the like, and can not query the transaction according to the attribute name, and the semantic-given storage mode needs to construct an independent index for each attribute column, and can not merge indexes according to attribute types.
Second, the query efficiency is low. The on-chain transactions include attributes such as time, space, and keywords, distributed among the blocks. Most of the existing blockchain query methods avoid traversing all transactions by using B+ tree indexes in the blocks, but the structure has low query efficiency for space-time keyword query with multidimensional search space.
Third, communication overhead is large. Because the data on the link and the data under the link need to be acquired simultaneously during the query, the existing method needs to read all the data in the link table under the link for connection, and when the amount of the data under the link is very large and the result of the connection is small, a great deal of unnecessary communication overhead is generated.
Disclosure of Invention
Aiming at the defects of the prior art, the invention designs a space-time keyword query method under a mixed storage block chain environment.
A space-time keyword query method under a mixed storage block chain environment comprises the following specific steps:
Step 1, constructing a block chain model which is CSBM classified according to attributes and endowed with semantics;
The method comprises the steps of 1.1, CSBM comprises a plurality of blocks CS-B classified according to attributes and endowed with semantics, CSBM=CS-B 1+CS-B2+···+CS-Bn, each CS-B comprises a Block head and a Block CS-B body classified according to attributes and endowed with semantics, wherein CS-B=B head+CS-B body, B head has the same structure as a traditional Block chain Block head and comprises a front Block hash value PrevHash for connecting blocks into a chain, a Block Height for recording the position of a current Block on the chain, a Timestamp for recording the generation time of the Block, mining difficulty Bits and random numbers Nonce, merkle Root in the Block head is replaced by a Root node BKM-tree Root of a BKM-tree constructed in step 2, all transactions in the Block are generated based on hash values of transactions in the Block, and the transactions in the CS-B body comprise attribute types and are added;
Defining T x as a transaction ,Tx={Key=v1,Time={time=v2},Coordinate={longitude=v3,latitude=v4},Keywords={kw1,kw2,···,kwn},Others={oth1,oth2,···,othn}},Key classified according to attributes and endowed with semantics in CSBM as a main key of the transaction, calculating a hash value of the transaction, and using a 16-system character string with the length of 64 to represent the transaction, wherein Time is a Time attribute type, time is a timestamp of the transaction, coordinatate is a space attribute type, longitude is the longitude of the transaction, latitude is the latitude of the transaction, v i is an attribute value, keywords is a keyword attribute type, kw i is an attribute name of a certain keyword of the transaction, others is other attribute types, oth i is names of other attributes of the transaction, and setting different attribute names according to different application occasions and transaction types;
Step 2, constructing an on-chain two-stage index structure based on a B 2 F-BKM structure, wherein the structure consists of two parts, namely an inter-block B 2 F-tree and an intra-block BKM-tree, and indexing all blocks and transactions is completed;
Step 2.1, constructing BKM tree in each block to replace original Merkle tree, extracting all the space attribute type values and key attribute type values of the transactions in the block in the process of constructing new blocks, simulating KD tree structure, dividing the region according to the space attribute data of the transactions until each region only contains one transaction, storing the transactions in leaf nodes only, storing information related to data division in non-leaf nodes, adding bloom filter fields into KD tree, constructing bloom filter according to the key attribute data of the transactions, including all key words of child nodes in bloom filter of non-leaf nodes, calculating hash values of each node from bottom to top according to the construction mode of Merkle tree to form BKM-tree, and recording root node in block head;
Step 2.2, constructing a B 2 F-tree among blocks based on Block numbers, block creation time and keywords in the blocks, acquiring new Block data, wherein the new Block data comprises Block numbers of the new blocks, block-id, block creation time Timestamp, bloom filters bloomfilter-bf formed by the keywords in the blocks and address offsets in the Block chain files, constructing B 2 F-tree nodes, wherein the formats of keys in the leaf nodes are Block-id, timestamp and bf, the Block-id is from a Block Height at a Block head, the Timestamp is from Timestamp at the Block head and bf is the same as bloom filters in BKM-tree root nodes in the blocks, and the pointers in the leaf nodes record the address offsets of the blocks in the Block chain files;
Step 2.3, constructing a chain two-stage index structure B 2 F-BKM structure;
Step 3, designing a method for inquiring space-time keywords under the upper chain of a chain, acquiring inquiry conditions, inquiring the conditions on the chain according to a B 2 F-BKM structure of a two-stage index structure on the chain, namely, B 2 F-tree and BKM-tree, extracting connection attribute values of inquired matters, inquiring under the upper chain of the chain according to the values, and returning a final result;
Defining that space-time keyword query in a mixed storage blockchain environment is composed of four tuples, wherein Q= [ [ T s,Te ], S, W, join-attr ], wherein [ T s,Te ] is a time range of user query, S is a space range of user query, W is a keyword data set of user query and comprises a plurality of keywords, join-attr is a connection condition in the form of an equation, the left side of the equation is a keyword attribute name to be connected on a chain, and the right side of the equation is a connection keyword attribute name in a database table under the chain;
Traversing B 2 F-tree according to time and keyword condition in query Q, selecting branch pointer according to time condition based on 'timestamp, bf' portion of key in B 2 F-tree node, judging whether the node pointed by said pointer contains query keyword W according to bloom filter, if so, continuing traversing, otherwise, stopping, according to said method until leaf node, finding out all keys containing query keyword W and time is greater than or equal to 'T s', less than or equal to 'T e' and obtaining the positions of these blocks by pointer;
Step 3.3, after a specific block is queried, traversing BKM-tree in the block in turn according to space and keyword conditions in query Q, selecting subnodes from top to bottom according to a search mode of KD-tree, judging whether all query keywords are contained in bloom filters of the subnodes, traversing the subnodes if the subnodes are contained, and discarding the branch if the subnodes are not contained;
Step 3.4, extracting the value of the connection attribute of original data in Onchain-RS according to the connection condition join-attr in the query Q, querying the data in the under-chain database table according to the value, adding the under-chain result set Offchain-RS, reading the data in Offchain-RS and Onchain-RS, connecting the under-chain data through the connection attribute, and storing the result in the result set ResultSet;
Returning ResultSet, terminating the current calculation and waiting for the next call;
The invention has the beneficial technical effects that:
according to the invention, when the space-time keyword query is performed on the block chain based on the chain up-chain and down-chain hybrid storage mode, the query efficiency is low by applying the existing method, a large amount of unnecessary communication overhead can be generated, and the user experience is affected; therefore, the invention aims to provide a space-time keyword query method in a mixed storage block chain environment, which improves the query efficiency and reduces the communication overhead compared with the traditional query method. The method has the following specific beneficial technical effects:
1. The method provided by the invention processes the space-time keyword query problem under the mixed storage blockchain environment based on the blockchain model CSBM classified according to the attribute and endowed with the semantic, and can realize the efficient space-time keyword query under the mixed storage blockchain environment.
2. The space-time keyword query method constructs an on-chain two-stage index structure based on a B 2 F-BKM structure, the structure consists of two parts, namely an inter-block B 2 F-tree and an intra-block BKM-tree, and indexes of all blocks and transactions are completed.
3. According to the block chain connection query method facing the hybrid storage mode, the link condition query is performed by traversing the B 2 F-BKM structure, and then the link data connection query is performed on the link according to the connection attribute value.
Drawings
FIG. 1 is a schematic diagram of a supply chain scene in a space-time keyword query method in a hybrid storage blockchain environment, wherein the left side is a blockchain model CSBM classified according to attributes on a chain and provided with semantics, and the right side is an under-chain database;
FIG. 2 is a schematic diagram of a structure of a block CS-B classified by attribute and provided with semantics in a space-time keyword query method in a hybrid memory blockchain environment according to an embodiment of the present invention;
Fig. 3 is a schematic diagram of the overall structure of a two-level index structure B 2 F-BKM structure on a chain in a space-time keyword query method in a hybrid memory blockchain environment according to an embodiment of the present invention;
FIG. 4 is a flowchart of a structure of B 2 F-BKM for constructing a two-level index structure on a chain in a space-time keyword query method in a mixed storage blockchain environment according to an embodiment of the present invention;
FIG. 5 is a flowchart of a method for querying space-time keywords in a chain up-link and a chain down-link in a method for querying space-time keywords in a mixed storage block chain environment according to an embodiment of the invention;
FIG. 6 is a diagram illustrating an example query process of a space-time keyword query method under a chain-up and chain-down in a space-time keyword query method in a hybrid memory blockchain environment according to an embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the drawings and examples;
Aiming at the problem of space-time keyword query of a chain up-chain and down-chain hybrid storage block chain, the invention provides a space-time keyword query method in a hybrid storage block chain environment. Firstly, aiming at weak query semantics, a blockchain model (Categorical AND SEMANTIC Blockchain Model, CSBM) which is classified by attributes and endows semantics is constructed, attribute types are divided for transactions in a block, and semantics are added. Secondly, aiming at low query efficiency, an on-chain two-stage index structure based on a B 2 F-BKM structure is constructed, and the structure can index all blocks and transactions, so that the query efficiency is effectively improved. Finally, a method for querying space-time keywords under the uplink of the chain is designed aiming at larger communication cost, the on-chain conditional query is performed by traversing the B 2 F-BKM structure, and then the data connection query under the uplink of the chain is performed according to the connection attribute value, so that unnecessary communication cost is reduced compared with the traditional query method.
A space-time keyword query method under a mixed storage block chain environment comprises the following specific steps:
Step 1, constructing a blockchain model (Categorical AND SEMANTIC Blockchain Model, CSBM) classified by attributes and endowed with semantics, wherein the structure is shown in the left part of the figure 1.
Step 1.1 CSBM contains a plurality of blocks classified by attribute and assigned semantics (CS-B), csbm=cs-B 1+CS-B2+···+CS-Bn, each CS-B contains two parts, a block header (B head) and a block classified by attribute and assigned semantics (CS-B body), the structure is shown in fig. 2. CS-b=b head+cs-B body. The Bhead has the same structure as the Block head of the traditional Block chain, and comprises a front Block hash value PrevHash for connecting blocks into a chain, a Block Height for recording the position of the current Block on the chain, a time stamp for recording the generation time of the Block, mining difficulty Bits and random numbers Nonce, wherein Merkle Root (Merkle Root) in the Block head is replaced by a Root node (BKM-tree Root) of the BKM-tree constructed in the step 2, and the hash value based on the transaction in the Block is generated. All transactions within the block are contained in the CS-B body, attribute types are divided for the transactions and semantics are added;
Step 1.2, defining T x as a transaction ,Tx={Key=v1,Time={time=v2},Coordinate={longitude=v3,latitude=v4},Keywords={kw1,kw2,···,kwn},Others={oth1,oth2,···,othn}},Key classified according to attributes and endowed with semantics in CSBM as a main key of the transaction, obtaining by calculating a hash value of the transaction, and using a 16-ary character string with the length of 64 to represent the transaction, wherein Time is a Time attribute type, time is a timestamp of the transaction, coordinates is a space attribute type, longitude is longitude of the transaction, latitude is latitude of the transaction, v i is an attribute value, keywords is a keyword attribute type, kw i is an attribute name of a certain keyword of the transaction, others is other attribute types, and oth i is names of other attributes of the transaction. Setting different attribute names according to different application occasions and transaction types;
In this example, transaction Keywords = { operation, good, factory } for recording production information, and transaction Keywords = { operation, good, factory, driver } for recording logistics information. Taking transaction T j+1 in fig. 2 as an example, the main key of the transaction is '8 c..9B', and a piece of logistics information is recorded, wherein at the time of 'T 2, the driver Tom transports goods 2 produced by the factory 2 to a (134.3°e,42.5°n) position';
And 2, constructing an on-chain two-stage index structure based on a B 2 F-BKM structure, wherein the structure is shown in figure 3. The structure consists of two parts, namely an inter-block B 2 F-tree and an intra-block BKM-tree, and the indexing of all blocks and transactions is completed, and the construction flow is shown in figure 4, and the specific process is as follows:
And 2.1, constructing a BKM-tree in each block to replace the original Merkle tree. In the process of constructing a new block, all transaction space attribute type values and keyword attribute type values in the block are extracted. Simulating a KD-tree structure, dividing regions according to the space attribute data of the transactions until each region only contains one transaction, wherein the transactions are only stored in leaf nodes, and non-leaf nodes store information related to data division; adding a bloom filter field into the KD-tree, constructing a bloom filter according to the key word attribute data of the transaction, wherein the bloom filter of the non-leaf node contains all key words of child nodes of the non-leaf node;
In this example, as shown on the right side of fig. 3, BKM-trees are built for all transactions in block n, for example, the root node contains the hash value of the node, spatial partition related information, and a bloom filter BF 7, which contains the key data for all transactions in block n. The leaf node P j+1 records the spatial information (134.3°e,42.5°n), the Key information (transport 2,goods2, tom) and the Key value (8 c..9b) of the transaction T j+1, and the hash value of this node is calculated from the above information in series hash.
And 2.2, constructing an inter-block B 2 F-tree based on the block number, the block creation time and the intra-block keywords. New block data is obtained, including the block number (block-id) of the new block, the block creation time (timestamp), the bloom filter (bloomfilter-bf) made up of intra-block keys, and the address offset of the block in the blockchain file. The B 2 F-tree node is constructed, and the format of the keys in the leaf nodes is 'Block-id, timestamp, bf', wherein the Block-id is from the Block Height (Block Height) at the Block head, the Timestamp is from the Timestamp (Timestamp) at the Block head, and bf is the same as the bloom filter in the BKM-tree root node in the Block. Pointers in leaf nodes record the address offsets of the blocks located in the blockchain file. All keywords of subtrees of the non-leaf nodes are contained in a bloom filter of the non-leaf nodes and are used for filtering pointers, judging whether the nodes pointed by the pointers contain corresponding keywords or not, and updating a B 2 F-tree according to a block-id and a B+ tree node insertion mode, wherein the structure is shown in figure 3;
In this example, as shown in FIG. 3, a B 2 F-tree is constructed for all blocks, e.g., the root node's bloom filter BF 5 contains all the keys in its subtree bloom filter BF 1,BF2, and the leaf node ((1, ts 1,bf1),p1) indicates that the first block was created at ts 1, and the key data of the intra-block transaction constitutes the location where bloom filter BF 1,p1 points to the block.
Step 2.3, constructing a chain two-stage index structure B 2 F-BKM structure;
And 3, designing a method for querying space-time keywords under the upper chain of the chain. Acquiring query conditions, carrying out on-chain conditional query according to a B 2 F-BKM structure of the on-chain two-stage index structure, namely a B 2 F-tree and a BKM-tree, extracting a connection attribute value of a queried transaction, carrying out on-chain and off-chain connection query according to the value, and returning a final result. The flow is shown in fig. 5, the example query process is shown in fig. 6, and the specific process is as follows:
Defining that space-time keyword query in a mixed storage blockchain environment is composed of four tuples, wherein Q= [ [ T s,Te ], S, W, join-attr ], wherein [ T s,Te ] is a time range of user query, S is a space range of user query, W is a keyword data set of user query and comprises a plurality of keywords, join-attr is a connection condition in the form of an equation, the left side of the equation is a keyword attribute name to be connected on a chain, and the right side of the equation is a connection keyword attribute name in a database table under the chain;
In this example, query q= [ [ t 7,t9],[S],[goods=goods2, operator=transport ], driver=driver.
And 3.2, traversing the B 2 F-tree according to the time and the keyword condition in the query Q. Based on the 'timestamp, bf' part of the keys in the B 2 F-tree node, starting from the root node, selecting a branch pointer according to time conditions, judging whether the node pointed by the pointer contains a query keyword W according to a bloom filter, and continuing traversing if the node contains the query keyword W, otherwise, stopping traversing;
In this example, the location information of the block is obtained by traversing the B 2 F-tree to obtain that only the block with the block number 9 meets the query time condition and includes all the query keywords.
And 3.3, after a specific block is inquired, traversing the BKM-tree positioned in the block in sequence according to the space and the keyword condition in the inquiry Q. And selecting the child node from top to bottom according to the searching mode of the KD-tree, judging whether all query keywords are contained in the bloom filter of the child node, traversing the child node if the query keywords are contained, and discarding the branch if the query keywords are not contained. According to this method, all transactions within the block that meet the space and key conditions are found up to the leaf node. Obtaining Key data in leaf nodes, namely hash values (Key) of transactions, obtaining original data according to the Key data, and adding the data into a result set (Onchain-RS) on a chain if a time condition is met;
In this example, the original data T j and T j+1 are found by traversing the BKM-tree in block 9 to obtain that nodes P j and P j+1 meet space and keyword query conditions, then reading the hash values Key j and Key j+1 stored in nodes P j and P j+1, finding that the original data meet time conditions, and adding the original data of transactions T j and T j+1 to Onchain-RS.
And 3.4, extracting the value of the connection attribute of the original data in Onchain-RS according to the connection condition join-attr in the query Q, querying the data in the database table under the chain according to the value, and adding an under-chain result set (Offchain-RS). After the data in Offchain-RS and Onchain-RS are read and connected with the data under the chain through the connection attribute, the result is stored in a result set ResultSet, in the example, the values Tom and James of the connection attribute Driver are obtained according to the original data, the data about the drivers Tom and James are searched in the under-chain Driver table, then the on-chain data are connected with the under-chain data through the connection attribute value, the results R1 and R2 are obtained, and the results are stored in a result set ResultSet.
And 3.5, returning ResultSet, terminating the current calculation and waiting for the next call.

Claims (6)

1. The space-time keyword query method in the mixed storage block chain environment is characterized by comprising the following steps:
Step 1, constructing a block chain model which is CSBM classified according to attributes and endowed with semantics;
The method comprises the steps of 1.1, CSBM comprises a plurality of blocks CS-B classified according to attributes and endowed with semantics, CSBM=CS-B 1+CS-B2+···+CS-Bn, each CS-B comprises a block head B head and a block CS-B body classified according to attributes and endowed with semantics, wherein CS-B=B head+CS-B body, B head has the same structure as a traditional block chain block head, merkle Root in the block head is replaced by a Root node BKM-tree Root of a BKM-tree constructed in step 2 and generated based on hash values of transactions in the block, all transactions in the block are contained in the CS-B body, attribute types are divided for the transactions, and semantics are added;
Defining T x as a transaction ,Tx={Key=v1,Time={time=v2},Coordinate={longitude=v3,latitude=v4},Keywords={kw1,kw2,···,kwn},Others={oth1,oth2,···,othn}},Key classified according to attributes and endowed with semantics in CSBM as a main key of the transaction, calculating a hash value of the transaction, and using a 16-system character string with the length of 64 to represent the transaction, wherein Time is a Time attribute type, time is a timestamp of the transaction, coordinatate is a space attribute type, longitude is the longitude of the transaction, latitude is the latitude of the transaction, v i is an attribute value, keywords is a keyword attribute type, kw i is an attribute name of a certain keyword of the transaction, others is other attribute types, oth i is names of other attributes of the transaction, and setting different attribute names according to different application occasions and transaction types;
Step 2, constructing an on-chain two-stage index structure based on a B 2 F-BKM structure, wherein the structure consists of two parts, namely an inter-block B 2 F-tree and an intra-block BKM-tree, and indexing all blocks and transactions is completed;
Step 3, designing a method for inquiring space-time keywords under the upper chain of the chain, acquiring inquiry conditions, inquiring the conditions on the chain according to the B 2 F-BKM structure of the two-stage index structure on the chain, namely, B 2 F-tree and BKM-tree, extracting connection attribute values of inquired transactions, inquiring under the upper chain of the chain according to the values, and returning a final result.
2. The method of claim 1, wherein the B head comprises a pre-Block hash value PrevHash connecting blocks into a chain, a Block Height for recording the position of the current Block on the chain, a Timestamp for recording the Block generation time, and mining difficulty Bits and random number nonces.
3. The method for querying space-time keywords in a hybrid memory blockchain environment according to claim 1, wherein step 2 specifically comprises:
Step 2.1, constructing BKM-tree in each block to replace original Merkle tree, extracting all the space attribute type values and key attribute type values of the transactions in the block in the process of constructing a new block, simulating KD-tree structure, dividing the region according to the space attribute data of the transactions until each region only contains one transaction, storing the transactions in leaf nodes only, storing information related to data division in non-leaf nodes, adding bloom filter fields into KD-tree, constructing bloom filter according to the key attribute data of the transactions, including all keys of child nodes of the non-leaf nodes, calculating hash values of each node from bottom to top according to the construction mode of Merkle tree to form BKM-tree, and recording root nodes in block heads;
Step 2.2, constructing a B 2 F-tree among blocks based on Block numbers, block creation time and keywords in the blocks, acquiring new Block data, constructing B 2 F-tree nodes, judging whether the nodes pointed by the pointers contain corresponding keywords or not according to the filtering pointers, and updating B 2 F-tree according to the inserting mode of the Block-id and the tree nodes according to B+ by constructing B 2 F-tree nodes, wherein the key formats in the leaf nodes are 'Block-id, timestamp and bf' from the Block Height of the Block head, timestamp is the same as the bloom filter in the BKM-tree root node in the Block head, pointers in the leaf nodes record the address offset of the blocks in the Block chain file;
And 2.3, constructing a chain two-stage index structure B 2 F-BKM structure.
4. The method for querying space-time keywords in a mixed storage blockchain environment according to claim 3, wherein the obtaining new block data includes a block number block-id of a new block, a block creation time timestamp, a bloom filter bloomfilter-bf composed of keywords in the block, and an address offset of the block in the blockchain file.
5. The method for querying space-time keywords in a hybrid memory blockchain environment according to claim 1, wherein the step 3 is specifically:
Step 3.1, defining that space-time keyword query in a mixed storage blockchain environment is composed of quadruples, wherein Q= [ [ T s,Te ], S, W, join-attr ];
Traversing B 2 F-tree according to time and keyword condition in query Q, selecting branch pointer according to time condition based on 'timestamp, bf' portion of key in B 2 F-tree node, judging whether the node pointed by said pointer contains query keyword W according to bloom filter, if so, continuing traversing, otherwise, stopping, according to said method until leaf node, finding out all keys containing query keyword W and time is greater than or equal to 'T s', less than or equal to 'T e' and obtaining the positions of these blocks by pointer;
Step 3.3, after a specific block is queried, traversing BKM-tree in the block in turn according to space and keyword conditions in query Q, selecting subnodes from top to bottom according to a search mode of KD-tree, judging whether all query keywords are contained in bloom filters of the subnodes, traversing the subnodes if the subnodes are contained, and discarding the branch if the subnodes are not contained;
Step 3.4, extracting the value of the connection attribute of original data in Onchain-RS according to the connection condition join-attr in the query Q, querying the data in the under-chain database table according to the value, adding the under-chain result set Offchain-RS, reading the data in Offchain-RS and Onchain-RS, connecting the under-chain data through the connection attribute, and storing the result in the result set ResultSet;
and 3.5, returning ResultSet, terminating the current calculation and waiting for the next call.
6. The method for searching for space-time keywords in a hybrid memory blockchain environment according to claim 5, wherein [ T s,Te ] in step 3.1 is a time range of user searching, S is a space range of user searching, W is a keyword data set of user searching, the keyword data set comprises a plurality of keywords, join-attr is a connection condition in the form of an equation, the left side of the equation is a keyword attribute name to be connected on a chain, and the right side of the equation is a connection keyword attribute name in a database table under the chain.
CN202210650145.0A 2022-06-10 2022-06-10 A spatiotemporal keyword query method in a hybrid storage blockchain environment Active CN115048377B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210650145.0A CN115048377B (en) 2022-06-10 2022-06-10 A spatiotemporal keyword query method in a hybrid storage blockchain environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210650145.0A CN115048377B (en) 2022-06-10 2022-06-10 A spatiotemporal keyword query method in a hybrid storage blockchain environment

Publications (2)

Publication Number Publication Date
CN115048377A CN115048377A (en) 2022-09-13
CN115048377B true CN115048377B (en) 2025-03-28

Family

ID=83161322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210650145.0A Active CN115048377B (en) 2022-06-10 2022-06-10 A spatiotemporal keyword query method in a hybrid storage blockchain environment

Country Status (1)

Country Link
CN (1) CN115048377B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116628285B (en) * 2023-07-21 2023-11-03 北京邮电大学 Blockchain transaction data query method and device
CN118626517B (en) * 2024-07-10 2025-09-12 浙江大学 A classification retrieval method and its connection query method based on the tamper-proof property of blockchain data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108769150A (en) * 2018-05-14 2018-11-06 百度在线网络技术(北京)有限公司 Data processing method, device, clustered node and the storage medium of block chain network
CN112800065A (en) * 2021-02-09 2021-05-14 北京工业大学 Efficient data retrieval method based on improved block storage structure

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102544628B1 (en) * 2019-03-08 2023-06-19 한국전자통신연구원 System for a data sharing platform in a block chain based distributed data sharing environment, method for searching data index in the system and method for providing seartch index in the system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108769150A (en) * 2018-05-14 2018-11-06 百度在线网络技术(北京)有限公司 Data processing method, device, clustered node and the storage medium of block chain network
CN112800065A (en) * 2021-02-09 2021-05-14 北京工业大学 Efficient data retrieval method based on improved block storage structure

Also Published As

Publication number Publication date
CN115048377A (en) 2022-09-13

Similar Documents

Publication Publication Date Title
US7287033B2 (en) Efficient traversals over hierarchical data and indexing semistructured data
CN107491561B (en) Ontology-based urban traffic heterogeneous data integration system and method
CN115048377B (en) A spatiotemporal keyword query method in a hybrid storage blockchain environment
CN106933833B (en) Method for quickly querying position information based on spatial index technology
US9870382B2 (en) Data encoding and corresponding data structure
Chu et al. A relational approach to incrementally extracting and querying structure in unstructured data
CN103440245A (en) Line and column hybrid storage method of database system
CN103123650B (en) A kind of XML data storehouse full-text index method mapped based on integer
NZ505767A (en) Database management system with layered index arranged in blocks
CN101963993B (en) Method for fast searching database sheet table record
CN114969101B (en) SQL Statement Processing Method and Device
CN110928882B (en) Memory database indexing method and system based on improved red black tree
CN113704248B (en) Block chain query optimization method based on external index
CN106503040A (en) It is suitable for KV data bases and its creation method of SQL query method
CN113590610B (en) Blood relationship expression method based on Elastic Search
CN116881243A (en) Learning type indexing method and system based on time sequence data characteristics
CN101256579A (en) Method for inquesting data organization in database
Jin et al. Making RDBMSs efficient on graph workloads through predefined joins
CN103425789B (en) The querying method of a kind of space-time data and device
CN113434511A (en) Hilbert curve-based clustering index method
CN102236662A (en) Database query and control method
Bertino et al. The indispensability of dispensable indexes
JP2001216307A (en) Relational database management system and storage medium stored with same
CN114791967B (en) Storage and query method of temporal RDF data based on bit matrix model
Wang et al. Complete Join Reordering for Null-Intolerant Joins

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载