+

CN111858607B - Data processing method, device, electronic equipment and computer readable medium - Google Patents

Data processing method, device, electronic equipment and computer readable medium Download PDF

Info

Publication number
CN111858607B
CN111858607B CN202010728918.3A CN202010728918A CN111858607B CN 111858607 B CN111858607 B CN 111858607B CN 202010728918 A CN202010728918 A CN 202010728918A CN 111858607 B CN111858607 B CN 111858607B
Authority
CN
China
Prior art keywords
target
hash
data
partition
hash partition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010728918.3A
Other languages
Chinese (zh)
Other versions
CN111858607A (en
Inventor
邱海港
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202010728918.3A priority Critical patent/CN111858607B/en
Publication of CN111858607A publication Critical patent/CN111858607A/en
Application granted granted Critical
Publication of CN111858607B publication Critical patent/CN111858607B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • G06F9/526Mutual exclusion algorithms

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data processing method, a device, electronic equipment and a computer readable medium, relating to the technical field of databases, comprising the following steps: acquiring a plurality of target index information which are preset and used for indexing data in a database; performing hash partition operation on the plurality of target index information to obtain at least one target hash partition; each target hash partition comprises a hash value of corresponding target index information; setting a corresponding B tree for each target hash partition in at least one target hash partition; and performing target data operation on the data in the database based on each target hash partition and the corresponding B tree. The method solves the technical problems of low efficiency and resource waste of the conventional database index operation.

Description

Data processing method, device, electronic equipment and computer readable medium
Technical Field
The present invention relates to the field of databases, and in particular, to a data processing method, apparatus, electronic device, and computer readable medium.
Background
In the existing databases, the data volume is large, the databases store massive data, and the number of lines of single-table data in many databases reaches tens of millions or even hundreds of millions of data volumes. The data base is a table with billions of data, and the data volume is huge. The database also needs to provide data querying, updating, etc. functions to the application. In the existing data query mode, a B tree or a b+ tree may be used.
For example, in a data table of more than 1 million or tens of millions of data volumes, a B-tree may be found by halving over 30 times for one data processing thread. Although there are not many single lookups, the number of queries is multiplied by several times if multiple data processing threads initiate data processing operations at the same time. If the data processing thread is 2000, the number of queries is 30 by 2000, and is amplified by 2000. In a high concurrency state, the B-tree also involves reading of the disk during the lookup, and disk IO is very time consuming, involving many unnecessary disk queries due to the binary lookup.
Disclosure of Invention
Accordingly, an object of the present invention is to provide a data processing method, apparatus, electronic device and computer readable medium, so as to alleviate the technical problems of low efficiency and resource waste of the existing database indexing operation.
In a first aspect, an embodiment of the present invention provides a data processing method, including: acquiring a plurality of target index information which are preset and used for indexing data in a database; performing hash partition operation on the plurality of target index information to obtain at least one target hash partition; each target hash partition comprises a hash value of corresponding target index information; setting a corresponding B tree for each target hash partition in the at least one target hash partition; and executing target data operation on the data in the database based on each target hash partition and the corresponding B tree.
Further, performing a hash partition operation on the plurality of target index information, and obtaining at least one target hash partition includes: calculating the hash value of each target index information to obtain a plurality of target hash values; acquiring at least one preset initial hash partition; and determining an initial hash partition to which each target hash value belongs to obtain at least one target hash partition.
Further, determining the initial hash partition to which each of the target hash values belongs includes: acquiring the number of the at least one initial hash partition to obtain a target number; performing remainder calculation on each target hash value based on the target quantity to obtain a first calculation result; and determining an initial hash partition to which each target hash value belongs based on the first calculation result.
Further, each initial hash partition contains corresponding sequence flag information; determining, based on the first calculation result, an initial hash partition to which each target hash value belongs includes: and determining target sequence mark information which is the same as each first calculation result in at least one sequence mark information, and determining an initial hash partition corresponding to the target sequence mark information as the initial hash partition to which each target hash value belongs.
Further, performing a target data operation on data in the database based on each of the target hash partitions and its corresponding B-tree includes: when the target data is operated for data writing operation, determining data index information of data to be written, and calculating a hash value of the data index information of the data to be written to obtain a first hash value; determining a hash partition to which the first hash value belongs from the at least one target hash partition to obtain a first hash partition; and establishing a corresponding B tree node on a B tree corresponding to the first hash partition, wherein the B tree node comprises at least one of data index information of the data to be written and storage position information of the data to be written in a database.
Further, determining, in the at least one target hash partition, a hash partition to which the first hash value belongs, where obtaining the first hash partition includes: performing remainder calculation on the first hash value based on the number of the at least one target hash partition to obtain a second calculation result; and determining a hash partition to which the first hash value belongs based on the second calculation result to obtain the first hash partition.
Further, performing the target data operation on the data in the database based on each of the target hash partitions and its corresponding B-tree further comprises: when the target data operation is a data query operation, determining data index information of data to be queried, and calculating a hash value of the data index information of the data to be queried to obtain a second hash value; determining a hash partition to which the second hash value belongs in the at least one target hash partition to obtain a second hash partition; determining a B tree corresponding to the second hash partition; and traversing and searching the B tree corresponding to the second hash partition to obtain a storage position of the data to be queried in a database, and reading the data to be queried according to the storage position.
Further, determining, in the at least one target hash partition, a hash partition to which the second hash value belongs, where obtaining the second hash partition includes: performing remainder calculation on the second hash value based on the number of the at least one target hash partition to obtain a third calculation result; and determining the hash partition to which the second hash value belongs based on the third calculation result to obtain the second hash partition.
Further, before said performing a target data operation on data in the database based on each of said target hash partitions and its corresponding B-tree, the method further comprises: and controlling the B tree corresponding to the target hash partition corresponding to the target data operation to be in a locking state.
In a second aspect, an embodiment of the present invention further provides a data processing apparatus, including: the acquisition unit is used for acquiring a plurality of target index information which are preset and used for indexing the data in the database; the partition unit is used for executing hash partition operation on the plurality of target index information to obtain at least one target hash partition; each target hash partition comprises a hash value of corresponding target index information; a building unit, configured to set a corresponding B-tree for each target hash partition in the at least one target hash partition; and the operation unit is used for executing target data operation on the data in the database based on each target hash partition and the corresponding B tree.
In a third aspect, an embodiment of the present invention further provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the method described in the first aspect when the processor executes the computer program.
In a fourth aspect, embodiments of the present invention also provide a computer readable medium having non-volatile program code executable by a processor, the program code causing the processor to perform the steps of the method described in the first aspect above.
In the embodiment of the application, firstly, a plurality of target index information which is preset and used for indexing data in a database is obtained; then, hash partition operation is carried out on the plurality of target index information, and at least one target hash partition is obtained; each target hash partition comprises a hash value of corresponding target index information; then, setting a corresponding B tree for each target hash partition in at least one target hash partition; finally, performing a target data operation on the data in the database based on each target hash partition and its corresponding B-tree. According to the method and the device, the target index information is subjected to hash partition, and each hash partition is provided with one B tree, so that the method and the device can be executed on the corresponding B tree of the corresponding target hash partition in the process of executing data updating processing, and the data of different target hash partitions are updated at the same time, so that the problem of a traditional data serial index mode is solved, the efficiency of target data operation is improved, useless disk inquiry is reduced, and the technical problems of low efficiency and resource waste of the existing database index operation are solved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
In order to make the above objects, features and advantages of the present invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data processing method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for determining a target hash partition according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Embodiment one:
According to an embodiment of the present invention, there is provided an embodiment of a data processing method, it being noted that the steps shown in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that herein.
FIG. 1 is a flow chart of a data processing method according to an embodiment of the present invention, as shown in FIG. 1, the method includes the steps of:
step S102, a plurality of target index information which is preset and used for indexing the data in the database is obtained.
Specifically, in the present application, the target index information is index information preset by a user, and the index information is used for indexing data in a database.
Step S104, hash partition operation is carried out on the plurality of target index information, and at least one target hash partition is obtained; each target hash partition contains a hash value of corresponding target index information.
It should be noted that, the hash partition may be understood as hash buckets, where each hash bucket corresponds to a hash value of target index information of data in the database, and one hash bucket may correspond to one or more hash values.
The target index information is applicable to the character index, a hash value corresponding to each target index information is calculated through the character index, and a hash partition corresponding to the target index information is determined according to the hash value corresponding to the target index information data, so that at least one target hash partition is obtained.
Step S106, setting a corresponding B tree for each target hash partition in at least one target hash partition;
It should be noted that the B-tree is a self-balancing tree, and can keep data ordered. The data structure can enable the actions of searching data, sequentially accessing, inserting data and deleting data to be completed in logarithmic time. The B-tree contains a plurality of nodes, each node containing index information of the data and storage locations of the data in the database.
In the application, a target hash chain table can be constructed according to at least one target hash partition. The target hash chain table is a chain table constructed based on hash partitions, for example, a corresponding data field and a pointer field may be constructed in advance for each target hash partition, where the data field is used to store a hash value corresponding to the target hash partition, and the pointer field is used to determine the next "node" or "element". And setting a B tree pointer corresponding to at least one target hash partition, wherein the B tree pointer is used for determining a B tree corresponding to the at least one target hash partition.
Step S108, performing target data operation on the data in the database based on each target hash partition and the corresponding B tree.
In an optional embodiment of the present application, before performing the target data operation on the data in the database based on each target hash partition and the B tree corresponding to the target hash partition, the B tree corresponding to the target hash partition corresponding to the target data operation may be further controlled to be in a locked state.
It should be noted that after determining the target hash partition corresponding to the target data operation and before executing the target data operation on the data in the database based on each target hash partition and the B tree corresponding to the target hash partition, a lock flag needs to be set for the target hash partition corresponding to the target data operation, so that the B tree corresponding to the target hash partition corresponding to the target data operation is in a locked state, and when the target hash partition corresponding to the target data operation is in a locked state, executing the target data operation on the data in the database based on the target hash partition corresponding to the target data operation and the B tree corresponding to the target hash partition.
In the embodiment of the application, a plurality of target index information which is preset and used for indexing the data in the database is obtained; performing hash partition operation on the plurality of target index information to obtain at least one target hash partition; each target hash partition comprises a hash value of corresponding target index information; setting a corresponding B tree for each target hash partition in at least one target hash partition; and performing target data operation on the data in the database based on each target hash partition and the corresponding B tree. According to the method and the device, the target index information is subjected to hash partition, and each hash partition is provided with one B tree, so that the method and the device can be executed on the corresponding B tree of the corresponding target hash partition in the process of executing data updating processing, and the data of different target hash partitions are updated at the same time, so that the problem of a traditional data serial index mode is solved, the efficiency of target data operation is improved, useless disk inquiry is reduced, and the technical problems of low efficiency and resource waste of the existing database index operation are solved.
Finally, it should be noted that, since the hash values cannot be sorted, the B-trees corresponding to at least one target hash partition have different ranges and cross each other, and if sorting is required, secondary sorting is required.
In the embodiment of the present invention, as shown in fig. 2, step S104 includes the following steps:
step S11, calculating a hash value of each target index information to obtain a plurality of target hash values;
Step S12, at least one preset initial hash partition is obtained;
Step S13, determining an initial hash partition to which each target hash value belongs, and obtaining at least one target hash partition.
In the embodiment of the invention, firstly, a hash value of each target index information is calculated to obtain a plurality of target hash values.
Next, at least one preset initial hash partition is obtained, and it should be noted that, when the at least one initial hash partition corresponds to one or more hash values, for example, when the target number of the at least one initial hash partition is 997 (i.e., the hash chain contains 997 initial hash partitions), the sequence flag information of the first initial hash partition is 1, the hash value corresponding to the hash partition 1 is a hash value obtained by dividing the hash value of the target index information by the hash value obtained by dividing the remainder of 997 by 1, and the hash value corresponding to the first initial hash partition may be multiple hash values such as 1, 998, 1995, …, and the like, and the hash value determining method corresponding to the remaining initial hash partitions and the hash value corresponding to the first initial hash partition are the same as the above procedure, which is not repeated here.
After determining the target hash value of each target index information, determining an initial hash partition to which each target hash value belongs, and obtaining at least one target hash partition, wherein the method comprises the following steps:
Step S31, obtaining the number of at least one initial hash partition to obtain a target number;
Step S32, performing remainder calculation on each target hash value based on the target quantity to obtain a first calculation result;
Step S33, determining an initial hash partition to which each target hash value belongs based on the first calculation result.
In the embodiment of the invention, after the number of at least one initial hash partition is determined, remainder calculation is performed on each target hash value and the target number, the remainder corresponding to each target hash value is determined, and the remainder corresponding to each target hash value is determined as a first calculation result.
Then, an initial hash partition to which each target hash value belongs is determined according to the first calculation result.
For example, the number of the initial hash partitions is 997, and if one of the target index information has a target hash value of 201, the remainder is 201 if it is necessary to calculate 201 divided by 997 (i.e., the first calculation result is 201), and if it is determined that the initial hash partition to which the target hash value 201 belongs is the 201 st initial hash partition.
It should be noted that, in the embodiment of the present invention, each initial hash partition includes corresponding sequence flag information, so when determining an initial hash partition to which each target hash value belongs based on the first calculation result, the same target sequence flag information as the first calculation result may be determined in the plurality of sequence flag information, and the initial hash partition to which the target sequence flag information corresponds may be determined as the initial hash partition to which the target hash value belongs.
In the present application, the target data operation described in step S108 includes any one of the following: data writing operation and data inquiring operation.
Based on this, the description will be made below of the target data operation as a data write operation and the target data operation as a data poll operation, respectively.
In the embodiment of the present invention, when the target data is operated as a data write operation, step S108 includes the steps of:
step S41, when the target data is operated as a data writing operation, determining data index information of the data to be written, and calculating a hash value of the data index information of the data to be written to obtain a first hash value;
Step S42, determining a hash partition to which a first hash value belongs in at least one target hash partition to obtain the first hash partition;
Step S43, corresponding B tree nodes are established on the B tree corresponding to the first hash partition, wherein the B tree nodes comprise at least one of data index information of the data to be written and storage position information of the data to be written in a database.
In the embodiment of the invention, when the target data is operated as the data writing operation, after the data to be written is obtained, the data index information of the data to be written is determined, and the hash value (namely, the first hash value) of the data index information of the data to be written is calculated.
Then, a hash partition to which a hash value (i.e., a first hash value) of data index information of the data to be written belongs is determined, and the first hash partition is obtained.
Specifically, when determining the hash partition to which the hash value of the data index information of the data to be written belongs, a remainder of dividing the first hash value by the number of at least one target hash partition may be calculated, thereby obtaining a second calculation result, and determining the hash partition to which the remainder (i.e., the second calculation result) belongs as the first hash partition.
For example, when the number of at least one target hash partition is 997 and it is determined that the hash value of the data index information of the data to be written (i.e., the first hash value) is 201, then 201 is calculated and divided by 997, and the remainder obtained is 201 (i.e., the second calculation result), then the hash partition to which 201 belongs is determined as the first hash partition.
And finally, building a B tree node corresponding to the data to be written in the B tree corresponding to the first hash partition, and writing the data index information of the data to be written in and the storage position of the data to be written in the database into the B tree node, thereby completing the writing operation of the data to be written in.
In the prior art, when a plurality of users write data in a database, each user needs to add a write lock when writing data, so that the users need to be queued, and after the current user finishes writing data, the next user can perform data writing operation.
In the application, the number of hash value partitions is 997, the number of users is 500, and when the hash values of the data written by 500 users are respectively in different partitions in the 997 hash value partitions, the 500 users can write the 500 data into the database at the same time, thereby improving the data writing efficiency of the database, and the application supports 997 data writing operations to be simultaneously parallel.
In the embodiment of the present invention, when the target data operation is a data query operation, step S108 includes the steps of:
Step S51, when the target data operation is a data query operation, determining data index information of the data to be queried, and calculating a hash value of the data index information of the data to be queried to obtain a second hash value;
Step S52, determining a target hash partition to which the second hash value belongs in at least one target hash partition, and obtaining a second hash partition;
step S53, determining a B tree corresponding to the second hash partition;
Step S54, traversing and searching the B tree corresponding to the second hash partition to obtain the storage position of the data to be queried in the database, and reading the data to be queried according to the storage position.
In the embodiment of the invention, under the condition that the target data operation is a data query operation, firstly, determining the data index information of the data to be queried, and calculating the hash value of the data index information of the data to be written to obtain a second hash value.
After the second hash value is obtained, determining a target hash partition to which the second hash value belongs in at least one target hash partition, and determining the target hash partition as the second hash partition.
In the present application, when determining the hash partition to which the second hash value belongs in the at least one target hash partition to obtain the second hash partition, a remainder of dividing the second hash value by the number of the at least one target hash partition may be calculated first, and the remainder may be used as a third calculation result.
And then, determining the hash partition to which the second hash value belongs according to the third calculation result, thereby obtaining the second hash partition.
For example, when the number of at least one target hash partition is 997 and the second hash value is determined to be 101, the 101 is calculated and divided by 997, and the remainder obtained is 101, and the hash partition to which the 101 belongs is determined to be the second hash partition.
And finally, traversing and searching the B tree corresponding to the second hash partition to obtain the storage position of the data to be queried in the database, and reading the data to be queried according to the storage position, thereby completing the query operation of the data to be queried.
In the prior art, a database contains a large amount of data, so that a B tree of the database contains a large amount of nodes, when a user queries the data in the database, the user can query the data to be queried after traversing a large amount of nodes, the query speed is low, useless disk query can occur in the query process, and the query times of IO are more.
In the embodiment of the invention, a large amount of data is partitioned according to the hash value, and each partition contains a part of a large amount of data, so that the height of a B tree corresponding to each partition is reduced compared with that of a B tree in the prior art, when the data in a database is queried, the hash value partition to which the data to be queried belongs is determined by calculating the hash value of the data to be queried, and the data to be queried is queried from the B tree corresponding to the hash value partition, thereby reducing useless disk query, reducing IO query times and improving the query speed of the data.
In addition, in the prior art, when a plurality of users query data in a database, since each user needs to write a lock when querying the data, the users need to be queued, and after the current user finishes querying, the next user can query.
For example, a table with data volume exceeding 1 hundred million or tens of millions is included in the database, the B tree is searched for more than 25 times by two times, although the search is not more than one time, under high concurrency, the number of times of inquiry is amplified by several times, if 1000 times are concurrent, the number of times of inquiry is 25 x 1000, and if data update is needed, the writing lock is added, the whole B tree is locked, and 1000 concurrency is realized, so that the performance is seriously affected. The main disadvantages are that under high concurrency, B-trees also involve reading of disks during lookup, disk IO is very time consuming, and binary lookup involves many unnecessary disk queries.
In the application, the number of target hash partitions is 997, and the number of users is 500, and when the data to be queried of 500 users are respectively located in different partitions in 997 target hash partitions, 500 users can query the B-tree corresponding to 500 target hash partitions at the same time, so that data required by 500 users are queried.
Embodiment two:
the embodiment of the invention also provides a data processing device, which is mainly used for executing the data processing method provided in the first embodiment of the invention, and the data processing device provided in the embodiment of the invention is specifically described below.
Fig. 3 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention, and as shown in fig. 3, the data processing apparatus mainly includes the following units:
an obtaining unit 10, configured to obtain a plurality of target index information that are preset to index data in a database;
A partition unit 20, configured to perform hash partition operation on the multiple target index information to obtain at least one target hash partition; each target hash partition comprises a hash value of corresponding target index information;
A building unit 30, configured to set a corresponding B-tree for each target hash partition in the at least one target hash partition;
an operation unit 40 for performing a target data operation on the data in the database based on each of the target hash partitions and its corresponding B-tree.
In the embodiment of the application, a plurality of target index information which is preset and used for indexing the data in the database is obtained; performing hash partition operation on the plurality of target index information to obtain at least one target hash partition; each target hash partition comprises a hash value of corresponding target index information; setting a corresponding B tree for each target hash partition in at least one target hash partition; and executing target data operation on the data in the database based on each target hash partition and the corresponding B tree. According to the method and the device, the target index information is subjected to Hash partition, and each Hash partition is provided with one B tree, so that the data can be executed on the B tree corresponding to the corresponding target Hash partition in the process of executing data updating processing, and the data of different target Hash partitions are updated at the same time, so that the problem of a traditional data serial index mode is solved, the efficiency of target data operation is improved, useless disk inquiry is reduced, and the technical problems of low efficiency of the existing database index operation and resource waste are solved.
Optionally, the partition unit is configured to: calculating the hash value of each target index information to obtain a plurality of target hash values; acquiring at least one preset initial hash partition; and determining an initial hash partition to which each target hash value belongs to obtain at least one target hash partition.
Optionally, the partition unit is further configured to: acquiring the number of the at least one initial hash partition to obtain a target number; performing remainder calculation on each target hash value based on the target quantity to obtain a first calculation result; and determining an initial hash partition to which each target hash value belongs based on the first calculation result.
Optionally, each initial hash partition contains corresponding sequence flag information, and the partition unit is further configured to: and determining target sequence mark information which is the same as each first calculation result in at least one sequence mark information, and determining an initial hash partition corresponding to the target sequence mark information as the initial hash partition to which each target hash value belongs.
Optionally, the first operation unit is configured to: when the target data is operated for data writing operation, determining data index information of data to be written, and calculating a hash value of the data index information of the data to be written to obtain a first hash value; determining a hash partition to which the first hash value belongs from the at least one target hash partition to obtain a first hash partition; and establishing a corresponding B tree node on a B tree corresponding to the first hash partition, wherein the B tree node comprises at least one of data index information of the data to be written and storage position information of the data to be written in a database.
Optionally, the first operation unit is further configured to: performing remainder calculation on the first hash value based on the number of the at least one target hash partition to obtain a second calculation result; and determining a hash partition to which the first hash value belongs based on the second calculation result to obtain the first hash partition.
Optionally, the first operation unit is further configured to: when the target data operation is a data query operation, determining data index information of data to be queried, and calculating a hash value of the data index information of the data to be queried to obtain a second hash value; determining a hash partition to which the second hash value belongs in the at least one target hash partition to obtain a second hash partition; determining a B tree corresponding to the second hash partition; and traversing and searching the B tree corresponding to the second hash partition to obtain a storage position of the data to be queried in a database, and reading the data to be queried according to the storage position.
Optionally, the first operation unit is further configured to: performing remainder calculation on the second hash value based on the number of the at least one target hash partition to obtain a third calculation result; and determining the hash partition to which the second hash value belongs based on the third calculation result to obtain the second hash partition.
Optionally, the first operation unit is further configured to: and before the target data operation is executed on the data in the database based on each target hash partition and the corresponding B tree, controlling the B tree corresponding to the target hash partition corresponding to the target data operation to be in a locking state.
The data processing device provided in the embodiment of the present invention has the same implementation principle and technical effects as those of the previous method embodiment, and for brevity, reference may be made to the corresponding content in the previous method embodiment for the part of the device embodiment that is not mentioned.
Embodiment III:
The embodiment of the application also provides an electronic device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the computer program to realize the steps of the method in the first embodiment.
Referring to fig. 4, an embodiment of the present invention further provides an electronic device 100, including: a processor 90, a memory 91, a bus 92 and a communication interface 93, said processor 90, communication interface 93 and memory 91 being connected by bus 92; the processor 90 is arranged to execute executable modules, such as computer programs, stored in the memory 91.
The memory 91 may include a high-speed random access memory (RAM, random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication connection between the system network element and the at least one other network element is implemented via at least one communication interface 93 (which may be wired or wireless), and may use the internet, a wide area network, a local network, a metropolitan area network, etc.
Bus 92 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one bi-directional arrow is shown in FIG. 4, but not only one bus or type of bus.
The memory 91 is configured to store a program, and the processor 90 executes the program after receiving an execution instruction, and the method executed by the apparatus for flow defining disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 90 or implemented by the processor 90.
The processor 90 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuitry in hardware or instructions in software in processor 90. The processor 90 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but may also be a digital signal processor (DIGITAL SIGNAL Processing, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), off-the-shelf Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 91 and the processor 90 reads the information in the memory 91 and in combination with its hardware performs the steps of the method described above.
In addition, in the description of embodiments of the present invention, unless explicitly stated and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
Embodiments of the present application provide a computer readable medium having a processor executable non-volatile program code, including a computer readable storage medium storing the processor executable non-volatile program code, where the program code includes instructions for executing the method described in the foregoing method embodiments, and specific implementation may be referred to the method embodiments and will not be described herein.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Finally, it should be noted that: the above examples are only specific embodiments of the present invention, and are not intended to limit the scope of the present invention, but it should be understood by those skilled in the art that the present invention is not limited thereto, and that the present invention is described in detail with reference to the foregoing examples: any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or perform equivalent substitution of some of the technical features, while remaining within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (11)

1. A method of data processing, comprising:
acquiring a plurality of target index information which are preset and used for indexing data in a database;
Performing hash partition operation on the plurality of target index information to obtain at least one target hash partition; each target hash partition comprises a hash value of corresponding target index information;
Setting a corresponding B tree for each target hash partition in the at least one target hash partition;
Performing target data operation on the data in the database based on each target hash partition and the corresponding B tree thereof;
Setting a corresponding B-tree for each of the at least one target hash partition includes:
Constructing a target hash chain table according to the at least one target hash partition; the target hash chain table comprises B tree pointers which are correspondingly arranged for the at least one target hash partition, wherein the B tree pointers are used for determining B trees corresponding to the at least one target hash partition;
Before said performing a target data operation on data in a database based on each of said target hash partitions and its corresponding B-tree, said method further comprises:
and controlling the B tree corresponding to the target hash partition corresponding to the target data operation to be in a locking state.
2. The method of claim 1, wherein performing a hash partition operation on the plurality of target index information to obtain at least one target hash partition comprises:
Calculating the hash value of each target index information to obtain a plurality of target hash values;
acquiring at least one preset initial hash partition;
And determining an initial hash partition to which each target hash value belongs to obtain at least one target hash partition.
3. The method of claim 2, wherein determining the initial hash partition to which each of the target hash values belongs comprises:
acquiring the number of the at least one initial hash partition to obtain a target number;
performing remainder calculation on each target hash value based on the target quantity to obtain a first calculation result;
and determining an initial hash partition to which each target hash value belongs based on the first calculation result.
4. The method of claim 3, wherein each initial hash partition contains corresponding order marker information;
Determining, based on the first calculation result, an initial hash partition to which each target hash value belongs includes:
And determining target sequence mark information which is the same as each first calculation result in at least one sequence mark information, and determining an initial hash partition corresponding to the target sequence mark information as the initial hash partition to which each target hash value belongs.
5. The method of claim 1, wherein performing a target data operation on data in a database based on each of the target hash partitions and its corresponding B-tree comprises:
when the target data is operated for data writing operation, determining data index information of data to be written, and calculating a hash value of the data index information of the data to be written to obtain a first hash value;
determining a hash partition to which the first hash value belongs from the at least one target hash partition to obtain a first hash partition;
and establishing a corresponding B tree node on a B tree corresponding to the first hash partition, wherein the B tree node comprises at least one of data index information of the data to be written and storage position information of the data to be written in a database.
6. The method of claim 5, wherein determining, among the at least one target hash partition, a hash partition to which the first hash value belongs, the first hash partition comprising:
Performing remainder calculation on the first hash value based on the number of the at least one target hash partition to obtain a second calculation result;
And determining a hash partition to which the first hash value belongs based on the second calculation result to obtain the first hash partition.
7. The method of claim 1, wherein performing a target data operation on data in a database based on each of the target hash partitions and its corresponding B-tree further comprises:
When the target data operation is a data query operation, determining data index information of data to be queried, and calculating a hash value of the data index information of the data to be queried to obtain a second hash value;
Determining a hash partition to which the second hash value belongs in the at least one target hash partition to obtain a second hash partition;
determining a B tree corresponding to the second hash partition;
And traversing and searching the B tree corresponding to the second hash partition to obtain a storage position of the data to be queried in a database, and reading the data to be queried according to the storage position.
8. The method of claim 7, wherein determining, in the at least one target hash partition, a hash partition to which the second hash value belongs, the second hash partition comprising:
performing remainder calculation on the second hash value based on the number of the at least one target hash partition to obtain a third calculation result;
and determining the hash partition to which the second hash value belongs based on the third calculation result to obtain the second hash partition.
9. A data processing apparatus, comprising:
The acquisition unit is used for acquiring a plurality of target index information which are preset and used for indexing the data in the database;
The partition unit is used for executing hash partition operation on the plurality of target index information to obtain at least one target hash partition; each target hash partition comprises a hash value of corresponding target index information;
A building unit, configured to set a corresponding B-tree for each target hash partition in the at least one target hash partition;
The operation unit is used for controlling the B tree corresponding to the target hash partition corresponding to the target data operation to be in a locking state; performing target data operation on the data in the database based on each target hash partition and the corresponding B tree thereof;
The construction unit is further configured to: constructing a target hash chain table according to the at least one target hash partition; the target hash chain table comprises B tree pointers which are correspondingly arranged for the at least one target hash partition, and the B tree pointers are used for determining B trees corresponding to the at least one target hash partition.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any of the preceding claims 1 to 8 when the computer program is executed.
11. A computer readable medium having non-volatile program code executable by a processor, the program code causing the processor to perform the steps of the method of any one of the preceding claims 1 to 8.
CN202010728918.3A 2020-07-24 2020-07-24 Data processing method, device, electronic equipment and computer readable medium Active CN111858607B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010728918.3A CN111858607B (en) 2020-07-24 2020-07-24 Data processing method, device, electronic equipment and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010728918.3A CN111858607B (en) 2020-07-24 2020-07-24 Data processing method, device, electronic equipment and computer readable medium

Publications (2)

Publication Number Publication Date
CN111858607A CN111858607A (en) 2020-10-30
CN111858607B true CN111858607B (en) 2024-10-25

Family

ID=72947086

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010728918.3A Active CN111858607B (en) 2020-07-24 2020-07-24 Data processing method, device, electronic equipment and computer readable medium

Country Status (1)

Country Link
CN (1) CN111858607B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112925783B (en) * 2021-03-26 2025-01-21 北京金山云网络技术有限公司 Business data processing method and device, electronic device and storage medium
CN113672524B (en) * 2021-08-20 2024-07-02 上海哔哩哔哩科技有限公司 Data processing method and system based on multi-level cache
CN114416384A (en) * 2021-12-23 2022-04-29 北京达佳互联信息技术有限公司 A data processing method, device, electronic device and storage medium
CN114780541B (en) * 2022-04-01 2024-04-12 港珠澳大桥管理局 Data partitioning method, device, equipment and medium in micro batch flow processing system
CN116795863A (en) * 2023-06-01 2023-09-22 北京人大金仓信息技术股份有限公司 Database query statement processing methods, storage media and computer equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101677009A (en) * 2008-09-17 2010-03-24 索尼株式会社 Information processing system, information processing method, and program
CN111061680A (en) * 2018-10-15 2020-04-24 北京京东尚科信息技术有限公司 Data retrieval method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103412917B (en) * 2013-08-08 2016-08-10 广西大学 The Database Systems of a kind of extendible polymorphic type FIELD Data coordinated management and management method
CN104391863A (en) * 2014-10-23 2015-03-04 中国建设银行股份有限公司 Data storage method and device
US20160378824A1 (en) * 2015-06-24 2016-12-29 Futurewei Technologies, Inc. Systems and Methods for Parallelizing Hash-based Operators in SMP Databases
CN105117417B (en) * 2015-07-30 2018-04-17 西安交通大学 A kind of memory database Trie tree indexing means for reading optimization
CN107784110B (en) * 2017-11-03 2020-07-03 北京锐安科技有限公司 A kind of index establishment method and apparatus
CN109902092B (en) * 2019-02-22 2020-05-05 广州荔支网络技术有限公司 Operation method and device of data storage system and mobile terminal
CN110083601B (en) * 2019-04-04 2021-11-30 中国科学院计算技术研究所 Key value storage system-oriented index tree construction method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101677009A (en) * 2008-09-17 2010-03-24 索尼株式会社 Information processing system, information processing method, and program
CN111061680A (en) * 2018-10-15 2020-04-24 北京京东尚科信息技术有限公司 Data retrieval method and device

Also Published As

Publication number Publication date
CN111858607A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
CN111858607B (en) Data processing method, device, electronic equipment and computer readable medium
CN110321344B (en) Information query method and device for associated data, computer equipment and storage medium
CN108089893B (en) Method and device for determining redundant resources, terminal equipment and storage medium
US6546394B1 (en) Database system having logical row identifiers
CN103514201B (en) Method and device for querying data in non-relational database
US20160103858A1 (en) Data management system comprising a trie data structure, integrated circuits and methods therefor
KR101678149B1 (en) Data searching method of database, apparatus and computer program for the same
CN108205571B (en) Key value data table connection method and device
CN114490737B (en) A method and terminal for improving database deep paging query efficiency
CN116578239A (en) Method, electronic device and storage medium for partitioning memory
CN111858606B (en) Data processing method, device and electronic device
CN108984615B (en) Data query method and system and storage medium
CN117909301B (en) Index-based object query method, device, equipment and medium
CN113946584A (en) A QRB Tree Index Method for Massive Vector Data Retrieval
KR102354343B1 (en) Spatial indexing method and apparatus for blockchain-based geospatial data
CN104537016B (en) A kind of method and device of determining file place subregion
CN111666302A (en) User ranking query method, device, equipment and storage medium
CN116955341A (en) Database integrity evaluation method, system and application thereof
CN116880780A (en) Tree data writing method, device, machine-readable medium and memory
JP2020135530A (en) Data management equipment, data retrieval methods and programs
CN116401245A (en) A data index construction method and system
CN113360551B (en) Method and system for storing and rapidly counting time sequence data in shooting range
CN115455207A (en) Reference relation retrieval method and device, electronic equipment and storage medium
CN117743316A (en) Data processing method and device, storage medium and electronic equipment
CN113742344A (en) Method and device for indexing power system data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载