+

CN112000688A - Query method and query system based on universal query language - Google Patents

Query method and query system based on universal query language Download PDF

Info

Publication number
CN112000688A
CN112000688A CN202010816565.2A CN202010816565A CN112000688A CN 112000688 A CN112000688 A CN 112000688A CN 202010816565 A CN202010816565 A CN 202010816565A CN 112000688 A CN112000688 A CN 112000688A
Authority
CN
China
Prior art keywords
query
engine
optimization
query engine
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010816565.2A
Other languages
Chinese (zh)
Inventor
郭祖凯
秦建伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Shuyun Information Technology Co ltd
Original Assignee
Hangzhou Shuyun Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Shuyun Information Technology Co ltd filed Critical Hangzhou Shuyun Information Technology Co ltd
Priority to CN202010816565.2A priority Critical patent/CN112000688A/en
Publication of CN112000688A publication Critical patent/CN112000688A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a query method and a query system based on a universal query language, and belongs to the technical field of query engines. The method comprises the following steps: converting the query content into a relational algebra expression, and analyzing the relational algebra expression to obtain the query characteristics of the relational algebra expression; matching the query engine characteristics with the query characteristics, calculating the cost required by each query engine for meeting the query characteristics, and selecting the query engine with the minimum cost; query optimization is carried out on the relational algebra expression by using a query optimization rule bound by a query engine, and related query characteristics are rewritten; calling a specific query engine adaptation service for the rewritten query features, converting the query features into a query language of a target engine, and submitting the target query language to a query engine for query to obtain a query result; and writing the query result into an external storage module, informing the user of the end of the query, and obtaining the query result by the user through access connection.

Description

Query method and query system based on universal query language
Technical Field
The invention belongs to the technical field of query engines, and particularly relates to a query method and a query system based on a universal query language.
Background
With the rapid development of big data technology, the traditional data query mode can not meet the requirement of data query performance under mass data. Thus, various big data architecture based data query engines have been developed, such as: ElasticSearch, HIVE, NoSQL.
The professional big data query engine plays a very important role in solving the big data query performance, and solves the problem of the big data query performance to a certain extent. However, the focus of the problem to be solved by different query engines is different, so that the advantages of different query engines are different in different scenes. The existing big data query engine has respective realization methods for optimizing the data query performance and respective attention points, so that the result is that one query engine can only optimize the optimal performance aiming at the scenes concerned by the query engine, and the optimization effect has limitation.
Disclosure of Invention
The present invention aims to solve the above technical problems, and provides a query method and a query system based on a universal query language.
In order to achieve the purpose, the invention adopts the following technical scheme:
a query method based on a universal query language comprises the following steps:
s1, analyzing the structured query, converting the query content into a relational algebra expression, and analyzing the relational algebra expression to obtain the query characteristics of the relational algebra expression;
s2, matching the characteristics of the query engine with the query characteristics of the obtained relational algebra expression, calculating the cost required by each query engine for meeting the query characteristics, and selecting the query engine with the minimum cost;
s3, after the query engine is selected, query optimization is carried out on the relational algebra expression by using a query optimization rule bound by the query engine, and meanwhile, related query characteristic rewriting is carried out on query characteristics of an actual query engine storage structure and the view structure different from that seen by a user, so that the query requirement of the user is correctly expressed;
s4, calling specific query engine adaptation service for the rewritten query features, converting the query features into the query language of the target engine, and submitting the target query language to the query engine for query to obtain a query result;
and S5, writing the query result into an external storage module, informing the user that the query is finished, and acquiring the query result by the user through access connection.
Preferably, in step S2, the method selected by the query engine is as follows: the query engine selection is divided into a plurality of stages, each stage determines the query cost of the query engine in the stage through the query characteristics and the query engine characteristics, the query engine is selected to enter the next stage when the cost is controllable, the query engine is selected not to enter the next stage when the cost is uncontrollable, and the query engine with the minimum cost sum is selected as the target query engine after the calculation of all stages is finished.
Preferably, the method for selecting by the query engine further comprises: and after all stages of calculation are finished, backtracking and comparing the cost of each query engine to select the optimal query engine.
Preferably, in step S3, the method for optimizing overwrite includes the steps of:
a1. loading an optimization rule, binding the optimization rule with an actual query engine, and determining the optimization rule after the actual query engine is selected;
a2. receiving query features, matching the query features by using an optimization rule, and optimizing and modifying the query features by using the optimization rule when the optimization rule is suitable for the query features;
a3. and outputting the optimized query features when the query features are considered to be optimal.
The invention also provides a query system based on the universal query language, which comprises: the query analysis module analyzes the query characteristics and combines the query engine characteristics through a tree-based multi-stage cost calculation and backtracking algorithm to calculate and obtain the cost of the query engine of a certain stage on the query characteristics, and after the calculation of all stages is finished, the cost of each query engine is backtracked and compared to select the optimal query engine; the query optimization rewriting module realizes query optimization and rewriting through a relational algebra expression; a query language conversion module for converting a generic query language to each query engine specific query language.
Preferably, the query system further includes an analysis rule configuration module, the analysis rule configuration module is configured to configure an optimization rule bound to the query engine, after the query analysis module selects the optimal query engine, the query optimization rewriting module loads the optimization rule bound to the actual query engine in the analysis rule configuration module, and matches the query feature with the optimization rule, when the optimization rule is applicable to the query feature, the query optimization rewriting module optimizes and modifies the query feature by using the optimization rule, and when the query optimization rewriting module considers that the query feature is optimal, the query optimization rewriting module outputs the optimized query feature.
Preferably, the query analysis module includes a query route and a query engine analysis module, the query route propagates a previous node data query feature to a destination node through a network, the node receiving the query feature sends the query feature back to the original node, the query engine analysis module is configured to calculate a query cost of the query engine in a stage from a previous node to the destination node, the query engine selects to enter a next stage when the cost is controllable, the query engine selects not to enter the next stage when the cost is uncontrollable, and after calculation in all stages is completed, the query engine with the smallest sum of the costs is selected as the target query engine.
After the technical scheme is adopted, the invention has the following advantages:
the invention provides a query method and a query system for various query engines based on a universal query language, so that a user can use different query engines without mastering the query language of various query engines; the characteristics of various query engines are integrated, more comprehensive data query and application capabilities are provided, and service possibility is guaranteed; the influence on a user when a new storage engine is introduced is solved, the user is prevented from paying attention to relevant details when the new engine is used, and the difficulty in introducing a new storage technology is reduced; the core data of the company is effectively managed, and a user does not need to consider the storage problem of the data; and different query engines are adapted to users with different authorities, so that the data access authority of the users is controlled, and the data safety is guaranteed.
Drawings
FIG. 1 is a flowchart illustrating the steps of a query method based on a universal query language according to the present invention;
FIG. 2 is a schematic structural diagram of a query system based on a universal query language according to the present invention;
in the figure:
1-query analysis module; 101-query routing; 102-a query engine analysis module; 2-an analysis rule configuration module; 3-query optimization rewriting module; 4-query optimization rewrite module.
Detailed Description
The present invention will be described in further detail with reference to the following drawings and specific examples.
A query method based on a universal query language comprises the following steps:
s1, analyzing the structured query, converting the query content into a relational algebra expression, and analyzing the relational algebra expression to obtain the query characteristics of the relational algebra expression;
s2, matching the characteristics of the query engine with the query characteristics of the obtained relational algebra expression, calculating the cost required by each query engine for meeting the query characteristics, and selecting the query engine with the minimum cost;
s3, after the query engine is selected, query optimization is carried out on the relational algebra expression by using a query optimization rule bound by the query engine, and meanwhile, related query characteristic rewriting is carried out on query characteristics of an actual query engine storage structure and the view structure different from that seen by a user, so that the query requirement of the user is correctly expressed;
s4, calling specific query engine adaptation service for the rewritten query features, converting the query features into the query language of the target engine, and submitting the target query language to the query engine for query to obtain a query result;
and S5, writing the query result into an external storage module, informing the user that the query is finished, and acquiring the query result by the user through access connection.
In step S2, the method selected by the query engine is as follows: the query engine selection is divided into a plurality of stages, each stage determines the query cost of the query engine in the stage through the query characteristics and the query engine characteristics, the query engine is selected to enter the next stage when the cost is controllable, the query engine is selected not to enter the next stage when the cost is uncontrollable, after all stages are calculated, the query engine with the minimum cost sum is selected as the target query engine, and the cost of each query engine is backtracked and compared to select the optimal query engine.
In step S3, the method of optimizing overwrite includes the steps of:
a1. loading an optimization rule, binding the optimization rule with an actual query engine, and determining the optimization rule after the actual query engine is selected;
a2. receiving query features, matching the query features by using an optimization rule, and optimizing and modifying the query features by using the optimization rule when the optimization rule is suitable for the query features;
a3. and outputting the optimized query features when the query features are considered to be optimal.
The invention also provides a query system based on the universal query language, which comprises: the system comprises a query analysis module 1, an analysis rule configuration module 2, a query optimization rewriting module 3 and a query optimization rewriting module 4.
The query analysis module 1 comprises a query route 101 and a query engine analysis module 102.
The query router 101 propagates a data query message from a node through the network to the destination node, and the node receiving the data query message sends back data matching the query message to the original node. Typically, these queries are described in a natural language or a high-level language, and a general query language needs to be converted into each storage engine specific query language by the query language conversion module 4.
The query engine analysis module 102 is configured to calculate a query cost of the query engine in a stage from a previous node to a destination node, select the query engine to enter a next stage when the cost is controllable, select the query engine not to enter the next stage when the cost is uncontrollable, and select the query engine with the smallest sum of the costs as a target query engine after calculation in all stages is completed.
In the optimal selection algorithm, the query engine analysis module 102 analyzes and calculates an optimal query engine by using a configured weighted collaborative feature tree path, the cost calculation of the query engine at a certain stage is determined by the query features and the features of the query engine together, it is assumed that at a certain stage k, the query features are represented by using a node n, the query features corresponding to the candidate query engine a feature tree are nodes n +8 with numerical values, and the query features corresponding to the candidate storage engine B feature tree are nodes n +22 with numerical values. For a, the cost of this stage is 8, and for B, the cost of this stage is 22. Thus, at this stage, a has a better advantage.
The query analysis module 1 analyzes the query characteristics and combines the query engine characteristics through a tree-based multi-stage cost calculation and backtracking algorithm to calculate and obtain the cost of the query engine of a certain stage for the query characteristics, and backtracks and compares the cost of each query engine after the calculation of all stages is finished to select the optimal query engine.
The analysis rule configuration module 2 is configured to configure an optimization rule bound with a query engine, after the query analysis module 1 selects an optimal query engine, the query optimization rewriting module 3 loads the optimization rule bound with the actual query engine in the analysis rule configuration module 2 and matches query features with the optimization rule, when the optimization rule is applicable to the query features, the query optimization rewriting module 3 optimizes and modifies the query features with the optimization rule, and when the query optimization rewriting module 3 considers that the query features are optimal, the optimized query features are output.
The query optimization rewriting module 3 realizes query optimization and rewriting through a relational algebra expression. The query optimization rewriting module 3 obtains a relational algebra expression through analyzing the structural query, and the relational algebra expression correctly expresses the query intention. The query efficiency is improved through query optimization, and the query time is reduced; by modifying the query through the query, the query falling in the actual query engine returns an accurate result.
The query language conversion module 4 is used for converting the general query language into the specific query language of each query engine, so that the query can be normally executed. The query language conversion module 4 maps the meaning of the relational algebra expression to a specific query language context, which stores the query requirement in the relational algebra expression. The query language conversion module 4 constructs the specific query language context into a query request capable of being accepted and executed by the query engine, generates a data query request of the specific query engine by processing the specific query language context, and incorporates the data query request into the library for query.
Query engines include, but are not limited to: lucene-based search engine elastic search, relational database, data warehouse tool HIVE, non-relational database NoSQL, and other query engines.
Corresponding to the query engines, the query results are written into the external storage module in the forms of; elasticissearch instance, Mysql and other types of SQL relational data instance, HIVE instance, MongoDB, bit, graph and other types of query engine instance.
According to the invention, through a Cost-Based Optimization mode (CBO for short) and a Rule-Based Optimization mode (RBO for short), query contents provided by a user can be accurately optimized, rewritten and quickly matched with a query engine.
The invention enables users to use different query engines without mastering various query engine query languages; the characteristics of various query engines are integrated, more comprehensive data query and application capabilities are provided, and service possibility is guaranteed; the influence on a user when a new storage engine is introduced is solved, the user is prevented from paying attention to relevant details when the new engine is used, and the difficulty in introducing a new storage technology is reduced; the core data of the company is effectively managed, and a user does not need to consider the storage problem of the data; and different query engines are adapted to users with different authorities, so that the data access authority of the users is controlled, and the data safety is guaranteed.
Other embodiments of the present invention than the preferred embodiments described above will be apparent to those skilled in the art from the present invention, and various changes and modifications can be made therein without departing from the spirit of the present invention as defined in the appended claims.

Claims (7)

1. A query method based on a universal query language is characterized by comprising the following steps:
s1, analyzing the structured query, converting the query content into a relational algebra expression, and analyzing the relational algebra expression to obtain the query characteristics of the relational algebra expression;
s2, matching the characteristics of the query engine with the query characteristics of the obtained relational algebra expression, calculating the cost required by each query engine for meeting the query characteristics, and selecting the query engine with the minimum cost;
s3, after the query engine is selected, query optimization is carried out on the relational algebra expression by using a query optimization rule bound by the query engine, and meanwhile, related query characteristic rewriting is carried out on query characteristics of an actual query engine storage structure and the view structure different from that seen by a user, so that the query requirement of the user is correctly expressed;
s4, calling specific query engine adaptation service for the rewritten query features, converting the query features into the query language of the target engine, and submitting the target query language to the query engine for query to obtain a query result;
and S5, writing the query result into an external storage module, informing the user that the query is finished, and acquiring the query result by the user through access connection.
2. The universal query language-based query method according to claim 1, wherein in step S2, the query engine selects the following method: the query engine selection is divided into a plurality of stages, each stage determines the query cost of the query engine in the stage through the query characteristics and the query engine characteristics, the query engine is selected to enter the next stage when the cost is controllable, the query engine is selected not to enter the next stage when the cost is uncontrollable, and the query engine with the minimum cost sum is selected as the target query engine after the calculation of all stages is finished.
3. The method of claim 2, wherein the method for selecting by the query engine further comprises: and after all stages of calculation are finished, backtracking and comparing the cost of each query engine to select the optimal query engine.
4. The query method based on the universal query language of claim 1, wherein in step S3, the method for optimizing rewrite comprises the following steps:
a1. loading an optimization rule, binding the optimization rule with an actual query engine, and determining the optimization rule after the actual query engine is selected;
a2. receiving query features, matching the query features by using an optimization rule, and optimizing and modifying the query features by using the optimization rule when the optimization rule is suitable for the query features;
a3. and outputting the optimized query features when the query features are considered to be optimal.
5. A query system based on a universal query language, comprising:
the query analysis module (1) analyzes the query characteristics and combines the query engine characteristics through a tree-based multi-stage cost calculation and backtracking algorithm, calculates and obtains the cost of the query engine at a certain stage for the query characteristics, and backtracks and compares the cost of each query engine after the calculation at all stages is finished so as to select an optimal query engine;
the query optimization rewriting module (3), the query optimization rewriting module (3) realizes query optimization and rewriting through a relational algebra expression;
a query language conversion module (4), the query language conversion module (4) being configured to convert a generic query language into a respective query engine specific query language.
6. The query system based on the universal query language according to claim 5, further comprising an analysis rule configuration module (2), wherein the analysis rule configuration module (2) is configured to configure the optimization rules bound to the query engine, after the query analysis module (1) selects the optimal query engine, the query optimization rewrite module (3) loads the optimization rules bound to the actual query engine in the analysis rule configuration module (2) and matches the query features using the optimization rules, when the optimization rules are applied to the query features, the query optimization rewrite module (3) optimizes and modifies the query features using the optimization rules, and when the query optimization rewrite module (3) considers the query features to be optimal, the optimized query features are output.
7. The query system based on the universal query language according to claim 6, wherein the query analysis module (1) comprises a query router (101) and a query engine analysis module (102), the query router (101) propagates a previous node data query feature to a destination node through a network, a node receiving the query feature sends back the query feature matching to an original node, the query engine analysis module (102) is configured to calculate a query cost of the query engine in a stage from a previous node to the destination node, the query engine selects to enter a next stage when the cost is controllable, the query engine selects not to enter the next stage when the cost is uncontrollable, and the query engine with the smallest sum of the costs is selected as the target query engine after calculation in all stages is finished.
CN202010816565.2A 2020-08-14 2020-08-14 Query method and query system based on universal query language Pending CN112000688A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010816565.2A CN112000688A (en) 2020-08-14 2020-08-14 Query method and query system based on universal query language

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010816565.2A CN112000688A (en) 2020-08-14 2020-08-14 Query method and query system based on universal query language

Publications (1)

Publication Number Publication Date
CN112000688A true CN112000688A (en) 2020-11-27

Family

ID=73472421

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010816565.2A Pending CN112000688A (en) 2020-08-14 2020-08-14 Query method and query system based on universal query language

Country Status (1)

Country Link
CN (1) CN112000688A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113742359A (en) * 2021-01-19 2021-12-03 北京沃东天骏信息技术有限公司 Method and device for inquiring presence, electronic equipment and storage medium
CN114238308A (en) * 2021-10-14 2022-03-25 多点生活(成都)科技有限公司 Cross perspective table generation method and device, electronic equipment and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104750690A (en) * 2013-12-25 2015-07-01 中国移动通信集团公司 Query processing method, device and system
CN105824957A (en) * 2016-03-30 2016-08-03 电子科技大学 Query engine system and query method of distributive memory column-oriented database
CN109086376A (en) * 2018-07-24 2018-12-25 北京大学 More querying methods and device based on SPARQL query language
CN110110165A (en) * 2019-04-01 2019-08-09 跬云(上海)信息科技有限公司 Dynamic routing method and device for query engine in precomputation system
CN110750560A (en) * 2019-10-25 2020-02-04 东北大学 A system and method for optimizing network multi-connection

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104750690A (en) * 2013-12-25 2015-07-01 中国移动通信集团公司 Query processing method, device and system
CN105824957A (en) * 2016-03-30 2016-08-03 电子科技大学 Query engine system and query method of distributive memory column-oriented database
CN109086376A (en) * 2018-07-24 2018-12-25 北京大学 More querying methods and device based on SPARQL query language
CN110110165A (en) * 2019-04-01 2019-08-09 跬云(上海)信息科技有限公司 Dynamic routing method and device for query engine in precomputation system
CN110750560A (en) * 2019-10-25 2020-02-04 东北大学 A system and method for optimizing network multi-connection

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113742359A (en) * 2021-01-19 2021-12-03 北京沃东天骏信息技术有限公司 Method and device for inquiring presence, electronic equipment and storage medium
CN114238308A (en) * 2021-10-14 2022-03-25 多点生活(成都)科技有限公司 Cross perspective table generation method and device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN111597209B (en) Database materialized view construction system, method and system creation method
CN110377715A (en) Reasoning type accurate intelligent answering method based on legal knowledge map
CN115062070A (en) A question-and-answer-based text table data query method
CN118012900A (en) A natural language intelligent query method and device based on multi-agent interaction
CN104636478A (en) Information query method and device
CN114817307B (en) Less-sample NL2SQL method based on semi-supervised learning and meta-learning
CN109857846B (en) Method and device for matching user question and knowledge point
CN110008308B (en) Method and device for supplementing information for user question
CN117667991A (en) Structured query language generation method, verification method and device
CN118484516B (en) Industry large model-oriented multi-level theme type search enhancement generation method and system
CN118227653A (en) Method for converting full-link natural language into structured query language
CN112000688A (en) Query method and query system based on universal query language
CN118761417A (en) A method for improving large model knowledge question answering using triple proofreading mechanism
CN118503273B (en) Text-to-SQL conversion method and system based on large pre-training model
CN120045686A (en) Knowledge graph-based interactive intelligent analysis method, device and medium
CN119046313B (en) Query statement generation method based on relational graph
CN120144606A (en) Model training method, device and data processing method
CN119671317A (en) A new energy project intelligent planning method and system based on large language model
CN119903071A (en) A processing method, device, equipment and medium based on database query
CN119025552A (en) A natural language intelligent number asking method, device and storage medium
CN115617954B (en) Question answering method, device, electronic device and storage medium
CN117808923A (en) Image generation method, system, electronic device and readable storage medium
CN114328924B (en) Relation classification method based on pre-training model combined with syntactic subtree
CN115221198A (en) Data query method and device
CN119759598B (en) Intelligent scheduling method of computing network resources based on large model intention perception

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201127

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载