CN114925093A

CN114925093A - Method and device for processing database query statement

Info

Publication number: CN114925093A
Application number: CN202210520530.3A
Authority: CN
Inventors: 潘毅; 唐铭豆; 余璜
Original assignee: Beijing Oceanbase Technology Co Ltd
Current assignee: Beijing Oceanbase Technology Co Ltd
Priority date: 2022-05-13
Filing date: 2022-05-13
Publication date: 2022-08-19

Abstract

The disclosure discloses a method and a device for processing a database query statement, wherein the method comprises the following steps: receiving a database query statement, wherein the database query statement comprises a main query and a sub query, the query condition of the sub query is a relevant condition of the main query, the main query comprises a filtering condition for filtering based on the query result of the sub query, and the partition table data of the main query and the sub query are distributed on a plurality of servers; querying target data from the partition table data of the main query, wherein the target data comprises first data relevant to the main query and second data relevant to the sub-queries; the first data and the second data are distributed to a target server, and the target server can obtain all sub-queried partition table data; based on the second data, executing the query of the sub-query partition table data in parallel to obtain a first query result; and performing a filtering operation on the first data in parallel by using the filtering condition according to the first query result.

Description

Method and device for processing database query statement

技术领域technical field

本公开涉及数据库领域，具体涉及一种处理数据库查询语句的方法及装置。The present disclosure relates to the field of databases, and in particular, to a method and device for processing database query statements.

背景技术Background technique

分布式数据库在执行包含相关子查询的结构化查询语言(structured querylanguage，SQL)语句时，传统的执行计划需要将主查询和子查询涉及的数据拉回到本地单线程执行子查询的过滤条件，对于主查询的每一行都需要重新扫描一遍子查询，这种执行计划难以并行，因而执行性能较差。When a distributed database executes a structured query language (SQL) statement containing correlated sub-queries, the traditional execution plan needs to pull the data involved in the main query and sub-queries back to the filter conditions of the local single-threaded sub-query execution. Each row of the main query needs to be rescanned for subqueries. This execution plan is difficult to parallelize, resulting in poor execution performance.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本公开提供了一种处理数据库查询语句的方法及装置，能够提高分布式数据库在执行包含相关子查询的SQL语句时的执行性能。In view of this, the present disclosure provides a method and apparatus for processing database query statements, which can improve the execution performance of a distributed database when executing SQL statements including correlated subqueries.

第一方面，提供一种处理数据库查询语句的方法，包括：接收数据库查询语句，所述数据库查询语句包括主查询和子查询，所述子查询的查询条件为所述主查询的相关条件，所述主查询包括基于所述子查询的查询结果进行过滤的过滤条件，所述主查询和所述子查询的分区表数据分布在多台服务器上；从所述主查询的分区表数据中查询目标数据，所述目标数据包括与所述主查询相关的第一数据以及与所述子查询相关的第二数据；将所述第一数据和所述第二数据分发给目标服务器，所述目标服务器能够获取到所述子查询的所有分区表数据；基于所述第二数据，并行执行对所述子查询的分区表数据的查询，得到第一查询结果；根据所述第一查询结果，使用所述过滤条件并行执行对所述第一数据的过滤操作。A first aspect provides a method for processing a database query statement, comprising: receiving a database query statement, wherein the database query statement includes a main query and a sub-query, the query conditions of the sub-query are related conditions of the main query, the The main query includes filter conditions for filtering based on the query results of the sub-query, the partition table data of the main query and the sub-query are distributed on multiple servers; the target data is queried from the partition table data of the main query , the target data includes first data related to the main query and second data related to the sub-query; distribute the first data and the second data to a target server, and the target server can Acquire all partition table data of the sub-query; perform a query on the partition table data of the sub-query in parallel based on the second data to obtain a first query result; use the first query result according to the first query result The filter condition performs the filtering operation on the first data in parallel.

可选地，所述将所述第一数据和所述第二数据分发给目标服务器，包括：通过自适应选择hash或random的分发方式将所述第一数据和所述第二数据分发给所述目标服务器。Optionally, the distributing the first data and the second data to the target server includes: distributing the first data and the second data to the target server by adaptively selecting a distribution method of hash or random. the target server.

可选地，所述方法还包括：对所述第一数据和所述第二数据的分发情况进行采集；根据所述分发情况，判断采用hash或random的分发方式将所述第一数据和所述第二数据分发给所述目标服务器。Optionally, the method further includes: collecting the distribution situation of the first data and the second data; according to the distribution situation, judging to use a hash or random distribution method to distribute the first data and the second data. The second data is distributed to the target server.

可选地，所述目标服务器能够获取到所述子查询的所有分区表数据，包括：所述目标服务器通过DISTRIBUTED TABLE SCAN算子获取所述子查询的所有分区表数据。Optionally, the target server can obtain all the partition table data of the sub-query, including: the target server obtains all the partition table data of the sub-query through the DISTRIBUTED TABLE SCAN operator.

第二方面，提供一种处理数据库查询语句的装置，包括：接收模块，用于接收数据库查询语句，所述数据库查询语句包括主查询和子查询，所述子查询的查询条件为所述主查询的相关条件，所述主查询包括基于所述子查询的查询结果进行过滤的过滤条件，所述主查询和所述子查询的分区表数据分布在多台服务器上；查询模块，用于从所述主查询的分区表数据中查询目标数据，所述目标数据包括与所述主查询相关的第一数据以及与所述子查询相关的第二数据；分发模块，用于将所述第一数据和所述第二数据分发给目标服务器，所述目标服务器能够获取到所述子查询的所有分区表数据；所述查询模块还用于：基于所述第二数据，并行执行对所述子查询的分区表数据的查询，得到第一查询结果；过滤模块，用于根据所述第一查询结果，使用所述过滤条件并行执行对所述第一数据的过滤操作。In a second aspect, an apparatus for processing a database query statement is provided, comprising: a receiving module configured to receive a database query statement, wherein the database query statement includes a main query and a subquery, and the query condition of the subquery is the same as that of the main query. Relevant conditions, the main query includes filter conditions for filtering based on the query results of the sub-queries, and the partition table data of the main query and the sub-queries are distributed on multiple servers; The target data is queried in the partition table data of the main query, and the target data includes first data related to the main query and second data related to the sub-query; a distribution module is used for distributing the first data and the sub-query. The second data is distributed to the target server, and the target server can obtain all partition table data of the sub-query; the query module is further configured to: based on the second data, perform parallel execution of the sub-query. The query of the partition table data obtains the first query result; the filtering module is configured to perform the filtering operation on the first data in parallel by using the filtering condition according to the first query result.

可选地，所述分发模块还用于通过自适应选择hash或random的分发方式将所述第一数据和所述第二数据分发给所述目标服务器。Optionally, the distribution module is further configured to distribute the first data and the second data to the target server by adaptively selecting a distribution manner of hash or random.

可选地，所述装置还包括：采集模块，用于对所述第一数据和所述第二数据的分发情况进行采集；判断模块，用于根据所述分发情况，判断采用hash或random的分发方式将所述第一数据和所述第二数据分发给所述目标服务器。Optionally, the device further includes: a collection module for collecting the distribution of the first data and the second data; a judgment module for judging the use of hash or random according to the distribution. The distribution method distributes the first data and the second data to the target server.

可选地，所述目标服务器通过DISTRIBUTED TABLE SCAN算子获取所述子查询的所有数据表数据。Optionally, the target server obtains all data table data of the subquery through the DISTRIBUTED TABLE SCAN operator.

第三方面，提供一种计算机可读存储介质，所述计算机存储介质存储有计算机程序，所述计算机程序被执行时实现如第一方面所述的方法。In a third aspect, a computer-readable storage medium is provided, and the computer storage medium stores a computer program, which implements the method according to the first aspect when the computer program is executed.

第四方面，提供一种计算机程序产品，包括可执行代码，当所述可执行代码被执行时，能够实现如第一方面所述的方法。In a fourth aspect, a computer program product is provided, comprising executable code, which, when executed, can implement the method of the first aspect.

本公开实施例的方案将与主查询相关的第一数据以及与子查询相关的第二数据分发到目标服务器中，该目标服务器为能够获取到子查询的所有分区表数据的服务器。由于目标服务器可以并行执行任务，因此，目标服务器可以基于第二数据并行地执行子查询，以及并行地执行对第一数据的过滤操作，即可以并行地执行子查询的过滤条件，从而能够提高SQL语句的执行性能。The solution of the embodiment of the present disclosure distributes the first data related to the main query and the second data related to the subquery to a target server, where the target server is a server that can obtain all partition table data of the subquery. Since the target server can execute tasks in parallel, the target server can execute the sub-query in parallel based on the second data, and execute the filtering operation on the first data in parallel, that is, the filtering conditions of the sub-query can be executed in parallel, so that the SQL can be improved. Statement execution performance.

附图说明Description of drawings

图1为本公开一实施例提供的分布式数据库的系统示例图。FIG. 1 is a system example diagram of a distributed database provided by an embodiment of the present disclosure.

图2为本公开一实施例提供的处理数据库查询语句的方法的流程示意图。FIG. 2 is a schematic flowchart of a method for processing a database query statement provided by an embodiment of the present disclosure.

图3为本公开另一实施例提供的处理数据库查询语句的方法的流程示例图。FIG. 3 is an exemplary flowchart of a method for processing a database query statement provided by another embodiment of the present disclosure.

图4为本公开又一实施例提供的处理数据库查询语句的方法的流程示例图。FIG. 4 is an exemplary flowchart of a method for processing a database query statement provided by another embodiment of the present disclosure.

图5为本公开一实施例提供的处理数据库查询语句的装置的结构示意图。FIG. 5 is a schematic structural diagram of an apparatus for processing a database query statement provided by an embodiment of the present disclosure.

图6为本公开另一实施例提供的处理数据库查询语句的装置的结构示意图。FIG. 6 is a schematic structural diagram of an apparatus for processing a database query statement provided by another embodiment of the present disclosure.

具体实施方式Detailed ways

下面对本公开实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅是本公开一部分实施例，而不是全部的实施例。The technical solutions in the embodiments of the present disclosure will be clearly and completely described below. Obviously, the described embodiments are only a part of the embodiments of the present disclosure, but not all of the embodiments.

随着互联网的发展和数据量的不断激增，分布式数据库逐渐取代单机数据库。在数据库使用中，读操作占了数据库操作的大多数，反映到SQL中就是数据查询语句被经常使用。一个SQL语句中，“select-from-where”组成的结构称为查询块。当一个查询块作为另一个查询块的查询条件嵌套在其中时，称为子查询。With the development of the Internet and the continuous surge in data volume, distributed databases have gradually replaced stand-alone databases. In database use, read operations account for the majority of database operations, and it is reflected in SQL that data query statements are often used. In an SQL statement, the structure composed of "select-from-where" is called query block. When a query block is nested within it as a query condition of another query block, it is called a subquery.

一个SQL语句可以由主查询和子查询构成。主查询也可以称为外层查询，子查询也可以称为内层查询。如果子查询的执行不依赖于主查询的查询结果，则该子查询为非相关子查询。非相关子查询由内向外执行，每个查询只需要执行一次。也就是说，非相关子查询先执行子查询，将子查询的执行结果缓存起来，再去执行主查询。An SQL statement can consist of a main query and sub-queries. The main query can also be called the outer query, and the sub query can also be called the inner query. A subquery is a non-correlated subquery if its execution does not depend on the query results of the main query. Uncorrelated subqueries are executed from the inside out, and each query needs to be executed only once. That is to say, a non-correlated subquery executes the subquery first, caches the execution result of the subquery, and then executes the main query.

如果子查询的执行依赖于主查询的查询结果，该子查询称为相关子查询。在包含相关子查询的SQL中，主查询中每执行一行，子查询需要全部遍历一次。这种执行方法使得包含相关子查询的SQL语句的执行时间会随着主查询行数的增多而增长。对于嵌套层次复杂的相关子查询，SQL的执行时间会呈指数增长。If the execution of a subquery depends on the query results of the main query, the subquery is called a correlated subquery. In SQL containing correlated subqueries, the subqueries need to be traversed once for each row executed in the main query. This execution method causes the execution time of SQL statements containing correlated subqueries to increase as the number of rows in the main query increases. For correlated subqueries with complex nesting levels, the execution time of SQL increases exponentially.

具体地，相关子查询表示的是两个查询之间有一定的条件关联。下面根据如下SQL语句，对相关子查询的相关概念进行说明。Specifically, a correlated subquery indicates that there is a certain conditional association between two queries. The following describes the related concepts of correlated subqueries based on the following SQL statements.

SELECT T2.v2SELECT T2.v2

FROM T2FROM T2

WHERE T2.v2＝(SELECT Sum(v2)WHERE T2.v2=(SELECT Sum(v2)

FROM T1FROM T1

WHERE T2.v1<T1.v1)；WHERE T2.v1<T1.v1);

上述SQL语句包含主查询和子查询。其中，主查询语句为“SELECT T2.v2FROMT2WHERE T2.v2＝()”，子查询语句为“(SELECT Sum(v2)FROM T1WHERE T2.v1<T1.v1)”。该子查询语句中包括表达式“WHERE T2.v1<T1.v1”，显然，该表达式中引用了主查询对象T2表中的字段v2。即子查询对象T1表的查询结果依赖于主查询对象T2表。因此，该子查询为相关子查询。The above SQL statement contains a main query and a subquery. The main query statement is "SELECT T2.v2FROMT2WHERE T2.v2=()", and the sub-query statement is "(SELECT Sum(v2)FROM T1WHERE T2.v1<T1.v1)". The subquery statement includes the expression "WHERE T2.v1<T1.v1", which obviously refers to the field v2 in the T2 table of the main query object. That is, the query result of the sub-query object T1 table depends on the main query object T2 table. Therefore, the subquery is a correlated subquery.

另外，主查询包括基于子查询的查询结果进行过滤的过滤条件，该过滤条件也可以称为子查询过滤条件。仍以上述SQL语句为例，该过滤条件可以指“WHERE T2.v2＝()”。In addition, the main query includes a filter condition for filtering based on the query result of the sub-query, and the filter condition may also be referred to as a sub-query filter condition. Still taking the above SQL statement as an example, the filter condition may refer to "WHERE T2.v2=()".

在相关子查询中，引用主查询的语句，使得子查询与主查询产生关联的表达式为相关表达式。作为一个示例，上述SQL中的表达式“where T2.v1<T1.v1”是使得子查询对象T1与主查询对象T2产生关联的表达式，因此，“where T2.v1<T1.v1”为该相关子查询的相关表达式，相关表达式也称为相关条件。In a correlated subquery, the statement of the main query is quoted, so that the expression associated with the subquery and the main query is a correlated expression. As an example, the expression "where T2.v1<T1.v1" in the above SQL is an expression that associates the subquery object T1 with the main query object T2, therefore, "where T2.v1<T1.v1" is The correlation expression of the correlated subquery, the correlation expression is also called the correlation condition.

下面对上述SQL语句的相关子查询的执行过程进行相关描述：The following describes the execution process of the correlated subquery of the above SQL statement:

第一步，从主查询对象表T2中取出一个元组(即一行)，将该元组相关列v1的值(即T2.v1)传递给子查询。The first step is to take a tuple (ie, a row) from the main query object table T2, and pass the value of the tuple-related column v1 (ie, T2.v1) to the subquery.

第二步，子查询将表T2的v1列的值作为条件执行，并对T1表执行一次完整的子查询，得到查询结果，并将查询结果传递给主查询。子查询执行一次完整的查询过程称为一次循环。In the second step, the subquery executes the value of the v1 column of the table T2 as a condition, and executes a complete subquery on the T1 table to obtain the query result, and transmit the query result to the main query. A subquery executes a complete query process called a loop.

第三步，执行过滤操作。将子查询的查询结果与该元组相关列v2的值进行比较，确定表T2的v2列的值是否满足过滤条件。The third step is to perform the filtering operation. Compare the query result of the subquery with the value of the related column v2 of the tuple to determine whether the value of the v2 column of the table T2 satisfies the filter condition.

第四步，再从主查询对象表T2中取出下一个元组，循环执行第一步至第三步，直到外部查询对象表T2中的所有元组全部处理完毕。In the fourth step, the next tuple is taken out from the main query object table T2, and the first to the third step are executed cyclically until all the tuples in the external query object table T2 are processed.

数据库在处理SQL语句时，会对SQL语句的执行生成执行计划。需要解释的是，执行计划是SQL语句的执行路径或算法。执行计划影响SQL语句的执行性能。在一些场景下，优化器例如会在众多的执行计划中找出一个资源使用较少的执行方法来展示出来。When the database processes an SQL statement, it generates an execution plan for the execution of the SQL statement. It needs to be explained that the execution plan is the execution path or algorithm of the SQL statement. The execution plan affects the execution performance of the SQL statement. In some scenarios, the optimizer will, for example, find an execution method that uses less resources among many execution plans to display.

对于存在有依赖关系的子查询(即相关子查询)的SQL语句来说，数据库会生成子查询执行计划(subplan)。在分布式数据库中，subplan的执行逻辑通常是先将主查询涉及的数据(如上述T2.v2和T2.v1)拉回本地，然后对主查询涉及数据的每一行去执行完整的子查询，并将执行结果拉回本地。在本地执行子查询相关过滤条件的计算，也称为子查询filter条件计算。需要说明的是，相关过滤条件是对子查询结果进行过滤的SQL语句，例如可以是子查询结果与主查询匹配的where语句。For SQL statements with dependent subqueries (ie, correlated subqueries), the database generates a subquery execution plan (subplan). In a distributed database, the execution logic of subplan is usually to first pull the data involved in the main query (such as T2.v2 and T2.v1 above) back to the local, and then execute a complete subquery for each row of the data involved in the main query, And pull the execution result back to the local. The calculation of the sub-query related filter conditions is performed locally, which is also called the calculation of the sub-query filter conditions. It should be noted that the relevant filter condition is an SQL statement for filtering the sub-query result, for example, a where statement in which the sub-query result matches the main query.

下面结合图1对包含相关子查询的SQL语句在分布式数据库中的执行过程进行介绍。The following describes the execution process of the SQL statement including the correlated subquery in the distributed database with reference to FIG. 1 .

图1为本申请实施例提供的一个分布式数据库的结构示例图。图1中的分布式数据库100可以包含用户节点(client node)110，协调节点(coordinator node)120和计算节点(computer node)130。FIG. 1 is a structural example diagram of a distributed database provided by an embodiment of the present application. The distributed database 100 in FIG. 1 may include a client node 110 , a coordinator node 120 and a computer node 130 .

用户节点110例如可以客户端，包括计算机、平板电脑、手机等，用于发送所需要查询的SQL语句。For example, the user node 110 can be a client, including a computer, a tablet computer, a mobile phone, etc., for sending the SQL statement required for the query.

协调节点120，也称为协调者，与用户节点110交互，负责向用户节点110提供查询接口，接收用户节点110发送的SQL语句，并将该SQL语句转换成为多个可执行的子查询计划分发给计算节点130。The coordinator node 120, also called the coordinator, interacts with the user node 110, is responsible for providing a query interface to the user node 110, receives the SQL statement sent by the user node 110, and converts the SQL statement into a plurality of executable sub-query plans for distribution to compute node 130.

计算节点130，负责接收协调节点120发出的查询任务，执行子查询计划。计算节点130具备计算和数据存储功能。作为一个示例，分布式数据库中还可以包括存储节点，用于对数据进行存储。本公开对分布式数据库的结构不做限制。The computing node 130 is responsible for receiving the query task sent by the coordinating node 120 and executing the sub-query plan. The computing node 130 has computing and data storage functions. As an example, the distributed database may further include storage nodes for storing data. The present disclosure does not limit the structure of the distributed database.

具体地，在分布式数据库中，SQL语句的执行过程为：协调节点120将接收到的SQL语句转换成多个可以在不同计算节点130上并行执行的子查询计划，并根据每个计算节点的CPU利用率、空闲内存大小等信息，选择合适的计算节点130进行发送。计算节点130执行子查询计划，并将结果返回给协调节点。协调节点120将计算结果再发送给用户节点110。Specifically, in the distributed database, the execution process of the SQL statement is as follows: the coordinating node 120 converts the received SQL statement into multiple sub-query plans that can be executed in parallel on different computing nodes 130, and according to the For information such as CPU utilization, free memory size, etc., an appropriate computing node 130 is selected for transmission. Compute node 130 executes the subquery plan and returns the result to the coordinator node. The coordinating node 120 sends the calculation result to the user node 110 again.

需要说明的是，计算节点130将子查询执行结果发送给协调节点120的过程可以称为拉回本地。也就是说，传统的子查询执行计划需要将主查询的查询结果和子查询的查询结果拉回本地，在本地执行相关子查询的filter条件的计算。It should be noted that the process of the computing node 130 sending the sub-query execution result to the coordinating node 120 may be referred to as pulling back locally. That is to say, the traditional sub-query execution plan needs to pull the query result of the main query and the query result of the sub-query back to the local, and perform the calculation of the filter condition of the correlated sub-query locally.

在一些应用场景下，计算节点130可以部署在多台服务器上，如图1中所示的计算节点1到N分别部署在N台服务器上。协调节点120只能部署在一台服务器上。换句话说，任意一个计算节点130都可以作为协调节点120，但协调节点120只能有一个。In some application scenarios, the computing node 130 may be deployed on multiple servers, and the computing nodes 1 to N shown in FIG. 1 are respectively deployed on N servers. The coordinator node 120 can only be deployed on one server. In other words, any computing node 130 can serve as the coordinator node 120, but there can only be one coordinator node 120.

需要说明的是，SQL语义要求，执行子查询的filter条件时，子查询数据每次只能返回一个循环结果。如果子查询一次返回多个循环结果，系统会提供报错。由于传统的执行计划中是将子查询filter条件放在本地并采用单线程执行，因而存在执行效率差的问题。It should be noted that, SQL semantics requires that when the filter condition of the subquery is executed, the subquery data can only return one loop result at a time. If the subquery returns multiple loop results at a time, the system will provide an error. In the traditional execution plan, the sub-query filter conditions are placed locally and executed by a single thread, so there is a problem of poor execution efficiency.

下面结合示例(1)中的SQL(1)来说明分布式数据库中子查询执行计划的具体实现过程。The specific implementation process of the sub-query execution plan in the distributed database is described below with reference to the SQL (1) in the example (1).

SQL(1)语句如下：The SQL(1) statement is as follows:

creat table T1(v1 int，v2 int)partition by hash(v1)partitions 4；creat table T1(v1 int, v2 int) partition by hash(v1) partitions 4;

creat table T2(v1 int，v2 int)；creat table T2(v1 int, v2 int);

creat table index T1_idx1 on T1(v2)local；creat table index T1_idx1 on T1(v2)local;

SELECT T2.v2SELECT T2.v2

FROM T2FROM T2

WHERE T2.v2＝(SELECT Sum(v2)WHERE T2.v2=(SELECT Sum(v2)

FROM T1FROM T1

WHERE T2.v1<T1.v1)；WHERE T2.v1<T1.v1);

上述SQL(1)语句中，主查询的分区表T2和子查询的分区表T1均分布在多台服务器(server)上。T2中每一行都需要重新计算出子查询的聚合计算(即sum计算)。然后，将聚合计算结果与T2中该行的v2值进行匹配看是否满足，满足则返回查询结果。作为一个示例，SQL(1)中的子查询过滤条件为“WHERE T2.v2＝(……)”。In the above SQL(1) statement, the partition table T2 of the main query and the partition table T1 of the sub-query are distributed on multiple servers (servers). Each row in T2 needs to recalculate the aggregate calculation of the subquery (ie, the sum calculation). Then, match the aggregate calculation result with the v2 value of the row in T2 to see if it is satisfied, and return the query result if it is satisfied. As an example, the subquery filter condition in SQL(1) is "WHERE T2.v2=(...)".

在分布式数据库中执行SQL(1)时，生成如下的执行计划(1)：When executing SQL(1) in a distributed database, the following execution plan(1) is generated:

IDID OPERATOROPERATOR NAMENAME 00 SUBPLAN FILTERSUBPLAN FILTER 11 PX COORDINATORPX COORDINATOR 22 EXCHANGE OUT DISTREXCHANGE OUT DISTR ：EX10000: EX10000 33 PX PARTITION ITERATORPX PARTITION ITERATOR 44 TABLE SCANTABLE SCAN T2T2 55 SCALAR GROUP BYSCALAR GROUP BY 66 PX COORDINATORPX COORDINATOR 77 EXCHANGE OUT DISTREXCHANGE OUT DISTR ：EX20000: EX20000 88 MERGE GROUP BYMERGE GROUP BY 99 PX PARTITION ITERATORPX PARTITION ITERATOR 1010 TABLE SCANTABLE SCAN T1T1

在一些应用场景下，执行计划可以分成多个局部子计划，一个局部子计划可以包含一个或多个算子。例如，局部子计划可以以“EXCHANGE”算子为边界进行划分。如在执行计划(1)中，第2号算子至第4号算子可以组成主查询的局部子计划(也可以称为一个数据流操作(data flow operation，DFO))；第7号算子至第10号算子可以组成子查询的局部子计划。In some application scenarios, the execution plan can be divided into multiple partial sub-plans, and a partial sub-plan can contain one or more operators. For example, local sub-plans can be divided with the "EXCHANGE" operator as the boundary. For example, in the execution plan (1), operators No. 2 to No. 4 can form a partial sub-plan of the main query (also called a data flow operation (DFO)); No. 7 operator Sub to No. 10 operators can form partial sub-plans of sub-queries.

作为一个示例，一个局部子计划可以由多个线程来执行。例如，当主查询分区表T2分布在多台服务器上时，可以通过多个线程并行扫描这多台服务器上的分区表T2。As an example, a local subplan may be executed by multiple threads. For example, when the main query partition table T2 is distributed on multiple servers, the partition table T2 on these multiple servers can be scanned in parallel by multiple threads.

执行计划(1)的具体执行过程如下所述：The specific execution process of the execution plan (1) is as follows:

1号算子“PX COORDINATOR”(即协调者)发起多个线程执行主查询分区表T2的扫描，并将扫描结果发送给协调者。The No. 1 operator "PX COORDINATOR" (ie the coordinator) initiates multiple threads to scan the main query partition table T2, and sends the scan results to the coordinator.

协调者启动子查询执行计划，发起多个线程执行子查询分区表T1的扫描。作为一种实现方式，每个子查询的工作线程都会带着前面主查询的相关查询结果(如T2.v1)对T1表执行扫描。The coordinator starts the sub-query execution plan and initiates multiple threads to scan the sub-query partition table T1. As an implementation, the worker thread of each subquery will scan the T1 table with the relevant query results (eg T2.v1) of the previous main query.

子查询的每个工作线程会得到一个查询结果，每个线程将其查询结果返回给协调者。在一些实施例中，该查询结果可以是一个聚合结果，即每个工作线程可以将自己所查询的分区表数据的查询结果进行聚合。在另一些实施例中，该查询结果可以是数据集合，如该查询结果可以是T1.v1的集合。Each worker thread of the subquery gets a query result, and each thread returns its query result to the coordinator. In some embodiments, the query result may be an aggregated result, that is, each worker thread may aggregate the query result of the partition table data queried by itself. In other embodiments, the query result may be a data set, for example, the query result may be a set of T1.v1.

协调者接收到子查询每个工作线程的查询结果后，可以对每个线程的查询结果进行一个全局聚合，得到子查询的执行结果。After the coordinator receives the query result of each worker thread of the subquery, it can perform a global aggregation on the query result of each thread to obtain the execution result of the subquery.

协调者得到子查询执行结果后，执行相关子查询filter条件的计算，即“WHERET2.v2＝(…)”条件的计算，由前文所述可知，该子查询filter条件的计算是单线程完成的。After the coordinator obtains the execution result of the sub-query, it executes the calculation of the filter condition of the correlated sub-query, that is, the calculation of the "WHERET2.v2=(...)" condition. It can be seen from the foregoing that the calculation of the filter condition of the sub-query is completed by a single thread. .

因此可以看出，在上述执行计划(1)中，子查询的聚合计算、子查询filter条件的计算都是无法并行的。Therefore, it can be seen that in the above execution plan (1), the aggregation calculation of the subquery and the calculation of the filter condition of the subquery cannot be parallelized.

另外，当SQL(1)作为一个更上层SQL语句的执行条件或需要与其他表进行连接，需要将SQL(1)的执行结果发送给其他SQL查询语句时，执行计划(1)会在子查询filter的地方产生一个执行单点，导致该发送过程也是无法并行执行的。In addition, when SQL(1) is used as an execution condition of a higher-level SQL statement or needs to be connected with other tables, and the execution result of SQL(1) needs to be sent to other SQL query statements, the execution plan (1) will be executed in the subquery The place of filter produces a single point of execution, so that the sending process cannot be executed in parallel.

综上所述，分布式数据库在执行包含相关子查询的SQL语句的过程中，子查询计算、子查询filter条件计算等过程无法并行，因而导致执行性能差。并且，目前也没有太好的优化手段。尤其是当主查询的返回行数较多或者子查询是一个较为复杂的分布式计划时，其执行性能会更差。To sum up, in the process of executing SQL statements containing correlated sub-queries in a distributed database, sub-query calculation, sub-query filter condition calculation and other processes cannot be performed in parallel, resulting in poor execution performance. Moreover, there is no good optimization method at present. Especially when the main query returns a large number of rows or the subquery is a more complex distributed plan, its execution performance will be worse.

基于上述问题，本公开实施例提供了一种新的执行方式，子查询filter条件的计算不是由协调节点执行，而是由计算节点中的目标服务器执行。协调节点可以将主查询分区表数据中的相关数据发送给目标服务器，由目标服务器并行地执行子查询filter条件的计算，从而提高SQL语句的执行性能。Based on the above-mentioned problems, the embodiments of the present disclosure provide a new execution mode, in which the calculation of the filter conditions of the sub-query is not executed by the coordinator node, but by the target server in the calculation node. The coordinating node can send the relevant data in the data of the partition table of the main query to the target server, and the target server executes the calculation of the filter conditions of the sub-query in parallel, thereby improving the execution performance of the SQL statement.

图2为本公开一实施例提供的处理数据库查询语句的方法200的流程示意图。应理解，图2示出了方法200的步骤或操作，但这些步骤或操作仅是示例，或者，这些步骤可以按照其他顺序执行。所述方法200可以包括步骤S210至步骤S250，具体如下所示。FIG. 2 is a schematic flowchart of a method 200 for processing a database query statement provided by an embodiment of the present disclosure. It should be understood that the steps or operations of the method 200 are shown in FIG. 2, but these steps or operations are merely examples, or the steps may be performed in other orders. The method 200 may include steps S210 to S250, as shown below.

在步骤S210中，接收数据库查询语句。该数据库查询语句包括主查询和子查询，子查询的查询条件为主查询的相关条件，且主查询和子查询的分区表数据分布在多台服务器上。In step S210, a database query statement is received. The database query statement includes a main query and a subquery, the query conditions of the subquery are related conditions of the main query, and the partition table data of the main query and the subquery are distributed on multiple servers.

数据库查询语句可以是前述的任意SQL语句。由于该数据库查询语句包括主查询和子查询，子查询的查询条件为所述主查询的相关条件，因而，该数据库查询语句可以是包含相关子查询的SQL语句。The database query statement can be any of the aforementioned SQL statements. Since the database query statement includes a main query and a subquery, and the query conditions of the subquery are related conditions of the main query, the database query statement may be an SQL statement including a related subquery.

在分布式数据库中，主查询和子查询的分区表数据可以分布在多台服务器上。作为一个示例，每台服务器上可以存储主查询和/或子查询的一个或多个分区的数据。In a distributed database, the partitioned table data for the main query and subqueries can be distributed across multiple servers. As one example, data for one or more partitions of the main query and/or sub-queries may be stored on each server.

参照示例(1)中，主查询的分区表数据T2例如可以分布在两台服务器server 1和server 2上，子查询的分区表数据T1例如可以分布在三台服务器server 3、server 4和server 5上。Referring to example (1), the partition table data T2 of the main query can be distributed, for example, on two servers server 1 and server 2, and the partition table data T1 of the subquery can be distributed, for example, on three servers server 3, server 4 and server 5. superior.

在一些实施例中，该主查询可以包括基于子查询的查询结果进行过滤的过滤条件。该过滤条件可以为上文描述的子查询filter条件。以上述SQL(1)为例，该过滤条件为“WHERE T2.v2＝(…)”语句。In some embodiments, the main query may include filter conditions for filtering based on query results of sub-queries. The filter condition may be the subquery filter condition described above. Taking the above SQL(1) as an example, the filter condition is a "WHERE T2.v2=(...)" statement.

在步骤S220中，从主查询的分区表数据中查询目标数据。该目标数据可以包括与主查询相关的第一数据以及与子查询相关的第二数据。In step S220, the target data is queried from the partition table data of the main query. The target data may include first data related to the main query and second data related to the sub-query.

以上述SQL(1)为例，该第一数据可以为T2.v2，该第二数据可以为T2.v1。该第二数据可以理解为主查询的分区表数据中与子查询产生关联的数据。Taking the above SQL(1) as an example, the first data may be T2.v2, and the second data may be T2.v1. The second data can be understood as data associated with the sub-query in the partition table data of the main query.

协调者根据主查询分区表数据的分布情况发起多个工作线程对主查询的分区表数据进行查询，得到目标数据。目标数据例如可以是主查询分区表数据中的一行数据，也可以是这行数据中的一个或多个值。例如，目标数据可以包括主查询分区表数据中v1列和v2列数据。又例如，目标数据可以为整个T2表。The coordinator initiates multiple worker threads to query the partition table data of the main query according to the distribution of the data in the partition table of the main query, and obtains the target data. The target data can be, for example, a row of data in the main query partition table data, or one or more values in this row of data. For example, the target data may include the v1 column and v2 column data in the main query partition table data. For another example, the target data may be the entire T2 table.

作为一个示例，由于server 1和server 2上均存储有T2表的数据，则协调者可以向server 1和server 2均发送工作线程来执行T2表的查询。例如，协调者在server 1和server 2上分别发起3个线程来扫描主查询的分区表数据T2，得到T2中每行数据中T2.v1和T2.v2值。As an example, since both server 1 and server 2 store the data of the T2 table, the coordinator can send worker threads to both server 1 and server 2 to execute the query of the T2 table. For example, the coordinator initiates 3 threads on server 1 and server 2 respectively to scan the partition table data T2 of the main query, and obtain the values of T2.v1 and T2.v2 in each row of data in T2.

上述步骤S220可以是由主查询的计算节点执行的。该计算节点例如可以为存储有主查询的分区表数据的计算节点。一个计算节点上可以有一个或多个工作线程。如果一个计算节点有多个工作线程，该多个工作线程可以并行地执行主查询的分区表数据的查询。The above-mentioned step S220 may be performed by the computing node of the main query. The computing node may be, for example, a computing node that stores partition table data of the main query. There can be one or more worker threads on a compute node. If a computing node has multiple worker threads, the multiple worker threads can execute the query of the partition table data of the main query in parallel.

在步骤S230中，将第一数据和第二数据分发给目标服务器。该目标服务器能够获取到子查询的所有分区表数据。该目标服务器的数量可以为多个。In step S230, the first data and the second data are distributed to the target server. The target server can get all the partition table data of the subquery. The number of the target server can be multiple.

在一些实施例中，该过程可以是由上文描述的协调节点执行的。主查询的工作线程可以将第一数据和第二数据发送给协调节点，然后由协调节点将第一数据和第二数据分发给目标服务器。在另一些实施例中，该过程可以是由主查询的工作线程执行的。主查询的工作线程在查询到第一数据和第二数据后，可直接将该第一数据和第二数据分发给目标服务器。In some embodiments, this process may be performed by the coordinating node described above. The worker thread of the main query can send the first data and the second data to the coordinator node, and then the coordinator node distributes the first data and the second data to the target server. In other embodiments, the process may be performed by a worker thread of the main query. After querying the first data and the second data, the worker thread of the main query can directly distribute the first data and the second data to the target server.

目标服务器为子查询的分区表数据所分布的服务器，目标服务器例如可以包含多台服务器，如前面所述的server 3、server 4和server 5。The target server is a server where the partition table data of the subquery is distributed, and the target server may include, for example, multiple servers, such as server 3, server 4, and server 5 described above.

在向目标服务器分发第一数据和第二数据时，可以将同一行的第一数据和第二数据分发给同一个的服务器。例如，可以将T2表中的第一行的T2.v1和T2.v2分发给server 3，将T2表中的第二行T2.v1和T2.v2分发给server 4，将T2表中的第三行T2.v1和T2.v2分发给server 5。When distributing the first data and the second data to the target server, the first data and the second data of the same row may be distributed to the same server. For example, T2.v1 and T2.v2 of the first row in the T2 table can be distributed to server 3, the second row T2.v1 and T2.v2 of the T2 table can be distributed to server 4, and the second row of the T2 table can be distributed to server 4. The three lines T2.v1 and T2.v2 are distributed to server 5.

一个目标服务器上可以有一个或多个工作线程，将第一数据和第二数据分发给目标服务器可以理解为将第一数据和第二数据分发给目标服务器的工作线程。协调节点可以根据子查询分区表的数据分布情况，在目标服务器上发起一个或多个工作线程。协调节点可以向每个工作线程分发第一数据和第二数据。在一些实施例中，协调节点可以将同一行的第一数据和第二数据分发给同一个的工作线程。There may be one or more worker threads on a target server, and distributing the first data and the second data to the target server may be understood as a worker thread distributing the first data and the second data to the target server. The coordinator node can initiate one or more worker threads on the target server according to the data distribution of the subquery partition table. The coordinating node may distribute the first data and the second data to each worker thread. In some embodiments, the coordinator node may distribute the first data and the second data of the same row to the same worker thread.

在步骤S240、基于第二数据，并行执行子查询的分区表数据的查询，得到第一查询结果。该过程可以是由目标服务器执行的。In step S240, based on the second data, the query of the partition table data of the sub-query is executed in parallel to obtain the first query result. This process may be performed by the target server.

目标服务器可以将第二数据作为查询条件，执行子查询的分区表数据的查询。协调节点可以向不同的目标服务器发送不同的第二数据，不同的目标服务器可以基于接收到的第二数据，并行地执行查询操作。The target server may use the second data as a query condition, and execute the query of the partition table data of the sub-query. The coordinating node may send different second data to different target servers, and different target servers may execute query operations in parallel based on the received second data.

仍以上述server 3、server 4和server 5为例，server 3接收到的是T2表中的第一行的T2.v1，则server 3可以将第一行的T2.v1作为查询条件，执行对T1表的查询。server4接收到的是T2表中的第二行的T2.v1，则server 4可以将第二行的T2.v1作为查询条件，执行对T1表的查询。server 5接收到的是T2表中的第三行的T2.v1，则server 5可以将第三行的T2.v1作为查询条件，执行对T1表的查询。由于server 3、server 4和server 5接收的是不同行的T2.v1，因此，server 3、server 4和server 5可以并行地对T1表进行查询。Still taking the above server 3, server 4 and server 5 as examples, server 3 receives T2.v1 in the first row of the T2 table, then server 3 can use T2.v1 in the first row as a query condition, and execute the query Query of T1 table. What server4 receives is T2.v1 in the second row in the T2 table, then server 4 can use T2.v1 in the second row as a query condition to perform a query on the T1 table. The server 5 receives T2.v1 in the third row in the T2 table, and the server 5 can use the T2.v1 in the third row as a query condition to query the T1 table. Since server 3, server 4, and server 5 receive T2.v1 from different rows, server 3, server 4, and server 5 can query the T1 table in parallel.

第一查询结果为子查询的查询结果。以上述SQL(1)为例，第一查询结果为“(SELECT Sum(v2)FROM T1WHERE T2.v1<T1.v1)”的执行结果。第一查询结果可以包括针对不同行的T2.v1的查询结果。The first query result is the query result of the subquery. Taking the above SQL(1) as an example, the first query result is the execution result of "(SELECT Sum(v2)FROM T1WHERE T2.v1<T1.v1)". The first query result may include query results of T2.v1 for different rows.

在步骤S250、根据第一查询结果，使用过滤条件并行地执行对第一数据的过滤操作。In step S250, according to the first query result, the filtering operation on the first data is performed in parallel using the filtering conditions.

在本公开实施例中，子查询的过滤条件在目标服务器上执行。In the embodiment of the present disclosure, the filtering conditions of the subquery are executed on the target server.

在传统方案中，子查询的查询结果需要返回给协调节点，由协调节点执行过滤操作。不同于传统方案，在方法200中，协调节点会直接将主查询的查询结果(如目标数据)直接分发给子查询。例如，协调节点根据子查询分区表的数据分布情况在目标服务器上发起多个工作线程，这多个工作线程在接收到主查询的目标数据后，并行执行子查询分区表数据的扫描。In the traditional solution, the query result of the subquery needs to be returned to the coordinator node, and the coordinator node performs the filtering operation. Different from the traditional solution, in the method 200, the coordinating node will directly distribute the query results (eg target data) of the main query to the sub-queries. For example, the coordinator node initiates multiple worker threads on the target server according to the data distribution of the sub-query partition table. After receiving the target data of the main query, the multiple worker threads scan the data of the sub-query partition table in parallel.

仍以上述server 3、server 4和server 5为例，server 3可以将T2表第一行的T2.v1作为查询条件，对T1表进行查询，得到查询结果1，进一步地，server 3可以执行过滤操作，即将该查询结果1与T2表第一行的T2.v2进行比较，判断T2表第一行的T2.v2与查询结果1是否相等。如果相等，则返回T2表第一行的T2.v2；如果不相等，则不返回数据。server 4可以将T2表第二行的T2.v1作为查询条件，对T1表进行查询，得到查询结果2，进一步地，server 4可以执行过滤操作，即将该查询结果2与T2表第二行的T2.v2进行比较，判断T2表第二行的T2.v2与查询结果2是否相等。如果相等，则返回T2表第二行的T2.v2；如果不相等，则不返回数据。server 5可以将T2表第三行的T2.v1作为查询条件，对T1表进行查询，得到查询结果3，进一步地，server 5可以执行过滤操作，即将该查询结果3与T2表第三行的T2.v2进行比较，判断T2表第三行的T2.v2与查询结果3是否相等。如果相等，则返回T2表第三行的T2.v2；如果不相等，则不返回数据。Still taking the above server 3, server 4, and server 5 as examples, server 3 can use T2.v1 in the first row of the T2 table as a query condition to query the T1 table to obtain query result 1. Further, server 3 can perform filtering The operation is to compare the query result 1 with T2.v2 in the first row of the T2 table to determine whether T2.v2 in the first row of the T2 table is equal to the query result 1. If equal, return T2.v2 of the first row of the T2 table; if not, return no data. Server 4 can use T2.v1 in the second row of the T2 table as a query condition, and query the T1 table to obtain query result 2. Further, server 4 can perform a filtering operation, that is, the query result 2 and the second row of the T2 table. Compare T2.v2 to determine whether T2.v2 in the second row of the T2 table is equal to query result 2. If equal, return T2.v2 of the second row of the T2 table; if not, return no data. The server 5 can use T2.v1 in the third row of the T2 table as the query condition, and query the T1 table to obtain the query result 3. Further, the server 5 can perform the filtering operation, that is, the query result 3 and the third row of the T2 table. Compare T2.v2 to determine whether T2.v2 in the third row of the T2 table is equal to query result 3. If equal, return T2.v2 in the third row of the T2 table; if not, return no data.

在一些实施例中，一个目标服务器上可以运行多个工作线程。例如，server 3、server 4和server 5上均可以运行2个线程。在该情况下，协调节点可以分别在server 3、server 4和server 5上发起2个线程来执行子查询分区表数据T1的扫描，也就是说，协调者会将步骤S220中主查询的目标数据分发给这6个线程，实现并行扫描子查询分区表数据T1。其中，每个工作线程的执行过程与上文描述的server 3、server 4和server 5的执行过程类似，为了间接，此处不再赘述。In some embodiments, multiple worker threads may run on a target server. For example, 2 threads can run on server 3, server 4, and server 5. In this case, the coordinator node can respectively initiate two threads on server 3, server 4 and server 5 to perform the scan of the sub-query partition table data T1, that is to say, the coordinator will scan the target data of the main query in step S220 Distributed to these 6 threads to implement parallel scanning of sub-query partitioned table data T1. Wherein, the execution process of each worker thread is similar to the execution process of server 3, server 4, and server 5 described above, and is not repeated here for indirectness.

在一些实施例中，协调者将第一扫描结果通过选择自适应的随机(random)或哈希(hash)分发方式分发给目标服务器。自适应的random或hash分发方式可以指在执行期间，根据数据的分发情况动态选择使用random或hash来分发数据的方式。In some embodiments, the coordinator distributes the first scan result to the target server by selecting an adaptive random (random) or hash (hash) distribution method. The adaptive random or hash distribution method can refer to the method of dynamically selecting random or hash to distribute data according to the distribution of data during execution.

作为一个示例，random分发方法例如可以是轮询(round robin)分发方法。需要说明的是，round robin分发方法可以将数据均匀分发给子查询的每个线程，因而roundrobin分发方法可以保证每个线程上获取到的数据是均匀的，防止出现数据倾斜的情况。As an example, the random distribution method may be, for example, a round robin distribution method. It should be noted that the round robin distribution method can evenly distribute the data to each thread of the subquery, so the round robin distribution method can ensure that the data obtained by each thread is uniform and prevent data skew.

hash分发方法也能实现将数据均匀的分发给子查询的每个线程。但对于hash分发方法来说，如果主查询分区表数据中出现相同的行数据时，hash分发方法会相同的行数据分发给同一个线程上。由于该线程中缓存有该数据之前的关于子查询的计算结果，因此在接收到相同的行数据时，该线程不需要在重复计算子查询，因而能够提高执行效率。The hash distribution method can also distribute the data evenly to each thread of the subquery. But for the hash distribution method, if the same row data appears in the main query partition table data, the hash distribution method will distribute the same row data to the same thread. Since the calculation result of the subquery before the data is cached in the thread, when the same row data is received, the thread does not need to repeatedly calculate the subquery, thereby improving the execution efficiency.

但是，在一些应用场景下，如果主查询中相同数据的行较多，采用hash分发方式的话，会将这些相同数据的行全部分发到同一个线程上。这样，该线程会处于工作饱和状态，而其他线程可能会处于闲置状态，导致数据分布倾斜现象的出现，从而影响执行效率。However, in some application scenarios, if there are many rows of the same data in the main query, if the hash distribution method is adopted, all the rows of the same data will be distributed to the same thread. In this way, the thread will be in a work-saturated state, while other threads may be in an idle state, resulting in skewed data distribution, which affects execution efficiency.

如果采用random分发方法，主查询的每行数据会依次分给子查询不同的工作线程，这样虽然能够避免数据分布倾斜的问题，但可能无法享受由于相同行数据而带来无需计算子查询所带来的执行性能上的提高。If the random distribution method is adopted, each row of data in the main query will be distributed to different worker threads of the sub-query in turn. Although this can avoid the problem of skewed data distribution, it may not enjoy the need to calculate the sub-query due to the same row of data. performance improvements to come.

针对这一问题，本公开实施例提供的方法200采用的是自适应random或hash方法对主查询的第一数据和第二数据分发给子查询的所有工作线程。In response to this problem, the method 200 provided by the embodiment of the present disclosure adopts an adaptive random or hash method to distribute the first data and the second data of the main query to all worker threads of the sub-query.

具体地，主查询的第一扫描结果可以先按照一种默认的数据分发方式进行分发，例如，采用hash分发方式将第一数据和第二数据分发给子查询的工作线程。Specifically, the first scan result of the main query may first be distributed according to a default data distribution method, for example, the first data and the second data are distributed to the working threads of the sub-query in a hash distribution method.

然后，对指定行的分发结果进行采样，如果hash分发的结果并不均匀，则换成random分发方式继续进行分发。如果hash分发的结果均匀，则可以继续使用hash分发方式。Then, sample the distribution results of the specified row. If the results of hash distribution are not uniform, switch to random distribution and continue to distribute. If the result of hash distribution is uniform, you can continue to use the hash distribution method.

作为一种实现方式，采样可以是主查询的每个工作线程向协调节点汇报自己的分发结果，协调节点汇总分发结果，并根据汇总后的分发结果决策使用哪种分发方式。协调节点在确定使用hash或random分发方式后，会将决策结果同步给主查询的所有工作线程。As an implementation manner, sampling may be that each worker thread of the main query reports its own distribution results to the coordinator node, the coordinator node aggregates the distribution results, and decides which distribution method to use according to the aggregated distribution results. After the coordinating node decides to use the hash or random distribution method, it will synchronize the decision result to all worker threads of the main query.

作为一个实施例，目标服务器可以通过“DISTRIBUTED TABLE SCAN”算子获取到子查询的所有分区表数据。也就是说，算子“DISTRIBUTED TABLE SCAN”提供了分布式扫描数据的能力，因而，目标服务器上的工作线程可以并行执行子查询的扫描和子查询filter条件的执行。As an embodiment, the target server can obtain all partition table data of the subquery through the "DISTRIBUTED TABLE SCAN" operator. That is to say, the operator "DISTRIBUTED TABLE SCAN" provides the ability to scan data in a distributed manner, so that the worker threads on the target server can execute the scan of the subquery and the execution of the filter condition of the subquery in parallel.

目标服务器在获取到第一扫描结果，即子查询的工作线程在获取到主查询的第一扫描结果，可以并行执行扫描子查询的分区表数据，在得到子查询结果后，并行执行子查询filter条件的计算。After the target server obtains the first scan result, that is, the worker thread of the subquery obtains the first scan result of the main query, it can execute the scan of the partition table data of the subquery in parallel. After obtaining the subquery result, execute the subquery filter in parallel. conditional calculation.

根据上述执行过程可以看出，本公开提供的方法200可以实现子查询的并行扫描，因而子查询的计算可以实现并行执行，子查询filter条件的计算也可以并行执行。另外，目标服务器上计算得到查询结果如果需要发送给其他SQL语句时，该过程也可以实现并行的执行。因此，本公开实施例提供的新的执行计划可以大大提高SQL语句的执行性能。It can be seen from the above execution process that the method 200 provided by the present disclosure can implement parallel scanning of sub-queries, so the calculation of sub-queries can be executed in parallel, and the calculation of filter conditions of sub-queries can also be executed in parallel. In addition, if the query result obtained by calculation on the target server needs to be sent to other SQL statements, the process can also be executed in parallel. Therefore, the new execution plan provided by the embodiments of the present disclosure can greatly improve the execution performance of the SQL statement.

下面结合图3和示例(2)对本公开实施例提供的方法进行进一步说明。图3为本公开一实施例提供执行SQL(1)语句的方法流程图。示例(2)是按照本公开实施例提供的方法执行示例(1)中SQL(1)语句所生成的执行计划(2)，执行计划(2)具体如下：The method provided by the embodiment of the present disclosure will be further described below with reference to FIG. 3 and example (2). FIG. 3 provides a flowchart of a method for executing an SQL(1) statement according to an embodiment of the present disclosure. Example (2) is an execution plan (2) generated by executing the SQL (1) statement in example (1) according to the method provided by the embodiment of the present disclosure, and the execution plan (2) is specifically as follows:

上述执行计划(2)的执行逻辑具体如下所述：The execution logic of the above execution plan (2) is as follows:

步骤S310，并行执行(parallel execution，PX)调度主查询、子查询所在DFO执行。Step S310, parallel execution (parallel execution, PX) schedules the execution of the DFO where the main query and the sub-query are located.

执行计划(2)中第3号算子至第6号算子为主查询的局部子计划，例如可以是主查询所在的DFO，其执行在主查询分区表数据T2所分布的服务器上。第7号算子和第8号算子为子查询局部子计划，例如可以是子查询所在的DFO，其执行在子查询分区表数据T1所分布的服务器上，例如前文所述的目标服务器。Operators No. 3 to No. 6 in the execution plan (2) are local sub-plans of the main query, such as the DFO where the main query is located, which are executed on the servers where the main query partition table data T2 is distributed. Operator No. 7 and No. 8 are sub-query local sub-plans, such as the DFO where the sub-query is located, which are executed on the server where the sub-query partition table data T1 is distributed, such as the target server described above.

PX为执行计划(2)中的0号算子“PX COORDINATOR”，其执行在协调者上。也就是说，协调者通过PX算子调度主查询和子查询所在的DFO开始执行扫描。PX is the No. 0 operator "PX COORDINATOR" in the execution plan (2), which is executed on the coordinator. That is, the coordinator schedules the DFO where the main query and sub-queries are located through the PX operator to start scanning.

步骤S320，主查询所在的多个工作线程执行扫描T2数据，默认使用hash方式分发给子查询所在的工作线程。In step S320, the multiple working threads where the main query is located perform scanning T2 data, and are distributed to the working threads where the sub-query is located by default in a hash manner.

主查询所在的DFO通过多个线程执行主查询分区表数据T2的扫描，得到分区表数据T2中的一行数据。然后，将这行数据通过hash分发的方式分发给子查询所在多个线程。该分发过程例如可以通过执行计划(2)中的4号算子“EXCHANGE OUT DISTR(Adaptive)”来实现。The DFO where the main query is located scans the partition table data T2 of the main query through multiple threads to obtain a row of data in the partition table data T2. Then, this row of data is distributed to multiple threads where the subquery is located by means of hash distribution. The distribution process can be implemented, for example, by executing the No. 4 operator "EXCHANGE OUT DISTR (Adaptive)" in the execution plan (2).

步骤S330，自适应采样决策使用hash或random方式分发。Step S330, the adaptive sampling decision is distributed in a hash or random manner.

图4为本公开实施例提供的一种数据分发过程的流程示例图。如图4所示，PX对主查询的分发结果进行采样，并将决策的分发方法同步给主查询继续分发。FIG. 4 is an exemplary flowchart of a data distribution process provided by an embodiment of the present disclosure. As shown in Figure 4, PX samples the distribution results of the main query, and synchronizes the decision distribution method to the main query to continue distribution.

具体地，PX对主查询多个线程的数据分发情况进行采样，例如可以对主查询的前1000行的分发情况进行采样。作为一种实现方式，采样可以是主查询的每个线程将自己的数据分发情况汇报给PX，PX对分发结果进行汇总。Specifically, the PX samples the data distribution situation of multiple threads of the main query, for example, the distribution situation of the first 1000 rows of the main query can be sampled. As an implementation manner, sampling may be that each thread of the main query reports its own data distribution status to the PX, and the PX summarizes the distribution results.

如果分发结果是均匀的，或者分发结果的均匀度在预设范围内，或者分发结果不会造成数据分布倾斜，那么继续按照原先的hash分发方式进行分发。同时，PX将采用hash分发方式的决策同步给主查询的所有工作线程，使得主查询的工作线程继续采用hash分发方式将扫描结果发送给子查询的工作线程。If the distribution results are uniform, or the uniformity of the distribution results is within the preset range, or the distribution results do not cause skewed data distribution, continue to distribute according to the original hash distribution method. At the same time, PX synchronizes the decision of adopting the hash distribution method to all the worker threads of the main query, so that the worker threads of the main query continue to use the hash distribution method to send the scan results to the worker threads of the sub-query.

如果分发结果不均匀，或者分发结果的均匀度不在预设范围内，或者分发结果已经造成数据分布倾斜，那么PX会将原先hash分发方式变更为random分发方式来进行数据的分发。同时，PX会将采用random分发方式的决策同步给主查询的所有工作线程，使得主查询的工作线程接下来采用random分发方式将主查询的扫描结果发送给子查询的工作线程。If the distribution results are not uniform, or the uniformity of the distribution results is not within the preset range, or the distribution results have caused the data distribution to be skewed, PX will change the original hash distribution method to random distribution to distribute data. At the same time, PX will synchronize the decision in the random distribution method to all the worker threads of the main query, so that the worker thread of the main query will then use the random distribution method to send the scan results of the main query to the worker threads of the sub-query.

当主查询的数据分发到达指定行数(即1000行)时，PX再次根据汇总分发结果决策采用random或hash的分发方式来继续分发主查询的扫描结果，并将决策结果同步给主查询的所有工作线程，重复上述过程可以有效避免数据分布倾斜的问题，使得数据分布处于一个可控的范围内。When the data distribution of the main query reaches the specified number of rows (that is, 1000 rows), PX decides again to use random or hash distribution according to the summary distribution results to continue distributing the scan results of the main query, and synchronizes the decision results to all the work of the main query. Thread, repeating the above process can effectively avoid the problem of skewed data distribution, so that the data distribution is within a controllable range.

步骤S340，子查询所在的工作线程收到主查询的行数据(如上文描述的第一数据和第二数据)后，并行扫描子查询数据，并计算filter条件。Step S340, after receiving the row data of the main query (such as the first data and the second data described above), the working thread where the sub-query is located scans the sub-query data in parallel, and calculates the filter conditions.

子查询所在的DFO的多个线程在收到主查询分发的扫描结果后，并行执行扫描子查询分区表数据T1，其中8号算子“DISTRIBUTED TABLE SCAN”提供了分布式扫描数据的能力。After receiving the scan results distributed by the main query, multiple threads of the DFO where the sub-query is located will execute the scan of the sub-query partitioned table data T1 in parallel, in which the No. 8 operator "DISTRIBUTED TABLE SCAN" provides the ability to scan data in a distributed manner.

得到子查询的扫描结果和此前获取到的主查询的扫描结果进行匹配，将满足子查询过滤条件的行返回，这一过程是多个线程并行执行的。The scan result of the subquery is matched with the scan result of the main query obtained before, and the rows that satisfy the filter conditions of the subquery are returned. This process is executed by multiple threads in parallel.

步骤S350，满足过滤条件的行返回给客户端。Step S350, the rows satisfying the filtering conditions are returned to the client.

在步骤S340中已经获得了满足过滤条件的行数据，将该行数据返回给客户端。In step S340, the row data satisfying the filtering condition has been obtained, and the row data is returned to the client.

综上所述，本公开实施例提出新的执行方式，通过自适应的分发方法将主查询的扫描结果分发到子查询中，执行方式上通过拆分主查询数据支持了子查询filter条件的并行执行，使得相关执行计划的执行性能可以随着并行度的增长而线性提升。在此基础上，本公开实施例还创造性地提出了自适应选择hash或random分发方式来分发主查询结果，从而优化了并行过程中由于数据分布倾斜带来的性能不佳的问题。To sum up, the embodiment of the present disclosure proposes a new execution mode, which distributes the scan results of the main query to the sub-queries through an adaptive distribution method, and supports the parallelism of the sub-query filter conditions by splitting the main query data in the execution mode. Execution, so that the execution performance of the related execution plan can be linearly improved with the increase of parallelism. On this basis, the embodiment of the present disclosure also creatively proposes to adaptively select a hash or random distribution mode to distribute the main query result, thereby optimizing the problem of poor performance caused by skewed data distribution in the parallel process.

上文结合图1至图4，详细描述了本公开提供的方法实施例。下面结合图5至图6对本公开提供的装置实施例进行相关描述。应理解，装置实施例的描述与方法实施例的描述相互对应，因此，未详细描述的部分可以参见前面的方法实施例。The method embodiments provided by the present disclosure are described in detail above with reference to FIGS. 1 to 4 . The device embodiments provided by the present disclosure will be described below with reference to FIG. 5 to FIG. 6 . It should be understood that the descriptions of the apparatus embodiments correspond to the descriptions of the method embodiments, and therefore, for the parts not described in detail, reference may be made to the foregoing method embodiments.

图5是本公开实施例的处理数据库查询语句的装置的结构示意图。该装置可以为数据库，如分布式数据库。在一些场景中，该装置可以为原生分布式数据库，其中，原生分布式数据库可以是一种自主研发的分布式数据库，这种分布式数据库并不是对已有的分布式数据库进行二次开发或封装得到的。当然，该装置还可以为其他数据库，本公开实施例对此不作限定。下文结合图5介绍应用本公开实施例的装置进行介绍。FIG. 5 is a schematic structural diagram of an apparatus for processing a database query statement according to an embodiment of the present disclosure. The apparatus may be a database, such as a distributed database. In some scenarios, the device may be a native distributed database, where the native distributed database may be a self-developed distributed database, which is not a secondary development or secondary development of an existing distributed database. packaged. Certainly, the apparatus may also be other databases, which are not limited in this embodiment of the present disclosure. The following describes an apparatus to which an embodiment of the present disclosure is applied with reference to FIG. 5 .

图5中的装置500可以包括接收模块510、查询模块520、分发模块530和过滤模块540。The apparatus 500 in FIG. 5 may include a receiving module 510 , a querying module 520 , a distributing module 530 and a filtering module 540 .

接收模块510，用于接收数据库查询语句，所述数据库查询语句包括主查询和子查询，所述子查询的查询条件为所述主查询的相关条件，所述主查询包括基于所述子查询的查询结果进行过滤的过滤条件，所述主查询和所述子查询的分区表数据分布在多台服务器上。The receiving module 510 is configured to receive a database query statement, where the database query statement includes a main query and a subquery, the query conditions of the subquery are related conditions of the main query, and the main query includes a query based on the subquery The result is a filter condition for filtering, and the partition table data of the main query and the subquery are distributed on multiple servers.

查询模块520，用于从所述主查询的分区表数据中查询目标数据，所述目标数据包括与所述主查询相关的第一数据以及与所述子查询相关的第二数据。The query module 520 is configured to query target data from the partition table data of the main query, where the target data includes first data related to the main query and second data related to the sub-query.

分发模块530，用于将所述第一数据和所述第二数据分发给目标服务器，所述目标服务器能够获取到所述子查询的所有分区表数据。The distribution module 530 is configured to distribute the first data and the second data to a target server, where the target server can obtain all partition table data of the sub-query.

所述查询模块520还用于：基于所述第二数据，并行执行对所述子查询的分区表数据的查询，得到第一查询结果。The query module 520 is further configured to: based on the second data, execute the query on the partition table data of the sub-query in parallel to obtain the first query result.

过滤模块540，用于根据所述第一查询结果，使用所述过滤条件并行执行对所述第一数据的过滤操作。The filtering module 540 is configured to perform the filtering operation on the first data in parallel by using the filtering condition according to the first query result.

可选地，所述装置500还包括：采集模块，用于对所述第一数据和所述第二数据的分发情况进行采集；判断模块，用于根据所述分发情况，判断采用hash或random的分发方式将所述第一数据和所述第二数据分发给所述目标服务器。Optionally, the apparatus 500 further includes: a collection module, configured to collect the distribution situation of the first data and the second data; a judgment module, used to determine whether to use hash or random according to the distribution situation The distribution method distributes the first data and the second data to the target server.

可选地，所述目标服务器通过DISTRIBUTED TABLE SCAN算子获取所述子查询的所有分区表数据。Optionally, the target server obtains all partition table data of the subquery through the DISTRIBUTED TABLE SCAN operator.

图6是本公开另一实施例提供的处理数据库查询语句的装置的结构示意图。图6所示的装置600可以是数据库或服务器。装置600可以包括存储器610和处理器620。FIG. 6 is a schematic structural diagram of an apparatus for processing a database query statement provided by another embodiment of the present disclosure. The apparatus 600 shown in FIG. 6 may be a database or a server. Apparatus 600 may include memory 610 and processor 620 .

存储器610可以用于存储可执行代码。处理器620可以用于执行存储器610中存储的可执行代码，以实现前文描述的各个方法中的步骤。Memory 610 may be used to store executable code. The processor 620 may be used to execute executable code stored in the memory 610 to implement the steps in the various methods described above.

在一些实施例中，该装置600还可以包括网络接口630，处理器620与外部设备的数据交换可以通过该网络接口630实现。In some embodiments, the apparatus 600 may further include a network interface 630 through which data exchange between the processor 620 and the external device may be implemented.

在上述实施例中，可以全部或部分地通过软件、硬件、固件或者其他任意组合来实现。当使用软件实现时，可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时，全部或部分地产生按照本公开实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中，或者从一个计算机可读存储介质向另一个计算机可读存储介质传输，例如，所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(Digital Subscriber Line，DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如，软盘、硬盘、磁带)、光介质(例如数字视频光盘(Digital Video Disc，DVD))、或者半导体介质(例如固态硬盘(Solid State Disk，SSD))等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware or any other combination. When implemented in software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the procedures or functions described in accordance with the embodiments of the present disclosure are produced in whole or in part. The computer may be a general purpose computer, special purpose computer, computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server or data center The transmission is carried out to another website site, computer, server or data center by wire (eg coaxial cable, optical fiber, Digital Subscriber Line, DSL) or wireless (eg infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes an integration of one or more available media. The usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, Digital Video Disc (DVD)), or semiconductor media (eg, Solid State Disk (SSD)), etc. .

本领域普通技术人员可以意识到，结合本公开实施例描述的各示例的单元及算法步骤，能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本公开的范围。Those skilled in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments of the present disclosure can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this disclosure.

在本公开所提供的几个实施例中，应该理解到，所揭露的系统、装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided by the present disclosure, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

另外，在本公开各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

以上所述，仅为本公开的具体实施方式，但本公开的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本公开揭露的技术范围内，可轻易想到变化或替换，都应涵盖在本公开的保护范围之内。因此，本公开的保护范围应以所述权利要求的保护范围为准。The above are only specific embodiments of the present disclosure, but the protection scope of the present disclosure is not limited to this. should be included within the scope of protection of the present disclosure. Therefore, the protection scope of the present disclosure should be based on the protection scope of the claims.

Claims

1. A method of processing a database query statement, comprising:

receiving a database query statement, wherein the database query statement comprises a main query and a sub query, the query condition of the sub query is the relevant condition of the main query, the main query comprises a filtering condition for filtering based on the query result of the sub query, and the partition table data of the main query and the sub query are distributed on a plurality of servers;

querying target data from partition table data of the main query, wherein the target data comprises first data relevant to the main query and second data relevant to the sub-queries;

the first data and the second data are distributed to a target server, and the target server can obtain all partition table data of the sub-query;

based on the second data, executing the query of the partition table data of the sub-query in parallel to obtain a first query result;

and according to the first query result, performing the filtering operation on the first data in parallel by using the filtering condition.

2. The method of claim 1, the distributing the first data and the second data to a target server, comprising:

and distributing the first data and the second data to the target server by adaptively selecting a hash or random distribution mode.

3. The method of claim 1, further comprising:

collecting the distribution condition of the first data and the second data;

and according to the distribution condition, judging to adopt a hash or random distribution mode to distribute the first data and the second data to the target server.

4. The method of claim 1, wherein the target server is capable of obtaining all partition table data of the sub-query, comprising:

and the target server acquires all partition TABLE data of the sub-query through a DISTRIBUTED TABLE SCAN operator.

5. An apparatus for processing a database query statement, comprising:

the system comprises a receiving module, a processing module and a processing module, wherein the receiving module is used for receiving a database query statement, the database query statement comprises a main query and a sub query, the query condition of the sub query is the relevant condition of the main query, the main query comprises a filtering condition for filtering based on the query result of the sub query, and the partition table data of the main query and the sub query are distributed on a plurality of servers;

the query module is used for querying target data from the partition table data of the main query, wherein the target data comprises first data relevant to the main query and second data relevant to the sub-queries;

the distribution module is used for distributing the first data and the second data to a target server, and the target server can obtain all the partition table data of the sub-query;

the query module is further configured to: based on the second data, executing the query of the partition table data of the sub-query in parallel to obtain a first query result;

and the filtering module is used for executing the filtering operation on the first data in parallel by using the filtering condition according to the first query result.

6. The apparatus of claim 5, wherein the distribution module is further configured to distribute the first data and the second data to the target server by adaptively selecting a distribution manner of hash or random.

7. The apparatus of claim 5, the apparatus further comprising:

the acquisition module is used for acquiring the distribution conditions of the first data and the second data;

and the judging module is used for judging that a hash or random distribution mode is adopted to distribute the first data and the second data to the target server according to the distribution condition.

8. The apparatus of claim 5, the target server to obtain all partition table data of the sub-query through a DISTRIBUTEDTABLE SCAN operator.

9. A computer readable storage medium having stored thereon executable code which, when executed, is capable of implementing the method of any one of claims 1 to 4.