WO2018100419A1 - Système et procédé de réduction de la distribution de données par poussée d'agrégation - Google Patents
Système et procédé de réduction de la distribution de données par poussée d'agrégation Download PDFInfo
- Publication number
- WO2018100419A1 WO2018100419A1 PCT/IB2016/057307 IB2016057307W WO2018100419A1 WO 2018100419 A1 WO2018100419 A1 WO 2018100419A1 IB 2016057307 W IB2016057307 W IB 2016057307W WO 2018100419 A1 WO2018100419 A1 WO 2018100419A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- request
- grouping
- group
- clause
- responses
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24534—Query rewriting; Transformation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
Definitions
- the present subject matter described herein in general, relates to database technologies, and more particularly, to systems and methods for optimization of running group by queries with aggregation on distinct column values where the group by is not done on distribution key.
- a database system provides a high-level view of data, but ultimately the data have to be stored as bits on one or more storage nodes.
- a vast majority of databases today store data on magnetic disk (and, increasingly, on flash storage) and fetch data into main memory for processing, or copy data onto tapes and other backup nodes for archival storage.
- the physical characteristics of storage nodes play a major role in the way data are stored, in particular because access to a random piece of data on disk is much slower than memory access: Disk access takes tens of milliseconds, whereas memory access takes a tenth of a microsecond.
- a database management system is generally system software for creating and managing databases.
- the DBMS provides users and programmers with a systematic way to create, retrieve, update and manage data.
- the DBMS is a collection of programs that enables you to store, modify, and extract information from a database.
- the database system can be a distributed database system, wherein the database is distributed over multiple disparate computers or nodes.
- Distributed databases are formed by placing and distributing the rows of the database in multiple nodes.
- the rules for distribution of the rows may be broadly classified into a fixed distribution strategy and random distribution strategy.
- the mapping of the node to the row is governed by a mapping function that would uniquely identify the node to fetch the data, and the distribution function is dependent on one of the column values of the row.
- Example of the fixed distribution strategy is Hash ranged distribution.
- random distribution strategy the mapping of a node to a row is not fixed and can vary based on several factors.
- Example of the random distribution strategy is Round robin ranged distribution.
- the query plans may be adjusted based on the distribution strategy.
- the adjustments may be to favor local query processing to distribution of rows and to favor low data redistribution and maximize processing power of the data nodes.
- a main objective of the present disclosure is to solve the technical problem as recited above by providing system and method for aggregation pushdown with double hashing thereby avoiding heavy data redistribution either in distributed databases query execution models or also in a single node instance that uses parallel execution of a single query.
- the present disclosure provides a system for aggregation pushdown.
- the system comprises a processor, and a memory coupled to the processor for executing a plurality of modules stored in the memory.
- the plurality of modules comprises a receiving module, a re-ordering module, and an execution module.
- the receiving module configured to receive at least request related to a data set, the request comprises at least a group by clause.
- the re-ordering module configured to rearrange the request for receiving from a plurality of data sources a respective plurality of responses related to a request received related to the data set, wherein the request is rearranged such that at least a distribution key is used as the group by clause.
- the execution module configured to execute the request rearranged to generate at least a response to the request received.
- the request is rearranged in the case(s) where the request involves grouping operation with distribution key is used within a DISTINCT condition in an aggregate method and distribution key is not part of group by clause. [0011]
- a method for aggregation pushdown is disclosed.
- the method comprises receiving at least request related to a data set, the request comprises at least a group by clause; rearranging the request for receiving from a plurality of data sources a respective plurality of responses related to a request received related to the data set, wherein the request is rearranged such that at least a distribution key is used as the group by clause; and executing the request rearranged to generate at least a response to the request received.
- the request is rearranged in the case(s) where the request involves grouping operation with distribution key is used within a DISTINCT condition in an aggregate method and distribution key is not part of group by clause.
- a system for aggregation pushdown is disclosed.
- the system comprises a processor, and a memory coupled to the processor for executing a plurality of modules stored in the memory.
- the plurality of modules comprises a receiving module, a re-ordering module, and an execution module.
- the receiving module configured to receive at least request related to a data set, the request comprises at least a group by clause on a non distribution column and/or an aggregate function with DISTINCT over the distribution key.
- the re-ordering module is configured to rearrange the request to multi-staged group by plan for receiving, from a plurality of data sources, a respective plurality of responses related to the request received related to the data set, wherein the request is rearranged such that at least a distribution key is used as the group by clause.
- the execution module is configured to execute the request rearranged, to multi-staged group by plan, to generate at least a response to the request received. [0013] In one implementation, a method for aggregation pushdown is disclosed.
- the method comprises receiving at least request related to a data set, the request comprises at least a group by clause on a non distribution column and/or an aggregate function with DISTINCT over the distribution key; rearranging the request to multi- staged group by plan for receiving, from a plurality of data sources, a respective plurality of responses related to the request received related to the data set, wherein the request is rearranged such that at least a distribution key is used as the group by clause; executing the request rearranged, to multi-staged group by plan, to generate at least a response to the request received.
- the present disclosure by systems, and methods, for queries involving group by, push down the grouping clause to the local nodes for execution and avoid redistribution.
- the present disclosure involves rewriting the query received from the user to use the distribution key as a grouping by clause column.
- the present disclosure deals with a usage of "DISTINCT" clause on a distribution key in aggregate methods with grouping clause.
- a given query contains a) group by clause on a non distribution column, b) contains an aggregate function with DISTINCT over the distribution key
- the present disclosure splits the query to multi staged group by plan. In a first stage, the grouping is done based on the grouping column and the distribution key. In a next step, grouping is done on the result from previous grouping by the distribution column alone.
- the aggregate with DISTINCT is converted to a simpler form of aggregate, because the DISTINCT entries of the aggregated column are achieved by the previous grouping phase. The two groupings done along with the aggregation is done locally.
- the amount of data to be sent over the network is very less.
- the next group by is done by multiple data nodes with redistribution or by a single data node, as the number of records expected to flow out of each local node is very less.
- Figure 1 illustrates an overall processing of the query for aggregation pushdown according to the available prior-art techniques and in accordance with an embodiment of the present subject matter.
- Figure 1(a) illustrates a naive implementation of the query.
- Figure 1(b) illustrates an execution of the query in accordance with the prior arts/ exiting databases.
- Figure 1(c) illustrates processing of the query for aggregation pushdown in accordance with an embodiment of the present subject matter.
- Figure 2 illustrates a system for aggregation pushdown, in accordance with an embodiment of the present subject matter.
- Figure 3 illustrates a method for aggregation pushdown, in accordance with an embodiment of the present subject matter.
- the terms “plurality” and “a plurality” as used herein may include, for example, “multiple” or “two or more”.
- the terms “plurality” or “a plurality” may be used throughout the specification to describe two or more components, devices, elements, units, parameters, or the like.
- the method embodiments described herein are not constrained to a particular order or sequence. Additionally, some of the described method embodiments or elements thereof can occur or be performed simultaneously, at the same point in time, or concurrently.
- FIG. 1 an overall processing of the query in distributed database system as compared to the prior-art techniques is disclosed.
- the present disclosure for queries involving group by clause a method is disclosed through which the grouping clause is pushed down to a local nodes for execution and thereby avoiding redistribution.
- the present disclosure involves rewriting/re-arranging/reordering the query to use the distribution key as a grouping by clause column.
- the horizontal line as shown in the figure 1 indicates the boundary of local processing and processing before redistribution.
- the items marked below the horizontal line is localized process, and the ones above are post redistribution processing.
- the query may be rewritten so that a group by clause is introduced based on the distribution key.
- the key in this rewrite of the query are- i) the query is split into multiple staged grouping query, and ii) The DISTINCT is removed from the query.
- Figure 1(a) illustrates a naive implementation of the query. As shown in the figure 1(a), the scan is performed on all the local nodes and the complete data set is redistributed. A person skilled in that art may understand that this may be a worst performing query plan though very simple to implement. The complete data scanned from each data node has to be redistributed to all the data nodes to perform aggregation.
- Figure 1(b) illustrates an execution of the query in accordance with the prior arts/ exiting databases.
- Figure 1(b) shows a bit intelligent query plan compared to the first one as shown in figure 1(a).
- the group by is done in two stages. In the first stage, the grouping is done based on the actual grouping column and the distribution key. However, then redistribution is done. The number of records distributed here is also high as, there might be multiple entries many groups that fall under the combination. In the second stage of grouping will happen only based on the actual grouping column will find that the values from the distribution column does not overlap from one other, however, the grouping will done by on many rows.
- Figure 1(c) illustrates processing of the query for aggregation pushdown in accordance with an embodiment of the present subject matter.
- grouping is done in three stages.
- the first grouping involves the queries' actual grouping column and the distribution key as used by many solutions (described in the figure 1(b)).
- the second grouping which achieves the technical advancement over the prior-art techniques, the grouping is again done based on the actual group by column.
- Original Query select count(distinct(dist_f3)) from tl;
- a system for aggregation pushdown is illustrated, in accordance with an embodiment of the present subject matter.
- a system 200 for aggregation pushdown is disclosed.
- the present subject matter is explained considering that the present disclosure is implemented in the system 200, it may be understood that the present disclosure may also be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a mainframe computer, a server, a network server, and the like. It will be understood that the system 200 may be accessed by multiple users, or applications residing on the database system.
- Examples of the system 200 may include, but are not limited to, a portable computer, a personal digital assistant, a handheld node, sensors, routers, gateways and a workstation.
- the system 200 is communicatively coupled to each other and/or other nodes or a nodes or apparatuses to form a network (not shown).
- Examples of the database system may include, but are not limited to, a portable computer, a personal digital assistant, a handheld node, sensors, routers, gateways and a workstation.
- the system 200 is communicatively coupled to each other and/or other nodes or a nodes or apparatuses to form a network (not shown).
- the network may be a wireless network, a wired network or a combination thereof.
- the network can be implemented as one of the different types of networks, such as GSM, CDMA, LTE, UMTS, intranet, local area network (LAN), wide area network (WAN), the internet, and the like.
- the network may either be a dedicated network or a shared network.
- the shared network represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and the like, to communicate with one another. Further the network may include a variety of network nodes, including routers, bridges, servers, computing nodes, storage nodes, and the like.
- the system 200 may include a processor 202, an interface 204, and a memory 206.
- the processor 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any nodes that manipulate signals based on operational instructions.
- the at least one processor is configured to fetch and execute computer-readable instructions or modules stored in the memory 206.
- the interface (I/O interface) 204 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like.
- the I/O interface may allow the database system, the first node, the second node, and the third node to interact with a user directly. Further, the I/O interface may enable the system 200 to communicate with other nodes or nodes, computing nodes, such as web servers and external data servers (not shown).
- the I/O interface can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, GSM, CDMA, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite.
- the I/O interface may include one or more ports for connecting a number of nodes to one another or to another server.
- the I/O interface may provide interaction between the user and the system 200 via, a screen provided for the interface.
- the memory 206 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
- volatile memory such as static random access memory (SRAM) and dynamic random access memory (DRAM)
- non-volatile memory such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
- ROM read only memory
- erasable programmable ROM erasable programmable ROM
- flash memories such as compact flash drives, etc.
- a system 200 for aggregation pushdown comprises a processor 202, and a memory 206 coupled to the processor 202 for executing a plurality of modules present in the memory 206.
- the plurality of modules comprises a receiving module 208, a re-ordering module 210, a parser 212, an execution module 214, a designation module 216, a determination module 218, and a generation module 220.
- the receiving module 208 may be configured to receive at least request related to a data set, the request comprises at least a group by clause.
- the re-ordering module 210 may be configured to rearrange the request for receiving from a plurality of data sources a respective plurality of responses related to a request received related to the data set, wherein the request is rearranged such that at least a distribution key is used as the group by clause.
- the execution module 214 may be configured to execute the request rearranged to generate at least a response to the request received.
- the generation module 220 may be further configured to generate a response to the client by grouping information in the plurality of responses based on a grouping clause included in the request.
- the parser 212 may be configured to parse the request to determine the data set to receive responses based on request. The parsing preferably determines a table name.
- the designating module 216 may be configured to designate at least a parameter included in the request as a distribution key based on the dataset determined. The parameter is based on an association of the table name with a field included in the request.
- the determination module 218 may be configured to determine a plurality of data sources to provide the responses based on a value of the distribution key. The value is an index identifying the data source.
- the generation module 220 may be configured to generate the response to the request based on the plurality of responses received from the plurality of data source.
- FIG. 3 a method for aggregation pushdown is illustrated, in accordance with an embodiment of the present subject matter.
- the method may be described in the general context of computer executable instructions.
- computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, etc., that perform particular functions or implement particular abstract data types.
- the method may also be practiced in a distributed computing environment where functions are performed by remote processing devices that are linked through a communications network.
- computer executable instructions may be located in both local and remote computer storage media, including memory storage devices.
- At block 302 at least request related to a data set is received.
- the request may include at least a group by clause.
- the request received is rearranged. Based on the rearranged request, the present disclosure receives a respective plurality of responses related to the request received related to the data set from a plurality of data sources. The request is rearranged such that at least a distribution key is used as the group by clause.
- the request received is parsed to determine the data set to receive responses based on request. The parsing is preferably performed to determine a table name from the request received.
- the system may start executing the query.
- At block 310 during execution, the based on the dataset determined, at least a parameter included in the request is designated as a distribution key. The parameter is based on an association of the table name with a field included in the request. [0055] At block 312, a plurality of data sources to provide the responses is determined based on a value of the distribution key. The value is an index identifying the data source. [0056] At block 314, the response to the request is generated based on the plurality of responses received from the plurality of data source
- the request rearranged is executed to generate at least a response to the request received.
- a response to the client is generated by grouping information in the plurality of responses based on a grouping clause included in the request.
- a system for aggregation pushdown is disclosed.
- the system comprises a processor, and a memory coupled to the processor for executing a plurality of modules present in the memory.
- the plurality of modules comprises a receiving module, a re-ordering module, and an execution module.
- the receiving module configured to receive at least request related to a data set, the request comprises at least a group by clause on a non distribution column and/or an aggregate function with DISTINCT over the distribution key.
- the re-ordering module is configured to rearrange the request to multi-staged group by plan for receiving, from a plurality of data sources, a respective plurality of responses related to the request received related to the data set, wherein the request is rearranged such that at least a distribution key is used as the group by clause.
- the execution module is configured to execute the request rearranged, to multi-staged group by plan, to generate at least a response to the request received.
- a method for aggregation pushdown is disclosed.
- the method comprises receiving at least request related to a data set, the request comprises at least a group by clause on a non distribution column and/or an aggregate function with DISTINCT over the distribution key; rearranging the request to multi- staged group by plan for receiving, from a plurality of data sources, a respective plurality of responses related to the request received related to the data set, wherein the request is rearranged such that at least a distribution key is used as the group by clause; executing the request rearranged, to multi-staged group by plan, to generate at least a response to the request received.
- the present disclosure deals with a usage of "DISTINCT” clause on a distribution key in aggregate methods with grouping clause.
- a given query contains a) group by clause on a non distribution column, b) contains an aggregate function with DISTINCT over the distribution key
- the present disclosure splits the query to multi staged group by plan. In a first stage, the grouping is done based on the grouping column and the distribution key. In a next step, grouping is done on the result from previous grouping by the distribution column alone.
- the aggregate with DISTINCT is converted to a simpler form of aggregate, because the DISTINCT entries of the aggregated column are achieved by the previous grouping phase. The two groupings done along with the aggregation is done locally.
- the amount of data to be sent over the network is very less.
- the next group by is done by multiple data nodes with redistribution or by a single data node, as the number of records expected to flow out of each local node is very less.
- the present disclosure enables to improve the query's latency as more nodes are added to the cluster. • The present disclosure provides a mechanism to push down the grouping clause to the local nodes and avoids redistribution.
- the present disclosure provides a mechanism to rewrite/rearrange/reorder the query to use the distribution key as a grouping by clause column.
- the present disclosure enables to avoid heavy data redistribution for grouping queries.
- the disclosed system, apparatus, and method may be implemented in other manners.
- the described apparatus embodiment is merely exemplary.
- the unit division is merely logical function division and may be other division in actual implementation.
- a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed.
- the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented through some interfaces.
- the indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.
- the functions When the functions are implemented in a form of a software functional unit and sold or used as an independent product, the functions may be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions of the present disclosure essentially, or the part contributing to the prior art, or a part of the technical solutions may be implemented in a form of a software product.
- the computer software product is stored in a storage medium, and includes several instructions for instructing a computer node (which may be a personal computer, a server, or a network node) to perform all or a part of the steps of the methods described in the embodiment of the present disclosure.
- the foregoing storage medium includes: any medium that can store program code, such as a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disc.
- program code such as a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
La présente invention concerne un système et un procédé de poussée d'agrégation avec double hachage. Contrairement aux techniques de l'état de la technique, la présente invention concerne des systèmes et des procédés, pour des interrogations impliquant un groupe, par poussée vers le bas de la clause de regroupement vers les noeuds locaux pour l'exécution et pour éviter une redistribution. La présente invention concerne. Il consiste à réécrire la requête reçue de l'utilisateur pour utiliser la clé de distribution en tant que groupement par colonne de clause.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IB2016/057307 WO2018100419A1 (fr) | 2016-12-02 | 2016-12-02 | Système et procédé de réduction de la distribution de données par poussée d'agrégation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IB2016/057307 WO2018100419A1 (fr) | 2016-12-02 | 2016-12-02 | Système et procédé de réduction de la distribution de données par poussée d'agrégation |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018100419A1 true WO2018100419A1 (fr) | 2018-06-07 |
Family
ID=62241256
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2016/057307 WO2018100419A1 (fr) | 2016-12-02 | 2016-12-02 | Système et procédé de réduction de la distribution de données par poussée d'agrégation |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2018100419A1 (fr) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110246550A1 (en) * | 2010-04-02 | 2011-10-06 | Levari Doron | System and method for aggregation of data from a plurality of data sources |
-
2016
- 2016-12-02 WO PCT/IB2016/057307 patent/WO2018100419A1/fr active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110246550A1 (en) * | 2010-04-02 | 2011-10-06 | Levari Doron | System and method for aggregation of data from a plurality of data sources |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12099472B2 (en) | Utilizing metadata to prune a data set | |
CN108431804B (zh) | 将多个容器数据库分组为单个容器数据库集群的能力 | |
US10769148B1 (en) | Relocating data sharing operations for query processing | |
US10296498B2 (en) | Coordinated hash table indexes to facilitate reducing database reconfiguration time | |
US10713248B2 (en) | Query engine selection | |
US8620903B2 (en) | Database distribution system and methods for scale-out applications | |
US8214356B1 (en) | Apparatus for elastic database processing with heterogeneous data | |
US10585887B2 (en) | Multi-system query execution plan | |
US20180246934A1 (en) | Adjusting partitioning policies of a database system in view of storage reconfiguration | |
CN107077453B (zh) | 用于使用集群缓存进行数据库查询的并行优化的系统和方法 | |
US20170116321A1 (en) | Ability to group multiple container databases as a single container database cluster | |
US20190324967A1 (en) | System and method for dynamic database split generation in a massively parallel or distributed database environment | |
US11514022B2 (en) | Streams on shared database objects | |
US20040181522A1 (en) | Shared memory router system and method for node communication in a distributed system | |
US10417257B2 (en) | Non-blocking database table alteration | |
US20150006509A1 (en) | Incremental maintenance of range-partitioned statistics for query optimization | |
US20170322963A1 (en) | Apparatus and Method for Creating User Defined Variable Size Tags on Records in RDBMS | |
US9239852B1 (en) | Item collections | |
US9129037B2 (en) | Disappearing index for more efficient processing of a database query | |
US20080133493A1 (en) | Method for maintaining database clustering when replacing tables with inserts | |
Chang et al. | Integration and optimization of multiple big data processing platforms | |
Zhang et al. | Design and implementation of a real-time interactive analytics system for large spatio-temporal data | |
US20170322973A1 (en) | System and Method to Optimize Queries on a View | |
WO2018100419A1 (fr) | Système et procédé de réduction de la distribution de données par poussée d'agrégation | |
US10558637B2 (en) | Modularized data distribution plan generation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16922946 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16922946 Country of ref document: EP Kind code of ref document: A1 |