-
Notifications
You must be signed in to change notification settings - Fork 7.5k
Parallel replicas support projection #82807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Workflow [PR], commit [ca610c2] Summary: ❌
|
c7220dd
to
6f968f4
Compare
6f968f4
to
3442cfe
Compare
PTAL @alexey-milovidov |
128b892
to
4a0e891
Compare
4a0e891
to
7fff9e3
Compare
src/Processors/QueryPlan/Optimizations/QueryPlanOptimizationSettings.cpp
Outdated
Show resolved
Hide resolved
src/Processors/QueryPlan/Optimizations/enableMemoryBoundMerging.cpp
Outdated
Show resolved
Hide resolved
src/Processors/QueryPlan/Optimizations/enableMemoryBoundMerging.cpp
Outdated
Show resolved
Hide resolved
src/Processors/QueryPlan/Optimizations/optimizeUseNormalProjection.cpp
Outdated
Show resolved
Hide resolved
src/Processors/QueryPlan/Optimizations/optimizeUseNormalProjection.cpp
Outdated
Show resolved
Hide resolved
src/Processors/QueryPlan/Optimizations/optimizeUseNormalProjection.cpp
Outdated
Show resolved
Hide resolved
src/Processors/QueryPlan/Optimizations/optimizeUseAggregateProjection.cpp
Outdated
Show resolved
Hide resolved
@nickitat please take another look, thanks. |
src/Core/SettingsChangesHistory.cpp
Outdated
@@ -51,6 +51,7 @@ const VersionToSettingsChangesMap & getSettingsChangesHistory() | |||
{"azure_connect_timeout_ms", 1000, 1000, "New setting"}, | |||
{"azure_sdk_use_native_client", false, true, "New setting"}, | |||
{"opentelemetry_trace_cpu_scheduling", false, false, "New setting to trace `cpu_slot_preemption` feature."}, | |||
{"parallel_replicas_support_projection", true, true, "New setting. Optimization of projections can be applied in parallel replicas. Effective only with enabled parallel_replicas_local_plan."}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be old_value=false
, new_value=true
, so when the compatibility is set to match one of the older versions, we will have this feature disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, done.
std::optional<size_t> getNumberOfCurrentReplica() { return number_of_current_replica; } | ||
/// After projection optimization, ReadFromMergeTree may be replaced with a new reading step, and the ParallelReadingExtension must be forwarded to the new step. | ||
/// Meanwhile, the ParallelReadingExtension originally in ReadFromMergeTree might be detached. | ||
void detachParallelReadingExtension(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Methods should have the following semantics:
x getX() {
return x;
}
void cancel() {
x.reset()/clear();
}
X detachX() {
auto ret = move(x);
x.reset()/clear();
return move(ret);
}
I.e., detach
is get + cancel
.
So, getParallelReadingExtension
now does what it is supposed to do. But detachParallelReadingExtension
acts like cancel
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your patient explanation of the meaning of detach
. However, it may not fit the semantics here.
The original parent part reading step needs to pass ParallelReadingExtension
to the projection reading step, but it doesn't necessarily require "detach" because the original parent part reading step may still need ParallelReadingExtension
, and at the same time, the projection reading step may need to clear the received ParallelReadingExtension
.
Therefore, perhaps clearParallelReadingExtension
is the most semantically appropriate here.
@@ -59,6 +80,18 @@ void enableMemoryBoundMerging(QueryPlan::Node & node) | |||
{ | |||
if (aggregating_step->memoryBoundMergingWillBeUsed()) | |||
{ | |||
/// When parallel_replicas_local_plan is enabled and the local plan has been optimized by a projection, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test case to ensure memory-bound merging and projections can coexist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I’ve added the test cases.
I’ve noticed that when parallel replicas support projections while aggregation_in_order
is active, it often leads to errors, causing different replicas to select different coordinator modes. Therefore, it might be a good idea to disable parallel replicas from reading projections when aggregation_in_order
is active
/// Previously we didn't maintain backward compatibility and every change was breaking. | ||
/// Particularly, we had an equality check for the version. To work around that code | ||
/// in previous server versions we now have to lie to them about the version. | ||
const UInt64 version = initiator_protocol_version >= DBMS_MIN_REVISION_WITH_VERSIONED_PARALLEL_REPLICAS_PROTOCOL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When connecting to the initiator, we read the initiator_protocol_version
value from the socket. In newer versions, it is set to the current DBMS_PARALLEL_REPLICAS_PROTOCOL_VERSION
, but previous versions had different values. Apparently, the test for backward compatibility of the protocol was broken some time ago, so the issue does not reproduce in the CI. I will fix it shortly here, and you'll be able to see the problem.
@@ -0,0 +1,22 @@ | |||
---normal : contains both projections and parts --- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that this test will also succeed if we set enable_parallel_replicas=0
, which is not good because it tests the behaviour of parallel replicas specifically. So, I'd like to ensure that the test will fail with disabled parallel replicas (because we have some traces of using parallel replicas in the reference file).
please take another look, thanks. @nickitat |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
This PR closes #82659.
To simplify the implementation, support for projection is only enabled when
parallel_replicas_local_plan
is active.A new setting
parallel_replicas_support_projection
has been added to control whether projection support is enabled.Changes Overview
Coordinator + ReadPool Changes
part_info
.part_info + projection_name
.Query Plan Changes for Projection Optimization
Documentation entry for user-facing changes