-
Notifications
You must be signed in to change notification settings - Fork 3
rework queries with variable sets so they use indexes #83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
dmoverton
approved these changes
Jun 24, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes aggregation pipelines that are produced for query requests that include variable sets. Previously we used a
$facet
sub-pipelines where each sub-pipeline was a copy of the query pipeline with different variable values interpolated. Variable values were written directly into the pipelines at use sites.This PR removes the
$facet
stage, and instead feeds variable sets into the start of the pipeline using a$documents
stage which feeds into a$lookup
stage that contains the query pipeline as a sub-pipeline. So we only have one copy of the query pipeline. Instead if interpolating variable values into that pipeline, we use MongoDB variables to reference values from the$documents
stage.This change has two big benefits:
$lookup
instead of$facet
allows the use of indexes/explain
requests would fail if the connector is the target of a remote joinThe downside is that in my testing use of a
$documents
stage fails in MongoDB 5 with this error:We're investigating this issue to see if it can be fixed. For this PR I have skipped integration tests involving variable sets when testing MongoDB 5. (The tests do run in MongoDB versions 6 & 7.)
A couple of considerations came up while making the changes for this PR. The biggest is that since we are now injecting variables in the first stage of the pipeline we need to know types for those variables up front to apply JSON-to-BSON conversions. To handle that I updated the
plan_for_query_request
state struct to record types for each use of each variable, and i added avariable_types
map to theQueryPlan
struct. It is possible for one variable to appear in multiple contexts in the query with different types. Previously we did the BSON conversion at each use site, so we would convert to BSON differently at each site as appropriate for the expected type. Moving variable serialization to a central point requires either an attempt at type unification, or setting separate values for each variable for each distinct type context. I opted for the latter. So the variables written into the aggregation pipeline have names with type suffixes to distinguish different possible BSON serializations.The other consideration was that the pipeline change results in a different response shape. So I also had to touch response serialization. Thankfully the new response serialization came out a little simpler than it used to be.
MDB-113