rework queries with variable sets so they use indexes #83

hallettj · 2024-06-21T19:21:43Z

Changes aggregation pipelines that are produced for query requests that include variable sets. Previously we used a $facet sub-pipelines where each sub-pipeline was a copy of the query pipeline with different variable values interpolated. Variable values were written directly into the pipelines at use sites.

This PR removes the $facet stage, and instead feeds variable sets into the start of the pipeline using a $documents stage which feeds into a $lookup stage that contains the query pipeline as a sub-pipeline. So we only have one copy of the query pipeline. Instead if interpolating variable values into that pipeline, we use MongoDB variables to reference values from the $documents stage.

This change has two big benefits:

using $lookup instead of $facet allows the use of indexes
the new pipeline is valid even with zero variable sets which fixes a bug we had where connector /explain requests would fail if the connector is the target of a remote join

The downside is that in my testing use of a $documents stage fails in MongoDB 5 with this error:

Kind: Command failed: Error code 5491300 (Location5491300): $documents' is not allowed in user requests, labels: {}

We're investigating this issue to see if it can be fixed. For this PR I have skipped integration tests involving variable sets when testing MongoDB 5. (The tests do run in MongoDB versions 6 & 7.)

A couple of considerations came up while making the changes for this PR. The biggest is that since we are now injecting variables in the first stage of the pipeline we need to know types for those variables up front to apply JSON-to-BSON conversions. To handle that I updated the plan_for_query_request state struct to record types for each use of each variable, and i added a variable_types map to the QueryPlan struct. It is possible for one variable to appear in multiple contexts in the query with different types. Previously we did the BSON conversion at each use site, so we would convert to BSON differently at each site as appropriate for the expected type. Moving variable serialization to a central point requires either an attempt at type unification, or setting separate values for each variable for each distinct type context. I opted for the latter. So the variables written into the aggregation pipeline have names with type suffixes to distinguish different possible BSON serializations.

The other consideration was that the pipeline change results in a different response shape. So I also had to touch response serialization. Thankfully the new response serialization came out a little simpler than it used to be.

MDB-113

dmoverton

Looks good!

hallettj added 18 commits June 19, 2024 13:51

create indexes in mongodb fixtures

3255766

capture expected types of variables

3301e30

map request variables to $documents stage, replace $facet with $lookup

251ea94

test variable name escaping function

55c0464

tests for query_variable_name

f240360

use escaping in variable function to make it infallible

e4aba4d

replace variable map lookups with mongodb variable references

b1906ee

some test updates, delegate to variable function

3a007ab

fix make_selector

8e4aced

run db.aggregate if query request has variable sets

68af4e1

update response serialization for change in foreach response shape

637f6d9

update one of the foreach unit tests

8079fe2

update some stale comments

1454296

handle responses with aggregates, update tests

6b0be71

handle aggregate responses without rows

45cfef7

Merge branch 'main' into jesse/rework-variable-sets

7e6e0f6

add test for binary comparison bug that I incidentally fixed

e317ab6

skip remote relationship integration tests in mongodb 5

ec60490

hallettj requested review from codedmart and dmoverton June 21, 2024 19:21

hallettj self-assigned this Jun 21, 2024

hallettj added 3 commits June 21, 2024 12:24

update changelog

624e794

note breaking change in changelog

0c088e4

change aggregate target in explain to match target in query

fb695ce

dmoverton approved these changes Jun 24, 2024

View reviewed changes

dmoverton merged commit 6a5e208 into main Jun 24, 2024
1 check passed

dmoverton deleted the jesse/rework-variable-sets branch June 24, 2024 22:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rework queries with variable sets so they use indexes #83

rework queries with variable sets so they use indexes #83

Uh oh!

hallettj commented Jun 21, 2024 •

edited by jira bot

Loading

Uh oh!

dmoverton left a comment

Uh oh!

Uh oh!

Uh oh!

rework queries with variable sets so they use indexes #83

rework queries with variable sets so they use indexes #83

Uh oh!

Conversation

hallettj commented Jun 21, 2024 • edited by jira bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dmoverton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hallettj commented Jun 21, 2024 •

edited by jira bot

Loading