add native queries, functions or virtual collections defined by pipelines #45

hallettj · 2024-04-18T21:51:35Z

Describe your changes

Implements native queries. They are defined as aggregation pipelines. If the root target of a query request is a native query then the given pipeline will form the start of the overall query plan pipeline. In this case the query will be executed as a MongoDB aggregate command with no target collection - as opposed to our other queries which are an aggregate command that does have a collection target.

Native queries currently cannot be the target of a relation.

There is a really basic native query in fixtures to test with. If you run services with arion up -d you can see it as a query field called hello.

The changes were going to result in a large-ish amount of very similar, but technically incompatible code involving converting configuration to ndc types for schema responses, and for processing query requests. To avoid that I pushed configuration processing into the configuration crate. This makes it easier to share that logic, pushes a bunch of errors from connector runtime to configuration parsing time, and pushes computation to connector startup time instead of response-handling time. This resulted in a bunch of changes:

The MongoConfig type is gone. Instead configuration-related data is kept in Configuration which is now passed directly to mongodb-agent-common functions. Database-connection data is put in a new type, ConnectorState. So now we have properly separated configuration and state types, which is what the ndc-sdk API expects.
The configuration crate has a new dependency on ndc-models. We need to keep the ndc-models version matched with ndc-sdk. To make that easier I moved configuration for those dependencies to the workspace Cargo.toml and added a note.

Issue ticket number and link

MDB-103

dmoverton · 2024-04-19T06:30:05Z

crates/configuration/src/configuration.rs

+    /// responses they are separate concepts. So we want a set of [CollectionInfo] values for
+    /// functions for query processing, and we want it separate from `collections` for the schema
+    /// response.
+    pub function_collection_infos: BTreeMap<String, ndc::CollectionInfo>,


Will functions and function_collection_infos have the same set of keys?
If so, would it be better to represent this as a BTreeMap<String, (ndc::FunctionInfo, ndc::CollectionInfo)>?

That does make sense. But I think it's a matter of style. There isn't any case where we want to look up both FunctionInfo and CollectionInfo - the schema handler wants one, the query handler wants the other. We shouldn't be constructing Configuration values without going through validate except in tests so I'm not concerned about accidentally created an invalid state.

Ok I made this change. It does make the validate function simpler. But I'd rather not combine the procedure maps since I like keeping the ndc stuff somewhat separated from the mongodb-specific stuff.

dmoverton · 2024-04-19T06:37:15Z

crates/configuration/src/configuration.rs


    /// Native procedures allow arbitrary MongoDB commands where types of results are
    /// specified via user configuration.
-    #[serde(default, skip_serializing_if = "BTreeMap::is_empty")]
    pub native_procedures: BTreeMap<String, NativeProcedure>,


Likewise, could procedures and native_procedures be represented as a single map BTreeMap<String, (ndc::ProcedureInfo, NativeProcedure)>?

Yep that makes sense. But again I think it's a matter of style since there are no cases where we want both values together.

crates/configuration/src/configuration.rs

dmoverton · 2024-04-19T07:07:26Z

crates/configuration/src/serialized/native_procedure.rs

+    /// be merged with the definitions in `schema.json`. This allows you to maintain hand-written
+    /// types for native procedures without having to edit a generated `schema.json` file.
+    #[serde(default, skip_serializing_if = "BTreeMap::is_empty")]
+    pub object_types: BTreeMap<String, ObjectType>,


I'm wondering whether all user-defined object types should be in one place, rather than each native procedure/query having their own. That would make it easier to re-use types between procedures/queries. It would also provide a place where users could override type definitions inferred by the schema inference.

Yeah I think you make a good point. I think the way I have it set up has a convenience advantage because you can define single-use types in the same place they're used. But it might be more confusing. @codedmart already ran into this trying to write a procedure that reuses a collection object type.

I think this is a good topic for discussion. I think it's outside the scope of this PR since I already established this convention in native procedures. But there's still time to change it in another ticket.

hallettj added 30 commits April 5, 2024 18:05

configuration for native queries

c3fa8bb

update tests to accept database in query executor instead of collection

d000d2f

begin query plan with top-level native query

4c6269b

update fixtures with a basic native query

4e86df9

give query request conversion access to function result types

2a14c9b

update query conversion tests

fab2329

wip:

80877fd

Merge branch 'main' into jesse/native-queries-using-pipelines

29dd28f

rename test helper

03207c4

log query request target

8ec01c3

interpolate native query arguments

63e42c3

make QueryConfig implement Copy

190feed

update fixture

79c1efa

unit test for native query

153ec52

move native query fixture to default connector config

1e9c769

avoid crate rebuilds when fixtures change

4c12875

support representations as collections or as functions

6aa94b6

include native queries in schema response

61a48eb

push configuration processing into configuration crate

5705911

pass around Configuration instead of MongoConfig

006e89c

wip: pass around Configuration instead of MongoConfig

9979b38

wip: remove dead schema code from common, config type changes

9203c55

wip: QueryContext copies from Configuration

efea08f

wip: match ndc lib revs

df6a388

Merge branch 'main' into jesse/native-queries-using-pipelines

532c899

match ndc-models tag from sdk

6878b0d

wip: configuration types

9dd80fb

configuration changes compile - next up, tests

afd0a97

update tests for type changes

e2c4cfa

lint fixes

8f30bfd

hallettj requested review from daniel-chambers and dmoverton April 18, 2024 21:51

hallettj self-assigned this Apr 18, 2024

hallettj added 6 commits April 18, 2024 14:55

fix native query fixture

c088dcb

we need to preserve function object types for query request processing

91af19d

remove debugging output

6e4bcf9

add missing list item dash in error message

be8cbcd

change field name from "type" to "result_document_type"

db180e8

oh wait - for consistency that should be camelCase

9db1014

dmoverton reviewed Apr 19, 2024

View reviewed changes

crates/configuration/src/configuration.rs Outdated Show resolved Hide resolved

dmoverton reviewed Apr 19, 2024

View reviewed changes

hallettj added 2 commits April 19, 2024 09:24

remove commented-out code

081adac

combine FunctionInfos and CollectionInfos into one map

cc1f467

codedmart approved these changes Apr 19, 2024

View reviewed changes

hallettj merged commit 2067659 into main Apr 19, 2024

hallettj deleted the jesse/native-queries-using-pipelines branch April 19, 2024 21:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add native queries, functions or virtual collections defined by pipelines #45

add native queries, functions or virtual collections defined by pipelines #45

Uh oh!

hallettj commented Apr 18, 2024 •

edited

Loading

Uh oh!

dmoverton Apr 19, 2024

Uh oh!

hallettj Apr 19, 2024

Uh oh!

hallettj Apr 19, 2024

Uh oh!

dmoverton Apr 19, 2024

Uh oh!

hallettj Apr 19, 2024

Uh oh!

Uh oh!

dmoverton Apr 19, 2024

Uh oh!

hallettj Apr 19, 2024

Uh oh!

Uh oh!

add native queries, functions or virtual collections defined by pipelines #45

add native queries, functions or virtual collections defined by pipelines #45

Uh oh!

Conversation

hallettj commented Apr 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

Issue ticket number and link

Uh oh!

dmoverton Apr 19, 2024

Choose a reason for hiding this comment

Uh oh!

hallettj Apr 19, 2024

Choose a reason for hiding this comment

Uh oh!

hallettj Apr 19, 2024

Choose a reason for hiding this comment

Uh oh!

dmoverton Apr 19, 2024

Choose a reason for hiding this comment

Uh oh!

hallettj Apr 19, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dmoverton Apr 19, 2024

Choose a reason for hiding this comment

Uh oh!

hallettj Apr 19, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hallettj commented Apr 18, 2024 •

edited

Loading