这是indexloc提供的服务,不要输入任何密码
Skip to content

add option to skip rows on response type mismatch #162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 37 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,47 @@ This changelog documents the changes between release versions.

### Added

- Add option to skip rows on response type mismatch ([#162](https://github.com/hasura/ndc-mongodb/pull/162))

### Changed

### Fixed

### Option to skip rows on response type mismatch

When sending response data for a query if we encounter a value that does not match the type declared in the connector
schema the default behavior is to respond with an error. That prevents the user from getting any data. This change adds
an option to silently skip rows that contain type mismatches so that the user can get a partial set of result data.

This can come up if, for example, you have database documents with a field that nearly always contains an `int` value,
but in a handful of cases that field contains a `string`. Introspection may determine that the type of the field is
`int` if the random document sampling does not happen to check one of the documents with a `string`. Then when you run
a query that _does_ read one of those documents the query fails because the connector refuses to return a value of an
unexpected type.

The new option, `onResponseTypeMismatch`, has two possible values: `fail` (the existing, default behavior), or `skipRow`
(the new, opt-in behavior). If you set the option to `skipRow` in the example case above the connector will silently
exclude documents with unexpected `string` values in the response. This allows you to get access to the "good" data.
This is opt-in because we don't want to exclude data if users are not aware that might be happening.

The option is set in connector configuration in `configuration.json`. Here is an example configuration:

```json
{
"introspectionOptions": {
"sampleSize": 1000,
"noValidatorSchema": false,
"allSchemaNullable": false
},
"serializationOptions": {
"extendedJsonMode": "relaxed",
"onResponseTypeMismatch": "skipRow"
}
}
```

The `skipRow` behavior does not affect aggregations, or queries that do not request the field with the unexpected type.

## [1.7.2] - 2025-04-16

### Fixed
Expand All @@ -22,10 +59,6 @@ This changelog documents the changes between release versions.

- Add watch command while initializing metadata ([#157](https://github.com/hasura/ndc-mongodb/pull/157))

### Changed

### Fixed

## [1.7.0] - 2025-03-10

### Added
Expand Down
20 changes: 20 additions & 0 deletions crates/configuration/src/configuration.rs
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,26 @@ pub struct ConfigurationSerializationOptions {
/// used for output. This setting has no effect on inputs (query arguments, etc.).
#[serde(default)]
pub extended_json_mode: ExtendedJsonMode,

/// When sending response data the connector may encounter data in a field that does not match
/// the type declared for that field in the connector schema. This option specifies what the
/// connector should do in this situation.
#[serde(default)]
pub on_response_type_mismatch: OnResponseTypeMismatch,
}

/// Options for connector behavior on encountering a type mismatch between query response data, and
/// declared types in schema.
#[derive(Copy, Clone, Debug, Default, PartialEq, Eq, Deserialize, Serialize)]
#[serde(rename_all = "camelCase")]
pub enum OnResponseTypeMismatch {
/// On a type mismatch, send an error instead of response data. Fails the entire query.
#[default]
Fail,

/// If any field in a response row contains data of an incorrect type, exclude that row from
/// the response.
SkipRow,
}

fn merge_object_types<'a>(
Expand Down
5 changes: 4 additions & 1 deletion crates/configuration/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,10 @@ pub mod schema;
pub mod serialized;
mod with_name;

pub use crate::configuration::Configuration;
pub use crate::configuration::{
Configuration, ConfigurationIntrospectionOptions, ConfigurationOptions,
ConfigurationSerializationOptions, OnResponseTypeMismatch,
};
pub use crate::directory::parse_configuration_options_file;
pub use crate::directory::read_existing_schemas;
pub use crate::directory::write_schema_directory;
Expand Down
7 changes: 4 additions & 3 deletions crates/mongodb-agent-common/src/mongo_query_plan/mod.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
use std::collections::BTreeMap;

use configuration::ConfigurationSerializationOptions;
use configuration::{
native_mutation::NativeMutation, native_query::NativeQuery, Configuration, MongoScalarType,
};
use mongodb_support::{ExtendedJsonMode, EXTENDED_JSON_TYPE_NAME};
use mongodb_support::EXTENDED_JSON_TYPE_NAME;
use ndc_models as ndc;
use ndc_query_plan::{ConnectorTypes, QueryContext, QueryPlanError};

Expand All @@ -15,8 +16,8 @@ use crate::scalar_types_capabilities::SCALAR_TYPES;
pub struct MongoConfiguration(pub Configuration);

impl MongoConfiguration {
pub fn extended_json_mode(&self) -> ExtendedJsonMode {
self.0.options.serialization_options.extended_json_mode
pub fn serialization_options(&self) -> &ConfigurationSerializationOptions {
&self.0.options.serialization_options
}

pub fn native_queries(&self) -> &BTreeMap<ndc::FunctionName, NativeQuery> {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ pub async fn execute_query_request(
tracing::debug!(?query_plan, "abstract query plan");
let pipeline = pipeline_for_query_request(config, &query_plan)?;
let documents = execute_query_pipeline(database, config, &query_plan, pipeline).await?;
let response = serialize_query_response(config.extended_json_mode(), &query_plan, documents)?;
let response =
serialize_query_response(config.serialization_options(), &query_plan, documents)?;
Ok(response)
}

Expand Down
Loading
Loading