-
Notifications
You must be signed in to change notification settings - Fork 3
add uuid scalar type #148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add uuid scalar type #148
Conversation
pub fn is_supertype(a: &BsonScalarType, b: &BsonScalarType) -> bool { | ||
matches!((a, b), (Double, Int)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this to a method on BsonScalarType
S::Date => convert_date(&from_string(expected_type, value)?)?, | ||
S::Timestamp => deserialize::<json_formats::Timestamp>(expected_type, value)?.into(), | ||
S::BinData => deserialize::<json_formats::BinData>(expected_type, value)?.into(), | ||
S::UUID => convert_uuid(&from_string(expected_type, value)?)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry about the churn in this function. This is the only added line. The logic in other lines is unchanged.
JsonSchema, | ||
)] | ||
#[serde(try_from = "BsonType", rename_all = "camelCase")] | ||
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, Sequence, JsonSchema)] | ||
pub enum BsonScalarType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We serialize BSON scalar types using the names in bson_name()
. But so for this has been implicit - the serde serialization and the bson_name
function didn't reference each other. The camelCase
renaming just happened to match the strings in bson_name
.
I changed this now because serde camel-cases UUID
to uUID
which I think is wrong. I replaced the derived Serialize
and Deserialize
implementations with hand-written implementations that call bson_name
and from_bson_name
.
I also removed the try_from = "BsonType"
part. That has a hack for parsing collection validators. In the schema::Property::Scalar
variant we declare the bson_type
field as BsonScalarType
, but actually it can be an array type. So we need to parse that field as a BsonType
, and attempt to convert it to a scalar type. Instead of doing that for all BsonScalarType
deserializations I moved that logic into the schema
module.
/// If there is a BSON scalar type that encompasses both a and b, return it. This does not | ||
/// require a and to overlap. The returned type may be equal to a or b if one is a supertype of | ||
/// the other. | ||
pub fn common_supertype(a: BsonScalarType, b: BsonScalarType) -> Option<BsonScalarType> { | ||
fn helper(a: BsonScalarType, b: BsonScalarType) -> Option<BsonScalarType> { | ||
if a == b { | ||
Some(a) | ||
} else if a.is_binary() && b.is_binary() { | ||
Some(S::BinData) | ||
} else { | ||
match (a, b) { | ||
(S::Double, S::Int) => Some(S::Double), | ||
_ => None, | ||
} | ||
} | ||
} | ||
helper(a, b).or_else(|| helper(b, a)) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
common_supertype
is more general than is_supertype
: it can provide a common ancestor type given two types that don't overlap, or don't fully overlap with each other. This is a bit of future-proofing in case we add more binary subtype scalar types. For example if a field is found to have UUID
and also Vector
values we want to unify those to BinData
.
} | ||
} | ||
|
||
/// True iff we consider a to be a supertype of b. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not a typo - "iff" is short for "if and only if"
…uuids-serialized-as-strings
UUIDs are stored in BSON using the
binData
type which we already support. But thebinData
type is not especially user friendly to work with in its generalized form, and UUIDs are a common use case. It is helpful to surface UUIDs as a first-class scalar type that serializes to a string.The
binData
type has a numericsubType
field to give a hint as to how the stored binary data should be interpreted. There are two subtypes for UUIDs, 3 & 4, but subtype 4 is indicated as the "current" representation for UUIDs. TheUUID()
constructor inmongosh
producesbinData
values with subtype 4.This change:
UUID
UUID
toString
UUID
values to BSON asbinData
values with subtype 4 - inputs are parsed from string representations using functions provided by the BSON library that is bundled with the MongoDB Rust driverUUID
values to JSON as strings using provided BSON library functions - for example,4ca4b7e7-6f6a-445b-b142-9d6d252d92bc
UUID
when encountering fields withbinData
values with subtype 4BinData
instead for fields where subtype 4 occurs alongsidebinData
values with other subtypesThis allows for more user-friendly queries involving UUIDs. For example filtering by UUID changed from this:
and now looks like this: