-
Notifications
You must be signed in to change notification settings - Fork 144
RUST-871 Support direct serialization to BSON bytes #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUST-871 Support direct serialization to BSON bytes #279
Conversation
@@ -92,16 +92,6 @@ functions: | |||
${PREPARE_SHELL} | |||
.evergreen/run-tests-u2i.sh | |||
|
|||
"run serde tests": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than the serde tests being their own task in evergreen, I updated each of the other tasks to run it too with their associated feature flags on.
@@ -83,6 +89,24 @@ fn write_f128<W: Write + ?Sized>(writer: &mut W, val: Decimal128) -> Result<()> | |||
writer.write_all(&raw).map_err(From::from) | |||
} | |||
|
|||
#[inline] | |||
fn write_binary<W: Write>(mut writer: W, bytes: &[u8], subtype: BinarySubtype) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is just copied from below
@@ -0,0 +1,458 @@ | |||
mod document_serializer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend starting the review in this file and then branching out to the others as you encounter them here.
} | ||
|
||
fn serialize_struct(self, name: &'static str, _len: usize) -> Result<Self::SerializeStruct> { | ||
let value_type = match name { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BSON types which don't fit into serde's data model (e.g. ObjectId
) are serialized as structs matching the extended JSON format for that type. The ValueSerializer
is then in charge of interpreting the extended JSON format and producing actual BSON from it. The regular serializer works this way too, but it goes from Bson
to Bson
rather than serde data model -> BSON.
The "name" of the struct is used to incidate which BSON type, and the values are basically just the BSON type's extended JSON wrapper name. The only divergence is we also use $codeWithScope
to differentiate from just a regular $code
with no scope.
If the name doesn't match any of the special strings, the struct will just be serialized as a document.
@@ -698,6 +698,20 @@ impl Bson { | |||
} | |||
} | |||
|
|||
["$numberDecimalBytes"] => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unintuitively, this function is used in the existing serializer to convert between the intermediate serialized form (i.e. and "extended document") and the final serialized form. Now that decimal128 can serialize this way, this needed to be updated to interpret it.
|
||
/// A serializer used specifically for serializing the serde-data-model form of a BSON type (e.g. | ||
/// `Binary`) to raw bytes. | ||
pub(crate) struct ValueSerializer<'a> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type is used to interpret any extended JSON formatted types in serde's data model and serialize them to actual BSON. It keeps track of where it's at via a state machine.
type Ok = (); | ||
type Error = Error; | ||
|
||
fn serialize_field<T: ?Sized>(&mut self, key: &'static str, value: &T) -> Result<()> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method handles all the state transitions.
@@ -551,8 +607,12 @@ impl Serialize for Decimal128 { | |||
where | |||
S: ser::Serializer, | |||
{ | |||
let value = Bson::Decimal128(self.clone()); | |||
value.serialize(serializer) | |||
let mut state = serializer.serialize_struct("$numberDecimal", 1)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the decimal128
flag was enabled, this used to produce actual extended JSON format (i.e. { "$numberDecimal": <decimal as a string> }
), but per the bug we discussed in triage, the current implementation's to_string
is lossy. For that reason, it seems safer to produce this unique format that does round trip properly until we have a complete decimal128 implementation.
For users relying on the output of this, their code will break, but that's probably better than them silently losing data.
Also note that we classify the decimal128
feature flag as "experimental" and subject to breaking changes, so this is within our contract. Code that omits the flag will not break.
@@ -61,63 +61,95 @@ fn run_test(test: TestFile) { | |||
|
|||
let canonical_bson = hex::decode(&valid.canonical_bson).expect(&description); | |||
|
|||
let bson_to_native_cb = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and updated the naming scheme of the different things here. Let me know if this makes sense. I know the old one got the point of being unreadable to me at least, so hopefully this is better.
RUST-871
This PR introduces the
to_writer
andto_vec
methods, which can serialize a genericT
directly to BSON bytes without going throughDocument
. This should enable substantial performance improvements in the driver, as well as be generally useful in the BSON library.As part of testing this feature, I enabled
decimal128
serde testing on evergreen (RUST-446). decimal128 tests were originally added to evergreen in #279.