Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up BSON unmarshalling #410

Closed
wants to merge 3 commits into from

Conversation

craiggwilson
Copy link

Currently when unmarshalling into a struct which is missing a field or into bson.Raw, the unmarshaller reads the parts that should be skipped into blackHole. This allows for the entire document to be checked for corruption at the expense of speed. In particular for bson.Raw, it also means that every document is unmarshalled twice, once at the initial time of reading, and later when the bson.Raw is unmarshalled into its final form. Effectively, this PR stops doing that. It verifies the element that is getting skipped but doesn't descend into them. This is particularly relevent for containers like arrays and documents.

The effect is a massive speedup (I've measured up to 6x) depending on the complexity of documents when using commands that returns cursors as arrays. This would be the new find command and the aggregate command. The downside is that the corruption message appears later in the program than it used to and sometimes a corruption message may not occur if a field is ignored or a bson.Raw is never ultimately unmarshalled. I feel these are acceptable trade-offs.

As part of verifying, I've implemented the entire bson_corpus as generated code which is checked in (so as long as the corpus doesn't change, no need to regenerate).

@fmpwizard
Copy link
Contributor

If you close and reopen the PR, travis will rerun this build and thanks to #462 , your PR should pass all tests on all mongodb versions

@craiggwilson
Copy link
Author

oh, probably need to rebase...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants