-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize Message Metadata into Flatbuffers #32
Conversation
@josevalim This PR is still a draft... |
And the plan is looking good so far! |
I don't think I'm going to make any Erlang changes after this, so just as well I guess. :) |
This commit adds the implementation and tests to serialize a RecordBatch.
In addition to serializing schema fields, there have been some soft changes. In creating this commit, I've come to realise that Atoms are a bad thing to keep in Rust structs. So, in this commit, I've also changed Name and Dictionary to use enums to represent undefined instead of an atom.
6558c27
to
477734b
Compare
This commit serializes messages to their planus, or Rust form. It does not serialize it to proper flatbuffers yet.
This commit adds the final commit required to serialize message metadata to IPC. Note though, that this function has not been integrated with `serde_arrow_ipc_message:to_ipc` function.
@josevalim, There haven't been any Erlang changes since you last reviewed, but you still may want to take a look. |
pub type CustomMetadata = Vec<HashMap<String, String>>; | ||
|
||
/// Returns an example `Message` of a `Schema` | ||
pub fn schema() -> Message { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally don't like "utils" as a module name because it tends to become a blackhole of all things that don't quite have a place.
If I understood correctly, schema
and message_batch
a just dummy functions that will be replaced where they are currently called by actual code in the future.
If this is indeed the case, I think it would be best if they were prefixed with dummy_
and set in the message
module instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are functions which return testing data, and are used solely in testing.
Presently they are are used in the test_decode
function in crate
, and in various Rust specific tests.
Should I still prefix them with dummy
and move them or is keeping them in utils
or in a test_utils
good enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is even more indicative that this module should be something else :)
Maybe test/fixtures
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you won't need to prefix. Not sure if this would work, but maybe calling it as:
use test::fixtures
...
fixtures::schema(...)
would be readable enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks sound good.
However, test_decode is not a Rust specific test - it's included in the NIF, and tested from Erlang which is why I'm not very sure on what to do with it.
This means that schema
and record_batch
are used out of Rust tests.
The functions prefixed arrow_
can probably go to test/fixtures
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should test_decode
actually be exposed in the NIF?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's a function to test if everything decodes properly into Erlang.
This PR only implements the encoding part of this NIF, and I will have to deal with decoding later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that function will be removed as soon as the decoding is implemented, right?
I believe it's fine to leave it where it is then :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've put the arrow_*
into a test::fixtures
module in lib.rs
. I hoping this is what you had in mind when you said test::fixtures
? I wasn't able to find a way to keep them in a file in the tests
dir and use them in unit tests in the src
dir.
Since these are test only functions, it's better to keep them in a test specific module.
Since the code in schema and record_batch are no longer dead, we need not allow dead code in the clippy checks.
@@ -49,10 +46,156 @@ mod atoms { | |||
} | |||
} | |||
|
|||
#[cfg(test)] | |||
pub mod test { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you could have this extracted to a test.rs
file in the src
directory. This is a non-blocker, though :)
This commit adds support to serialize message metadata into flatbuffers.