-
Notifications
You must be signed in to change notification settings - Fork 457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(pageserver): use json for timeline metadata encoding #7663
Conversation
3024 tests run: 2892 passed, 0 failed, 132 skipped (full report)Flaky tests (4)Postgres 15
Postgres 14
Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
2682a32 at 2024-05-09T20:37:21.601Z :recycle: |
Signed-off-by: Alex Chi Z <chi@neon.tech>
9970d45
to
5418e80
Compare
can we somehow avoid that? It would be great if it were a native json object instead of inside a string inside the json. |
pageserver/src/tenant/metadata.rs
Outdated
@@ -175,17 +203,18 @@ impl TimelineMetadata { | |||
} | |||
|
|||
pub fn to_bytes(&self) -> Result<Vec<u8>, SerializeError> { | |||
let body_bytes = self.body.ser()?; | |||
let body_bytes = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For forward compatibility, can we write the old format for timelines that have not switched to aux files v2 yet? That way we can retain the option to rollback the next release, but also not have to wait another week to roll out the write path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code logic to implement this is like: if aux file v2 is enabled, decode with JSON; however, we cannot know whether it's enabled before decoding the metadata body. It's kind of egg-and-chicken problem.
The alternative is to maintain two versions of timeline metadata here and do not automatically upgrade the timeline metadata version, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... which means that aux v2 enabled -> metadata v3, aux v2 disabled -> metadata v2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 396b9cf
Signed-off-by: Alex Chi Z <chi@neon.tech>
@arpad-m My current concern is that |
Hmm yeah there is no "normal forms" of json that I know of that one could use for obtaining the reliability one needs to compute checksums, nor is there some other standardized checksumming algorithm. The algorithms that do employ checksums or signatures of json objects usually just compute it on the string representation, but then one has to include it as json string obviously. Edit: JWT for example |
note that forward compatibility test passed, which means something is going wrong with the test case itself |
maybe related: c20a379? |
3dcb191
to
03038da
Compare
forward compat broke as expected after the revert 03038da |
03038da
to
396b9cf
Compare
Now we write the old binary format by default and do not automatically upgrade to the new JSON format unless aux file flag is set (aux file flag is part of the next PR in the queue). Ready for review, |
/// the in-memory representation of the data. If the header format is the old one, the | ||
/// body will be serialized using bincode. Otherwise, it will be serialized using JSON, | ||
/// which is the default method for bodyv3. | ||
const METADATA_FORMAT_VERSION: u16 = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make an enum for these format versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Don't have to do it in this PR, just general question)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, I agree -- with more versions in the future, there should be an enum to map from FORMAT_VERSION_VX to a number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how this change will be used, but I don't think we should change TimelineMetadata at all -- we have index_part.json
where we should also render this as plain json.
We can never let go of reading the old serialization format because of detached tenants, but at least I cannot see a reason from this PR to keep using it in favor of modifying only IndexPart
. I'll fire up an alternative for real json format, but in the meantime, I'd like to see some more justification for these changes.
The PR which adds support for interpreting the index_part.json
field as the current format (before this PR) or plain json (in future): #7693. The "next weeks" follow-up to stop using Vec<u8>
or bincode in json format completely: #7699.
To reiterate from those PR's:
- I don't think we should add any data to
TimelineMetadata
- Instead add all fields to
index_part.json
where we already have a versioning/migration story
- Instead add all fields to
- Current
TimelineMetadata
will be needed until we've confirmed that there are no longerindex_part.json
out there using the old format- Until then, we shouldn't change
TimelineMetadata
- Until then, we shouldn't change
replaced by #7693 |
Problem
close #7540
Summary of changes
Change the encoding of timeline metadata to JSON. Note that the index_part still uses a weird representation that:
but using JSON at least gives us easy forward compatibility that serde_json ignores unknown fields, and new fields can be defined as
Option<X>
so that forward/backward compatibility is guaranteed.Checklist before requesting a review
Checklist before merging