-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix array schema when time traveling #4549
Conversation
This pull request has been linked to Shortcut Story #35424: Schema evolution: time traveling should pick the correct schema.. |
@Shelnutt2 from left field! |
a8ca015
to
7566609
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing code solves the basic problem. It doesn't seem to account for all possible cases, though.
tiledb/sm/array/array_directory.cc
Outdated
for (auto& uri : array_schema_uris_) { | ||
auto name = uri.remove_trailing_slash().last_path_part(); | ||
|
||
// Skip the old schema URI name since it doesn't have timestamps | ||
if (name == constants::array_schema_filename) { | ||
continue; | ||
} | ||
|
||
std::pair<uint64_t, uint64_t> ts_range; | ||
throw_if_not_ok(utils::parse::get_timestamp_range(uri, &ts_range)); | ||
|
||
if (ts_range.second > latest_ts && ts_range.second <= timestamp_end_) { | ||
latest_uri = uri; | ||
latest_ts = ts_range.second; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you extract this loop to a function to somewhere in storage_format/
? It is, after all, parsing URIs, which is something we want to centralize there regardless.
This particular one caught my eye for another reason. We're going to be adding submillisecond sequence numbers to our basic timestamps. While it won't directly impinge on this particular code, we're going to want to make everything that uses timestamps generic to avoid bleeding internals of the new abstraction into every use of it.
No need to do more than extract it. We can rewrite it later, and if it's already moved, we won't have to do so in order to proceed with the work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you extract this loop to a function to somewhere in
storage_format/
? It is, after all, parsing URIs, which is something we want to centralize there regardless.
Not sure wha you mean here. I'm obviously calling a function to parse URIs, but I don't think that means this code is doing the parsing or belongs in storage_format
. I had contemplated putting this logic in ArrayDirectory::latest_array_schema_uri()
but I haven't got any clue how hot that function is. I could go back and make the latest_array_schema_uri_
and optional and then only invoke this logic when that member is nullopt
. However, given the logic happens exactly once I wasn't particularly concerned in just putting the logic in the one place its used given I don't see where else we'd apply it.
For the sub-millisecond sequences, I was under the impression that we were going to encode that into the UUID? Is there a design doc on the new approach?
No need to do more than extract it. We can rewrite it later, and if it's already moved, we won't have to do so in order to proceed with the work.
Why would we need to rewrite this? Because of the sequence numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sub-millisecond sequences, I was under the impression that we were going to encode that into the UUID
Yes, but that's not where the story ends. It's where it begins. What we're effectively doing is increasing our temporal resolution by not using just a clock. It's essentially a new temporal type that's going to end up everywhere in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we need to rewrite this? Because of the sequence numbers?
Exactly this.
The larger issue is that storage_format
involves any of (1) a file format, (2) a file name, or (3) directory contents.
tiledb/sm/array/array_directory.cc
Outdated
// Skip the old schema URI name since it doesn't have timestamps | ||
if (name == constants::array_schema_filename) { | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there's a defect here that if there's a timestamp requested that precedes any of the files that includes times, it won't find anything. It should probably find this one, but it's unambiguously discarded.
Similarly, I find it odd that this code doesn't throw anywhere. If a timestamp is requested that precedes all available schema, it should fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there's a defect here that if there's a timestamp requested that precedes any of the files that includes times, it won't find anything. It should probably find this one, but it's unambiguously discarded.
I'm guessing you missed this line?
Similarly, I find it odd that this code doesn't throw anywhere. If a timestamp is requested that precedes all available schema, it should fail.
Yeah, I started off thinking that as well but then I got to considering all of the different edge cases around our requirements for reading arrays that have a __array_schema.tdb
and I realized we can't blindly fail here since those arrays have an un-timestamped schema available.
The other thing to keep in mind is that we can time travel before the first schema was written and write fragments. This is done a lot in the test suite, but more importantly, its possible users have done this. So changing the array schema latest could theoretically break some users existing arrays if they're doing fun things like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can time travel before the first schema was written and write fragments
It's a defect that we allow this. Such data is fundamentally ambiguous, because it could have been written with any schema. Rather than double down on supporting defective behavior, I'd prefer that we fail to read such data with a "contact TileDB" message. We might be able to infer what the data means from the schema history, but that's a manual process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eric-hughes-tiledb this is out of scope for 2.19 and we need this PR in for the release per discussion with Isaiah. If we do what you suggest we are going to end up with a lot of support calls from customers as we have a lot of arrays out there that do this. Let’s file a follow up story for this an take more time to figure out how we can correct this moving forward.
7566609
to
127493f
Compare
We were using the latest array schema even while traveling. This changes the latest array schema selection when in ArrayDirectoryMode::READ to use the newest array schema that is less than or equal to the current end timestamp. If no array schemas are newer than the timestamp end value, the earliest schema is used.
127493f
to
8e54edd
Compare
We were using the latest array schema even while traveling. This changes the latest array schema selection when in ArrayDirectoryMode::READ to use the newest array schema that is less than or equal to the current end timestamp. If no array schemas are newer than the timestamp end value, the earliest schema is used. --- TYPE: BUG DESC: Fix array schema selection when time traveling
PR #4549 attempted to fix the latest array schema selection. Unfortunately, it didn't take into account that some array schemas might have the same timestamps. This change uses the fact that array schema URIs are lexicographically sorted (names generated in the same timestamps use extra sorting bits in the UUID) to chose the latest array URI whilst keeping the original intent of #4549 and checking against the end timestamp. --- TYPE: BUG DESC: Fix array latest schema selection for same MS timestamps schemas.
) PR #4549 attempted to fix the latest array schema selection. Unfortunately, it didn't take into account that some array schemas might have the same timestamps. This change uses the fact that array schema URIs are lexicographically sorted (names generated in the same timestamps use extra sorting bits in the UUID) to chose the latest array URI whilst keeping the original intent of #4549 and checking against the end timestamp. [sc-49806] --- TYPE: BUG DESC: Fix array latest schema selection for same MS timestamps schemas.
We were using the latest array schema even while traveling. This changes the latest array schema selection when in ArrayDirectoryMode::READ to use the newest array schema that is less than or equal to the current end timestamp. If no array schemas are newer than the timestamp end value, the earliest schema is used.
TYPE: BUG
DESC: Fix array schema selection when time traveling