-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(query): Add NestedCheckpointReader for input format parser #6385
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
Thanks for the contribution! Please review the labels and make any necessary changes. |
This comment was marked as outdated.
This comment was marked as outdated.
use super::BufferRead; | ||
|
||
// this struct can avoid NestedCheckpointReader<NestedCheckpointReader<R>> in recursive functions | ||
pub struct NestedCheckpointReader<R: BufferRead> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is thatCheckpointReader
renaming to NestedCheckpointReader
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it can maintain multi checkpoints.
a big diff implementation:
keep using the buffer instead of preadd to the buffer of the inner reader when rollback
} else { | ||
reader.rollback_to_checkpoint()?; | ||
reader.pop_checkpoint(); | ||
self.inner.de_text_quoted(reader, format)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can not construct a tmp CheckpointReader only when needed because the recursive call to de_text_quoted() , compiler complain about potential type CheckpointReader<CheckpointReader<CheckpointReader<...>>>
self.checkpoints.push(self.pos) | ||
} | ||
|
||
// todo(youngsofun): add CheckPointGuard to make it safer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be great to have a Guard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do it in the next pr, with some other minor refactors. let`s fix the bug first.
Wait for another reviewer approval |
CI Passed |
format: &FormatSettings, | ||
) -> Result<()> { | ||
reader.push_checkpoint(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not find a good way to impl a guard for checkpoint
- most of the time, need to pop_checkpoint early like here, guard does not help.
- guard need to hold a mut ref to impl Drop, so the flowing code need to use
guard.reader.read()
, it is not convenient. and can not pass it to inner consumer (like de_text_json here)
I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/
Summary
methods in TypeDeserializer may be called recursively, and in each recursive call, it may need to checkpoint.
add a new NestedCheckpointReader, and use it throughout the parse progress.
Changelog
Related Issues
Fixes #6353