Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silent downcast overflow #21

Closed
crepererum opened this issue Jan 20, 2022 · 5 comments
Closed

Silent downcast overflow #21

crepererum opened this issue Jan 20, 2022 · 5 comments

Comments

@crepererum
Copy link

When using VarintReader with types smaller than 64bits, the current implementation first reads a 64bit integer:

impl VarIntProcessor {
fn push(&mut self, b: u8) -> Result<()> {
if self.i >= 10 {
return Err(io::Error::new(
io::ErrorKind::InvalidData,
"Unterminated varint",
));
}
self.buf[self.i] = b;
self.i += 1;
Ok(())
}
fn finished(&self) -> bool {
self.i > 0 && (self.buf[self.i - 1] & MSB == 0)
}
fn decode<VI: VarInt>(&self) -> Option<VI> {
Some(VI::decode_var(&self.buf[0..self.i])?.0)
}
}

and then casts it to a smaller integer:

fn decode_var(src: &[u8]) -> Option<(Self, usize)> {
let (n, s) = u64::decode_var(src)?;
Some((n as Self, s))
}

fn decode_var(src: &[u8]) -> Option<(Self, usize)> {
let (n, s) = i64::decode_var(src)?;
Some((n as Self, s))
}

Now imagine a data stream w/ 9 bits 0xff followed by 0x00, which would be a pretty large 64bit integer. If you use VarIntReader::read_varint::<i32>(...) to decode that, it will sucessfully read the 64bit varint, consume all the bytes and during the conversion to 32bit will silently truncate the result. What it should do instead is to read only the number of bytes that are at max required for 32bit (5?) and then fail with "Unterminated varint".

@dermesser
Copy link
Owner

Yes this is an unsatisfactory state. However, with an unterminated varint error, there would be a fragment being left in the input stream which will give a nonsense number on next read (at least as nonsensical as a truncated int).

Now one could write documentation that the next integer after such an error must be ignored... but that also doesn't seem optimal to me.

In any case, I'm open for debate on this topic.

@crepererum
Copy link
Author

I think if your read / deserialize data from a stream and get an error, you have to stop reading or re-synchronize the stream (the latter one is mostly only possible for low-level network protocols). I think a user cannot expect a parser to automatically recover from broken data, because there is no way to know which part was broken (in the example case above was it one byte that was broken or was a 32bit value serialized as 64bit? or did we mess up before reading the varint?). It's not only about the "next integer" btw. because likely the protocol the user is trying to deserialize consists of other data types before and after the varint in question.

With your argument, I think you should NEVER return an "unterminated varint error" because you could always argue that there could in theory be a 128bit, 256bit, ... wide varint.

@dermesser
Copy link
Owner

That's a good point, I will look into it.

@dermesser
Copy link
Owner

dermesser commented Jan 28, 2022

@crepererum would you mind reviewing the change in #22?

dermesser added a commit that referenced this issue Feb 22, 2022
@dermesser
Copy link
Owner

I believe that this issue can be closed with the fix implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants