You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 4, 2021. It is now read-only.
Aegisthus will actually miss out rows in compressed SSTables, as it does not correctly track the sstable file size.
The SSTableRecordReader tracks the position in file to determine if there are still records to read. However, based on SSTableReader it's tracking the position in the uncompressed stream.
However, in AegSplit after initializing the compressedInputStream, we don't adjust end to be the actual uncompressed stream length (available in CompressionMetadata).
I sort of have a working patch, will submit in a PR in a few days.
The text was updated successfully, but these errors were encountered:
Awesome. I wrote the compressed reader as a proof of concept when we were considering adopting them a couple of years ago. But we haven't actually adopted them, so I haven't had a chance to verify the data against any real cases.
I believe this issue is closed and I am going to go ahead and mark it now. There is another project that delves further into compressed SSTables: https://github.com/fullcontact/hadoop-sstable which is probably even better. Netflix doesn't have a lot of compressed tables so we haven't had a need to optimize this use case.
Aegisthus will actually miss out rows in compressed SSTables, as it does not correctly track the sstable file size.
The SSTableRecordReader tracks the position in file to determine if there are still records to read. However, based on SSTableReader it's tracking the position in the uncompressed stream.
However, in AegSplit after initializing the compressedInputStream, we don't adjust end to be the actual uncompressed stream length (available in CompressionMetadata).
I sort of have a working patch, will submit in a PR in a few days.
The text was updated successfully, but these errors were encountered: