-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds ISD Compression #606
Adds ISD Compression #606
Conversation
LGTM! General question - does one have to read/write the intermediary uncompressed file or can one just work with the compressed ISD. It looks like the workflow right now would be write uncompressed, compress. Then read compressed, uncompress and write to disk, do something with the data. Ideally, could one just instantiate a sensor model using the compressed ISD without having to use all the disk space to uncompress? Maybe that is a knoten issue? |
@jlaura It kind of does seem like a knoten/SET issue, but if we are going to support compressed ISDs we might as well provide a basic API so people dont have to implement their own. Maybe a follow up task could be to add a Something else to consider: add a flag to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarifying questions for me. If they are pedantic, ignore them and feel free to merge!
help="Output a compressed isd json file with .br file extension. " | ||
"Ale uses the brotli compression algorithm. " | ||
"To decompress an isd file run: python -c \"import ale.isd_generate as isdg; isdg.decompress_json('/path/to/isd.br')\"" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! So using this, one could write directly to compressed. Then a SET could read the compressed ISD straight to memory and use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think anything in ale takes in an isd file. I'm working on having knoten take in these compressed isds directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! I didn't mean for this comment to spill into more work over there! Much appreciated.
ale/isd_generate.py
Outdated
|
||
os.rename(uncompressed_json_file, os.path.splitext(uncompressed_json_file)[0] + '.br') | ||
|
||
return os.path.splitext(uncompressed_json_file)[0] + '.br' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have to write and then re-write? No way from brotli to compress into memory to skip the intermediary file? Perhaps using something like StringIO or cStringIO (both standard library afaik.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this function to accept json data directly instead of reading from a file, eliminating the need to write the json data to a file.
with open(compressed_json_file, 'rb') as f: | ||
data = f.read() | ||
with open(compressed_json_file, 'wb') as f: | ||
f.write(brotli.decompress(data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused by the decompression here. Is this read the compressed and then write the decompressed to disk? Would it make more sense to simply pipe the decompressed to stdout? That would be more bash-like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for situations where a user wants to revert to a readable JSON file saved locally. We could add a function to read the JSON from binary format, if you think that would be appropriate.
closes #604
Licensing
This project is mostly composed of free and unencumbered software released into the public domain, and we are unlikely to accept contributions that are not also released into the public domain. Somewhere near the top of each file should have these words: