-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parquetjs browser support #17
Conversation
Use wasm-brotli: * Use async import to load wasm before loading compression.js * requires async loading to get the wasm instanc * bubble up all the asyncs * make the tests pass again Remove compiled bundle.js from repo Remove browserify tsconfig
disable LZO completely due to overrun error
274eedb
to
c63f60f
Compare
}, | ||
} = columnChunk; | ||
return bloomFilterOffsetBuffer; | ||
if (!columnChunk.column.meta_data.bloom_filter_offset) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the previous form was causing an error in the browser when meta_data was undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should cause an error everywhere if columnChunk.column.meta_data
is undefined. Did you mean to say when bloom_filter_offset
is undefined? Also this should be redundant to the optional chain below unless you're really checking for falsey rather than undefined here. I would drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think it was actually bloom_filter_offset, and somehow the case I was looking at was getting through the if condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great work. I left a few non-blocking requests.
webpack.config.js
Outdated
@@ -0,0 +1,99 @@ | |||
/** | |||
* Left here in case esbuild stops working for us when we try to re-enable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is good, but this should be renamed webpack.config.js.example
or all the code in this file should be commented out if we're not actually using it. It's likely a future developer will land in the middle of this file by searching or just miss the comment and get really confused about how the module is built. It might also be good to name check the actual build system (esbuild?) in this comment.
}, | ||
} = columnChunk; | ||
return bloomFilterOffsetBuffer; | ||
if (!columnChunk.column.meta_data.bloom_filter_offset) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should cause an error everywhere if columnChunk.column.meta_data
is undefined. Did you mean to say when bloom_filter_offset
is undefined? Also this should be redundant to the optional chain below unless you're really checking for falsey rather than undefined here. I would drop it.
lib/reader.js
Outdated
@@ -603,13 +601,12 @@ class ParquetEnvelopeReader { | |||
if (colChunk.meta_data.dictionary_page_offset) { | |||
const offset = +colChunk.meta_data.dictionary_page_offset; | |||
const size = Math.min(+this.fileSize - offset, this.default_dictionary_size); | |||
dictionary = this.read(offset, size, colChunk.file_path).then(buffer => decodePage({offset: 0, buffer, size: buffer.length}, opts).dictionary); | |||
const buffer = await this.read(offset, size, colChunk.file_path) | |||
const dict = await decodePage({offset: 0, buffer, size: buffer.length}, opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is is necessary to read and decode the dictionary if opts.dictionary is already set? It'd be nice to avoid it if we're just going to throw it away, but maybe it's required to start the next read in the right place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not absolutely required, it just made it a little more readable.
* in reader.js, don't use throwaway variables, use nested "thens" * rename webpack.config.js to webpack.config.js.example
…t generated files from being checked in
Problem
Need to be able to use parquetjs in browser, Fixes #178456668
with @enddynayn , @acruikshank , @aneyzberg
Solution Summary
To verify:
Notes:
Example server working: