-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added gzipped WARC support, started TypeScript bindings #13
Conversation
Added Puppeteer RequestCap/WARCWriter to index.js
@hyl thanks for adding gzip support and the typescript definition file! Per Could change the second arg of const WARCInitOpts = {
appending: false,
gzip: process.env.NODEWARC_WRITE_GZIPPED != null
}
initWARC (warcPath, options = {}){
this.opts = { ...options, ...WARCInitOpts }
} This would be the simplest way to go about this unless this configuration goes into the constructor. Per the typescript definition file, I am currently in the process of improving the parsing and refactoring node-warc to be less janky when it comes to its exports. |
👍 to the introduction of I'll see if |
@hyl apologies for the delay in my reply... been a busy week for me. I changed the merge base from master to dev and broke this PR.... |
# Conflicts: # README.md # lib/writers/warcWriterBase.js
@hyl I merged N0taN3rd/node-warc@dev into your branch. Merging this PR thanks for it! |
Hi! Few different themes in this PR - happy to revisit each of them individually, but figured it was worth starting the conversation.
I've added support in the base WARC writer for writing
.warc.gz
files - the GZIP spec specifies that records should be compressed individually, so it callszlib.gzipSync
on each buffer that goes intowriteRecordBlock
if the environment variableNODEWARC_WRITE_GZIPPED
exists. Do you want it to be controlled by an envvar or can we pass in some kind of config? I couldn't see an obvious way to do the latter, so I decided to go for the former.I've also started working on a TypeScript definition file - a lot of it is auto-gen and a bit rough and ready, but adding to it as I go. I'll continue working on this as I go.