- Feature
- Added Request Capturer for chrome-remote-interface-extra
- Added WARC writer for chrome-remote-interface-extra
- Feature
- It is no longer required to include the extension in the name for the WARC file to be created, node-warc will add the correct extension if it is omitted
- The default WARC file option can now be supplied as the only argument to the constructor of all writers or set via
setDefaultOpts
- Added to all writers the ability to write a Webrecorder Player compatible bookmark list (as WARC info record) via
WARCWriterBase.writeWebrecorderBookmarksInfoRecord
and thepages
property of thegenOpts
object supplied togenerateWARC
- Breaking Change
- Creation of the WARC Info record using
writeWarcInfoRecord
or viagenerateWARC
's genOpts winfo property now accepts an object, Buffer, or string. If info records contents is an object, the objects properties (property, property value pairs) are written otherwise (when Buffer or string) the content is written as is. - If the supplied filename for the WARC to be written is null, undefined, or not a string an error is thrown
- Creation of the WARC Info record using
- CDP Based Changes
- Feature
- WARC parsing via a transform stream
- WARC parsing via async iteration (Node v10+)
- WARC generator for the request library (@hyl, #15)
- Gzipped WARC generation for all WARC generators (@hyl, #13)
- Puppeteer request capturer and WARC generator
- Puppeteer CDP session request capturer and WARC generator
- Added additional method to WARC generators to streamline WARC creation
- Added additional convince methods to
WARCWriterBase
for record generation - Ability to add
WARC-Warcinfo-Id
to records that field is allowed on (@BubuAnabelas, #7) - Added typescript definition file (@hyl, #13)
- Fixes
- WARCs generated by node-warc are no longer considered invalid by openwayback/jwat-tools
WARC-Concurrent-To
field is no longer missing on records it is expected to exist on (@BubuAnabelas, #21)
- Breaking Changes
- The default export is no longer
AutoWARCParser
but an object containing all primary classes provided by node-warc - WARC parsing has been completely reworked:
node-warc/lib/warcRecordBuilder
has been removednode-warc/lib/warcRecords
has been renamed tonode-warc/lib/warcRecord
- Removed all WARC record type specific classes in favor of a generic WARC record class (
node-warc/lib/warcRecord/record
) - Consolidated and generified WARC record building (
node-warc/lib/warcRecord/builder
)
- Request capturers no longer accept a
navMan
(Squidwarc remnant)
- The default export is no longer