-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for archive + file comments #44
Conversation
Thanks for the PR. What's the usecase for adding comments? This has been proposed and rejected before in #15. I'm open to adding support for this if there's a real usecase. |
Short version: application-specific metadata as part of the central directory. I’m building out the JavaScript tooling for https://medium.com/@mojodna/tapalcatl-cloud-optimized-tile-archives-1db8d4577d92, particularly the archive generation piece. A preliminary survey of unzip libraries across multiple platforms showed support for comments and they fit the use-case really well, since a single read gets both the central directory and application-specific metadata. The facilitating library is here: https://github.com/mojodna/tapalcatl-js Thanks to |
Sounds like a compelling usecase to me! And thanks for this high-quality PR! I'll work on merging this tonight. |
Awesome, thanks! |
So here's something disappointing about the zipfile spec and ecosystem: technically the zipfile comment is supposed to be encoded in CP437, but in practice I've only seen software that treats it as UTF-8. The overlap between UTF-8 and CP437 is the codepoints Do you need newlines (or hard tabs, or other ascii control characters, or non-ascii unicode characters) in your zipfile comment? For example, the python Note that the file comments on entries in the zipfile can be UTF-8, so there's no problem there. |
oops. accidentally merged. .... this PR probably won't reopen will it 🤔 |
@@ -606,7 +608,7 @@ function dateToDosDateTime(jsDate) { | |||
} | |||
|
|||
function writeUInt64LE(buffer, n, offset) { | |||
// can't use bitshift here, because JavaScript only allows bitshiting on 32-bit integers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💩💩💩
@mojodna I published yazl version 2.5.0 which includes this feature. Please let me know how it works for you. The CP437 situation is pretty silly. I'm going to great lengths to adhere to an obscure part of the spec that no one else seems to care about. I'm seriously considering giving up on cp437 entirely both in yazl and yauzl. 🤔 |
I have a suggestion: Create your own zip file specification. Give it a new catchy name something like "Frenzied Zip File Format Specification" and make the tweaks you want, such as clearing up ambiguity, declaring that cp437 is only used in certain circumstances, whatever. Then make it clear that most zip file readers are actually adhering to Frenzied Zip File Format Specification anyway, not the "official" specification. Make a blog post talking about the problems with the official specs, and promote your fork spec, and point out that most people are using your spec anyway, we should just all admit it and agree this is the new standard. |
The Mac Archive Utility can't read self extracting zip files. So not everyone is on the same page in the wild. I guess I could just have a special section for Archive Utility zipfiles, since they're so different anyway. A blog post is something I've been meaning to do for years though. |
@thejoshwolfe awesome, thanks so much! I did a bit of experimentation to see how various tools / libs handle UTF-8 encoded comments. The short version is that all the ones I've tested just pass the bytes through (especially Python 3, where it comes back as Archive creator: var fs = require("fs");
var yazl = require("yazl");
var zipfile = new yazl.ZipFile();
zipfile.outputStream.pipe(fs.createWriteStream("test.zip"));
zipfile.addFile("package.json", "package.json", {
fileComment: "abcd⎋⌘👋"
});
zipfile.end({ comment: Buffer.from("abcd⎋⌘👋") })
$ unzip -v test.zip
Archive: test.zip
abcd⎋⌘👋
Length Method Size Cmpr Date Time CRC-32 Name
-------- ------ ------- ---- ---------- ----- -------- ----
694 Defl:N 333 52% 11-17-2018 17:26 1dd024d9 package.json
abcd⎋⌘👋
-------- ------- --- -------
694 333 52% 1 file Python 2: >>> import zipfile
>>> z = zipfile.ZipFile("test.zip")
>>> z.comment
'abcd\xe2\x8e\x8b\xe2\x8c\x98\xf0\x9f\x91\x8b'
>>> print z.comment
abcd⎋⌘👋 Python 3: >>> import zipfile
>>> z = zipfile.ZipFile("test.zip")
>>> z.comment
b'abcd\xe2\x8e\x8b\xe2\x8c\x98\xf0\x9f\x91\x8b'
>>> print(z.comment)
b'abcd\xe2\x8e\x8b\xe2\x8c\x98\xf0\x9f\x91\x8b'
>>> print(z.comment.decode("utf-8"))
abcd⎋⌘👋 JDK 8 (via Scala 2.12.7): scala> import java.util.zip.ZipFile
import java.util.zip.ZipFile
scala> val z = new ZipFile("test.zip")
z: java.util.zip.ZipFile = java.util.zip.ZipFile@1d23ff23
scala> z.getComment()
res1: String = abcd⎋⌘👋 |
This adds support (and tests) for archive and file comments. Archive comments are set by providing
comment
in.end()
'soptions
. File comments are set by providingfileComment
when adding files.