-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new_http_archive
can't handle archives containing unicode-encoded filenames
#1653
Comments
I just ran into this too. |
This is causing problems for me with libvips, which has a filename with russian characters in it in the release. |
Add a test verifying that http_archive can extract a tar archive containing unicode characters. While such files cannot be referred to by labels, it is still important that the archive can be extracted. Also fix that use case on Darwin, by appropriately reencoding the string, so that the Files java standard library can encode it back to what we had in the first place. Work-around for #1653, showing that http_archive from @bazel_tools can be used; however, the issue still remains for zip archives. Change-Id: If944203bf618c21705af676347d8591ab015d559 PiperOrigin-RevId: 183987726
*** Reason for rollback *** Breaks on our CI Linux machines (but works on our work desktop Linux machines); apparently, even our own Linux machines are too different from each other... Fixes #4557 *** Original change description *** http_archive: verify that unicode characters are OK in tar archives Add a test verifying that http_archive can extract a tar archive containing unicode characters. While such files cannot be referred to by labels, it is still important that the archive can be extracted. Also fix that use case on Darwin, by appropriately reencoding the string, so that the Files java standard library can encode it back to what we had in the first place. Work-around for #1653, showing that http_archive from @bazel_tools can be used; however, the issue still remains for zip archives. *** PiperOrigin-RevId: 184132385
*** Reason for rollback *** Breaks on our CI Linux machines (but works on our work desktop Linux machines); apparently, even our own Linux machines are too different from each other... Fixes #4557 *** Original change description *** http_archive: verify that unicode characters are OK in tar archives Add a test verifying that http_archive can extract a tar archive containing unicode characters. While such files cannot be referred to by labels, it is still important that the archive can be extracted. Also fix that use case on Darwin, by appropriately reencoding the string, so that the Files java standard library can encode it back to what we had in the first place. Work-around for #1653, showing that http_archive from @bazel_tools can be used; however, the issue still remains for zip archives. *** PiperOrigin-RevId: 184132385
*** Reason for rollback *** Breaks on our CI Linux machines (but works on our work desktop Linux machines); apparently, even our own Linux machines are too different from each other... Fixes #4557 *** Original change description *** http_archive: verify that unicode characters are OK in tar archives Add a test verifying that http_archive can extract a tar archive containing unicode characters. While such files cannot be referred to by labels, it is still important that the archive can be extracted. Also fix that use case on Darwin, by appropriately reencoding the string, so that the Files java standard library can encode it back to what we had in the first place. Work-around for #1653, showing that http_archive from @bazel_tools can be used; however, the issue still remains for zip archives. *** PiperOrigin-RevId: 184132385
*** Reason for rollback *** Breaks on our CI Linux machines (but works on our work desktop Linux machines); apparently, even our own Linux machines are too different from each other... Fixes #4557 *** Original change description *** http_archive: verify that unicode characters are OK in tar archives Add a test verifying that http_archive can extract a tar archive containing unicode characters. While such files cannot be referred to by labels, it is still important that the archive can be extracted. Also fix that use case on Darwin, by appropriately reencoding the string, so that the Files java standard library can encode it back to what we had in the first place. Work-around for #1653, showing that http_archive from @bazel_tools can be used; however, the issue still remains for zip archives. *** PiperOrigin-RevId: 184132385
This is still an issue as of Bazel 0.13 with
|
We are deprecating the native versions of http_archive and git_repository. Have you tried the skylark-based versions, do they have the same error? To use the Skylark new_git_repository, just add this to your WORKSPACE:
|
Yes I did try it with the skylark originally
On Tue 22 May 2018 at 11:25, katre ***@***.***> wrote:
We are deprecating the native versions of http_archive and git_repository.
Have you tried the skylark-based versions, do they have the same error?
To use the Skylark new_git_repository, just add this to your WORKSPACE:
load(
***@***.***_tools//tools/build_defs/repo:git.bzl",
"git_repository",
"new_git_repository",
)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1653 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAIY-7zwAr3VrRK1-la1E7cLDqlZ0iZaks5t09lzgaJpZM4JlCMK>
.
--
twitter.com/steeve
github.com/steeve
linkd.in/smorin
|
This no longer crashes with the Starlark rules, but it still fails during extraction:
I thought I could work around with |
Also affected, using Starlark
Same as above; using May try to make a workaround Python script to run via |
Is there any progress on fixing this? It's understandable that this is not an issue for Google internal but apparently a huge bumper for many community users. Given that this is a 1-year-old P1 bug, shall we provide some workarounds for now? eg. adding an http_repository attribute to skip non-ascii-named files or to skip certain directories. |
Let's make sure |
Interestingly enough, the problem is very sensitive to tiny changes in the environment; https://bazel-review.googlesource.com/c/bazel/+/93754 passes on my corp desktop, but fails on our CI machines. |
Yes, see #7757 for a bit of an explanation. |
I'm attempting to add
libgit2
to a project as a bazel external, like so:Unfortunately,
bazel build @com_github_libgit2//...
fails withThe problem seems to be proximally caused by the fact that
bazel
forces itself into a latin-1 locale:bazel/src/main/cpp/blaze.cc
Lines 1718 to 1724 in 936c2c2
UnixPath.encode
will happily encode utf-8 paths in a utf-8 locale.I think this is distinct from #374 in that I don't even need to reference the problematic files in a rule; bazel can't even unpack the tarball, even though I don't care about the files in question.
The text was updated successfully, but these errors were encountered: