-
-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Anchor links are encoded in html-proofer output #757
Comments
Oh I’m sorry I missed this. Was the problem fixed? |
I am seeing this same problem w/ html-proofer 5.0.8
I note the encoding is correct in the debug output as it's checking:
|
There wasn't a specific test for this, and it fails on some setups. The link in the fixture is a direct copy-paste from one that fails on my project. But the test passes on this test suite... So there is something else going on. Relates to gjtorikian#757
I have done some more investigation on this issue. I think it's caused by an upstream dependency rather than something in html-proofer itself. You can see the error in action on this test run : https://github.com/github/opensource.guide/actions/runs/8887912548/job/24403938507?pr=3214 I added one of the failing links from that project as a test fixture, and added a test to exercise it, which you can see on my fork: That test passes on my branch. And I'm using the same ruby version on both html-proofer and opensource.guide. Looking through the Gemfile, one possible difference is that html-proofer is using |
Yo, thanks for the nice write-up and repros. Always a huge help to see. I'll clone the guides repo and see what I can find, if not an error in this project than at least to help someone else pinpoint this down the line. |
I can't seem to repro this. I created some text to get this outputted into the site build: <p><a href="#eigentümerschaft">Hash mit ein umlaut</a></p>
<h2 id="eigentümerschaft">Hier gib's</h2> Running I do note that both the Ruby (2.7.5) and Node (12.14.0) are pretty wildly out of date, so it's possible that some Ruby gem dependency which needs to be updated, hasn't been, because of the old Ruby version. |
OK I was able to make got a reproducible test case on my fork of html-proofer : ahpook@bd8e0d4 As I put in that commit, only links to different page using "../page/#häsh" syntax failed. Failures:
1) HTMLProofer::Check::Links passes for internal links with non-ASCII characters from implicit indexes
Failure/Error: expect(proofer.failed_checks).to(eq([]))
expected: []
got: [#<HTMLProofer::Failure:0x00000001067d3988 @path="spec/html-proofer/fixtures/links/hash_nonascii_dir/...inking to target/#ei
gent%C3%BCmerschaft, which does not exist", @line=3, @status=nil, @content=nil>]
(compared using ==)
Diff:
@@ -1 +1 @@
-[]
+[#<HTMLProofer::Failure:0x00000001067d3988 @path="spec/html-proofer/fixtures/links/hash_nonascii_dir/source/index.html", @check_name=
"Links > Internal", @description="internally linking to target/#eigent%C3%BCmerschaft, which does not exist", @line=3, @status=nil, @content=
nil>]
# ./spec/html-proofer/links_spec.rb:52:in `block (2 levels) in <top (required)>' The second anchor link in the file is found ok. |
I added a little debug line to matched_files.each do |metadata|
url = HTMLProofer::Attribute::Url.new(@runner, link, base_url: metadata[:base_url], source: metadata[:source], filename: metadata[:filename])
@logger.log(:debug, "Internal link #{link}: base_url: #{metadata[:base_url]}, source: #{metadata[:source]}, filename: #{metadata[:filename]} -- resulting url: #{url}") ❯ /opt/homebrew/Cellar/ruby@3.2/3.2.2/bin/ruby exe/htmlproofer --log_level=debug spec/html-proofer/fixtures/links/hash_nonascii_dir
Checking 2 internal links
(1 / 2) Internal link ../target/#eigentümerschaft: Checking 1 reference
Internal link ../target/#eigentümerschaft: base_url: , source: spec/html-proofer/fixtures/links/hash_nonascii_dir, filename: spec/html-proofer/fixtures/links/hash_nonascii_dir/source/index.html -- resulting url: target/#eigent%C3%BCmerschaft
(2 / 2) Internal link ../target/#mitarbeit: Checking 1 reference
Internal link ../target/#mitarbeit: base_url: , source: spec/html-proofer/fixtures/links/hash_nonascii_dir, filename: spec/html-proofer/fixtures/links/hash_nonascii_dir/source/index.html -- resulting url: ../target/#mitarbeit
... |
So, in the If there are non-ascii characters, the URL is passed through to Addressable::URI's https://github.com/gjtorikian/html-proofer/blob/main/lib/html_proofer/attribute/url.rb#L223-L228 Pry session:
|
@gjtorikian Hi, sending a friendly ping on this - do you have a suggestion how to proceed? I think either (a) the regexp check in I'm blocked on working a patch up because not sure what kinds of "bad" urls you've seen and are trying to catch with the regexp. Also, maybe I am misunderstanding this, but where is the previous regexp that the string interpolation |
Sorry but can you create a PR with a reproducible test case? I remember digging through your comments and commits but I got distracted, and having a single reproducible issue would help in my fix resolution. Thank you! |
I'll make a PR with just my test changes to avoid regressions, but the refactor you did in #820 fixed this issue as well! |
There wasn't a specific test for this, and it fails on some setups. The link in the fixture is a direct copy-paste from one that fails on my project. But the test passes on this test suite... So there is something else going on. Relates to gjtorikian#757
Hey there! In our translated files, Japanese anchor links are being encoded in the html-proofer output even though the anchor links are not encoded in our HTML. So, for example, we have the following in an HTML file:
But see the following in the html-proofer output:
internally linking to #authorization-%E3%83%98%E3%83%83%E3%83%80%E3%83%BC%E3%82%92%E4%BD%9C%E6%88%90%E3%81%99%E3%82%8B; the file exists, but the hash ‘authorization-%E3%83%98%E3%83%83%E3%83%80%E3%83%BC%E3%82%92%E4%BD%9C%E6%88%90%E3%81%99%E3%82%8B’ does not
I'm not sure if this is a bug with html-proofer or a configuration problem on our end, but I don't recall seeing this before. Thanks for any help you can provide.
The text was updated successfully, but these errors were encountered: