Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent handling of xml:lang attributes for XHTML pages between html-has-lang and html-lang-valid #3623

Closed
1 task done
thibaudcolas opened this issue Aug 27, 2022 · 4 comments
Labels

Comments

@thibaudcolas
Copy link
Contributor

thibaudcolas commented Aug 27, 2022

Product

axe-core

Product Version

4.4.2

Latest Version

  • I have tested the issue with the latest version of the product

Issue Description

html-has-lang’s has-lang-evaluate reports a failure for certain XHTML pages failing its XHTML validation, while html-lang-valid’s valid-lang-evaluate passes.

Expectation

html-has-lang and html-lang-valid should be consistent in their handling of xml:lang attributes, so it’s simpler to understand what the problem might be with a page. They should either both have the same XHTML validation, so they both fail pages with invalid language declarations – or both skip any XHTML validation.

If the discrepancy is intended, then I would expect the documentation of html-lang-valid to explain it also checks xml:lang. As-is, its description of "Ensures the lang attribute of the element has a valid value" is confusing when there is no lang attribute on the element.

Actual

Right now, when a page has a xml:lang but doesn’t pass Axe’s XHTML check,

  • html-has-lang fails with "Ensures every HTML document has a lang attribute […] The xml:lang attribute is not valid on HTML pages, use the lang attribute.".
  • html-lang-valid passes, with its description / help message suggesting everything is fine ("<html> element must have a valid value for the lang attribute").

I find it confusing that it seems lang is valid even when absent. And the XHTML validation is also confusing – I wasn’t expecting this kind of check from Axe, and the error message isn’t necessarily suggesting an actionable fix (the issue could be the page showing as HTML even though it’s not, rather than using the wrong attribute).

How to Reproduce

Here are sites where this can be tested with the Axe CLI:

axe -r html-has-lang,html-lang-valid https://www.nyjuror.gov/ --save html-lang-1.json
axe -r html-has-lang,html-lang-valid https://www.keolis-idf.com/ --save html-lang-2.json
axe -r html-has-lang,html-lang-valid https://www.cecil.fr/ --save html-lang-3.json

In each case, see the html-has-lang violation while html-lang-valid is marked as a "pass".

Additional context

Validation

Of the above three sites, the first one uses an invalid namespace and fails validator.w3.org (HTTPS scheme in xmlns="https://www.w3.org/1999/xhtml"). The other two seem to pass validation of the DTD + namespaces, but are nonetheless not detected as XHTML by Axe.

Lighthouse

I was also tempted to file this as an issue with Lighthouse, as their two audits based on Axe’s rules are even more confusing. In the above cases, it reports:

  • An error "<html> element does not have a [lang] attribute". No mention of the xml:lang / XHTML support.
  • A pass "<html> element has a valid value for its [lang] attribute"

Their web.dev resources (https://web.dev/html-has-lang/, https://web.dev/html-lang-valid/) make no mention of XHTML / XML, so it takes a lot of digging to understand the discrepancy.

@WilcoFiers
Copy link
Contributor

html-has-lang and html-lang-valid should be consistent in their handling of xml:lang attributes, so it’s simpler to understand what the problem might be with a page. They should either both have the same XHTML validation, so they both fail pages with invalid language declarations – or both skip any XHTML validation.

Axe-core doesn't report the same issue more than once. The rules we write are written to avoid overlap. Keep in mind that just because something passes, doesn't mean there can't be something wrong that's outside the scope of the rule.

If the discrepancy is intended, then I would expect the documentation of html-lang-valid to explain it also checks xml:lang. As-is, its description of "Ensures the lang attribute of the element has a valid value" is confusing when there is no lang attribute on the element.

Axe-core reports the following suggestion on the issue "The xml:lang attribute is not valid on HTML pages, use the lang attribute." This seems fairly actionable to me. Do you have any suggestions on how to improve on this?

Of the above three sites, the first one uses an invalid namespace and fails validator.w3.org (HTTPS scheme in xmlns="https://www.w3.org/1999/xhtml"). The other two seem to pass validation of the DTD + namespaces, but are nonetheless not detected as XHTML by Axe.

Axe-core is an accessibility tool, not an HTML validator. As far as I'm aware this wouldn't cause any accessibility problems, so this isn't something axe-core should test.

@WilcoFiers WilcoFiers added question and removed ungroomed Ticket needs a maintainer to prioritize and label labels Aug 29, 2022
@thibaudcolas
Copy link
Contributor Author

👍 it makes sense to me that two rules shouldn’t report the same problem. I still find it problematic that Axe reports:

The xml:lang attribute is not valid on HTML pages, use the lang attribute.

This would be actionable for sites that are written in HTML and mistakenly using xml:lang, but if a site is written in XHTML, then Axe’s suggestion isn’t very helpful. lang isn’t a valid attribute in XHTML, while xml:lang is, so it would seem more correct to address why the site is getting parsed as HTML by browsers, rather than change the one attribute.

Perhaps Axe could suggest using a validator in this case, or at least mention the XHTML detection? Something like:

The xml:lang attribute is only valid on XHTML pages, while this page was detected as HTML. Use the lang attribute.

Or:

The xml:lang attribute is not valid on HTML pages, use the lang attribute instead or make sure the page parses as XHTML.

It’s lost on me why the XHTML yes/no check is useful, but if it is, it probably should be surfaced like this. Even if Axe isn’t a validator, here it does restrict which attribute is allowed based on what the page parses as, with a message that doesn’t really make this apparent at all.

@straker
Copy link
Contributor

straker commented Nov 17, 2023

Closing as it appears the question was answered. Please feel free to reopen if you can provide more information.

In response to your last message, the html-has-lang rule checks to see if the page is an XHTML page before displaying that specific message. That error will only display for pages detected as HTML that use the xml:lang attribute with a value and do not provide a lang attribute with a value.

@straker straker closed this as completed Nov 17, 2023
@thibaudcolas
Copy link
Contributor Author

@straker I don’t think I was asking a question here, just reporting a possible issue. There is no more information to be provided, and even if there was, this repository doesn’t allow reopening issues.

And yes, what you just mentioned is exactly what I find is confusing. If a page was detected as HTML and that’s why Axe reports a missing lang – the error message should mention the page was detected as HTML. The page being detected differently than it’s been authored is probably what developers should look at fixing, rather than following the suggestion from Axe to add a lang attribute to a page where they already have xml:lang.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants