WCAG 2 and APCA Comparison #131

Myndex · 2021-10-17T17:23:30Z

Myndex
Oct 17, 2021

On the APCA Github repo, I posted my portion of a discussion with some of the early adopters of APCA. Here is the key infographic fromthat post:

WCAG 2 and APCA Comparison

Here is the old WCAG 2 1.4.3 contrast guideline against the new WCAG 3 APCA contrast guideline. As the links below also show, this is not an isolated case. Using thousands of random colors, WCAG 2 incorrectly passed 49% colors it tested — meaning passing colors that were too low in contrast to read.

(click image to view it full size/full screen)

Comparison Article: The Lighter Side of Dark Backgrounds

Comparison Article: Orange You Wondering About Contrast?

Regarding consistency concerns between the old and new:

It is not possible to create something that works correctly
and also be consistent to something that is incorrect.

Fortunately, there is enough overlap that a "transition" zone can be had, and some of the tools developed by other developers have a "pass all" setting, to pass both WCAG 2 and WCAG 3 at once, though the trade off is a loss of the flexibility that APCA normally adds. I am actually very surprised at the number of early adopters that have made APCA a part of their contrast tools, literally dozens of tools out there now!

-Andy

Myndex/SAPC-APCA#21

sdw32 · 2021-10-25T13:00:20Z

sdw32
Oct 25, 2021

Andy this is a real eye-opener! The 49%. Incorrect passes is really hard hitting. I wondered if you had published this study with thousands of colours in the public domain? If not, please can you explain how the thousands of colours pairs were generated, was it just three random numbers all between 0 and 255 for the foreground and similar again for the background?

Also, do you have a figure for how many colours WCAG 2 fails when it ought to pass? Many thanks.

0 replies

Myndex · 2021-10-25T22:41:24Z

Myndex
Oct 25, 2021
Author

@sdw32 I have not published the complete model, it was summer of 2019, and an earlier model. Now, pretty sure 2.x would do worse, but not gonna beat a dead horse. They were pure random tuples, I'd need to dig it up to see what domain they were in, but most likely linear 0.0-1.0, for each channel, then apply gamma and multiply by 255, so: rnd()^0.45 * 255

It is important to point out that no designer would use pure random colors, it was only to point out the deep confirmation bias that surrounds that.

0 replies

sdw32 · 2021-10-26T08:27:21Z

sdw32
Oct 26, 2021

Thanks for the follow-up. I think this issue remains important to fully investigate, contrast ratio thresholds are far from dead! I think are presently used to determine the outcome of ADA lawsuits and Section 508 compliance for procurement contracts, and will continue to do so for the indefinite future!

As a group in the LVTF, we still have a decision to make regarding whether the supplemental guidance uses ratios, or APCA, or both. If we do anything other than using ratios on their own, we will have to write some content justifying why we didn't just use contrast ratios. Understanding the full extent to which the ratios model is broken would helpfully inform this decision I think.

Did you also have a value for the number of colours that failed the ratios model when they ought to have passed? I'm imagining something like an Excel spreadsheet that can calculate both WCAG2 and APCA contrasts from RGB input values for foreground and background, which could be randomly generated in Excel. I wondered if this is something that you already had, or if you would be able to make it? This page looks like it has the VBA custom functions for WCAG contrast, so would just need similar VBA custom functions for APCA [https://ramblings.mcpher.com/color-fiesta/playing-around-with-colors-in-vba/#Calculate_contrast_ratio]

Happy to chat further on this topic during the regular meetings.

0 replies

Myndex · 2021-10-27T07:13:27Z

Myndex
Oct 27, 2021
Author

Hi @sdw32

contrast ratio thresholds are far from dead! I think are presently used to determine the outcome of ADA lawsuits and Section 508 compliance for procurement contracts, and will continue to do so for the indefinite future!

Are you talking about the WCAG 2 ratio? or ratios in general? or the ratios of some specific national standard of some nation?

There is no contrast ratio requirement under current ADA unless something changed very recently. As for lawsuits, not sure the context you mean.

The WCAG 2.x contrast ratio is different from other standards, and not directly compatible with any of them, and only exists for use in web-context. It's also not accurate (perceptually or otherwise) as has been shown over and over.

Some other standards (ISO, ANSI, etc) may use some variant of Weber or Michelson, or something completely different. Most of them are NOT ratios. Australia for instance uses something I can only call "bizarre" and resorts to brute force due to its incorrect shape.

The UK uses LRV difference, specifying a 30 point difference. But once again, this is not a perceptual standard. Compared to APCA, a 30 point LRV difference would rate anywhere from a low Lc 19 (if one color was white) up to Lc 47 if one color is black. This is the exact reverse of the WCAG 2.x problem, but as the UK standard relates to physical signage it's possibly a less serious problem than WCAG 2.x, as physical signage is also dependent on ambient illumination, so forcing contrast higher for darker colors has some benefit. Still, it is a brute force solution. (And when I say opposite problem, WCAG 2.x causes darker colors to have lower contrast by over reporting).

As a group in the LVTF, we still have a decision to make regarding whether the supplemental guidance uses ratios, or APCA, or both. If we do anything other than using ratios on their own, we will have to write some content justifying why we didn't just use contrast ratios. Understanding the full extent to which the ratios model is broken would helpfully inform this decision I think.

I've written about 145,000 words on this, so I'm guessing you haven't read much of my work? I won't ask you to read all the in depth material, but here are a few very brief articles with many visual examples:

The Lighter Side of Dark Backgrounds

Orange You Wondering About Contrast?

WCAG 2 vs APCA • A Contrast in Applied Maths

The issue of using contrast ratios to describe perceived contrast has a long history of wavy pseudoscience as described in this paper by Dr. Arditi.

This kind of thing is partly why I am doing this research, to add to the knowledge in this area, and the papers I am preparing focus on these kinds of issues.

Did you also have a value for the number of colours that failed the ratios model when they ought to have passed?

I've posted this and others several times. I'm surprised this is the first you heard this as I've written about it a lot. I did several runs with 1000 colors (500 pairs). There were some more recent test runs where the incorrect pass rate was as high as 75%(depending on criteria such as use case). This was from 2019.

I'm not inclined to continue much of this, as the proactive choice is developing better solutions going forward. The few articles and examples are all in response to specific requests, but I'd rather focus on other positive actions.

I'm imagining something like an Excel spreadsheet that can calculate both WCAG2 and APCA contrasts from RGB input values for foreground and background, which could be randomly generated in Excel. I wondered if this is something that you already had, or if you would be able to make it? This page looks like it has the VBA custom functions for WCAG contrast, so would just need similar VBA custom functions for APCA [https://ramblings.mcpher.com/color-fiesta/playing-around-with-colors-in-vba/#Calculate_contrast_ratio]

Yes I already have of course. There is an APCA spreadsheet pre-built (LibreOffice) at the APCA GitHub repo. And the rest I have written about extensively in thread 695 and others.

As a side note, when writing posts here, enclose URLs in parentheses not square brackets. Use the square brackets for the clickable text. Basically like this:

        [Click here I say!](https://www.MyURL.com)

Note there is no space between the ] and the (

As for the content on the link you shared, there is enough wrong there that I suggest you disregard. Here is an oft cited site for maths for displays and color etc:

http://brucelindbloom.com

Bruce uses the ASTM values, sometimes very slightly different than some other sources. I hear that easyrgb is also good, but have not investigated that site much. I use the CIE Colorimetry (2004 and 2007), ITU, IEC, and SMTPE standards as appropriate, writing my own code, as I don't use any code snips off internet sites anywhere because apparently very few people actually know the correct math, much less how vision works, and it's rather shocking. Even some commercial software like Matlab got it wrong. On github there is a python library called ColourScience that's solid, but can't vouch for many others.

0 replies

sdw32 · 2021-10-28T15:25:42Z

sdw32
Oct 28, 2021

Thanks for all the additional info. I have now completed my own exploration of APCA Perceived Lightness Contrast versus WCAG2 contrast ratios. I programmed both as custom functions in Excel, then used the spreadsheet to compare passes and fails for white text on coloured backgrounds, and black text on coloured backgrounds.

Perhaps unsurprisingly, I agree with the conclusion that WCAG2 is not fit for purpose. When one of the colours is black, it incorrectly passes a great deal of colour combinations, which can be so dark that they are almost completely unreadable.

When one of the colours is white, it also incorrectly fails a great deal of colour combinations that are actually fine.

For my own analysis in Excel, I did completely random integers in the range 0-255 for both foreground and background colours. I ran circa 5000 combinations, obviously lots of these were completely nonsensical, but 1393 colour pairs were considered viable by one algorithm or the other. Of these,

65.8% of the pairs were passed by both algorithms,
27.9% incorrectly passed WCAG2, due to its issue with dark colour pairs.
6.3% of the pairs incorrectly failed WCAG2, as this excessively penalises bright colour pairs.

Having completed my exploration, I am now of the opinion that the minimum contrast threshold is within the supplemental guidance for WCAG2 should be based on APCA Perceived Lightness Contrast. I don't currently believe that it's worth spending effort trying to develop patches to fix the WCAG2 algorithms, it would be difficult to develop a patch for the dark pairs problem, which didn't make the bright colour pairs problem even worse. (and vice versa). I will look forward to further discussions on this topic.

1 reply

Myndex Oct 29, 2021
Author

I agree with the conclusion that WCAG2 is not fit for purpose. When one of the colours is black, it incorrectly passes a great deal of colour combinations, which can be so dark that they are almost completely unreadable.''

Yep.

_65.8% of the pairs were passed by both algorithms,

27.9% incorrectly passed WCAG2, due to its issue with dark colour pairs.

6.3% of the pairs incorrectly failed WCAG2, as this excessively penalises bright colour pairs._

This is useful — I am going to guess that the WCAG 2 criteria was AA and 4.5:1 (fonts smaller than 18pt)...

I am now of the opinion that the minimum contrast threshold is within the supplemental guidance for WCAG2 should be based on APCA Perceived Lightness Contrast. I don't currently believe that it's worth spending effort trying to develop patches to fix the WCAG2 algorithms, it would be difficult to develop a patch for the dark pairs problem, which didn't make the bright colour pairs problem even worse.

Yea, it's not a patchable problem, and right now leadership tells me that there shall be no change that is not backwards compatible, meaning the math literally can't change for WCAG 2.

I was at one point going to promote the following SCs (patches):

At least one of the colors must be lighter than the equivalent luminance of #A0A0A0
No font used for content text may be less than 12pt/16px
Body text should be 10:1

But I was not able to get them in in time as Silver also had a deadline...

sdw32 · 2021-11-01T10:37:40Z

sdw32
Nov 1, 2021

Thanks for the follow-up. Apologies, my previous post was not clear, I was testing 3:1 WCAG2 against perceived lightness contrast>45.

I have just now repeated the analysis based on testing 4.5:1 WCAG2 against perceived lightness contrast>60. On this basis, the number of colours considered viable by either algorithm reduces to 596 colour pairs. Of these:

55.4 % of the pairs were passed by both algorithms,
38.1 % incorrectly passed WCAG2, due to its issue with dark colour pairs.
6.5% of the pairs incorrectly failed WCAG2, as this excessively penalises bright colour pairs.

This is now pretty close to the toss of the coin, which Andy has previously reported.

Regarding patches, increasing the contrast ratio would of course help to solve the dark colour pairs problem, but it would also make the bright colour pairs problem worse. In order to properly solve it, I think we would need to make the viable threshold depend on the brightness of the brightest colour, but I fear this would end up being too complicated to be practically useful. I will look forward to discussing this topic further in our regular calls.

0 replies

Myndex · 2021-11-01T21:58:09Z

Myndex
Nov 1, 2021
Author

Thank you Sam it's good to have independent verification.

Yea the only "patches" that are practical as the few I mentioned, but they too are somewhat brute force.

The very rapid and wide acceptance and integration into contrast tools caught me by surprise. But this also means if there is a WCAG 2.3, the better bet than a patch is a WCAG 2.3 version of APCA, though this is not without issues, in particular if the old method "has to pass too" that kills the flexibility of a lot of APCA... I'd need to evaluate how much impact that would be.

The reason is that WCAG 2 states 3:1 for "everything" and then 4.5:1 for small text.

Converting to WCAG 2 ratios (roughly in my head) APCA's spread is much wider, as large non-text items that are non-lexical like a button form can be Lc30 which is about 2:1, moving up the scale, APCA is actually stricter except when the text is brighter than about #bbb ish, and the BG is darker. APCA is slightly less strict when the background is brighter than #ddd.

But on that point, one of the guidance areas not yet mentioned is an overall page background that is not brighter than about #ddd. Here's a demonstration of that reason:

0 replies

mraccess77 · 2021-11-02T02:27:15Z

mraccess77
Nov 2, 2021
Maintainer

Regarding page backgrounds - I actually prefer to the full white background it is much easier for me to read than the other one you provided. I'd note that Word, outlook, slack, onenote, etc. all have black on white as well as the default. I know I may be an exception and so understand that for most people including people with some types of visual impairments it is not best - but for some low vision users that black on full white is needed -- so we need to at least acknowledge that some people require that and may want to change settings for that - there are no absolutes in this world and some people need the extremes. The bright background keeps me focused.

I would personally agree that full white text on a totally black background is not good for reading for me and is much more problematic probably for a larger set of users.

1 reply

Myndex Nov 2, 2021
Author

Regarding page backgrounds - I actually prefer to the full white background it is much easier for me to read than the other one you provided.

Hi Jonathan @mraccess77

Yes, there is much personal preference here. The user adjusted monitor brightness for a given environment is one key (and uncontrolled) aspect.

I'd note that Word, outlook, slack, onenote, etc. all have black on white as well as the default. I know I may be an exception and so understand that for most people including people with some types of visual impairments it is not best - but for some low vision users that black on full white is needed

Yes, and part of what I've been saying along with many in the choir is the need for real personalization.

I for one can not tolerate #FFF background for more than a few minutes before my vision literally starts to shut down. And for me, reverse light on black works best.

There is some recent research that light on dark is healthier for eyesight — a study with younger individuals indicating that those that used mainly a white screen with black developed myopia vs those using light text on dark.

-- so we need to at least acknowledge that some people require that and may want to change settings for that - there are no absolutes in this world and some people need the extremes. The bright background keeps me focused.

I agree here 100%. The absolute is there are no absolutes...

I would personally agree that full white text on a totally black background is not good for reading for me and is much more problematic probably for a larger set of users.

Here, research has not found a fully defined answer for everyone. While some research had claimed there was a benefit to dark on light, the claim was supported by a chart that was heavily biased. The actual difference is very minor other than for user preference.

Some research shows that those (like me) susceptible to glare have a problem with bright full white backgrounds and do better with light text on dark. The environment and ambient light are also significant factors.

sdw32 · 2021-11-02T10:40:10Z

sdw32
Nov 2, 2021

I have now refined my analysis that tests randomly generated colours, on the situation is even worse than previously reported.

This time around, I have focused the analysis on two real-world scenarios that are particularly relevant:

One of the colours is pure black, the other is random
One of the colours is pure white, the other is random

On this basis, the false pass of dark colours only occurs in the first scenario, and the false fail of bright colours only occurs in the second scenario.

Considering the first scenario, I generated 2000 random colours, 1000 colour text on black background and 1000 black text on colour background. Of these 2000 colours, WCAG2 suggested 1286 passed AA 4.5to1. APCA suggested 570 passed perceived lightness contrast>60. This gives a total of 716 colours that passed WCAG2 when they ought to have failed, a whopping 55.7%. So, if one of the colours is black, and the other is random, WCAG2 appears to be less accurate than tossing a coin.

If we look at the second scenario, I generated 2000 random colours, 1000 colour text on white background and 1000 white text on colour background. Of these 2000 colours, WCAG2 suggested 733 passed AA 4.5to1. APCA suggested 1122 passed perceived lightness contrast>60. This gives a total of 389 colours that failed WCAG2 when they ought to have passed, which is 34.7% of the 1122 colours that ought to have passed.

I made a webpage where the left hand column showcases the entire set of WCAG2 false passes (black & random colour). The right-hand column showcases the entire set of WCAG2 false fails (white & random colour). http://www.cedc.tools/pairs.html.

For the sake of this webpage, I set the font-family to Arial and the font-size to 12 PX.

I sorted the table by APCA perceived lightness contrast, so the most epic incorrect pass for black & random colour is at the top of the left-hand column. This is R56,G130,B88 text on pure black background, which just passes 4.5:1 but is perceived lightness contrast = 30.

The most epic incorrect fail for white & random colour is at the bottom of the right-hand column. This is white text on R43, G133, B113 background, which just fails 4.5:1, but is perceived lightness contrast = 76.

The bottom line here is that for display screens in typical lighting conditions, the eye perceives the brightness contrast of text as a difference, not as a ratio. If the model based on ratios was correct, then everything in the left-hand column of http://www.cedc.tools/pairs.html should be easier to read than everything in the right-hand column. Evidently, this is not the case.

I will look forward to discussing the implications of this in due course.

2 replies

bruce-usab Jan 5, 2022

I made a webpage where the left hand column showcases the entire set of WCAG2 false passes (black & random colour). The right-hand column showcases the entire set of WCAG2 false fails (white & random colour). http://www.cedc.tools/pairs.html.

The pairs are very useful @sdw32 — Thanks so much! Also, I complete agree with the design approach, lots of of randomly selected RGB values. Totally valid.

That said, I will greedily take the liberty to ask for more!

I would like to see a version for each column that is ordered somehow. If one goes to just about any large/long list of colors, they will have a visual sorting.
After the sort, see if you cannot compress the results even more! Maybe 16x16 px for each color, each under a single 8 px black or white letter?

sdw32 Jan 5, 2022

Thanks Bruce for the follow-up. Reading individual words, or indeed individual letters, is a different visual task than reading sentences. The contrast algorithms are proposed to work for body paragraphs of text, hence why my demonstration page uses multiple words, and uses random words to force the eye to actually 'read' the words. For real sentences most of the words can be guessed based on context.

Regarding the sorting of the list, each column is currently sorted in order of APCA Perceived Lightness Contrast. Please let me know if you have any further comments on this issue.

mraccess77 · 2021-11-02T14:15:30Z

mraccess77
Nov 2, 2021
Maintainer

@sdw32 I think it really helpful to have this comparison chart for people to compare and also get and idea of the colors we are talking about. I generally I agree with the most of the dark combos on the left being harder to read and causing issues - I agree this is a limitation in the current algorithm. 2 in the dark combos that I found no problems with that were failed were:

who but new as now two what come very how up to she use want me on if out their 9.65:1 PLC=59.7 |
and
be they know do people time then thing go which these one when there make say have many or than 9.73:1 PLC=59.8

In terms of the white text ones - while there are some that WCAG unfairly flags that are really pretty easy to read - there are some on the list of APCA passes that are very hard for me to read. I like that APCA is a sliding scale because size also does make a difference and for me which color is the foreground and which is the background make a big difference. For example, a white text on green may be ok for me but a green text on white background is not. Also some of these that are problems at 12pt for me likely would be ok at 17pt normal weight for me.

1 reply

sdw32 Nov 2, 2021

Thanks for the follow-up, indeed, the dark combos that you found no problem with had plc = 59.7 and 59.8, so these only failed plc>60 by the tiniest fraction. The far bigger concerns arise with the dark pairs towards the top of the table, and the bright pairs towards the bottom of the table.

And yes, of course all of the text would be easier to read if it was 17pt, but the point is that these dark colour combinations are passing WCAG2 4.5:1, which is the level that passes AA for normal-sized body text. I will look forward to further discussions on this topic.

Myndex · 2021-11-03T03:46:32Z

Myndex
Nov 3, 2021
Author

Hi Sam and Jonathan @sdw32 @mraccess77

Independent Validation Thanks

Thank you Sam for your deep analysis and validation here, especially as my paper on APCA is not ready to submit, and I know some have been eager to see some level of peer review. You are one of the few other maths-accomplished individuals here, and it's very good to hear this feedback. I've had to work doubly hard to ensure I am not falling into any confirmation bias which is a very real issue in perception research.

Contrast Thresholds

First, Just as an FYI, 59.7 can count as 60. The spec indicates rounding to the nearest integer. But do keep in mind that size is a critical component.

Also, when used in a pass/fail way, the Lc60 and Lc75 levels are part of the "simple" APCA, originally intended as a replacement for WCAG 2 — the "full matrix" APCA does not set those as a pass/fail, but a sliding scale for a given font.

As we move forward, I expect two distinct options for designers, one a sliding scale for design and testing to accomodate more flexible design needs, or the alternate choice for a site is the simple test version, which is set up more like the WCAG 2 SCs.

Size/Weight Thresholds etc.

APCA contrast is tightly connected to font weight and size for fonts smaller than about 32px. I AM working on the lookup tables, and the current lookup tables are a little off by a px or two.

On Sam's experiment page, a 12px font is used. But per the APCA guidelines, 12px is discouraged for any fluent text, and needs to be Lc 90 if used for fluent purposes. The "standard" is moving to 16px at Lc 75, and 24px at Lc 75 and might relax from these values to 14px and 21px for 400 weight.

Assuming weight 400-500:

For Lc 90: 12px minimum
For Lc 75: 16px minimum (may become 14px)
For Lc 60: 24px minimum (may become 21 px)
For Lc 45: 42px @ 400 or 24px bold (700)
For Lc 30: NON FLUENT ONLY 80px @400 or 42px bold
- This is getting to the minimum limit for some non text.

These values are still under study. For a dollars to donuts comparison, at Lc60 I suggest using no less than 21px, and at Lc75 no less than 14px for a 400 weight font.

History and Early Evaluations

Again Sam thank you for the independent verification... I am working on more than one paper regarding this, and I intend on gathering additional empirical data before I do, though there seems to be some impatience regarding me publishing.

I also realize that the volume of my writing has been hard for some to digest — I've been stating these issues since early in thread 695, and your experiments here echo my early experiments. it's useful to hear it in a different voice. Some key experiments are still online. Here are a couple links if you are curious about some of the background and early development.

The list of experiments that I had made public are here: https://www.myndex.com/WEB/Perception

Some of them have the link "offline" to prevent search engine linking and the resultant confusion, but I can give you a link to any that are listed, and some links are obvious as I outline below. But please keep in mind that I am FAR past these as a source of data. These are more of historical interest.

Historical

These links are a few of the historical investigations. They contain obsolete material. We are far past this initial material, and the most recent material has ventured into a proprietary area and will be released when published.

The "first salvo" April 2019 in issue 695 was the "W3Contrastissue" page. It demonstrates several of the key concepts that you are mentioning above, and that led ultimately to the invention of APCA.
- https://www.myndex.com/WEB/W3Contrastissue
Many of the later demonstrations are numbered and you should be able to determine the URL if you look at one like
- https://www.myndex.com/WEB/WCAG_CE16 is the base series with add ons like
- https://www.myndex.com/WEB/WCAG_CE16cc which is the constant contrast examination of SAPC 06, an early version of what evolved into APCA.
Some of these show the results of multiple other contrast maths including of course WCAG 2.
Some of these are evaluations of various contrast maths, the the CEX and CEXI series.
There are various versions from 09 through 17, (13 is skipped), and each with different explorations.
The main list on the Perception page though has live links for what I think is most important.
NONE of these are displaying the current SAPC and APCA, which is at version SAPC 08, for APCA 0.98G04g.
- The above are all the 2019 examples. For various reasons I stopped publicly presenting anything new, except for beta releases and technology demonstrators.
- The current APCA adjusts some of the non-linearities seen in SAPC 06 to the point the limitation now is largely due to sRGB data at 8-bit. At lower contrasts near threshold, the error is 8-bit noise.

TL;DR

Thanked Sam for the independent verification review, listed a few specifics and work in progress RE: the lookup tables.

0 replies

sdw32 · 2021-11-04T15:20:21Z

sdw32
Nov 4, 2021

Thanks to Andy for the detailed follow-up. I had a quick look at issue 695 and it is unbelievably long! Congratulations for your years of persistence on this topic.

Regarding the size of the text on my test page, I was primarily interested in comparing one contrast calculation algorithm against the other, so I had intentionally set the text to be 12 PX for everything, which is slightly on the small side, intending to give both algorithms a good workout at something close to the limit of readability. However I can see why this has caused confusion as the PLC threshold that I was testing at was not intended to be applied to that text size. I have now changed the text size to 16 PX, even though this still isn't quite at the actual intended size threshold for PLC>60, it makes all the text easier to read from both algorithms, while still being small enough that the differences between the algorithms is still apparent.

Furthermore, I've been trying to think about the contextual factors that would influence the severity of the dark-pairs false pass and bright pairs false fail issues with WCAG2 (4.5to1). Lots of these factors have already been mentioned before in 695 and elsewhere, but please can you check if I have summarised these correctly?

The ambient lighting, where I would expect dark pairs to perform better when the entire room has minimal lighting
The dominating colour of all the blank space on the screen, I would expect dark pairs to perform better when the entire screen is black,
the gamma function/quality of the monitor

I had a go at playing with the brightness and contrast settings on my monitor, and this didn't seem to favour either of the columns of my test patches

In order to consider the significance of the colour of blank space on the screen, I made a version of test page where the entire background is black and made a separate page where the entire background is white. The dark pairs appear to be considerably more readable with the black page background.

I also had a look at my test pages on some different laptops and monitors that I have easy access to. The dark pairs appeared to be more readable on some of these monitors than others.

So, I would be glad to find out more detail regarding the contextual factors for the experiments that were performed in order to calibrate the APCA model, and the extent to which these generalise to real-world usage of laptops, monitors, phones and tablets. Also, I'd be glad to know if I have missed / misunderstood anything critical, or if any of the above effects could be better described. Many thanks.

1 reply

Myndex Nov 5, 2021
Author

Thanks to Andy for the detailed follow-up. I had a quick look at issue 695 and it is unbelievably long! Congratulations for your years of persistence on this topic.

Hi Sam @sdw32

Yes, this has been my primary research focus for years, in part as literally all of my life/work experience is involved, from imaging to visual perception to typography... I stopped adding to thread 695 long ago — the Visual Contrast Wiki and the GitHub repo, and my other articles are where I've continued putting public information.

There is a chunk of proprietary materials that are more related to other IP that is not yet public for a variety of reasons, among them, pending empirical studies where I don't want undue influence or prior knowledge, and IP protection issues.

Regarding the size of the text on my test page, I was primarily interested in comparing one contrast calculation algorithm against the other, so I had intentionally set the text to be 12 PX for everything, which is slightly on the small side, intending to give both algorithms a good workout at something close to the limit of readability.

I can get behind that, I was just noting it as a factor. Something worth looking at is the relationship to the minimum acuity size and contrast, and the shift that occurs when size and weight is increased. It's not linear. There is also an inflection point where contrat constancy effects take over and that is also related to spatial frequency.

Furthermore, I've been trying to think about the contextual factors that would influence the severity of the dark-pairs false pass and bright pairs false fail issues with WCAG2 (4.5to1)

The WCAG2 pass fail issues are really about using some oddly concocted math that lacks empirical support amongst other things. On some of the links I posted above are a lot of test samples that include values from multiple forms of existing contrast maths. WCAG2 is consistently, literally the worst performer among all of them. In fact, you "could" say it is backwards, doing the opposite of what it needs to. Confirmation bias is a related factor.

The ambient lighting, where I would expect dark pairs to perform better when the entire room has minimal lighting

The dominating colour of all the blank space on the screen, I would expect dark pairs to perform better when the entire screen is black,

the gamma function/quality of the monitor

The second of your statements in this list is the most accurate one. Though in the first 695 experiment called W3Contrastissue (linked above) there is an example of just that, and a discussion of ambient lighting which is a factor, but not a linear relationship. There is a theoretical relationship to the gamma setting of the monitor, but there is some unexpected aspects mixed in.

I had a go at playing with the brightness and contrast settings on my monitor, and this didn't seem to favour either of the columns of my test patches

Okay, so here are unexpected aspects: what monitor? system OS? color management? calibration type? If you are using anything MacOS based, and there is significant integration such as an iMac or Macbook, the color management system "un does" a lot of adjustments you might be making, if also viewing in Safari. Firefox will perform differently usually... and you can set Firefox or Chrome to ignore all color management in the hidden settings,

Next, contrast constancy is a real thing, though more for lower spatial frequencies (big and bold) — and for evaluations, higher spatial frequency (small thinner) exacerbates issues of color distance. But along the lines of spatial frequency is the local adaption effects; a glance at a test strip and then to the adjacent one may indicate that one is more contrasty than another at a similar value. When that far supra threshold, local adaptation has a big effect, and it takes a few seconds to adjust to a different polarity.

In order to consider the significance of the colour of blank space on the screen, I made a version of test page where the entire background is black and made a separate page where the entire background is white. The dark pairs appear to be considerably more readable with the black page background.

Yes, this is a fairly well described phenomenon. Polarity experiment CE17 has an example of this. For a practical guideline though, we can look more toward a worst case, i.e. in a normal room light and with a particular monitor setting. Some preliminary empirical finds (that I hesitate to discuss) is that both bright and dark screens have a similar effect on middle contrast perceptions. This is part of some ongoing work though, so don't read too much into it, but it is not as cut and dried as it might at first seem. Barten's work in the 90s is particularly illustrative here.

I also had a look at my test pages on some different laptops and monitors that I have easy access to. The dark pairs appeared to be more readable on some of these monitors than others.

Yes, some of this can be color management (or lack thereof), others can be related to monitor calibration. Most mobile devices have no color management at all, as it is a big drain on processor and power.

So, I would be glad to find out more detail regarding the contextual factors for the experiments that were performed in order to calibrate the APCA model, and the extent to which these generalise to real-world usage of laptops, monitors, phones and tablets. Also, I'd be glad to know if I have missed / misunderstood anything critical, or if any of the above effects could be better described. Many thanks.

The publicly available stuff is public, and I've provided links in above posts. I've hinted at a few other things, but this is an ongoing area of work. I will tell you that the experiments are being conducted on multiple monitors and multiple device types in multiple environments. The baseline though is described in the SAPC Standard Observer which is listed at the GitHub repo.

A consideration is users adjust their screen brightness to preference, which is relative to their environmental ambient lighting and their adaptation level.

Some of what you are asking relates to some unreleased IP that I can not discuss at the moment, but will possibly soon.

On the SAPC site there is a "research mode" with some interactive experiments that demonstrate some of the concepts.

sdw32 · 2021-11-05T14:27:01Z

sdw32
Nov 5, 2021

Thanks to Andy for all the additional details. As I understand things, the current state of play is that the contrast calculation algorithm, and some experimental results have been published, but the full set of experimental results that support the latest version of the contrast calculation algorithm have not been published. Additionally, there is a proposed algorithm that gives a minimum font-size for a given perceived lightness contrast and font-weight, but this algorithm has not been published, and neither has the evidence that supports it.

From my limited experience with guidelines and IP, I would imagine that any algorithm or look up table that needs to be used in order to follow the guideline will need to be royalty-free for anybody else to use, and program their own software based on this. I would have thought any evidence that is necessary to support this algorithm ought to be reviewed before the algorithm is incorporated as part of the guidelines. If the supporting evidence has already been published then this is straightforward. If the supporting evidence is not intended to be published, then I wonder if NDA arrangements might be needed to enable such a review process?

Furthermore, I would imagine that if someone creates software that uses the algorithms that eventually become part of WCAG guidelines, this software does not need to be offered freely. I must confess I am a little confused as to where the boundary is in this regard between the WCAG3 (silver) guidelines, and the tool that is here: https://www.myndex.com/APCA/.

Nevertheless, for the supplemental guidance that we are discussing for WCAG2, I believe Andy has proposed contributing an APCA-Lite version that just contains a few of the size/contrast cut-offs, so this may make things a bit simpler.

For the more imminent issue of the supplemental guidance, I'd be glad if someone could confirm if I have understood the above correctly, and/or describe the expected timeline/roadmap/procedures regarding access to review the empirical evidence that supports any algorithm that is proposed to be included in supplemental guidance.

And finally, I'll summarise my current position on the APCA contrast algorithm as follows: my preliminary review indicates that APCA contrast calculation is better than WCAG2 contrast ratios, so I'd be delighted to see the supplemental guidance use this method instead of contrast ratios. However, I'm nevertheless reluctant to support the APCA algorithm, without having had the chance to review the empirical evidence that confirms it is better, for the range of expected real-world use cases.

Would be grateful for any guidance or further help can be offered on this topic, many thanks.

0 replies

mraccess77 · 2021-11-06T15:29:38Z

mraccess77
Nov 6, 2021
Maintainer

@sdw32 Any terms related to the use of tools would be worked out at the W3C and Accessibility Guidelines level and not with the co-facilitators of the Low Vision Task Force.

Ability to create tools that use the formula and access to research that validates it is an important aspect that they are aware of. In terms of how it impacts supplemental guidance - this is an area where we will need to keep flexibility in how we write the guidance so it could stand with, without depending on how the timing of those discussions go.

They are tracking that issue - but I think it's worthwhile for the group and others seeking to use the tools to be aware of this dependency.

0 replies

Myndex · 2021-11-08T12:32:44Z

Myndex
Nov 8, 2021
Author

Hi @sdw32 and @mraccess77

At the moment, only my patent attorney and I discuss my IP. Because studies are ongoing, and delayed due to COVID, some things are necessarily still under wraps. I was very open initially, but due to the general atmosphere here I've been a little more careful, which is not my normal open operating method. I have nevertheless provided everything needed to duplicate my efforts.

I have indicated where this information is. For the most resent refinements, there is some empirical data that is not collated into charts or graphs yet. Theory of operation, conformance, uniformity, comparitive superiority, have all been demonstrated with that information publicly available.

APCA has been integrated into dozens of tools, and people are very receptive. I have shown in multiple articles with visual examples the efficacy.

Publications

That said, I am working toward publication. I am funding this project myself, so there will not be any open access as the costs to submit to these pay for play deals is in the thousands. As a writer, I am used to being paid to write... in the scientific community, scientists are required to PAY to be published in open access journals.

While they are not written in an academic format, the following articles have substantial visual comparisons, and are written in a more casual manner for a general audience.

GENERAL-PUBLIC-ORIENTED & PUBLIC FACING:

Fonts for Accessibility

https://www.myndex.com/PUB/PDF/AccessibleFontsD.pdf

Contrast

Part I:Orange You Wondering About Contrast? Answering some contrast questions, and demonstrating a real solution to the infamous orange conundrum.

Part II:The Lighter Side of Dark Backgrounds An article comparing some parts of APCA with the old WCAG 2 contrast methods.

NEW! Part III: WCAG 2 vs APCA Contrast Shootout

A gist answering some recent questions regarding APCA, and comparisons and examples of the old (WCAG 2 1.4.3) and the new WCAG 3 with APCA.

Determining Luminance Contrast A discussion and code examples for determining Luminance, and various methods of determining luminance contrast. (Written prior to invention of SAPC/ACPA)

Links

The Myndex Linktree A bunch of links relating to APCA contrast anc color.

SAPC and APCA The WCAG3/Silver Contrast Method and Algorithm repository. This is the canonical source for the latest version of APCA.

SAPC Contrast Research Tools The WCAG3/Silver Contrast Algorithm beta site. Includes interactive experiments that demonstrate the concepts.

perception studies site A list of publicly available experiments. CE17 includes a white paper with theory of operation.

Color

NEW! Let's Flip for Color! If you want your text to be either black or white if the user selects some random color, just where is that inflection point? Hint: It's NOT 18% Y.

Part I:For The Luv of Color An article comparing CIE Lab and Luv colorspaces

Part II:Will Work for Color A follow up article on working spaces and related considerations. Introduces the concept of "Web Working Spacelets".

COLORSPACES - The Primal Frontier A brief Look at the math that helps model how we see.

And let's not forget the VIsual Contrast WIki, and try the experiments at the SAPC site.

AS I HAVE ALSO STATED

The following existing research has played a large role:

The Bailey/Lovie-Kitchin studies on critical contrast and critical size
The Barten contrast studies
The Dr. Legge studies on readability
Studies of Maureen Stone (NIST), Dr. Arditi, Dr. Poynton, Larry Ahrend's NASA "designing for display" site, FAA standards, and the reference books of RW Hunt, M Fairchild, etc.
Fairchild's R-LAB was one source of inspiration, as is Maureen Stone's discussion on the subject from her PARC research.
Brettel et al whose 1997 & 1999 work was using in my clinically accurate CVD simulator.

And obviously I am not listing the several threads such as 695. Below are a number of infographics created in this project.

Thank you,

Andy

Andrew Somers
Invited Expert, AGWG, Silver, LVTF
Myndex Color Science Researcher
Inventor of SAPC & APCA

THE REVOLUTION WILL BE READABLE™

ANNEX: INFOGRAPHICS

These are copyright and not for distribution, not to be used without permission, here for discussion only.

Demonstration: it's not too much contrast, it's too much white.

This is an example of one of the guidelines I am working on relating to excessive contrast or luminance.

WCAG 2 vs APCA — Shootout

Chromaticity diagram

Critical font size, based largely on the Bailey/Lovie Kitchin research

Illustration of the problems with font sizing and font metrics in general

A lookup chart of equivelelnt pixel density

APCA to WCAG2 comparison

The "famous" Orange issue

sRGB Spectral outputs

Contrast Sensitivity and Font Weight

Key factor in creating the APCA font lookup tables.

User interface of SAPC research software

This is the software I developed for determining the constants (exponents) for APCA. Not seen are the stimulus panels viewed my test subjects. I am not particularly comfortable showing these more fully as they are part of on going studies.

Various studies or experiments

Plot of perceptual error of WCAG 2

Three Equal Contrasts — but are all three equal in readability?

Examples of context sensitivity.

CVD depicted

0 replies

sdw32 · 2022-01-05T09:33:54Z

sdw32
Jan 5, 2022

I have slightly refined the demo pages that I created. http://www.cedc.tools/pairs1.html shows the whole page with a completely black background, with a white rectangle behind the text in the right-hand column. http://www.cedc.tools/pairs2.html shows exactly the same colour combinations, but this time the whole page has a completely white background, with black rectangles behind the text in the left-hand column. This demonstration shows that dark pairs are considerably more readable when the background of the whole page is black, compared to when the background of the whole page is white. Similarly, I would imagine that the dark pairs would be more readable in a dark room, as compared to a brightly lit room. Nevertheless, I think it's fair to say model ought to be calibrated to screens with white backgrounds in typical indoor lighting, in which case the dark pairs false pass is still of considerable concern.

1 reply

bruce-usab Jan 6, 2022

Thanks! I thinks pair1 and pair2 very much facilitate human browsing (as compared to the initial version).

I understand/appreciate/agree with the focus on fluid reading.

That said, I still think there is also value to be gained from any single-page (no-scrolling needed) demonstrations we might gin up to highlight the utility of the L^c metric.

Yes, most of these will present as a kaleidoscope. But if the letters are all white and densely packed in and if the color backgrounds have similar brightness enough to each neighbor, it might be possible to have a sentence that remain legible enough for fluid reading.

Next challenge after that would a superficially similar version where the 2.0 ratios are 4.5:1 (or better), but the fluid reading impossible for most people.

sdw32 · 2022-01-06T14:25:59Z

sdw32
Jan 6, 2022

Thanks Bruce for the follow-up.

Inspired by your suggestions, I have now made several more demo pages. I have separated things out so that random colour backgrounds are now on a different page from random colour text I have also edited my code so that can produce tables with a variable number of columns, and reduced the number of words per line and tightened up margins a bit.

So, there are now the following demo pages:
8 random words with random text colours on white/black backgrounds
8 random words with white/black text on random colour backgrounds

Thinking about it some more, presenting these pages completely separately is better than the previous approach as the effect of darkness is now much better counterbalanced.

Additionally, if you're interested in spot reading rather than fluent body reading, I have produced demo pages with exactly the same colour combinations, but each cell only contains the numbers for Perceived Lightness Contrast and contrast ratios, so you get a lot more colours on the screen at once.
Numbers only with random text colours on white/black backgrounds
numbers only with white/black text on random colour backgrounds

I will look forward to further discussions on this topic

1 reply

bruce-usab Jan 6, 2022

These are so great!

if you're interested in spot reading rather than fluent body reading...

My interest is not rather but and also.

I agree that fluent body reading is the more important concern. At this moment in time, from the perspective of someone particularly interested in metrics that might go beyond guidance and best practices — that is, metrics which might be adopted as enforceable requirements all the time and for all digital content — I remain skeptical that we will be able to draw a bright line between spot reading and fluent body reading. (I do remain open to that possibility, I am just saying we are not there yet. So please continue to keep spot reading in scope where you can.)

Also, the distinction between spot reading and fluent body reading seems more ephemeral to me because of my understanding that the distinction loses utility for someone who relies upon screen magnification.

sdw32 · 2022-01-07T10:30:30Z

sdw32
Jan 7, 2022

Thanks Bruce for the comments, I agree that 'and also' is correct, and in fact, thinking about it further, communicating the difference between WCAG2 and APCA is more straightforward for spot reading.

I've also been thinking about this topic further, and have now produced my best ever demonstration page showing how WCAG2 incorrectly predicts readability of text and colour backgrounds. The latest demo page is here http://www.cedc.tools/rback_numbers_vs.html

Essentially, in every case, if WCAG2 is correct then every instance of black text ought to be more readable than every instance of white text, and all the black text is a pass. if APCA is correct, then every instance of white text ought to be more readable than every instance of black text, and every instance of black text is a fail, and every instance of white text is a pass. This page is clearly counterbalanced for darkness adaption, so I believe it's theoretically sound and also extremely compelling!

For those interested in the details, I generated 4000 random colour backgrounds sRGB(X,Y,Z) in the range 0-255. Then, for each background, I calculated WCAG2 ratio and APCA for black text on that colour background. Then I also calculated the same for white text on that colour background. Then I looked for specific instances where WCAG2 said the black text was better contrast than the white text, and WCAG2 said the black text was a pass (>4.5:1), and APCA said that in fact the white text was better contrast than the black text, and the black text was a fail (APCA<60), and that the white text is a pass APCA>60.

Given that I was generating colours completely randomly, and looking for a quite specific outcome with all five of these conditionals being true, I was expecting to find a few edge case colours where WCAG2 performed particularly poorly.

In fact, out of the 4000 random colours, 907 (23%) matched this particular set of conditionals. This is even more shocking if you consider that only 2656 of the random colours passed WCAG2 in the first place. So, if you restrict attention to the colours that WCAG2 passed, this particular epic fail condition occurs for 34% of these colours.

To produce the demo page, I filtered the 907 epic fail conditions to only present the colours that were most different to each other (in Euclidean sRGB coordinates). There's some obvious improvement that can be made in the filtering, but the page is pretty compelling 'as is'!!

0 replies

alastc · 2022-01-11T11:17:02Z

alastc
Jan 11, 2022

As I read through this thread, particularly one titled "shootout", there's two aspects I'm not getting that I hope someone can help with. These come from various experiences from usability testing, where I've seen quite a variety in which colour combinations cause issues for various people. I get uncomfortable when the visibility of a colour combination to people without an impairment is used as evidence for it's visibility from an accessibility point of view.

The two main questions I have are:

When talking about percentages of colour combinations that aren't right with a particular model, is that out of the whole colour space of combinations, or some sub-set?
Is it in comparison to a theoretical model, or the variety of human sight?

I think I need to show my current mental modal about this. I am assuming that if you looked at a colour space of hues, and mapped which people (with no visual impairment) could read against white text, it might look something like this:

Where anything under the line didn't have enough contrast against a white background. That line would probably wave up and down a bit with different hues, but just take that as a baseline for regular vision.

Then, if someone had low-vision with less acuity, presumably that line needs to move up?

Then, if we accounted for various colour deficiencies (e.g. protanopia) you'd need to adjust that line in various ways depending on the hue used.

Then, to produce an overall model that accounted for both you'd want to be failing anything under either line:

I'm also assuming that the WCAG 2 formula is similar to the straight line, and the formula behind APCA is more bendy (apologies for the technical language ;-) ) to better match (non-impaired?) perception.

However, when talking about percentages, what is it if you include all the colours that would fail both? Because (from my mental model above) there are a ton of combinations that are terrible and should fail both, so a ~50% incorrect rate just doesn't sound right.

0 replies

bruce-usab · 2022-01-11T14:44:50Z

bruce-usab
Jan 11, 2022

@alastc the recent experiment (and percentages) @sdw32 did was only for color pairs where WCAG2 said the black text was better contrast than the white text.

@Myndex — the bar graph @alastc describes, I agree, would be very valuable.

I am expecting the APCA line to be a smooth curve, mostly above the 2.x line, but maybe a dip below.

0 replies

alastc · 2022-01-11T15:05:20Z

alastc
Jan 11, 2022

Thanks Bruce, so it's a sub-set of colour combinations.

I think it's important (when describing percentages of correct/incorrect) to indicate what that is as a proportion of the whole and whether it is for 'normal' visual perception or incorporating impairments.

0 replies

sdw32 · 2022-01-13T14:33:14Z

sdw32
Jan 13, 2022

Thanks to Alistair and Bruce for the follow-ups, and I also notice Andy sent a detailed reply on a new thread here Myndex/SAPC-APCA#55.

So I'll just add a few additional comments here. For my earlier post on the random colour generator and percent false fails, I said this

Considering the first scenario, I generated 2000 random colours, 1000 colour text on black background and 1000 black text on colour background. Of these 2000 colours, WCAG2 suggested 1286 passed AA 4.5to1. APCA suggested 570 passed perceived lightness contrast>60. This gives a total of 716 colours that passed WCAG2 when they ought to have failed, a whopping 55.7%. So, if one of the colours is black, and the other is random, WCAG2 appears to be less accurate than tossing a coin.

When generating completely random colours for foreground or background, one has to bear in mind that plenty of these will be completely nonsensical, and no human would ever even attempt to try and make a website with those colours. As such, percentage of the whole isn't a particularly interesting statistic. I addressed this by instead reporting 'if a random colour passes WCAG2, what percentage of these ought to have failed'. I maintain that this scope is much more relevant, as it only examines them the subset of things that passed, which means the percentage reflects colour combinations that might actually appear on real-world websites.

Nevertheless, it's also important to point out that my first analysis assumes that APCA gets it right and WCAG2.x gets it wrong. I believe Andy has evidence that this is the case within his controlled research experiments, and it looks to be the case on the screen in my office, but population-based evidence that APCA performs better in real world devices/environments is still pending (and Andy is planning a study for this).

So, in my later article, I reframed things to instead pose a question regarding which algorithm is better, and shows some examples that might help people come to their own conclusions, and to act as a call to action for further research. This latest article is here http://www.cedc.tools/rback_numbers_vs1.html

Regarding the algorithms and colour vision impairments, to the best of my knowledge, both algorithms are models of full colour vision, and neither attempt to model colour impairments. I've heard it said lots of times that WCAG 2.x helps to ensure sufficient contrast for people with colour impairments, but to my knowledge there is no theoretical justification for this statement, because the perceived lightness of different colours gets differently affected when one of the cones is impaired.

That said, WCAG 2.x appears to have issues with false passing when one of the colours is black, and black vs red is a particular issue for protans. APCA tends to give a much lower score when one of the colours is black, so one could say it's less likely to pass colour combinations that would be problematic for protans.

To do the job properly though, to the best of my knowledge, it ought to be possible to adapt either or both APCA and WCAG 2.x in order to model colour impairments. As a first starting point, I would try using a colour impairment simulation algorithm to transforming the RGB values, then run the contrast algorithm on the transformed RGB values. I'd be glad to discuss this further, but I think that belongs on a different thread.

I will look forward to further discussions on this topic.

0 replies

alastc · 2022-01-13T18:02:56Z

alastc
Jan 13, 2022

no human would ever even attempt to try and make a website with those colours.

Um, we have tested sites which were using #ddd on #fff, personally I'd include that in the 'no human should attempt this' category! Another source of problems is where people have tried to colour-code their navigation/sections, and they keep adding colours as the site grows. That leads to some poor (and you might consider 'random') colour combinations because they stick with either white-text or black-text across all of them. It's less common these days, but I remember testing sites like this.

Please don't underestimate how much good has been done by having a contrast measure. There are a lot of colour combinations which would fail both that have been prevented by the current guideline. It definitely needs improvement, but unless you're on the front-line working with designers and developers it's hard to see the effect.

It's a tricky area, as I think Bruce mentioned that a lot of the combinations that should fail (but don't in WCAG2) are not particularly attractive and don't tend to get used by designers. I think if we want to focus on a sub-set of colour combinations it would be better to take a sample from what's in use. E.g. There are a couple of projects which have done mass-scale analysis of website code, maybe one of them could provide a list of popular colour combinations?

If we can't create a sub-set based on real usage, then I think it would be worth including a 'ridiculous' set that is random across the colour space, perhaps as a control group compared to a set that focuses on the colours which are different between models.

Another thought I had when looking at the dark & light example pages was that the background colour has quite a big impact. If I switched the background to black, the dark background combinations seemed much better (to me). I'm not sure a guideline can account for that, but I think it would be worth having that as a variable in any experiments.

Nevertheless, it's also important to point out that my first analysis assumes that APCA gets it right and WCAG2.x gets it wrong.

Right, I thought that must be the case but wasn't sure.

and shows some examples that might help people come to their own conclusions

I'm a little wary of that because most people looking will not have a visual impairment, and from observing usability testing with low-vis people, my expectations didn't always match the outcome. For example, pink/white (even glaring, saturated pink) caused issues for a participant a while back. Orange & white, even blue on white that seemed clear to me weren't for participants. That's anecdotal, but a few examples like that caused me to stop assuming visibility to me meant visibility to others.

I've heard it said lots of times that WCAG 2.x helps to ensure sufficient contrast for people with colour impairments

If that has come from me, I think it is based a logical outcome from having luminance contrast, compared to not. One of the 'old school' tests is to grey-scale the interface and see whether you can still use it. If it works mono-chromatically, it is more likely (but not certain) to work for everyone.

Seeing the examples does make me wonder if we are at a stage where we can start getting help with testing the models?

Probably need to start a new thread, but how about:

Several sets of colour combinations are gathered, each includes a full set of hues but gather from:

a sample of common combinations in use;
a random sample from across the colour space (50% against white, 50% against black)
a sample that focuses on the difference between the WCAG2 and APCA models.

A web form is created that includes on page 1:

Name
Email (only for a follow up)
Category of visual impairment (pre-determined set + comments box)
Setup details (e.g. mobile phone, desktop, whether you have a customised colour them such as high-contrast mode or night-mode)

Page 2:

A dynamic page which shows a sample of standard text in a colour combination, and a rating (1-10?) for ease of reading.
Each time you make a rating, it shows the next combination.

Page 3: Thanks for taking part.

I'm starting off with a strawman method to see if this (simple) approach is reasonable?

If something like this is reasonable, we could write it up and get help with running the testing.

1 reply

Myndex Jan 24, 2022
Author

Hi @alastc

Another thought I had when looking at the dark & light example pages was that the background colour has quite a big impact. If I switched the background to black, the dark background combinations seemed much better (to me). I'm not sure a guideline can account for that, but I think it would be worth having that as a variable in any experiments.

Yes, the overall page lightness is a significant impact. This IS part of the SAPC model, and it's part of the technology I've developed, and in the patent filings, etc.

I have not shown the 3,4, and 5 color versions (though I have mentioned them) mainly due to concerns that people think the model I've shown is already too complicated.

The actual effect on readability contrast though is not what you'd expect, and the subject of the paper I am working on right now. The short answer is that it's "not all in one direction" and actually a but more parabolic, and so a useful average or approximation can be had.

But for the time being, I'm leaving the public-facing W3-licensed version as a test of only a color pair, and within this, two conformance options:

Basic levels (90 75 60 45) similar to how WCAG 2 conformance works.
Font Lookup Table: more complicated, but more flexible.

I'm a little wary of that because most people looking will not have a visual impairment, and from observing usability testing with low-vis people, my expectations didn't always match the outcome. For example, pink/white (even glaring, saturated pink) caused issues for a participant a while back. Orange & white, even blue on white that seemed clear to me weren't for participants. That's anecdotal, but a few examples like that caused me to stop assuming visibility to me meant visibility to others.

FWIW I have impaired vision, to the point I am starting to use MacOS Voiceover as a screen reader. And there are a lot of issues depending on the kind of impairment — glare is one of the biggest for me, so I am always in dark mode. WCAG 2 contrast math can not calculate correctly for dark mode.

WCAG 2 contrast does nothing at all "special" in any way for any impairment. The ISO standards are more useful for instance.

When working on what became WCAG 2, Dr. Arditi had proposed the 7:1 as the standard, though his recommendation was not followed. 7:1 was pushed to AAA, and an AA level was formed using the same math, but the math that may "sort of work" for 7:1 breaks down rapidly for lower values for reasons I will happily bore you with if you have half an hour to kill (I discuss at length in thread 695 as I recall).

I've heard it said lots of times that WCAG 2.x helps to ensure sufficient contrast for people with colour impairments
If that has come from me, I think it is based a logical outcome from having luminance contrast, compared to not. One of the 'old school' tests is to grey-scale the interface and see whether you can still use it. If it works mono-chromatically, it is more likely (but not certain) to work for everyone.

The "it've for color vision" is stated so in the understanding document, but has achieved a "life of it's own" in terms of misunderstanding the actual issues and importance.

YES, there is no question that LUMINANCE is the the KEY for readability, this is academic. The luminance channel of human visual processing is were literally all the details are, and in particular, for small thin items like fonts. It is all in luminance. Hue and chroma are processed at a far lower spatial frequency, and have a third the strength of luminance contrast.

Here's the problem:

The WCAG 2 contrast math does not maintain adequate luminance contrast unless one of the colors is lighter than #aaa (or equivalent). It's a simple ratio, and does not follow human perception, and that is the crux of that failure.

NEXT: When the discussion comes to things like "pink on white" or other chroma/hue clashes (and there are many) the WCAG 2 contrast math does nothing to mitigate any of them. Because the WCAG 2 contrast does nothing regarding hue.

In fact, it uses the piecewise linearization to relative luminance which is actually not an accurate model of an actual monitor. It is only a bit of math used for sRGB for mathematical convenience to prevent an infinite slope at 0 and also to mimic a 2.2 gamma which was a common image encoding for Windows machines of that era (Mac wa 1.8, and SGI was 1.4 at that time). Using that tends to over-rate red and blue, and is one of the reasons I show how protanopia is not helped by WCAG 2 math but actually degraded.

Regardless, the gamma encoding of an image does nor necessarily predict the actual gamma of display, and does nothing to predict they HVS (human vision system) perception gamma. There is a reason that APCA is not using the piecewise sRGB linearization, and it is wrapped up in emulating actual device display characteristics, not to mention the way APCA linearizes red is set so that red on black is below Lc45, and therefore prohibited per the levels conformance method (and also prohibited on Bridge PCA).

Color

That said, all the way back in 2019 I had developed color modules to specifically address certain specific hue related issues, like protanopia. (Showed them to Chuck and Cybele in the 2019 visual contrast subgroup). But the focus right now has been purely on readability, and that is luminance. And when luminance is done correctly as a perceptual lightness difference then most of the hue related issues fall away as they are ultimately weaker than the strong luminance contrast that is required for good readability.

Seeing the examples does make me wonder if we are at a stage where we can start getting help with testing the models?
Probably need to start a new thread, but how about:
Several sets of colour combinations are gathered, each includes a full set of hues but gather from:

This is the remote study I was developing circa 2019/2020 when COVID (and my personal health crisis) interupted: https://www.myndex.com/perceptex/MMU/index/JoinStudy

Since then I have developed new and improved experiments, and am working to get the software up and running. Right now the software requires me (or someone) to be with the study participant to run the study, so it need to get it property automated.

Though if you feel it will get the ball rolling again in a positive way, starting with a basic survey as you outlined might be faster, and could probably be built using existing survey software, whereas I'm building something from scratch.

Myndex · 2022-01-28T04:41:42Z

Myndex
Jan 28, 2022
Author

Hi @alastc @sdw32 @bruce-usab

I've been working on the font lookup tables, and part of the process in terms of font analysis goes along with some of what's being discussed in this thread, and here are some examples for further discussion:

On this link are examples of several fonts, lined up to make clear the variations in weight. There are also a number of presentations where the lightness contrast is adjusted with the font size and weight to maintain perceptual uniformity.
https://www.myndex.com/SAPC/CE301_fontweight

On this link are evaluations of the most recent font lookup table (0.1.5. G).
https://www.myndex.com/SAPC/CE302_fontlut

These are the lookup tables the valuations reference:

0 replies

sdw32 · 2022-02-03T14:10:28Z

sdw32
Feb 3, 2022

Thanks to Andy @Myndex for providing this extra detail.

Personally, I'm currently skeptical of the accuracy of providing average results for 'sans serif' typefaces, based on body height and font-weight. As has already been discussed, each typeface has its own interpretation of what font-weight means, and the ratio of x-height to body height is very different for different typefaces.

For each of the typefaces shown on the demo page, I would suggest calculating the ratio of x- height to body height, and calculating the ratio of stroke width to x-height. For the font look up table, I would have thought the values in the table ought to reference x-height (instead of body height), and 'the ratio of stroke width / x-height' (instead of font-weight).

Then, another look up table can show, for a variety of typefaces, how x-height relates to body height, and how stroke width / x-height relates to font-weight. This could all be programmed such that a user selects the typeface from a drop-down list, and then gets the values of body height and font-weight, calculated from the generic tables.

It would be great to include Verdana on the demonstration page as it's such an outlier, and perfectly demonstrates why body height and font weight are not useful properties for the purposes of guidelines.

I will look forward to further discussions on this topic, many thanks.

1 reply

Myndex Feb 4, 2022
Author

Hi Sam @sdw32

Personally, I'm currently skeptical of the accuracy of providing average results for 'sans serif' typefaces, based on body height and font-weight. As has already been discussed, each typeface has its own interpretation of what font-weight means, and the ratio of x-height to body height is very different for different typefaces.

I'm not sure what context you are referring to? I'm not "providing average results" this is some misunderstanding...

For each of the typefaces shown on the demo page, I would suggest calculating the ratio of x- height to body height, and calculating the ratio of stroke width to x-height.

That was the entire purpose of those pages, to make it easy to measure those metrics in an as-rendered way, which of course I did, LOL.

For the font look up table, I would have thought the values in the table ought to reference x-height (instead of body height), and 'the ratio of stroke width / x-height' (instead of font-weight).

The font look up tables are all x-height * 2 but in CSS it is not possible to set a font by x-height, therefore a table for general consumption cannot use x-height. I have posted lengthy requests in the CSSWG on this topic, specifically requesting the creation of CSS property font-x-size:

ALSO: there shall be not math required of designers. That's a non-starter. What you and I consider to be trivially simple math is considered by most as intractably complex.

Then, another look up table can show, for a variety of typefaces, how x-height relates to body height, and how stroke width / x-height relates to font-weight. This could all be programmed such that a user selects the typeface from a drop-down list, and then gets the values of body height and font-weight, calculated from the generic tables.

The tables state that the sizes are two times the x-height, and that fonts with an x-height ratio other 0.5 should be adjuted accordingly.

I am already widely criticized (at least by the troll army) that APCA is too complex. We are looking for ways to simplify, not add complexity. The guidelines already state the use of a comparitive reference font, and I do describe the method for comparison in detail here:

Myndex/SAPC-APCA#28 (comment)

It would be great to include Verdana on the demonstration page as it's such an outlier, and perfectly demonstrates why body height and font weight are not useful properties for the purposes of guidelines.

Verdana is one of the best fonts for accessibility, and I'm well aware of its characteristics. But those demo pages are part of our internal research, and I only linked them as a discussion reference of interest.

Importantly, Verdana is not available in all 9 weights, and is commonly only available in 400 and 700, so it's use in research is limited. Verdana Pro still has only 5 weights, and is $400 US, and therefore not commonly available.

bruce-usab · 2022-02-03T20:29:43Z

bruce-usab
Feb 3, 2022

I would ask for inclusion of more pedestrian of web fonts.

Am I correct to understand that lvtf has consensus that x-height is a key metric? If so, I agree that we should take pains to include it with any of these sorts of tables. (Even if it is not currently a CSS property.)

2 replies

Myndex Feb 4, 2022
Author

Hi Bruce @bruce-usab

I would ask for inclusion of more pedestrian of web fonts.

Okay, but I'm not following, inclusion ... where? If you mean experiments CE301 and CE302, those are both "9 weight experiments"... the core web fonts come in only two weights: normal and bold. I do already have evaluations of core web fonts, which I'd sum up as:

Verdana: one of the best and most accessible fonts for screen use.
Georgia: a serif font that complements Verdana

Both Georgia and Verdana were designed by the legendary Andrew Carter. Both are available in normal and bold. Both have an expensive "pro" version that adds a couple additional weights, but neither is available in all 9 weights.

Core web font samples:

The very bad

Andy's do-not-use list:

Courier New: I list this horrible font as "do not use, ever" it's garbage, and far too light in weight, the normal is about a 100 or 200 weight font.
Times New Roman: this font was originally designed about 100 years ago for the London Times newspaper, hence the name. It was designed specifically as small so that the paper could jam as much text on a page as possible, saving paper. It's smaller than any of the other web fonts, and a terrible font for screen use. If you want a serif font Georgia is a better choice.
Impact: weak readability, ultra-heavy font that is essentially weight 1100 for "normal."

Not Horrible

Andy's "I'll just cringe a little" list.

Arial: this neo-grotesque font is just a clone of Helvetica, and while Helvetica and arial are listed as reference fonts for size and weight comparisons, that's only due to their common nature. As I wrote in "Evaluating Fonts for Accessibility" there's a lot to not like about these, and they have some specific accessibility problems.
Trebuchet: Verdana's better. Trebuchet has tighter spacing which is bad for readavbility.,
Comic Sans: I really truly loath this font. And recently in the readability group's study it scored near the bottom.

Okay

Andale mono: There are better monospaced fonts, but this one works, though a little smaller and lighter than I'd prefer.
Arial Black: While I don't care for Arial in general, this is a surprisingly good very black display font, that maintains readability despite the extra bold glyphs.

Am I correct to understand that lvtf has consensus that x-height is a key metric? If so, I agree that we should take pains to include it with any of these sorts of tables. (Even if it is not currently a CSS property.)

I'm not certain about the LVTF. But that is consensus in the design community.

The font lookup tables are all based on x-height, and the font sizes listed are all two times x-height.

Hence my attempts to get font-x-size:as a CSS property. see: w3c/csswg-drafts#6709

Thank you!

bruce-usab Feb 8, 2022

Okay, but I'm not following, inclusion ... where? If you mean experiments CE301 and CE302

The font face names used for those experiments reminded me that I had meant to suggest more mundane choice for the font lookup tables. I think I have a better-than-average (for the typical web user, so the bar is low) font-or-cheese score, but few of the ones named are meaningful for me. Courier New might be illuminating to use on those tables — because the font is common but fundamentally flawed (as you note).

sdw32 · 2022-02-08T11:57:25Z

sdw32
Feb 8, 2022

Hi @Myndex, thanks for all the follow-up detail. I would just like to add one further comment regarding your text ...

The font look up tables are all x-height * 2 but in CSS it is not possible to set a font by x-height, therefore a table for general consumption cannot use x-height. I have posted lengthy requests in the CSSWG on this topic, specifically requesting the creation of CSS property font-x-size:

ALSO: there shall be not math required of designers. That's a non-starter. What you and I consider to be trivially simple math is considered by most as intractably complex.

I had never imagined designers would use the font look up table directly. I think the lookup table allows software programmers to make applications that can present this data in a useful way. So, I maintain that the lookup table should be based on x-height, and this table will likely sit as a database back end for a website /software. The front end of the website / software might for example have drop-down list to choose the font-name, and then show the CSS font-size property (i.e. body height) for that particular font-name, which the software/website calculates from its source database of x-heights, and the ratio of x-height to body height, which will be known for the list of font-names.

1 reply

Myndex Feb 8, 2022
Author

Hi Sam, I did consider this initially but it's counter to 400 years of typography, so I can't get behind it, especially as there is no x-height property in CSS. It's just a needless cause of confusion is an area that is already not well understood.

The PANOSE font database is not well utilized, but regardless does not alter the concern every single piece of software and every CSS property and 400 years of typography are all based on body height. I agree with you on philosophical grounds, and I have even started that jihad as mentioned above — but until other things like a CSS property are in place, this (the lookup table) is not the place to draw the line in the sand on this matter.

sdw32 · 2022-02-09T10:02:55Z

sdw32
Feb 9, 2022

Thanks @Myndex for your feedback. Inspired by your comments, I thought of an alternative middle ground, something a bit like the following:

Produce the look up table for fonts where the x-height is 50% of the body height, state this explicitly in the intro to the table, and state a list of common fonts where this ratio is true (Arial etc). After the table, give an example for how the table can be adjusted to account for fonts that have a different ratio, so that the x-height would be the same as one of the reference fonts. So, for example with Verdana, the x-height is 55% of the body height, so all the font-size values in the table would need to be multiplied by 50/55 = 0.91. So Arial 22px is equivalent to Verdana 20px.

Furthermore, I wonder if font-weight can be handled in the same way. Arial 400 regular has stroke width = 8.8% of body height. Verdana 400 has stroke width = 9.2% of body height, so is roughly equivalent to Arial 400 (or more exactly, 400*9.2/8.8=418, although this doesn't exist obviously, so 400 is still the closest match).

Similarly, Arial 700 has stroke width = 13.7% of body height. Verdana 700 has stroke width = 17.6% of body height, so is roughly equivalent to Arial 900 (or more exactly, 700*17.6/13.7=899, so 900 is the closest match).

I'm going to take a guess that the stroke width of Arial 900 to Arial 700 isn't exactly in the proportion of 900/700, but I'd also assume the approximation is probably close enough for practical purposes?

Any further feedback on this topic would be gratefully received! Many thanks.

4 replies

bruce-usab Feb 9, 2022

Some promotion of common font faces with good x-height seems like a productive interim step. I suspect it will be a while before CSS has x-height.

Myndex Feb 9, 2022
Author

Hi Sam @sdw32

Produce the look up table for fonts where the x-height is 50% of the body height, state this explicitly in the intro to the table, and state a list of common fonts where this ratio is true (Arial etc). After the table, give an example for how the table can be adjusted to account for fonts that have a different ratio, so that the x-height would be the same as one of the reference fonts........

Which is... exactly what we've done (and have been doing). The current reference font is Barlow with a 0.5 x-height ratio. Helvetica/Arial BTW are about 0.5175.... Barlow is the current preferred reference font, at least in research here, as it is a well-designed font with a number of fairly common/normalized metrics such as the way the weights progress. the "approximately median" x height, glyph aspect, etc.

Furthermore, I wonder if font-weight can be handled in the same way. Arial 400 regular has stroke width = 8.8% of body height. Verdana 400 has stroke width = 9.2% of body height, so is roughly equivalent to Arial 400 (or more exactly, 400*9.2/8.8=418, although this doesn't exist obviously, so 400 is still the closest match).

Ah it looks like you've gotten into PANOSE. Good, however from that metric alone you can't really determine the perceptual contrast of the font and the terminology "font contrast" as it relates to glyph design is not the same as contrast that we think of in terms of the difference between black and white. It's related much more to the internal dynamics of of the glyph structure.

I'm going to take a guess that the stroke width of Arial 900 to Arial 700 isn't exactly in the proportion of 900/700, but I'd also assume the approximation is probably close enough for practical purposes?

Correct that there is no "proportion of 900/700" there is no set standard for any font, even the nomenclature of the weights itself is not standard 100, 200, 300… You'll notice it's not even in PANOSE. It's just CSS (with some long backstory tracing roughly to ITC if memory serves).

Neither the absolute nor relative values of ANY font mteric have ANY standardization, other than a higher number means more. LOL.

I am working on, and only really interested in, objective automatable methods with reasonable accuracy toward human perception. I've been hinting at what some of that may look like.

sdw32 Feb 9, 2022

Thanks for the additional detail, I didn't use PANOSE, I just used Illustrator to type the lowercase letters x and l at 100 PX body size, then I used the 'create outlines' function, and then measured the height of the x, and the width of the stroke that made the letter l (with all units set to PX).

Regarding the meaning of 700/900, while I accept that stroke widths within a given font won't be exactly proportional in this regard, I still maintain that it's good enough to give a sense of how to treat a custom font (given that you are only looking for the closest match to an actual font-weight that exists). I'm sure this could also be done more accurately with lookup tables for the particular fonts, but I would imagine it will probably give the same results for the closest match between font-weights that exist

Myndex Feb 9, 2022
Author

I want to encourage you to spend some time at https://fonts.google.com, as those are all freely downloadable.

But also https://www.fonts.com is a site with a lot of interesting features including font-match. I think you'll find working with those tools illustrative.

Importantly, when you get into less geometric fonts, expect to see that the vertical stroke width of an uppercase T is not the sole determiner of the visual contrast of a given font.

WCAG 2 and APCA Comparison #131

WCAG 2 and APCA Comparison

Comparison Article: The Lighter Side of Dark Backgrounds

Comparison Article: Orange You Wondering About Contrast?

Regarding consistency concerns between the old and new:

It is not possible to create something that works correctly and also be consistent to something that is incorrect.

Replies: 28 comments · 17 replies

Myndex Oct 25, 2021 Author

Myndex Oct 27, 2021 Author

Myndex Oct 29, 2021 Author

Myndex Nov 1, 2021 Author

mraccess77 Nov 2, 2021 Maintainer

Myndex Nov 2, 2021 Author

mraccess77 Nov 2, 2021 Maintainer

Myndex Nov 3, 2021 Author

Independent Validation Thanks

Contrast Thresholds

Size/Weight Thresholds etc.

History and Early Evaluations

Historical

TL;DR

Myndex Nov 5, 2021 Author

mraccess77 Nov 6, 2021 Maintainer

Myndex Nov 8, 2021 Author

Publications

GENERAL-PUBLIC-ORIENTED & PUBLIC FACING:

Fonts for Accessibility

Contrast

NEW! Part III: WCAG 2 vs APCA Contrast Shootout

Links

Color

NEW! Let's Flip for Color! If you want your text to be either black or white if the user selects some random color, just where is that inflection point? Hint: It's NOT 18% Y.

AS I HAVE ALSO STATED

THE REVOLUTION WILL BE READABLE™

ANNEX: INFOGRAPHICS

These are copyright and not for distribution, not to be used without permission, here for discussion only.

Demonstration: it's not too much contrast, it's too much white.

WCAG 2 vs APCA — Shootout

Chromaticity diagram

Critical font size, based largely on the Bailey/Lovie Kitchin research

Illustration of the problems with font sizing and font metrics in general

A lookup chart of equivelelnt pixel density

APCA to WCAG2 comparison

sRGB Spectral outputs

Contrast Sensitivity and Font Weight

User interface of SAPC research software

Various studies or experiments

Plot of perceptual error of WCAG 2

Three Equal Contrasts — but are all three equal in readability?

Examples of context sensitivity.

CVD depicted

Myndex Jan 24, 2022 Author

Here's the problem:

Color

Myndex Jan 28, 2022 Author

These are the lookup tables the valuations reference:

Myndex Feb 4, 2022 Author

Myndex Feb 4, 2022 Author

Core web font samples:

The very bad

Not Horrible

Okay

It is not possible to create something that works correctly
and also be consistent to something that is incorrect.

Replies: 28 comments 17 replies

Myndex
Oct 25, 2021
Author

Myndex
Oct 27, 2021
Author

Myndex Oct 29, 2021
Author

Myndex
Nov 1, 2021
Author

mraccess77
Nov 2, 2021
Maintainer

Myndex Nov 2, 2021
Author

mraccess77
Nov 2, 2021
Maintainer

Myndex
Nov 3, 2021
Author

Myndex Nov 5, 2021
Author

mraccess77
Nov 6, 2021
Maintainer

Myndex
Nov 8, 2021
Author

Myndex Jan 24, 2022
Author

Myndex
Jan 28, 2022
Author

Myndex Feb 4, 2022
Author

Myndex Feb 4, 2022
Author