Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make wpt own the HTML parser test data and remove dependency on html5lib-python, html5lib-tests #27868

Open
zcorpan opened this issue Mar 2, 2021 · 13 comments

Comments

@zcorpan
Copy link
Member

zcorpan commented Mar 2, 2021

This week I've done the exercise of updating HTML parser tests again, though this time I was a bit more successful in figuring out how to get those changes through to wpt (see #2887). But boy is it painful and also mostly undocumented!

Juggling 3 repos for one change like this doesn't seem ideal for contributors. From wpt's perspective, what I would like instead is:

  • Make the test data change in wpt and run a script to generate tests. No dependency on html5lib.

Then html5lib-python can get the tree-builder test data from wpt instead of from html5lib-tests.

Thoughts? @gsnedders @jgraham @annevk @stephenmcgruer

@gsnedders
Copy link
Member

this is effectively a dupe of html5lib/html5lib-tests#127 fwiw

@zcorpan
Copy link
Member Author

zcorpan commented Mar 3, 2021

@gsnedders oh, right, I had forgotten about that! It seems like there isn't objection. Are you still planning to work on this?

@gsnedders
Copy link
Member

@gsnedders oh, right, I had forgotten about that! It seems like there isn't objection. Are you still planning to work on this?

It is a long way down my list.

@zcorpan
Copy link
Member Author

zcorpan commented Feb 7, 2022

A tweak we can make is to depend on html5lib-tests instead of html5lib-python from wpt, which would remove the second step. (I think this was @jgraham 's idea, but don't see it mentioned in GitHub.)

@gsnedders
Copy link
Member

One obvious (easy) tweak given it's using git-submodules is to explicitly store a commit hash somewhere in WPT and then during update cd html5lib-python/html5lib/tests/testdata && git fetch origin && git checkout $REV.

@hsivonen
Copy link
Member

hsivonen commented May 6, 2022

My main concern is that I want to preserve the file format for the preferred form form making modifications to the test, since there are non-WPT consumers of those formats.

I'm not a fan of WPT having a build step that transforms the tree builder test format. FWIW, Gecko's mochitest harness stores the original .dat format in the repo and parses it when the tests are run.

@zcorpan
Copy link
Member Author

zcorpan commented May 6, 2022

Having the sources files in the same format in wpt and parsing them with JS when running sounds ideal actually. Can that parser be migrated to wpt?

@annevk
Copy link
Member

annevk commented Mar 28, 2023

Having worked on a parser bug in WebKit I now think this would be even more valuable than I previously thought. It looks like Chromium and WebKit both have two sets of parser tests in the tree:

  • Some html5lib-tests fork of unspecified vintage
  • web-platform-tests's import of html5lib-tests

And the former has tests the latter might not contain. I contributed further to this problem in WebKit/WebKit#12019, but am willing to be part of the cleanup crew if we make web-platform-tests the true home of HTML parser tests.

I suspect @mfreed7 might be interested in this from the Chromium side. Copying here to gather interest.

@mfreed7
Copy link
Contributor

mfreed7 commented Mar 31, 2023

I'm definitely supportive of the effort to clean this up, and make WPT the source of truth for parser tests.

@annevk
Copy link
Member

annevk commented Apr 1, 2023

Steps taken thus far:

I wonder if @zcorpan is still interested in taking this even further as I think it would definitely be preferable if we didn't have to go via html5lib-tests.

https://github.com/html5lib/html5lib-tests does have a number of actionable issues and stale PRs worth triaging. Help appreciated.

@zcorpan
Copy link
Member Author

zcorpan commented Apr 3, 2023

Yes. See html5lib/html5lib-tests#127 (comment) and later comments.

@annevk
Copy link
Member

annevk commented Jun 16, 2023

@zcorpan any progress on this?

@zcorpan
Copy link
Member Author

zcorpan commented Jun 19, 2023

Not yet but it's on my list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants