Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlyingSaucer can cleanup invalid html #279

Open
asolntsev opened this issue Mar 4, 2024 · 2 comments
Open

FlyingSaucer can cleanup invalid html #279

asolntsev opened this issue Mar 4, 2024 · 2 comments
Labels

Comments

@asolntsev
Copy link
Contributor

FlyingSaucer could be more tolerant to minor errors in html source.
Instead of throwing an exception, say, for non-closed html5 tags, it could close them automatically.

For example, we could use JSoup for job.

Triggered in discussion #277

@DarkTyger
Copy link

DarkTyger commented Dec 31, 2024

I agree with @pbrant : If an application wants to clean up HTML before passing it to FlyingSaucer, that's the application's job. An optional plugin (or module) would be good. I already have a specific version of JSoup included with my application for various reasons (including, but not limited to, clean up before passing to Flying Saucer). Having potentially two versions of JSoup would unnecessarily bloat my application.

Can we close this issue?

@asolntsev
Copy link
Contributor Author

@DarkTyger Yes, we could close the issue and leave this problem for FS users to solve by themselves.
But it's a common request, and many users want to solve this problem.
So it's better to solve it once in FS than make people re-invent the wheel every time.

An optional plugin (or module) would be good.

This is exactly what I was think about: a separate plugin / module / something like that.

Having potentially two versions of JSoup would unnecessarily bloat my application.
You will never have two versions of JSoup. Both Maven and Gradle always leave only one version in classpath.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants