Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sanitizeHtml throws TypeError on '&' symbol #606

Open
matejfalat opened this issue Jan 28, 2023 · 4 comments
Open

sanitizeHtml throws TypeError on '&' symbol #606

matejfalat opened this issue Jan 28, 2023 · 4 comments
Labels

Comments

@matejfalat
Copy link

matejfalat commented Jan 28, 2023

To Reproduce

Step by step instructions to reproduce the behavior:

sanitizeHtml('<p>&</p>')
// or
sanitizeHtml('<p>&nbsp</p>')

Expected behavior

Not to crash.

Describe the bug

When the html text contains the ampersand symbol, the sanitizeHtml() is failing with:

Uncaught TypeError: Cannot read properties of undefined (reading '0')
    at Tokenizer.stateBeforeEntity (Tokenizer.js?6fbd:582:1)
    at Tokenizer.parse (Tokenizer.js?6fbd:818:1)
    at Tokenizer.write (Tokenizer.js?6fbd:158:1)
    at Parser.write (Parser.js?5804:459:1)
    at sanitizeHtml (index.js?5e22:578:1)
    at MaterialPreviewPage (MaterialPreviewPage.tsx?d2f2:41:55)

Details

React: 18.2.0,
Webpack: 5.75.0

Version of Node.js:
v18.13.0

Server Operating System:
Windows 11, WSL2, and Docker

Screenshots

error1

error2

@matejfalat matejfalat added the bug label Jan 28, 2023
@BoDonkey
Copy link
Contributor

Hi @matejfalat,
I tested this on a mac using the repo tests and could not reproduce the error. Is this occurring only in the browser? I wonder if this is an htmlparser2 issue, rather than a sanitize-html issue since Tokenizer is part of that package. I guess we would need some minimal project set-up to replicate this error.
Cheers

@boutell
Copy link
Member

boutell commented Jan 30, 2023

Yes, what we would ask is that you contribute a PR with a failing unit test to this repo so we can see how this is possible in the context of this project and avoid any confusion with issues that might only exist in a larger project with parts that aren't actually dependencies of the module etc. htmlparser2 is a dependency so browser or no, a bug coming from that should be reproducible in a test.

@victorbojica
Copy link

probably because of missing ";" at the end

@NewEraCracker
Copy link

NewEraCracker commented Jul 25, 2024

On Node.js CLI:

> const sanitizeHtml = require('sanitize-html');
undefined
> sanitizeHtml('<p>&</p>')
'<p>&amp;</p>'
> sanitizeHtml('<p>&nbsp</p>')
'<p> </p>'
>

Just tested this and it works good on both of these:

  • With the patched version 2.7.3 (and htmlparser2 6.1.0);
  • On another project using 2.13.0 (and htmlparser2 8.0.2).

Might be worth adding a test case. My two cents. Thank you.

PS: Found these related: fb55/htmlparser2#1426 & fb55/htmlparser2#1460

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants