Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in #914 #991

Closed
papandreou opened this issue Oct 22, 2021 · 2 comments · Fixed by #1010
Closed

Regression in #914 #991

papandreou opened this issue Oct 22, 2021 · 2 comments · Fixed by #1010

Comments

@papandreou
Copy link
Contributor

A project of mine has been using htmlparser2 since 3.7.3, but when I upgrade from 7.0.0 to 7.1.0 or above, every test fails with this error:

     Uncaught TypeError: this.buffer.charCodeAt is not a function
      at Tokenizer.parse (node_modules/htmlparser2/lib/Tokenizer.js:628:33)
      at Tokenizer.write (node_modules/htmlparser2/lib/Tokenizer.js:111:14)
      at Parser.write (node_modules/htmlparser2/lib/Parser.js:382:24)
      at Transform.<anonymous> (lib/expressExtractHeaders.js:165:28)
      at Transform.Readable.read (_stream_readable.js:507:10)
      at flow (_stream_readable.js:1007:34)
      at resume_ (_stream_readable.js:988:3)
      at processTicksAndRejections (internal/process/task_queues.js:84:21)

Here's a failing build: https://github.com/papandreou/express-extractheaders/runs/3980571638?check_suite_focus=true#step:5:1615

I bisected it down to this commit:

a2eae5164a58e8957a6b7d968d81b03ca44cb375 is the first bad commit
commit a2eae5164a58e8957a6b7d968d81b03ca44cb375
Author: Felix Böhm <188768+fb55@users.noreply.github.com>
Date:   Mon Aug 23 13:00:32 2021 +0100

    refactor(tokenizer): Remove unused branches, improve test coverage (#914)
    
    `Tokenizer`'s `option` argument can no longer be `null`, and we no longer check for unknown states.

 src/Tokenizer.ts                                   | 36 +++++++++-------------
 src/__fixtures__/Events/37-entity-in-title.json    |  2 +-
 .../Events/41-trailing-legacy-entity.json          | 10 ++++++
 .../Events/42-trailing-numeric-entity.json         | 10 ++++++
 4 files changed, 36 insertions(+), 22 deletions(-)
 create mode 100644 src/__fixtures__/Events/41-trailing-legacy-entity.json
 create mode 100644 src/__fixtures__/Events/42-trailing-numeric-entity.json
@fb55
Copy link
Owner

fb55 commented Oct 24, 2021

Thanks for the report! #914 removed the ability to pass buffers to the parser, which was accidentally supported before.

There are some edge-cases when passing buffers directly to htmlparser2 — see #128. htmlparser2/lib/WritableStream solves this by using a string decoder.

I will reinstate support for parsing buffers here, but in a deprecated state. This will be dropped in the next major release. Hope this is a good compromise!

papandreou added a commit to papandreou/express-extractheaders that referenced this issue Oct 24, 2021
@papandreou
Copy link
Contributor Author

Thanks a lot for looking into it! I've put out express-extractheaders 4.0.1 that avoids buffers so I'm prepared for the next major release :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants