Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rehype-minify-whitespace: inline HTML content is trimmed #19

Closed
erquhart opened this issue Jul 20, 2017 · 15 comments
Closed

rehype-minify-whitespace: inline HTML content is trimmed #19

erquhart opened this issue Jul 20, 2017 · 15 comments
Labels
🗄 area/interface This affects the public interface 💪 phase/solved Post is done 👶 semver/patch This is a backwards-compatible fix 🐛 type/bug This is a problem

Comments

@erquhart
Copy link

The contents of inline tags are being trimmed. When parsing html to markdown:

Input

a b<span> </span>c d</span><strong> e</strong>

Expected output

a b c d **e**

Actual output

a bc d**e**

Runkit repro: https://runkit.com/erquhart/595d36c4e60f6b0012c00405

@wooorm
Copy link
Member

wooorm commented Jul 20, 2017

If someone wants to work on it, it’s in rehype-minify-whitespace.

Moved from remarkjs/remark#275.

@erquhart
Copy link
Author

@wooorm I'm not using rehype-minify-whitespace in that repro - are you certain that's the problem?

@wooorm
Copy link
Member

wooorm commented Jul 21, 2017

Yup

@erquhart
Copy link
Author

erquhart commented Aug 3, 2017

@wooorm seems like the best approach may be to simply allow whitespace minification to be toggled off in the options for rehype-remark/hast-util-to-mdast. Would you accept a PR for that?

@wooorm
Copy link
Member

wooorm commented Aug 9, 2017

Just dropping this here ’cause it probably has to be something like this: https://github.com/wooorm/remark-lint/blob/master/packages/remark-lint-no-paragraph-content-indent/index.js

@wooorm wooorm changed the title Inline HTML content is trimmed rehype-minify-whitespace: inline HTML content is trimmed Aug 10, 2017
@wooorm wooorm added 🐛 type/bug This is a problem 👶 semver/patch This is a backwards-compatible fix 🗄 area/interface This affects the public interface 🙆 yes/confirmed This is confirmed and ready to be worked on labels Aug 13, 2019
@erezrokah
Copy link

Bumping this as we're still experiencing this issue: decaporg/decap-cms#3727

I added a test that is failing erezrokah@d7cc409#diff-c7e2175747a03e137c1615aecfaa1636R103 but not sure how to go about fixing it.

Can you suggest a place to get started?
Tried looking at the history of https://github.com/wooorm/remark-lint/blob/master/packages/remark-lint-no-paragraph-content-indent/index.js but couldn't figure out the suggested direction.

@kptdobe
Copy link
Contributor

kptdobe commented May 28, 2020

I am facing the same issue. It is coming from those lines:

https://github.com/rehypejs/rehype-minify/blob/master/packages/rehype-minify-whitespace/index.js#L62-L68

I think the problem is that in the case of a <span>, <em>, <strong>, the siblings you want to test are not the text node siblings (almost always non existing) but the siblings of the parent (the span, em, strong siblings).

@kptdobe
Copy link
Contributor

kptdobe commented Jun 2, 2020

Re-thinking about it, I would even argue you should never trim an inline element: there is no way you can know if the leading / trailing space has been introduced on purpose or not:

  • <span>Click here: </span><input text="my button" value="button">
  • <p><strong> Maybe I want to start with a space here...</strong></p>

@wooorm wooorm closed this as completed in 64c6206 Jun 4, 2020
@wooorm wooorm added ⛵️ status/released and removed 🙆 yes/confirmed This is confirmed and ready to be worked on labels Jun 4, 2020
@wooorm
Copy link
Member

wooorm commented Jun 4, 2020

Released!

@erezrokah
Copy link

@wooorm
Copy link
Member

wooorm commented Jun 4, 2020

It's a patch, so reinstalling dependencies should do the trick!

@erezrokah
Copy link

erezrokah commented Jun 4, 2020

Sorry, let me clarify my question.
We're using rehype-remark. Even if I update it to the latest version is still uses "hast-util-to-mdast": "^7.0.0" which uses "rehype-minify-whitespace": "^3.0.0", so it won't get the fix.

I'm using (with yarn) the following workaround:

"resolutions": {
  "rehype-minify-whitespace": "^4.0.2"
}

But that is not ideal

Update

My workaround doesn't work - investigating

@kptdobe
Copy link
Contributor

kptdobe commented Jun 4, 2020

Awesome! Thanks a lot @wooorm.

I tried to test the changes (I manually patched rehype-remark) and I get an error:

TypeError: Cannot read property 'type' of undefined
    at one .../hast-util-to-mdast/lib/one.js:13:12)

I traced to https://github.com/syntax-tree/hast-util-to-mdast/blob/master/index.js#L31 : minify returns undefined for my tree (was working before). Investigating too :)

@kptdobe
Copy link
Contributor

kptdobe commented Jun 4, 2020

@kptdobe
Copy link
Contributor

kptdobe commented Jun 4, 2020

Created a PR to address this new issue: #33
Feel free to use it or not ;)

I tested the new trimming behaviors, works pretty well, especially with "my" complex DOM. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🗄 area/interface This affects the public interface 💪 phase/solved Post is done 👶 semver/patch This is a backwards-compatible fix 🐛 type/bug This is a problem
Development

Successfully merging a pull request may close this issue.

4 participants