Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entities to html #5

Merged
merged 6 commits into from
Jan 5, 2015
Merged

Conversation

patrickknaap
Copy link

If there are character entities in your html, they will now be
converted too.

PatrickK added 3 commits December 3, 2014 13:08
When there are character entities in your html code, they will now be
converted too.
If there are character entities in your html, they will now be
converted too.
@yuchi
Copy link
Contributor

yuchi commented Dec 3, 2014

This is great! But looks like you modified the wrong file!
The file nl.fokkezb.html2as.js is a ‘compiled’ one. This one is the ‘source’.

Also you’re replacing the entities in the wrong phase, that is before parsing it. This means that

<b>hi</b>

is interpreted as

<b>hi</b>

@yuchi
Copy link
Contributor

yuchi commented Dec 3, 2014

And (just being pedantic, I know) but probably it should be more solid to use this entities module (on GitHub and npm) instead of a homebrewed entity -> codepoint map.

@FokkeZB
Copy link
Collaborator

FokkeZB commented Dec 3, 2014

I agree that would be better. Could you modify the right file to use entities @patrickknaap? Just add the entities module to the package.json.

PatrickK added 2 commits December 19, 2014 09:32
The module entities converts entities to html
Added the module to the package.son file
@patrickknaap
Copy link
Author

Sorry for the long wait. I was very busy at my job. But I found some time.
It was the first time working on a module for me (and also with Grunt), so please inform me about my work.

@yuchi
Copy link
Contributor

yuchi commented Dec 19, 2014

Way better! But I’m scared by the fact that we’re unescaping the entities after the creation of attributes. You have to know they are character position based

"Hallo <b>world</b>"

   -- becomes --

           ╭ strong [6:11]
         ┌─┴─┐
  "Hallo world"

but when we put escaping in the mix

"Hall&ograve; <b>world</b>"

      -- becomes --

                   ╭ strong [13:18]
                 ┌─┴─┐
   "Hall&ograve; world"
        ┬─────── ┊   ┊
   "Hallò world" ?????

@FokkeZB
Copy link
Collaborator

FokkeZB commented Dec 19, 2014

@yuchi it looks like it is correctly done before the creation of attributers right?

https://github.com/patrickknaap/ti-html2as/blob/entities-to-html/index.js#L135-L138

@yuchi
Copy link
Contributor

yuchi commented Dec 19, 2014

Actually not. The attributes are created after the unescaping, but the ranges are created before.

@yuchi
Copy link
Contributor

yuchi commented Dec 19, 2014

This line is where it should happen, probably…

parameters.text += entities.decodeHTML(node.data);

so when later you use parameters.length you get the right one.

@FokkeZB
Copy link
Collaborator

FokkeZB commented Dec 19, 2014

Yes, you're totally right @yuchi

Could you fix that @patrickknaap? Thx

@yuchi
Copy link
Contributor

yuchi commented Dec 19, 2014

Don’t take my words for granted, @patrickknaap. Could you try if, after the eventual fix, the html Hall&ograve; <b>world</b> outputs «Hallò world» correctly?

@patrickknaap
Copy link
Author

You're right @yuchi . The output wasn't the expected output. The ò was parsed correctly, but the ' world' didn't output in bold. I'll push a new version.

The parsing is done before the attributes are created.
@FokkeZB FokkeZB merged commit 9859d18 into jasonkneen:master Jan 5, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants