-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keeps line and indentation on remove() #86
Comments
We use Cheerio in Yeoman to do HTML manipulations in our scaffolder and it's great, but because of this, it leaves a lot of empty lines and trailing whitespace, which is annoying to the end-user. Hopefully this can be fixed soon :) |
I'm not sure I understand the example, but a while ago we removed the tidying features. IMO it was feature creep and should be left to a tidy library. |
If I remove() the div#test <div>
<div id="test">dsf</div>
</div> The resulting HTML is (with trailing whitespace): <div>
</div> The resulting HTML should be: <div>
</div> Or even <div></div> In this situation. |
This is a div element with three children. A textnode containing a newline and 4 spaces, a div element with an id of "test", and a textnode containing a newline. Removing the div child element does not (and should not) remove the text nodes. I do understand what you want, you want the HTML resulting from Cheerio to be reformatted accoring to your preferred style. However this is not (currently) a functional goal of Cheerio, and is functionality that can best be achieved by processing the output of Cheerio with another function. I would recommend the mature and stable js-beautify) for HTML post-processing. It provides a number of options to format HTML to your standards |
Ok, didn't think Cheerio concerned itself about textnodes. I do however think that Cheerio should have an option in $.html() or something to run the html-beautify. I can't think of any scenario where I would want trailing spaces left in the source. |
Right, but most node modules use cheerio to do screen scraping, where content, not source, is most important. From a quick look at js-beautify and node-beautifier, it looks to be as simple as: var html = $.html(),
beauty = beautify.html_beautify;
html = beauty(html); Would be super simple to add to yeoman, and it would do a better job than something we hack together for cheerio. Closing this issue, unless there becomes a more compelling reason to add a tidy. |
Not saying it would be hard to add to Yeoman, obviously it's not, just would be a nice convenience in Cheerio, not having to evaluate the options, add as dep, import it, look up the API and the finally beautify, but whatever. |
While this is quite an old issue, I just came across this facing the same situation. Beautification is obviously a possible way, but I didn't like the overhead, so I came up with the following solution, which is removing empty (whitespace only) lines. It is doing so by iterating over all nodes, selecting the text nodes and removing it if it only contains white spaces. Maybe it is helpful to anybody else: $("something").contents().filter(function() {
var ns = this.nextSibling;
if(ns != null) {
return this.nodeType === 3 && ns.nodeType === 3 && /^\s+$/.test(this.nodeValue); // Node.TEXT_NODE
}
return false;
}).remove(); |
When I remove an indented element that makes up the whole line it preserves the line and the indentation. This makes a document with many remove() look dirty, since it's filled with indented whitespace on empty lines.
It should IMO remove the line if all the contents is gone.
The text was updated successfully, but these errors were encountered: