Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending removeComments to keep only some custom comments #73

Open
sebiniemann opened this issue May 28, 2019 · 5 comments
Open

Extending removeComments to keep only some custom comments #73

sebiniemann opened this issue May 28, 2019 · 5 comments

Comments

@sebiniemann
Copy link

sebiniemann commented May 28, 2019

Currently, the removeComments options allows to

  1. keep all comments (using false)
  2. remove all comments (using 'all')
  3. remove all except the conditional ones and <!--/?noindex--> (using 'safe')

This works great for classic/standard projects, but is problematic in more advanced use cases, where some special comments are used in another framework to extend the HTML file later one. In these situations, 'all' and 'safe' are not useful, as either all or at least the custom special comments are removed. However, false is also impractical, as it keeps all kind of comments, not differentiating between development (should be removed) comments or special framework comments.

For example, vue-server-renderer uses the <!--vue-ssr-outlet--> comment to inject the compiled markup. Another example I have seen in the wild is KnockoutJS, which supports a so called containerless control flow, which uses comments like <!-- ko if : isEditable() -->Some markup<!-- /ko -->.


As a solution, I suggest to add another option. Allowing the user to provide a regular expression, working as a whitelist of comments to keep.

removeComments: 'vue-.+?',

This shouldn't conflict with the existing 'all' and 'safe' options, as we can always add <!-- to the regex to be more specific. So if someone wants to keep all comments that include all, it could also be written as <--.*?all.

The actual implementation could then be something like this:

  • As anything besides 'all' and 'safe' can be assumed to be a regular expression, we shouldn't overwrite the option anymore.
export default function removeComments(tree, options, removeType) {
-    if (removeType !== 'all' && removeType !== 'safe') {
-        removeType = 'safe';
-    }
  • To keep a single logic, it would be great to map 'all' and 'safe' to their corresponding regular expression.
  • 'all' can be seen as /^$/ (will only match an empty string/nothing)
  • 'safe' is a combination of:
    • /^<!--\/?noindex-->$/
    • and isConditionalComment, which uses already a regular expression (/<!--\[if/) inside and only invoked/used by this single line.
    • In summary: /^<!--(\/?noindex|\[if.+?)-->$/
function isCommentToRemove(text, removeType) {
    if (typeof text !== 'string') {
        return false;
    }

    if (! isComment(text)) {
        // Not HTML comment
        return false;
    }

+    let keep;
+    if (removeType === 'all') {
+        // Keep nothing
+        keep = /^$/;
+    } else if (removeType === 'safe') {
+        // Keep only noindex and conditional comments
+        // See: https://yandex.com/support/webmaster/controlling-robot/html.xml and https://en.wikipedia.org/wiki/Conditional_comment
+        keep = /^<!--(\/?noindex|\[if.+?)-->$/;
+    } else {
+        // Keep only custom comments
+        keep = new RegExp(removeType);
+    }

-    const isNoindex = text === '<!--noindex-->' || text === '<!--/noindex-->';
-    if (removeType === 'safe' && isNoindex) {
-        // Don't remove noindex comments.
-        // See: https://yandex.com/support/webmaster/controlling-robot/html.xml
-        return false;
-    }
-
-    // https://en.wikipedia.org/wiki/Conditional_comment
-    if (removeType === 'safe' && isConditionalComment(text)) {
-        return false;
-    }
+    if (keep.test(text.trim())) {
+        return false;
+    }

    return true;
}

To keep it intuitive, I would also suggest to rename the option from removeComments to keepComments (and all to none in correspondence). However, I won't suggest to do this before version 1.0, as this project is already a widely spread dependency.

@maltsev
Copy link
Member

maltsev commented Jun 4, 2019

Thanks for the suggestion! It totally makes sense.

It also made me think if such custom rules could come in handy in other modules. I created #75 for discussing that. I'd be happy to hear your thoughts about that.

@SoftCreatR
Copy link

Any progress here? Because I could need this for Server-Side SSI, which is currently being stripped out.

@SukkaW
Copy link
Contributor

SukkaW commented Sep 13, 2021

@SoftCreatR I can create a PR implementing the feature.

@SoftCreatR
Copy link

That would be 👍

@SoftCreatR
Copy link

Just FYI: It works absolutely fine (at least for me and my special case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants