Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle regex special characters #250

Merged
merged 1 commit into from
Oct 22, 2016
Merged

Conversation

zacronos
Copy link
Contributor

Currently, enabling namespaces which contain regex special characters leads to unexpected results. Examples:

var debug = require('debug');
debug.enable("abc[def],x.z,what?");

console.log(debug.enabled('abcf'));  // yields true, should be false
console.log(debug.enabled('abc[def]'));  // yields false, should be true

console.log(debug.enabled('xyz'));  // yields true, should be false

console.log(debug.enabled('what?'));  // yields false, should be true
console.log(debug.enabled('wha'));  // yields true, should be false

debug.enable("oops[");  // throws an error

This PR fixes that by escaping the regex special characters within the namespace strings (while still properly allowing * to be a wildcard). The escape code is based on the proposed ES7 RegEx.escape polyfill found at https://github.com/benjamingr/RegExp.escape/blob/master/polyfill.js

Since this escaping only happens when .enable() is called and not when .enabled() is called or the log function is actually used, it should not negatively impact the performance of apps using this module.

@zacronos
Copy link
Contributor Author

Sorry for the additional commits, right after I pushed up this PR I saw I was being inefficient. The current code should be a winner.

@stephenmathieson
Copy link
Contributor

Sorry for the additional commits, right after I pushed up this PR I saw I was being inefficient. The current code should be a winner.

why not squash the commits then?

@@ -141,7 +141,7 @@ function enable(namespaces) {

for (var i = 0; i < len; i++) {
if (!split[i]) continue; // ignore empty strings
namespaces = split[i].replace(/\*/g, '.*?');
namespaces = split[i].replace(/[\\^$+?.()|[\]{}]/g, '\\$&').replace(/\*/g, '.*?');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at that, but * would be escaped by that (or any such) library -- you'll note I'm not escaping it here so the existing wildcard replace will work as-is. It's possible to work around this issue, but for a 1-liner it seemed like overkill.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should add: it especially complicates the issue since this needs to work in both node and the browser. If this were a node-only library it wouldn't be such a big deal.

If the functionality needed were a big deal and possibly subject to further development, bug fixes, etc, it might be worth it -- but again, in comparison to a 1-liner, it seems like unnecessary complication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah gotcha

@zacronos
Copy link
Contributor Author

@stephenmathieson: Very well, squashed.

@TooTallNate
Copy link
Contributor

Thanks!

@nunofgs
Copy link

nunofgs commented Nov 21, 2016

Hey guys. This PR introduced a breaking change for us, in version 2.3.0.

In 2.2.0, this was perfectly valid:

DEBUG='(foo|bar)*' node index.js

// This would log both "foo:example:something-else" and "bar:blabla:example".

After applying the previous replace, the process.env.DEBUG variable is now (foo|bar)*.

In 2.3.0,

DEBUG='(foo|bar)*' node index.js

// Does not log anything starting with "foo" or "bar" at all.

After applying the new replace(), the process.env.DEBUG variable is now \(foo|bar\)*.

I understand that parenthesis are a regexp special character, but their usage is reserved for matching groups.

Given the functionality that debug is trying to expose here, I personally believe that parenthesis are an acceptable and natural way to specify exclusive words as per the example above.

EDIT: Also, it goes without saying that the regexp introduced in 2.3.0 also blocks the use of the | character, even if you try to escape it. So there's no way to do something like: DEBUG='foo|bar' node index.js

@TooTallNate: given that this is a BC (we suddenly stopped seeing logs!) would you consider merging a fix (I can submit a PR) and bumping the package asap?

@zacronos
Copy link
Contributor Author

The question @TooTallNate and the visionmedia folks need to figure out is whether they WANT to allow regex-ish enables for DEBUG. It would be a little strange to do so since * wouldn't be treated as a regex character. And then anyone who tries to use parentheses as part of the label would get unexpected behavior, much like this person did when trying to use square brackets: #312.

You are considering it a breaking change, but the other perspective is that you wrote code which relies on buggy behavior. It all depends on whether you consider the regex-ish behavior to be a bug or an undocumented feature.

If the decision is that the regex-ish behavior is desired, then it should be properly documented to prevent users from having problems like #312.

@zacronos
Copy link
Contributor Author

@nunofgs: FYI, the documented way to do something like DEBUG='(foo|bar)*' is DEBUG='foo*,bar*'.

@TooTallNate
Copy link
Contributor

Well, breaking logs is never a good thing, so 👍 for a revert on the 2.x level. Then, if we are going to keep this patch, it should be saved for the v3 branch.

@TooTallNate
Copy link
Contributor

Plus having namespaces with these special characters seems... strange to me as well. So if they had previous utility in specifying the names in DEBUG then that's probably more useful IMO.

@zacronos
Copy link
Contributor Author

FWIW, a revert will wind up breaking logs for at least 1 person (@wenkanglin, the originator of #312).

@nunofgs
Copy link

nunofgs commented Nov 21, 2016

You are considering it a breaking change, but the other perspective is that you wrote code which relies on buggy behavior. It all depends on whether you consider the regex-ish behavior to be a bug or an undocumented feature.

I always assumed that regex-ish behavior was desired and is part of the reason I consider this package to be very useful and powerful.

I still recommend reverting and then immediately bumping to 3.0.0 since, even though it may be considered undesirable behavior but this decision is, of course, your prerogative.

@fixe
Copy link
Contributor

fixe commented Nov 21, 2016

+1 for keeping compatibility with native RegExps and allowing some syntactic sugar on top of it.

TooTallNate added a commit that referenced this pull request Nov 22, 2016
This reverts commit 8dd8345.

We shouldn't have changed the original behavior, which
too many people are relying on at this point.

It's also technically a breaking change, which we shouldn't
have landed on a minor/patch version change. So in the interest
of not breaking people's logs in production, reverting this.

We can discuss in a new issue if we want to restore this patch
for a `v3` release, but at the moment I'm personally leaningo
towards *no*, for historical reasons (i.e. this is reminding
me of Node.js trying to remove `sys` if anybody reading this
remembers those days).

See the discussion in #250 for more backlog.
@zacronos
Copy link
Contributor Author

I strongly recommend documenting the regex-ish behavior, since the current documentation indicates how * and , are treated specially, but nothing is mentioned about regex characters -- which is why I think some people (including myself) did not expect regex characters to be treated as anything but literal characters.

Alternatively, you could consider a v3.0 release that uses 100% regex syntax (with appropriate documentation), since otherwise the non-regex (and currently documented) behavior of * and , seems out of place.

@zacronos
Copy link
Contributor Author

For another example of the types of issues that can result from folks not expecting the regex behavior, see #288

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants