-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util.stripVTControlCharacters does not strip hyperlinks #53697
Comments
+ └─$ node
Welcome to Node.js v22.3.0.
Type ".help" for more information.
> util.stripVTControlCharacters('\x1b]8;;http://example.com\x1b\\This is a link\x1b]8;;\x1b\\ hello')
ttp://example.comThis is a link;; hello |
The current RegEx used is: /[\u001B\u009B][[\]()#;?]*(?:(?:(?:(?:;[-a-zA-Z\d\/#&.:=?%@~_]+)*|[a-zA-Z\d]+(?:;[-a-zA-Z\d\/#&.:=?%@~_]*)*)?\u0007)|(?:(?:\d{1,4}(?:;\d{0,4})*)?[\dA-PR-TZcf-ntqry=><~]))/g |
I'm currently building and testing @isaacs suggested regex. |
Oh, hey, looks like you can put other parameters between the
Note the addition of (I don't know of any terminal emulators or programs that do put parameters there, as it's just "reserved for future use", but that's what the spec says it can be.) |
I was wrong, the |
I'm seeing a number of failures with your RegEx, but I do agree the current one may have a bug. (I would CC a util team but there isn't one :/) |
The regex at https://github.com/chalk/ansi-regex/blob/main/index.js supports links. You could maybe find some inspiration there. |
Hi @sindresorhus, IIRC, the regex used is from ansi-regex, but an older version. Later today, I'll test if a newer version of the regex fixes the issue. |
// From https://github.com/chalk/strip-ansi/blob/main/index.js
function ansiRegex({onlyFirst = false} = {}) {
const pattern = [
'[\\u001B\\u009B][[\\]()#;?]*(?:(?:(?:(?:;[-a-zA-Z\\d\\/#&.:=?%@~_]+)*|[a-zA-Z\\d]+(?:;[-a-zA-Z\\d\\/#&.:=?%@~_]*)*)?\\u0007)',
'(?:(?:\\d{1,4}(?:;\\d{0,4})*)?[\\dA-PR-TZcf-nq-uy=><~]))'
].join('|');
return new RegExp(pattern, onlyFirst ? undefined : 'g');
}
console.log('\x1b]8;;http://example.com\x1b\\This is a link\x1b]8;;\x1b\\ hello'.replace(ansiRegex(), '')); Outputs:
https://runkit.com/6689b91eebd0a700080cd8b3/668c3336e85de300088f3f39 |
I've opened an issue in chalk/ansi-regex#56, which is the source for the RegEx used by Node.js. Once resolved, I'll update the RegEx. |
Fixed by 5b3f3c5...9416354 |
Version
22.4.0
Platform
Subsystem
util
What steps will reproduce the bug?
How often does it reproduce? Is there a required condition?
Always
What is the expected behavior? Why is that the expected behavior?
Should output:
(With "This is a link" not hyperlinked.)
What do you see instead?
Additional information
OCS codes are a bit tricky to capture and strip, but this is what I'm using in ansi-to-pre:
Adding
str.replaceAll(/\u001b\]8;;(.*?)(?:\u001b\\|\u0007)(.*?)\u001b]8;;(?:\u001b\\|\u0007)/g, '$2')
should do the right thing, but not sure if there's some better way to do it in context.The text was updated successfully, but these errors were encountered: