Can't search across lines with .* regex #303

izuzak · 2014-10-24T16:06:28Z

Originally reported by @cydrobolt over at atom/atom#3892

Both regex and normal find can't search across lines. E.g


something="ex
ists"


<p>blablablabla
blablablabla</p>

For the second example, a regex of .* should have matched the text. However, it does not work, because it is spread across two lines.

The text was updated successfully, but these errors were encountered:

burabure · 2014-11-05T15:14:15Z

if you try to do someting like

<p>(.*|\n)*?</p>

on the current buffer, it actually crashes (at least if the content is more than a couple lines long)

Ubuntu 14.04
atom 0.141.0

redfellow · 2015-03-18T17:39:35Z

Ubuntu 14.04 crash still exists.

harai · 2015-07-07T01:38:18Z

It still crashes.

Ubuntu 15.04
Atom 1.0.0
find-and-replace 0.174.1

dead-claudia · 2015-07-17T04:27:16Z

To be honest, I don't think that crash is easily fixable. Try running that regex through grep and see what happens. If it doesn't hang on a file of about 30 lines, then there is likely a very difficult perf bug in V8 or (highly unlikely) Atom's text editor. I would be surprised if that's the case, though, considering that is literally "any number of a group of the least number of characters consisting of either a newline or the largest group of non-line-ending characters you can get". That's a lot of work to do, and it's not the easiest to even statically compile that regex to infer that it's matching any set of characters that don't include any line-ending character other than a line feed. That means the regex doesn't match other line-breaking characters, i.e. carriage return, (the obscure line-breaking code points) U+2028 and U+2029, etc. Another thing is that even Sublime, etc. tend to choke a little on regular expressions.

Regular expression engines are extremely slow to begin with, and V8's Irregexp engine is one of the few that isn't atrociously slow. (It's faster than most POSIX-based regex implementations, and it's faster than Perl's highly optimized, highly flexible one.)

I would say one way, probably the best way, to curb the crashes is to instate a delay since the last character is added before the regex is finally executed, even as little as 200 milliseconds. I couldn't tell you how many times I've had Atom crash in the middle of me typing out a regex, simply because the incomplete one happened to match a third of the code. The other thing is that most editors don't regular expressions as they're typed - they run via a dialog or similar. Atom is rather unique in this problem.

acusti · 2015-07-20T21:29:52Z

I don’t know if this is worth it’s own issue, but the general problem of not being able to do a multiline search without converting your search to a regular expression is a painful one. It seems like the need to use “replace in project” to modify every instance of a multiline chunk of code in a project is the kind of thing that comes up frequently enough that it would be great if the editor could handle it. There have been a few times that I just wanted to paste in a code chunk to the find field and a different one in the replace field.

dbolton · 2016-01-07T17:25:42Z

@burabure If you delete the asterisk inside the parentheses, you'll get the same matches without the crash ((.|\n)*?). Better yet is the following regular expression which matches newlines regardless of platform (e.g. carriage return or newline)

[\s\S]*?

\s matches white space including line breaks, and \S matches anything that is not a white space. Unlike some languages, JavaScript doesn't have a way to flag that you want dots to match a newline. So maybe Atom can replace dots with [\s\S] under the hood to match newlines.

guillochon · 2016-06-27T21:18:50Z

This bug is really bad, I typed in (.*|\n)*? into my find and it crashed Atom, but find still has that pattern entered so it crashes every time I launch a new search now! How do I clear the search history?

Edit: Looks like it's working now after restarting Atom a few times, not sure what changed.

menocomp · 2016-07-06T02:12:29Z

@dbolton (.|\n)*? only works in one file!!! I tried it in folders and did not work!

dead-claudia · 2016-07-06T23:58:43Z

Should there be a multiline option? I think that's probably the best resolution, since there may be cases when you don't intend to match across lines.

winstliu · 2017-04-19T17:06:33Z

VSCode and Atom are two different projects, so both issues should remain open.

sekmo · 2017-05-18T10:26:45Z

No news after three years? :-)

steviesama · 2017-08-01T21:56:07Z

I don't know if it's a complete solution but I got something working. Pretty strange I thought, and I'm not sure about the limitations because it's hackity, but here's how I refactored a chunk of code I had in more places than I should have.

The first snip shows what I was matching as it always matches in the same window but never across multiple files. While if you hit enter how I have it here it will match across all instances of the text.

Below is a snip of the search matching in all 40 places.

This is pretty strange. But I noticed that the first line is always fine. Then to get to the next as well as every line thereafter, you need to start doing a pattern, at least the way I'm doing it. Shown below:

\s*[text to match]*

\s* for all the upcoming space, though I should mention, I did (\s)* or (\s*) in mine as what I wanted to also do was match whatever indentation was present. Putting your text to search inside a character class, and always terminating it with *, and your search will be found.

I found it strange than the character class worked, but I figured it had something to do with how it was finding them so I tried * after each character on lines after the first...and that worked too. Snip below.

var style = \{*\s*w*i*d*t*h*

Well, I hope that was helpful to someone. I was about to use sub-grouping to change the followup matching and everything without a hitch.

steviesama · 2017-08-01T22:06:46Z

@isiahmeadows As for the multiline option, since it doesn't let you search across multiple lines, I think with what I found above, that seems to basically make multiline an explicit option.

dead-claudia · 2017-08-02T04:46:13Z

@steviesama Good point. Maybe better to add an option to, short-term, transform . to [^], and long-term, use /s (which is currently an ES proposal, but V8 has recently started shipping it by default).

dead-claudia · 2017-08-02T04:47:24Z

And maybe make that option ". matches newlines" or something like that.

ghost · 2018-01-04T17:57:47Z

What's the status on this?

winstliu · 2018-01-04T21:33:01Z

I'm not aware of any attempts to fix this issue, however we would be interested in reviewing PRs addressing this issue that don't regress in terms of performance. The current library we use for searching files is atom/scandal, where I believe files were intentionally broken up into chunks to improve search performance.

dead-claudia · 2018-01-05T07:41:31Z

Found an issue there, but no PR.

g3ar · 2018-07-19T23:51:26Z

Have the same issue. We need to have "Multiline" find option.

artheus · 2018-07-24T10:15:32Z

👍 I Agree that this is something that is needed.

jinglesthula · 2018-07-30T17:23:07Z

Although this has been painful enough for long enough, I think we're nearly out of the woods. The proposal went to Stage 4 seven months ago, and the kangax tables list it as an ES2018 feature http://kangax.github.io/compat-table/es2016plus/. I don't know the guts of Atom to know even what JS engine it's running or what ES features are supported, but I suspect we're either at the point (or will be very soon) where we could just have a button added on the Atom find UI to include the s flag.

dead-claudia · 2018-07-30T18:01:50Z

@jinglesthula Not sure how much the /s flag would help, if my analysis is correct.

jinglesthula · 2018-07-31T15:34:39Z

Mmm.. yeah. We're all probably naively thinking "how hard could it be?", but the realities of performance and scaling when dealing with large files isn't trivial. I wonder if other editors' approaches could be looked at to see how they accomplish it. For now, remembering to use [^] or \s* may be the easiest workaround.

dead-claudia · 2018-07-31T21:07:28Z

@jinglesthula This very issue has prompted me to start an ESDiscuss thread about what would be required to fix this.

But most certainly, the more intuitively simple something is conceptually, the more complex it really becomes behind the scenes to do correctly, ironically enough.

ghost · 2019-04-01T22:54:55Z

I know It has been a long time, but I've been trying a solution for this issue for a while. So, here are my 2¢:

blablablabla
blablablabla

Find: (.*\n.+?)
Replace: New content:$1

Result:

New content:blablablabla
blablablabla

Screenshots

Before "Replace":

After "Replace":

Atom: 1.35.1 x64, macOS Mojave 10.14.4

Does it help?

g3ar · 2019-04-01T22:58:57Z

No.

ghost · 2019-04-01T23:02:34Z

@g3ar Could you give more details, please?

g3ar · 2019-04-01T23:10:33Z

I'm not using atom right now. Your solution works for simple files. I have tried this for complicated sources and it fails. I think problem is in wrong parsing of \n.

ghost · 2019-04-01T23:22:44Z

@g3ar I understand. I've tested it in a file (html+javascript+json) with 14,448 lines and it worked fine. However, I'm using Atom. I believe that different regex flavors require different regex structures.

I don't know if you already did it but, if not, you could try to identify which flavor/engine you're using and then try another solution.

Here's a list of them: https://en.wikipedia.org/wiki/Comparison_of_regular_expression_engines

Good luck and thank you for the details.

DigitalLeaves · 2020-04-10T10:13:45Z

My two cents. It works for single files, but not for multifiles.

I have plenty of files with this code (sidebar, HTML static):

<li class="nav-item">
   <a class="nav-link" href="./employees.html">
      <i class="ni ni-badge text-primary"></i>
      <span class="nav-link-text" data-i18n="employees_and_salaries"></span>
   </a>
</li>

I want to add a new class (let's call it newclass) to the <li> element, but only when the link links to employees.html, so my regexp:
<li class="nav-item">([.|\n|\s|\t]*)<a class="nav-link" href="\.\/employees\.html">
And replacement:
<li class="nav-item newclass">$1<a class="nav-link" href="./employees.html">

Works for single files (finds the expression), but fails to find a single match if I look for multi-files (Shift+Option+F).

svennd · 2022-03-13T12:43:52Z

the "find all" works fine in a single file, but multi-file doesn't work. Is there a workaround available ? (other then opening 100's of files to run this manually) ?

I want to remove double lines :

thumbnail:(.|\r?\n)*?thumbnail:(.*?)$

with 

thumbnail:$2

izuzak mentioned this issue Oct 24, 2014

Can't search across lines atom/atom#3892

Closed

benogle changed the title ~~Can't search across lines~~ Can't search across lines with .* regex Oct 24, 2014

benogle added the bug label Nov 13, 2014

lee-dohm mentioned this issue Jan 4, 2015

Replacing ^ and $ broken with "only in selection" atom/atom#4838

Closed

danwulff mentioned this issue Jan 25, 2017

Multiline search using regex fails to return all results microsoft/vscode#19217

Closed

rafeca mentioned this issue May 22, 2019

Handle multiline results on the find and replace UI #1085

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't search across lines with .* regex #303

Can't search across lines with .* regex #303

izuzak commented Oct 24, 2014

burabure commented Nov 5, 2014

redfellow commented Mar 18, 2015

harai commented Jul 7, 2015

dead-claudia commented Jul 17, 2015

acusti commented Jul 20, 2015

dbolton commented Jan 7, 2016

guillochon commented Jun 27, 2016 •

edited

Loading

menocomp commented Jul 6, 2016

dead-claudia commented Jul 6, 2016

winstliu commented Apr 19, 2017

sekmo commented May 18, 2017

steviesama commented Aug 1, 2017 •

edited

Loading

steviesama commented Aug 1, 2017 •

edited

Loading

dead-claudia commented Aug 2, 2017

dead-claudia commented Aug 2, 2017

ghost commented Jan 4, 2018

winstliu commented Jan 4, 2018

dead-claudia commented Jan 5, 2018

g3ar commented Jul 19, 2018

artheus commented Jul 24, 2018

jinglesthula commented Jul 30, 2018

dead-claudia commented Jul 30, 2018

jinglesthula commented Jul 31, 2018

dead-claudia commented Jul 31, 2018

ghost commented Apr 1, 2019

g3ar commented Apr 1, 2019

ghost commented Apr 1, 2019 •

edited by ghost

Loading

g3ar commented Apr 1, 2019

ghost commented Apr 1, 2019 •

edited by ghost

Loading

DigitalLeaves commented Apr 10, 2020 •

edited

Loading

svennd commented Mar 13, 2022

Can't search across lines with .* regex #303

Can't search across lines with .* regex #303

Comments

izuzak commented Oct 24, 2014

burabure commented Nov 5, 2014

redfellow commented Mar 18, 2015

harai commented Jul 7, 2015

dead-claudia commented Jul 17, 2015

acusti commented Jul 20, 2015

dbolton commented Jan 7, 2016

guillochon commented Jun 27, 2016 • edited Loading

menocomp commented Jul 6, 2016

dead-claudia commented Jul 6, 2016

winstliu commented Apr 19, 2017

sekmo commented May 18, 2017

steviesama commented Aug 1, 2017 • edited Loading

steviesama commented Aug 1, 2017 • edited Loading

dead-claudia commented Aug 2, 2017

dead-claudia commented Aug 2, 2017

ghost commented Jan 4, 2018

winstliu commented Jan 4, 2018

dead-claudia commented Jan 5, 2018

g3ar commented Jul 19, 2018

artheus commented Jul 24, 2018

jinglesthula commented Jul 30, 2018

dead-claudia commented Jul 30, 2018

jinglesthula commented Jul 31, 2018

dead-claudia commented Jul 31, 2018

ghost commented Apr 1, 2019

Atom: 1.35.1 x64, macOS Mojave 10.14.4

g3ar commented Apr 1, 2019

ghost commented Apr 1, 2019 • edited by ghost Loading

g3ar commented Apr 1, 2019

ghost commented Apr 1, 2019 • edited by ghost Loading

DigitalLeaves commented Apr 10, 2020 • edited Loading

svennd commented Mar 13, 2022

guillochon commented Jun 27, 2016 •

edited

Loading

steviesama commented Aug 1, 2017 •

edited

Loading

steviesama commented Aug 1, 2017 •

edited

Loading

ghost commented Apr 1, 2019 •

edited by ghost

Loading

ghost commented Apr 1, 2019 •

edited by ghost

Loading

DigitalLeaves commented Apr 10, 2020 •

edited

Loading