-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stackoverflow in 1.4.4 #750
Comments
I need a smaller reproduction please. There is no obvious reason that I see why any recent changes would cause a stack overflow. It also isn't clear whether you're reporting a compilation error or something that happens at regex runtime. If the former, that then sounds like a rustc bug, no? If the latter, a stack trace would be helpful. |
I'm not familiar with windows platform enough to provide a stacetrace, sorry. It's compilation error because bindgen fails to generate bindings. The reason that it fails to generate bindings because of stackoverflow. rust-bindgen is the only one dependency of grpcio-sys that uses regex. I guess there is some code in rust-bindgen that invokes regex heavily and result in stackoverflow. I can confirm wrapping T into Line 121 in 951b8b9
Before the PR, all values are allocated at heap when not enables |
I'm not familiar enough with Windows either. Your patch that fixes the issue is quite weird. I wonder if this is not a case of recursion causing a stack overflow, but rather, too many things on the stack itself. If Windows has smaller stack sizes than, let's say, Linux or macOS, it could explain why Windows specifically is having a problem. But for this to be true, I think you would need to put a lot of regexes on the stack. |
According to the rustc output, the new Pool size becomes 848 byte, is it an expected size? The default stack size of a thread in Rust is 2MiB. Supposing half of the stack is used for other stack frames, then users are expected not to create more than 2473 regex expressions.
I think this is a common problems for all platform. The reason why it always stackoverflow on Windows may be related to different symbols on different platforms. Perhaps rust-bindgen just needs to handle more symbols than other platform. |
Actually it's true. I got the 2MiB from https://doc.rust-lang.org/std/thread/index.html#stack-size, but I missed the bottom line
After referring to the docs and writing small snippets to verify it, the stack sizes of main thread on MacOS and Linux are the same as |
@jdm One thing that would be useful is a pointer to the source code where bindgen uses regexes. If there are a lot of them on the stack, then I think that would explain things here. |
servo/servo#28269 is the tracking bug in servo for this, as they are hitting it too. |
I'm working on a patch for this now. |
This commit fixes a fairly large regression in the stack size of a Regex introduced in regex 1.4.4. When I dropped thread_local and replaced it with Pool, it turned out that Pool inlined a T into its struct and a Regex in turn had Pool inlined into itself. It further turns out that the T=ProgramCache is itself quite large. We fix this by introducing an indirection in the inner regex type. That is, we use a Box<Pool> instead of a Pool. This shrinks the size of a Regex from 856 bytes to 16 bytes. Interestingly, prior to regex 1.4.4, a Regex was still quite substantial in size, coming in at around 552 bytes. So it looks like the 1.4.4 release didn't dramatically increase it, but it increased it enough that folks started experiencing real problems: stack overflows. Fixes #750, Fixes #751 Ref servo/servo#28269
Every use of RegexSet in https://github.com/rust-lang/rust-bindgen/blob/dedbea5bc0317be4e7fd47a49392a6b080f47ac8/src/lib.rs#L1592 stores a regex::RegexSet inline, and BindgenOptions is stored in https://github.com/rust-lang/rust-bindgen/blob/dedbea5bc0317be4e7fd47a49392a6b080f47ac8/src/lib.rs#L2050, and that's stored on the stack. |
This commit fixes a fairly large regression in the stack size of a Regex introduced in regex 1.4.4. When I dropped thread_local and replaced it with Pool, it turned out that Pool inlined a T into its struct and a Regex in turn had Pool inlined into itself. It further turns out that the T=ProgramCache is itself quite large. We fix this by introducing an indirection in the inner regex type. That is, we use a Box<Pool> instead of a Pool. This shrinks the size of a Regex from 856 bytes to 16 bytes. Interestingly, prior to regex 1.4.4, a Regex was still quite substantial in size, coming in at around 552 bytes. So it looks like the 1.4.4 release didn't dramatically increase it, but it increased it enough that folks started experiencing real problems: stack overflows. Since indirection can lead to worse locality and performance loss, I did run the benchmark suite. I couldn't see any measurable difference. This is generally what I would expect. This is an indirection at a fairly high level. There's lots of other indirection already, and this indirection isn't accessed in a hot path. (The regex cache itself is of course used in hot paths, but by the time we get there, we have already followed this particular pointer.) We also include a regression test that asserts a Regex (and company) are 16 bytes in size. While this isn't an API guarantee, it at least means that increasing the size of Regex will be an intentional thing in the future and not an accidental leakage of implementation details. Fixes #750, Fixes #751 Ref servo/servo#28269
Still though, if we say a Regex was taking 1KB on the stack and there are ~couple dozen regexes on the stack, then that's still nowhere near a stack limit of, say, 1MB as found above on Windows. (I only remark on this for the sake of saying that a gap is still missing in my mental model I think.) |
This commit fixes a fairly large regression in the stack size of a Regex introduced in regex 1.4.4. When I dropped thread_local and replaced it with Pool, it turned out that Pool inlined a T into its struct and a Regex in turn had Pool inlined into itself. It further turns out that the T=ProgramCache is itself quite large. We fix this by introducing an indirection in the inner regex type. That is, we use a Box<Pool> instead of a Pool. This shrinks the size of a Regex from 856 bytes to 16 bytes. Interestingly, prior to regex 1.4.4, a Regex was still quite substantial in size, coming in at around 552 bytes. So it looks like the 1.4.4 release didn't dramatically increase it, but it increased it enough that folks started experiencing real problems: stack overflows. Since indirection can lead to worse locality and performance loss, I did run the benchmark suite. I couldn't see any measurable difference. This is generally what I would expect. This is an indirection at a fairly high level. There's lots of other indirection already, and this indirection isn't accessed in a hot path. (The regex cache itself is of course used in hot paths, but by the time we get there, we have already followed this particular pointer.) We also include a regression test that asserts a Regex (and company) are 16 bytes in size. While this isn't an API guarantee, it at least means that increasing the size of Regex will be an intentional thing in the future and not an accidental leakage of implementation details. Fixes #750, Fixes #751 Ref servo/servo#28269
This should be fixed in |
394: Bump regex from 1.4.4 to 1.4.5 r=pitkley a=dependabot[bot] Bumps [regex](https://github.com/rust-lang/regex) from 1.4.4 to 1.4.5. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/rust-lang/regex/blob/master/CHANGELOG.md">regex's changelog</a>.</em></p> <blockquote> <h1>1.4.5 (2021-03-14)</h1> <p>This is a small patch release that fixes a regression in the size of a <code>Regex</code> in the 1.4.4 release. Prior to 1.4.4, a <code>Regex</code> was 552 bytes. In the 1.4.4 release, it was 856 bytes due to internal changes. In this release, a <code>Regex</code> is now 16 bytes. In general, the size of a <code>Regex</code> was never something that was on my radar, but this increased size in the 1.4.4 release seems to have crossed a threshold and resulted in stack overflows in some programs.</p> <ul> <li>[BUG <a href="https://github.com/rust-lang/regex/issues/750">#750</a>](<a href="https://github.com/rust-lang/regex/pull/750">rust-lang/regex#750</a>): Fixes stack overflows seemingly caused by a large <code>Regex</code> size by decreasing its size.</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/rust-lang/regex/commit/ff283badce21dcebd581909d38b81f2c8c9bfb54"><code>ff283ba</code></a> 1.4.5</li> <li><a href="https://github.com/rust-lang/regex/commit/78c7cefbc9c1b06522a8ccbfb72cd630dec724e3"><code>78c7cef</code></a> impl: substantially reduce regex stack size</li> <li>See full diff in <a href="https://github.com/rust-lang/regex/compare/1.4.4...1.4.5">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=regex&package-manager=cargo&previous-version=1.4.4&new-version=1.4.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> 395: Bump paste from 1.0.4 to 1.0.5 r=pitkley a=dependabot[bot] Bumps [paste](https://github.com/dtolnay/paste) from 1.0.4 to 1.0.5. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/dtolnay/paste/releases">paste's releases</a>.</em></p> <blockquote> <h2>1.0.5</h2> <ul> <li>Work around conflict with extended_key_value_attributes (<a href="https://github.com/dtolnay/paste/issues/63">#63</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/dtolnay/paste/commit/b38636e38862f1df2c7b167073b022fcc8819a05"><code>b38636e</code></a> Release 1.0.5</li> <li><a href="https://github.com/dtolnay/paste/commit/59fcd8a18b363c7b267087df41ac74a2c03b1f3d"><code>59fcd8a</code></a> Merge pull request <a href="https://github.com/dtolnay/paste/issues/64">#64</a> from dtolnay/kv</li> <li><a href="https://github.com/dtolnay/paste/commit/e5e87d382e8e104dd1c090f75368ed49c2489bc6"><code>e5e87d3</code></a> Work around conflict with extended_key_value_attributes</li> <li><a href="https://github.com/dtolnay/paste/commit/f20a2930bc153449a9fae154eab22817f557e7b3"><code>f20a293</code></a> Add regression test for issue 63</li> <li><a href="https://github.com/dtolnay/paste/commit/c613e7acfe0eb8a3c0c0299101996c65012c401c"><code>c613e7a</code></a> Format with rustfmt 1.4.36-nightly</li> <li><a href="https://github.com/dtolnay/paste/commit/5c9f92a9ed9e3b694e4cedfbd5d2bdbe732b52fc"><code>5c9f92a</code></a> Move stuff not related to pasting out of method_new</li> <li><a href="https://github.com/dtolnay/paste/commit/73a54c0c7d493d186fbf3886215db2e5d7bd7cd8"><code>73a54c0</code></a> Make method_new example compilable</li> <li><a href="https://github.com/dtolnay/paste/commit/23f7facec46ece8e730316782a169a20a6946e85"><code>23f7fac</code></a> Fix catching clippy warnings as CI failures</li> <li><a href="https://github.com/dtolnay/paste/commit/c55f34310e74a90539024957d006d456d7912979"><code>c55f343</code></a> Resolve explicit_into_iter_loop clippy lint</li> <li><a href="https://github.com/dtolnay/paste/commit/1b8d824a9abe2275f2ba363c97a2459314ef793e"><code>1b8d824</code></a> Inform clippy of supported compiler version in clippy.toml</li> <li>Additional commits viewable in <a href="https://github.com/dtolnay/paste/compare/1.0.4...1.0.5">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=paste&package-manager=cargo&previous-version=1.0.4&new-version=1.0.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> 396: Bump libc from 0.2.88 to 0.2.89 r=pitkley a=dependabot[bot] Bumps [libc](https://github.com/rust-lang/libc) from 0.2.88 to 0.2.89. <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/rust-lang/libc/commit/508f65a2f87968f14a2b815f1e5537f08c504907"><code>508f65a</code></a> Auto merge of <a href="https://github.com/rust-lang/libc/issues/2112">#2112</a> - joshtriplett:eighty-nine, r=JohnTitor</li> <li><a href="https://github.com/rust-lang/libc/commit/1a359cff89ca2383a1461ae2c571b2e3b5aa3d51"><code>1a359cf</code></a> Bump to 0.2.89</li> <li><a href="https://github.com/rust-lang/libc/commit/44abe4b82dd19d63e7b2c63578b51b6e76842cd1"><code>44abe4b</code></a> Auto merge of <a href="https://github.com/rust-lang/libc/issues/2111">#2111</a> - voidc:clone3-musl, r=JohnTitor</li> <li><a href="https://github.com/rust-lang/libc/commit/3e0e58521acef559cda667bad2880a3854a55c1d"><code>3e0e585</code></a> Ignore SYS_clone3 during musl tests</li> <li><a href="https://github.com/rust-lang/libc/commit/efe0fe9499f5048b7975e806e22ec8079dca7624"><code>efe0fe9</code></a> Add SYS_clone3 for musl targets</li> <li><a href="https://github.com/rust-lang/libc/commit/a91546f20fe9f558df33e2cd8b6b24554cf91463"><code>a91546f</code></a> Auto merge of <a href="https://github.com/rust-lang/libc/issues/2110">#2110</a> - GuillaumeGomez:cpu-state, r=JohnTitor</li> <li><a href="https://github.com/rust-lang/libc/commit/fbba2bb34c6e977acccd5b945e5773257c8166f0"><code>fbba2bb</code></a> Add CPU_STATE_* constants</li> <li><a href="https://github.com/rust-lang/libc/commit/9bd88759e0bfa68e19cd88028f9d7ee422aca615"><code>9bd8875</code></a> Auto merge of <a href="https://github.com/rust-lang/libc/issues/2106">#2106</a> - coolreader18:uinput-structs, r=coolreader18</li> <li><a href="https://github.com/rust-lang/libc/commit/ed45c2649b848bc2df48ba10d68194d45da03b75"><code>ed45c26</code></a> Move some structs to s_no_extra_traits due to large arrays</li> <li><a href="https://github.com/rust-lang/libc/commit/3f6b151eed76dada19da7bcbc3d382a5366dd14e"><code>3f6b151</code></a> Skip uinput tests on musl+mips+ppc64+sparc64</li> <li>Additional commits viewable in <a href="https://github.com/rust-lang/libc/compare/0.2.88...0.2.89">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=libc&package-manager=cargo&previous-version=0.2.88&new-version=0.2.89)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
What version of regex are you using?
If it isn't the latest version, then please upgrade and check whether the bug
is still present.
Describe the bug at a high level.
1.4.4 breaks windows build of grpcio.
What are the steps to reproduce the behavior?
Just run
cargo test --all
on Windows.What is the actual behavior?
Build fails when generating bindings using rust-bindgen, which uses regex to process sources.
When downgrade regex back to before e040c1b, the build can finish successfully, so the bug is probably introduced by #749.
What is the expected behavior?
It compiles successfully.
What do you expect the output to be?
The text was updated successfully, but these errors were encountered: