Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider removing --check-bounds=no? #48245

Open
Keno opened this issue Jan 11, 2023 · 51 comments
Open

Consider removing --check-bounds=no? #48245

Keno opened this issue Jan 11, 2023 · 51 comments

Comments

@Keno
Copy link
Member

Keno commented Jan 11, 2023

I think we should consider removing the --check-bounds=no option.
I can't really think of any situation in which it would be safe or sensible to turn it on.
If you really must have something like it, a Cassette pass that disables bounds checking
in a particular region of code, could achieve any residual benefits without being as massive a footgun.
Worse, turning on --check-bounds=no on current master can actually result in significantly
worse performance because it removes inference's ability to do concrete evaluation (constant folding).
Note that that we can make the option a noop in a minor release since throwing a BoundsError
is allowable undefined behavior.

EDIT: note for future readers that the recommended replacement is to mark @inbounds code that is discovered to benefit from this flag from @profile analysis.

@Keno Keno added the triage This should be discussed on a triage call label Jan 11, 2023
@matthias314
Copy link
Contributor

Here is a case where I regularly use --check-bounds=no: I write a program that does some computation, say based on some integer parameter n. The larger n is, the longer the program runs. I want to get the result for as large a value of n as possible. I run the program for small n with bounds checking turned on to make sure everything works. Then I turn bounds checking off to push n as high as possible. That's quick and easy and does the trick for me.
(This is of course for code that I use myself, not for packages released to the public.)

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 14, 2023

Just to be clear, the reason we want to remove it is it requires us to compile the code more conservatively, leading to significant losses in inference accuracy, leading to significant losses of performance when running with check-bounds=no

@JeffBezanson
Copy link
Sponsor Member

@matthias314 Out of curiosity, what kind of speedup do you get?

@gbaraldi
Copy link
Member

Triage discussed this in length and the conclusion was that issues that are present in --check-bounds=no are similarly present with @inbounds meaning that this is probably a bandaid.
One idea @oscardssmith proposed but wasn't sure of the feasibility is to refine the inbounds effect to not taint if we can prove that the acesses are always inbounds.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 19, 2023

That sounds easy to prove: just remove --check-bounds=no and the required property is proved exactly when expected, and more often than it could be proven under --check-bounds=no in the current system (e.g. we can eliminate taint on bounds more easily with check-bounds=yes/auto than with check-bounds=no. That is the statement of purpose for this issue.)

@matthias314
Copy link
Contributor

@JeffBezanson The speedup varies; often it is indeed not impressive.

My point was not so much about the effectiveness of the current implementation of --check-bounds=no, but about the general idea: Instead of sprinkling my code with @inbounds, @boundscheck and @propagate_inbounds, I find it easier not to worry about it when I write my code and then turn all checks off once I think my code is correct.

@LilithHafner
Copy link
Member

leading to significant losses of performance when running with check-bounds=no

Could we get a specific example of @inbounds and/or check-bounds=no decreasing performance? I feel like there should be an open issue for this but I'm having trouble finding either a gh issue or an example of this behavior.

@oscardssmith
Copy link
Member

oscardssmith commented Jan 20, 2023

julia> Base.@assume_effects :terminates_locally function f(s)
           t = 0.
           for i in 1:2^20
               for m in 1:length(s)
                   t += 1/(i + s[m])
               end
           end
           t
       end
f (generic function with 1 method)

julia> Base.@assume_effects :terminates_locally function f_inbounds(s)
           t = 0.
           for i in 1:2^20
               for m in 1:length(s)
                   @inbounds t += 1/(i + s[m])
               end
           end
           t
       end
f_inbounds (generic function with 1 method)

julia> g() = f((2,3,4))
g (generic function with 1 method)

julia> g_inbounds() = f_inbounds((2,3,4))
g_inbounds (generic function with 1 method)

julia> @btime g()
  0.861 ns (0 allocations: 0 bytes)
37.903821175206865

julia> @btime g_inbounds()
  2.683 ms (0 allocations: 0 bytes)
37.903821175206865

So a nice example where @inbounds a 3 million times regression.

@LilithHafner
Copy link
Member

Thanks! I figured out the reason I couldn't reproduce this is that I wasn't using the latest master.

@maleadt
Copy link
Member

maleadt commented Jan 23, 2023

Apparently there's GPU users that care about this, because GPUs are very sensitive to the branch-heavy code introduced by bounds checks (typically because it increases register pressure which hurts occupancy) and they are using unoptimized kernels that don't have the necessary @inbounds annotations. At the same time, --check-bounds=no is now (on 1.9) unusable to the because the regression in const-prop breaks static compilation, so that's not great.

@Keno Can you elaborate on the Cassette-like solution? What's a reasonable way to implement this for GPUCompiler's abstract interpreter without the const-prop regressions?

@omlins
Copy link

omlins commented Jan 23, 2023

The feature check-bounds=no is one of the top beautiful things of Julia in my opinion: it allows to develop code with the benefit of bounds checking, and once everything works as it should, with a single switch we can remove the bounds checking not necessary anymore to have it run faster. I think there is a large community, in particular domain scientists (who in the end are the probably largest part of the target users) that will strongly appreciate if this feature can be conserved. :)

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 23, 2023

The biggest security flaw in C come from the fact it does not have bounds checking enabled in releases.

@omlins
Copy link

omlins commented Jan 23, 2023

Bounds checking can naturally drastically impact performance as we can see running the following example The performance achieved without bounds checking is here over three times higher than with bounds checking activated (executed on a P100 GPU):

omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --check-bounds=no --math-mode=fast diffusion2D_shmem_novis.jl
time_s=1.1297500133514404 t_it=0.012552777926127115 T_eff=513.2291021090084
Benchmarktools (min): t_it=0.012316079 T_eff=523.0926940302998
omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --math-mode=fast diffusion2D_shmem_novis.jl
time_s=3.5698001384735107 t_it=0.03966444598303901 T_eff=162.42382275438484
Benchmarktools (min): t_it=0.039542728 T_eff=162.9237857337511

@KristofferC
Copy link
Sponsor Member

I think the argument is to use @inbounds where it matters which prevents making your whole application vulnerable to index mistakes with --check-bounds=false.

@omlins
Copy link

omlins commented Jan 23, 2023

Bounds checking can naturally drastically impact performance as we can see running the following example The performance achieved without bounds checking is here over three times higher than with bounds checking activated (executed on a P100 GPU):

omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --check-bounds=no --math-mode=fast diffusion2D_shmem_novis.jl
time_s=1.1297500133514404 t_it=0.012552777926127115 T_eff=513.2291021090084
Benchmarktools (min): t_it=0.012316079 T_eff=523.0926940302998
omlins@nid02356:~/tmpwdir/ParallelStencil/examples> julia -O3 --math-mode=fast diffusion2D_shmem_novis.jl
time_s=3.5698001384735107 t_it=0.03966444598303901 T_eff=162.42382275438484
Benchmarktools (min): t_it=0.039542728 T_eff=162.9237857337511

This example shows that bounds checking drastically impacts performance for high performance (GPU) applications. As a result, when running scientific high performance code, we do not want a single bounds check to happen . In the same time, while developing a scientific high performance code, we would like bounds check everywhere.
Thus, we do not want to add @inbounds statements into our code in order to be able to develop conveniently with bounds checking; at run time however, we need to deactivate the bounds checking everywhere: the global switch --check-bounds=no which has been available until now` has been a perfect solution.

@maleadt
Copy link
Member

maleadt commented Jan 23, 2023

when running scientific high performance code, we do not want to have a single bounds check to happen

The point is that your entire session shouldn't be running under --check-bounds=no, or a single mistake can kill your it. You apparently want GPU code to run without bounds checks (even though it would still be better to properly optimize your kernels using appropriate @inbounds annotations), so the option should be narrowed to GPU code generation, which as I mentioned already is something that could be added to e.g. @cuda calls, or as an env var to GPUCompiler.jl.

@omlins
Copy link

omlins commented Jan 23, 2023

The point is that your entire session shouldn't be running under --check-bounds=no, or a single mistake can kill your it.

At the moment we run the code without bounds checks (meaning for a production run), there will be no errors almost 100% of the time.

so the option should be narrowed to GPU code generation

That could be a good solution indeed, if bounds checking would not impact performance of high performance cpu code. However, running the same code on cpu we can already see that this is not true (problem size: 1024^2):

omlins@nid02061:~/tmpwdir/ParallelStencil/examples> julia --threads=12 -O3 --check-bounds=no --math-mode=fast diffusion2D_shmem_novis.jl
time_s=0.032401084899902344 t_it=0.0003600120544433594 T_eff=69.90272600430198
Benchmarktools (min): t_it=0.000321402 T_eff=78.30014747885825
omlins@nid02061:~/tmpwdir/ParallelStencil/examples> julia --threads=12 -O3 --math-mode=fast diffusion2D_shmem_novis.jl
time_s=0.044291019439697266 t_it=0.0004921224382188586 T_eff=51.137322839988364
Benchmarktools (min): t_it=0.000453692 T_eff=55.46896132177777

@Seelengrab
Copy link
Contributor

Seelengrab commented Jan 24, 2023

Any indiscriminate use of --check-bounds=no is just as dangerous as just throwing @inbounds on a loop "to make things go fast".

Any use of @inbounds should always be accompanied by a matching @boundscheck before the use, as well as a @propagate_inbounds to allow the removal of that @boundscheck. If any of those pieces are missing, you're going to run into either correctness issues or suboptimal performance. The argument that @inbounds and @boundscheck should not be necessary because the compiler should be able to figure it out is in principle good, but falls apart due to the fact that those annotations are precisely used when the compiler can't figure them out on its own (if we had such a perfect compiler, we'd do all computation at compile time!). Thus, what @inbounds communicates from the developer to the compiler is that you have a mathematical proof that all indexing operations in the block are valid (with the caveat that the code is too concise to communicate that proof to the compiler).

Personally, I'd prefer enforcing the use of @boundscheck when using @inbounds, but since that makes currently working code error, it could be a deprecation at best.

In regards to GPU code (or any code, really) - it seems much more safe to me to use @inbounds and @boundscheck and set --check-bounds=yes during development, than use --check-bounds=no in production. The former will still catch accidental out-of-bounds accesses in dependencies in production instead of introducing correctness (or even security) issues, while the latter will not.


In case it's not clear - I'm in favour of removing --check-bounds=no, while also making the compiler better at interacting with explicit @inbounds and (possibly in a future release, if at all) requiring the use of @boundscheck paired with @inbounds. To bring another point of contention over from triage - the observed behavior of constant folded vs. "interpreted" julia code should not differ at all.

@sloede
Copy link
Contributor

sloede commented Jan 24, 2023

Being able to do development with memory safety and then easily switch to faster performance using a simple command line option is literally one of the best and most used features for me as an HPC user. When talking to colleagues from the scientific computing community, maybe the single strongest argument favor of Julia is that it takes away nothing of your performance while making it it much easier to work and prototype with.

Another major points I bring up when advocating Julia is that it makes great strides in solving the two-language problem. By default, everything is memory safe and still reasonably fast. But when necessary, it's as simple as flipping a single switch and you get performance on par with Fortran/C++, and that's for real production code, not just academic setups.
IMHO, this argument would take a serious blow since in the future it means you will have the two languages inside Julia:
The regular, inexperienced-user-friendly and non-optimized code, and the super high performance code with @inbounds and @boundscheck etc. at all the right places. Maybe there will be an array type someone comes up with that will automatically allow switching to @inbounds? And hello numjl, your friendly in-Julia DSL modelled after numpy...

In addition, you now have to annotate (or better yet: first test, then annotate, since no premature optimization) each loop that might be remotely performance critical. I did a quick check; in our code base for Trixi.jl, we have around 2000 for loops. Given that some of them are not performance relevant and some of them are nested, we still need to check ~1000 places and verify if and how they could benefit from manual optimizations. Compare that with the ease of just writing --check-bounds=no and that's it. Yes, it is not a big issue to manually optimize everything for small to medium sized packages, but compare that to production codes with tens to hundreds of thousands of lines.

Finally, here are some numbers. For a production run with Trixi.jl, I see a strong influence of --check-bounds=no on the performance (5 hottest kernels; all numbers per iteration):

--check-bounds=no Regular Impact
calc_volume_integral! 121ms 122ms 0%
calc_interface_flux! 66.9ms 120ms +79%
calc_surface_integral! 23.7ms 46.1ms +95%
prolong2interfaces! 16.4ms 33.9ms +100%
apply_jacobian! 6.66ms 7.78ms +17%
... ... ...
Overall 240ms 336ms +40%

Thus while the effect of disabling bounds checking varies, there is a significant impact on many core algorithms.


TL;DR At the current state, I think removing --check-bounds=no is not the best idea. It would cause many HPC codes written in Julia over the past years to immediately cease to be competitive with C++/Fortran codes, unless hand-optimizing all loops again, taking away one of the major benefits (and selling points) Julia currently has. Finally, at least in our case --check-bounds=no has a significant impact on performance (30% reduction in overall runtime) and is thus always used where performance is critical, e.g., production runs on a large number of processors.

@KristofferC
Copy link
Sponsor Member

Any indiscriminate use of --check-bounds=no is just as dangerous as just throwing @inbounds on a loop "to make things go fast".

Clearly not, since an @inbounds is restricted to the loop you apply it to while the --check-bounds=no is applied to the whole process.

@vchuravy
Copy link
Member

I am sympathetic to the performance argument, but the actual issue here is that --check-bounds=no turns a correct program into an incorrect program.

Local annotations have the benefit that they only require local knowledge to reason about them, whereas global flags require global knowledge (in my opinion impossible to obtain).
As an example we could add a mode to Julia that turns all exceptions off (a generalized check-bounds=no). One sometimes comes across some code that uses exception for control-flow (a style that is frowned upon, but nonetheless legal), we even had code like that in Base... Which would mean that turning exceptions of "globally" would have lead to dead-locking Julia programs.

Now out-of-bounds exceptions are hopefully not used like that, but --check-bounds=no can turn correct programs into incorrect programs.

In Julia we have exposed local options to control unsafe behaviours (like fast-math or opting out of bounds-checking) and as others have pointed out @inbounds applies to blocks, for-loops and even functions and is not to onerous to use.

@sloede
Copy link
Contributor

sloede commented Jan 24, 2023

@vchuravy I see your point about fine-grained control and its benefits. On the other hand, you lose a big part of what makes Julia currently very flexible. Now the same code can be memory safe or fast, without the user having to make a conscious decision in each (potentially) performance critical section. Why would you give up this awesome feature and, one of the strongest selling points of Julia in a world where you compete with established, fast code languages like C++/Fortran or slow, rapid prototyping langues like Python? I feel like this would instead lead Julia more towards Python, where you can be just as fast as Julia if you restrict yourself to loops that are amenable to @njit/Numba (with @inbounds everywhere, it will even look similar 😬).

And again, adding @inbounds whereever it makes sense from a performance perspective is not a true solution to our misgivings (as I believe was pointed out by someone else above). It is exactly that we do need this ability to have bounds checking enabled by default (for rapid prototyping), with the option to quickly switch to "production mode".

If the main argument in favor is that users abuse the --check-bounds=no option and wreaking havoc on the Julia support channels, I would classify this as an educational (or people-related) issue and not a technical issue. Therefore, imho the correct solution should be found in the instructional realm and not by applying a code patch. Otherwise, I strongly believe that the alternative will be a new package that exports a macro such as @dynamic_inbounds, with the ability to switch between the actual @inbounds and a no-op based on, e.g., Preferences.jl, just to get the old behavior back. This seems bound to cause a lot of new issues, i.e., just kicking the can of user confusion down the road 🤷

@vchuravy
Copy link
Member

Why would you give up this awesome feature and, one of the strongest selling points of Julia in a world where you compete with established, fast code languages like C++/Fortran or slow, rapid prototyping langues like Python?

For me --check-bounds=no has never been a selling point of Julia and I have never taught it in my performance engineering workshops. In contrast the local control a programmer has is for me a major advantage of Julia in contrast to C/C++ especially w.r.t fast-math. Over the years we have improved the compiler and we are now at a point where @inbounds is needed less and less, but I would expect a HPC application to use it, instead of relying on a global compiler/runtime flag.

The issue is that --check-bounds=no turns correct code into incorrect code. A second issue for me is the social one where the reliance of --check-bounds=no for performance, leads to a worse experience by default for users of your packages and of Julia. Julia is now "slow by default" and you have to tell your users to run Trixi.jl based applications with --check-bounds=no for best performance, instead of using the local mechanism and have a good user experience by default.

I could see a use for development where I would like to answer the question "would @inbounds help", but the current usage scenario you describe leads to a worse experience for everyone. User don't get fast code by default, and if they use --check-bounds=no they can't be sure that an "untested code-path" doesn't lead to a numerical corruption down the road and their climate simulation is wrong.

@williamfgc
Copy link

My two cents is to document the recommended way for developers to refactor existing code and to write code onwards no matter what the outcome is for the recommended change. Mostly to be sure I'm not giving advice on deprecated functionality as we provide tutorials for the HPC folks new to Julia. Thanks!

@Seelengrab
Copy link
Contributor

Clearly not, since an @inbounds is restricted to the loop you apply it to while the --check-bounds=no is applied to the whole process.

A single erroneous @inbounds can have just as disastrous consequences as whole program --check-bounds=no, the latter just creates vastly more opportunities for mistakes to become (in-)visible. Since all that is needed for an incorrect result is a single misuse, both are equally dangerous (just think of the OffsetArrays kerfuffle, spawned by incorrect use of @inbounds..).

Another major points I bring up when advocating Julia is that it makes great strides in solving the two-language problem. By default, everything is memory safe and still reasonably fast. But when necessary, it's as simple as flipping a single switch and you get performance on par with Fortran/C++, and that's for real production code, not just academic setups.

Now the same code can be memory safe or fast, without the user having to make a conscious decision in each (potentially) performance critical section.

In my experience, well written/idiomatic julia code is as fast (or sometimes beats) equivalent C/C++/Fortran code while retaining the safety features a high level language provides. Turning those safety features off and saying "look, without those safety features we are just as fast!" is, in my opinion, misrepresenting the strength of julia, of being able to be strictly better than the "old guard", further perpetuating the myth that the only way to go fast is to not have safe programs. This is not an either/or thing - we can, should (and ultimately MUST) have both at the same time.

Therefore, imho the correct solution should be found in the instructional realm and not by applying a code patch.

The correct way to deal with a tool that you can do nothing but cut yourself with if not held in exactly the right way (which doesn't exist here, as this is a global flag and I'm pretty sure you're not auditing all your dependencies for correct behavior) is not to tell people to only hold it in the exactly right way, it is to fix the tool so you can't cut yourself in the first place.

@PallHaraldsson
Copy link
Contributor

Could a compromise be made that you could disable --check-bounds=no at a module level (or alternatively modules disallow that by default, and you could opt into an exception, something you do until you know you've added enough @inbounds)? There is plenty of code that doesn't need to be fast (and/or isn't well tested), and such a module could still opt into @inbounds locally.

@omlins
Copy link

omlins commented Jan 26, 2023

--check-bounds=no and a global @inbounds are exactly the same thing

Thanks for pointing that out. This means then that there is, a priori, no problem with the implementation of --check-bounds=no and that the @inbounds implementation would need to be improved ( if possible?) to fix this constant folding and evaluation issue.

At the moment, the situation is therefore that @inbounds and as a consequence also --check-bounds=no can either improve or decrease performance, depending on the test case. So I believe it is pretty simple: if you have a properly tested application which does not rely on bounds checking for correctness (as discussed here), then you can use --check-bounds=no to improve its performance if it does so; else you don't use the flag. As @williamfgc highlighted, the flag is optional... and the Julia community in general appreciates the "we're all adults" philosophy.

use @inbounds locally where you know it to be safe (which is most certainly not true for your whole dependency chain!).

A possibility to disable the removal of bounds checking for modules where it is not needed would certainly be well appreciated.

@williamfgc
Copy link

A possibility to disable the removal of bounds checking for modules where it is not needed would certainly be well appreciated.

Any larger-than-local-array scope would certainly be appreciated. Hope that's what comes out of this discussion. There are scientific schemas with thousands of array variables (any atmospheric or mildly large experimental dataset, or even a multiphysics particle or mesh based simulation) adding @inbounds in every index access operation becomes a tough sell for Julia at these scales.

NASA used rigorous engineering practices to write code that is systemically safe, instead of burning compute power to rely on software testing for correctness of their Apollo Lander.

Tests at small scales actually saves a tremendous amount of compute power before launching at scale, CI is good preventive/predictive maintenance not available at that time. But I'd rather not get out of scope.

@omlins
Copy link

omlins commented Jan 26, 2023

Any larger-than-local-array scope would certainly be appreciated. Hope that's what comes out of this discussion.

To conclude the input from my side: people that currently rely on the feature --check-bounds=no are certainly open to (and see an benefit in) using somewhat less coarse grained mechanisms for the deactivation of bounds checking as long as they don't impede to write high performance code that is close to math notation. However, while such new coarse grain mechanism are not available, I would propose to keep the optional flag --check-bounds=no.

...and at that point I would like to say: thanks to all the people working on the julia compiler and the core language for their awesome work!

@Seelengrab
Copy link
Contributor

This means then that there is, a priori, no problem with the implementation of --check-bounds=no and that the @inbounds implementation would need to be improved ( if possible?) to fix this constant folding and evaluation issue.

No - this means that both are bad and if you have to use it, choose @inbounds over --check-bounds=no. Fixing the issue is vastly more likely to only happen in the context of @inbounds than globally (if it were possible to solve it globally, I'm quite certain Rust would not need unsafe).

A possibility to disable the removal of bounds checking for modules where it is not needed would certainly be well appreciated.

As pointed out above, this is not trivial, at all, due to bounds checking removal requiring inlining in the first place, which is very likely to cross module boundaries. This also comes with the problem of just removing bounds checking not necessarily meaning that the compiler can vectorize effectively, which is what's often needed/what people want to achieve with that in the first place (see e.g. eschnett/SIMD.jl#102).

adding @inbounds in every index access operation becomes a tough sell for Julia at these scales.

That is not what people in this thread are asking and would indeed be too cumbersome to use - instead, use @boundscheck at the start of your function for your bounds checking needs in your kernel/code, then have a big @inbounds begin block or @inbounds for or similar. It's also usually not necessary to use @inbounds in every indexing expression, since the compiler is in fact able to figure the inboundsness out in quite a lot of common patterns & cases (e.g. broadcasting). You can see an example of this pattern e.g. here in the Code dropdown.

@williamfgc
Copy link

williamfgc commented Jan 27, 2023

That is not what people in this thread are asking and would indeed be too cumbersome to use - instead, use...

Updating the documentation here with this info (and deprecation) would be appreciated, in particular for those new to Julia coming from Fortran, Matlab, Numpy, etc. and helps us in trying to convince people to adopt Julia. Thanks everyone for the great work and discussion.

Edit: my motivation is that we are preparing tutorials for HPC crowds so we want to point them in the right direction moving forward @vchuravy

@timholy
Copy link
Sponsor Member

timholy commented Jan 28, 2023

@williamfgc not sure if you're talking about the documentation for existing Julia versions or you're just asking about making sure the documentation gets updated if there are changes, but assuming the former: it's easy to improve the docs yourself (for simple changes you can just edit in your browser), and often it's better if the person who sees the gap is the one who tries to close it. (To me the docs look just fine, I wouldn't know what to change, but since I already know how it works I'm sure I'm not seeing them from the right perspective.)

@williamfgc
Copy link

williamfgc commented Jan 28, 2023

@timholy more like bringing awareness that docs would become soon outdated if this flag is removed (it's already broken) and block-style @inbounds begin...end (edit: and other helpful situations) are not mentioned. I am more thinking about those new to Julia reading the doc for the first time and want to code moving past v1.8 as this is a breaking change. Happy to contribute and thanks for pointing this out.

@chriselrod
Copy link
Contributor

chriselrod commented Apr 24, 2023

Thus, we do not want to add @inbounds statements into our code in order to be able to develop conveniently with bounds checking; at run time however, we need to deactivate the bounds checking everywhere: the global switch --check-bounds=no which has been available until now` has been a perfect solution.

What about the opposite: develop using --check-bounds=yes, and then run with auto?

Discussion above mostly focused on this being less pretty.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 24, 2023

That is recommended

aviatesk added a commit that referenced this issue Jun 8, 2023
From version 1.9 onwards, when `--check-bounds=no` is used,
concrete-eval is completely disabled. However, it appears
`--check-bounds=no` is still being used within the community, causing
issues like the one reported in JuliaArrays/StaticArrays.jl#1155.
Although we should move forward to a direction of eliminating the flag
in the future (#48245), for the time being, there are many requests to
carry out a certain level of compiler optimization, even when this flag
is enabled.

This commit aims to allow concrete-eval "safely" even under
`--check-bounds=no`. Specifically, when the method call being analyzed
is `:nothrow`, it should be predominantly safe to concrete-eval it under
this flag. Technically, however, even `:nothrow` methods could trigger
undefined behavior, since `:nothrow` isn't a strict constraint and it's
possible for users to annotate potentially risky methods with
`Base.@assume_effects :nothrow`. Nonetheless, since this possibility is
acknowledged in `Base.@assume_effects` documentation, I feel it's fair
to relegate it to user responsibility.
vchuravy pushed a commit that referenced this issue Jun 12, 2023
#50107)

From version 1.9 onwards, when `--check-bounds=no` is used,
concrete-eval is completely disabled. However, it appears
`--check-bounds=no` is still being used within the community, causing
issues like the one reported in JuliaArrays/StaticArrays.jl#1155.
Although we should move forward to a direction of eliminating the flag
in the future (#48245), for the time being, there are many requests to
carry out a certain level of compiler optimization, even when this flag
is enabled.

This commit aims to allow concrete-eval "safely" even under
`--check-bounds=no`. Specifically, when the method call being analyzed
is `:nothrow`, it should be predominantly safe to concrete-eval it under
this flag. Technically, however, even `:nothrow` methods could trigger
undefined behavior, since `:nothrow` isn't a strict constraint and it's
possible for users to annotate potentially risky methods with
`Base.@assume_effects :nothrow`. Nonetheless, since this possibility is
acknowledged in `Base.@assume_effects` documentation, I feel it's fair
to relegate it to user responsibility.
aviatesk added a commit that referenced this issue Jun 14, 2023
In the current state of the Julia compiler, bounds checking and its
related optimization code, such as `@boundscheck` and `@inbounds`, pose
a significant handicap for effect analysis. As a result, we're
encountering an ironic situation where the application of `@inbounds`
annotations, which are intended to optimize performance, instead
obstruct the program's optimization, thereby preventing us from
achieving optimal performance.

This PR is designed to resolve this situation. It aims to enhance the
relationship between bounds checking and effect analysis, thereby
correctly improving the performance of programs that have `@inbounds`
annotations.

In the following, I'll first explain the reasons that have led to this
situation for better understanding, and then I'll present potential
improvements to address these issues. This commit is a collection of
various improvement proposals. It's necessary that we incorporate all
of them simultaneously to enhance the situation without introducing
any regressions.

\## Core of the Problem

There are fundamentally two reasons why effect analysis of code
containing bounds checking is difficult:
1. The evaluation value of `Expr(:boundscheck)` is influenced by the
  `@inbounds` macro and the `--check-bounds` flag. Hence, when
  performing a concrete evaluation of a method containing
  `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro
  context and the `--check-bounds` settings, ensuring the method's
  behavior is consistent across the compile time concrete evaluation
  and the runtime execution.
1. If the code, from which bounds checking has been removed due to
  `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to
  undefined behavior due to uncertain memory access.

\## Current State

The current Julia compiler handles these two problems as follows:

\### Current State 1

Regarding the first problem, if a code or method call containing
`Expr(:boundscheck)` is within an `@inbounds` context, a concrete
evaluation is immediately prohibited. For instance, in the following
case, when analyzing `bar()`, if you simply perform concrete evaluation
of `foo()`, it wouldn't properly respect the `@inbounds` context present
in `bar()`. However, since the concrete evaluation of `foo()` is
prohibited, it doesn't pose an issue:
```julia
foo() = (r = 0; @BoundsCheck r += 1; return r)

bar() = @inbounds foo()
```

Conversely, in the following case, there is _no need_ to prohibit the
concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`.
This is because the execution of the `@boundscheck` block is determined
by the presence of local `@inbounds`:
```julia
function A1_inbounds()
    r = 0
    @inbounds begin
        @BoundsCheck r += 1
    end
    return r
end
```

However, currently, we prohibit the concrete evaluation of such code as
well. Moreover, we are not handling such local `@inbounds` contexts
effectively, which results in incorrect execution of `A1_inbounds()`
(even our test is incorrect for this example:
<https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34>).

Furthermore, there is room for improvement when the `--check-bounds`
flag is specified. Specifically, when the `--check-bounds` flag is set,
the evaluation value of `Expr(:boundscheck)` is determined irrespective
of the `@inbounds` context. Hence, there is no need to prohibit concrete
evaluations due to inconsistency in the evaluation value of
`Expr(:boundscheck)`.

\### Current State 2

Next, we've ensured that concrete evaluation isn't performed when
there's potentially unsafe code that may have bounds checking removed,
or when the `--check-bounds=no` flag is set, which could lead to bounds
checking being removed always.
For instance, if you perform concrete evaluation for the function call
`baz((1,2,3), 4)` in the following example, it may return a value
accessed from illicit memory and introduce undefined behaviors into the
program:
```julia
baz(t::Tuple, i::Int) = @inbounds t[i]

baz((1,2,3), 4)
```

However, it's evident that the above code is incorrect and unsafe
program and I believe undefined behavior in such programs is deemed,
as explicitly stated in the `@inbounds` documentation:

> │ Warning
> │
> │  Using @inbounds may return incorrect results/crashes/corruption for
> │  out-of-bounds indices. The user is responsible for checking it
> │  manually. Only use @inbounds when it is certain from the information
> │  locally available that all accesses are in bounds.

Actually, the `@inbounds` macro is primarily an annotation to
"improve performance by removing bounds checks from safe programs".
Therefore, I opine that it would be more reasonable to utilize it to
alleviate potential errors due to bounds checking within `@inbounds`
contexts.

To bring up another associated concern, in the current compiler
implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset`
is a bit risky, and `:nothrow`-ness is assumed when their bounds checking
is turned off by call argument.
If our intended direction aligns with the removal of bounds checking
based on `@inbounds` as proposed in issue #48245, then assuming
`:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming
`:nothrow`-ness due to bounds checking argument or the `--check-bounds`
flag appears to be risky, especially considering it's not documented.

\## This Commit

This commit implements all proposed improvements against the current
issues as mentioned above. In summary, the enhancements include:
- allowing concrete evaluation within a local `@inbounds` context
- folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set
  (and allow concrete evaluation)
- changing the `:nothrow` effect bit to `UInt8` type, and refining
  `:nothrow` information when in an `@inbounds` context
- removing dangerous assumptions of `:nothrow`-ness for built-in
  functions when bounds checking is turned off
- replacing the `@_safeindex` hack with `@inbounds`
aviatesk added a commit that referenced this issue Jun 14, 2023
In the current state of the Julia compiler, bounds checking and its
related optimization code, such as `@boundscheck` and `@inbounds`, pose
a significant handicap for effect analysis. As a result, we're
encountering an ironic situation where the application of `@inbounds`
annotations, which are intended to optimize performance, instead
obstruct the program's optimization, thereby preventing us from
achieving optimal performance.

This PR is designed to resolve this situation. It aims to enhance the
relationship between bounds checking and effect analysis, thereby
correctly improving the performance of programs that have `@inbounds`
annotations.

In the following, I'll first explain the reasons that have led to this
situation for better understanding, and then I'll present potential
improvements to address these issues. This commit is a collection of
various improvement proposals. It's necessary that we incorporate all
of them simultaneously to enhance the situation without introducing
any regressions.

\## Core of the Problem

There are fundamentally two reasons why effect analysis of code
containing bounds checking is difficult:
1. The evaluation value of `Expr(:boundscheck)` is influenced by the
  `@inbounds` macro and the `--check-bounds` flag. Hence, when
  performing a concrete evaluation of a method containing
  `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro
  context and the `--check-bounds` settings, ensuring the method's
  behavior is consistent across the compile time concrete evaluation
  and the runtime execution.
1. If the code, from which bounds checking has been removed due to
  `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to
  undefined behavior due to uncertain memory access.

\## Current State

The current Julia compiler handles these two problems as follows:

\### Current State 1

Regarding the first problem, if a code or method call containing
`Expr(:boundscheck)` is within an `@inbounds` context, a concrete
evaluation is immediately prohibited. For instance, in the following
case, when analyzing `bar()`, if you simply perform concrete evaluation
of `foo()`, it wouldn't properly respect the `@inbounds` context present
in `bar()`. However, since the concrete evaluation of `foo()` is
prohibited, it doesn't pose an issue:
```julia
foo() = (r = 0; @BoundsCheck r += 1; return r)

bar() = @inbounds foo()
```

Conversely, in the following case, there is _no need_ to prohibit the
concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`.
This is because ~~the execution of the `@boundscheck` block is determined
by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a
local `@inbounds` context does not need to block concrete evaluation:
```julia
function A1_inbounds()
    r = 0
    @inbounds begin
        @BoundsCheck r += 1
    end
    return r
end
```

However, currently, we prohibit the concrete evaluation of such code as
well. ~~Moreover, we are not handling such local `@inbounds` contexts
effectively, which results in incorrect execution of `A1_inbounds()`
(even our test is incorrect for this example:
<https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34>)~~
EDIT It was an expected behavior as pointed out by Jameson.

Furthermore, there is room for improvement when the `--check-bounds`
flag is specified. Specifically, when the `--check-bounds` flag is set,
the evaluation value of `Expr(:boundscheck)` is determined irrespective
of the `@inbounds` context. Hence, there is no need to prohibit concrete
evaluations due to inconsistency in the evaluation value of
`Expr(:boundscheck)`.

\### Current State 2

Next, we've ensured that concrete evaluation isn't performed when
there's potentially unsafe code that may have bounds checking removed,
or when the `--check-bounds=no` flag is set, which could lead to bounds
checking being removed always.
For instance, if you perform concrete evaluation for the function call
`baz((1,2,3), 4)` in the following example, it may return a value
accessed from illicit memory and introduce undefined behaviors into the
program:
```julia
baz(t::Tuple, i::Int) = @inbounds t[i]

baz((1,2,3), 4)
```

However, it's evident that the above code is incorrect and unsafe
program and I believe undefined behavior in such programs is deemed,
as explicitly stated in the `@inbounds` documentation:

> │ Warning
> │
> │  Using @inbounds may return incorrect results/crashes/corruption for
> │  out-of-bounds indices. The user is responsible for checking it
> │  manually. Only use @inbounds when it is certain from the information
> │  locally available that all accesses are in bounds.

Actually, the `@inbounds` macro is primarily an annotation to
"improve performance by removing bounds checks from safe programs".
Therefore, I opine that it would be more reasonable to utilize it to
alleviate potential errors due to bounds checking within `@inbounds`
contexts.

To bring up another associated concern, in the current compiler
implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset`
is a bit risky, and `:nothrow`-ness is assumed when their bounds checking
is turned off by call argument.
If our intended direction aligns with the removal of bounds checking
based on `@inbounds` as proposed in issue #48245, then assuming
`:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming
`:nothrow`-ness due to bounds checking argument or the `--check-bounds`
flag appears to be risky, especially considering it's not documented.

\## This Commit

This commit implements all proposed improvements against the current
issues as mentioned above. In summary, the enhancements include:
- making `Expr(:boundscheck)` within a local `@inbounds` context not
  block concrete evaluation
- folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set
  (and allow concrete evaluation)
- changing the `:nothrow` effect bit to `UInt8` type, and refining
  `:nothrow` information when in an `@inbounds` context
- removing dangerous assumptions of `:nothrow`-ness for built-in
  functions when bounds checking is turned off
- replacing the `@_safeindex` hack with `@inbounds`
aviatesk added a commit that referenced this issue Jun 15, 2023
In the current state of the Julia compiler, bounds checking and its
related optimization code, such as `@boundscheck` and `@inbounds`, pose
a significant handicap for effect analysis. As a result, we're
encountering an ironic situation where the application of `@inbounds`
annotations, which are intended to optimize performance, instead
obstruct the program's optimization, thereby preventing us from
achieving optimal performance.

This PR is designed to resolve this situation. It aims to enhance the
relationship between bounds checking and effect analysis, thereby
correctly improving the performance of programs that have `@inbounds`
annotations.

In the following, I'll first explain the reasons that have led to this
situation for better understanding, and then I'll present potential
improvements to address these issues. This commit is a collection of
various improvement proposals. It's necessary that we incorporate all
of them simultaneously to enhance the situation without introducing
any regressions.

\## Core of the Problem

There are fundamentally two reasons why effect analysis of code
containing bounds checking is difficult:
1. The evaluation value of `Expr(:boundscheck)` is influenced by the
  `@inbounds` macro and the `--check-bounds` flag. Hence, when
  performing a concrete evaluation of a method containing
  `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro
  context and the `--check-bounds` settings, ensuring the method's
  behavior is consistent across the compile time concrete evaluation
  and the runtime execution.
1. If the code, from which bounds checking has been removed due to
  `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to
  undefined behavior due to uncertain memory access.

\## Current State

The current Julia compiler handles these two problems as follows:

\### Current State 1

Regarding the first problem, if a code or method call containing
`Expr(:boundscheck)` is within an `@inbounds` context, a concrete
evaluation is immediately prohibited. For instance, in the following
case, when analyzing `bar()`, if you simply perform concrete evaluation
of `foo()`, it wouldn't properly respect the `@inbounds` context present
in `bar()`. However, since the concrete evaluation of `foo()` is
prohibited, it doesn't pose an issue:
```julia
foo() = (r = 0; @BoundsCheck r += 1; return r)

bar() = @inbounds foo()
```

Conversely, in the following case, there is _no need_ to prohibit the
concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`.
This is because ~~the execution of the `@boundscheck` block is determined
by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a
local `@inbounds` context does not need to block concrete evaluation:
```julia
function A1_inbounds()
    r = 0
    @inbounds begin
        @BoundsCheck r += 1
    end
    return r
end
```

However, currently, we prohibit the concrete evaluation of such code as
well. ~~Moreover, we are not handling such local `@inbounds` contexts
effectively, which results in incorrect execution of `A1_inbounds()`
(even our test is incorrect for this example:
`https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~
EDIT: It is an expected behavior as pointed out by Jameson.

Furthermore, there is room for improvement when the `--check-bounds`
flag is specified. Specifically, when the `--check-bounds` flag is set,
the evaluation value of `Expr(:boundscheck)` is determined irrespective
of the `@inbounds` context. Hence, there is no need to prohibit concrete
evaluations due to inconsistency in the evaluation value of
`Expr(:boundscheck)`.

\### Current State 2

Next, we've ensured that concrete evaluation isn't performed when
there's potentially unsafe code that may have bounds checking removed,
or when the `--check-bounds=no` flag is set, which could lead to bounds
checking being removed always.
For instance, if you perform concrete evaluation for the function call
`baz((1,2,3), 4)` in the following example, it may return a value
accessed from illicit memory and introduce undefined behaviors into the
program:
```julia
baz(t::Tuple, i::Int) = @inbounds t[i]

baz((1,2,3), 4)
```

However, it's evident that the above code is incorrect and unsafe
program and I believe undefined behavior in such programs is deemed,
as explicitly stated in the `@inbounds` documentation:

> │ Warning
> │
> │  Using @inbounds may return incorrect results/crashes/corruption for
> │  out-of-bounds indices. The user is responsible for checking it
> │  manually. Only use @inbounds when it is certain from the information
> │  locally available that all accesses are in bounds.

Actually, the `@inbounds` macro is primarily an annotation to
"improve performance by removing bounds checks from safe programs".
Therefore, I opine that it would be more reasonable to utilize it to
alleviate potential errors due to bounds checking within `@inbounds`
contexts.

To bring up another associated concern, in the current compiler
implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset`
is a bit risky, and `:nothrow`-ness is assumed when their bounds checking
is turned off by call argument.
If our intended direction aligns with the removal of bounds checking
based on `@inbounds` as proposed in issue #48245, then assuming
`:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming
`:nothrow`-ness due to bounds checking argument or the `--check-bounds`
flag appears to be risky, especially considering it's not documented.

\## This Commit

This commit implements all proposed improvements against the current
issues as mentioned above. In summary, the enhancements include:
- making `Expr(:boundscheck)` within a local `@inbounds` context not
  block concrete evaluation
- folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set
  (and allow concrete evaluation)
- changing the `:nothrow` effect bit to `UInt8` type, and refining
  `:nothrow` information when in an `@inbounds` context
- removing dangerous assumptions of `:nothrow`-ness for built-in
  functions when bounds checking is turned off
- replacing the `@_safeindex` hack with `@inbounds`
aviatesk added a commit that referenced this issue Jun 15, 2023
In the current state of the Julia compiler, bounds checking and its
related optimization code, such as `@boundscheck` and `@inbounds`, pose
a significant handicap for effect analysis. As a result, we're
encountering an ironic situation where the application of `@inbounds`
annotations, which are intended to optimize performance, instead
obstruct the program's optimization, thereby preventing us from
achieving optimal performance.

This PR is designed to resolve this situation. It aims to enhance the
relationship between bounds checking and effect analysis, thereby
correctly improving the performance of programs that have `@inbounds`
annotations.

In the following, I'll first explain the reasons that have led to this
situation for better understanding, and then I'll present potential
improvements to address these issues. This commit is a collection of
various improvement proposals. It's necessary that we incorporate all
of them simultaneously to enhance the situation without introducing
any regressions.

\## Core of the Problem

There are fundamentally two reasons why effect analysis of code
containing bounds checking is difficult:
1. The evaluation value of `Expr(:boundscheck)` is influenced by the
  `@inbounds` macro and the `--check-bounds` flag. Hence, when
  performing a concrete evaluation of a method containing
  `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro
  context and the `--check-bounds` settings, ensuring the method's
  behavior is consistent across the compile time concrete evaluation
  and the runtime execution.
1. If the code, from which bounds checking has been removed due to
  `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to
  undefined behavior due to uncertain memory access.

\## Current State

The current Julia compiler handles these two problems as follows:

\### Current State 1

Regarding the first problem, if a code or method call containing
`Expr(:boundscheck)` is within an `@inbounds` context, a concrete
evaluation is immediately prohibited. For instance, in the following
case, when analyzing `bar()`, if you simply perform concrete evaluation
of `foo()`, it wouldn't properly respect the `@inbounds` context present
in `bar()`. However, since the concrete evaluation of `foo()` is
prohibited, it doesn't pose an issue:
```julia
foo() = (r = 0; @BoundsCheck r += 1; return r)

bar() = @inbounds foo()
```

Conversely, in the following case, there is _no need_ to prohibit the
concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`.
This is because ~~the execution of the `@boundscheck` block is determined
by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a
local `@inbounds` context does not need to block concrete evaluation:
```julia
function A1_inbounds()
    r = 0
    @inbounds begin
        @BoundsCheck r += 1
    end
    return r
end
```

However, currently, we prohibit the concrete evaluation of such code as
well. ~~Moreover, we are not handling such local `@inbounds` contexts
effectively, which results in incorrect execution of `A1_inbounds()`
(even our test is incorrect for this example:
`https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~
EDIT: It is an expected behavior as pointed out by Jameson.

Furthermore, there is room for improvement when the `--check-bounds`
flag is specified. Specifically, when the `--check-bounds` flag is set,
the evaluation value of `Expr(:boundscheck)` is determined irrespective
of the `@inbounds` context. Hence, there is no need to prohibit concrete
evaluations due to inconsistency in the evaluation value of
`Expr(:boundscheck)`.

\### Current State 2

Next, we've ensured that concrete evaluation isn't performed when
there's potentially unsafe code that may have bounds checking removed,
or when the `--check-bounds=no` flag is set, which could lead to bounds
checking being removed always.
For instance, if you perform concrete evaluation for the function call
`baz((1,2,3), 4)` in the following example, it may return a value
accessed from illicit memory and introduce undefined behaviors into the
program:
```julia
baz(t::Tuple, i::Int) = @inbounds t[i]

baz((1,2,3), 4)
```

However, it's evident that the above code is incorrect and unsafe
program and I believe undefined behavior in such programs is deemed,
as explicitly stated in the `@inbounds` documentation:

> │ Warning
> │
> │  Using @inbounds may return incorrect results/crashes/corruption for
> │  out-of-bounds indices. The user is responsible for checking it
> │  manually. Only use @inbounds when it is certain from the information
> │  locally available that all accesses are in bounds.

Actually, the `@inbounds` macro is primarily an annotation to
"improve performance by removing bounds checks from safe programs".
Therefore, I opine that it would be more reasonable to utilize it to
alleviate potential errors due to bounds checking within `@inbounds`
contexts.

To bring up another associated concern, in the current compiler
implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset`
is a bit risky, and `:nothrow`-ness is assumed when their bounds checking
is turned off by call argument.
If our intended direction aligns with the removal of bounds checking
based on `@inbounds` as proposed in issue #48245, then assuming
`:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming
`:nothrow`-ness due to bounds checking argument or the `--check-bounds`
flag appears to be risky, especially considering it's not documented.

\## This Commit

This commit implements all proposed improvements against the current
issues as mentioned above. In summary, the enhancements include:
- making `Expr(:boundscheck)` within a local `@inbounds` context not
  block concrete evaluation
- folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set
  (and allow concrete evaluation)
- changing the `:nothrow` effect bit to `UInt8` type, and refining
  `:nothrow` information when in an `@inbounds` context
- removing dangerous assumptions of `:nothrow`-ness for built-in
  functions when bounds checking is turned off
- replacing the `@_safeindex` hack with `@inbounds`
aviatesk added a commit that referenced this issue Jun 15, 2023
In the current state of the Julia compiler, bounds checking and its
related optimization code, such as `@boundscheck` and `@inbounds`, pose
a significant handicap for effect analysis. As a result, we're
encountering an ironic situation where the application of `@inbounds`
annotations, which are intended to optimize performance, instead
obstruct the program's optimization, thereby preventing us from
achieving optimal performance.

This PR is designed to resolve this situation. It aims to enhance the
relationship between bounds checking and effect analysis, thereby
correctly improving the performance of programs that have `@inbounds`
annotations.

In the following, I'll first explain the reasons that have led to this
situation for better understanding, and then I'll present potential
improvements to address these issues. This commit is a collection of
various improvement proposals. It's necessary that we incorporate all
of them simultaneously to enhance the situation without introducing
any regressions.

\## Core of the Problem

There are fundamentally two reasons why effect analysis of code
containing bounds checking is difficult:
1. The evaluation value of `Expr(:boundscheck)` is influenced by the
  `@inbounds` macro and the `--check-bounds` flag. Hence, when
  performing a concrete evaluation of a method containing
  `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro
  context and the `--check-bounds` settings, ensuring the method's
  behavior is consistent across the compile time concrete evaluation
  and the runtime execution.
1. If the code, from which bounds checking has been removed due to
  `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to
  undefined behavior due to uncertain memory access.

\## Current State

The current Julia compiler handles these two problems as follows:

\### Current State 1

Regarding the first problem, if a code or method call containing
`Expr(:boundscheck)` is within an `@inbounds` context, a concrete
evaluation is immediately prohibited. For instance, in the following
case, when analyzing `bar()`, if you simply perform concrete evaluation
of `foo()`, it wouldn't properly respect the `@inbounds` context present
in `bar()`. However, since the concrete evaluation of `foo()` is
prohibited, it doesn't pose an issue:
```julia
foo() = (r = 0; @BoundsCheck r += 1; return r)

bar() = @inbounds foo()
```

Conversely, in the following case, there is _no need_ to prohibit the
concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`.
This is because ~~the execution of the `@boundscheck` block is determined
by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a
local `@inbounds` context does not need to block concrete evaluation:
```julia
function A1_inbounds()
    r = 0
    @inbounds begin
        @BoundsCheck r += 1
    end
    return r
end
```

However, currently, we prohibit the concrete evaluation of such code as
well. ~~Moreover, we are not handling such local `@inbounds` contexts
effectively, which results in incorrect execution of `A1_inbounds()`
(even our test is incorrect for this example:
`https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~
EDIT: It is an expected behavior as pointed out by Jameson.

Furthermore, there is room for improvement when the `--check-bounds`
flag is specified. Specifically, when the `--check-bounds` flag is set,
the evaluation value of `Expr(:boundscheck)` is determined irrespective
of the `@inbounds` context. Hence, there is no need to prohibit concrete
evaluations due to inconsistency in the evaluation value of
`Expr(:boundscheck)`.

\### Current State 2

Next, we've ensured that concrete evaluation isn't performed when
there's potentially unsafe code that may have bounds checking removed,
or when the `--check-bounds=no` flag is set, which could lead to bounds
checking being removed always.
For instance, if you perform concrete evaluation for the function call
`baz((1,2,3), 4)` in the following example, it may return a value
accessed from illicit memory and introduce undefined behaviors into the
program:
```julia
baz(t::Tuple, i::Int) = @inbounds t[i]

baz((1,2,3), 4)
```

However, it's evident that the above code is incorrect and unsafe
program and I believe undefined behavior in such programs is deemed,
as explicitly stated in the `@inbounds` documentation:

> │ Warning
> │
> │  Using @inbounds may return incorrect results/crashes/corruption for
> │  out-of-bounds indices. The user is responsible for checking it
> │  manually. Only use @inbounds when it is certain from the information
> │  locally available that all accesses are in bounds.

Actually, the `@inbounds` macro is primarily an annotation to
"improve performance by removing bounds checks from safe programs".
Therefore, I opine that it would be more reasonable to utilize it to
alleviate potential errors due to bounds checking within `@inbounds`
contexts.

To bring up another associated concern, in the current compiler
implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset`
is a bit risky, and `:nothrow`-ness is assumed when their bounds checking
is turned off by call argument.
If our intended direction aligns with the removal of bounds checking
based on `@inbounds` as proposed in issue #48245, then assuming
`:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming
`:nothrow`-ness due to bounds checking argument or the `--check-bounds`
flag appears to be risky, especially considering it's not documented.

\## This Commit

This commit implements all proposed improvements against the current
issues as mentioned above. In summary, the enhancements include:
- making `Expr(:boundscheck)` within a local `@inbounds` context not
  block concrete evaluation
- folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set
  (and allow concrete evaluation)
- changing the `:nothrow` effect bit to `UInt8` type, and refining
  `:nothrow` information when in an `@inbounds` context
- removing dangerous assumptions of `:nothrow`-ness for built-in
  functions when bounds checking is turned off
- replacing the `@_safeindex` hack with `@inbounds`
Keno added a commit that referenced this issue Jun 20, 2023
In 1.9, `--check-bounds=no` has started causing significant performance
regressions (e.g. #50110). This is because we switched a number of functions that
used to be `@pure` to new effects-based infrastructure, which very closely tracks
the the legality conditions for concrete evaluation. Unfortunately, disabling
bounds checking completely invalidates all prior legality analysis, so the only
realistic path we have is to completely disable it.

In general, we are learning that these kinds of global make-things-faster-but-unsafe
flags are highly problematic for a language for several reasons:

- Code is written with the assumption of a particular mode being chosen, so
  it is in general not possible or unsafe to compose libraries (which in a language
  like julia is a huge problem).

- Unsafe semantics are often harder for the compiler to reason about, causing
  unexpected performance issues (although the 1.9 --check-bounds=no issues are
  worse than just disabling concrete eval for things that use bounds checking)

In general, I'd like to remove the `--check-bounds=` option entirely (#48245),
but that proposal has encountered major opposition.

This PR implements an alternative proposal: We introduce a new function
`Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is
passed the result of `Expr(:boundscheck)` (which is now purely determined by
the inliner based on `@inbounds`, without regard for the command line flag).

In this proposal, what the command line flag does is simply redefine this
function to either `true` or `false` (unconditionally) depending on the
value of the flag.

Of course, this causes massive amounts of recompilation, but I think this can
be addressed by adding logic to loading that loads a pkgimage with appropriate
definitions to cure the invalidations. The special logic we have now now
to take account of the --check-bounds flag in .ji selection, would be replaced
by automatically injecting the special pkgimage as a dependency to every
loaded image. This part isn't implemented in this PR, but I think it's reasonable
to do.

I think with that, the `--check-bounds` flag remains functional, while having
much more well defined behavior, as it relies on the standard world age
mechanisms.

A major benefit of this approach is that it can be scoped appropriately using
overlay tables. For exmaple:

```
julia> using CassetteOverlay

julia> @MethodTable AssumeInboundsTable;

julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false;

julia> assume_inbounds = @overlaypass AssumeInboundsTable

julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically
```

Similar logic applies to GPUCompiler, which already supports overlay tables.
Keno added a commit that referenced this issue Jun 20, 2023
In 1.9, `--check-bounds=no` has started causing significant performance
regressions (e.g. #50110). This is because we switched a number of functions that
used to be `@pure` to new effects-based infrastructure, which very closely tracks
the the legality conditions for concrete evaluation. Unfortunately, disabling
bounds checking completely invalidates all prior legality analysis, so the only
realistic path we have is to completely disable it.

In general, we are learning that these kinds of global make-things-faster-but-unsafe
flags are highly problematic for a language for several reasons:

- Code is written with the assumption of a particular mode being chosen, so
  it is in general not possible or unsafe to compose libraries (which in a language
  like julia is a huge problem).

- Unsafe semantics are often harder for the compiler to reason about, causing
  unexpected performance issues (although the 1.9 --check-bounds=no issues are
  worse than just disabling concrete eval for things that use bounds checking)

In general, I'd like to remove the `--check-bounds=` option entirely (#48245),
but that proposal has encountered major opposition.

This PR implements an alternative proposal: We introduce a new function
`Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is
passed the result of `Expr(:boundscheck)` (which is now purely determined by
the inliner based on `@inbounds`, without regard for the command line flag).

In this proposal, what the command line flag does is simply redefine this
function to either `true` or `false` (unconditionally) depending on the
value of the flag.

Of course, this causes massive amounts of recompilation, but I think this can
be addressed by adding logic to loading that loads a pkgimage with appropriate
definitions to cure the invalidations. The special logic we have now now
to take account of the --check-bounds flag in .ji selection, would be replaced
by automatically injecting the special pkgimage as a dependency to every
loaded image. This part isn't implemented in this PR, but I think it's reasonable
to do.

I think with that, the `--check-bounds` flag remains functional, while having
much more well defined behavior, as it relies on the standard world age
mechanisms.

A major benefit of this approach is that it can be scoped appropriately using
overlay tables. For exmaple:

```
julia> using CassetteOverlay

julia> @MethodTable AssumeInboundsTable;

julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false;

julia> assume_inbounds = @overlaypass AssumeInboundsTable

julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically
```

Similar logic applies to GPUCompiler, which already supports overlay tables.
@Keno
Copy link
Member Author

Keno commented Jun 20, 2023

See RFC for one possible approach in #50239

Keno added a commit that referenced this issue Jul 18, 2023
In 1.9, `--check-bounds=no` has started causing significant performance
regressions (e.g. #50110). This is because we switched a number of functions that
used to be `@pure` to new effects-based infrastructure, which very closely tracks
the the legality conditions for concrete evaluation. Unfortunately, disabling
bounds checking completely invalidates all prior legality analysis, so the only
realistic path we have is to completely disable it.

In general, we are learning that these kinds of global make-things-faster-but-unsafe
flags are highly problematic for a language for several reasons:

- Code is written with the assumption of a particular mode being chosen, so
  it is in general not possible or unsafe to compose libraries (which in a language
  like julia is a huge problem).

- Unsafe semantics are often harder for the compiler to reason about, causing
  unexpected performance issues (although the 1.9 --check-bounds=no issues are
  worse than just disabling concrete eval for things that use bounds checking)

In general, I'd like to remove the `--check-bounds=` option entirely (#48245),
but that proposal has encountered major opposition.

This PR implements an alternative proposal: We introduce a new function
`Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is
passed the result of `Expr(:boundscheck)` (which is now purely determined by
the inliner based on `@inbounds`, without regard for the command line flag).

In this proposal, what the command line flag does is simply redefine this
function to either `true` or `false` (unconditionally) depending on the
value of the flag.

Of course, this causes massive amounts of recompilation, but I think this can
be addressed by adding logic to loading that loads a pkgimage with appropriate
definitions to cure the invalidations. The special logic we have now now
to take account of the --check-bounds flag in .ji selection, would be replaced
by automatically injecting the special pkgimage as a dependency to every
loaded image. This part isn't implemented in this PR, but I think it's reasonable
to do.

I think with that, the `--check-bounds` flag remains functional, while having
much more well defined behavior, as it relies on the standard world age
mechanisms.

A major benefit of this approach is that it can be scoped appropriately using
overlay tables. For exmaple:

```
julia> using CassetteOverlay

julia> @MethodTable AssumeInboundsTable;

julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false;

julia> assume_inbounds = @overlaypass AssumeInboundsTable

julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically
```

Similar logic applies to GPUCompiler, which already supports overlay tables.
Keno added a commit that referenced this issue Jul 19, 2023
In 1.9, `--check-bounds=no` has started causing significant performance
regressions (e.g. #50110). This is because we switched a number of functions that
used to be `@pure` to new effects-based infrastructure, which very closely tracks
the the legality conditions for concrete evaluation. Unfortunately, disabling
bounds checking completely invalidates all prior legality analysis, so the only
realistic path we have is to completely disable it.

In general, we are learning that these kinds of global make-things-faster-but-unsafe
flags are highly problematic for a language for several reasons:

- Code is written with the assumption of a particular mode being chosen, so
  it is in general not possible or unsafe to compose libraries (which in a language
  like julia is a huge problem).

- Unsafe semantics are often harder for the compiler to reason about, causing
  unexpected performance issues (although the 1.9 --check-bounds=no issues are
  worse than just disabling concrete eval for things that use bounds checking)

In general, I'd like to remove the `--check-bounds=` option entirely (#48245),
but that proposal has encountered major opposition.

This PR implements an alternative proposal: We introduce a new function
`Core.should_check_bounds(boundscheck::Bool) = boundscheck`. This function is
passed the result of `Expr(:boundscheck)` (which is now purely determined by
the inliner based on `@inbounds`, without regard for the command line flag).

In this proposal, what the command line flag does is simply redefine this
function to either `true` or `false` (unconditionally) depending on the
value of the flag.

Of course, this causes massive amounts of recompilation, but I think this can
be addressed by adding logic to loading that loads a pkgimage with appropriate
definitions to cure the invalidations. The special logic we have now now
to take account of the --check-bounds flag in .ji selection, would be replaced
by automatically injecting the special pkgimage as a dependency to every
loaded image. This part isn't implemented in this PR, but I think it's reasonable
to do.

I think with that, the `--check-bounds` flag remains functional, while having
much more well defined behavior, as it relies on the standard world age
mechanisms.

A major benefit of this approach is that it can be scoped appropriately using
overlay tables. For exmaple:

```
julia> using CassetteOverlay

julia> @MethodTable AssumeInboundsTable;

julia> @overlay AssumeInboundsTable Core.should_check_bounds(b::Bool) = false;

julia> assume_inbounds = @overlaypass AssumeInboundsTable

julia> assume_inbounds(f, args...) # f(args...) with bounds checking disabled dynamically
```

Similar logic applies to GPUCompiler, which already supports overlay tables.
@LilithHafner LilithHafner removed the triage This should be discussed on a triage call label Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests