Reading for 10/17: Alias-Based Optimization #389

keikun555 · 2023-10-02T14:15:44Z

keikun555
Oct 2, 2023

Here's the discussion thread for 10/17's paper discussion on Alias-Based Optimization. The paper is here!

Edit: better paper URL

keikun555 · 2023-10-11T16:44:00Z

keikun555
Oct 11, 2023
Author

Going back to the "Producing wrong data without doing anything obviously wrong!" paper, do you think this paper did a good job in measuring their TBAA optimization?

How did the authors select which benchmarks to run the optimizer on? Is this a fair choice, and if not, how should they choose the benchmarks?

6 replies

obhalerao Oct 17, 2023

I second the fact that they could have included a baseline implementation of RLE without type-based alias analysis for completeness (perhaps with the more trivial alias analysis methods alluded to at the start of the paper). In addition, I'm not that surprised by the fact that the three optimizations had similar runtimes, as the speedup compared to the baseline is not obscenely high to begin with, and my intuition tells me that the more layers of complexity you add to an optimizing analysis like this, the more diminishing returns you get.

I do wish that the data they reported for the runtime, for the above graph especially, could have been a bit more fine-grained; especially since the speedups were not very drastic. At least to me, it seems as though two significant figures could be too imprecise for measuring runtimes of 100% versus 99%, for example.

stephenverderame Oct 17, 2023

I also liked that they didn't just report one metric, and even discussed how maximizing the different metrics they presented could lead to different conclusions. It also would have been nice if they tested TBAA on different optimizations. For example, in their testing, they found that FieldTypeDecl doesn't really give a performance improvement over TypeDecl despite it being more precise. While completely intuitive that a more precise analysis would lead to more performance gains, it kind of felt like they just assumed that for some optimization the greater precision would be useful. Again, completely understandable (and I would probably agree with them), but also a little hand-wavy...

willwng Oct 17, 2023

I'm a little confused on why they seemed to place more emphasis on the optimizations (e.g., runtime, # of redundant loads) when they could directly assess the analysis itself? By that, I mean my initial interpretation of "limit evaluation" was that it would be a comparison between the aliases determined by each analysis against some sort of "truth" (maybe dynamically checking each alias).

They claim that a more precise optimization doesn't always lead to a better runtime, but could that be more indicative of the lack of optimization than the analysis itself? Optimized performance must have been very important to them since, it seemed that they even took extra steps to make benchmarks (insofar as finding gcc bugs).

sampsyo Oct 17, 2023
Maintainer

While completely intuitive that a more precise analysis would lead to more performance gains, it kind of felt like they just assumed that for some optimization the greater precision would be useful.

Yeah, surely it is possible to craft some optimization that exploits any aliasing "fact"—whether it's a useful optimization or not is another story. I think there may be an interesting paper to be written that does a similar study for a very wide range of alias-info-using optimizations to see how their sensitivity differs! I don't think there is much intuition to be had about this, so it may be a very empirical question.

sampsyo Oct 17, 2023
Maintainer

I'm a little confused on why they seemed to place more emphasis on the optimizations (e.g., runtime, # of redundant loads) when they could directly assess the analysis itself?

I think the point is that the more basic eval approach (just measure AA precision) is what most papers on alias analysis do. There are 1,000,000 papers out there about making AA more scalable/precise, but this paper is significant because it asks the question: "What is this all good for?" In other words, increasing AA precision all by itself isn't actually the goal; AA info is always used to do something else, like optimize the program. So this paper is kinda saying, "here is a very simple/fast AA that is much less precise than more sophisticated/slower analyses but nonetheless gets most of the benefit, in terms of optimizability."

keikun555 · 2023-10-11T17:02:49Z

keikun555
Oct 11, 2023
Author

How else can we leverage type systems for compiler optimizations?

5 replies

bennyrubin Oct 16, 2023

What is really interesting to me is the progression of the field from thinking about using types as a way to help with alias analyses to constructing the types themselves to help even more with alias analyses. Another way to think about this is going from leveraging an existing language feature for an analysis to co-designing the language itself for the analysis. I think the best example of this is safe rust, where you get guarantees from the type system itself about aliasing. To be honest, I had always thought rust’s power was in memory safety and I had never even considered how powerful its type system could be for optimizations. There exists a whole class of optimizations involving memory accesses that don’t require any alias analyses because of the guarantees from the type system.
I am now very intrigued by this idea of co-designing a language around certain optimizations. My first thought is SSA form and how that decoupled variables and values. I wonder if there are more creative ways to accomplish this, looking back on previous optimizations we’ve studied.

zachary-kent Oct 17, 2023

I think there is also an interesting opportunity here to leverage more advanced type systems for finer-grained optimizations. For example, say you had two pointers to integers and can verify statically that the ranges of their pointees are disjoint. Then, you can conclude these pointers do not alias. You can accomplish this using refinement types, which take the form of { x : t | P(x) } where t is some type and P is a predicate over type t; informally, such a type would express "all values of type x satisfying predicate P. Such type systems can be found in liquid Haskell and F*. More generally, if you had pointers to types { x : t | P(x) } and { x : t | Q(x) } where P and Q are disjoint, you can conclude that these pointers do not alias.

Enochen Oct 17, 2023

There are many creative ways we can take advantage of well designed type systems for more-so general purpose languages like Rust, but I think there are also a ton of use cases for languages and type systems specially designed for a specific domain. That includes CS research fields like with liquid Haskell and F*, but I'm sure there are also more niche fields like weather simulations or something that could benefit from hypertailored languages that enable both ease of programming for the use case and ease of optimization based on the type guarantees of the type system. I'm thinking instructions that can be taken advantage of in certain scenarios, like SIMD instructions. These scenarios would be much easier to detect within the compiler if the language/type system was designed with them in mind, and the more domain-specific you go the powerful the type system could be for the use case.

sampsyo Oct 17, 2023
Maintainer

On the subject of Rust and aliasing-based optimizations: one "entertaining" backstory here is that, for a long time, the Rust compiler wanted to emit LLVM's noalias attribute to reflect the semantics of its pointers. But noalias was so under-used by LLVM before that point (in its role as mainly a C/C++ compiler) that adding the attribute pervasively triggered a lot of bugs in LLVM. So it took a long time to first fix LLVM's handling of noalias before rustc could start to safely rely on it.

For example, here's one attempt to change rustc (by no means the first): rust-lang/rust#82834

And one bug that triggered: rust-lang/rust#84958

Anyway, I don't exactly know where things stand (I think noalias emission has been reenabled at last?), and I would love to know what the performance impact was of that change.

sampsyo Oct 17, 2023
Maintainer

FWIW, @zachary-kent, your proposed type system sounds reminiscent of analyses based on separation logic. One reflection of this pattern is this paper about reference immutability, which in retrospect looks a lot like where Rust ended up.

keikun555 · 2023-10-11T17:06:00Z

keikun555
Oct 11, 2023
Author

When can we leverage upper bound analysis for compiler optimizations? When should we leverage them?

5 replies

MelindaFang-code Oct 17, 2023

Also thinking about this question. The author mentions that the key differences between the previous alias analysis algorithms stem from where and how they approximate the unbounded control paths and data, and the approximation in turn determines the precision and efficiency of the algorithm. I wonder what are different points where we can apply upper bound analysis during the compiler optimization and how they could affect the results

bcarlet Oct 17, 2023

Beyond the simple satisfaction of knowing when we have a good analysis, I see upper bound analysis as being helpful if you want to speculatively bound the performance benefit of a proposed optimization with respect to two concerns: compiler runtime and developer time. That is, if you want to do a cost-benefit analysis for a proposed set of new fancy optimizations, and prioritize thusly, then having an upper bound might give you the ability to prune out many unprofitable optimizations relatively quickly. We don't tend to think about these types of engineering decisions when discussing compilers in an academic context, but presumably they are necessary when building compilers in the real world.

SanjitBasker Oct 17, 2023

I think upper bound analysis would also be relevant if you've identified a "challenging input" and want to measure exactly how much room there is for improvement. It is tempting to simply measure the improvement of a new algorithm on a "bad input" for the previous state of the art, but that figure lacks the context of how much further one can go.

This kind of reasoning is common in theory research, where we might prove that an algorithm $\mathcal A$ is always within, say, $30$% of the optimal solution and simultaneously exhibit an example where it is exactly $30$% away from being optimal. This guides future researchers away from trying to come up with a sharper analysis of $\mathcal A$ to try to prove that it's within $25$%, and it's nice to have a counterexample in hand when thinking about improvements to $\mathcal A$.

collinzrj Oct 17, 2023

I think this paper depicts a good way of applying upper bound analysis for compiler optimization development. Upper bound analysis seems to be expensive to run, so the authors didn't use them directly in their optimization, but using it as a benchmark to show how effective their results are. For tasks like optimizations, there would always be a bound that we can achieve, and upper bound analysis helps us identify how far are we from the bound. If we can achieve good performance with quite efficient algorithm, then it might be the time to stop this kind of optimization.

NgaiJustin Oct 17, 2023

Second the above points about the practical application of upper bound analysis in compiler optimization development. Using it as a benchmark, as outlined in the paper, not only validates the effectiveness of our optimizations but also provides a tangible measure of our algorithm's efficiency. This approach aligns with the need for practical, real-world solutions in compiler engineering.

stephenverderame · 2023-10-17T00:56:34Z

stephenverderame
Oct 17, 2023

One thing I was kind of impressed with was being 2.5% within an optimal alias analysis, at least for RLE and the given benchmarks. It seems like there are a lot of more sophisticated things that could be added to improve precision (for example, using a flow-sensitive analysis) yet, at least in the observed executions of the benchmarks with RLE, there isn't much more room to improve.

Another thing I discovered during the LLVM assignment that struck me as odd is that (recent versions of) LLVM only have a single opaque pointer type. Initially, I thought that could be because LLVM was originally designed for C and C++ and pointer types might be less useful for those languages. But I discovered that it used to have typed pointers, and only recently have they been fully removed. Although they mentioned pointee types were causing issues, this decision would definitely hamper the usage of a technique like TBAA. On the surface it seems like more information would be better, but clearly whatever problems it was causing outweighed any benefits.

5 replies

keikun555 Oct 17, 2023
Author

For those interested, here is the section in the article about LLVM's Opaque Pointer project about why pointee types are problematic!

keikun555 Oct 17, 2023
Author

I think this section still allows TBAA, as we have

memory access related analyses and optimizations should use the types encoded in the load and store instructions instead of querying the pointer type

matth2k Oct 17, 2023

The feasibility of implementing TBAA within modern compilers is interesting considering how much higher level semantics is lost before peephole optimizations begin. As the paper concedes, C/C++ is troublesome with its arbitrary casting and as you say LLVM has opaque pointers.

Compiler papers like these reinforce my opinion that the extensibility of MLIR will allow future compilers to have the best of both worlds, because it will seamlessly integrate high-level and low-level optimizations together much more than LLVM could ever.

matth2k Oct 17, 2023

Just as a followup, apparently TBAA is part of clang and llvm consumes that metadata. Here is some info on the metadata format: https://the-ravi-programming-language.readthedocs.io/en/stable/llvm-tbaa.html

sampsyo Oct 17, 2023
Maintainer

Although they mentioned pointee types were causing issues, this decision would definitely hamper the usage of a technique like TBAA.

As @matth2k noted, opaque pointers actually do not interfere with LLVM's TBAA—because TBAA never actually used LLVM pointer types! It used language-level pointer types (i.e., C or C++ pointer types) as encoded in instruction metadata. Here is the metadata spec in the LangRef:
https://llvm.org/docs/LangRef.html#tbaa-metadata

vivianyyd · 2023-10-17T03:21:10Z

vivianyyd
Oct 17, 2023

It's interesting seeing alias analyses which take a different approach than the dataflow analysis we saw in class.
At first glance, they seem weaker than the analysis we saw in class (although I'm not sure if this intuition is actually correct). However, they are pretty impressively close to a perfect alias analysis; I wonder how close the analysis from lecture gets to a perfect analysis in comparison.
We saw how various type-based implementations compare in the paper; how do they compare to what we learned in class and why might we prefer one over another?

I also liked that at each step, the authors tied their implementation to real-world languages, explaining how they would modify it for Java/C/C++.

1 reply

sampsyo Oct 17, 2023
Maintainer

At first glance, they seem weaker than the analysis we saw in class (although I'm not sure if this intuition is actually correct).

I think that's a super interesting question (can we say one is "stronger" than the other?). It's certainly weaker in the sense that it's not flow-sensitive, but it might also be stronger in the sense that it exploits information that our in-class dataflow formulation does not (the types). So perhaps they are "incomparable" in terms of a strict strength property? Not sure!

FWIW, to kinda steal a main point from the paper, the "why we might prefer" question mostly comes down to "maximal optimization bang for minimal analysis buck."

evanmwilliams · 2023-10-17T04:19:18Z

evanmwilliams
Oct 17, 2023

I quite enjoyed this paper. At a first read, it makes total sense - pointers that have different types obviously can't alias. If they did, it must mean something really strange is going on with the program. In fact, the results alone to me indicate that we should actively pursue developing programming languages where this cannot happen. Or at the very least, if a programmer is going to force this behavior, they should be very explicit about what they are doing and ensure that they understand what is going on. Rust programmers probably like this idea :)

In C, this is not necessarily a given. There's all sorts of dark magic we can do with pointers of different types. Here's a small example in C:

    int i = 42;
    int *int_ptr = &i;
    float *float_ptr = (float *) &i;

While this example is probably fine, you can imagine a lot more troublesome examples in other programs. I guess the question this poses, then, is to what extent do papers like these reflect the need to build programming languages that enforce these guarantees? If type-based alias analysis truly does just as well as any other kind of alias analysis, why should we ever build a language that doesn't make TBAA easy?

1 reply

sampsyo Oct 17, 2023
Maintainer

FWIW, that C program is actually invalid! Under a rule called "strict aliasing": https://stackoverflow.com/a/99010

This means that C compilers (including LLVM) are allowed to assume that you never do this (i.e., never create two aliases of the same memory that disagree on the type, except under very specific conditions) and can break your program if you do.

he-andy · 2023-10-17T05:21:41Z

he-andy
Oct 17, 2023

Type information provides a contractual guarantee about how variables and data structures will be used. The central thesis of type-based alias analysis is simple but interesting: pointers of different types, in many cases, don't alias. This seemingly basic observation has many implications for optimization.

The authors, through empirical studies on real-world programs, found that different typed pointers often don't alias in practice. This finding makes type-based alias analysis an attractive alternative to more traditional, often computationally expensive methods. When implemented in an optimizing compiler, the type-based approach was shown to be competitive, and in some instances superior, to these traditional methods. This outcome was achieved with reduced computational overhead, making it an efficient and scalable solution.

I’m curious, then, how we can use type systems to motivate other kinds of optimizations. For example, decisions about function inlining strongly depend on types, and you can also perform automatic parallelization. If the type system can prove that two data structures are distinct and will not alias, then operations on them can potentially be parallelized since there's a guarantee of no side effects between them. I’d love to keep exploring more of these optimizations that leverage strong types.

1 reply

sampsyo Oct 17, 2023
Maintainer

The authors, through empirical studies on real-world programs, found that different typed pointers often don't alias in practice.

I don't think this is quite right. The empirical evaluation is not asking "how many times to differently-typed variables nonetheless alias"; the rules of the language say that pointers of incompatible types cannot alias. The empirical question is: "how much useful aliasing information does this fact represent?" Or, perhaps put differently, how often do pointers with the same type not alias?

If the type system can prove that two data structures are distinct and will not alias, then operations on them can potentially be parallelized since there's a guarantee of no side effects between them.

FWIW, this mostly sounds like an alias-based optimization. That is, you would have two separate compiler passes:

generic TBAA, having nothing to do with parallelization
generic parallelization, using any aliasing information and not having anything to do with types

So I'm not sure there is much to gain from coupling the type-based analysis and the parallelization. (Unless you're thinking of something more specific.)

emwangs · 2023-10-17T05:45:40Z

emwangs
Oct 17, 2023

There was a super brief section on how some researchers used TBAA for execution-time improvements, and it got me thinking!! It was super interesting to me on how this alias analysis could happen for execution-time improvement. I was wondering if it would be possible at all to apply this to a language similar to Python? Don't really understand too much about underlying memory model of Python, but my understanding of Python is that its type system is duck-typing, which is more about what attributes the object has rather than the name of its type. One thought that came to mind was maybe that would look like classifying types by their attributes -- and then in order to remove copying of objects, we would compare values of attributes? Is that even feasible? And would the tradeoff of making the comparison be better than doing the actual copy?
Lowkey is that what Python already does? 😓

2 replies

sampsyo Oct 17, 2023
Maintainer

It's a very interesting question! The big problem with Python, of course, is that you often don't know what the types of things are. Like, if I write:

def func(a, b):
    return a + b

…the variables a and b might have any type, and we need to account for all possibilities. Even with Python's type annotations: these are not sound (by default they are neither checked at compile time nor checked at run time), so optimizations can't rely on them to be correct.

Of course, there are very specific times when you do know types: for example, when you use variables immediately after initializing them! Like this:

a = 1
b = 2
return a + b

You can be sure that a and b both have type int on the final line. But it may be hard to find enough of these circumstances to create a lot of optimization opportunities in a heavily dynamic language…

janpaulpl Oct 17, 2023

I want to point out that the closest I could find to a pythonic-like language using TBAA was statically typed Lua (Ravi). While this specific language extends the Lua VM in some pretty interesting ways, in relation to this optimization I found it interesting that they were able to implement it with these conditions:

Ravi extends Lua with static typing for greater performance under JIT compilation. However, the static typing is optional and therefore Lua programs are also valid Ravi programs.

Most likely, a similar extension to Python such as Vyper would make this more feasible. Here's the actual code for the Lua implementation, it's just a measly 819 line cpp file!

I also found the motivation for this implementation pretty funny:

My assumption was that the LLVM optimizer will realise that the base pointer hasn’t changed and so the loads are redundant and can be removed. However to my surprise I found that this is not the case.

20ashah · 2023-10-17T08:32:33Z

20ashah
Oct 17, 2023

I thought this paper offered an interesting look into the optimization of programs, particularly through RLE. What I found particularly intriguing was the paper's focus on not just the effectiveness but also the complexity of the TBAA algorithms. It's fascinating to see how the paper balances the trade-off between precision and computational efficiency, seemingly bringing us closer to the ideal of a "perfect" alias analysis. Given that the TBAA methods were nearly as effective as a perfect alias analysis in the context of RLE, could these methods be generalized to other types of program optimizations?

1 reply

sampsyo Oct 17, 2023
Maintainer

Given that the TBAA methods were nearly as effective as a perfect alias analysis in the context of RLE, could these methods be generalized to other types of program optimizations?

Out of curiosity, by "these methods," do you mean:

exploiting types to do analysis/optimization
"good-enough" tradeoffs where you do something cheap/easy and hope to get most of the benefit of a complex/expensive solution

…or something else entirely?

alifarahbakhsh · 2023-10-17T13:58:22Z

alifarahbakhsh
Oct 17, 2023

I am always amazed to see how far one can go by "just" including types in a program. This paper shows that by including types, and then increasing the granularity to consider the fields within an object, one can achieve a considerably high accuracy in pinpointing potential aliasing pairs. Interestingly, considering explicit assignments does not add a significant benefit in terms of the accuracy of the algorithm, which is evidence in support of how using types - and in this case, fields - orients the developer to write non-aliasing code. This reminds me of separation logic, and I know that it is being used for memory aliasing.

It was helpful to see different metrics, and to realize that sometimes high or optimal analysis accuracy does not lead to a significant performance upgrade. However, perhaps there are use cases in which (bulk) performance is not the only issue of interest, and there are other things to look out for. For instance, perhaps a thorough algorithmic analysis of aliasing would be super helpful for a program including cryptography and secure data. Aliasing might lead to side channels in this case. As another example, perhaps there are paths in the program that are less frequently executed (depending on the nondeterminism of the program), and are therefore less reflected in the performance analysis. However, a developer might be concerned about the aliasing behavior of these paths for some reasons exogenous to performance, and therefore might be interested in a fine-grained aliasing information.

0 replies

rcplane · 2023-10-17T14:46:08Z

rcplane
Oct 17, 2023

Considering the history of languages evolving to utilize type safety for reference correctness checks and compiler optimization alias information, it is interesting to compare to a more recent alias model for Rust called Stacked Borrows https://dl.acm.org/doi/10.1145/3371109 which defined a set of rules to extend reasoning into unsafe code to make useful analyses for a broader set of performant code useful to application developers.

0 replies

Reading for 10/17: Alias-Based Optimization #389

Replies: 11 comments · 27 replies

keikun555 Oct 11, 2023 Author

sampsyo Oct 17, 2023 Maintainer

sampsyo Oct 17, 2023 Maintainer

keikun555 Oct 11, 2023 Author

sampsyo Oct 17, 2023 Maintainer

sampsyo Oct 17, 2023 Maintainer

keikun555 Oct 11, 2023 Author

keikun555 Oct 17, 2023 Author

keikun555 Oct 17, 2023 Author

sampsyo Oct 17, 2023 Maintainer

sampsyo Oct 17, 2023 Maintainer

sampsyo Oct 17, 2023 Maintainer

sampsyo Oct 17, 2023 Maintainer

sampsyo Oct 17, 2023 Maintainer

sampsyo Oct 17, 2023 Maintainer

Replies: 11 comments 27 replies

keikun555
Oct 11, 2023
Author

sampsyo Oct 17, 2023
Maintainer

sampsyo Oct 17, 2023
Maintainer

keikun555
Oct 11, 2023
Author

sampsyo Oct 17, 2023
Maintainer

sampsyo Oct 17, 2023
Maintainer

keikun555
Oct 11, 2023
Author

keikun555 Oct 17, 2023
Author

keikun555 Oct 17, 2023
Author

sampsyo Oct 17, 2023
Maintainer

sampsyo Oct 17, 2023
Maintainer

sampsyo Oct 17, 2023
Maintainer

sampsyo Oct 17, 2023
Maintainer

sampsyo Oct 17, 2023
Maintainer

sampsyo Oct 17, 2023
Maintainer