Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

globals – make their performance less awful #8870

Open
StefanKarpinski opened this issue Nov 1, 2014 · 59 comments
Open

globals – make their performance less awful #8870

StefanKarpinski opened this issue Nov 1, 2014 · 59 comments
Assignees
Labels
performance Must go faster

Comments

@StefanKarpinski
Copy link
Member

This keeps coming up, and rather than direct people to the performance tips, I feel like we should really just try to fix the issue. One relatively non-disruptive approach might be to emit code that accesses globals with a fast path that assumes that they have the same type that they currently have and a slow path that handles arbitrary type. A harder fix would be the compile-time version of that: generate code that assumes that globals retain their current type and invalidate it if the type changes. One might want to let compilation happen a couple of times before giving up and generating pessimistic general code.

Related: #265, #524, #964

@StefanKarpinski StefanKarpinski added the performance Must go faster label Nov 1, 2014
@ivarne
Copy link
Member

ivarne commented Nov 1, 2014

+ a lot

It's somewhat fun to examine a disappointing benchmark for non-const globals, and get a 10-100 times improvement, but it would be much nicer if we wouldn't have to do that.

@johnmyleswhite
Copy link
Member

If this isn't easy to fix quickly, I really like @ihnorton's heavy-handed solution: include a message about not writing code in the global scope in the Julia banner.

@JeffBezanson
Copy link
Member

This is a feature --- it makes performance problems more obvious.

@StefanKarpinski
Copy link
Member Author

I'm not sure if you're serious or not.

@PythonNut
Copy link
Contributor

What about having a way to assert that the type of a variable does not change? Assignments to such variables will only allow you to assign values of that type or values coercible to that type, otherwise the program terminates in error. With this knowledge, the compiler can produce fast paths for those variables.

This might also be helpful for other optimizations, although I don't know exactly what.

@ivarne
Copy link
Member

ivarne commented Nov 10, 2014

We have const which in practice is only an assertion that the global will not change type.

@PythonNut
Copy link
Contributor

Is that a bug or a feature?

@ivarne
Copy link
Member

ivarne commented Nov 10, 2014

Not quite sure. It might also be a signal to the compiler that the value will be constant, and that it can inline the value, if it wants, so it is not necessarily good advice to make global immutables const (if you depend on code picking up the change).

@StefanKarpinski
Copy link
Member Author

Fixing this "feature" would cut down on julia-users traffic by about 5%, e.g.:

https://groups.google.com/forum/#!topic/julia-users/7FUx2flIxac

@StefanKarpinski
Copy link
Member Author

That's a possibility but you would have to then save values back to global scope at the end. During execution, changes to globals would not be reflected elsewhere which could lead to significant confusion.

@Sisyphuss
Copy link

Suppose I write:

param = 1
myfunc(param)

Since now I have a global variable, will I get a relatively poor performance?

Or should I write instead:

function main()
  param = 1
  myfunc(param)
end

main()

@carnaval
Copy link
Contributor

The first version is fine. It is the lookup from inside compiled code which is slow, since the compiler cannot know a priori the type of the result.
If you call func(param) the function will be compiled for an Int parameter, and no global lookup will happen inside.

@Sisyphuss
Copy link

Here's my experiment, and I find no significant performance difference in version 0.3.7

param = 0.001
@time sleep(param)
function main()
  param = 0.001
  @time sleep(param)
end
main()

Both take 2.1 ms.

@JeffBezanson
Copy link
Member

Think of it this way: accessing a global variable is slow. So the existence of a global variable in your code might not matter if you only read it once. You don't want to read or write a global variable repeatedly in a loop. But if you read its value once and pass it to a function, there's no problem.

In this case the overhead of sleep will be much bigger than anything else; you won't get meaningful times.

@Sisyphuss
Copy link

The remark of @JeffBezanson seems reasonable. I did this experiment:

@time begin
  p = 1
  for i in 1:100000
    p += 1
  end
end

elapsed time: 0.002861372 seconds (1591840 bytes allocated)

function main()
  @time begin
    p = 1
    for i in 1:100000
      p += 1
    end
  end
end

main()

elapsed time: 1.676e-6 seconds (0 bytes allocated)

But what remains mysterious to me is that why the compiler can't treat global variables just in the same way as local variables. And how to explain the huge difference of memory allocation in the above example?

@JeffBezanson
Copy link
Member

Globals are different because they can be observed anywhere when they change. To convert a global to a local, the compiler would have to prove that no possible code anywhere in the system can see the changing value of p during the loop.

@timholy
Copy link
Member

timholy commented Apr 24, 2015

The memory allocation is from boxing. Even the type of the global might change; if it did, and you didn't box things, you'd get a segfault.

@Sisyphuss
Copy link

Actually, I don't know how the compiler treats local or global variables. I'm not from the computer science community. But I try my best to guess what you mean. It seems to me that you are worrying about the multi-threading situation.

@timholy
Copy link
Member

timholy commented Apr 27, 2015

No, just this:

julia> globvar = 0
0

julia> function foo()
           s = 0
           for i = 1:5
               bar()
               s += globvar
           end
       end
foo (generic function with 1 method)

julia> function bar()
           global globvar
           globvar = "oops"
           nothing
       end
bar (generic function with 1 method)

julia> foo()
ERROR: MethodError: `+` has no method matching +(::Int64, ::ASCIIString)
Closest candidates are:
  +(::Any, ::Any, ::Any)
  +(::Any, ::Any, ::Any, ::Any...)
  +(::Int64, ::Int64)
  ...
 in foo at none:5

@Sisyphuss
Copy link

@timholy, typically your example only illustrates how dangerous global variables could be and it is not a good programming habit. But I try to understand what you implied. You imply that if the compiler had treated the globvar in the same way as a local variable, this run-time error wouldn't have been detected. Isn't it?

@Sisyphuss
Copy link

If this is the case, I further deduce that the compiler interprets a local variable as a fix address in the memory, and a global variable as a name in a "variable list".

@timholy
Copy link
Member

timholy commented Apr 27, 2015

Exactly. Julia creates optimized code that depends upon how the machine represents information: the code for adding two Float64s is different from the code for adding two Float32s or two Ints. If you're wrong about the representation, you either get complete junk or a segfault, depending on the nature of the code.

"Boxing" just means wrapping the machine representation with some additional information that allows you to ask, "what type of object is this, anyway?" It takes memory, however.

@pao
Copy link
Member

pao commented Apr 27, 2015

Memory for the box, time to handle the indirection (and to ask the "what are you" question), and possibly time to perform a dynamic dispatch, depending on context. It's turtles all the way down when types are unpredictable.

@Sisyphuss
Copy link

Thanks for your reply!

I have just found an interesting phenomenon:

@time begin
  p = 1
  for i in 1:510
    p += 1
  end
end

elapsed time: 2.9961e-5 seconds (0 bytes allocated)

@time begin
  p = 1
  for i in 1:511
    p += 1
  end
end

elapsed time: 2.8285e-5 seconds (16 bytes allocated)

Within 510 iterations, there are no extra memory cost. From the 511th iteration, every extra iteration takes 16 bytes more. Does it mean that there is no boxing within 510 iterations?

@StefanKarpinski
Copy link
Member Author

The first 512 integers (0-511) are preallocated and boxed versions of them are always kept around. After that you need to actually allocate a new object.

@Sisyphuss
Copy link

Now I understand. Let's go back to our original discussion about why to avoid global variables.

I think what you are really against is not the global variable itself, but loop involving global variables, or in other words, using global variables in any loop.

@StefanKarpinski
Copy link
Member Author

StefanKarpinski commented Jul 13, 2017

Potential proposal (needs some work, but it's the most promising thing we could come up with on today's triage call):

  1. Implement type annotations for globals (Type annotations in global scope #964).
  2. Globals without explicit type annotations are implicitly type-const.
  3. Error if an explicit global type annotation is violated anywhere.
  4. Error if an implicit global type annotation is violated in non-top-level code.
  5. Warn if an implicit global type annotation is violated in top-level code.

For 1.0 only the behavioral part of this has to be implemented, but this scheme allows making globals fast with some relatively straightforward optimizations. Most globals don't change type, and this allows efficient code to be generated under that assumption. If those type assumptions are violated at the top-level, methods relying on those assumptions can simply be invalidated. This allows relatively convenient use at the REPL, while still fixing the global performance problem.

@ChrisRackauckas
Copy link
Member

Error if an implicit global type annotation is violated in non-top-level code.

So this is saying that if a function tries to modify a global in a non-type-stable way, it will error. That sounds good. But

Warn if an implicit global type annotation is violated in top-level code.

So then

x = 2
x = 2.0

that will warn every time?

@StefanKarpinski
Copy link
Member Author

x = 2; x = 2.0 would warn, yes. I'm not sure what you mean by "every time" – it would warn every time you evaluate it at the top-level, yes.

@vtjnash
Copy link
Member

vtjnash commented Aug 3, 2017

After discussion on triage, I propose we only implement item 1 (type annotations for globals, #964), and close this issue immediately. There doesn't seem to be any consensus that the other proposals here are necessary, or even uniformly useful.

@StefanKarpinski
Copy link
Member Author

StefanKarpinski commented Aug 3, 2017

The triage winds have turned here and there are serious usability and feasibility concerns about this. Everyone agrees that #964 should be implemented and with that you can conveniently express both that a global is constant and that it is type-constant (e.g. x::Int = 0). Performance could be fixed without changing the current semantics with clever on-stack-replacement since you could just invalidate and replace any method whose code depends on the type that is changed.

@vtjnash vtjnash closed this as completed Aug 3, 2017
@StefanKarpinski
Copy link
Member Author

There's still a performance issue here, we just decided not to make any semantic changes.

@StefanKarpinski StefanKarpinski modified the milestones: 1.x, 1.0 Aug 3, 2017
roseteague referenced this issue in jarvist/TheDancer.jl Jul 6, 2018
@DilumAluthge DilumAluthge removed this from the 1.x milestone Mar 13, 2022
@ViralBShah
Copy link
Member

With typed globals in 1.8, I am hoping it is ok to close this.

https://julialang.org/blog/2022/08/julia-1.8-highlights/#typed_globals

@antoine-levitt
Copy link
Contributor

Can you keep it open? While I understand it's not a priority, it would be awesome if something like

Performance could be fixed without changing the current semantics with clever on-stack-replacement

suggested by @StefanKarpinski could eventually be implemented to make quick&dirty script writing performant.

@ViralBShah ViralBShah reopened this Sep 6, 2022
@StefanKarpinski
Copy link
Member Author

Could we union-split in the current type of globals at the time of method generation? I.e. every global access implicitly looks like this:

if typeof(G) === typeG
    # fast path
else
    # slow path
end

That doesn't involve any code invalidation. It trades slowness for a bit of generated code bloat.

@vtjnash
Copy link
Member

vtjnash commented Dec 1, 2022

We could, but it is probably beyond pointless to the level of actually slowing down the code, since there typically is no actual cost in the dispatch here, but there is usually cost in using the return value due to the boxing and dispatch implied in not being able to ascertain the type of that PhiNode.

@StefanKarpinski
Copy link
Member Author

So is there anything we could realistically do to make globals faster at this point? We have typed globals now, which solves much of the original issue (either make your globals const or typed). The only thing I can think of at this point would be assuming type-constness of all globals initially and invalidating and code generated on that assumption if a global is assigned a different type.

@vtjnash
Copy link
Member

vtjnash commented Dec 1, 2022

I think the main remaining task we were thinking of doing for this issue is to world-age partition const bindings, so that you can use them more frequently, and still be able to rely on Revise to change them later (this is usualaly most especially relevant for struct definitions though, which already have an implied const to avoid this "awful globals performance" issue, but which run into difficulties then with modifying the source code later when doing development)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests