Improving package load times #4373

amitmurthy · 2013-09-26T05:10:51Z

Some background first:

AWS.jl (https://github.com/amitmurthy/AWS.jl) has around 500+ type definitions and 900+ function definitions
The bulk of the code is currently pre-generated from the WSDL for EC2 . Handwritten for S3 since the S3 WSDL does not map perfectly onto the S3 REST API - but I could write a spec for S3 API generation too. The generated code is around 11000 lines.
@loladiro has provided a patch which moves the code generation to load time - Reorganization of Code Generation JuliaCloud/AWS.jl#6 .
However, the fact that we still have to process the huge number of types/functions at load time means that loading AWS.jl still takes 10 seconds to load on my fairly current laptop. More on lower spec'ed machines.
One suggestion has been to only expose a higher level, better abstracted EC2 API. In my experience, this does not work for any real world apps where you want the full power of the raw EC2 API to do everything from tagging your resources, filtering them, mounting volumes programmatically, etc. It is not just about starting/stopping machines.
The other suggestion given is to have a simpler generic API, which takes in all parameters as a key-value dict, and returns a generic XML object to be parsed by the user. Having specific functions and input/output types for each AWS function makes the user code more simpler, concise, easier to read and less prone to typo errors.
But, having said this, a typical use of AWS.jl may only use 2-3% of the APIs - it is just that we do not know which 2-3% of the types/functions are required.

To improve the above, I just wanted to sound out if either of the following approaches is feasible/makes sense :

Approach 1

Base provides a function syms_on_demand(syms::Vector{Symbol}, load_sym_cb::Function).
At load time, a Module file will call syms_on_demand with a list of symbols that it wants to be defined/loaded only when used. This list is recorded by the julia interpreter.
load_sym_cb(s::Symbol) is a callback that executes an appropriate @eval for the specified symbol
Upon encountering an undefined symbol (or a defined symbol also existing in a syms_on_demand list), the julia interpreter will execute the callback to define the same, remove it from its internal syms_on_demand list and then do a multiple dispatch.

Approach 2

A new macro @eval_on_demand <symbols> <code block> does the same as above, i.e., it registers the symbols (and associates it with the particular module), but does not evaluate the code block till required.

I am not familiar with the intricate details/issues related to code generation, but just thought I'll put this up for discussion. Also any alternate suggestions for improving the load time of AWS.jl are welcome.

The text was updated successfully, but these errors were encountered:

JeffBezanson · 2013-09-26T06:24:35Z

I'm against adding features to deal with this. We just have to make it faster.

JeffBezanson · 2013-09-26T06:25:25Z

Features like pre-compiling things are reasonable however.

Keno · 2013-09-26T06:35:06Z

I'm with Jeff on this on. I also disagree that code generation is a good way to go about wrapping APIs, but that's a different can of worms. We should just get to static pre-compilation already ;).

JeffBezanson · 2013-09-26T20:03:01Z

At least a couple seconds (I would guess maybe 3-4 out of the 10 seconds) are just in the front end. We could create __jlcache__ directories in packages to hold binary pre-processed representations without the full can of worms of static compilation.

ivarne · 2013-09-26T21:12:11Z

Similar to pythons .pyc files?

Please don't make jlcache usable without the source file without renaming the cache file. I once spent 3 hours to figure out why a python file caused trouble after it was deleted.

StefanKarpinski · 2013-09-26T22:00:15Z

Please don't make jlcache usable without the source file without renaming the cache file. I once spent 3 hours to figure out why a python file caused trouble after it was deleted.

Don't worry – we've all been bitten by that and will not make the same mistake.

nhodas · 2013-09-27T16:55:44Z

The LLVM blog describes using the MCJIT to cache and pre-compile objects (http://blog.llvm.org/2013/08/object-caching-with-kaleidoscope.html).

"However, MCJIT provides a mechanism for caching generated object images. Once we’ve compiled a module, we can store the image and never have to compile it again. This is not available with the JIT execution engine and gives MCJIT a significant performance advantage when a library is used in multiple invocations of the program."

If the MCJIT and JIT can talk to each other, would this be a promising route?

ihnorton · 2013-09-27T17:00:24Z

Please see #260 #3922 #3892 (among others) for all the previous discussion of this topic.

Keno · 2013-09-27T17:00:42Z

Yes, that's the idea. The problem here is not that we don't know what needs to be done, we do. The problem is that it is a significant amount of work and nobody has done it yet.

quinnj · 2014-06-04T17:37:33Z

It seems like there are several issues discussed here that are already open elsewhere. In terms of package loading times, there is the usable, though fairly undocumented technique using userimg.jl for precompiling packages. Forum thread.

What else would be needed to close this issue? userimg.jl documentation?

tknopp · 2014-06-04T19:22:10Z

I don't think this can be closed yet. The userimg.jl trick requires a from source build. If I understand correctly this is due to the linking step that requires a linker to be available.

IMHO a nice solution would be if any module (or package) could be compiled into its own .so file that is cached somewhere. If this is feasible one could autogenerate these files on either the installation of a package or when using the package for the first time.

lindahua · 2014-06-16T18:35:35Z

Requiring users to modify Julia source (e.g., userimg.jl) is not an acceptable solution.

The system should work out of box, which means that packages should be able to load reasonably fast without any modification to Julia base files. I agree that we should probably cache pre-compiled images, invalidate the cache whenever the source changes in any way.

tkelman · 2014-08-10T15:30:58Z

Can we put heads together and figure out some form of package caching/precompilation first thing after getting LLVM 3.5 running?

kmsquire · 2014-08-10T15:57:39Z

+1

IainNZ · 2014-08-10T15:58:52Z

+100. Its so bad that we are actually hesitant to merge a very large PR into JuMP because of the impact on loading times...

StefanKarpinski · 2014-08-10T16:06:25Z

Something like @vtjnash's #6884 change wouldn't solve the problem but it would help a lot by allowing packages that can interact with a lot of other packages to not depend on those other packages. Gadfly, for example, could be much faster to load.

IainNZ · 2014-08-10T16:11:26Z

We (JuliaOpt) are looking forward to that one too to make our conditional solver load code less hacky, but we're still bound mostly just by our own code.
I'm not sure it'll actually help Gadfly that much
most of that looks fairly needed to me

StefanKarpinski · 2014-08-10T16:26:24Z

Gadfly doesn't really need Datetime or DataFrames, which cuts out a big part of that graph. I suspect other parts aren't really required by the core of Gafly either.

IainNZ · 2014-08-10T16:29:52Z

Oh if it doesn't need Dataframes then that'd be pretty cool.

timholy · 2014-08-10T16:46:20Z

As usual, if one @vtjnash PR doesn't help the problem, there's bound to be a second @vtjnash PR that will 😄. See #5061.

StefanKarpinski · 2014-08-12T15:21:11Z

precompilation, precompile

pao · 2014-09-08T14:53:00Z

#7977, which was on the 0.4-projects milestone, was closed as a dup. Should this be put on the 0.4-projects milestone in its stead?

IainNZ · 2015-03-17T02:33:53Z

Is putting this as milestone 0.5 up for discussion? I'm not doing the work myself, so I don't want to dictate priorities, but I'm kinda distressed by the idea that package loading will be this slow until... December?

vtjnash · 2015-03-17T03:06:46Z

sounds like we need a faster 0.4 release then?

there's more than one way to skin a cat :)

but, in fact, i tried to remove a lot of "nice to haves" from the 0.4 list specifically to help with schedule

JeffBezanson · 2015-03-17T03:46:39Z

Please do not edit the 0.4 milestone without discussion.

tkelman · 2015-03-17T07:16:00Z

Should we centralize an overall scope discussion somewhere, either as an issue or on julia-dev?

timholy · 2015-03-17T10:36:23Z

My own personal opinion: of things that seem "close" for 0.4 (#8745), in terms of importance I'd say there's nothing even remotely in the same league. A substantial improvement in package loading times would be my vote for the very first item mentioned in the announcement for whatever release this makes it into (unless the debugger gets merged in the same release).

If there's stuff we can do to help, please do let us know. I guess I should start checking out #8745 and playing with it.

timholy · 2015-03-17T10:39:38Z

(I should add that multithreading might also be a competitor for the top spot...)

tknopp · 2015-03-17T12:37:30Z

My take: if its 3 month more time we should release 0.4 without precompilation and make it the only goal for 0.5 to be released asap. Maybe those who have the skills to finish it could comment on a realistic time schedule (me not)

nalimilan · 2015-03-17T12:50:14Z

If it's a non-breaking feature, it could even be introduced in a minor release.

jiahao · 2015-03-17T12:57:10Z

Faster package loading is no longer just a "nice to have" feature. It's frankly quite difficult to claim that Julia is a fast language and then have the second thing a user try (after 1+1) is to plot something and wait forever for Gadfly to load.

tknopp · 2015-03-17T13:17:34Z

@jiahao: i think we all agree on that. But we still have to do realistic release planing. It does not help to wait for a feature when it is not realistic to get in. So it would be good if jameson, jeff, and keno make a clear decision here.

tkelman · 2015-03-17T13:19:25Z

That's just as much the case now as it was a year ago. We're rate limited on implementation labor (and code review? in the cases of already-open PR's) for big core features that everyone knows need to be done.

Decisions and/or plans should probably start to be made pretty soon whether the 0.4.0 roadmap is going to be feature-defined or schedule-defined. If the former, by which features (and expect it to take a while), or if the latter, by what target (and expect it to not have as many finished features as everyone would like).

tknopp · 2015-03-17T19:09:36Z

I though that there was agreement on a time-based schedule (e.g. by @StefanKarpinski https://groups.google.com/d/msg/julia-users/aqGvjGLVaLk/CI7p8R8XZGEJ )

elcritch · 2015-04-12T07:43:48Z

Any updates on the current status of improving package load times? I'm interested to see if I could help with anything. My LLVM-skills are getting rusty and it'd be good to have an excuse to work on them again. ;) Would that be a questions to ask @vtjnash directly?

ViralBShah · 2015-04-12T11:40:01Z

@elcritch Actually, if I could convince you, a good place for LLVM skills is the julia debugger - I believe @Keno is almost on the verge of outlining what remains to be done - and this may be a good reason to do so.

vtjnash · 2015-04-12T12:56:08Z

agreed. this issue doesn't have much to do with llvm. but see #9336 for a list of llvm36 issues to burn down. helping improve/fix debugging info would also be a huge win (tracking inline functions for line number tables and emitting the debugging symbol tables for seeing the values of local variables in lldb/gdb).

tkelman · 2015-04-12T22:52:06Z

Keno also has a pile of LLVM patches up for review that are moving slowly, not sure if us bumping them will make them go any faster.

elcritch · 2015-04-13T05:43:24Z

@ViralBShah Great, sounds like some work on the LLVM debugging info would be helpful. I will need to look into how the Julia front-end handles the debugging info. @vtjnash, it looks like the serialization code is only serializing the AST, or did I miss how it's serializing the JIT'ed code?

ihnorton · 2015-04-13T11:22:20Z

For debug info, look at "step 5" in emit_function within codegen.cpp. For serialization, start with julia_save in init.c. There are two parts to it, the JIT'd code is written by jl_dump_bitcode, and there are some helpers to serialize global values and pointers -- start from jl_save_system_image in dump.c.

(other questions should probably go to julia-dev)

Timmmm · 2019-01-04T13:54:47Z

Is there an open issue for this? This is still laughably slow. Even after precompilation. Even after it is already loaded!

$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.2 (2018-11-08)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> @time using Gadfly
 10.652760 seconds (18.66 M allocations: 1.003 GiB, 6.66% gc time)

julia> @time using Gadfly
  0.653415 seconds (1.09 M allocations: 52.098 MiB, 3.30% gc time)

julia> @time using Gadfly
  0.001039 seconds (283 allocations: 15.063 KiB)

This is on a fairly high spec Macbook Pro. I never had to wait 10 seconds to make a plot in Matlab...

JeffBezanson · 2019-01-04T16:29:27Z

There are several open issues for it; see the "latency" label.

simonster mentioned this issue Oct 9, 2013

require/reload behavior with multiple processes #4459

Closed

stevengj mentioned this issue Oct 24, 2013

[Feature request] Feature parity with IPython JuliaLang/IJulia.jl#99

Closed

staticfloat mentioned this issue Nov 19, 2013

Roadmap for 0.3 #4853

Closed

21 tasks

dcjones mentioned this issue Mar 25, 2014

loading Gadfly takes a long time GiovineItalia/Gadfly.jl#251

Closed

JeffBezanson mentioned this issue Jun 16, 2014

Packages take too long to load #7280

Closed

tkelman mentioned this issue Aug 10, 2014

Slow vcat for Sparse Matrices #7926

Closed

JeffBezanson mentioned this issue Aug 12, 2014

precompilation #7977

Closed

StefanKarpinski added this to the 0.4-projects milestone Sep 8, 2014

vtjnash modified the milestones: 0.5, 0.4 Mar 7, 2015

JeffBezanson mentioned this issue May 10, 2015

switch to incremental compilation #11218

Closed

vtjnash closed this as completed Jul 30, 2015

JeffBezanson mentioned this issue Jul 13, 2018

startup time #28092

Closed

Improving package load times #4373

Improving package load times #4373

Comments

amitmurthy commented Sep 26, 2013

JeffBezanson commented Sep 26, 2013

JeffBezanson commented Sep 26, 2013

Keno commented Sep 26, 2013

JeffBezanson commented Sep 26, 2013

ivarne commented Sep 26, 2013

StefanKarpinski commented Sep 26, 2013

nhodas commented Sep 27, 2013

ihnorton commented Sep 27, 2013

Keno commented Sep 27, 2013

quinnj commented Jun 4, 2014

tknopp commented Jun 4, 2014

lindahua commented Jun 16, 2014

tkelman commented Aug 10, 2014

kmsquire commented Aug 10, 2014

IainNZ commented Aug 10, 2014

StefanKarpinski commented Aug 10, 2014

IainNZ commented Aug 10, 2014

StefanKarpinski commented Aug 10, 2014

IainNZ commented Aug 10, 2014

timholy commented Aug 10, 2014

StefanKarpinski commented Aug 12, 2014

pao commented Sep 8, 2014

IainNZ commented Mar 17, 2015

vtjnash commented Mar 17, 2015

JeffBezanson commented Mar 17, 2015

tkelman commented Mar 17, 2015

timholy commented Mar 17, 2015

timholy commented Mar 17, 2015

tknopp commented Mar 17, 2015

nalimilan commented Mar 17, 2015

jiahao commented Mar 17, 2015

tknopp commented Mar 17, 2015

tkelman commented Mar 17, 2015

tknopp commented Mar 17, 2015

elcritch commented Apr 12, 2015

ViralBShah commented Apr 12, 2015

vtjnash commented Apr 12, 2015

tkelman commented Apr 12, 2015

elcritch commented Apr 13, 2015

ihnorton commented Apr 13, 2015

Timmmm commented Jan 4, 2019

JeffBezanson commented Jan 4, 2019