-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve startup time #552
Comments
Hi @randallt, thank you for this analysis! We have not applied a dedicated focus on profiling SpecialAgent, but its startup time does need some attention. The crux of performance hit is going to be the re/transformation operations that ByteBuddy -> ASM -> One way to reduce the startup time for your own application is to trim down the codebase being packaged into the application. Spring and Spring Boot are notorious for importing everything (and the kitchen sink) into an application, regardless of whether it's just to print "Hello World". There are a few things that could be done to improve performance:
Unfortunately, delaying the initialization by utilizing threads would simply result in the application starting sooner, but the codebase being partially instrumented.
Yes, you're right! Thank you for the SVG -- it provides a good high-level breakdown of bottlenecks in SpecialAgent. I'll use it to guide my focus. |
Hi @randallt, there have been a few updates to SpecialAgent concerning this issue, which include:
These changes address the relatively "low hanging fruit" items to improve startup time. There are a number of more involved items that would also help, which are:
We will release SpecialAgent v1.7.2 this weekend, which will include the former updates. If you have any feedback regarding your own observations, this would be welcome! |
Hi @safris, only (1) above applies to my build, as I don't include the spring or lettuce rules at all. That said, this change has had a great effect. For a simple service with the tracing agent, startup time went from 20s to 13s seconds (a 35% reduction), with the premain execution time going from 12s down to 7s (42% reduction). Great work, and I hope there's more where that came from! |
On a larger application, this change dropped the startup time about 20 sec, or 29%, from about 70 sec down to 50 sec. I haven't seen any change in function. |
I see this issue is currently in queue for milestone v1.7.4. Curious to know if a fix is still forthcoming? Seeing long startup times (approx 5 minutes) with Some more context on the startup environment:
|
Several of our users have complained about the startup time of their services being multiplied by several factors when using the special agent. We even already use a trimmed-down version with most of the rule plugins (integrations) removed (sync'ed to v1.7.0 of the specialagent code) with an integrated Wavefront Tracer (using their SDK not their bundle; other tracers are removed from our build of specialagent). Even so, services sometimes see a 10x or greater increase to their start time (e.g. from 3 sec to 30).
On a dummy app that simply exits immediately, execution time without the agent was less than 100 ms. With the agent it is over 10 sec. Profiling/tracing the startup of the agent can be tricky (e.g. YourKit can't do this), so I simply took the poor-man's approach and immediately started a new thread that sleeps 13 ms and then prints a thread dump (infinite loop). This was run on a 4-core VM, and top showed basically 100% (1 full core) utilized for the entire 10 sec. I created a created a flame graph of the resulting thread dumps using https://github.com/brendangregg/FlameGraph.
The result gives a decent view of where the time is spent.
startup_13ms_samples_main_only.zip
(You'll need to download the file, unzip it, and view it in a browser since github doesn't support uploading SVG files directly). But here is a screenshot as well (after clicking into the main thread):
This shows that the majority of time is spent in bytebuddy code, which makes sense. So anything we can do to optimize byte buddy time will be good. But we can see that the majority of time is divided into a few main pieces:
BootLoaderAgent.premain (about 25%)
AgentRule$$Access.init (about 40%)
SpecialAgent.loadAdapter (about 20%, mostly loading the Wavefront SDK stuff)
Is there any potential to utilize multiple threads across any of the 3 above?
I am a ByteBuddy novice for sure, but I will do what I can to help with this effort.
It is likely that using a better profiler would help with the data fidelity as well.
The text was updated successfully, but these errors were encountered: