-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
App using lots of CPU #854
Comments
I think this can be closed now; its the probe using CPU. |
While testing the 0.12 candidate ( d0d6cac) with the ECS demo I see that the app in one of the three hosts is consuming ~50% CPU with peaks of ~75% CPU. The other apps are running with less than 10% CPU consumption. The app showing the problem is the only one whose UI I had opened in my browser It doesn't happen systematically. I managed to reproduce it once yesterday and spent the rest of the day retrying without luck, until I saw it happening today again. Here are two CPU profiles: http://filebin.ca/2V8B4tkAgwWP/pprof.ec2-52-28-149-93.eu-central-1.compute.amazonaws.com4040.samples.cpu.001.pb.gz And here's the png from one of them: Which suggests it's the GC. Here's the memory consumption profile: http://filebin.ca/2V8F07Fc8QZr/pprof.ec2-52-28-149-93.eu-central-1.compute.amazonaws.com4040.inuse_objects.inuse_space.001.pb.gz Note: This is happening in a node whose Scope probe is experiencing a bad memory leak (#881), at the time I took the measurements, the Probe was consuming ~176MB, which may be related |
This is the profile of the App (2ed3968) obtained at 70% CPU while monitoring the 5 probes in the scope.weave.works service: http://filebin.ca/2VkNcTTMcyob/pprof.localhost_4040.samples.cpu.001.pb.gz
Same behavior as above. After having a closer look at the profile, the app spends most of the time:
I am not super familiar with the App, I guess we moved to gobs for efficiency reasons. Also, I am totally new to the gob encoding (so the following may be stupid and wrong) but I think it's optimized for continuous streamed communication and not individual REST calls. From http://blog.golang.org/gobs-of-data:
Here's what I think is happening: the type information is included in every report, causing the gob decoder (and maybe the encoder too, I still have to check the probe profile) to be created (compiled) every single time an http request arrives. So I would suggest to either:
If we take (2) maybe we could use a decoder which used Pools of Reports, also fixing the GC problem. |
Maybe there's even a 4th, quick and hacky option: (4). Create a reader which concats all the http bodies for a cached decoder. But mixing this with gobs from different probes may be problematic or even make it useless due to its state. |
This seems to confirm my theory, I should probably had started there :): https://golang.org/pkg/encoding/gob/
|
I'd suggest trying Json first. Then websocket second. Your right, gob is a bad fit for this. |
Using json ( #916 ) improves the App's situation considerably without impacting the probes. The app is now at 30% CPU with peaks of 40%. However, the decoding completely dominates the execution time: pprof.localhost:4040.samples.cpu.001.pb.gz Plus (and this is an important point I didn't mention) this all happens without even opening the UI in a browser. When doing this it gets peak of ~70% CPU. I could try websockets+gob next but I think we should just aim at a better-suited codec instead of compromising on REST |
Using ffjson without code generation ( see #916 ) helps a bit. The app is now down to between 25% with peaks of 35% CPU. And it also seems to help when connecting the UI, when it gets at 60% CPU. I guess this already starts to make Scope usable with the service, but I think we still want to use a better performing decoder. I guess I will try code generation with ffjson or a different one. |
Apart from optimizing the codecs, @bboreham made a fair point that we need to reduce the report sizes/transfer rates to the apps. I measured the size of the (uncompressed) gob reports and they are around 1MB each
Probes send regular reports once every 3 seconds (and we also have asynchronous reports once every 10 seconds from the kubernetes and docker reporters) so the data processing requirements of the app get very quickly out of hand. |
After the codec improvements here's the CPU profile of the app running in the service, with two probes attached and with its UI connected to a browser. pprof.localhost:4040.samples.cpu.001.pb.gz The main bottlenecks are similar to the ones in the probe (#812 (comment)):
|
Here's the |
After #1000 the codec is not generating almost any garbage compared to the immutable datastructures, copies and merge operations. The GC is clearly the bottleneck. Note how Also worth noting how CPU profile: pprof.localhost:4043.samples.cpu.004.pb.gz Object allocation profile: pprof.localhost:4043.alloc_objects.alloc_space.001.pb.gz Note how cons generates 20% of the garbage. |
Closing in favor of #1010 |
No description provided.
The text was updated successfully, but these errors were encountered: