Significant slow down when executed in quick succession #34

tncks0121 · 2017-08-20T17:21:40Z

I am just reopening issue #29 since I am really concerned with this issue. In my server, the delay is about 0.4-0.5s, which is significant in contests which has usual time limit 1 sec.

Before I saw that issue, I just decided to use another sandbox only for C/C++ (but using isolate for other languages, because they are slow itself), but it would be good for me to use isolate for all languages.

I'm not sure whether this is a kernel issue. We checked the time of successive isolate executions in two machines on Linode, using this bash script.

sumtime=0
for ((i=1; i<=10; i++)); do
   isolate --init --box-id=$(echo $i % 100 | bc) >/dev/null
   /usr/bin/time -f %e -o /tmp/memotime isolate --run -b $(echo $i % 100 | bc) -- /bin/echo 1>/dev/null 2>/dev/null
   isolate --cleanup -b $(echo $i % 100 | bc) >/dev/null
   echo iteration $i : $(cat /tmp/memotime)
   sumtime=$(echo $sumtime + $(cat /tmp/memotime) | bc)
done
echo total time = $sumtime

Machine 1: working bad

root@ubuntu:~# uname -a
Linux ubuntu 4.9.36-x86_64-linode85 #1 SMP Thu Jul 6 15:31:23 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
root@ubuntu:~# bash test.sh
iteration 1 : 0.06
iteration 2 : 0.49
iteration 3 : 0.50
iteration 4 : 0.52
iteration 5 : 0.47
iteration 6 : 0.45
iteration 7 : 0.48
iteration 8 : 0.48
iteration 9 : 0.43
iteration 10 : 0.46
total time = 4.34

Machine 2: working nicely

root@ubuntu:~# uname -a
Linux ubuntu 4.9.36-x86_64-linode85 #1 SMP Thu Jul 6 15:31:23 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
root@ubuntu:~# bash test.sh
iteration 1 : 0.01
iteration 2 : 0.04
iteration 3 : 0.06
iteration 4 : 0.04
iteration 5 : 0.06
iteration 6 : 0.06
iteration 7 : 0.04
iteration 8 : 0.07
iteration 9 : 0.06
iteration 10 : 0.06
total time = .50

Also I executed the script on a Docker container (on my Mac): working bad

root@8b518e8841a9:~# uname -a
Linux 8b518e8841a9 4.9.36-moby #1 SMP Wed Jul 12 15:29:07 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
root@8b518e8841a9:~# bash test/m.sh
iteration 1 : 0.08
iteration 2 : 0.97
iteration 3 : 0.90
iteration 4 : 0.91
iteration 5 : 0.89
iteration 6 : 0.91
iteration 7 : 0.87
iteration 8 : 0.94
iteration 9 : 0.96
iteration 10 : 0.99
total time = 8.42

Maybe the difference is just Linode's fault, but I think this difference is notable.

The text was updated successfully, but these errors were encountered:

hermanzdosilovic · 2017-08-20T19:02:16Z

Hey @tncks0121, why do you care about run time of isolate itself? Isn't just run time of user's program important? 😕

tncks0121 · 2017-08-20T23:47:58Z

@hermanzdosilovic, yes, that is correct, but suppose you are conducting a contest and each submission needs at least 0.4*100 seconds just because of the sandbox when the solution is super easy. In this situation the queue will be very long quickly if one has not that many workers.
Also I was planning to use isolate not only for cms but in general judging (users can see how much judging has done), so speed was important.

hermanzdosilovic · 2017-08-21T05:01:29Z

@tncks0121, thanks for clarifing that. Now when I think about it I think that if you are planning to conduct a contest where 0.4s sandbox overhead is problem than you should rather think about larger scaling, but I understand your concern. To be honest, I really hope this is an real isolate issue and not kernel issue, because it is probably easier to fix isolate than kernel. 😛 I will try to reproduce your results on few of my machines - some VM (Digital Ocean and AWS) and some bare metal, with latest isolate version. 😊 In the meantime you can try using my API for your platform https://api.judge0.com and stress test it to see if it fits your needs.

bblackham · 2017-08-21T06:00:54Z

Hi! I can't reproduce this on any of my machines, VMs or physical.

Can you try the following:

strace the isolate --run command by prefixing it with sudo strace -u $USER -ttTfo /tmp/strace-$(echo $i % 100 | bc)" to generate one strace per invocation, and see if there are any particular syscalls in each invocation that are taking the majority of the time?
Assuming what is suggested by this comment is true, that wait() or exit() is the culprit, the strace will hopefully show that. The next step to debug would be use oprofile on maybe 100 runs (with kernel symbols enabled) and see if there are any obvious hotspots in the kernel.

Or if you are able to provide a shell to a machine that can reproduce this, I'd be happy to investigate. (It probably requires root to diagnose though).

tncks0121 · 2017-08-21T07:06:13Z

@bblackham, I did what you said, and it seems this is the bottleneck. Unfortunately I don't know how to read these files, so I'm not sure what the problem is..

16020 06:55:34.505601 clone( <unfinished ...>
16037 06:55:34.864662 getpid( <unfinished ...>
16020 06:55:34.864715 <... clone resumed> child_stack=0x7ffe16ee62b0, flags=CLONE_NEWNS|CLONE_NEWIPC|CLONE_NEWPID|CLONE_NEWNET|SIGCHLD) = 16037 <0.359109>
16037 06:55:34.864733 <... getpid resumed> ) = 1 <0.000027>

strace-1.txt
strace-2.txt

bblackham · 2017-08-21T07:39:43Z

@tncks0121, that clone() definitely looks to be the culprit. No user code is executing between the start of the clone() and its completion. This points at the Linux kernel and something about cloning a task into a new namespace. Are you able to run operf on the affected machines? I'm not certain that operf will work in certain types of VM (I think it requires direct hardware access to the MSRs).

That sort of latency would have to be either some kind of network traffic (maybe there is a small amount of buffering which is why the first one is okay?), a lot of memory zeroing (like, gigabytes, which would be strange), or evicting something to swap, or dropping some caches. Perhaps it is triggering some call out to a really slow userspace helper. If operf doesn't provide any information, I don't know how to diagnose further without being able to reproduce it locally. Can you help me reproduce it locally, or provide a shell to somewhere that it is reproducible?

tncks0121 · 2017-08-21T08:47:44Z

@bblackham, I tried but it seems it doesn't give any useful information. Maybe I've done in a wrong way as I don't know about operf.

root@ubuntu:~# operf  ./isolate --run -- /bin/echo 1>/dev/null 2>/dev/null
root@ubuntu:~# opreport
Using /root/oprofile_data/samples/ for samples directory.


WARNING! Some of the events were throttled. Throttling occurs when
the initial sample rate is too high, causing an excessive number of
interrupts.  Decrease the sampling frequency. Check the directory
/root/oprofile_data/samples/current/stats/throttled
for the throttled event names.


WARNING: Lost samples detected! See /root/oprofile_data/samples/operf.log for details.
CPU: Intel Broadwell microarchitecture, speed 2299.99 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
CPU_CLK_UNHALT...|
  samples|      %|
------------------
        7 53.8462 echo
	CPU_CLK_UNHALT...|
	  samples|      %|
	------------------
	        7 100.000 kallsyms
        6 46.1538 isolate
	CPU_CLK_UNHALT...|
	  samples|      %|
	------------------
	        6 100.000 kallsyms

Anyway, I'll try to find a way to reproduce in a new machine. (both tested machines are already using ones)

SemaiCZE · 2017-08-21T08:48:53Z

I think that the execution speed depends on the machine, only the first one or first two iterations are significantly faster (6 times or so). This can be seen on both "good" and "bad" machines - the good one has first iteration 0.01 and then about 0.06 and the bad one 0.06 and then about 0.48, which is almost the same slowdown for both of them.

My measurement on CentOS VPS:

iteration 1 : 0.00
iteration 2 : 0.00
iteration 3 : 0.06
iteration 4 : 0.06
iteration 5 : 0.06
iteration 6 : 0.05
iteration 7 : 0.07
iteration 8 : 0.06
iteration 9 : 0.06
iteration 10 : 0.06
total time = .48

bblackham · 2017-08-21T11:49:19Z

Right, I can reproduce it under docker here. The killer is creating a separate networking namespace. If you pass --share-net to isolate --run, you should see that it runs significantly faster (but you won't have any network isolation). I suspect it is the same kernel issue reported here and here. The netns cleanup operations are batched and very expensive. The patch in the latter link made it into Linux 4.12, so you might expect some speed up with a newer kernel version.

stefano-maggiolo · 2017-08-21T12:03:07Z

Confirm that --share-net makes the timing difference completely disappear on my laptop.

bblackham · 2017-08-21T12:49:53Z

Thanks for confirming @stefano-maggiolo. Some extra data points:

I never saw it on my VM tests earlier because I was running a single-CPU VM (where the RCU slowdown issues never occur). On a dual-CPU VM, without any iptables loaded, there is no issue. But then as soon as I run iptables -t nat -L, the various nat connection tracking modules are loaded and these cause a major slowdown. If I rmmod nf_conntrack (including all modules it depends on), the slowdown disappears again.

For @tncks0121 and anyone else affected, try blacklisting nf_conntrack (add blacklist nf_conntrack to some file in /etc/modprobe.d/ and reboot). That should be all there is to it, assuming your iptables firewall doesn't require the connection tracking modules. If it's still a problem, please paste the output of cat /proc/modules and uname -r.

bblackham · 2017-08-29T02:30:58Z

Closing this issue as I believe it is definitely a Linux kernel bug and there's nothing isolate (or any sandbox that uses Linux network namespaces for network isolation) can do about it. A potential workaround is given in my previous comment (blacklisting nf_conntrack).

If this workaround solves the issue for you, please confirm here for posterity. Thanks!

bblackham closed this as completed Aug 29, 2017

bblackham mentioned this issue Aug 29, 2017

Significant slow down when executed in quick succession #29

Closed

gollux mentioned this issue Feb 22, 2018

New sandbox working slower than version 1.2 cms-dev/cms#806

Closed

evanj mentioned this issue Mar 19, 2018

Support block-network by default for tests bazelbuild/bazel#3453

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significant slow down when executed in quick succession #34

Significant slow down when executed in quick succession #34

tncks0121 commented Aug 20, 2017 •

edited

Loading

hermanzdosilovic commented Aug 20, 2017

tncks0121 commented Aug 20, 2017

hermanzdosilovic commented Aug 21, 2017 via email •

edited

Loading

bblackham commented Aug 21, 2017

tncks0121 commented Aug 21, 2017

bblackham commented Aug 21, 2017

tncks0121 commented Aug 21, 2017

SemaiCZE commented Aug 21, 2017

bblackham commented Aug 21, 2017

stefano-maggiolo commented Aug 21, 2017

bblackham commented Aug 21, 2017 •

edited

Loading

bblackham commented Aug 29, 2017

Significant slow down when executed in quick succession #34

Significant slow down when executed in quick succession #34

Comments

tncks0121 commented Aug 20, 2017 • edited Loading

hermanzdosilovic commented Aug 20, 2017

tncks0121 commented Aug 20, 2017

hermanzdosilovic commented Aug 21, 2017 via email • edited Loading

bblackham commented Aug 21, 2017

tncks0121 commented Aug 21, 2017

bblackham commented Aug 21, 2017

tncks0121 commented Aug 21, 2017

SemaiCZE commented Aug 21, 2017

bblackham commented Aug 21, 2017

stefano-maggiolo commented Aug 21, 2017

bblackham commented Aug 21, 2017 • edited Loading

bblackham commented Aug 29, 2017

tncks0121 commented Aug 20, 2017 •

edited

Loading

hermanzdosilovic commented Aug 21, 2017 via email •

edited

Loading

bblackham commented Aug 21, 2017 •

edited

Loading