Performance Issues with Routinator Containers #42

mark-hgb · 2024-05-13T22:08:45Z

I am using a 40 AS Mini-Internet here, four regions with 10 ASes each, 2 tier-1, 2 stub (fully configured) and 6 tier-2 ASes (managed by the students). So a "classic" setup I would suppose. Find all config files attached:

40_as_config.zip

Currently the whole intra-domain stuff is done, eBGP session are configured and running, business relationships and IXPs are setup too. The connection matrix shows full connectivity, some paths are still invalid due to route leaks based on mishandled business relationships. RPKI stuff is not done at the moment.

Now I ran into some serious troubles. I observe heavy and rising load on the virtual machine (VM) running the Mini-Internet. The VM has 16 CPU cores, all are up between 95 and 100 % load, load average is between 55 and 70. Memory consumption is at around 44 GB of 64 GB in total. Here's the current output of htop on this VM:

Some deeper analysis shows that a big part of this heavy load seems to originate from the routinator processes in the 40 ASes (look at the TIME column in the ps output):

The ones with the most CPU time are the ones in the fully configured tier-1 and stub ASes. Looking at one of the affected containers (group 12, tier-1, routinator running on the host at router GRZ) shows the following for ps:

Using strace on one of the routinator process on the VM shows that, if I am right, the routinator process is spawning a lot of new processes "doing things". I attach a file with strace output here:

g18_grz_host_routinator_trace.txt

Any ideas what is going wrong here?

Thanks for your help in advance!
Markus

mark-hgb · 2024-05-15T09:58:01Z

Since the performance problems got prestering I fixed it right now by restarting and reconfiguring the routinator containers. At the moment load seems to be as expected.

I did some further testing by recording pcaps on the routinator and the krill container. The rsync connections between the routinator and the krill containers generate aroung 7-8 MB (!) traffic in about one minute.

NotSpecial · 2024-06-24T11:35:31Z

Hey, thanks for investigating this.

Do you know if this traffic between Routinator and Krill always happens, or was that a one-off thing?

mark-hgb · 2024-06-24T13:28:14Z

Hi Alex,
thx for getting back! It happened midth of May and I solved it by restarting all the routinator containers (as mentioned above). It then again happened around start of June but then disappeared again without doing anything. And it its happening again at the moment. How can I assist in diagnosing the root cause?
Best regards
Markus

NotSpecial · 2024-06-24T13:52:15Z

Unfortunately I am not very familiar with routinator. @KTrel, do you have any idea what could be causing these overheads?

@mark-hgb, do you have any idea whether some specific update commands or anything might be prompting the overheads, or whether its just regular routinator operations?

mark-hgb · 2024-06-25T06:34:26Z

I am sorry, but I am not able to shed some more light on this. It all started even before sync sessions with the routinator containers were configured by the students on their BGP routers. So the state of the routinator containers was that they were configured with an IP address and a gateway and therefore were able to communicate and sync with the krill host. With that in mind I would say it all happened during regular routinator operations.

NotSpecial · 2024-07-01T14:18:55Z

Still, thank you for the report. We'll see if we can find anything.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Issues with Routinator Containers #42

Performance Issues with Routinator Containers #42

mark-hgb commented May 13, 2024

mark-hgb commented May 15, 2024

NotSpecial commented Jun 24, 2024

mark-hgb commented Jun 24, 2024

NotSpecial commented Jun 24, 2024

mark-hgb commented Jun 25, 2024

NotSpecial commented Jul 1, 2024

Performance Issues with Routinator Containers #42

Performance Issues with Routinator Containers #42

Comments

mark-hgb commented May 13, 2024

mark-hgb commented May 15, 2024

NotSpecial commented Jun 24, 2024

mark-hgb commented Jun 24, 2024

NotSpecial commented Jun 24, 2024

mark-hgb commented Jun 25, 2024

NotSpecial commented Jul 1, 2024