You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The netbox enrichment code is by far the slowest part of the logstash pipeline. Here's the end of the output of the list of all the logstash filters, with the final column being the duration of that filter in milliseconds:
You can see that the enrichment stuff is far and away the most costly. Beyond some caching, there isn't a ton I'm doing optimization/performance wise. We should examine the netbox enrichment ruby filter code (linked above) and see if we can do some of the following:
examine cache settings... do they make sense? are we getting cache misses?
is there any sort of profiling code we can do to find the hot spots in the code?
are there particular features (autodiscovery, regular lookups, devices, services, etc.) that are more costly than others?
All in all, it would be probably the biggest performance benefit we could get for Malcolm if we could improve the speed of that code without sacrificing functionality.
The text was updated successfully, but these errors were encountered:
Wrote some profiling code and cache tracking code for the netbox_enrich.rb script. Fed malcolm 145 pcaps (11GB) and got the below output.
Method Performance (cumulative)
Method
Calls
Avg (ms)
Min (ms)
Max (ms)
Total (ms)
Outliers
Avg Outlier (ms)
filter
25,045,705
0.32
0.01
2,599.61
8,061,228.42
28,007 (0.1%)
207.84
netbox_lookup
758,274
1.91
0.00
1,477.37
1,447,055.80
5,211 (0.7%)
216.26
lookup_devices
5,123
201.56
26.29
793.43
1,032,571.86
5,011 (97.8%)
204.80
lookup_prefixes
5,139
64.31
29.46
565.91
330,512.69
96 (1.9%)
278.54
lookup_or_create_site
25,055,884
0.01
0.00
432.59
320,552.84
82 (0.0%)
221.92
create_device_interface
84
262.81
159.30
625.18
22,076.34
84 (100.0%)
262.81
lookup_manuf
31
210.11
0.01
721.75
6,513.29
21 (67.7%)
310.15
autopopulate_prefixes
14
87.76
57.76
129.51
1,228.64
4 (28.6%)
114.07
lookup_or_create_role
67
7.10
0.01
49.10
475.82
0 (0.0%)
0.00
Cache Performance
Metric
Value
Percentage
Total Lookups
25,803,938
100%
Hits
25,045,705
97.1%
Misses
758,233
2.9%
From what I'm seeing the cache is pretty reliably getting hit and all of the methods that get called a lot have pretty good performance of <1ms on average.
However even the relatively low outlier % is still having a significant performance impact.
The 5,211 method calls to netbox_lookup that are classified as outliers multiplied by the average outlier runtime of 216.26 milliseconds results in 1,126,930.86 seconds or roughly 18.7 minutes.
The entire method execution time for netbox_lookup is 1,447,055.8 milliseconds or roughly 24.11 minutes.
That means that the outliers (0.7% of method calls) take up 75% of the execution time.
The netbox enrichment code is by far the slowest part of the logstash pipeline. Here's the end of the output of the list of all the logstash filters, with the final column being the duration of that filter in milliseconds:
You can see that the enrichment stuff is far and away the most costly. Beyond some caching, there isn't a ton I'm doing optimization/performance wise. We should examine the netbox enrichment ruby filter code (linked above) and see if we can do some of the following:
All in all, it would be probably the biggest performance benefit we could get for Malcolm if we could improve the speed of that code without sacrificing functionality.
The text was updated successfully, but these errors were encountered: