Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve efficiency of Suricata processing uploaded PCAP files #457

Closed
mmguero opened this issue Nov 5, 2024 · 4 comments
Closed

improve efficiency of Suricata processing uploaded PCAP files #457

mmguero opened this issue Nov 5, 2024 · 4 comments
Assignees
Labels
performance Related to speed/performance suricata Relating to Malcolm's use of Suricata upload Relating to PCAP and/or Zeek log ingestion
Milestone

Comments

@mmguero
Copy link
Collaborator

mmguero commented Nov 5, 2024

@mmguero cloned issue idaholab/Malcolm#325 on 2024-01-08:

Currently as uploaded PCAP files are processed, each PCAP file results in a new suricata process for that PCAP file.

This is the same behavior for Zeek and Arkime capture; however, suricata seems to have more overhead (I often notice that suricata is still running on a batch of uploaded PCAP files long after the others are done).

I came across this thread describing using suricata socket control to send PCAP files to a single long-running suricata process, then output each eve.json to a different directory per-PCAP. This would be an improvement.

@mmguero mmguero added performance Related to speed/performance suricata Relating to Malcolm's use of Suricata upload Relating to PCAP and/or Zeek log ingestion labels Nov 5, 2024
@mmguero mmguero added this to Malcolm Nov 5, 2024
@mmguero mmguero moved this to In Progress in Malcolm Nov 5, 2024
@mmguero mmguero added this to the v24.11.0 milestone Nov 5, 2024
@mmguero mmguero modified the milestones: v24.11.0, v24.12.0 Nov 5, 2024
@mmguero mmguero modified the milestones: v24.12.0, v25.01.0 Dec 10, 2024
@mmguero mmguero modified the milestones: v25.01.0, v25.02.0 Jan 16, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 20, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 20, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 20, 2025
@mmguero mmguero self-assigned this Jan 21, 2025
@mmguero mmguero moved this from In Progress to Review in Malcolm Jan 21, 2025
mmguero added a commit to idaholab/Malcolm that referenced this issue Jan 21, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 21, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 21, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 21, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 21, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 21, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 21, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 21, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 21, 2025
@mmguero
Copy link
Collaborator Author

mmguero commented Jan 22, 2025

This is about done, although I'm getting inconsistent results into dashbaords from the runs. From the stame PCAP set over two runs:

run 1
-----
JSON lines: 32778 in 55 files
suricata start: 14:53:54
suricata end:   14:54:49
ingest start:   14:54:06
ingest end:     14:55:05

run 2
JSON lines: 33726 in 57 files
suricata start: 15:11:13
suricata end:   15:12:10
ingest start:   14:54:06
ingest end:     14:55:05

My guess is some kind of a race condition in processing the results (like the .json file isn't flushed yet or something at the point where we're copying it).

I'm wondering if rather than doing it this way, we can just create the JSON files in place rather than moving them, then let filebeat handle it. We'll need to change where the tags are coming from (from the directory name rather than the file name). I'm going to look into that.

mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 22, 2025
…ata is done with a PCAP file. just let filebeat handle it and pick up the resultant eve.json files directly
@mmguero
Copy link
Collaborator Author

mmguero commented Jan 22, 2025

MUCH better. Still not sure why the 3 log difference :/ but looking a lot better. I still have some tests to do to make sure things recover if processes or restarted or whatever but this is about done.

run 1: 57,927 logs created, parsed and ingested in 1:13
run 2: 57,924 logs created, parsed and ingested in 1:14

@mmguero mmguero moved this from Review to Testing in Malcolm Jan 22, 2025
mmguero added a commit to mmguero-dev/Malcolm that referenced this issue Jan 22, 2025
@mmguero
Copy link
Collaborator Author

mmguero commented Jan 22, 2025

Here are my first performance tests, the first since I've enabled the multithreading for the suricata process, using the pcaps from the test suite and SURICATA_AUTO_ANALYZE_PCAP_THREADS=4:

new method: 57,925 logs created, parsed and ingested in 0:57 (4 threads)
old method: 57,436 logs created, parsed and ingested in 16:44 (4 threads)

So that's rougly an 18x speedup.

@mmguero
Copy link
Collaborator Author

mmguero commented Jan 22, 2025

I'm satisfied with this, marking as closed.

@mmguero mmguero closed this as completed Jan 22, 2025
@github-project-automation github-project-automation bot moved this from Testing to Done in Malcolm Jan 22, 2025
piercema added a commit to piercema/Malcolm that referenced this issue Feb 12, 2025
* Bump development for v25.01.0, also update copyright year

* bump netbox to v4.1.10, osd_transform to v2.18.0, and fluent-bit to v3.2.4

* for cisagov#354, work in progress for Malcolm directly accepting syslog

* for cisagov#354, work in progress for Malcolm directly accepting syslog; (dashboard)

* cisagov#543, add naviation pane to non-network dashboards

* bump jinja to 3.1.5

* Documentation for cisagov#354, syslog

* replace old filebeat input for syslog with tcp/udp input and syslog processor, for cisagov#354

* Documentation for cisagov#354, syslog

* install.py tweak for cisagov#354

* minor fix for for cisagov#354, set host.name correctly

* bump netbox to v4.11.1 and elasticsearch-dsl to v8.17.1

* start of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, fix for a dashboard

* WIP of cisagov#356, normalize winlogbeats

* Work in progress for cisagov#541, making sure conn.log and known_services.log get the ICS protocols assigned to them corrrectly and tagged appropriately

* Work in progress for cisagov#541

* standardize ICS protocols in network.protocol field, so they all get tagged with 'ics' properly cisagov#541

* fix cisagov#533, allow keystores to be created on startup even in hedgehog mode

* forgot to add file for cisagov#356

* For cisagov#524, handle filenames with spaces in extracted_files_http_server.py

* work for cisagov#542, preserve custom field formatting for index pattern on update of index pattern

* work for cisagov#542, preserve custom field formatting for index pattern on update of index pattern

* bump yq to v4.45.1

* for cisagov#551, URL pivot links from dashboards to arkime

* for cisagov#551, URL pivot links from dashboards to arkime

* fix pivot from arkime to dashboards and vice-versa when using a traefik or other reverse proxy

* for cisagov#551, URL pivot links from dashboards to netbox

* for cisagov#551, URL pivot links from dashboards to netbox

* for cisagov#551, URL pivot links from netbox to arkime/dashboards

* start of cisagov#553, update zeek to v7.1.0

* cisagov#553, handle conn.log for zeek v7.1.0 and documentation update

* cisagov#553, handle postgresql.log

* cisagov#553, handle postgresql.log

* cisagov#553, added PostgreSQL dashboard

* for cisagov#551, URL pivot links in dashboards (ignore date/times)

* start of omron fins integration, cisagov#554

* wip omron fins integration, , cisagov#554

* arkime to v5.6.0

* bump logstash and filebeat to v8.17.0

* Fix nginx filebeat

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* dashboards tweaks

* fix links for hh redirect download

* First pass at adding suricata socket optimization

* fix issue with nginx proxy

* Setting debug to false

* Fixing permissions for socket

* html formatting

* documentation for workaround for UFW software firewall for Malcolm ISO should automatically open ports for syslog cisagov#560)

* Bump for v25.02.0 development

* restore _config.yml

* fix version

* I don't think we need a seperate pod for the socket-based suricata, that's what the offline one does now anyway, right?

* restore some comments, black python style

* some tweaks for cisagov#457, pulled jjrush's branch into mine for some fixes

* some tweaks for cisagov#457

* allow suricata to spawn threads

* logging tweaks

* more flexible verbosity for suricata

* some tweaks for cisagov#457, try to wait until PCAP is finished processing before moving on

* First pass at adding suricata socket optimization

* Setting debug to false

* Fixing permissions for socket

* for cisagov#457, a few tweaks of the suricata pcap processing mode after reviewing @jjrush's code

* for cisagov#457, monitor suricata.log to know when PCAP is done processing

* for cisagov#457, monitor suricata.log to know when PCAP is done processing

* for cisagov#457, signal suricata rules to reload after update

* for cisagov#457, signal suricata rules to reload after update

* for cisagov#457, fix processing of other log types

* for cisagov#457, fix processing of other log types

* for cisagov#457, signal suricata rules to reload after update

* decrease verbosity for log

* fix logic for autoarkime/forcearkime

* some tweaks for cisagov#457, don't bother keeping track of when suricata is done with a PCAP file. just let filebeat handle it and pick up the resultant eve.json files directly

* Standardizing healthcheck scripts, updating docker-compose, updating kubernetes

* Adding livenessProbe to htadmin

* cisagov#457, handle multiple Suricata PCAP processing threads

* cisagov#574, clear screen after auth_setup when using Dialog mode

* add the related.user field to the 'nginx Access Logs' table

* bump fluent bit to v3.2.5

* fixed import of ECS templates

* handle ARKIME_PORT value formatted like a URL in the init of the API container

* cisagov#565, warn user about overwriting netbox passwords if they've already been set

* fix cisagov#559, ANSI color codes from croc displayed

* Exception in build triggers

* for cisagov#557, try building dirinit with arm runner

* cisagov#557, use arm-hosted runners for github build actions

* restore _config.yml

* a bit of cleanup for Dockefiles/health check scripts

* minor fixes for health checks

* Tweaks for health checks

* restore _config.yml

* Tweaks for health checks

* build tweaks for health scripts

* bump capa to v9.0.0

* workaround for issue blocking cisagov#475, integration of sigma rules

* improvements to workaround for issue blocking cisagov#475, integration of sigma rules

* improvements to workaround for issue blocking cisagov#475, integration of sigma rules

* for cisagov#475, automatically apply aliases via index templates

* for cisagov#475, starting on mappings for security analytics

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (wIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* demo fix

* for cisagov#585, show long connection count on connections dashboard

* decouple redis from netbox (cisagov#580)

* one more minor change to cisagov#491, moved all container health scripts into one place to make it easier to keep track of them

* decouple redis from netbox (cisagov#580) and reorganized some of the other netbox password stuff

* updated fluent bit

* fix filebeat health

---------

Co-authored-by: Seth Grover <seth.d.grover@gmail.com>
Co-authored-by: Jason Rush <jjrush-github@proton.me>
piercema added a commit to piercema/Malcolm that referenced this issue Feb 13, 2025
* Bump development for v25.01.0, also update copyright year

* bump netbox to v4.1.10, osd_transform to v2.18.0, and fluent-bit to v3.2.4

* for cisagov#354, work in progress for Malcolm directly accepting syslog

* for cisagov#354, work in progress for Malcolm directly accepting syslog; (dashboard)

* cisagov#543, add naviation pane to non-network dashboards

* bump jinja to 3.1.5

* Documentation for cisagov#354, syslog

* replace old filebeat input for syslog with tcp/udp input and syslog processor, for cisagov#354

* Documentation for cisagov#354, syslog

* install.py tweak for cisagov#354

* minor fix for for cisagov#354, set host.name correctly

* bump netbox to v4.11.1 and elasticsearch-dsl to v8.17.1

* start of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, normalize winlogbeats

* WIP of cisagov#356, fix for a dashboard

* WIP of cisagov#356, normalize winlogbeats

* Work in progress for cisagov#541, making sure conn.log and known_services.log get the ICS protocols assigned to them corrrectly and tagged appropriately

* Work in progress for cisagov#541

* standardize ICS protocols in network.protocol field, so they all get tagged with 'ics' properly cisagov#541

* fix cisagov#533, allow keystores to be created on startup even in hedgehog mode

* forgot to add file for cisagov#356

* For cisagov#524, handle filenames with spaces in extracted_files_http_server.py

* work for cisagov#542, preserve custom field formatting for index pattern on update of index pattern

* work for cisagov#542, preserve custom field formatting for index pattern on update of index pattern

* bump yq to v4.45.1

* for cisagov#551, URL pivot links from dashboards to arkime

* for cisagov#551, URL pivot links from dashboards to arkime

* fix pivot from arkime to dashboards and vice-versa when using a traefik or other reverse proxy

* for cisagov#551, URL pivot links from dashboards to netbox

* for cisagov#551, URL pivot links from dashboards to netbox

* for cisagov#551, URL pivot links from netbox to arkime/dashboards

* start of cisagov#553, update zeek to v7.1.0

* cisagov#553, handle conn.log for zeek v7.1.0 and documentation update

* cisagov#553, handle postgresql.log

* cisagov#553, handle postgresql.log

* cisagov#553, added PostgreSQL dashboard

* for cisagov#551, URL pivot links in dashboards (ignore date/times)

* start of omron fins integration, cisagov#554

* wip omron fins integration, , cisagov#554

* arkime to v5.6.0

* bump logstash and filebeat to v8.17.0

* Fix nginx filebeat

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* WIP omron fins integration, cisagov#554

* dashboards tweaks

* fix links for hh redirect download

* First pass at adding suricata socket optimization

* fix issue with nginx proxy

* Setting debug to false

* Fixing permissions for socket

* html formatting

* documentation for workaround for UFW software firewall for Malcolm ISO should automatically open ports for syslog cisagov#560)

* Bump for v25.02.0 development

* restore _config.yml

* fix version

* I don't think we need a seperate pod for the socket-based suricata, that's what the offline one does now anyway, right?

* restore some comments, black python style

* some tweaks for cisagov#457, pulled jjrush's branch into mine for some fixes

* some tweaks for cisagov#457

* allow suricata to spawn threads

* logging tweaks

* more flexible verbosity for suricata

* some tweaks for cisagov#457, try to wait until PCAP is finished processing before moving on

* First pass at adding suricata socket optimization

* Setting debug to false

* Fixing permissions for socket

* for cisagov#457, a few tweaks of the suricata pcap processing mode after reviewing @jjrush's code

* for cisagov#457, monitor suricata.log to know when PCAP is done processing

* for cisagov#457, monitor suricata.log to know when PCAP is done processing

* for cisagov#457, signal suricata rules to reload after update

* for cisagov#457, signal suricata rules to reload after update

* for cisagov#457, fix processing of other log types

* for cisagov#457, fix processing of other log types

* for cisagov#457, signal suricata rules to reload after update

* decrease verbosity for log

* fix logic for autoarkime/forcearkime

* some tweaks for cisagov#457, don't bother keeping track of when suricata is done with a PCAP file. just let filebeat handle it and pick up the resultant eve.json files directly

* Standardizing healthcheck scripts, updating docker-compose, updating kubernetes

* Adding livenessProbe to htadmin

* cisagov#457, handle multiple Suricata PCAP processing threads

* cisagov#574, clear screen after auth_setup when using Dialog mode

* add the related.user field to the 'nginx Access Logs' table

* bump fluent bit to v3.2.5

* fixed import of ECS templates

* handle ARKIME_PORT value formatted like a URL in the init of the API container

* cisagov#565, warn user about overwriting netbox passwords if they've already been set

* fix cisagov#559, ANSI color codes from croc displayed

* Exception in build triggers

* for cisagov#557, try building dirinit with arm runner

* cisagov#557, use arm-hosted runners for github build actions

* restore _config.yml

* a bit of cleanup for Dockefiles/health check scripts

* minor fixes for health checks

* Tweaks for health checks

* restore _config.yml

* Tweaks for health checks

* build tweaks for health scripts

* bump capa to v9.0.0

* workaround for issue blocking cisagov#475, integration of sigma rules

* improvements to workaround for issue blocking cisagov#475, integration of sigma rules

* improvements to workaround for issue blocking cisagov#475, integration of sigma rules

* for cisagov#475, automatically apply aliases via index templates

* for cisagov#475, starting on mappings for security analytics

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (wIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* for cisagov#585, include corelight/zeek-long-connections plugin for long connections (WIP)

* demo fix

* for cisagov#585, show long connection count on connections dashboard

* decouple redis from netbox (cisagov#580)

* one more minor change to cisagov#491, moved all container health scripts into one place to make it easier to keep track of them

* decouple redis from netbox (cisagov#580) and reorganized some of the other netbox password stuff

* updated fluent bit

* fix filebeat health

---------

Co-authored-by: Seth Grover <seth.d.grover@gmail.com>
Co-authored-by: Jason Rush <jjrush-github@proton.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Related to speed/performance suricata Relating to Malcolm's use of Suricata upload Relating to PCAP and/or Zeek log ingestion
Projects
Status: Done
Development

No branches or pull requests

2 participants