Skip to content

Commit

Permalink
ScanZip - configurable limit for extracted file size, improve events,…
Browse files Browse the repository at this point in the history
… support empty files, new tests
  • Loading branch information
ryanohoro committed Jan 27, 2024
1 parent 514ff69 commit 85c4e42
Show file tree
Hide file tree
Showing 6 changed files with 286 additions and 79 deletions.
2 changes: 2 additions & 0 deletions configs/python/backend/backend.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -694,6 +694,8 @@ scanners:
priority: 5
options:
limit: 1000
size_limit: 250000000
limit_metadata: False
password_file: '/etc/strelka/passwords.dat'
'ScanZlib':
- positive:
Expand Down
16 changes: 8 additions & 8 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -746,21 +746,21 @@ Navigate to the Jaeger UI at http://localhost:16686/ to view traces.
## Logging

### Local
The historical, and default, means of logging in Strelka is via a local log file that is instatiated upon the creation of the Strelka Frontend container. While other logging methodologies have recently been added (see Kafka section), in cases where other optional logging methodologies have been enable but fail some time after the instance has started running, the instance will always default to the local log such that no data is lost in the event of the alternative logging methodology failing.
The historical, and default, means of logging in Strelka is via a local log file that is instatiated upon the creation of the Strelka Frontend container. While other logging methodologies have recently been added (see Kafka section), in cases where other optional logging methodologies have been enable but fail some time after the instance has started running, the instance will always default to the local log such that no data is lost in the event of the alternative logging methodology failing.

### Kafka
The Frontend allows for the creation of a Kafka producer at runtime for an alternative means of logging Strelka output such that logs can be streamed to a Kafka Topic of the user's choice. This logging option is useful when there is a high volume of data being processed by Strelka and the production of that data to a down stream analysis tool (such as a SIEM system) must be highly availible for data enrichment purposes.
The Frontend allows for the creation of a Kafka producer at runtime for an alternative means of logging Strelka output such that logs can be streamed to a Kafka Topic of the user's choice. This logging option is useful when there is a high volume of data being processed by Strelka and the production of that data to a down stream analysis tool (such as a SIEM system) must be highly availible for data enrichment purposes.
Currently this is toggled on and off in the Frontend Dockerfile, which is overwritten in the build/docker-compose.yaml file. Specifically, to toggle the Kafka Producer log option on, the locallog command line option must be set to false, and the kafkalog function must be set to true. If both command line options are set to true, then the Frontend will default to the local logging option, which is how the logging has functioned historically.
Currently this is toggled on and off in the Frontend Dockerfile, which is overwritten in the build/docker-compose.yaml file. Specifically, to toggle the Kafka Producer log option on, the locallog command line option must be set to false, and the kafkalog function must be set to true. If both command line options are set to true, then the Frontend will default to the local logging option, which is how the logging has functioned historically.
The Kafka Producer that is created with the abbove command line options is fully configurable, and placeholder fields have already been added to the frontend.yaml configuration file. This file will need to be updated in order to point to an existing Kafka Topic, as desired. In cases where some fields are not used (e.g when security has not been enable on the desired Kafka Topic, etc) then unused fields in the broker configuration section of the frontend.yaml file may simply be replaced with an empty string.
The Kafka Producer that is created with the abbove command line options is fully configurable, and placeholder fields have already been added to the frontend.yaml configuration file. This file will need to be updated in order to point to an existing Kafka Topic, as desired. In cases where some fields are not used (e.g when security has not been enable on the desired Kafka Topic, etc) then unused fields in the broker configuration section of the frontend.yaml file may simply be replaced with an empty string.
#### Optional: S3 Redundancy
Dependant on a Kafka producer being created and a boolean in the Kafka config set to true, S3 redundancy can be toggled on in order to account for any issues with a Kafka connection. S3, in this case, is referring to either a AWS S3 bucket, or a Ceph Opensource Object Storage bucket.
Currently, if the option for S3 redundancy is toggled on, if the Kafka connection as desribed in the Kafka logging section of this document is interrupted, then, after the local log file is updated, the contents of that log file will be uploaded to the configureable S3 location. By default logs are kept for three hours after the start of the interuption of the Kafka connection, and, will rotate logs in S3 on the hour to maintain relevancy in the remote bucket location.
Currently, if the option for S3 redundancy is toggled on, if the Kafka connection as desribed in the Kafka logging section of this document is interrupted, then, after the local log file is updated, the contents of that log file will be uploaded to the configureable S3 location. By default logs are kept for three hours after the start of the interuption of the Kafka connection, and, will rotate logs in S3 on the hour to maintain relevancy in the remote bucket location.
Once connection is re-established to the original Kafka broker, then the stored logs are sent in parallel to new logs to the Kafka broker. If a restart of the Frontend is required to reset the connection, then the logs will be sent to the Kafka Broker (if they are not stale) at the next start up.
Once connection is re-established to the original Kafka broker, then the stored logs are sent in parallel to new logs to the Kafka broker. If a restart of the Frontend is required to reset the connection, then the logs will be sent to the Kafka Broker (if they are not stale) at the next start up.
This option is set to false by default.
Expand Down Expand Up @@ -815,7 +815,7 @@ The table below describes each scanner and its options. Each scanner has the hid
| ScanPhp | Collects metadata from PHP files | N/A |
| ScanPkcs7 | Extracts files from PKCS7 certificate files | N/A |
| ScanPlist | Collects attributes from binary and XML property list files | `keys` -- list of keys to log (defaults to `all`) |
| ScanQr | Collects QR code metadata from image files | `support_inverted` -- Enable/disable image inversion to support inverted QR codes (white on black). Adds some image processing overhead. | [Aaron Herman](https://github.com/aaronherman)
| ScanQr | Collects QR code metadata from image files | `support_inverted` -- Enable/disable image inversion to support inverted QR codes (white on black). Adds some image processing overhead. | [Aaron Herman](https://github.com/aaronherman)
| ScanRar | Extracts files from RAR archives | `limit` -- maximum number of files to extract (defaults to `1000`)<br>`password_file` -- location of passwords file for RAR archives (defaults to `/etc/strelka/passwords.dat`) |
| ScanRpm | Collects metadata and extracts files from RPM files | `tempfile_directory` -- location where `tempfile` will write temporary files (defaults to `/tmp/`) |
| ScanRtf | Extracts embedded files from RTF files | `limit` -- maximum number of files to extract (defaults to `1000`) |
Expand All @@ -838,7 +838,7 @@ The table below describes each scanner and its options. Each scanner has the hid
| ScanXL4MA | Analyzes and parses Excel 4 Macros from XLSX files | `type` -- string that determines the type of x509 certificate being scanned (no default, assigned as either "der" or "pem" depending on flavor) | Ryan Borre
| ScanXml | Log metadata and extract files from XML files | `extract_tags` -- list of XML tags that will have their text extracted as child files (defaults to empty list)<br>`metadata_tags` -- list of XML tags that will have their text logged as metadata (defaults to empty list) |
| ScanYara | Scans files with YARA rules | `location` -- location of the YARA rules file or directory (defaults to `/etc/strelka/yara/`)<br>`compiled` -- Enable use of compiled YARA rules, as well as the path.<br>`store_offset` -- Stores file offset for YARA match<br>`offset_meta_key` -- YARA meta key that must exist in the YARA rule for the offset to be stored.<br>`offset_padding` -- Amount of data to be stored before and after offset for additional context. |
| ScanZip | Extracts files from zip archives | `limit` -- maximum number of files to extract (defaults to `1000`)<br>`password_file` -- location of passwords file for zip archives (defaults to `/etc/strelka/passwords.dat`) |
| ScanZip | Extracts files from zip archives | `limit` -- maximum number of files to extract (defaults to `1000`)<br>`size_limit` -- maximum size for extracted files (defaults to `250000000`)<br>`limit_metadata` -- stop adding file metadata when `limit` is reached<br>`password_file` -- location of passwords file for zip archives (defaults to `/etc/strelka/passwords.dat`) |
| ScanZlib | Decompresses gzip files | N/A
## Tests
Expand Down
Loading

0 comments on commit 85c4e42

Please sign in to comment.