Skip to content

Commit

Permalink
Add Lambda function for the Amazon Security Lake integration (#189)
Browse files Browse the repository at this point in the history
* Migrate from #147

* Update amazon-security-lake integration

- Improved documentation.
- Python code has been moved to `wazuh-indexer/integrations/amazon-security-lake/src`.
- Development environment now uses OpenSearch 2.12.0.
- The `wazuh.integration.security.lake` container now displays logs, by watching logstash's log file.
- [**NEEDS FIX**] As a temporary solution, the `INDEXER_USERNAME` and `INDEXER_PASSWORD` values have been added as an environment variable to the `wazuh.integration.security.lake` container. These values should be set at Dockerfile level, but isn't working, probably due to permission denied on invocation of the `setup.sh` script.
- [**NEEDS FIX**] As a temporary solution, the output file of the `indexer-to-file` pipeline as been moved to `/var/log/logstash/indexer-to-file`. Previous path `/usr/share/logstash/pipeline/indexer-to-file.json` results in permission denied.
- [**NEEDS FIX**] As a temporary solution, the input.opensearch.query has been replaced with `match_all`, as the previous one does not return any data, probably to the use of time filters `gt: now-1m`.
- Standard output enable for `/usr/share/logstash/pipeline/indexer-to-file.json`.
- [**NEEDS FIX**] ECS compatibility disabled: `echo "pipeline.ecs_compatibility: disabled" >> /etc/logstash/logstash.yml` -- to be included automatically
- Python3 environment path added to the `indexer-to-integrator` pipeline.

* Disable ECS compatibility (auto)

-  Adds pipeline.ecs_compatibility: disabled at Dockerfile level.
- Removes `INDEXER_USERNAME` and `INDEXER_PASSWORD` as environment variables on the `wazuh.integration.security.lake` container.

* Add @timestamp field to sample alerts

* Fix Logstash pipelines

* Add working indexer-to-s3 pipeline

* Add working Python script up to S3 upload

* Add latest changes

* Remove duplicated line

* Add working environment with minimal AWS lambda function

* Mount src folder to Lambda's workdir

* Add first functional lambda function

Tested on local environment, using S3 Ninja and a Lambda container

* Working state

* Add documentation

* Improve code

* Improve code

* Clean up

* Add instructions to build a deployment package

* Make zip file lighter

* Use default name for aws_region

* Add destination bucket validation

* Add env var validation and full destination S3 path

* Add AWS_ENDPOINT environment variable

* Rename AWS_DEFAULT_REGION

* Remove unused env vars

* Remove unused file and improve documentation a bit.

* Makefile improvements

* Use dummy env variables

---------

Signed-off-by: Álex Ruiz <alejandro.ruiz.becerra@wazuh.com>
  • Loading branch information
AlexRuiz7 committed Aug 20, 2024
1 parent 073a304 commit c616df5
Show file tree
Hide file tree
Showing 24 changed files with 481 additions and 329 deletions.
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# build files
artifacts/
*.deb
*.rpm
*.zip
*.tar.gz

integrations/amazon-security-lake/package

.java
.m2
Expand Down
2 changes: 1 addition & 1 deletion docker/dev/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ services:
image: wi-dev:${VERSION}
container_name: wi-dev_${VERSION}
build:
context: ./../..
context: ${REPO_PATH}
dockerfile: ${REPO_PATH}/docker/dev/images/Dockerfile
ports:
# OpenSearch REST API
Expand Down
81 changes: 58 additions & 23 deletions integrations/README.md
Original file line number Diff line number Diff line change
@@ -1,58 +1,93 @@
## Wazuh indexer integrations

This folder contains integrations with third-party XDR, SIEM and cybersecurity software.
This folder contains integrations with third-party XDR, SIEM and cybersecurity software.
The goal is to transport Wazuh's analysis to the platform that suits your needs.

### Amazon Security Lake

Amazon Security Lake automatically centralizes security data from AWS environments, SaaS providers,
on premises, and cloud sources into a purpose-built data lake stored in your account. With Security Lake,
you can get a more complete understanding of your security data across your entire organization. You can
also improve the protection of your workloads, applications, and data. Security Lake has adopted the
Open Cybersecurity Schema Framework (OCSF), an open standard. With OCSF support, the service normalizes
Amazon Security Lake automatically centralizes security data from AWS environments, SaaS providers,
on premises, and cloud sources into a purpose-built data lake stored in your account. With Security Lake,
you can get a more complete understanding of your security data across your entire organization. You can
also improve the protection of your workloads, applications, and data. Security Lake has adopted the
Open Cybersecurity Schema Framework (OCSF), an open standard. With OCSF support, the service normalizes
and combines security data from AWS and a broad range of enterprise security data sources.

##### Usage
#### Development guide

A demo of the integration can be started using the content of this folder and Docker.

```console
docker compose -f ./docker/amazon-security-lake.yml up -d
```

This docker compose project will bring a *wazuh-indexer* node, a *wazuh-dashboard* node,
a *logstash* node and our event generator. On the one hand, the event generator will push events
constantly to the indexer, on the `wazuh-alerts-4.x-sample` index by default (refer to the [events
generator](./tools/events-generator/README.md) documentation for customization options).
On the other hand, logstash will constantly query for new data and deliver it to the integration
Python program, also present in that node. Finally, the integration module will prepare and send the
data to the Amazon Security Lake's S3 bucket.
This docker compose project will bring a _wazuh-indexer_ node, a _wazuh-dashboard_ node,
a _logstash_ node, our event generator and an AWS Lambda Python container. On the one hand, the event generator will push events
constantly to the indexer, to the `wazuh-alerts-4.x-sample` index by default (refer to the [events
generator](./tools/events-generator/README.md) documentation for customization options).
On the other hand, logstash will constantly query for new data and deliver it to output configured in the
pipeline, which can be one of `indexer-to-s3` or `indexer-to-file`.

The `indexer-to-s3` pipeline is the method used by the integration. This pipeline delivers
the data to an S3 bucket, from which the data is processed using a Lambda function, to finally
be sent to the Amazon Security Lake bucket in Parquet format.

<!-- TODO continue with S3 credentials setup -->

Attach a terminal to the container and start the integration by starting logstash, as follows:

```console
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-integrator.conf --path.settings /etc/logstash
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-s3.conf --path.settings /etc/logstash
```

Unprocessed data can be sent to a file or to an S3 bucket.
```console
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-file.conf --path.settings /etc/logstash
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-s3.conf --path.settings /etc/logstash
After 5 minutes, the first batch of data will show up in http://localhost:9444/ui/wazuh-indexer-aux-bucket.
You'll need to invoke the Lambda function manually, selecting the log file to process.

```bash
export AUX_BUCKET=wazuh-indexer-aux-bucket

bash amazon-security-lake/src/invoke-lambda.sh <file>
```

All three pipelines are configured to fetch the latest data from the *wazuh-indexer* every minute. In
the case of `indexer-to-file`, the data is written at the same pace, whereas `indexer-to-s3`, data
is uploaded every 5 minutes.
Processed data will be uploaded to http://localhost:9444/ui/wazuh-indexer-amazon-security-lake-bucket. Click on any file to download it,
and check it's content using `parquet-tools`. Just make sure of installing the virtual environment first, through [requirements.txt](./amazon-security-lake/).

For development or debugging purposes, you may want to enable hot-reload, test or debug on these files,
```bash
parquet-tools show <parquet-file>
```

Bucket names can be configured editing the [amazon-security-lake.yml](./docker/amazon-security-lake.yml) file.

For development or debugging purposes, you may want to enable hot-reload, test or debug on these files,
by using the `--config.reload.automatic`, `--config.test_and_exit` or `--debug` flags, respectively.

For production usage, follow the instructions in our documentation page about this matter.
(_when-its-done_)

As a last note, we would like to point out that we also use this Docker environment for development.

#### Deployment guide

- Create one S3 bucket to store the raw events, for example: `wazuh-security-lake-integration`
- Create a new AWS Lambda function
- Create an IAM role with access to the S3 bucket created above.
- Select Python 3.12 as the runtime
- Configure the runtime to have 512 MB of memory and 30 seconds timeout
- Configure an S3 trigger so every created object in the bucket with `.txt` extension invokes the Lambda.
- Run `make` to generate a zip deployment package, or create it manually as per the [AWS Lambda documentation](https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-create-dependencies).
- Upload the zip package to the bucket. Then, upload it to the Lambda from the S3 as per these instructions: https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-package.html#gettingstarted-package-zip
- Create a Custom Source within Security Lake for the Wazuh Parquet files as per the following guide: https://docs.aws.amazon.com/security-lake/latest/userguide/custom-sources.html
- Set the **AWS account ID** for the Custom Source **AWS account with permission to write data**.

<!-- TODO Configure AWS Lambda Environment Variables /-->
<!-- TODO Install and configure Logstash /-->

The instructions on this section have been based on the following AWS tutorials and documentation.

- [Tutorial: Using an Amazon S3 trigger to create thumbnail images](https://docs.aws.amazon.com/lambda/latest/dg/with-s3-tutorial.html)
- [Tutorial: Using an Amazon S3 trigger to invoke a Lambda function](https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html)
- [Working with .zip file archives for Python Lambda functions](https://docs.aws.amazon.com/lambda/latest/dg/python-package.html)
- [Best practices for working with AWS Lambda functions](https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html)

### Other integrations

TBD
28 changes: 28 additions & 0 deletions integrations/amazon-security-lake/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@

ZIP_NAME = wazuh_to_amazon_security_lake
TARGET = package
SRC = src

# Main target
.ONESHELL:
$(ZIP_NAME).zip: $(TARGET) $(SRC)/lambda_function.py $(SRC)/wazuh_ocsf_converter.py
@cd $(TARGET)
@zip -r ../$(ZIP_NAME).zip .
@cd ../$(SRC)
@zip ../$@ lambda_function.py wazuh_ocsf_converter.py
@zip ../$@ models -r

$(TARGET):
docker run -v `pwd`:/src -w /src \
python:3.12 \
pip install \
--platform manylinux2014_x86_64 \
--target=$(TARGET) \
--implementation cp \
--python-version 3.12 \
--only-binary=:all: --upgrade \
-r requirements.aws.txt

clean:
@rm -rf $(TARGET)
@py3clean .
17 changes: 17 additions & 0 deletions integrations/amazon-security-lake/aws-lambda.dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# docker build --platform linux/amd64 --no-cache -f aws-lambda.dockerfile -t docker-image:test .
# docker run --platform linux/amd64 -p 9000:8080 docker-image:test

# FROM public.ecr.aws/lambda/python:3.9
FROM amazon/aws-lambda-python:3.12

# Copy requirements.txt
COPY requirements.aws.txt ${LAMBDA_TASK_ROOT}

# Install the specified packages
RUN pip install -r requirements.aws.txt

# Copy function code
COPY src ${LAMBDA_TASK_ROOT}

# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "lambda_function.lambda_handler" ]
42 changes: 42 additions & 0 deletions integrations/amazon-security-lake/invoke-lambda.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!/bin/bash

export AUX_BUCKET=wazuh-indexer-aux-bucket

curl -X POST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{
"Records": [
{
"eventVersion": "2.0",
"eventSource": "aws:s3",
"awsRegion": "us-east-1",
"eventTime": "1970-01-01T00:00:00.000Z",
"eventName": "ObjectCreated:Put",
"userIdentity": {
"principalId": "AIDAJDPLRKLG7UEXAMPLE"
},
"requestParameters":{
"sourceIPAddress":"127.0.0.1"
},
"responseElements":{
"x-amz-request-id":"C3D13FE58DE4C810",
"x-amz-id-2":"FMyUVURIY8/IgAtTv8xRjskZQpcIZ9KG4V5Wp6S7S/JRWeUWerMUE5JgHvANOjpD"
},
"s3": {
"s3SchemaVersion": "1.0",
"configurationId": "testConfigRule",
"bucket": {
"name": "'"${AUX_BUCKET}"'",
"ownerIdentity": {
"principalId":"A3NL1KOZZKExample"
},
"arn": "'"arn:aws:s3:::${AUX_BUCKET}"'"
},
"object": {
"key": "'"${1}"'",
"size": 1024,
"eTag":"d41d8cd98f00b204e9800998ecf8427e",
"versionId":"096fKKXTRTtl3on89fVO.nfljtsv6qko"
}
}
}
]
}'

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,12 @@ input {
"query": {
"range": {
"@timestamp": {
"gt": "now-1m"
"gt": "now-5m"
}
}
}
}'
schedule => "5/* * * * *"
schedule => "*/5 * * * *"
}
}

Expand All @@ -26,15 +26,15 @@ output {
}
s3 {
id => "output.s3"
access_key_id => "${AWS_KEY}"
secret_access_key => "${AWS_SECRET}"
access_key_id => "${AWS_ACCESS_KEY_ID}"
secret_access_key => "${AWS_SECRET_ACCESS_KEY}"
region => "${AWS_REGION}"
endpoint => "http://s3.ninja:9000"
bucket => "${AWS_BUCKET}"
codec => "json"
endpoint => "${AWS_ENDPOINT}"
bucket => "${AUX_BUCKET}"
codec => "json_lines"
retry_count => 0
validate_credentials_on_root_bucket => false
prefix => "%{+YYYY}/%{+MM}/%{+dd}"
prefix => "%{+YYYY}%{+MM}%{+dd}"
server_side_encryption => true
server_side_encryption_algorithm => "AES256"
additional_settings => {
Expand Down
2 changes: 2 additions & 0 deletions integrations/amazon-security-lake/requirements.aws.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
pyarrow>=10.0.1
pydantic>=2.6.1
2 changes: 1 addition & 1 deletion integrations/amazon-security-lake/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
pyarrow>=10.0.1
parquet-tools>=0.2.15
pydantic==2.6.1
pydantic>=2.6.1
boto3==1.34.46
Loading

0 comments on commit c616df5

Please sign in to comment.