Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Lambda function for the Amazon Security Lake integration #189

Merged
merged 29 commits into from
Apr 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
0b5716e
Migrate from #147
AlexRuiz7 Mar 12, 2024
1deaba8
Update amazon-security-lake integration
AlexRuiz7 Mar 12, 2024
bdd99fe
Disable ECS compatibility (auto)
AlexRuiz7 Mar 13, 2024
c6482c7
Add @timestamp field to sample alerts
AlexRuiz7 Mar 13, 2024
d0b1573
Fix Logstash pipelines
AlexRuiz7 Mar 13, 2024
522e6e8
Add working indexer-to-s3 pipeline
AlexRuiz7 Mar 13, 2024
50a34c9
Add working Python script up to S3 upload
AlexRuiz7 Mar 14, 2024
a4e37b8
Add latest changes
AlexRuiz7 Mar 19, 2024
32312c6
Remove duplicated line
AlexRuiz7 Mar 19, 2024
4f9a592
Add working environment with minimal AWS lambda function
AlexRuiz7 Mar 19, 2024
b7dafbb
Mount src folder to Lambda's workdir
AlexRuiz7 Mar 20, 2024
5858a23
Merge branch '4.9.0' into 146-amazon-security-lake-dtd-lambda
AlexRuiz7 Apr 11, 2024
0955d1b
Add first functional lambda function
AlexRuiz7 Apr 15, 2024
3f2cf15
Working state
AlexRuiz7 Apr 16, 2024
fda5991
Add documentation
AlexRuiz7 Apr 16, 2024
7aa6c4e
Improve code
AlexRuiz7 Apr 16, 2024
8d6db2e
Improve code
AlexRuiz7 Apr 16, 2024
c1deb42
Clean up
AlexRuiz7 Apr 17, 2024
1c03011
Add instructions to build a deployment package
AlexRuiz7 Apr 17, 2024
9304c41
Make zip file lighter
AlexRuiz7 Apr 18, 2024
eb081b3
Use default name for aws_region
AlexRuiz7 Apr 18, 2024
7996e02
Add destination bucket validation
AlexRuiz7 Apr 18, 2024
cdaf434
Add env var validation and full destination S3 path
AlexRuiz7 Apr 19, 2024
b66ada1
Add AWS_ENDPOINT environment variable
AlexRuiz7 Apr 22, 2024
a3c3cce
Rename AWS_DEFAULT_REGION
AlexRuiz7 Apr 22, 2024
cada908
Remove unused env vars
AlexRuiz7 Apr 23, 2024
dc4f7d4
Remove unused file and improve documentation a bit.
AlexRuiz7 Apr 23, 2024
fe2a906
Makefile improvements
AlexRuiz7 Apr 23, 2024
7f59d9a
Use dummy env variables
AlexRuiz7 Apr 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# build files
artifacts/
*.deb
*.rpm
*.zip
*.tar.gz

integrations/amazon-security-lake/package

.java
.m2
Expand Down
2 changes: 1 addition & 1 deletion docker/dev/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ services:
image: wi-dev:${VERSION}
container_name: wi-dev_${VERSION}
build:
context: ./../..
context: ${REPO_PATH}
dockerfile: ${REPO_PATH}/docker/dev/images/Dockerfile
ports:
# OpenSearch REST API
Expand Down
81 changes: 58 additions & 23 deletions integrations/README.md
Original file line number Diff line number Diff line change
@@ -1,58 +1,93 @@
## Wazuh indexer integrations

This folder contains integrations with third-party XDR, SIEM and cybersecurity software.
This folder contains integrations with third-party XDR, SIEM and cybersecurity software.
The goal is to transport Wazuh's analysis to the platform that suits your needs.

### Amazon Security Lake

Amazon Security Lake automatically centralizes security data from AWS environments, SaaS providers,
on premises, and cloud sources into a purpose-built data lake stored in your account. With Security Lake,
you can get a more complete understanding of your security data across your entire organization. You can
also improve the protection of your workloads, applications, and data. Security Lake has adopted the
Open Cybersecurity Schema Framework (OCSF), an open standard. With OCSF support, the service normalizes
Amazon Security Lake automatically centralizes security data from AWS environments, SaaS providers,
on premises, and cloud sources into a purpose-built data lake stored in your account. With Security Lake,
you can get a more complete understanding of your security data across your entire organization. You can
also improve the protection of your workloads, applications, and data. Security Lake has adopted the
Open Cybersecurity Schema Framework (OCSF), an open standard. With OCSF support, the service normalizes
and combines security data from AWS and a broad range of enterprise security data sources.

##### Usage
#### Development guide

A demo of the integration can be started using the content of this folder and Docker.

```console
docker compose -f ./docker/amazon-security-lake.yml up -d
```

This docker compose project will bring a *wazuh-indexer* node, a *wazuh-dashboard* node,
a *logstash* node and our event generator. On the one hand, the event generator will push events
constantly to the indexer, on the `wazuh-alerts-4.x-sample` index by default (refer to the [events
generator](./tools/events-generator/README.md) documentation for customization options).
On the other hand, logstash will constantly query for new data and deliver it to the integration
Python program, also present in that node. Finally, the integration module will prepare and send the
data to the Amazon Security Lake's S3 bucket.
This docker compose project will bring a _wazuh-indexer_ node, a _wazuh-dashboard_ node,
a _logstash_ node, our event generator and an AWS Lambda Python container. On the one hand, the event generator will push events
constantly to the indexer, to the `wazuh-alerts-4.x-sample` index by default (refer to the [events
generator](./tools/events-generator/README.md) documentation for customization options).
On the other hand, logstash will constantly query for new data and deliver it to output configured in the
pipeline, which can be one of `indexer-to-s3` or `indexer-to-file`.

The `indexer-to-s3` pipeline is the method used by the integration. This pipeline delivers
the data to an S3 bucket, from which the data is processed using a Lambda function, to finally
be sent to the Amazon Security Lake bucket in Parquet format.

<!-- TODO continue with S3 credentials setup -->

Attach a terminal to the container and start the integration by starting logstash, as follows:

```console
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-integrator.conf --path.settings /etc/logstash
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-s3.conf --path.settings /etc/logstash
```

Unprocessed data can be sent to a file or to an S3 bucket.
```console
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-file.conf --path.settings /etc/logstash
/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-s3.conf --path.settings /etc/logstash
After 5 minutes, the first batch of data will show up in http://localhost:9444/ui/wazuh-indexer-aux-bucket.
You'll need to invoke the Lambda function manually, selecting the log file to process.

```bash
export AUX_BUCKET=wazuh-indexer-aux-bucket

bash amazon-security-lake/src/invoke-lambda.sh <file>
```

All three pipelines are configured to fetch the latest data from the *wazuh-indexer* every minute. In
the case of `indexer-to-file`, the data is written at the same pace, whereas `indexer-to-s3`, data
is uploaded every 5 minutes.
Processed data will be uploaded to http://localhost:9444/ui/wazuh-indexer-amazon-security-lake-bucket. Click on any file to download it,
and check it's content using `parquet-tools`. Just make sure of installing the virtual environment first, through [requirements.txt](./amazon-security-lake/).

For development or debugging purposes, you may want to enable hot-reload, test or debug on these files,
```bash
parquet-tools show <parquet-file>
```

Bucket names can be configured editing the [amazon-security-lake.yml](./docker/amazon-security-lake.yml) file.

For development or debugging purposes, you may want to enable hot-reload, test or debug on these files,
by using the `--config.reload.automatic`, `--config.test_and_exit` or `--debug` flags, respectively.

For production usage, follow the instructions in our documentation page about this matter.
(_when-its-done_)

As a last note, we would like to point out that we also use this Docker environment for development.

#### Deployment guide

- Create one S3 bucket to store the raw events, for example: `wazuh-security-lake-integration`
- Create a new AWS Lambda function
- Create an IAM role with access to the S3 bucket created above.
- Select Python 3.12 as the runtime
- Configure the runtime to have 512 MB of memory and 30 seconds timeout
- Configure an S3 trigger so every created object in the bucket with `.txt` extension invokes the Lambda.
- Run `make` to generate a zip deployment package, or create it manually as per the [AWS Lambda documentation](https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-create-dependencies).
- Upload the zip package to the bucket. Then, upload it to the Lambda from the S3 as per these instructions: https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-package.html#gettingstarted-package-zip
- Create a Custom Source within Security Lake for the Wazuh Parquet files as per the following guide: https://docs.aws.amazon.com/security-lake/latest/userguide/custom-sources.html
- Set the **AWS account ID** for the Custom Source **AWS account with permission to write data**.

<!-- TODO Configure AWS Lambda Environment Variables /-->
<!-- TODO Install and configure Logstash /-->

The instructions on this section have been based on the following AWS tutorials and documentation.

- [Tutorial: Using an Amazon S3 trigger to create thumbnail images](https://docs.aws.amazon.com/lambda/latest/dg/with-s3-tutorial.html)
- [Tutorial: Using an Amazon S3 trigger to invoke a Lambda function](https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html)
- [Working with .zip file archives for Python Lambda functions](https://docs.aws.amazon.com/lambda/latest/dg/python-package.html)
- [Best practices for working with AWS Lambda functions](https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html)

### Other integrations

TBD
28 changes: 28 additions & 0 deletions integrations/amazon-security-lake/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@

ZIP_NAME = wazuh_to_amazon_security_lake
TARGET = package
SRC = src

# Main target
.ONESHELL:
$(ZIP_NAME).zip: $(TARGET) $(SRC)/lambda_function.py $(SRC)/wazuh_ocsf_converter.py
@cd $(TARGET)
@zip -r ../$(ZIP_NAME).zip .
@cd ../$(SRC)
@zip ../$@ lambda_function.py wazuh_ocsf_converter.py
@zip ../$@ models -r

$(TARGET):
docker run -v `pwd`:/src -w /src \
python:3.12 \
pip install \
--platform manylinux2014_x86_64 \
--target=$(TARGET) \
--implementation cp \
--python-version 3.12 \
--only-binary=:all: --upgrade \
-r requirements.aws.txt

clean:
@rm -rf $(TARGET)
@py3clean .
17 changes: 17 additions & 0 deletions integrations/amazon-security-lake/aws-lambda.dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# docker build --platform linux/amd64 --no-cache -f aws-lambda.dockerfile -t docker-image:test .
# docker run --platform linux/amd64 -p 9000:8080 docker-image:test

# FROM public.ecr.aws/lambda/python:3.9
FROM amazon/aws-lambda-python:3.12

# Copy requirements.txt
COPY requirements.aws.txt ${LAMBDA_TASK_ROOT}

# Install the specified packages
RUN pip install -r requirements.aws.txt

# Copy function code
COPY src ${LAMBDA_TASK_ROOT}

# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "lambda_function.lambda_handler" ]
42 changes: 42 additions & 0 deletions integrations/amazon-security-lake/invoke-lambda.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!/bin/bash

export AUX_BUCKET=wazuh-indexer-aux-bucket

curl -X POST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{
"Records": [
{
"eventVersion": "2.0",
"eventSource": "aws:s3",
"awsRegion": "us-east-1",
"eventTime": "1970-01-01T00:00:00.000Z",
"eventName": "ObjectCreated:Put",
"userIdentity": {
"principalId": "AIDAJDPLRKLG7UEXAMPLE"
},
"requestParameters":{
"sourceIPAddress":"127.0.0.1"
},
"responseElements":{
"x-amz-request-id":"C3D13FE58DE4C810",
"x-amz-id-2":"FMyUVURIY8/IgAtTv8xRjskZQpcIZ9KG4V5Wp6S7S/JRWeUWerMUE5JgHvANOjpD"
},
"s3": {
"s3SchemaVersion": "1.0",
"configurationId": "testConfigRule",
"bucket": {
"name": "'"${AUX_BUCKET}"'",
"ownerIdentity": {
"principalId":"A3NL1KOZZKExample"
},
"arn": "'"arn:aws:s3:::${AUX_BUCKET}"'"
},
"object": {
"key": "'"${1}"'",
"size": 1024,
"eTag":"d41d8cd98f00b204e9800998ecf8427e",
"versionId":"096fKKXTRTtl3on89fVO.nfljtsv6qko"
}
}
}
]
}'

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,12 @@ input {
"query": {
"range": {
"@timestamp": {
"gt": "now-1m"
"gt": "now-5m"
}
}
}
}'
schedule => "5/* * * * *"
schedule => "*/5 * * * *"
}
}

Expand All @@ -26,15 +26,15 @@ output {
}
s3 {
id => "output.s3"
access_key_id => "${AWS_KEY}"
secret_access_key => "${AWS_SECRET}"
access_key_id => "${AWS_ACCESS_KEY_ID}"
secret_access_key => "${AWS_SECRET_ACCESS_KEY}"
region => "${AWS_REGION}"
endpoint => "http://s3.ninja:9000"
bucket => "${AWS_BUCKET}"
codec => "json"
endpoint => "${AWS_ENDPOINT}"
bucket => "${AUX_BUCKET}"
codec => "json_lines"
retry_count => 0
validate_credentials_on_root_bucket => false
prefix => "%{+YYYY}/%{+MM}/%{+dd}"
prefix => "%{+YYYY}%{+MM}%{+dd}"
server_side_encryption => true
server_side_encryption_algorithm => "AES256"
additional_settings => {
Expand Down
2 changes: 2 additions & 0 deletions integrations/amazon-security-lake/requirements.aws.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
pyarrow>=10.0.1
pydantic>=2.6.1
2 changes: 1 addition & 1 deletion integrations/amazon-security-lake/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
pyarrow>=10.0.1
parquet-tools>=0.2.15
pydantic==2.6.1
pydantic>=2.6.1
boto3==1.34.46
Loading