-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amazon Security Lake integration - Data transform and delivery (DTD) #145
Comments
Has sending the data to a Kinesis Firehose with data transformation to parquet been considered? |
Together with @wazuh/threat-intel team, we have worked on generating mappings to transform our data to the OCSF schema. In order to do that, we'll use the Detection Finding (2004) class, added in the v1.1.0 release of OCSF. The first proposal was to use the Security Finding (2001) class, but was discarded due to its deprecation on the latest version of OCSF. OCSF Version: 1.1.0
Originally posted by @IsExec in https://github.com/wazuh/internal-devel-requests/issues/699#issuecomment-1933401673 To test these mappings work and lead our data to be OCSF compliant, we have used the validate tool from amazon-security-lake-ocsf-validation, which had to be updated (link redirects to the updated version), together with the CLI Python module parquet-tools show parquet/wazuh-event.ocsf.parquet
python validate.py -i ../../wazuh-indexer/integrations/amazon-security-lake/parquet/output -version ocsf_schema_1.1.0
Python mappings to OCSF
#!/usr/bin/python
# event comes from Filebeat
event = {}
def normalize(level: int) -> int:
"""
Normalizes rule level into the 0-6 range, required by OCSF.
"""
# TODO normalization
return level
def join(iterable, separator=","):
return (separator.join(iterable))
def convert(event: dict) -> dict:
"""
Converts Wazuh events to OCSF's Detecting Finding (2004) class.
"""
ocsf_class_template = \
{
"activity_id": 1,
"category_name": "Findings",
"category_uid": 2,
"class_name": "Detection Finding",
"class_uid": 2004,
"count": event["_source"]["rule"]["firedtimes"],
"message": event["_source"]["rule"]["description"],
"finding_info": {
"analytic": {
"category": join(event["_source"]["rule"]["groups"]),
"name": event["_source"]["decoder"]["name"],
"type_id": 1,
"uid": event["_source"]["rule"]["id"],
},
"attacks": {
"tactic": {
"name": join(event["_source"]["rule"]["mitre"]["tactic"]),
},
"technique": {
"name": join(event["_source"]["rule"]["mitre"]["technique"]),
"uid": join(event["_source"]["rule"]["mitre"]["id"]),
},
"version": "v13.1"
},
"title": event["_source"]["rule"]["description"],
"types": [
event["_source"]["input"]["type"]
],
"uid": event["_source"]['id']
},
"metadata": {
"log_name": "Security events",
"log_provider": "Wazuh",
"product": {
"name": "Wazuh",
"lang": "en",
"vendor_name": "Wazuh, Inc,."
},
"version": "1.1.0",
},
"raw_data": event["_source"]["full_log"],
"resources": [
{
"name": event["_source"]["agent"]["name"],
"uid": event["_source"]["agent"]["id"]
},
],
"risk_score": event["_source"]["rule"]["level"],
"severity_id": normalize(event["_source"]["rule"]["level"]),
"status_id": 99,
"time": event["_source"]["timestamp"],
"type_uid": 200401,
"unmapped": {
"data_sources": [
event["_index"],
event["_source"]["location"],
event["_source"]["manager"]["name"]
],
"nist": event["_source"]["rule"]["nist_800_53"], # Array
}
}
return ocsf_class_template |
I'm working on an event generator tool to test the integration and ease its development. |
Description
Now that we know how OCSF works, how to encode data in Parquet and how to implement a Logstash pipeline to send events to an S3 bucket from
wazuh-indexer
indexes, we need to bundle it all together and prepare the data before sending it to AWS.As explained in #113, we need to somehow transform the data during the pipeline, to later on upload it to the Amazon Security Lake S3 bucket already in OCSF and Parquet.
To transform the data, we'll explore the use of a Lambda function and a Python script. The main difference about these 2 approaches is the resources required, as the first one needs an auxiliary S3 bucket.
Tasks
Subtasks
Definition of done
These two proposals will be worked in parallel. As soon as we manage to get one of these workings, we can consider this issue completed. Once that happens, we'll discuss the next steps.
The text was updated successfully, but these errors were encountered: