Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suricata] Update fields and paths #8550

Merged
merged 16 commits into from
Oct 16, 2018

Conversation

adriansr
Copy link
Contributor

@adriansr adriansr commented Oct 3, 2018

This PR updates the suricata module to:

  • Add top-level source_ecs to compensate for source in 6.x.
  • Use ECS fields where appropriate.
  • Include mappings for src/dst geo points.
  • Use the correct path for eve.json
    • Windows: Path used by msi package install.
    • macOS: Path used by brew package install.
  • Fields from eve.json added to template

x-pack/filebeat/module/suricata/eve/_meta/fields.yml Outdated Show resolved Hide resolved
, "ignore_missing": true
}
}

, { "geoip":
{ "field": "suricata.eve.src_ip"
, "target_field": "ecs.source.geo"
, "target_field": "suricata.eve.src_geo"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ECS doesn't yet have a source.geo or a destination.geo but it will. So for master, where we don't have the source field conflict, we should put the geo data under those fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you already have a plan for dealing with the ECS source conflict with the Filebeat source for 6.x?

Perhaps we could just rename all uses of the ECS source field to source_ecs when we backport.

Copy link
Contributor Author

@adriansr adriansr Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was to have this branch directly mergeable into 6.x to speed up things, so no source usage yet.

I thought it makes more sense to not create an intermediate field (i.e. source_ecs) just to have it replaced with source soon. So I leave this under suricata.eve for now and keep those fields too for 7.x, adding also source.

Makes any sense?

Copy link
Member

@andrewkroh andrewkroh Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I follow your thinking.

My idea with source_ecs is that this would be the only difference between the module in 6.x vs 7.x. My hope is that this would make forward upgrades easier for early adopters of the module. We can even warn users in advance that we will definitely be breaking the source_ecs field in 7.x by renaming it to source.

We can't guarantee that we won't break other fields, but if they build out any security content (dashboards, alerts) it will be easier to update this content. I guess this will even make it easier for us with the dashboards we create for 6.x.

{ "field": "ecs.user_agent.os_minor"
, "target_field": "ecs.user_agent.os.minor"
{ "field": "user_agent.os_minor"
, "target_field": "user_agent.os.minor"
, "ignore_missing": true
}
}

, { "geoip":
{ "field": "suricata.eve.src_ip"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we should "promote" this up to the source.ip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same answer as above, I was refraining from using source.

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking good overall, but I have a few questions I've added here and there.

filebeat/_meta/fields.common.yml Show resolved Hide resolved

, { "set":
{ "field": "http.request.method"
, "value": "{{suricata.eve.http.http_method}}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set operations I had done initially were on fields that were always present. How does it behave for missing fields? Does it create an empty http.request.method field? If that's the case, we must find a way that doesn't create empty fields. Because the mappings you just added are for http events, but you still have dns, tls, ftp, smtp and a whole bunch more to map. We can't end up with all of them defined (and empty) on each event :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these mappings remind me that we should likely add a new processor that can do mass rename/copies with light type coercion (e.g. string to int or float, mostly). I could have sworn I had opened an issue for this. Because this is very painful at the moment.

I wonder whether Beats or Ingest Node would be a better place to have this. Perhaps Ingest Node? @adriansr @andrewkroh @tsg What do you think?

Copy link
Contributor Author

@adriansr adriansr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm having some github issues with my messages disappearing / showing up later, hope you can follow up the discussion)

{ "lang": "painless"
, "source": "if (ctx.suricata.eve.src_ip != null) { if (ctx.source_ecs == null) ctx.source_ecs = new HashMap(); ctx.source_ecs.ip = ctx.suricata.eve.src_ip; }"
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@webmat what do you think of this way to conditionally set a field only when the source field exists? I'm not familiar with ingest pipelines or painless

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it looks like we have the option (not tested) of using the convert processor. Set type to be the original type and use ignore_missing and target_field. It's hacky but slightly cleaner than the above painless.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lie hidden in the name "Painless". I cannot imagine myself doing this for 50+ fields. And that's just Suricata...

I haven't looked at the convert processor. I'll go have a look.

But I do think we need a more declarative and compact way to do it. Imagine this was an Ingest Node plugin:

{ "normalize":
  [ { "source": "suricata.eve.event_type", "target": "event.type", "error_missing": true },
    { "source": "suricata.eve.http.status", "target": "http.response.status_code", "conversion": "integer" },
    { ... }
  ]
},

By default error_missing = false, so the long list of fields we may have to convert would not need to specify that. But we could still force an error to be added to the event when really critical fields are missing, as I'm showing in the first event.

This example is loosely based on a plugin I was proposing we add to Logstash elastic/logstash#9768 (for Suricata, when I started it as a LS module). Although I didn't keep pushing it when we moved Suricata to Beats. So it's still in the idea phase at this time for LS.

Copy link
Contributor

@webmat webmat Oct 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah convert could work, and it's not that hacky. But I feel that having one processor definition per field is pretty heavy. Plus the default is not to ignore missing, which makes the pipeline definition heavier still. And since it's specifically tailored to convert, the type field is mandatory, when in this case I'd say 80%+ of the fields are straight renames that should remain as strings 😂

But I'd say let's go this way for now, it's the simplest approach, I think. The important part is to not have tons of empty fields :-)

{ "convert":
  { "field": "suricata.eve.event_type",
    "target_field": "event.type",
    "type": "string"
  }
}, 
{ "convert":
  { "field": "suricata.eve.http.status",
    "target_field": "http.response.status_code",
    "type": "integer",
    "ignore_missing": true
  }
}, 
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad news, conversion doesn't seem to work:

Error loading pipeline: Error loading pipeline for fileset suricata/eve: couldn't load pipeline: couldn't load json. Error: 400 Bad Request: {"error":{"root_cause":[{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"},"suppressed":[{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [text] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}}]}],"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"},"suppressed":[{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [text] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}}]},"status":400}. Response body: {"error":{"root_cause":[{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"},"suppressed":[{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [text] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}}]}],"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"},"suppressed":[{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [ip] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [text] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}},{"type":"parse_exception","reason":"[type] type [keyword] not supported, cannot convert field.","header":{"processor_type":"convert","property_name":"type"}}]},"status":400}```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake, I was using elasticsearch types when it expects JSON types. It works

{ "lang": "painless"
, "source": "void createAndSet(def ctx, List components, def value) { int n = components.getLength(); try { def map = ctx; for (int i = 0; i < n - 1; i++) { map = map.computeIfAbsent(components.get(i), x -> new HashMap()); } map[components.get(n-1)] = value; } catch (Exception e) { } } def getValue(def ctx, List components) { int n = components.getLength(); try { def map = ctx; for (int i = 0; i < n - 1; i++) { map = map.get(components.get(i)); if (map == null) return null; } return map[components.get(n-1)]; }catch(Exception e) { return null; } } List split(String path) { List res = new ArrayList(); int pos = 0; int next = -1; while ( (next=path.indexOf('.',pos))>=0) { if (pos < next - 1) { res.add(path.substring(pos, next)); } pos = next + 1; } if (pos < path.length()) { res.add(path.substring(pos)); } return res; } void copyIfPresent(def ctx, String source, String dest) { def val = getValue(ctx, split(source)); if (val != null) { createAndSet(ctx, split(dest), val); } } int n = params.list.getLength(); for (int i=0; i<n; i+=2) { copyIfPresent(ctx, params.list.get(i), params.list.get(i+1)); }"
, "params": {
"list": [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ugly painless does the job of creating an ECS copy of a field only if it exists in the event. Too bad it can't be formatted or documented properly, but at least I've been able to parametrize the field names.

@webmat WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I see that convert works, I guess convert is clearer than this ugly script?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative rename on the Beats side could be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd go for Ingest Node convert.

If we want to keep the original Suricata object around, we can't use Beats rename

webmat
webmat previously approved these changes Oct 5, 2018
Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

filebeat/_meta/fields.common.yml Show resolved Hide resolved
@adriansr
Copy link
Contributor Author

adriansr commented Oct 9, 2018

@webmat @andrewkroh I've pushed some fixes, please have a look

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the fields that are being copied over to the ECS field names are not being used everywhere. Examples of that are: both the IP and port of the source and destination in both of the saved searches. We're still using the suricata.eve.* fields there.

Some well defined ECS fields are not being copied to their ECS location from the original Suricata object. However I think it's ok for now, as the massive amount of copying required to move most fields will be easier once we have a better way than the current field by field approach with the convert filter. So we'll need to revisit anyway.

Ingestion and dashboards all look fine to me otherwise.

So all in all I'm good with this as is, since we'll need to revisit it again later.

I'd rather use the ECS IP & port fields in both saved searches, but since we need to revisit anyway, perhaps it doesn't matter. You can fix those now or keep them for later, I'm good with both approaches.

I've left a few comments here and there, but since I'm not sure of the team's usual approach, I'm not marking them as blockers for this PR.

{"field": "suricata.eve.http.http_user_agent"
,"target_field": "user_agent.original"
,"type": "string"
,"ignore_missing": true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous was more compact (hadn't noticed the params initially). But the single line of Painless was inscrutable. I much prefer this, even though it's more verbose. It's really hard to make a programming error in this JSON :-)

Plus when we get a massive rename/coerce processor in IN we can make all of this more compact again.

@@ -6,6 +6,54 @@
- name: event_type
type: keyword

- name: vars
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think trying to define the booleans under vars is a losing game. Isn't this super arbitrary depending on the kinds of messages we're getting? And actually, are they detected as booleans without this, or are they interpreted as strings? Does the JSON contain "true" or true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they are true (not a string).

These fields definitions come from a generator, so it outputs all the fields it finds in the provided eve.json.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm happy you didn't have write them all by hand, then :-)

This doesn't change the fact that I get the sense that these fields are very message-specific. So if we didn't happen to generate messages containing other fields from our 2 respective computers, then we're missing an unknown amount of fields here. It's in that sense that I'd rather not declare them at all.

But perhaps this is just my OCD. Perhaps nobody will notice and perhaps this is fine.

- name: software_version
type: keyword

- name: stats
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be inclined to skip all of the integers under stats. They're already numerics in the JSON logs, and will therefore be autodetected as long.

Do we have a policy of trying to define as many fields as possible in modules (even when it's basically a no-op)? I understand that if the defaults change this could cause problems, but this seems like a lot of overhead for something that doesn't change the outcome.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that if they are removed from the template, the fields won't be available in Kibana i.e. for Dashboards?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's incorrect. This is just declaring these field types explicitly. They will all end up in ES the same whether this is there or not. Kibana will let you use them in dashboards etc regardless of this. If they are in an ElasticSearch index, you can use them in Kibana, there's no other dependency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we just discussed, we'll keep stats and get rid of vars. Keeping stats around will ensure the Kibana index pattern is pre-populated with the numeric stats, which could eventually be used in visualizations we ship with the module. So it makes sense to pre-populate those with the index template.

@adriansr adriansr merged commit 4d60173 into elastic:feature-suricata Oct 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants