Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source/target and diff modeling #678

Open
andrewstucki opened this issue Dec 5, 2019 · 12 comments
Open

Source/target and diff modeling #678

andrewstucki opened this issue Dec 5, 2019 · 12 comments
Labels

Comments

@andrewstucki
Copy link
Contributor

This is a more generic application of modeling either "source/target" style events or "diff" events that I'm spinning off from #589

As I initially mentioned (#589 (comment)). There are a number of things that ECS should support modeling including things like:

  • setuid/setgid operations
  • file modification events (i.e. renames, permissions)
  • IPC calls
  • network requests/flows (current use of source and destination)
  • user modifications
  • process execs
  • registry modifications
  • windows source/target audit log info

Overall these more or less fall into two categories:

  1. Modelling communication between two like things (network connections, IPCs, windows audit log)
  2. Modification events

Currently the way ECS has started to approach this is to make fields that are specific to each of these domains, i.e. source/destination are currently for network modeling only, and then there's also client/server

I'm advocating for adopting a more generic field set that allows you to do generic source/target or diff modeling which would essentially allow you to embed any other field set in it.

For example--something like origin and target (sad that source/destination is already taken):

For file modifications:

origin.file.name = "foo"
target.file.name = "bar"

For process execs:

origin.process.path = "foo"
target.process.path = "bar"

For network requests (slightly difficult because of the lack of port info):

origin.host.ip = "foo"
target.host.ip = "bar"

For user modification (maybe by another user baz who did the modification?):

user.name = "baz"
origin.user.name = "foo"
target.user.name = "bar"

Thoughts?

@janniten
Copy link
Contributor

For user modification (maybe by another user baz who did the modification?):

user.name = "baz"
origin.user.name = "foo"
target.user.name = "bar"

Hi @andrewstucki,
baz is the user performing the modification?
Thank you

@andrewstucki
Copy link
Contributor Author

@janniten that was the idea in that particular example, but the desire was just to highlight the need to model any sort of diffing or source/destination style events in a sizeable portion of the existing field sets out there.

I'm fairly open to ideas on naming/implementation (would this be at the top-level as I suggested or embedded inside each top-level field set, etc.), but just calling out the need to solve this issue for a number of use-cases so that we don't just solve the same problem a myriad of different ways as they arise. So, any suggestions/opinions on how we should go about this are more than welcome!

This was referenced Dec 13, 2019
@rw-access
Copy link
Contributor

rw-access commented Dec 18, 2019

Cross-process activity could also be modeled this way. This could include injection, credential access/handle opening, etc.

source.process.name = "totallynotmalware.exe"
target.process.name = "lsass.exe"

I'd prefer that versus nesting source/target under process. Then you have two mega namespaces source and target without needing to pollute everything with a source and target.

You could also argue that source is implicit for the existing namespaces. The above example could be equivalently written as:

process.name = "totallynotmalware.exe"
target.process.name = "lsass.exe"

The big advantage is that you can now build aggregations on those better. For instance, "what are the counts of the categories of events for cmd.exe?" You don't want to search search.process.name and process.name, and that will also make it tricky to assign them into buckets.

process.name == "cmd.exe" // instead of : process.name == "cmd.exe" or source.process.name == "cmd.exe"
| count event.category

@andrewstucki
Copy link
Contributor Author

I totally agree about the implicit nature of some things, and am slightly less concerned with the proposed origin field set than the target field set. The thing that this would preclude us from doing though is using something like the origin field set to model the original state of a diff operation from a third entity (i.e. the example of the user who modifies a different user's name).

At least in our particular use case for security related stuff, I think that there's a ton of utility introducing the target field set even if we weren't able to introduce origin. But I'd like people with other use-cases to chime in on the utility of these constructs as well.

@rw-access
Copy link
Contributor

rw-access commented Dec 18, 2019

@andrewstucki just to make sure I'm tracking your example well, does this reflect your example?
User Alice renamed Bob to Robert.

{
  "user": {"name": "alice"},
  "origin": {"user": {"name": "bob"}},
  "target": {"user": {"name": "robert"}}
}

This maps most closely to a Event ID of 4781 for Windows Event Logs (Security)

Subject:

   Security ID:  ACME\Alice
   Account Name:  Alice
   Account Domain:  ACME
   Logon ID:  0x1f40f

Target Account:

   Security ID:  ACME\robert
   Account Domain:  ACME
   Old Account Name: bob
   New Account Name: robert

Additional Information:

   Privileges:  -

@marshallmain
Copy link
Contributor

marshallmain commented Dec 18, 2019

Using origin and target to represent the old vs new state of a single user (or object in general) seems like a big semantic change from the process semantics where the origin process is generally taking action on the target process.

If I started out using origin.process and target.process and tried to apply that logic to origin.user and target.user I would expect origin.user to be taking an action on target.user, whereas the suggestion here is origin.user is transforming into target.user.

I definitely want to add the capability to describe an original state that has transformed to a new state but conflating it with an object taking action on a different object seems confusing to me.

andrewkroh added a commit to elastic/beats that referenced this issue Feb 5, 2020
…module (#15217)

Added Audit and Log Management related events, Computer Object Management Events, Distribution Groups Events. Changed user.name field for user management events and related.user mapping.

New Events

Due to that Windows events are the source of information for Winlogbeat the events 1100, 1102, 1104, 1105, 1108 and 4719 has been added in order to monitor changes in the audit policy configuration, log deletion and other failures in the log subsystem.

For event 4719, a human readable description was added in order to know which setting was modified (winlog.event_data.SubCategory) and to which value (winlog.event_data.AuditPolicyChangesDescription).

Distribution Groups (Security-Disabled) Management Events were added. Those events are processed in the same way and with the same function that Security Groups (#14299). In order to add information about the nature of the group being managed the type (Security-Disabled/Security-Enabled) and scope (Local,Global,Universal) where added as winlog.group.type and winlog.group.scope.

ComputerObject Management events were also added.

Changes to ECS mappings

In elastic/ecs#678 and elastic/ecs#589 we have been discussing how n-ary relationship between users in an event should be named and mapping into ECS. In #13530 winlog.event_data.TargetUserName has been mapped to user.name but from the reasons exposed in elastic/ecs#678 and elastic/ecs#589 the mapping winlog.event_data.SubjectUserName -> user.name is more appropriate. This mapping was changed.

Also, with the adding of related fields in ECS 1.3 and specifically the related.user field (elastic/ecs#694) all the user names appearing in one event were mapped to the related user events. Every time a SubjectUserName or TargetUserName is copied also is added to the related.user field, as well as other users appearing in the event.

Event test data were added for all events with the exception of event 1108 which I was not able to reproduce.

Co-authored-by: Lee Hinman <57081003+leehinman@users.noreply.github.com>
Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
andrewkroh pushed a commit to andrewkroh/beats that referenced this issue Mar 18, 2020
…module (elastic#15217)

Added Audit and Log Management related events, Computer Object Management Events, Distribution Groups Events. Changed user.name field for user management events and related.user mapping.

New Events

Due to that Windows events are the source of information for Winlogbeat the events 1100, 1102, 1104, 1105, 1108 and 4719 has been added in order to monitor changes in the audit policy configuration, log deletion and other failures in the log subsystem.

For event 4719, a human readable description was added in order to know which setting was modified (winlog.event_data.SubCategory) and to which value (winlog.event_data.AuditPolicyChangesDescription).

Distribution Groups (Security-Disabled) Management Events were added. Those events are processed in the same way and with the same function that Security Groups (elastic#14299). In order to add information about the nature of the group being managed the type (Security-Disabled/Security-Enabled) and scope (Local,Global,Universal) where added as winlog.group.type and winlog.group.scope.

ComputerObject Management events were also added.

Changes to ECS mappings

In elastic/ecs#678 and elastic/ecs#589 we have been discussing how n-ary relationship between users in an event should be named and mapping into ECS. In elastic#13530 winlog.event_data.TargetUserName has been mapped to user.name but from the reasons exposed in elastic/ecs#678 and elastic/ecs#589 the mapping winlog.event_data.SubjectUserName -> user.name is more appropriate. This mapping was changed.

Also, with the adding of related fields in ECS 1.3 and specifically the related.user field (elastic/ecs#694) all the user names appearing in one event were mapped to the related user events. Every time a SubjectUserName or TargetUserName is copied also is added to the related.user field, as well as other users appearing in the event.

Event test data were added for all events with the exception of event 1108 which I was not able to reproduce.

Co-authored-by: Lee Hinman <57081003+leehinman@users.noreply.github.com>
Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
(cherry picked from commit e624aef)
andrewkroh added a commit to elastic/beats that referenced this issue Mar 18, 2020
…ent Events - ECS related.user field mapping (#17090)

Added Audit and Log Management related events, Computer Object Management Events, Distribution Groups Events. Changed user.name field for user management events and related.user mapping.

New Events

Due to that Windows events are the source of information for Winlogbeat the events 1100, 1102, 1104, 1105, 1108 and 4719 has been added in order to monitor changes in the audit policy configuration, log deletion and other failures in the log subsystem.

For event 4719, a human readable description was added in order to know which setting was modified (winlog.event_data.SubCategory) and to which value (winlog.event_data.AuditPolicyChangesDescription).

Distribution Groups (Security-Disabled) Management Events were added. Those events are processed in the same way and with the same function that Security Groups (#14299). In order to add information about the nature of the group being managed the type (Security-Disabled/Security-Enabled) and scope (Local,Global,Universal) where added as winlog.group.type and winlog.group.scope.

ComputerObject Management events were also added.

Changes to ECS mappings

In elastic/ecs#678 and elastic/ecs#589 we have been discussing how n-ary relationship between users in an event should be named and mapping into ECS. In #13530 winlog.event_data.TargetUserName has been mapped to user.name but from the reasons exposed in elastic/ecs#678 and elastic/ecs#589 the mapping winlog.event_data.SubjectUserName -> user.name is more appropriate. This mapping was changed.

Also, with the adding of related fields in ECS 1.3 and specifically the related.user field (elastic/ecs#694) all the user names appearing in one event were mapped to the related user events. Every time a SubjectUserName or TargetUserName is copied also is added to the related.user field, as well as other users appearing in the event.

Event test data were added for all events with the exception of event 1108 which I was not able to reproduce.

Co-authored-by: Lee Hinman <57081003+leehinman@users.noreply.github.com>
Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
Co-authored-by: Anabella Cristaldi <33020901+janniten@users.noreply.github.com>

(cherry picked from commit e624aef)
@rw-access
Copy link
Contributor

rw-access commented Mar 3, 2021

I'm thinking that this issue is overdue to make it into ECS. We're also running into this with Endpoint, as part of the development for process injection telemetry and detection. The interim approach is Target.process for now. Capitalizing the first letter is generally what Elastic Endpoint does to avoid collisions with future ECS, like process.Ext for example.

I'm most interested in source/target for process events. think the most common use cases involve an acting process that "does the thing" and the target process that "has the thing done to it." Do you think that's fair to initially limit scope to that @andrewstucki? We can leave this issue open for the generic source/target problem if you like.

@ebeahan Do you think this would be good to make into an RFC, or straight to a PR? I'm thinking that process.* should capture the source information and process.target.* is just the process fieldset reused to capture information about the target process. It's possible process.target.parent would be useful or populated (@gabriellandau do you know?), but I don't think we would nest any further than that.

@andrewstucki @gabriellandau either of you interesting in helping drive this forward?

@ebeahan
Copy link
Member

ebeahan commented Mar 3, 2021

@rw-access Yes, I do think this topic makes a good RFC candidate. Since this is something that's already been discussed extensively, the initial RFC draft should probably target stage 1.

++ for using process for sources and process.target.* for the destinations/targets. The approach has symmetry with what's already been adopted for multiple users in events.

@andrewstucki
Copy link
Contributor Author

@rw-access I think it's definitely good to limit this at first, and 👍 on the idea of nesting under the process field, as @ebeahan pointed out, there's new precedent on preferring field subsets like target, etc. nested under the entity v. the other way around.

@gabriellandau
Copy link

Interesting. In the past, we would put something under something else to indicate that it is a property of that other thing. Would Target.process.thread go into process.thread.target or process.target.thread?

Some other questions that come to mind while we're discussing schema layout.

A Windows token is a security credential containing your username, groups, and related information. This is where we get the username from. Threads can have [impersonation] tokens that differ from their containing process's [primary] token. An event/action occurs in the security context of the impersonation token, but it can still be useful to know the primary token. These tokens can have different users and IDs. Target threads can be impersonating as well, so we can have:

  • acting process token
  • OPTIONAL acting thread token. If present, this contains the effective user.
  • target process token
  • OPTIONAL target thread token

We're not currently returning impersonation token information, but we may want to in the future.

Each of these tokens can have additional relevant attributes, some of which we like to know, such as Integrity Level. How do you think this should all lay out? Here's some pared-down sample data from a 7.12.0 diagnostic (not user-facing yet) alert. Note that this alert describes an action between two threads in the same process, so everything is the same except process.thread.id != Target.process.thread.id, but that's often not the case.

"Target": {
    "process": {
        "Ext": {
            "token": {
                "domain": "DESKTOP-4S6F4KN",
                "elevation": true,
                "elevation_type": "full",
                "integrity_level_name": "high",
                "sid": "S-1-5-21-2862132742-1403383571-1346394525-1001",
                "user": "user"
            }
        },
        "executable": "C:\\Windows\\System32\\cmd.exe",
        "parent": {
            "Ext": {
                "token": {
                    "domain": "DESKTOP-4S6F4KN",
                    "elevation": true,
                    "elevation_type": "full",
                    "integrity_level_name": "high",
                    "sid": "S-1-5-21-2862132742-1403383571-1346394525-1001",
                    "user": "user"
                },
            },
            "executable": "C:\\Program Files\\Python38\\python.exe",
            "pid": 4880
        },
        "thread": {
            "id": 7712
        }
    }
},
"process": {
    "Ext": {
        "token": {
            "domain": "DESKTOP-4S6F4KN",
            "elevation": true,
            "elevation_type": "full",
            "integrity_level_name": "high",
            "sid": "S-1-5-21-2862132742-1403383571-1346394525-1001",
            "user": "user"
        }
    },
    "executable": "C:\\Windows\\System32\\cmd.exe",
    "parent": {
        "Ext": {
            "token": {
                "domain": "DESKTOP-4S6F4KN",
                "elevation": true,
                "elevation_type": "full",
                "integrity_level_name": "high",
                "sid": "S-1-5-21-2862132742-1403383571-1346394525-1001",
                "user": "user"
            },
        },
        "executable": "C:\\Program Files\\Python38\\python.exe",
        "pid": 4880
    },
    "thread": {
        "id": 7680
    }
}

@rw-access
Copy link
Contributor

rw-access commented Mar 17, 2021

There is another issue for tokens specifically, #810.

Do you think we would scope those separately or are they terribly intertwined and impossible to decouple? (Honest question)

The meaning of .target, .effective, .new originated in RFC-0007. AFAICT.

Good question about target threads. I don't know the right answer, but that does sound like exactly the right thing to hash out on the RFC for target processes: #1297

Wanna be a SME for that?

Edit: I think it would be process.target.thread.*

@ebeahan
Copy link
Member

ebeahan commented Mar 23, 2021

Good question about target threads. I don't know the right answer, but that does sound like exactly the right thing to hash out on the RFC for target processes: #1297

Yes absolutely feel free to continue the discussion over in #1297. 😄

++ to process.target.thread.*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants