Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entity id desc #1

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Entity id desc #1

wants to merge 3 commits into from

Conversation

mjwolf
Copy link
Owner

@mjwolf mjwolf commented Nov 29, 2023

No description provided.

Update the process.entity_id description with recommended generation methods.

These methods will allow entity_id to be generated reproducibly, while being
unique to a process.

If different data source collectors use the recommeded generation method, and
observe events from the same process, they will generate the same entity_id, and
it will be possible to corralate the  events and identify them as belonging to the
same process later.

StartTimeSeconds:: Process start time in UNIX time seconds

StartTimeMilliseconds:: Millisecond precision of process start time

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows actually gives us a lot more precision than milliseconds in the form of a FILETIME: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getprocesstimes

@intxgo
Copy link

intxgo commented Dec 6, 2023

Endpoint's entity_id is already used by users (Kibana and API). Response Actions console list it with processes, then the user can grab any to call kill-process --entity_id

As a reminder it's essential to use ids on Windows where PIDs are reused often.

Therefore the gist of my earlier thought was to make the string more human readable like other cloud providers do, basically some sort of GUID. I guess that before Response Actions, the entity_ids were regarded to be used in code, not by human, so the optical readability of base64 was not taken into account. I'd rather start with unique process information at the front of it, as MachineGUID is constant per Endpoint.

I don't mind documenting it publicly. My original thought was that's not needed, but I can't see any reason not to do it.


A String formatted as: "MachineGUID__PID__StartTimeSeconds__StartTimeMilliseconds"

MachineGUID:: The Windows Machine GUID, which can be read from the Windows registry
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we would ever had a need to know that the entity was generated by an Endpoint receiving it.

Re-order the entity_id fields to have the PID, start time first. To
improve human readability and scanning effeciency, the fields that
change most often are first.

Also redefined Windows entity_id to use existing Windows process fields
that are more likely to be unique for the process.
@mjwolf
Copy link
Owner Author

mjwolf commented Dec 7, 2023

Endpoint's entity_id is already used by users (Kibana and API). Response Actions console list it with processes, then the user can grab any to call kill-process --entity_id

As a reminder it's essential to use ids on Windows where PIDs are reused often.

Therefore the gist of my earlier thought was to make the string more human readable like other cloud providers do, basically some sort of GUID. I guess that before Response Actions, the entity_ids were regarded to be used in code, not by human, so the optical readability of base64 was not taken into account. I'd rather start with unique process information at the front of it, as MachineGUID is constant per Endpoint.

I don't mind documenting it publicly. My original thought was that's not needed, but I can't see any reason not to do it.

@intxgo How important do you think it is to keep this human readable? If we change to use fixed-width fields, I think it also makes sense to encode numbers with hex; it would reduce the overall length of the string. But it would also make it a bit less human-readable, as PIDs won't be in the normal decimal format that's used in most places

(I've made the changes to the PR to re-order the fields and change the Windows definition, but I haven't changed to fix-width fields yet)

@intxgo
Copy link

intxgo commented Dec 7, 2023

@intxgo How important do you think it is to keep this human readable? If we change to use fixed-width fields, I think it also makes sense to encode numbers with hex; it would reduce the overall length of the string. But it would also make it a bit less human-readable, as PIDs won't be in the normal decimal format that's used in most places

(I've made the changes to the PR to re-order the fields and change the Windows definition, but I haven't changed to fix-width fields yet)

I'd say hex or ascii, but nicely formatted fixed length, is perfectly fine, and the shortest the better. For the purpose of Response Actions console the actual meaning of entity_id content doesn't have to be directly visible, but documenting it in general in ECS is a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants