Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide addition detail to Linux systemd events to indicate if a systemd service is configured to start automatically #22368

Closed
geekpete opened this issue Nov 3, 2020 · 4 comments
Assignees
Labels
enhancement Metricbeat Metricbeat monitoring Team:Services (Deprecated) Label for the former Integrations-Services team

Comments

@geekpete
Copy link
Member

geekpete commented Nov 3, 2020

Describe the enhancement:

Add a field to indicate if the Linux systemd service is configured to start automatically, similar to the windows field of start_type.

Describe a specific use case for the enhancement or feature:

For a generic pattern to monitor all services and filter/alert for services that are configured to start automatically but are not currently running, this is seemingly easy for Windows service Metricbeat events and not so possible for Linux systemd Metricbeat events.
This can be used as a generic recipe to filter/alert on any services that should be running but currently aren't for Windows services.

A similar generic pattern with Linux systemd services seems not so possible due to lack of detail in the resulting systemd Metricbeat event.

For Windows this pattern is made easy by the start_type field:
https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-metricset-windows-service.html
https://www.elastic.co/guide/en/beats/metricbeat/current/exported-fields-windows.html

 windows.service.start_type
    The startup type of the service. The possible values are Automatic, Boot, Disabled, Manual, and System.

eg:

    "windows": {
        "service": {
            "display_name": "Servicio de enrutador de AllJoyn",
            "exit_code": "ERROR_SERVICE_NEVER_STARTED",
            "id": "IOQgaoSLJ7",
            "name": "AJRouter",
            "start_type": "Manual (Triggered)",
            "state": "Stopped"
        }
    }

So for Windows sevices filtering by windows.service.start_type:Automatic and windows.service.state:Stopped gives you the list of services that should be running but currently aren't.

On Linux we don't seen to surface the detail of if a service is enabled and set to start on boot.

On Windows the key piece of information would map to the "start_type" field.
On Linux, a systemd service can have an "enabled" property for the Loaded state, but even more complex than this, one service that has a disabled property for Loaded state can be triggered by another service that has an enabled property.

For example on my system, I have the docker.service systemd service that is currently disabled (though default install suggests it should be enabled according to the "vendor preset", I think I just disabled this myself a while back):

$ systemctl status docker.service
● docker.service - Docker Application Container Engine
    Loaded: loaded (/lib/systemd/system/docker.service; disabled; vendor preset: enabled)
    Active: active (running) since Tue 2020-11-03 14:08:33 AEST; 1s ago
TriggeredBy: ● docker.socket
      Docs: https://docs.docker.com
  Main PID: 23372 (dockerd)
     Tasks: 21
    Memory: 143.0M
    CGroup: /system.slice/docker.service
            └─23372 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

But the docker.socket service IS enabled, so will start on boot and also triggers the start of the docker.service, effectively configuring docker.service to start on boot:

$ systemctl status docker.socket
● docker.socket - Docker Socket for the API
    Loaded: loaded (/lib/systemd/system/docker.socket; enabled; vendor preset: enabled)
    Active: active (running) since Tue 2020-11-03 14:08:33 AEST; 6min ago
  Triggers: ● docker.service
    Listen: /run/docker.sock (Stream)
     Tasks: 0 (limit: 19041)
    Memory: 1.4M
    CGroup: /system.slice/docker.socket

and the docker.service does start on boot successfully with this configuration.

So for right now, Linux Metricbeat events would appear to be missing the level of detail to allow a recipe to generically alert on services that "should" be running but are not currently running.
A workaround for now would be to specifically check/filter if a known list of services that should be started are running or not or to check/filter per known service that should be running.

As to how we might enhance this metricset to include this information, it might not even be a simple case of looking across all services to see if one triggers another, because I think you might also have situations where multi-level triggering can be configured, one triggers another that triggers another. The last service in the chain is being started on boot by a service that is enabled but doesn't directly reference it.

Not impossible to determine but also not so simple without jumping through a few hoops.

The result might just be a new field for Linux Metricbeat system service events similar to start_type on the Windows service events to indicate if the systemd service starts on boot according to the point in time that configuration was checked when that event was generated
Whether that's directly via an enabled property on the current service or via any upstream service indicated in the TriggeredBy property or any chain of triggered services where there is a necessary enabled property to result in the current service starting on boot.

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Nov 3, 2020
@geekpete geekpete added the Team:Platforms Label for the Integrations - Platforms team label Nov 3, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Nov 3, 2020
@andresrc andresrc added the Team:Services (Deprecated) Label for the former Integrations-Services team label Nov 3, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@andresrc andresrc added enhancement and removed Team:Platforms Label for the Integrations - Platforms team labels Nov 3, 2020
@fearful-symmetry
Copy link
Contributor

After some hunting around, I finally found the properties we're looking for in systemd:

      dict entry(
         string "UnitFileState"
         variant             string "enabled"
      )
      dict entry(
         string "UnitFilePreset"
         variant             string "enabled"
      )

Unless I'm missing something, it shouldn't be too hard to add those.

For future reference, here's a helpful command:

dbus-send --system --print-reply --dest=org.freedesktop.systemd1 /org/freedesktop/systemd1/unit/sshd_2eservice org.freedesktop.DBus.Properties.GetAll string:""

@fearful-symmetry
Copy link
Contributor

Commit has been merged, closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Metricbeat Metricbeat monitoring Team:Services (Deprecated) Label for the former Integrations-Services team
Projects
None yet
Development

No branches or pull requests

4 participants