Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logdog: add systemd analyze commands to common.conf #2550

Merged

Conversation

etungsten
Copy link
Contributor

Issue number:
Resolves #2199

Description of changes:

    logdog: add systemd analyze commands to common.conf
    
    This adds 'systemd-analyze blame' and 'systemd-analyze critical-chain'
    for our systemd target units to 'logdog.common.conf'. This information
    is useful for boot time analysis.

Testing done:
Built and launched aws-k8s-1.23 instance, ran logdog on the host and checked the logs in the admin container:

[ec2-user@admin]$ sudo sheltie
bash-5.1# logdog
Running: exec containerd-config containerd --config /etc/containerd/config.toml config dump
Running: exec containerd-config-host containerd --config /etc/host-containerd/config.toml config dump
Running: exec df df -h
Running: exec df-inodes df -hi
Running: exec dmesg dmesg --color=never --nopager
Running: exec iptables-filter iptables -nvL -t filter
Running: exec iptables-nat iptables -nvL -t nat
Running: exec journalctl-boots journalctl --list-boots --no-pager
Running: exec journalctl.errors journalctl -p err -a --no-pager
Running: exec journalctl.log journalctl -a --no-pager
Running: exec systemd-analyze-blame systemd-analyze blame --no-pager
Running: exec systemd-analyze-critical-chain systemd-analyze critical-chain --no-pager preconfigured.target configured.target multi-user.target
Running: exec proc-mounts cat /proc/mounts
Running: exec signpost signpost status
Running: exec wicked wicked show all
Running: file os-release /etc/os-release
Running: glob /var/log/kdump/*
Running: settings settings.json
Running: exec kube-status systemctl status kube* -l --no-pager
Running: file ipamd.log /var/log/aws-routed-eni/ipamd.log
Running: file plugin.log /var/log/aws-routed-eni/plugin.log
logs are at: /var/log/support/bottlerocket-logs.tar.gz

logdog systemd-analyze blame log:

[ec2-user@admin]$ tar xvf /.bottlerocket/support/bottlerocket-logs.tar.gz
bottlerocket-logs/
bottlerocket-logs/logdog.errors
bottlerocket-logs/containerd-config
bottlerocket-logs/containerd-config-host
bottlerocket-logs/df
bottlerocket-logs/df-inodes
bottlerocket-logs/dmesg
bottlerocket-logs/iptables-filter
bottlerocket-logs/iptables-nat
bottlerocket-logs/journalctl-boots
bottlerocket-logs/journalctl.errors
bottlerocket-logs/journalctl.log
bottlerocket-logs/systemd-analyze-blame
bottlerocket-logs/systemd-analyze-critical-chain
bottlerocket-logs/proc-mounts
bottlerocket-logs/signpost
bottlerocket-logs/wicked
bottlerocket-logs/os-release
bottlerocket-logs/settings.json
bottlerocket-logs/kube-status
bottlerocket-logs/ipamd.log
bottlerocket-logs/plugin.log
[ec2-user@admin]$ cat bottlerocket-logs/systemd-analyze-blame
12.850s kubelet.service
 5.249s host-containerd.service
 3.432s sundog.service
 3.370s settings-applier.service
 1.521s wicked.service
 1.094s send-boot-success.service
  742ms apiserver.service
  635ms early-boot-config.service
  475ms ldconfig.service
  398ms repart-local.service
  384ms systemd-modules-load.service
  369ms dev-loop0.device
  365ms var-lib-kernel\x2ddevel-.overlay-lower.mount
  364ms x86_64\x2dbottlerocket\x2dlinux\x2dgnu-sys\x2droot-usr-share-licenses.mount
  356ms prepare-boot.service
  339ms migrator.service
  281ms dev-dm\x2d0.device
  255ms activate-multi-user.service
  254ms selinux-policy-files.service
  200ms activate-configured.service
  141ms wickedd-dhcp4.service
  140ms opt-cni-bin.mount
  139ms wickedd-dhcp6.service
  139ms systemd-udev-trigger.service
  122ms systemd-journald.service
  118ms modprobe@configfs.service
  114ms mark-successful-boot.service
  111ms systemd-sysctl.service
  110ms modprobe@fuse.service
  109ms storewolf.service
  104ms containerd.service
  104ms systemd-sysusers.service
   98ms generate-network-config.service
   90ms prepare-var.service
   90ms dev-hugepages.mount
   86ms x86_64\x2dbottlerocket\x2dlinux\x2dgnu-sys\x2droot-usr-lib-modules.mount
   81ms dev-mqueue.mount
   81ms etc-cni.mount
   80ms sys-kernel-debug.mount
   80ms sys-kernel-tracing.mount
   78ms tmp.mount
   71ms systemd-tmpfiles-setup.service
   70ms dbus-broker.service
   66ms modprobe@ip_vs_sh.service
   66ms systemd-udevd.service
   63ms systemd-remount-fs.service
   63ms systemd-network-generator.service
   63ms systemd-machine-id-commit.service
   60ms modprobe@ip_vs_wrr.service
   59ms audit-rules.service
   57ms disable-kexec-load.service
   50ms modprobe@ip_vs_rr.service
   50ms systemd-journal-flush.service
   37ms var-lib-bottlerocket.mount
   36ms prepare-var-lib-containerd.service
   35ms prepare-var-lib-kubelet.service
   33ms opt.mount
   29ms systemd-random-seed.service
   22ms mask-local-mnt.service
   22ms prepare-opt.service
   21ms mask-local-var.service
   18ms kmod-static-nodes.service
   16ms systemd-journal-catalog-update.service
   14ms reboot-if-required.service
   14ms wickedd.service
   13ms systemd-tmpfiles-setup-dev.service
   11ms wickedd-nanny.service
   10ms etc-host\x2dcontainers.mount
   10ms acpid.service
    9ms mask-local-opt.service
    8ms mnt.mount
    7ms local.mount
    7ms etc-containerd.mount
    6ms prepare-primary-interface.service
    5ms var.mount
    4ms x86_64\x2dbottlerocket\x2dlinux\x2dgnu-sys\x2droot-usr-src-kernels.mount
    4ms root-.aws.mount
    3ms systemd-update-done.service
    3ms sys-fs-fuse-connections.mount
    3ms sys-kernel-config.mount
    3ms dev-loop1.device
    2ms set-hostname.service

logdog systemd-analyze critical-chain log:

[ec2-user@admin]$ cat bottlerocket-logs/systemd-analyze-critical-chain
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.

preconfigured.target @12.967s
└─set-hostname.service @12.964s +2ms
  └─settings-applier.service @9.589s +3.370s
    └─sundog.service @6.154s +3.432s
      └─early-boot-config.service @5.517s +635ms
        └─network-online.target @5.515s
          └─network.target @5.514s
            └─wicked.service @3.991s +1.521s
              └─wickedd-nanny.service @3.978s +11ms
                └─wickedd.service @3.961s +14ms
                  └─wickedd-dhcp6.service @3.820s +139ms
                    └─network-pre.target @3.815s
                      └─prepare-primary-interface.service @3.807s +6ms
                        └─generate-network-config.service @3.706s +98ms
                          └─basic.target @3.702s
                            └─dbus-broker.service @3.629s +70ms
                              └─dbus.socket @3.627s
                                └─sysinit.target @3.623s
                                  └─modprobe@ip_vs_sh.service @18.651s +66ms
                                    └─systemd-journald.socket @874ms
                                      └─system.slice @856ms
                                        └─-.slice @856ms
configured.target @18.406s
└─host-containerd.service @13.155s +5.249s
  └─preconfigured.target @12.967s
    └─set-hostname.service @12.964s +2ms
      └─settings-applier.service @9.589s +3.370s
        └─sundog.service @6.154s +3.432s
          └─early-boot-config.service @5.517s +635ms
            └─network-online.target @5.515s
              └─network.target @5.514s
                └─wicked.service @3.991s +1.521s
                  └─wickedd-nanny.service @3.978s +11ms
                    └─wickedd.service @3.961s +14ms
                      └─wickedd-dhcp6.service @3.820s +139ms
                        └─network-pre.target @3.815s
                          └─prepare-primary-interface.service @3.807s +6ms
                            └─generate-network-config.service @3.706s +98ms
                              └─basic.target @3.702s
                                └─dbus-broker.service @3.629s +70ms
                                  └─dbus.socket @3.627s
                                    └─sysinit.target @3.623s
                                      └─modprobe@ip_vs_sh.service @18.651s +66ms
                                        └─systemd-journald.socket @874ms
                                          └─system.slice @856ms
                                            └─-.slice @856ms
multi-user.target @31.592s
└─kubelet.service @18.741s +12.850s
  └─containerd.service @18.634s +104ms
    └─runtime.slice @18.628s
      └─-.slice @856ms

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

This adds 'systemd-analyze blame' and 'systemd-analyze critical-chain'
for our systemd target units to 'logdog.common.conf'. This information
is useful for boot time analysis.
@bcressey
Copy link
Contributor

bcressey commented Nov 3, 2022

kubelet.service @18.741s +12.850s

Ouch!

Copy link
Contributor

@zmrow zmrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️

@etungsten etungsten merged commit eadf55c into bottlerocket-os:develop Nov 4, 2022
@etungsten etungsten deleted the logdog-systemd-analyze branch November 4, 2022 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add systemd analyze commands to logdog output
5 participants