-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SONiC core dump utility #3499
base: master
Are you sure you want to change the base?
SONiC core dump utility #3499
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
[Unit] | ||
Description=Update coredump configuration | ||
Requires=updategraph.service | ||
After=updategraph.service | ||
|
||
[Service] | ||
Type=oneshot | ||
ExecStart=/usr/bin/coredump-config.sh | ||
|
||
[Install] | ||
WantedBy=multi-user.target |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
#!/bin/bash | ||
|
||
DISABLE_COREDUMP_CONF="/etc/sysctl.d/50-disable-coredump.conf" | ||
|
||
if [ "$(redis-cli -n 4 HGET "COREDUMP|config" "enabled")" = "false" ] ; then | ||
echo "kernel.core_pattern=" > ${DISABLE_COREDUMP_CONF} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does this mean ? Can you please explain the impact of disable? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When the coredump admin mode is disabled in config db, core files will not be generated. We are creating a sysctl entry to disable core dump. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you give me a use case, where you would not want core file dump ? Core file dump implies that some unexpected error occurred or user explicitly creating one with kill for a reason. In either case, the dump is required for analysis. If this is the only purpose of coredump-config.service, I don't see a need. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is already another initiative to do core file rotation and limit the count of core files at any time per process. Turning off is definitely not the solution. The limitation on count / size is also from Broadcom only. I need to look for it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In fact you are referring that PR #468, which has the following. " a. Support per-process core file rotation and archiving to optimize disk space " There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am referring to various configurations provided by systemd-coredump. Below is a link to it. The PR I was referring to is PR#729 which enables kernel core dump feature. For this feature, it is desirable that users have an enable/disable knob as kdump requires dedicated 512MB of memory. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let me add Guohan to this thread. What we need is the "requirement/use case"? I don't see one. I find it rather risky to have a DB variable to control, as it could get saved and persist across reboots, transparently, which is a big risk as it is likely to block core dumps unintentionally. I would rather have this ability as a CLI tool, which disables temporarily and all should be back to default/enabled state upon reboot. What we do need is the ability to limit count of cores per process and overall disk size taken by cores. |
||
else | ||
rm -f ${DISABLE_COREDUMP_CONF} | ||
fi | ||
|
||
# Read sysctl conf files again | ||
systemctl restart systemd-sysctl | ||
|
||
exit 0 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
[Coredump] | ||
Storage=external | ||
Compress=yes | ||
ProcessSizeMax=2G | ||
ExternalSizeMax=2G |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
[Journal] | ||
Storage=persistent | ||
SystemMaxUse=256M | ||
RuntimeMaxUse=356M | ||
MaxLevelStore=crit |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,7 +11,9 @@ VIM = vim | |
OPENSSH = openssh-client | ||
SSHPASS = sshpass | ||
STRACE = strace | ||
$(DOCKER_BASE_STRETCH)_DBG_IMAGE_PACKAGES += $(GDB) $(GDBSERVER) $(VIM) $(OPENSSH) $(SSHPASS) $(STRACE) | ||
SYSTEMD_COREDUMP = systemd-coredump | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. W/o this package installed, will there be core dumps created? Plus this is already installed in build_debian.sh unconditionally. Can you please explain? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. W/O this package installed inside the container, core dumps will still be created. Core files are always generated on host o/s and stored in the /var/lib/systemd/coredump directory. The application that has crashed may be part of a container. So if you want to run gdb using the coredumpctl gdb command, it will not find the application binary when executed on host o/s. So, we map the /var/lib/systemd/coredump directory to the containers (see change in docker_image_ctl.j2) and also install the coredumpctl tool here. Now, the corefile and the handy coredumpctl tool are ready for debugging the application inside the container. |
||
$(DOCKER_BASE_STRETCH)_DBG_IMAGE_PACKAGES += $(GDB) $(GDBSERVER) $(VIM) $(OPENSSH) $(SSHPASS) $(STRACE) \ | ||
$(SYSTEMD_COREDUMP) | ||
|
||
SONIC_DOCKER_IMAGES += $(DOCKER_BASE_STRETCH) | ||
SONIC_STRETCH_DOCKERS += $(DOCKER_BASE_STRETCH) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If key is present && carry value as false, then we disable. In other words the default behavior is "enabled=true". To disable, one has to create this key explicitly.
Instead, why not require a key for disabling, which would imply default is enabled.
"COREDUMP|disabled" == true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is common place to see configuration knobs with a positive intent. So if we flip this around, it may confuse the user with other usages of true/false kind of configurations.