Skip to content

Commit

Permalink
kdump support (#729)
Browse files Browse the repository at this point in the history
In the event of a kernel crash, we need to gather as much information as possible to understand and identify the root cause of the crash. Currently, the kernel does not provide much information, which make kernel crash investigation difficult and time consuming.

Fortunately, there is a way in the kernel to provide more information in the case of a kernel crash. kdump is a feature of the Linux kernel that creates crash dumps in the event of a kernel crash. This PR will add kernel kdump support. Please note that there is another PR in sonic-utilities which is also needed:
sonic-net/sonic-buildimage#3722

An extension to the CLI utilities config and show is provided to configure and manage kdump:

view kdump status (enabled/disabled, active, configuration, stored crash files)
enable / disable kdump functionality
configure kdump (how many kernel crash logs can be saved, memory
allocated for capture kernel)
view kernel crash logs
There is a design document which describes this kdump implementation:
sonic-net/SONiC#510
  • Loading branch information
olivier-singla authored and lguohan committed Jan 25, 2020
1 parent c139a23 commit 6babd1c
Show file tree
Hide file tree
Showing 6 changed files with 779 additions and 1 deletion.
45 changes: 45 additions & 0 deletions config/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -1200,6 +1200,51 @@ def shutdown():
"""Shut down BGP session(s)"""
pass

@config.group()
def kdump():
""" Configure kdump """
if os.geteuid() != 0:
exit("Root privileges are required for this operation")
pass

@kdump.command()
def disable():
"""Disable kdump operation"""
config_db = ConfigDBConnector()
if config_db is not None:
config_db.connect()
config_db.mod_entry("KDUMP", "config", {"enabled": "false"})
run_command("sonic-kdump-config --disable")

@kdump.command()
def enable():
"""Enable kdump operation"""
config_db = ConfigDBConnector()
if config_db is not None:
config_db.connect()
config_db.mod_entry("KDUMP", "config", {"enabled": "true"})
run_command("sonic-kdump-config --enable")

@kdump.command()
@click.argument('kdump_memory', metavar='<kdump_memory>', required=True)
def memory(kdump_memory):
"""Set memory allocated for kdump capture kernel"""
config_db = ConfigDBConnector()
if config_db is not None:
config_db.connect()
config_db.mod_entry("KDUMP", "config", {"memory": kdump_memory})
run_command("sonic-kdump-config --memory %s" % kdump_memory)

@kdump.command()
@click.argument('kdump_num_dumps', metavar='<kdump_num_dumps>', required=True, type=int)
def num_dumps(kdump_num_dumps):
"""Set max number of dump files for kdump"""
config_db = ConfigDBConnector()
if config_db is not None:
config_db.connect()
config_db.mod_entry("KDUMP", "config", {"num_dumps": kdump_num_dumps})
run_command("sonic-kdump-config --num_dumps %d" % kdump_num_dumps)

# 'all' subcommand
@shutdown.command()
@click.option('-v', '--verbose', is_flag=True, help="Enable verbose output")
Expand Down
12 changes: 12 additions & 0 deletions scripts/generate_dump
Original file line number Diff line number Diff line change
Expand Up @@ -436,6 +436,18 @@ main() {
fi
done

# archive kernel dump files
for file in $(find_files "/var/crash/"); do
# don't gzip already-gzipped dmesg files :)
if [ ! ${file} = "/var/crash/kexec_cmd" -a ! ${file} = "/var/crash/export" ]; then
if [[ ${file} == *"kdump."* ]]; then
save_file $file kdump false
else
save_file $file kdump true
fi
fi
done

# clean up working tar dir before compressing
$RM $V -rf $TARDIR

Expand Down
7 changes: 7 additions & 0 deletions scripts/reboot
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
#!/bin/bash

# Reboot immediately if we run the kdump capture kernel
VMCORE_FILE=/proc/vmcore
if [ -e $VMCORE_FILE -a -s $VMCORE_FILE ]; then
debug "We have a /proc/vmcore, then we just kdump'ed"
/sbin/reboot
fi

REBOOT_USER=$(logname)
REBOOT_TIME=$(date)
PLATFORM=$(sonic-cfggen -H -v DEVICE_METADATA.localhost.platform)
Expand Down
Loading

0 comments on commit 6babd1c

Please sign in to comment.