3221f4e91f
Currently, the makedumpfile option '--message-level' is set to 1 when dumping the vmcore, it only displays the progress indicator message, but there are no common message and error message, it is important to report some additional messages, especially for the error message, which is very useful for the debugging. In view of this, let's change the message level to 7 by default. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>
923 lines
37 KiB
Plaintext
923 lines
37 KiB
Plaintext
=================
|
|
Kexec/Kdump HOWTO
|
|
=================
|
|
|
|
|
|
Introduction
|
|
============
|
|
|
|
Kexec and kdump are new features in the 2.6 mainstream kernel. These features
|
|
are included in Red Hat Enterprise Linux 5. The purpose of these features
|
|
is to ensure faster boot up and creation of reliable kernel vmcores for
|
|
diagnostic purposes.
|
|
|
|
|
|
Overview
|
|
========
|
|
|
|
Kexec
|
|
-----
|
|
|
|
Kexec is a fastboot mechanism which allows booting a Linux kernel from the
|
|
context of already running kernel without going through BIOS. BIOS can be very
|
|
time consuming especially on the big servers with lots of peripherals. This can
|
|
save a lot of time for developers who end up booting a machine numerous times.
|
|
|
|
Kdump
|
|
-----
|
|
|
|
Kdump is a new kernel crash dumping mechanism and is very reliable because
|
|
the crash dump is captured from the context of a freshly booted kernel and
|
|
not from the context of the crashed kernel. Kdump uses kexec to boot into
|
|
a second kernel whenever system crashes. This second kernel, often called
|
|
a capture kernel, boots with very little memory and captures the dump image.
|
|
|
|
The first kernel reserves a section of memory that the second kernel uses
|
|
to boot. Kexec enables booting the capture kernel without going through BIOS
|
|
hence contents of first kernel's memory are preserved, which is essentially
|
|
the kernel crash dump.
|
|
|
|
Kdump is supported on the i686, x86_64, ia64 and ppc64 platforms. The
|
|
standard kernel and capture kernel are one in the same on i686, x86_64,
|
|
ia64 and ppc64.
|
|
|
|
If you're reading this document, you should already have kexec-tools
|
|
installed. If not, you install it via the following command:
|
|
|
|
# yum install kexec-tools
|
|
|
|
Now load a kernel with kexec:
|
|
|
|
# kver=`uname -r` # kexec -l /boot/vmlinuz-$kver
|
|
--initrd=/boot/initrd-$kver.img \
|
|
--command-line="`cat /proc/cmdline`"
|
|
|
|
NOTE: The above will boot you back into the kernel you're currently running,
|
|
if you want to load a different kernel, substitute it in place of `uname -r`.
|
|
|
|
Now reboot your system, taking note that it should bypass the BIOS:
|
|
|
|
# reboot
|
|
|
|
|
|
How to configure kdump
|
|
======================
|
|
|
|
Again, we assume if you're reading this document, you should already have
|
|
kexec-tools installed. If not, you install it via the following command:
|
|
|
|
# yum install kexec-tools
|
|
|
|
To be able to do much of anything interesting in the way of debug analysis,
|
|
you'll also need to install the kernel-debuginfo package, of the same arch
|
|
as your running kernel, and the crash utility:
|
|
|
|
# yum --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash
|
|
|
|
Next up, we need to modify some boot parameters to reserve a chunk of memory for
|
|
the capture kernel. With the help of grubby, it's very easy to append
|
|
"crashkernel=128M" to the end of your kernel boot parameters. Note that the X
|
|
values are such that X = the amount of memory to reserve for the capture kernel.
|
|
And based on arch and system configuration, one might require more than 128M to
|
|
be reserved for kdump. One need to experiment and test kdump, if 128M is not
|
|
sufficient, try reserving more memory.
|
|
|
|
# grubby --args="crashkernel=128M" --update-kernel=/boot/vmlinuz-`uname -r`
|
|
|
|
Note that there is an alternative form in which to specify a crashkernel
|
|
memory reservation, in the event that more control is needed over the size and
|
|
placement of the reserved memory. The format is:
|
|
|
|
crashkernel=range1:size1[,range2:size2,...][@offset]
|
|
|
|
Where range<n> specifies a range of values that are matched against the amount
|
|
of physical RAM present in the system, and the corresponding size<n> value
|
|
specifies the amount of kexec memory to reserve. For example:
|
|
|
|
crashkernel=512M-2G:64M,2G-:128M
|
|
|
|
This line tells kexec to reserve 64M of ram if the system contains between
|
|
512M and 2G of physical memory. If the system contains 2G or more of physical
|
|
memory, 128M should be reserved.
|
|
|
|
Besides, since kdump needs to access /proc/kallsyms during a kernel
|
|
loading if KASLR is enabled, check /proc/sys/kernel/kptr_restrict to
|
|
make sure that the content of /proc/kallsyms is exposed correctly.
|
|
We recommend to set the value of kptr_restrict to '1'. Otherwise
|
|
capture kernel loading could fail.
|
|
|
|
After making said changes, reboot your system, so that the X MB of memory is
|
|
left untouched by the normal system, reserved for the capture kernel. Take note
|
|
that the output of 'free -m' will show X MB less memory than without this
|
|
parameter, which is expected. You may be able to get by with less than 128M, but
|
|
testing with only 64M has proven unreliable of late. On ia64, as much as 512M
|
|
may be required.
|
|
|
|
Now that you've got that reserved memory region set up, you want to turn on
|
|
the kdump init script:
|
|
|
|
# chkconfig kdump on
|
|
|
|
Then, start up kdump as well:
|
|
|
|
# systemctl start kdump.service
|
|
|
|
This should load your kernel-kdump image via kexec, leaving the system ready
|
|
to capture a vmcore upon crashing. To test this out, you can force-crash
|
|
your system by echo'ing a c into /proc/sysrq-trigger:
|
|
|
|
# echo c > /proc/sysrq-trigger
|
|
|
|
You should see some panic output, followed by the system restarting into
|
|
the kdump kernel. When the boot process gets to the point where it starts
|
|
the kdump service, your vmcore should be copied out to disk (by default,
|
|
in /var/crash/<YYYY-MM-DD-HH:MM>/vmcore), then the system rebooted back into
|
|
your normal kernel.
|
|
|
|
Once back to your normal kernel, you can use the previously installed crash
|
|
kernel in conjunction with the previously installed kernel-debuginfo to
|
|
perform postmortem analysis:
|
|
|
|
# crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
|
|
/var/crash/2006-08-23-15:34/vmcore
|
|
|
|
crash> bt
|
|
|
|
and so on...
|
|
|
|
|
|
Notes on kdump
|
|
==============
|
|
|
|
When kdump starts, the kdump kernel is loaded together with the kdump
|
|
initramfs. To save memory usage and disk space, the kdump initramfs is
|
|
generated strictly against the system it will run on, and contains the
|
|
minimum set of kernel modules and utilities to boot the machine to a stage
|
|
where the dump target could be mounted.
|
|
|
|
With kdump service enabled, kdumpctl will try to detect possible system
|
|
change and rebuild the kdump initramfs if needed. But it can not guarantee
|
|
to cover every possible case. So after a hardware change, disk migration,
|
|
storage setup update or any similar system level changes, it's highly
|
|
recommended to rebuild the initramfs manually with following command:
|
|
|
|
# kdumpctl rebuild
|
|
|
|
|
|
Saving vmcore-dmesg.txt
|
|
=======================
|
|
|
|
Kernel log bufferes are one of the most important information available
|
|
in vmcore. Now before saving vmcore, kernel log bufferes are extracted
|
|
from /proc/vmcore and saved into a file vmcore-dmesg.txt. After
|
|
vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for
|
|
vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will
|
|
not be available if dump target is raw device.
|
|
|
|
|
|
Dump Triggering methods
|
|
=======================
|
|
|
|
This section talks about the various ways, other than a Kernel Panic, in which
|
|
Kdump can be triggered. The following methods assume that Kdump is configured
|
|
on your system, with the scripts enabled as described in the section above.
|
|
|
|
1) AltSysRq C
|
|
|
|
Kdump can be triggered with the combination of the 'Alt','SysRq' and 'C'
|
|
keyboard keys. Please refer to the following link for more details:
|
|
|
|
http://kbase.redhat.com/faq/FAQ_43_5559.shtm
|
|
|
|
In addition, on PowerPC boxes, Kdump can also be triggered via Hardware
|
|
Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.
|
|
|
|
2) NMI_WATCHDOG
|
|
|
|
In case a machine has a hard hang, it is quite possible that it does not
|
|
respond to keyboard interrupts. As a result 'Alt-SysRq' keys will not help
|
|
trigger a dump. In such scenarios Nmi Watchdog feature can prove to be useful.
|
|
The following link has more details on configuring Nmi watchdog option.
|
|
|
|
http://kbase.redhat.com/faq/FAQ_85_9129.shtm
|
|
|
|
Once this feature has been enabled in the kernel, any lockups will result in an
|
|
OOPs message to be generated, followed by Kdump being triggered.
|
|
|
|
3) Kernel OOPs
|
|
|
|
If we want to generate a dump everytime the Kernel OOPses, we can achieve this
|
|
by setting the 'Panic On OOPs' option as follows:
|
|
|
|
# echo 1 > /proc/sys/kernel/panic_on_oops
|
|
|
|
This is enabled by default on RHEL5.
|
|
|
|
4) NMI(Non maskable interrupt) button
|
|
|
|
In cases where the system is in a hung state, and is not accepting keyboard
|
|
interrupts, using NMI button for triggering Kdump can be very useful. NMI
|
|
button is present on most of the newer x86 and x86_64 machines. Please refer
|
|
to the User guides/manuals to locate the button, though in most occasions it
|
|
is not very well documented. In most cases it is hidden behind a small hole
|
|
on the front or back panel of the machine. You could use a toothpick or some
|
|
other non-conducting probe to press the button.
|
|
|
|
For example, on the IBM X series 366 machine, the NMI button is located behind
|
|
a small hole on the bottom center of the rear panel.
|
|
|
|
To enable this method of dump triggering using NMI button, you will need to set
|
|
the 'unknown_nmi_panic' option as follows:
|
|
|
|
# echo 1 > /proc/sys/kernel/unknown_nmi_panic
|
|
|
|
5) PowerPC specific methods:
|
|
|
|
On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if
|
|
XMON is configured). To configure XMON one needs to compile the kernel with
|
|
the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with
|
|
CONFIG_XMON and booting the kernel with xmon=on option.
|
|
|
|
Following are the ways to remotely issue a soft reset on PowerPC boxes, which
|
|
would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an
|
|
'Enter' here will trigger the dump.
|
|
|
|
5.1) HMC
|
|
|
|
Hardware Management Console(HMC) available on Power4 and Power5 machines allow
|
|
partitions to be reset remotely. This is specially useful in hang situations
|
|
where the system is not accepting any keyboard inputs.
|
|
|
|
Once you have HMC configured, the following steps will enable you to trigger
|
|
Kdump via a soft reset:
|
|
|
|
On Power4
|
|
Using GUI
|
|
|
|
* In the right pane, right click on the partition you wish to dump.
|
|
* Select "Operating System->Reset".
|
|
* Select "Soft Reset".
|
|
* Select "Yes".
|
|
|
|
Using HMC Commandline
|
|
|
|
# reset_partition -m <machine> -p <partition> -t soft
|
|
|
|
On Power5
|
|
Using GUI
|
|
|
|
* In the right pane, right click on the partition you wish to dump.
|
|
* Select "Restart Partition".
|
|
* Select "Dump".
|
|
* Select "OK".
|
|
|
|
Using HMC Commandline
|
|
|
|
# chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
|
|
|
|
5.2) Blade Management Console for Blade Center
|
|
|
|
To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in
|
|
the Blade Management Console. Select the corresponding blade for which you want
|
|
to initate the dump and then click "Restart blade with NMI". This issues a
|
|
system reset and invokes xmon debugger.
|
|
|
|
|
|
Dump targets
|
|
============
|
|
|
|
In addition to being able to capture a vmcore to your system's local file
|
|
system, kdump can be configured to capture a vmcore to a number of other
|
|
locations, including a raw disk partition, a dedicated file system, an NFS
|
|
mounted file system, or a remote system via ssh/scp. Additional options
|
|
exist for specifying the relative path under which the dump is captured,
|
|
what to do if the capture fails, and for compressing and filtering the dump
|
|
(so as to produce smaller, more manageable, vmcore files, see "Advanced Setups"
|
|
for more detail on these options).
|
|
|
|
In theory, dumping to a location other than the local file system should be
|
|
safer than kdump's default setup, as its possible the default setup will try
|
|
dumping to a file system that has become corrupted. The raw disk partition and
|
|
dedicated file system options allow you to still dump to the local system,
|
|
but without having to remount your possibly corrupted file system(s),
|
|
thereby decreasing the chance a vmcore won't be captured. Dumping to an
|
|
NFS server or remote system via ssh/scp also has this advantage, as well
|
|
as allowing for the centralization of vmcore files, should you have several
|
|
systems from which you'd like to obtain vmcore files. Of course, note that
|
|
these configurations could present problems if your network is unreliable.
|
|
|
|
Kdump target and advanced setups are configured via modifications to
|
|
/etc/kdump.conf, which out of the box, is fairly well documented itself.
|
|
Any alterations to /etc/kdump.conf should be followed by a restart of the
|
|
kdump service, so the changes can be incorporated in the kdump initrd.
|
|
Restarting the kdump service is as simple as '/sbin/systemctl restart kdump.service'.
|
|
|
|
There are two ways to config the dump target, config dump target only
|
|
using "path", and config dump target explicitly. Interpretation of "path"
|
|
also differs in two config styles.
|
|
|
|
Config dump target only using "path"
|
|
------------------------------------
|
|
|
|
You can change the dump target by setting "path" to a mount point where
|
|
dump target is mounted. When there is no explicitly configured dump target,
|
|
"path" in kdump.conf represents the current file system path in which vmcore
|
|
will be saved. Kdump will automatically detect the underlying device of
|
|
"path" and use that as the dump target.
|
|
|
|
In fact, upon dump, kdump creates a directory $hostip-$date with-in "path"
|
|
and saves vmcore there. So practically dump is saved in $path/$hostip-$date/.
|
|
|
|
Kdump will only check current mount status for mount entry corresponding to
|
|
"path". So please ensure the dump target is mounted on "path" before kdump
|
|
service starts.
|
|
|
|
NOTES:
|
|
|
|
- It's strongly recommanded to put an mount entry for "path" in /etc/fstab
|
|
and have it auto mounted on boot. This make sure the dump target is
|
|
reachable from the machine and kdump's configuration is stable.
|
|
|
|
EXAMPLES:
|
|
|
|
- path /var/crash/
|
|
|
|
This is the default configuration. Assuming there is no disk mounted
|
|
on /var/ or on /var/crash, dump will be saved on disk backing rootfs
|
|
in directory /var/crash.
|
|
|
|
- path /var/crash/ (A separate disk mounted on /var/crash)
|
|
|
|
Say a disk /dev/sdb is mounted on /var. In this case dump target will
|
|
become /dev/sdb and path will become "/" and dump will be saved
|
|
on "sdb:/var/crash/" directory.
|
|
|
|
- path /var/crash/ (NFS mounted on /var)
|
|
|
|
Say foo.com:/export/tmp is mounted on /var. In this case dump target is
|
|
nfs server and path will be adjusted to "/crash" and dump will be saved to
|
|
foo.com:/export/tmp/crash/ directory.
|
|
|
|
Config dump target explicitely
|
|
------------------------------
|
|
|
|
You can set the dump target explicitly in kdump.conf, and "path" will be
|
|
the relative path in the specified dump target. For example, if dump
|
|
target is "ext4 /dev/sda", then dump will be saved in "path" directory
|
|
on /dev/sda.
|
|
|
|
Same is the case for nfs dump. If user specified "nfs foo.com:/export/tmp/"
|
|
as dump target, then dump will effectively be saved in
|
|
"foo.com:/export/tmp/var/crash/" directory.
|
|
|
|
If the dump target is "raw", then "path" is ignored.
|
|
|
|
If it's a filesystem target, kdump will need to know the right mount option.
|
|
Kdump will check current mount status, and then /etc/fstab for mount options
|
|
corresponding to the specified dump target and use it. If there are
|
|
special mount option required for the dump target, it could be set by put
|
|
an entry in fstab.
|
|
|
|
If there are no related mount entry, mount option is set to "defaults".
|
|
|
|
NOTES:
|
|
|
|
- It's recommended to put an entry for the dump target in /etc/fstab
|
|
and have it auto mounted on boot. This make sure the dump target is
|
|
reachable from the machine and kdump won't fail.
|
|
|
|
- Kdump ignores some mount options, including "noauto", "ro". This
|
|
make it possible to keep the dump target unmounted or read-only
|
|
when not used.
|
|
|
|
EXAMPLES:
|
|
|
|
- ext4 /dev/sda (mounted)
|
|
path /var/crash/
|
|
|
|
In this case dump target is set to /dev/sdb, path is the absolute path
|
|
"/var/crash" in /dev/sda, vmcore path will saved on
|
|
"sda:/var/crash" directory.
|
|
|
|
- nfs foo.com:/export/tmp (mounted)
|
|
path /var/crash/
|
|
|
|
In this case dump target is nfs server, path is the absolute path
|
|
"/var/crash", vmcore path will saved on "foo.com:/export/tmp/crash/" directory.
|
|
|
|
- nfs foo.com:/export/tmp (not mounted)
|
|
path /var/crash/
|
|
|
|
Same with above case, kdump will use "defaults" as the mount option
|
|
for the dump target.
|
|
|
|
- nfs foo.com:/export/tmp (not mounted, entry with option "noauto,nolock" exists in /etc/fstab)
|
|
path /var/crash/
|
|
|
|
In this case dump target is nfs server, vmcore path will saved on
|
|
"foo.com:/export/tmp/crash/" directory, and kdump will inherit "nolock" option.
|
|
|
|
Dump target and mkdumprd
|
|
------------------------
|
|
|
|
MKdumprd is the tool used to create kdump initramfs, and it may change
|
|
the mount status of the dump target in some condition.
|
|
|
|
Usually the dump target should be used only for kdump. If you worry about
|
|
someone uses the filesystem for something else other than dumping vmcore
|
|
you can mount it as read-only or make it a noauto mount. Mkdumprd will
|
|
mount/remount it as read-write for creating dump directory and will
|
|
move it back to it's original state afterwards.
|
|
|
|
Supported dump target types and requirements
|
|
--------------------------------------------
|
|
|
|
1) Raw partition
|
|
|
|
Raw partition dumping requires that a disk partition in the system, at least
|
|
as large as the amount of memory in the system, be left unformatted. Assuming
|
|
/dev/vg/lv_kdump is left unformatted, kdump.conf can be configured with
|
|
'raw /dev/vg/lv_kdump', and the vmcore file will be copied via dd directly
|
|
onto partition /dev/vg/lv_kdump. Restart the kdump service via
|
|
'/sbin/systemctl restart kdump.service' to commit this change to your kdump
|
|
initrd. Dump target should be persistent device name, such as lvm or device
|
|
mapper canonical name.
|
|
|
|
2) Dedicated file system
|
|
|
|
Similar to raw partition dumping, you can format a partition with the file
|
|
system of your choice, Again, it should be at least as large as the amount
|
|
of memory in the system. Assuming it should be at least as large as the
|
|
amount of memory in the system. Assuming /dev/vg/lv_kdump has been
|
|
formatted ext4, specify 'ext4 /dev/vg/lv_kdump' in kdump.conf, and a
|
|
vmcore file will be copied onto the file system after it has been mounted.
|
|
Dumping to a dedicated partition has the advantage that you can dump multiple
|
|
vmcores to the file system, space permitting, without overwriting previous ones,
|
|
as would be the case in a raw partition setup. Restart the kdump service via
|
|
'/sbin/systemctl restart kdump.service' to commit this change to
|
|
your kdump initrd. Note that for local file systems ext4 and ext2 are
|
|
supported as dumpable targets. Kdump will not prevent you from specifying
|
|
other filesystems, and they will most likely work, but their operation
|
|
cannot be guaranteed. for instance specifying a vfat filesystem or msdos
|
|
filesystem will result in a successful load of the kdump service, but during
|
|
crash recovery, the dump will fail if the system has more than 2GB of memory
|
|
(since vfat and msdos filesystems do not support more than 2GB files).
|
|
Be careful of your filesystem selection when using this target.
|
|
|
|
It is recommended to use persistent device names or UUID/LABEL for file system
|
|
dumps. One example of persistent device is /dev/vg/<devname>.
|
|
|
|
3) NFS mount
|
|
|
|
Dumping over NFS requires an NFS server configured to export a file system
|
|
with full read/write access for the root user. All operations done within
|
|
the kdump initial ramdisk are done as root, and to write out a vmcore file,
|
|
we obviously must be able to write to the NFS mount. Configuring an NFS
|
|
server is outside the scope of this document, but either the no_root_squash
|
|
or anonuid options on the NFS server side are likely of interest to permit
|
|
the kdump initrd operations write to the NFS mount as root.
|
|
|
|
Assuming your're exporting /dump on the machine nfs-server.example.com,
|
|
once the mount is properly configured, specify it in kdump.conf, via
|
|
'nfs nfs-server.example.com:/dump'. The server portion can be specified either
|
|
by host name or IP address. Following a system crash, the kdump initrd will
|
|
mount the NFS mount and copy out the vmcore to your NFS server. Restart the
|
|
kdump service via '/sbin/systemctl restart kdump.service' to commit this change
|
|
to your kdump initrd.
|
|
|
|
4) Special mount via "dracut_args"
|
|
|
|
You can utilize "dracut_args" to pass "--mount" to kdump, see dracut manpage
|
|
about the format of "--mount" for details. If there is any "--mount" specified
|
|
via "dracut_args", kdump will build it as the mount target without doing any
|
|
validation (mounting or checking like mount options, fs size, save path, etc),
|
|
so you must test it to ensure all the correctness. You cannot use other targets
|
|
in /etc/kdump.conf if you use "--mount" in "dracut_args". You also cannot specify
|
|
mutliple "--mount" targets via "dracut_args".
|
|
|
|
One use case of "--mount" in "dracut_args" is you do not want to mount dump target
|
|
before kdump service startup, for example, to reduce the burden of the shared nfs
|
|
server. Such as the example below:
|
|
dracut_args --mount "192.168.1.1:/share /mnt/test nfs4 defaults"
|
|
|
|
NOTE:
|
|
- <mountpoint> must be specified as an absolute path.
|
|
|
|
5) Remote system via ssh/scp
|
|
|
|
Dumping over ssh/scp requires setting up passwordless ssh keys for every
|
|
machine you wish to have dump via this method. First up, configure kdump.conf
|
|
for ssh/scp dumping, adding a config line of 'ssh user@server', where 'user'
|
|
can be any user on the target system you choose, and 'server' is the host
|
|
name or IP address of the target system. Using a dedicated, restricted user
|
|
account on the target system is recommended, as there will be keyless ssh
|
|
access to this account.
|
|
|
|
Once kdump.conf is appropriately configured, issue the command
|
|
'kdumpctl propagate' to automatically set up the ssh host keys and transmit
|
|
the necessary bits to the target server. You'll have to type in 'yes'
|
|
to accept the host key for your targer server if this is the first time
|
|
you've connected to it, and then input the target system user's password
|
|
to send over the necessary ssh key file. Restart the kdump service via
|
|
'/sbin/systemctl restart kdump.service' to commit this change to your kdump initrd.
|
|
|
|
Advanced Setups
|
|
===============
|
|
|
|
Kdump boot directory
|
|
--------------------
|
|
|
|
Usually kdump kernel is the same as 1st kernel. So kdump will try to find
|
|
kdump kernel under /boot according to /proc/cmdline. E.g we execute below
|
|
command and get an output:
|
|
cat /proc/cmdline
|
|
BOOT_IMAGE=/xxx/vmlinuz-3.yyy.zzz root=xxxx .....
|
|
Then kdump kernel will be /boot/xxx/vmlinuz-3.yyy.zzz.
|
|
However a variable KDUMP_BOOTDIR in /etc/sysconfig/kdump is provided to
|
|
user if kdump kernel is put in a different directory.
|
|
|
|
Kdump Post-Capture Executable
|
|
-----------------------------
|
|
|
|
It is possible to specify a custom script or binary you wish to run following
|
|
an attempt to capture a vmcore. The executable is passed an exit code from
|
|
the capture process, which can be used to trigger different actions from
|
|
within your post-capture executable.
|
|
If /etc/kdump/post.d directory exist, All files in the directory are
|
|
collectively sorted and executed in lexical order, before binary or script
|
|
specified kdump_post parameter is executed.
|
|
|
|
Kdump Pre-Capture Executable
|
|
----------------------------
|
|
|
|
It is possible to specify a custom script or binary you wish to run before
|
|
capturing a vmcore. Exit status of this binary is interpreted:
|
|
0 - continue with dump process as usual
|
|
non 0 - run the final action (reboot/poweroff/halt)
|
|
If /etc/kdump/pre.d directory exists, all files in the directory are collectively
|
|
sorted and executed in lexical order, after binary or script specified
|
|
kdump_pre parameter is executed.
|
|
Even if the binary or script in /etc/kdump/pre.d directory returns non 0
|
|
exit status, the processing is continued.
|
|
|
|
Extra Binaries
|
|
--------------
|
|
|
|
If you have specific binaries or scripts you want to have made available
|
|
within your kdump initrd, you can specify them by their full path, and they
|
|
will be included in your kdump initrd, along with all dependent libraries.
|
|
This may be particularly useful for those running post-capture scripts that
|
|
rely on other binaries.
|
|
|
|
Extra Modules
|
|
-------------
|
|
|
|
By default, only the bare minimum of kernel modules will be included in your
|
|
kdump initrd. Should you wish to capture your vmcore files to a non-boot-path
|
|
storage device, such as an iscsi target disk or clustered file system, you may
|
|
need to manually specify additional kernel modules to load into your kdump
|
|
initrd.
|
|
|
|
Failure action
|
|
--------------
|
|
|
|
Failure action specifies what to do when dump to configured dump target
|
|
fails. By default, failure action is "reboot" and that is system reboots
|
|
if attempt to save dump to dump target fails.
|
|
|
|
There are other failure actions available though.
|
|
|
|
- dump_to_rootfs
|
|
This option tries to mount root and save dump on root filesystem
|
|
in a path specified by "path". This option will generally make
|
|
sense when dump target is not root filesystem. For example, if
|
|
dump is being saved over network using "ssh" then one can specify
|
|
failure action to "dump_to_rootfs" to try saving dump to root
|
|
filesystem if dump over network fails.
|
|
|
|
- shell
|
|
Drop into a shell session inside initramfs.
|
|
|
|
- halt
|
|
Halt system after failure
|
|
|
|
- poweroff
|
|
Poweroff system after failure.
|
|
|
|
Compression and filtering
|
|
-------------------------
|
|
|
|
The 'core_collector' parameter in kdump.conf allows you to specify a custom
|
|
dump capture method. The most common alternate method is makedumpfile, which
|
|
is a dump filtering and compression utility provided with kexec-tools. On
|
|
some architectures, it can drastically reduce the size of your vmcore files,
|
|
which becomes very useful on systems with large amounts of memory.
|
|
|
|
A typical setup is 'core_collector makedumpfile -F -l --message-level 7 -d 31',
|
|
but check the output of '/sbin/makedumpfile --help' for a list of all available
|
|
options (-i and -g don't need to be specified, they're automatically taken care
|
|
of). Note that use of makedumpfile requires that the kernel-debuginfo package
|
|
corresponding with your running kernel be installed.
|
|
|
|
Core collector command format depends on dump target type. Typically for
|
|
filesystem (local/remote), core_collector should accept two arguments.
|
|
First one is source file and second one is target file. For ex.
|
|
|
|
- ex1.
|
|
|
|
core_collector "cp --sparse=always"
|
|
|
|
Above will effectively be translated to:
|
|
|
|
cp --sparse=always /proc/vmcore <dest-path>/vmcore
|
|
|
|
- ex2.
|
|
|
|
core_collector "makedumpfile -l --message-level 7 -d 31"
|
|
|
|
Above will effectively be translated to:
|
|
|
|
makedumpfile -l --message-level 7 -d 31 /proc/vmcore <dest-path>/vmcore
|
|
|
|
For dump targets like raw and ssh, in general, core collector should expect
|
|
one argument (source file) and should output the processed core on standard
|
|
output (There is one exception of "scp", discussed later). This standard
|
|
output will be saved to destination using appropriate commands.
|
|
|
|
raw dumps core_collector examples:
|
|
|
|
- ex3.
|
|
|
|
core_collector "cat"
|
|
|
|
Above will effectively be translated to.
|
|
|
|
cat /proc/vmcore | dd of=<target-device>
|
|
|
|
- ex4.
|
|
|
|
core_collector "makedumpfile -F -l --message-level 7 -d 31"
|
|
|
|
Above will effectively be translated to.
|
|
|
|
makedumpfile -F -l --message-level 7 -d 31 | dd of=<target-device>
|
|
|
|
ssh dumps core_collector examples:
|
|
|
|
- ex5.
|
|
|
|
core_collector "cat"
|
|
|
|
Above will effectively be translated to.
|
|
|
|
cat /proc/vmcore | ssh <options> <remote-location> "dd of=path/vmcore"
|
|
|
|
- ex6.
|
|
|
|
core_collector "makedumpfile -F -l --message-level 7 -d 31"
|
|
|
|
Above will effectively be translated to.
|
|
|
|
makedumpfile -F -l --message-level 7 -d 31 | ssh <options> <remote-location> "dd of=path/vmcore"
|
|
|
|
There is one exception to standard output rule for ssh dumps. And that is
|
|
scp. As scp can handle ssh destinations for file transfers, one can
|
|
specify "scp" as core collector for ssh targets (no output on stdout).
|
|
|
|
- ex7.
|
|
|
|
core_collector "scp"
|
|
|
|
Above will effectively be translated to.
|
|
|
|
scp /proc/vmcore <user@host>:path/vmcore
|
|
|
|
About default core collector
|
|
----------------------------
|
|
|
|
Default core_collector for ssh/raw dump is:
|
|
"makedumpfile -F -l --message-level 7 -d 31".
|
|
Default core_collector for other targets is:
|
|
"makedumpfile -l --message-level 7 -d 31".
|
|
|
|
Even if core_collector option is commented out in kdump.conf, makedumpfile
|
|
is default core collector and kdump uses it internally.
|
|
|
|
If one does not want makedumpfile as default core_collector, then they
|
|
need to specify one using core_collector option to change the behavior.
|
|
|
|
Note: If "makedumpfile -F" is used then you will get a flattened format
|
|
vmcore.flat, you will need to use "makedumpfile -R" to rearrange the
|
|
dump data from stdard input to a normal dumpfile (readable with analysis
|
|
tools).
|
|
For example: "makedumpfile -R vmcore < vmcore.flat"
|
|
|
|
|
|
Caveats
|
|
=======
|
|
|
|
Console frame-buffers and X are not properly supported. If you typically run
|
|
with something along the lines of "vga=791" in your kernel config line or
|
|
have X running, console video will be garbled when a kernel is booted via
|
|
kexec. Note that the kdump kernel should still be able to create a dump,
|
|
and when the system reboots, video should be restored to normal.
|
|
|
|
|
|
Notes
|
|
=====
|
|
|
|
Notes on resetting video:
|
|
-------------------------
|
|
|
|
Video is a notoriously difficult issue with kexec. Video cards contain ROM code
|
|
that controls their initial configuration and setup. This code is nominally
|
|
accessed and executed from the Bios, and otherwise not safely executable. Since
|
|
the purpose of kexec is to reboot the system without re-executing the Bios, it
|
|
is rather difficult if not impossible to reset video cards with kexec. The
|
|
result is, that if a system crashes while running in a graphical mode (i.e.
|
|
running X), the screen may appear to become 'frozen' while the dump capture is
|
|
taking place. A serial console will of course reveal that the system is
|
|
operating and capturing a vmcore image, but a casual observer will see the
|
|
system as hung until the dump completes and a true reboot is executed.
|
|
|
|
There are two possiblilties to work around this issue. One is by adding
|
|
--reset-vga to the kexec command line options in /etc/sysconfig/kdump. This
|
|
tells kdump to write some reasonable default values to the video card register
|
|
file, in the hopes of returning it to a text mode such that boot messages are
|
|
visible on the screen. It does not work with all video cards however.
|
|
Secondly, it may be worth trying to add vga15fb.ko to the extra_modules list in
|
|
/etc/kdump.conf. This will attempt to use the video card in framebuffer mode,
|
|
which can blank the screen prior to the start of a dump capture.
|
|
|
|
Notes on rootfs mount
|
|
---------------------
|
|
|
|
Dracut is designed to mount rootfs by default. If rootfs mounting fails it
|
|
will refuse to go on. So kdump leaves rootfs mounting to dracut currently.
|
|
We make the assumtion that proper root= cmdline is being passed to dracut
|
|
initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in
|
|
/etc/sysconfig/kdump, you will need to make sure that appropriate root=
|
|
options are copied from /proc/cmdline. In general it is best to append
|
|
command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing
|
|
the original command line completely.
|
|
|
|
Notes on watchdog module handling
|
|
---------------------------------
|
|
|
|
If a watchdog is active in first kernel then, we must have it's module
|
|
loaded in crash kernel, so that either watchdog is deactivated or started
|
|
being kicked in second kernel. Otherwise, we might face watchdog reboot
|
|
when vmcore is being saved. When dracut watchdog module is enabled, it
|
|
installs kernel watchdog module of active watchdog device in initrd.
|
|
kexec-tools always add "-a watchdog" to the dracut_args if there exists at
|
|
least one active watchdog and user has not added specifically "-o watchdog"
|
|
in dracut_args of kdump.conf. If a watchdog module (such as hp_wdt) has
|
|
not been written in watchdog-core framework then this option will not have
|
|
any effect and module will not be added. Please note that only systemd
|
|
watchdog daemon is supported as watchdog kick application.
|
|
|
|
Notes for disk images
|
|
---------------------
|
|
|
|
Kdump initramfs is a critical component for capturing the crash dump.
|
|
But it's strictly generated for the machine it will run on, and have
|
|
no generality. If you install a new machine with a previous disk image
|
|
(eg. VMs created with disk image or snapshot), kdump could be broken
|
|
easily due to hardware changes or disk ID changes. So it's strongly
|
|
recommended to not include the kdump initramfs in the disk image in the
|
|
first place, this helps to save space, and kdumpctl will build the
|
|
initramfs automatically if it's missing. If you have already installed
|
|
a machine with a disk image which have kdump initramfs embedded, you
|
|
should rebuild the initramfs using "kdumpctl rebuild" command manually,
|
|
or else kdump may not work as expeceted.
|
|
|
|
Notes on encrypted dump target
|
|
------------------------------
|
|
|
|
Currently, kdump is not working well with encrypted dump target.
|
|
First, user have to give the password manually in capture kernel,
|
|
so a working interactive terminal is required in the capture kernel.
|
|
And another major issue is that an OOM problem will occur with certain
|
|
encryption setup. For example, the default setup for LUKS2 will use a
|
|
memory hard key derivation function to mitigate brute force attach,
|
|
it's impossible to reduce the memory usage for mounting the encrypted
|
|
target. In such case, you have to either reserved enough memory for
|
|
crash kernel according, or update your encryption setup.
|
|
It's recommanded to use a non-encrypted target (eg. remote target)
|
|
instead.
|
|
|
|
Notes on device dump
|
|
--------------------
|
|
|
|
Device dump allows drivers to append dump data to vmcore, so you can
|
|
collect driver specified debug info. The drivers could append the
|
|
data without any limit, and the data is stored in memory, this may
|
|
bring a significant memory stress. So device dump is disabled by default
|
|
by passing "novmcoredd" command line option to the kdump capture kernel.
|
|
If you want to collect debug data with device dump, you need to modify
|
|
"KDUMP_COMMANDLINE_APPEND=" value in /etc/sysconfig/kdump and remove the
|
|
"novmcoredd" option. You also need to increase the "crashkernel=" value
|
|
accordingly in case of OOM issue.
|
|
Besides, kdump initramfs won't automatically include the device drivers
|
|
which support device dump, only device drivers that are required for
|
|
the dump target setup will be included. To ensure the device dump data
|
|
will be included in the vmcore, you need to force include related
|
|
device drivers by using "extra_modules" option in /etc/kdump.conf
|
|
|
|
|
|
Parallel Dumping Operation
|
|
==========================
|
|
|
|
Kexec allows kdump using multiple cpus. So parallel feature can accelerate
|
|
dumping substantially, especially in executing compression and filter.
|
|
For example:
|
|
|
|
1."makedumpfile -c --num-threads [THREAD_NUM] /proc/vmcore dumpfile"
|
|
2."makedumpfile -c /proc/vmcore dumpfile",
|
|
|
|
1 has better performance than 2, if THREAD_NUM is larger than two
|
|
and the usable cpus number is larger than THREAD_NUM.
|
|
|
|
Notes on how to use multiple cpus on a capture kernel on x86 system:
|
|
|
|
Make sure that you are using a kernel that supports disable_cpu_apicid
|
|
kernel option as a capture kernel, which is needed to avoid x86 specific
|
|
hardware issue (*). The disable_cpu_apicid kernel option is automatically
|
|
appended by kdumpctl script and is ignored if the kernel doesn't support it.
|
|
|
|
You need to specify how many cpus to be used in a capture kernel by specifying
|
|
the number of cpus in nr_cpus kernel option in /etc/sysconfig/kdump. nr_cpus
|
|
is 1 at default.
|
|
|
|
You should use necessary and sufficient number of cpus on a capture kernel.
|
|
Warning: Don't use too many cpus on a capture kernel, or the capture kernel
|
|
may lead to panic due to Out Of Memory.
|
|
|
|
(*) Without disable_cpu_apicid kernel option, capture kernel may lead to
|
|
hang, system reset or power-off at boot, depending on your system and runtime
|
|
situation at the time of crash.
|
|
|
|
|
|
Debugging Tips
|
|
==============
|
|
|
|
- One can drop into a shell before/after saving vmcore with the help of
|
|
using kdump_pre/kdump_post hooks. Use following in one of the pre/post
|
|
scripts to drop into a shell.
|
|
|
|
#!/bin/bash
|
|
_ctty=/dev/ttyS0
|
|
setsid /bin/sh -i -l 0<>$_ctty 1<>$_ctty 2<>$_ctty
|
|
|
|
One might have to change the terminal depending on what they are using.
|
|
|
|
- Serial console logging for virtual machines
|
|
|
|
I generally use "virsh console <domain-name>" to get to serial console.
|
|
I noticed after dump saving system reboots and when grub menu shows up
|
|
some of the previously logged messages are no more there. That means
|
|
any important debugging info at the end will be lost.
|
|
|
|
One can log serial console as follows to make sure messages are not lost.
|
|
|
|
virsh ttyconsole <domain-name>
|
|
ln -s <name-of-tty> /dev/modem
|
|
minicom -C /tmp/console-logs
|
|
|
|
Now minicom should be logging serial console in file console-logs.
|
|
|
|
- Using the logger to output kdump log messages
|
|
|
|
Currently, kdump messages are printed with the 'echo' command or redirect
|
|
to console, and which does not support to output kdump messages according
|
|
to the log level.
|
|
|
|
That is not convenient to debug kdump issues, we usually need to capture
|
|
additional debugging information via the modification of the options or the
|
|
scripts like kdumpctl, mkdumprd, etc. Because there is no complete debugging
|
|
messages, which could waste valuable time.
|
|
|
|
To cope with this challenging, we introduce the logger to output the kdump
|
|
messages according to the log level, and provide a chance to save logs to
|
|
the journald if the journald service is available, and then dump all logs
|
|
to a file, otherwise dump the logs with the dmesg to a file.
|
|
|
|
Logging is controlled by following global variables:
|
|
- @var kdump_stdloglvl - logging level to standard error (console output)
|
|
- @var kdump_sysloglvl - logging level to syslog (by logger command)
|
|
- @var kdump_kmsgloglvl - logging level to /dev/kmsg (only for boot-time)
|
|
If any of the variables is not set, this function set it to default:
|
|
- @var kdump_stdloglvl=4 (info)
|
|
- @var kdump_sysloglvl=4 (info)
|
|
- @var kdump_kmsgloglvl=0 (no logging)
|
|
|
|
Logging levels: fatal(1),error(2),warn(3),info(4),debug(5),trace(6)
|
|
|
|
We can easily configure the above variables in the /etc/sysconfig/kdump. For
|
|
example:
|
|
kdump_sysloglvl=5
|
|
kdump_stdloglvl=5
|
|
|
|
The above configurations indicate that kdump messages will be printed to the
|
|
console and journald if the journald service is enabled.
|