Commit Graph

1659 Commits

Author SHA1 Message Date
Sourabh Jain
4b7b7736ee Introduce a function to get reserved memory size
The size of the reserved memory in the functions show_reserved_mem,
check_crash_mem_reserved, and do_estimate are fetched from the sysfs
node `/sys/kernel/kexec_crash_size`. However, in the case of fadump,
the reserved area size is instead present in
/sys/kernel/fadump/mem_reserved.

For example:

$ kdumpctl showmem
kdump: Dump mode is fadump
kdump: Reserved 0MB memory for crash kernel

The above command showed 0MB for Reserved memory which is incorrect, the
actual reservation was 2048MB.

To resolve this issue a new helper function is introduced to fetch
reserved memory size based on the dump mode. For "fadump" mode,
it looks in `/sys/kernel/fadump/mem_reserved`, otherwise, it uses
`/sys/kernel/kexec_crash_size`. And all functions that previously
fetching reserved memory directly from `/sys/kernel/kexec_crash_size`
sysfs node are now updated to use this new function to get the reserved
memory size.

With the fix in place, the `kdumpctl showmem` command will now display
correct reserved memory size.

$ kdumpctl showmem
kdump: Dump mode is fadump
kdump: Reserved 2048MB memory for crash kernel

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reported-by: Sachin P Bappalige <sachinpb@linux.vnet.ibm.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-08-15 13:51:14 +08:00
Fedora Release Engineering
b725cdb45e Rebuilt for https://fedoraproject.org/wiki/Fedora_39_Mass_Rebuild
Signed-off-by: Fedora Release Engineering <releng@fedoraproject.org>
2023-07-20 08:42:47 +00:00
Coiby Xu
52a034eb10 Use SPDX licence
Convert to SPDX license by https://fedoraproject.org/wiki/Changes/SPDX_Licenses_Phase_2,
    # license-fedora2spdx GPLv2
    GPL-2.0-only

Signed-off-by: Coiby Xu <coxu@redhat.com>
2023-07-04 11:56:11 +08:00
Lichen Liu
daa829f79e spec: kdump/ppc64: make servicelog_notify silent when there are no errors
There is confusing message in /var/log/anaconda/packaging.log when installing
kexec-tools during the system installation on ppc64le:

	Event Notification Registration successful (id: 1)

Make servicelog_notify slient when there are no erros.

Signed-off-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-06-25 10:42:02 +08:00
Pingfan Liu
f3139012f2 kdump-lib: Match 64k debug kernel in prepare_kdump_bootinfo()
For kernel 64k variant, it terminates with substring 64k-debug, e.g.
vmlinuz-5.14.0-327.el9.aarch64+64k-debug.

Providing an extra matching pattern to filter out it.

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-06-20 11:17:43 +08:00
Philipp Rudo
dda81d72c2 kdumpctl: Fix temporary directory location
The temporary directory is currently created under the current working
directory. That alone isn't ideal but works most of the time. However,
it will fail when the current working directory is not writable. So make
sure the directory is created within TMPDIR.

Fixes: ea00b7d ("kdumpctl: Move temp file in get_kernel_size to global temp dir")
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-06-20 11:17:43 +08:00
Coiby Xu
17c26558d9 tests: use the default crashkernel value
And with commit t5b31b099 ("Simplify the management of the kernel
parameter crashkernel"), the default crashkernel value will be
used for the kernel. But the test VM has a RAM of 768M thus this is no
actual reserved memory for kdump.  Even With the old crashkernel=224M,
network dumping tests like nfs-kdump will fail out of memory when
running against current Fedora Cloud images (>=F37).

This patch address the above two issues by
 1. increasing the RAM of test VM to 1G
 2. installing the kernel-modules which contains the squashfs module in
    order to use the dracut squash module for kdump initrd.

Thanks to the dracut squash module, now even crashkernel=192M (the
default crashkernel value for RAM between 1G and 4G) works for
network dumping. Another benefit brought by this change is the default
crashkernel value can be tested as well.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-06-20 10:24:25 +08:00
Coiby Xu
7cd799462e tests: skip checking if the second partition has the boot label
All the tests failed to run on the Fedora 37 host because the boot
partition failed to be mounted and in turn the key kernel cmdline
parameters like selinux=0 couldn't be added.

The root problem is somehow running lsblk on the second partition
returns an empty label unless we wait for enough time. Before figuring
out the root cause, simply skip check that the second partition
needs to have the boot label.

Note the root problem can be produced by building a test image,
    cd tests
    ./scripts/build-image.sh Fedora-Cloud-Base-37-1.7.x86_64.qcow2 output_image   scripts/build-scripts/base-image_test.sh
    Source image is qcow2, using snapshot...
    Formatting 'build/base-image1.building', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=5368709120 backing_file=Fedora-Cloud-Base-37-1.7.x86_64.qcow2 backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
    It's a image with multiple partitions, using last partition as main partition
    grep: /boot/grub2/grubenv: No such file or directory
    grub2-editenv: error: cannot open `/boot/grub2/grubenv.new': No such file or directory.
    /dev/nbd0 disconnected

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-06-20 10:24:25 +08:00
Coiby Xu
8f243d2ab1 tests: generate correct RPM name
Tests failed to run against Fedora 37 or newer cloud images because of
the following error,
    It's a image with multiple partitions, using last partition as main partition
    'xxx/tests/build/x86_64/kexec-tools-2.0.26-5.fc37.src.rpm' not found
    /dev/nbd0 disconnected
    make: *** [Makefile:73: xxx/tests/output/test-base-image] Error 1

This is because starting with Fedora 37, rpm changes its API,
    # Fedora >= 37
    $ rpm -q --specfile kexec-tools.spec
    kexec-tools-2.0.26-5.fc37.src
    # Fedora 36
    $ rpm -q --specfile kexec-tools.spec
    kexec-tools-2.0.26-5.fc36

The tests depends on rpm to generate correct RPM name. Fix this issue by
removing the trailing .src from the output of "rpm -q --specfile".

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2023-06-20 10:24:08 +08:00
Coiby Xu
471c136481 Release 2.0.26-7
Signed-off-by: Coiby Xu <coxu@redhat.com>
2023-06-14 17:39:41 +08:00
Pingfan Liu
64d93c886f kdumpctl: Fix the matching of plus symbol by grep's EREs
After introducing 64k variant kernel on aarch64, an example kernel name
looks like "vmlinuz-5.14.0-316.el9.aarch64+64k". To match the plus
symbol, it demands an escape charater.

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-06-14 17:33:16 +08:00
Pingfan Liu
7a2c4cbc3b kdump-lib: Evaluate the memory consumption by smmu and mlx5 separately
On 4k and 64k kernels, the typical consumption values for SMMU are 36MB
and 384MB, respectively. Hence for 64k kernel, the consumption by smmu
should be taken into account carefully.

To do it by adding the extra 384MB value if installing a 64k kernel.
The upper limit value 384MB is calculated according to the formula in
the kernel smmu driver.

As for mlx5 network cards, it is measured by a pratical test, 200M for
64k variant, 150M for 4k variant

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-06-14 17:33:16 +08:00
Pingfan Liu
05c4861443 kdump-lib: add support for 64K aarch64
On aarch64, both 4K and 64K kernel can be installed, while they demand
different size reserved memory for kdump kernel.

'get_conf PAGE_SIZE' can not work if installing a 64K kernel when
running a 4K kernel. Hence resorting to the kernel release naming rules.
At present, the 64K kernel has the keyword '64k' in its suffix.

The base line for 64K is decided based on 4K. The diff 100M is picked up
since on a high end machine without smmu enabled, the diff of MemFree is
82M.

As for the smmu case, a huge difference in the memory consumption lies
between 64k and 4k driver. And it should be calculated separatedly.

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-06-14 17:33:16 +08:00
Pingfan Liu
d8b961be37 kdump-lib: Introduce parse_kver_from_path() to get kernel version from its path name
kdump_get_arch_recommend_crashkernel() expects the kernel version info,
while _update_kernel() provides the absolute path, which contains the
kernel version info.

This patch introduce a dedicated function parse_kver_from_path() to
extract the kernel info from the path

Credit to Philipp, who contributes the original code.

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-06-14 17:33:16 +08:00
Pingfan Liu
51efbcf83e kdump-lib: Introduce a help function _crashkernel_add()
This help function can manipulate the crashkernel cmdline by adding an
number for each item. Also a basic test case for _crashkernel_add() is
provided in this patch.

Credit to Philipp, who contributes the original code.

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-06-14 17:33:16 +08:00
Coiby Xu
0471131a16 Merge #14 Make binutils a recommend as it's only needed for UKI support 2023-06-14 09:31:45 +00:00
Lichen Liu
29fe563644 kdump.conf: redirect unknown architecture warning to stderr
The warning messages should not be included in the generated files.
Redirecting the warning for an unknown architecture to stderr.

Signed-off-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-06-09 10:19:04 +08:00
Timothée Ravier
eabbf9d6a0 Whitespace fixes 2023-06-02 13:11:01 +02:00
Timothée Ravier
4da1ffe730 Make binutils a recommend as it's only needed for UKI support
UKI are not supported on rpm-ostree based Fedora variants so let's use
recommend for binutils for now to let those not include the package
until needed.

See: https://github.com/coreos/fedora-coreos-tracker/issues/1496
See: https://github.com/ostreedev/ostree/issues/2753
See: ea7be0608e
2023-06-02 13:11:01 +02:00
Coiby Xu
e42a823dae mkdumprd: Use the correct syntax to redirect the stderr to null
A space was added by mistake and unfortunately fips-mode-setup refuses
an extra parameter,

    # fips-mode-setup --is-enabled 2 > /dev/null
    # echo $?
    2
    # fips-mode-setup --is-enabled 2
    Check, enable, or disable the system FIPS mode.
    usage: /usr/bin/fips-mode-setup --enable|--disable [--no-bootcfg]
    usage: /usr/bin/fips-mode-setup --check
    usage: /usr/bin/fips-mode-setup --is-enabled

So in this case mkdumprd can never detect if FIPS is enabled. Fix this
mistake.

Fixes: 443a43e0 ("mkdumprd: call dracut with --add-device to install the drivers needed by /boot partition automatically for FIPS")
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Tao Liu <ltao@redhat.com>
2023-06-01 16:39:12 +08:00
Coiby Xu
4311534c85 Release 2.0.26-5
Signed-off-by: Coiby Xu <coxu@redhat.com>
2023-05-29 17:42:49 +08:00
Coiby Xu
07b99ecab7 Add ShellSpec tests for managing the crashkernel kernel parameter
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-05-29 14:40:57 +08:00
Coiby Xu
5b31b099ae Simplify the management of the kernel parameter crashkernel
Currently, kexec-tools only updates the crashkernel to a new default
value only when both two conditions are met,
 - auto_reset_crashkernel=yes in kdump.conf
 - existing kernels or current running kernel should use the old default
   value.

To address seen corner cases, the logic to tell if the second condition
is met becomes quite complex. Instead of making the logic more complex
to support aarch64-64k, this patch drops the second condition to
simplify the management of the crashkernel kernel parameter.

Another change brought by this simplification is kexec-tools will also
set up the kernel crashkernel parameter for a fresh install (previously
it's limited to osbuild).

Note
1. This patch also stop trying to update /etc/default/grub because
   a) it only affects the static file /boot/grub2/grub.cfg
   b) grubby is recommended to change the kernel command-line parameters
      for both Fedora [1] and RHEL9 [2][3]
   c) For the cases of aarch64 and POWER, different kernels could have
      different default crashkernel value.

2. Starting with Fedora 37,  posttrans rpm scriplet distinguish between
   package install and upgrade.

[1] https://fedoraproject.org/wiki/GRUB_2
[2] https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/managing_monitoring_and_updating_the_kernel/configuring-kernel-command-line-parameters_managing-monitoring-and-updating-the-kernel#changing-kernel-command-line-parameters-for-all-boot-entries_configuring-kernel-command-line-parameters
[3] https://access.redhat.com/solutions/1136173

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-05-29 14:40:57 +08:00
Coiby Xu
cdc0253a3c Let _update_kernel_cmdline return the correct return code
Currently, for non-s390x systems, the return code is 1 even when
_update_kernel_cmdline is correctly executed. This makes callers like
reset_crashkernel_after_update fail to print a message if a kernel has
its crashkernel updated. Fix it by put the code inside if block for
s390x.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-05-29 14:40:57 +08:00
Coiby Xu
443a43e075 mkdumprd: call dracut with --add-device to install the drivers needed by /boot partition automatically for FIPS
Currently, kdump doesn't work on many FIPS-enabled systems including
Azure, ESXI, Hyper, POWER and etc. When FIPS is enabled, it needs to
access /boot//.vmlinuz-xxx.hmac to verify the integrity of the kernel.
However, on those systems, /boot fails to be mounted due to a lack of
fs and block device drivers and the system just halted after failing to
verify the integrity of the kernel. For example, on Hyper-V, sd_mod, sg,
scsi_transport_fc, hv_storvsc and hv_vmbus need to be installed in order
for /boot to be mounted.

mkdumprd calls dracut with the --no-hostonly-default-device. Following
the documentation (man dracut),
    --no-hostonly-default-device
      Do not generate implicit host devices like root, swap, fstab, etc.
      Use "--mount" or "--add-device" to explicitly add devices as needed

this patch uses "--add-device" to explicitly add the device of /boot.

Note there is already an attempt to fix it in dracut's 01fips module
i.e. via the commit 83651776 ("fips: ensure fs module for /boot is
installed"). Unfortunately it only installs the file system driver e.g.
xfs.

Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
2023-05-29 10:20:11 +08:00
Pingfan Liu
81d3cc344d kdump-lib: fix the matching pattern for debug-kernel
On aarch64, a 64k kernel's name looks like:
vmlinuz-5.14.0-300.el9.aarch64+64k and the corresponding debug kernel's
name looks like: vmlinuz-5.14.0-300.el9.aarch64+64k-debug, which ends
with the suffix -debug instead of +debug.

Fix the matching pattern by [+|-]debug

Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-05-23 14:59:09 +08:00
Coiby Xu
5e3d629da7 Release 2.0.26-4
Signed-off-by: Coiby Xu <coxu@redhat.com>
2023-05-16 09:44:56 +08:00
Jeremy Linton
8af05dc45a kdumpctl: Add support for systemd-boot paths
The default systemd-boot installed kernels on fedora end up in the form:

/boot/efi/36b54597c46383/6.4.0-0.rc0.20230427git6e98b09da931.5.fc39.aarch64/linux

Where the kernel version is a directory containing the kernel (linux)
and the initrd. Thus _find_kernel_path_by release needs to be a bit less
strict and allow some futher characters on the grubby (really bootctl)
output.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-05-16 09:21:13 +08:00
Kairui Song
c8643af270 mkdumprd: add --aggressive-strip as default dracut args
The new aggressive strip option was added in dracut 058, which tell
dracut to build the initramfs stripping more sections of the ELF
binaries (basically strip .symtab, .strtab).

These section are only useful for debugging runtime failures, but in
kdump kernel, neccessary tools for debug any runtime failure are
absent, there is no point keeping these sections.

Stripping these section can help save some memory with almost no side
effect. So let enable --aggressive-strip by default.

Comparison of unpacked initramfs before / after enabling aggressive strip:

du -hs image image.aggressive-strip
31M     image
29M     image.aggressive-strip

Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-05-16 09:21:13 +08:00
Philipp Rudo
ea7be0608e kdumpctl: Add basic UKI support
A Unified Kernel Image (UKI) is a single EFI PE executable combining an
EFI stub, a kernel image, an initrd image, and the kernel command line.
They are defined in the Boot Loader Specification [1] as type #2
entries. UKIs have the advantage that all code as well as meta data that
is required to boot the system, not only the kernel image, is combined
in a single PE file and can be signed for EFI SecureBoot. This extends
the coverage of SecureBoot extensively.

For RHEL support for UKI were included into kernel-ark with 16c7e3ee836e
("redhat: Add sub-RPM with a EFI unified kernel image for virtual
machines").

There are two problems with UKIs from the kdump point of view at the
moment. First, they cannot be directly loaded via kexec_file_load and
second, the initrd included isn't suitable for kdump. In order to enable
kdump on systems with UKIs build the kdump initrd as usual and extract
the kernel image before loading the crash kernel.

[1] https://uapi-group.org/specifications/specs/boot_loader_specification/

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-05-16 09:21:13 +08:00
Philipp Rudo
ea00b7db43 kdumpctl: Move temp file in get_kernel_size to global temp dir
Others will need to use a temporary files, too. In order to avoid
potential clashes of multiple trap handlers move the local temp file
into a global temp dir.

While at it make sure that the trap handler returns the correct exit
code.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-05-16 09:21:13 +08:00
Philipp Rudo
81d89c885f kdumpctl: Move get_kernel_size to kdumpctl
The function is only used in do_estimate. Move it to kdumpctl to
prevent confusion.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-05-16 09:20:59 +08:00
Philipp Rudo
0ff44ca6e8 kdumpctl: fix is_dracut_mod_omitted
The function is pretty broken right now. To start with the -o/--omit
option allows a quoted, space separated list of modules. But using 'set'
breaks quotation and thus only considers the first element in the list.
Furthermore dracut uses getopt internally. This means that it is also
possible to pass the list via --omit=.

Fix the function by making use of getopt for parsing the dracut_args.
While at it also add a test cases to cover the functions.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-17 14:49:51 +08:00
Philipp Rudo
f81e6ca8da kdump-lib: move is_dracut_mod_omitted to kdumpctl
The function is only used in kdumpctl. Thus move it there to keep
kdump-lib small and simple.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-17 14:49:51 +08:00
Philipp Rudo
9eb39cda3c kdump-lib: remove get_nmcli_connection_apath_by_ifname
The function isn't used anywhere. Thus remove it to keep kdump-lib small
and simple.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-17 14:49:51 +08:00
Philipp Rudo
62c41e5343 kdump-lib: remove get_nmcli_field_by_conpath
The function isn't used anywhere. Thus remove it to keep kdump-lib small
and simple.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-17 14:49:51 +08:00
Philipp Rudo
f9d8cabfd1 dracut-module-setup: remove dead source_ifcfg_file
With the NetworkManager rewrite this function in no longer used. This
also allows to remove a lot of dead code in kdump-lib.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-17 14:49:51 +08:00
Philipp Rudo
258d285c63 kdump-lib-initramfs: remove is_fs_dump_target
The function isn't used anywhere. Thus remove it to keep
kdump-lib-initramfs small and simple.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-17 14:49:51 +08:00
Philipp Rudo
ca306cd403 kdump-lib-initramfs: harden is_mounted
If the device/mountpoint for findmnt is omitted findmnt will list all
mounted filesystems. In that case it will always return "true". So
explicitly check if an argument was passed to prevent false-positives.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-17 14:49:51 +08:00
Coiby Xu
12d9eff9dc Show how much time kdump has waited for the network to be ready
Relates: https://bugzilla.redhat.com/show_bug.cgi?id=2151504

Currently, when the network isn't ready, kdump would repeatedly print
the same info,

    [   29.537230] kdump[671]: Bad kdump network destination: 192.123.1.21
    [   30.559418] kdump[679]: Bad kdump network destination: 192.123.1.21
    [   31.580189] kdump[687]: Bad kdump network destination: 192.123.1.21

This is not user-friendly and users may think kdump has got stuck. So
also show much time has waited for the network to be ready,

    [   29.546258] kdump[673]: Waiting for network to be ready (50s / 10min)
    ...
    [   32.608967] kdump[697]: Waiting for network to be ready (56s / 10min)

Note kdump_get_ip_route no longer prints an error message and it's up to
the caller to determine the log level and print relevant messages. And
kdump_collect_netif_usage aborts when kdump_get_ip_route fails.

Reported-by: Martin Pitt <mpitt@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-04-15 06:39:17 +08:00
Coiby Xu
df6f25ff20 Tell nmcli to not escape colon when getting the path of connection profile
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2151504

When a NetworManager connection profile contains a colon in the name,
"nmcli --get-values UUID,FILENAME" by default would escape the colon
because a colon is also used for separating the values. In this case,
99kdumpbase fails to get the correct connection profile path,
	kdumpctl[5439]: cp: cannot stat '/run/NetworkManager/system-connections/static-52\\\:54\\\:01.nmconnection': No such file or directory
	kdumpctl[5440]: sed: can't read /tmp/1977-DRACUT_KDUMP_NM/ifcfg-static-52-54-01: No such file or directory
	kdumpctl[5449]: dracut-install: ERROR: installing '/tmp/1977-DRACUT_KDUMP_NM/ifcfg-static-52-54-01' to '/etc/NetworkManager/system-connections/ifcfg-static-52-54-01'

As a result, dumping vmcore to a remote nfs would fail.

In our case of getting connection profile path, there is no need to escape the
colon so pass "-escape no" to nmcli,

	[root@localhost ~]# nmcli --get-values UUID,FILENAME c show
	659e09c1-a6bd-3549-9be4-a07a1a9a8ffd:/etc/NetworkManager/system-connections/aa\:bb.nmconnection

	[root@localhost ~]# nmcli -escape no --get-values UUID,FILENAME c show
	659e09c1-a6bd-3549-9be4-a07a1a9a8ffd:/etc/NetworkManager/system-connections/aa:bb.nmconnection

Suggested-by: Beniamino Galvani <bgalvani@redhat.com>
Reported-by: Martin Pitt <mpitt@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-04-14 20:22:49 +08:00
Lichen Liu
d619b6dabe kdumpctl: lower the log level in reset_crashkernel_for_installed_kernel
Although upgrading the kernel with `rpm -Uvh` is not recommended, the
kexec-tools plugin prints confusing error logs when a customer upgrades the
kernel through it.

```
kdump: kernel 5.14.0-80.el9.x86_64 doesn't exist
kdump: Couldn't find current running kernel
```

Not finding the currently running kernel will only make kdump unable to copy the
grub entry parameters to the newly installed kernel, so lower the log level.

Signed-off-by: Lichen Liu <lichliu@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-04-10 12:20:15 +08:00
Coiby Xu
70c7598ef0 Install nfsv4-related drivers when users specify nfs dumping via dracut_args
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2140721

Currently, if users specify dumping to nfsv4 target via
  dracut_args --mount "<NFS-server-ip>:/var/crash /mnt nfs defaults"
it fails with the following errors,
    [    5.159760] mount[446]: mount.nfs: Protocol not supported
    [    5.164502] systemd[1]: mnt.mount: Mount process exited, code=exited, status=32/n/a
    [    5.167616] systemd[1]: mnt.mount: Failed with result 'exit-code'.
    [FAILED] Failed to mount /mnt.

This is because nfsv4-releted drivers are not installed to kdump initrd.
mkdumprd calls dracut with "--hostonly-mode strict". If nfsv4-related
drivers aren't loaded before calling dracut, they won't be installed.
When users specify nfs dumping via dracut_args, kexec-tools won't mount
the nfs fs beforehand hence nfsv4-related drivers won't be installed.
Note dracut only installs the nfs driver i.e. nfsv3 driver for "--mount
... nfs". So also install nfsv4-related drivers when users specify nfs
dumping via dracut_args. Since nfs_layout_nfsv41_files depends on nfsv4,
the nfsv4 driver will be installed automatically.

As for the reason why we support nfs dumping via dracut_args instead of
asking user to use the nfs directive, please refer to commit 74c6f464
("Support special mount information via 'dracut_args'").

Fixes: 4eedcae5 ("dracut-module-setup.sh: don't include multipath-hostonly")
Reported-by: rcheerla@redhat.com
Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
2023-03-21 15:55:57 +08:00
Philipp Rudo
d9dfea12da sysconfig: add zfcp.allow_lun_scan to KDUMP_COMMANDLINE_REMOVE on s390
Probing unnecessary I/O devices wastes memory and in extreme cases can
cause the crashkernel to run OOM. That's why the s390-tools maintain
their own module, 95zdev-kdump [1], that disables auto LUN scanning and
only configures zfcp devices that can be used as dump target. So remove
zfcp.allow_lun_scan from the kernel command line to prevent that we
accidentally overwrite the default set by the module.

[1] https://github.com/ibm-s390-linux/s390-tools/blob/master/zdev/dracut/95zdev-kdump/module-setup.sh

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-03-13 15:30:00 +08:00
Coiby Xu
12e6cd2b76 Use the correct command to get architecture
`uname -m` was used by mistake. As a result, kexec-tools failed to
update crashkernel=auto during in-place upgrade from RHEL8 to RHEL9.

`uname -m` should be used to get architecture instead.

Fixes: 5951b5e2 ("Don't try to update crashkernel when bootloader is not installed")

Signed-off-by: Coiby Xu <coxu@redhat.com>
Reviewed-by: Lichen Liu <lichliu@redhat.com>
2023-02-21 11:33:06 +08:00
Coiby Xu
b41cab7099 Release 2.0.26-3
Signed-off-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:39:35 +08:00
Philipp Rudo
d4e877214c kdumpctl: make do_estimate more robust
At the beginning of do_estimate it currently checks whether the
TARGET_INITRD exists and if not fails with an error message. This not
only requires the user to manually trigger the build of the initrd but
also ignores all cases where the TARGET_INITRD exists but need to be
rebuild. For example when there were changes to kdump.conf or when the
system switches from kdump to fadump. All these changes will impact the
outcome of do_estimate. Thus properly check whether the initrd needs to
be rebuild and if it does trigger the rebuild automatically.

To do so move the check whether the TARGET_INITRD has fadump enabled to
is_system_modified and call this function. With this force_(no_)rebuild
options in kdump.conf are ignored to avoid unnecessary rebuilds.

While at it cleanup check_system_modified and rename it to
is_system_modified. Furthermore move printing the info that the initrd
gets rebuild to rebuild_initrd to avoid every caller has the same line.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:37:23 +08:00
Philipp Rudo
37577b93ed kdumpctl: refractor check_rebuild
check_rebuild uses a bunch of local variables to store the result of the
different checks performed. At the end of the function it then evaluates
which check failed to print an appropriate info and trigger a rebuild if
needed. This not only makes the function hard to read but also requires
all checks to be executed even if an earlier one already determined that
the initrd needs to be rebuild. Thus refractor check_rebuild such that
it only checks whether the initrd needs to rebuild and trigger the
rebuild by the caller (if needed). While at it rename the function to
need_initrd_rebuild.

Furthermore also move setup_initrd to the caller so it is more consisted
with the other users of the function.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:37:23 +08:00
Philipp Rudo
5eefcf2e94 kdumpctl: cleanup 'stop'
Like for 'start' move the printing of the error message to the calling
function. This not only makes the code more consistent to 'start' but
also prevents 'kdumpctl restart' to call 'start' in case 'stop' has
failed. This doesn't impact the case when 'kdumpctl restart' is run
without any crash kernel being loaded as kexec will still return success
in that case.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:37:23 +08:00
Philipp Rudo
33b307af20 kdumpctl: cleanup 'start'
The function has many block of the kind

if ! cmd; then
  derror "Starting kdump: [FAILED]"
  return 1
fi

This duplicates code and makes the function hard to read. Thus move the
block to the calling function.

Signed-off-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Coiby Xu <coxu@redhat.com>
2023-01-30 17:37:23 +08:00