kexec-tools

Author	SHA1	Message	Date
Coiby Xu	231a75ac1b	Revert "Revert "x86_64: enable the kexec file load by default"" This reverts commit `073c30973c`, i.e. re-enable the kexec file load by default since this dual signature issue no longer bothers Fedora 34. Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-07-14 02:03:10 +08:00
Kairui Song	2603ba7187	Cleanup dead systemd services before start sysroot.mount When kdump failed due to initqueue timeout, the sysroot.mount and other serivces could be stuck in `start` but `dead` status: Example output of systemctl: dev-disk-by\x2duuid-530830d1\x2df2c7\x2d4c9a\x2d9a82\x2d148609097521.device loaded inactive dead start <... snip ...> squash-root.mount loaded active mounted /squash/root squash.mount loaded active mounted /squash sysroot.mount loaded inactive dead start /sysroot <... snip ...> dracut-cmdline.service loaded active exited dracut cmdline hook dracut-initqueue.service loaded activating start start dracut initqueue hook dracut-mount.service loaded inactive dead start dracut mount hook At this point calling `systemctl start sysroot.mount` will just hang as systemd will just wait for the services that are stuck in `start` status. So call `systemctl cancel` here to cancel all pending jobs and have a clean start for mounting sysroot. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Coiby Xu <coxu@redhat.com>	2021-07-12 16:53:34 +08:00
Kairui Song	7dbbb4bb31	Add a crashkernel-howto.txt doc Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Baoquan He <bhe@redhat.com>	2021-07-08 15:36:36 +08:00
Kairui Song	6463641935	Add a new hook: 92-crashkernel.install To track and manage kernel's crashkernel usage by kernel version, each kernel package will include a crashkernel.default containing the default `crashkernel=` value of that kernel. So we can use a hook to update the kernel cmdline of new installed kernel accordingly. Put it after all other grub boot loader setup hooks, so it can simply call grubby to modify the kernel cmdline. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Baoquan He <bhe@redhat.com>	2021-07-08 15:36:32 +08:00
Kairui Song	86130ec10f	kdumpctl: Add kdumpctl reset-crashkernel In newer kernel, crashkernel.default will contain the default crashkernel value of a kernel build. So introduce a new sub command to help user reset kernel crashkernel size to the default value. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Baoquan He <bhe@redhat.com>	2021-07-08 15:18:45 +08:00
Kairui Song	017903c3c4	Revert "kdump-lib.sh: Remove is_atomic" Now we need this helper again, for `reset-crashkernel` This reverts commit `ff46cfb19e`. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Baoquan He <bhe@redhat.com>	2021-07-08 15:18:00 +08:00
Kairui Song	97930d3cca	fadump-init: clean up mount points properly When running with squash module enabled for both initramfs, /dev and /run are also mounted by squash-init, so move them to newroot as well, else they might leak. Also pass `-d` to umount so loop devices (if used) will be force freed. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com>	2021-06-30 17:28:45 +08:00
Kairui Song	bf6671b60d	fadump: kdumpctl should check the modules used by the fadump initramfs After fadump embedded the fadump initramfs in the normal initramfs, kdumpctl will mistakenly rebuild the initramfs everytime. kdumpctl checks the hostonly-kernel-modules.txt file in initramfs to check if required drivers are included, but the normal initramfs is built in non-hostonly mode, so it doesn't have a hostonly-kernel-modules.txt file. The check will always fail. So let mkfadumprd make a copy of the hostonly-kernel-modules.txt in the fadump initramfs and let kdumpctl check that file instead. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com>	2021-06-30 17:27:02 +08:00
Hari Bathini	fa9201b240	fadump: isolate fadump initramfs image within the default one In case of fadump, the initramfs image has to be built to boot into the production environment as well as to offload the active crash dump to the specified dump target (for boot after crash). As the same image would be used for both boot scenarios, it could not be built optimally while accommodating both cases. Use --include to include the initramfs image built for offloading active crash dump to the specified dump target. Also, introduce a new out-of-tree dracut module (99zz-fadumpinit) that installs a customized init program while moving the default /init to /init.dracut. This customized init program is leveraged to isolate fadump image within the default initramfs image by kicking off default boot process (exec /init.dracut) for regular boot scenario and activating fadump initramfs image, if the system is booting after a crash. If squash is available, ensure default initramfs image is also built with squash module to reduce memory consumption in capture kernel. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-29 21:35:58 +08:00
Kairui Song	c4749f9c57	Release 2.0.22-4 Signed-off-by: Kairui Song <kasong@redhat.com>	2021-06-29 21:24:19 +08:00
Coiby Xu	ad6f60d70d	fix format issue in find_online_znet_device Change spaces to tab to fix alignment issue. Fixes: commit `7d47251568` ("Iterate /sys/bus/ccwgroup/devices to tell if we should set up rd.znet") Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-29 17:11:07 +08:00
Coiby Xu	03f9b91351	check the existence of /sys/bus/ccwgroup/devices before trying to find online network device /sys/bus/ccwgroup/devices doesn't exist for non-s390x machines which leads to the warning "find: '/sys/bus/ccwgroup/devices': No such file or directory". This warning can be eliminated by checking the existence of "/sys/bus/ccwgroup/devices" beforehand. Fixes: commit `7d47251568` ("Iterate /sys/bus/ccwgroup/devices to tell if we should set up rd.znet") Reported-by: Ruowen Qin <ruqin@redhat.com> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1974618 Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-29 17:11:00 +08:00
Tao Liu	50bb8b701f	check for invalid physical address of /proc/kcore when making ELF dumpfile Backport from upstream. commit 9a6f589d99dcef114c89fde992157f5467028c8f Author: Tao Liu <ltao@redhat.com> Date: Fri Jun 18 18:28:04 2021 +0800 [PATCH] check for invalid physical address of /proc/kcore when making ELF dumpfile Previously when executing makedumpfile with -E option against /proc/kcore, makedumpfile will fail: # makedumpfile -E -d 31 /proc/kcore kcore.dump ... write_elf_load_segment: Can't convert physaddr(ffffffffffffffff) to an offset. makedumpfile Failed. It's because /proc/kcore contains PT_LOAD program headers which have physaddr (0xffffffffffffffff). With -E option, makedumpfile will try to convert the physaddr to an offset and fails. Skip the PT_LOAD program headers which have such physaddr. Signed-off-by: Tao Liu <ltao@redhat.com> Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com> Signed-off-by: Tao Liu <ltao@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-28 15:52:21 +08:00
Tao Liu	0feb109818	check for invalid physical address of /proc/kcore when finding max_paddr Backport from upstream. commit 38d921a2ef50ebd36258097553626443ffe27496 Author: Coiby Xu <coxu@redhat.com> Date: Tue Jun 15 18:26:31 2021 +0800 [PATCH] check for invalid physical address of /proc/kcore when finding max_paddr Kernel commit 464920104bf7adac12722035bfefb3d772eb04d8 ("/proc/kcore: update physical address for kcore ram and text") sets an invalid paddr (0xffffffffffffffff = -1) for PT_LOAD segments of not direct mapped regions: $ readelf -l /proc/kcore ... Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000000120 0x0000000000000000 0x0000000000000000 0x0000000000002320 0x0000000000000000 0x0 LOAD 0x1000000000010000 0xd000000000000000 0xffffffffffffffff ^^^^^^^^^^^^^^^^^^ 0x0001f80000000000 0x0001f80000000000 RWE 0x10000 makedumpfile uses max_paddr to calculate the number of sections for sparse memory model thus wrong number is obtained based on max_paddr (-1). This error could lead to the failure of copying /proc/kcore for RHEL-8.5 on ppc64le machine [1]: $ makedumpfile /proc/kcore vmcore1 get_mem_section: Could not validate mem_section. get_mm_sparsemem: Can't get the address of mem_section. makedumpfile Failed. Let's check if the phys_start of the segment is a valid physical address to fix this problem. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1965267 Reported-by: Xiaoying Yan <yiyan@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com> Signed-off-by: Tao Liu <ltao@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-28 15:52:16 +08:00
Tao Liu	18b9b763de	Increase SECTION_MAP_LAST_BIT to 5 Backport from upstream. commit 646456862df8926ba10dd7330abf3bf0f887e1b6 Author: Kazuhito Hagio <k-hagio-ab@nec.com> Date: Wed May 26 14:31:26 2021 +0900 [PATCH] Increase SECTION_MAP_LAST_BIT to 5 * Required for kernel 5.12 Kernel commit 1f90a3477df3 ("mm: teach pfn_to_online_page() about ZONE_DEVICE section collisions") added a section flag (SECTION_TAINT_ZONE_DEVICE) and causes makedumpfile an error on some machines like this: __vtop4_x86_64: Can't get a valid pmd_pte. readmem: Can't convert a virtual address(ffffe2bdc2000000) to physical address. readmem: type_addr: 0, addr:ffffe2bdc2000000, size:32768 __exclude_unnecessary_pages: Can't read the buffer of struct page. create_2nd_bitmap: Can't exclude unnecessary pages. Increase SECTION_MAP_LAST_BIT to 5 to fix this. The bit had not been used until the change, so we can just increase the value. Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com> Signed-off-by: Tao Liu <ltao@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-28 15:52:02 +08:00
Kairui Song	302be5c34b	Release 2.0.22-3 Signed-off-by: Kairui Song <kasong@redhat.com>	2021-06-20 02:38:04 +08:00
Coiby Xu	62578ace21	selftest: ignore all spaces when compare the dmesg files For the log entry that has multiple lines, "makedumpfile --dump-dmesg" would indent the remaining lines while vmcore-dmesg doesn't. For example, vmcore-dmesg.txt and vmcore-dmesg.txt.2 are the outputs of vmcore-dmesg and "makedumpfile --dump-dmesg" respectively, ``` diff -u vmcore-dmesg.txt vmcore-dmesg.txt.2 --- vmcore-dmesg.txt 2021-03-28 22:13:09.986000000 -0400 +++ vmcore-dmesg.txt.2 2021-03-28 22:13:39.920106131 -0400 @@ -397,9 +397,9 @@ [ 1.710742] vc vcsa: hash matches [ 1.711938] RAS: Correctable Errors collector initialized. [ 1.713736] Unstable clock detected, switching default tracing clock to "global" -If you want to keep using the local clock, then add: - "trace_clock=local" -on the kernel command line + If you want to keep using the local clock, then add: + "trace_clock=local" + on the kernel command line [ 1.750539] ata1.01: NODEV after polling detection [ 1.750973] ata1.00: ATA-7: QEMU HARDDISK, 2.5+, max UDMA/100 [ 1.752885] ata1.00: 8388608 sectors, multi 16: LBA48 ``` Quite often, all three tests could fail because of the above difference. So let's ignore all the spaces. This patch could fix bz1952299 [1]. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1952299 Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-08 22:21:47 +08:00
Coiby Xu	560c0e8a7b	selftest: Make test_base_image depends on EXTRA_RPMS test_base_image should depend on EXTRA_RPMS so it gets rebuild when EXTRA_RPMS changes. Fixes: commit `bbc064f958` ("selftest: add EXTRA_RPMs so dracut RPMs can be installed onto the image to run the tests") Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-08 22:21:47 +08:00
Coiby Xu	0a15d859bb	selftest: fix the error of misplacing double quotes Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-08 22:21:47 +08:00
Lianbo Jiang	2d9504c4a4	mkdumprd: display the absolute path of dump location in the check_user_configured_target() When kdump service fails, the current errors do not display the absolute path of dump location(marked it as "^"), for example: kdump: kexec: unloaded kdump kernel kdump: Stopping kdump: [OK] kdump: Detected change(s) in the following file(s): /etc/kdump.conf kdump: Rebuilding /boot/initramfs-4.18.0-304.el8.x86_64kdump.img kdump: Dump path "/var1/crash" does not exist in dump target "UUID=c202ef45-3ac3-4adb-85e7-307a916757f0" ^^^^^^^^^^^ kdump: mkdumprd: failed to make kdump initrd kdump: Starting kdump: [FAILED] Here, it should output the absolute path of dump location with this format: "<mount path>/<path>". To fix it, let's extend the relative pathname to the absolute pathname in check_user_configured_target(). Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-08 10:49:24 +08:00
Coiby Xu	7d47251568	Iterate /sys/bus/ccwgroup/devices to tell if we should set up rd.znet This patch fixes bz1941106 and bz1941905 which passed empty rd.znet to the kernel command line in the following cases, - The IBM (Z15) KVM guest uses virtio for all devices including network device, so there is no znet device for IBM KVM guest. So we can't assume a s390x machine always has a znet device. - When a bridged network is used, kexec-tools tries to obtain the znet configuration from the ifcfg script of the bridged network rather than from the ifcfg script of znet device. We can iterate /sys/bus/ccwgroup/devices to tell if there if there is a znet network device. By getting an ifname from znet, we can also avoid mistaking the slave netdev as a znet network device in a bridged network or bonded network. Note: This patch also assumes there is only one znet device as commit `7148c0a30d` ("add s390x netdev setup") which greatly simplifies the code. According to IBM [1], there could be more than znet devices for a z/VM system and a z/VM system may have a non-znet network device like ConnectX. Since kdump_setup_znet was introduced in 2012 and so far there is no known customer complaint that invalidates this assumption I think it's safe to assume an IBM z/VM system only has one znet device. Besides, there is no z/VM system found on beaker to test the alternative scenarios. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1941905#c13 Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-08 10:48:34 +08:00
Kairui Song	41980f30d9	Use a customized emergency shell Use a modified and minimized version of emergency shell. The differences of this kdump shell and dracut emergency shell are: - Kdump shell won't generate a rdsosreport automatically - Customized prompts - Never ask root password - Won't tangle with dracut's emergency_action. If emergency_action is set, dracut emergency shell will perform dracut's emergency_action instead of kdump final_action on exit. - If rd.shell=no is set, kdump shell will still work, dracut emergency shell won't, even if kdump failure_action is set to shell. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Coiby Xu <coxu@redhat.com>	2021-06-04 14:26:51 +08:00
Kairui Song	a2306346bc	Remove the kdump error handler isolation wrapper The wrapper is introduced in commit `002337c`, according to the commit message, the only usage of the wrapper is when dracut-initqueue calls "systemctl start emergency" directly. In that case, emergency is started, but not in a isolation mode, which means dracut-initqueue is still running. On the other hand, emergency will call "systemctl start dracut-initqueue" again when default action is dump_to_rootfs. systemd would block on the last dracut-initqueue, waiting for the first instance to exit, which leaves us hang. In previous commit we added initqueue status detect in dump_to_rootfs, so now even without the wrapper, it will not hang. And actually, previously, with the wrapper, emergency might still hang for like 30s. When dracut called emergency service because initqueue timed out, dump_to_rootfs will try start initqueue again and timeout again. Now with the wrapper removed, we can avoid these two kinds of hangs, bacause without the isolation we can detect initqueue service status correctly in such case. Also remove the invalid header comments in service file, the service is not part of systemd code. And sync the service spec with dracut. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Coiby Xu <coxu@redhat.com>	2021-06-04 14:26:45 +08:00
Kairui Song	108258139a	Don's try to restart dracut-initqueue if it's already there kdump's dump_to_rootfs will try to start initqueue unconditionally. dump_to_rootfs will run after systemd isolate to emergency target, so this is currently accetable. But there is a problem when initqueue starts the emergency action because of initqueue timeout. dump_to_rootfs will start initqueue and lead to timeout again. So following patch will remove the previous isolation wrapper, and detect the service status here. Previous isolation makes the detection impossible. Now this detection will be valid and helpful to prevent double timeout or hang. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Coiby Xu <coxu@redhat.com>	2021-06-04 14:25:37 +08:00
Hari Bathini	39a642b66b	kdump-lib.sh: fix a warning in prepare_kdump_bootinfo() Fix the warning observed when KDUMP_KERNELVER is specified: kdumpctl[10926]: /lib/kdump/kdump-lib.sh: line 697: [: missing `]' Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-06-04 02:53:10 +08:00
Pingfan Liu	45377836b0	kdump-lib.sh: fix the case if no enough total RAM for kdump in get_recommend_size() For crashkernel=auto policy, if total RAM size is under a throttle, there is no memory reserved for kdump. Also correct a trivial bug by correcting the arch name. Signed-off-by: Pingfan Liu <piliu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-25 10:07:55 +08:00
Kairui Song	e9e6a2c745	kdumpctl: Add kdumpctl estimate Add a rough esitimation support, currently, following memory usage are checked by this sub command: - System RAM - Kdump Initramfs size - Kdump Kernel image size - Kdump Kernel module size - Kdump userspace user and other runtime allocated memory (currently simply using a fixed value: 64M) - LUKS encryption memory usage The output of kdumpctl estimate looks like this: # kdumpctl estimate Reserved crashkernel: 256M Recommanded crashkernel: 160M Kernel image size: 47M Kernel modules size: 12M Initramfs size: 19M Runtime reservation: 64M Large modules: xfs: 1892352 nouveau: 2318336 And if the kdump target is encrypted: # kdumpctl estimate Encrypted kdump target requires extra memory, assuming using the keyslot with minimun memory requirement Reserved crashkernel: 256M Recommanded crashkernel: 655M Kernel image size: 47M Kernel modules size: 12M Initramfs size: 19M Runtime reservation: 64M LUKS required size: 512M Large modules: xfs: 1892352 nouveau: 2318336 WARNING: Current crashkernel size is lower than recommanded size 655M. The "Recommanded" value is calculated based on memory usages mentioned above, and will be adjusted accodingly to be no less than the value provided by kdump_get_arch_recommend_size. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2021-05-19 15:27:43 +08:00
Kairui Song	85c725813b	mkdumprd: make use of the new get_luks_crypt_dev helper Simplfy the code and also improve the performance. udevadm call is heavy. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2021-05-19 15:27:37 +08:00
Kairui Song	1c70cf51c7	kdump-lib.sh: introduce a helper to get all crypt dev used by kdump Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2021-05-19 15:27:19 +08:00
Kairui Song	3423bbc17f	kdump-lib.sh: introduce a helper to get underlying crypt device Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2021-05-19 15:26:58 +08:00
Kairui Song	13796ca93a	Release 2.0.22-2 Signed-off-by: Kairui Song <kasong@redhat.com>	2021-05-13 17:14:38 +08:00
Tao Liu	d5fe96cd7a	Disable CMA in kdump 2nd kernel kexec-tools needs to disable CMA for kdump kernel cmdline, otherwise kdump kernel may run out of memory. This patch strips the inherited cma=, hugetlb_cma= cmd line from 1st kernel, and sets to be 0 for 2nd kernel. Signed-off-by: Tao Liu <ltao@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-13 17:13:39 +08:00
Coiby Xu	8178d7a5a1	Warn the user if network scripts are used Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-13 17:13:31 +08:00
Coiby Xu	d5f6d38173	Set up bond cmdline by "nmcli --get-values" Now kdumpctl will exit if failing to set up bond cmdline. Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-13 17:13:28 +08:00
Coiby Xu	6f1badec78	Set up dns cmdline by parsing "nmcli --get-values" Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-13 17:13:25 +08:00
Coiby Xu	8b08b4f17b	Set up s390 znet cmdline by "nmcli --get-values" Now kdumpctl will abort when failing to set up znet. Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-13 17:13:15 +08:00
Coiby Xu	0c292f49c7	Add helper to get nmcli connection show cmd by ifname Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-13 17:13:08 +08:00
Coiby Xu	c69578ca43	Add helper to get nmcli connection apath by ifname apath (a D-Bus active connection path) is used for nmcli connection operations, e.g. $ nmcli connection show $apath Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-13 17:13:01 +08:00
Coiby Xu	10c309b5f7	Add helper to get value by field using "nmcli --get-values" nmcli --get-values <field> connection show /org/freedesktop/NetworkManager/ActiveConnection/1 returns the following value for the corresponding field respectively, Field Value IP4.DNS "10.19.42.41 \| 10.11.5.19 \| 10.5.30.160" 802-3-ethernet.s390-subchannels "" bond.options "mode=balance-rr" Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-13 17:11:38 +08:00
Kairui Song	c05d8a16a0	Update makedumpfile to 1.6.9 Signed-off-by: Kairui Song <kasong@redhat.com>	2021-05-13 16:45:36 +08:00
Kairui Song	dece041609	Release 2.0.22-1 Update kexec-tools to 2.0.22 Signed-off-by: Kairui Song <kasong@redhat.com>	2021-05-11 02:12:50 +08:00
Coiby Xu	8a33ffffbc	rd.route should use the name from kdump_setup_ifname This fixes bz1854037 which happens because kexec-tools generates rd.route for eth0 instead of for kdump-eth0, 1. "rd.route=168.63.129.16:10.0.0.1:eth0 rd.route=169.254.169.254:10.0.0.1:eth0" is passed to the dracut cmdline by kexec-tools 2. In the 2rd kernel, dracut/modules.d/35network-manager/nm-config.sh calls /usr/libexec/nm-initrd-generator to generate two .nmconnection files based on the dracut cmdline, i.e. kdump-eth0.nmconnection and eth0.nmconnection, - /run/NetworkManager/system-connections/kdump-eth0.nmconnection [connection] id=kdump-eth0 uuid=3ef53b1b-3908-437e-a15f-cf1f3ea2678b type=ethernet autoconnect-retries=1 interface-name=kdump-eth0 multi-connect=1 permissions= wait-device-timeout=60000 [ethernet] mac-address-blacklist= [ipv4] address1=10.0.0.4/24,10.0.0.1 dhcp-timeout=90 dns=168.63.129.16; dns-search= may-fail=false method=manual [ipv6] addr-gen-mode=eui64 dhcp-timeout=90 dns-search= method=disabled [proxy] - /run/NetworkManager/system-connections/eth0.nmconnection [connection] id=eth0 uuid=f224dc22-2891-4d7b-8f66-745029df4b53 type=ethernet autoconnect-retries=1 interface-name=eth0 multi-connect=1 permissions= [ethernet] mac-address-blacklist= [ipv4] dhcp-timeout=90 dns=168.63.129.16; dns-search= method=auto route1=168.63.129.16/32,10.0.0.1 route2=169.254.169.254/32,10.0.0.1 [ipv6] addr-gen-mode=eui64 dhcp-timeout=90 dns-search= method=auto [proxy] 3. Since there's eth0.nmconnection, NetworkManager will try to get an IP for eth0 regardless of the fact it's a slave NIC and time out ``` $ ip link show 2: kdump-eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 00:0d:3a:11:86:8b brd ff:ff:ff:ff:ff:ff 3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master kdump-eth0 state UP mode DEFAULT group default qlen 1000 ``` Reported-by: Huijing Hei <hhei@redhat.com> Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-11 02:11:50 +08:00
Coiby Xu	97ee5dc64c	get kdump ifname once in kdump_install_net Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-11 02:11:50 +08:00
Tao Liu	ca05b754af	Fix incorrect file permissions of vmcore-dmesg-incomplete.txt vmcore-dmesg-incomplete.txt is generated by shell redirection, which taking the default umask value. When dmesg collector exits with non-zero, the file will exist and anyone can have access to it. This patch fixed the issue by chmod the file, making it accessible only to its owner. Signed-off-by: Tao Liu <ltao@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-05-11 02:11:50 +08:00
Kairui Song	ee160bf04d	Revert "Always set vm.zone_reclaim_mode = 3 in kdump kernel" This reverts commit `5633e83318`. vm.zone_reclaim_mode may cause trashing on some machines. And after second thought, vm.zone_reclaim_mode is barely helpful for machines with high mem stress, so just revert it. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2021-04-28 18:05:23 +08:00
Kairui Song	6137956f79	kdumpctl: fix check_config error when kdump.conf is empty Kdump scirpt already have default values for core_collector, path in many other place. Empty kdump.conf still works. Fix this corner case and fix the error message. Signed-off-by: Kairui Song <kasong@redhat.com> Acked-by: Pingfan Liu <piliu@redhat.com>	2021-04-28 18:05:12 +08:00
Kairui Song	d0a301aa3a	Release 2.0.21-9 Signed-off-by: Kairui Song <kasong@redhat.com>	2021-04-28 16:51:52 +08:00
Tao Liu	475e33030b	Make dracut-squash required for kexec-tools This patch reverts commit "Make dracut-squash a weak dep". Although kexec-tools can work without dracut-squash, it is essential for kdump to run properly in cases [1][2] where minimal amount of memory consumption is expected. Thus dracut-squash is needed for it. [1] https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/message/SJX7CW3WLOYSFI2YJKGTUGDBWSCMZXVZ/ [2] https://www.spinics.net/lists/systemd-devel/msg05864.html Signed-off-by: Tao Liu <ltao@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-04-28 16:13:39 +08:00
Tao Liu	0db060c4e2	Show write byte size in report messages Backport from upstream: commit 0ef2ca6c9fa2f61f217a4bf5d7fd70f24e12b2eb Author: Kazuhito Hagio <k-hagio-ab@nec.com> Date: Thu Feb 4 16:29:06 2021 +0900 [PATCH] Show write byte size in report messages Show write byte size in report messages. This value can be different from the size of the actual file because of some holes on dumpfile data structure. $ makedumpfile --show-stats -l -d 1 vmcore dump.ld1 ... Total pages : 0x0000000000080000 Write bytes : 377686445 ... # ls -l dump.ld1 -rw------- 1 root root 377691573 Feb 4 16:28 dump.ld1 Note that this value should not be used with /proc/kcore to determine how much disk space is needed for crash dump, because the real memory usage when a crash occurs can vary widely. Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com> Signed-off-by: Tao Liu <ltao@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-04-28 16:13:23 +08:00
Tao Liu	8973bd7ed0	Add shorthand --show-stats option to show report stats Backport from upstream: commit 6f3e75a558ed50d6ff0b42e3f61c099b2005b7bb Author: Julien Thierry <jthierry@redhat.com> Date: Tue Nov 24 10:45:25 2020 +0000 [PATCH 2/2] Add shorthand --show-stats option to show report stats Provide shorthand --show-stats option to enable report messages without needing to set a particular value for message-level. Signed-off-by: Julien Thierry <jthierry@redhat.com> Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com> Signed-off-by: Tao Liu <ltao@redhat.com> Acked-by: Kairui Song <kasong@redhat.com>	2021-04-28 15:45:25 +08:00

1 2 3 4 5 ...

1368 Commits