kernel-ark/kernel
Srivatsa S. Bhat a219ccf463 smp: print more useful debug info upon receiving IPI on an offline CPU
There is a longstanding problem related to CPU hotplug which causes IPIs
to be delivered to offline CPUs, and the smp-call-function IPI handler
code prints out a warning whenever this is detected.  Every once in a
while this (usually harmless) warning gets reported on LKML, but so far
it has not been completely fixed.  Usually the solution involves finding
out the IPI sender and fixing it by adding appropriate synchronization
with CPU hotplug.

However, while going through one such internal bug reports, I found that
there is a significant bug in the receiver side itself (more
specifically, in stop-machine) that can lead to this problem even when
the sender code is perfectly fine.  This patchset fixes that
synchronization problem in the CPU hotplug stop-machine code.

Patch 1 adds some additional debug code to the smp-call-function
framework, to help debug such issues easily.

Patch 2 modifies the stop-machine code to ensure that any IPIs that were
sent while the target CPU was online, would be noticed and handled by
that CPU without fail before it goes offline.  Thus, this avoids
scenarios where IPIs are received on offline CPUs (as long as the sender
uses proper hotplug synchronization).

In fact, I debugged the problem by using Patch 1, and found that the
payload of the IPI was always the block layer's trigger_softirq()
function.  But I was not able to find anything wrong with the block
layer code.  That's when I started looking at the stop-machine code and
realized that there is a race-window which makes the IPI _receiver_ the
culprit, not the sender.  Patch 2 fixes that race and hence this should
put an end to most of the hard-to-debug IPI-to-offline-CPU issues.

This patch (of 2):

Today the smp-call-function code just prints a warning if we get an IPI
on an offline CPU.  This info is sufficient to let us know that
something went wrong, but often it is very hard to debug exactly who
sent the IPI and why, from this info alone.

In most cases, we get the warning about the IPI to an offline CPU,
immediately after the CPU going offline comes out of the stop-machine
phase and reenables interrupts.  Since all online CPUs participate in
stop-machine, the information regarding the sender of the IPI is already
lost by the time we exit the stop-machine loop.  So even if we dump the
stack on each CPU at this point, we won't find anything useful since all
of them will show the stack-trace of the stopper thread.  So we need a
better way to figure out who sent the IPI and why.

To achieve this, when we detect an IPI targeted to an offline CPU, loop
through the call-single-data linked list and print out the payload
(i.e., the name of the function which was supposed to be executed by the
target CPU).  This would give us an insight as to who might have sent
the IPI and help us debug this further.

[akpm@linux-foundation.org: correctly suppress warning output on second and later occurrences]
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:12 -07:00
..
debug kernel/printk: use symbolic defines for console loglevels 2014-06-04 16:54:17 -07:00
events Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm into next 2014-06-05 15:57:04 -07:00
gcov gcov: reuse kbasename helper 2013-11-13 12:09:34 +09:00
irq genirq: Improve documentation to match current implementation 2014-05-27 10:16:44 +02:00
locking Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next 2014-06-03 14:00:15 -07:00
power Merge branches 'pnp', 'powercap', 'pm-runtime' and 'pm-opp' 2014-06-03 23:13:00 +02:00
printk kernel/printk: use symbolic defines for console loglevels 2014-06-04 16:54:17 -07:00
rcu Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next 2014-06-03 12:57:53 -07:00
sched printk: Add printk_deferred_once 2014-06-04 16:54:17 -07:00
time Merge branch 'akpm' (patchbomb from Andrew) into next 2014-06-04 16:55:13 -07:00
trace tracing: Use rcu_dereference_sched() for trace event triggers 2014-05-02 23:12:42 -04:00
.gitignore Ignore generated file kernel/x509_certificate_list 2013-12-10 18:21:34 +00:00
acct.c
async.c
audit_tree.c inotify: Fix reporting of cookies for inotify events 2014-02-18 11:17:17 +01:00
audit_watch.c inotify: Fix reporting of cookies for inotify events 2014-02-18 11:17:17 +01:00
audit.c net: Use netlink_ns_capable to verify the permisions of netlink messages 2014-04-24 13:44:54 -04:00
audit.h audit: Use struct net not pid_t to remember the network namespce to reply in 2014-03-20 10:10:53 -04:00
auditfilter.c Merge git://git.infradead.org/users/eparis/audit 2014-04-12 12:38:53 -07:00
auditsc.c Merge git://git.infradead.org/users/eparis/audit 2014-04-12 12:38:53 -07:00
backtracetest.c kernel/backtracetest.c: replace no level printk by pr_info() 2014-06-04 16:54:14 -07:00
bounds.c mm: do not allocate page->ptl dynamically, if spinlock_t fits to long 2013-12-20 12:25:45 -08:00
capability.c kernel/capability.c: code clean-up 2014-06-04 16:54:15 -07:00
cgroup_freezer.c cgroup: fix rcu_read_lock() leak in update_if_frozen() 2014-05-13 11:28:30 -04:00
cgroup.c kernfs: move the last knowledge of sysfs out from kernfs 2014-05-27 14:33:17 -07:00
compat.c kernel/compat.c: use sizeof() instead of sizeof 2014-06-04 16:54:19 -07:00
configs.c
context_tracking.c asmlinkage: Add explicit __visible to drivers/*, lib/*, kernel/* 2014-05-05 16:07:46 -07:00
cpu_pm.c
cpu.c kernel/cpu.c: convert printk to pr_foo() 2014-06-04 16:54:14 -07:00
cpuset.c mm: page_alloc: use jump labels to avoid checking number_of_cpusets 2014-06-04 16:54:08 -07:00
crash_dump.c
cred.c
delayacct.c kernel/delayacct.c: remove redundant checking in __delayacct_add_tsk() 2013-11-13 12:09:12 +09:00
dma.c
elfcore.c switch elf_core_write_extra_phdrs() to dump_emit() 2013-11-09 00:16:23 -05:00
exec_domain.c kernel/exec_domain.c: code clean-up 2014-06-04 16:54:15 -07:00
exit.c signals: mv {dis,}allow_signal() from sched.h/exit.c to signal.[ch] 2014-06-06 16:08:11 -07:00
extable.c asmlinkage: Make main_extable_sort_needed visible 2014-02-13 18:13:22 -08:00
fork.c ptrace: fix fork event messages across pid namespaces 2014-06-06 16:08:11 -07:00
freezer.c libata, freezer: avoid block device removal while system is frozen 2013-12-19 13:50:32 -05:00
futex_compat.c compat: Get rid of (get|put)_compat_time(val|spec) 2014-02-02 14:09:12 -08:00
futex.c Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next 2014-06-03 12:57:53 -07:00
groups.c kernel/groups.c: remove return value of set_groups 2014-04-03 16:21:05 -07:00
hrtimer.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next 2014-06-03 13:18:00 -07:00
hung_task.c kernel/hung_task.c: convert simple_strtoul to kstrtouint 2014-06-04 16:54:15 -07:00
irq_work.c perf/x86: Warn to early_printk() in case irq_work is too slow 2014-02-21 21:49:07 +01:00
itimer.c
jump_label.c static_key: WARN on usage before jump_label_init was called 2013-10-19 19:45:35 -04:00
kallsyms.c kernel: use macros from compiler.h instead of __attribute__((...)) 2014-04-07 16:36:11 -07:00
kcmp.c
Kconfig.freezer
Kconfig.hz kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS 2013-11-15 09:32:22 +09:00
Kconfig.locks
Kconfig.preempt
kexec.c powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode 2014-05-28 13:24:26 +10:00
kmod.c signals: change wait_for_helper() to use kernel_sigaction() 2014-06-06 16:08:12 -07:00
kprobes.c kprobes: use KSYM_NAME_LEN to size identifier buffers 2013-11-13 12:09:26 +09:00
ksysfs.c kobject: Make support for uevent_helper optional. 2014-04-25 12:00:49 -07:00
kthread.c kthread: fix return value of kthread_create() upon SIGKILL. 2014-06-04 16:53:51 -07:00
latencytop.c kernel/latencytop.c: convert seq_printf to seq_puts 2014-06-04 16:54:15 -07:00
Makefile Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-03-31 14:13:25 -07:00
module_signing.c keys: change asymmetric keys to use common hash definitions 2013-10-25 17:15:18 -04:00
module-internal.h KEYS: Separate the kernel signature checking keyring from module signing 2013-09-25 17:17:01 +01:00
module.c Fixed one missing place for the new taint flag, and remove a warning 2014-05-01 10:35:01 -07:00
notifier.c notifier: Substitute rcu_access_pointer() for rcu_dereference_raw() 2014-02-26 06:35:13 -08:00
nsproxy.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2013-09-07 14:35:32 -07:00
padata.c padata: Fix wrong usage of rcu_dereference() 2013-12-05 21:28:42 +08:00
panic.c kernel/panic.c: display reason at end + pr_emerg 2014-04-07 16:36:08 -07:00
params.c params: improve standard definitions 2013-12-04 14:09:46 +10:30
pid_namespace.c pid_namespace: pidns_get() should check task_active_pid_ns() != NULL 2014-04-02 16:20:21 -07:00
pid.c pidns: fix free_pid() to handle the first fork failure 2013-09-30 14:31:03 -07:00
posix-cpu-timers.c posix-timers: Convert abuses of BUG_ON to WARN_ON 2013-12-09 16:56:29 +01:00
posix-timers.c
profile.c CPU hotplug notifiers registration fixes for 3.15-rc1 2014-04-07 14:55:46 -07:00
ptrace.c kernel/compat: convert to COMPAT_SYSCALL_DEFINE 2014-03-06 15:35:10 +01:00
range.c
reboot.c kernel/reboot.c: convert simple_strtoul to kstrtoint 2014-06-04 16:54:15 -07:00
relay.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-04-12 14:49:50 -07:00
res_counter.c kernel/res_counter.c: replace simple_strtoull by kstrtoull 2014-06-04 16:54:15 -07:00
resource.c resources: Clarify sanity check message 2014-05-23 10:47:21 -06:00
seccomp.c seccomp: fix memory leak on filter attach 2014-04-16 15:25:53 -04:00
signal.c signals: introduce kernel_sigaction() 2014-06-06 16:08:12 -07:00
smp.c smp: print more useful debug info upon receiving IPI on an offline CPU 2014-06-06 16:08:12 -07:00
smpboot.c
smpboot.h
softirq.c Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu 2014-05-22 11:36:10 +02:00
stacktrace.c
stop_machine.c kernel/stop_machine.c: kernel-doc warning fix 2014-06-04 16:54:15 -07:00
sys_ni.c sys_sgetmask/sys_ssetmask: add CONFIG_SGETMASK_SYSCALL 2014-06-04 16:54:14 -07:00
sys.c sched: Consolidate open coded implementations of nice level frobbing into nice_to_rlimit() and rlimit_to_nice() 2014-05-22 11:16:36 +02:00
sysctl_binary.c kernel/sysctl_binary.c: use scnprintf() instead of snprintf() 2013-11-13 12:09:33 +09:00
sysctl.c Merge branch 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next 2014-06-05 08:05:29 -07:00
system_certificates.S KEYS: correct alignment of system_certificate_list content in assembly file 2013-12-10 18:25:28 +00:00
system_keyring.c KEYS: correct alignment of system_certificate_list content in assembly file 2013-12-10 18:25:28 +00:00
task_work.c task_work: documentation 2013-09-11 15:58:27 -07:00
taskstats.c genetlink: only pass array to genl_register_family_with_ops() 2013-11-19 16:39:05 -05:00
test_kprobes.c
time.c
timeconst.bc
timer.c timer: Prevent overflow in apply_slack 2014-04-30 13:46:17 +02:00
torture.c torture: Remove __init from torture_init_begin/end 2014-05-14 09:46:30 -07:00
tracepoint.c kernel/tracepoint.c: kernel-doc fixes 2014-06-04 16:54:15 -07:00
tsacct.c
uid16.c
up.c smp: Rename __smp_call_function_single() to smp_call_function_single_async() 2014-02-24 14:47:15 -08:00
user_namespace.c user namespace: fix incorrect memory barriers 2014-04-14 16:03:02 -07:00
user-return-notifier.c
user.c kernel/user.c: drop unused field 'files' from user_struct 2014-06-04 16:54:16 -07:00
utsname_sysctl.c kernel/utsname_sysctl.c: replace obsolete __initcall by device_initcall 2014-06-04 16:54:15 -07:00
utsname.c
watchdog.c kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write() 2014-04-18 16:40:08 -07:00
workqueue_internal.h
workqueue.c Linux 3.15-rc6 2014-05-22 10:28:56 +02:00