abb5a5cc6b
Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset callback_mutex lock. It only happens on cpu_exclusive cpusets, due to the dynamic sched domain code trying to take the cpu hotplug lock inside the cpuset callback_mutex lock. This bug has apparently been here for several months, but didn't get hit until the right customer load on a large system. This fix appears right from inspection, but it will take a few more days running it on that customers workload to be confident we nailed it. We don't have any other reproducible test case. The cpu_hotplug_lock() tends to cover large runs of code. The other places that hold both that lock and the cpuset callback mutex lock always nest the cpuset lock inside the hotplug lock. This place tries to do the reverse, risking an ABBA deadlock. This is in the cpuset_rmdir() code, where we: * take the callback_mutex lock * mark the cpuset CS_REMOVED * call update_cpu_domains for cpu_exclusive cpusets * in that call, take the cpu_hotplug lock if the cpuset is marked for removal. Thanks to Jack Steiner for identifying this deadlock. The fix is to tear down the dynamic sched domain before we grab the cpuset callback_mutex lock. This way, the two locks are serialized, with the hotplug lock taken and released before trying for the cpuset lock. I suspect that this bug was introduced when I changed the cpuset locking from one lock to two. The dynamic sched domain dependency on cpu_exclusive cpusets and its hotplug hooks were added to this code earlier, when cpusets had only a single lock. It may well have been fine then. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
||
---|---|---|
.. | ||
irq | ||
power | ||
time | ||
.gitignore | ||
acct.c | ||
audit.c | ||
audit.h | ||
auditfilter.c | ||
auditsc.c | ||
capability.c | ||
compat.c | ||
configs.c | ||
cpu.c | ||
cpuset.c | ||
delayacct.c | ||
dma.c | ||
exec_domain.c | ||
exit.c | ||
extable.c | ||
fork.c | ||
futex_compat.c | ||
futex.c | ||
hrtimer.c | ||
itimer.c | ||
kallsyms.c | ||
Kconfig.hz | ||
Kconfig.preempt | ||
kexec.c | ||
kfifo.c | ||
kmod.c | ||
kprobes.c | ||
ksysfs.c | ||
kthread.c | ||
lockdep_internals.h | ||
lockdep_proc.c | ||
lockdep.c | ||
Makefile | ||
module.c | ||
mutex-debug.c | ||
mutex-debug.h | ||
mutex.c | ||
mutex.h | ||
panic.c | ||
params.c | ||
pid.c | ||
posix-cpu-timers.c | ||
posix-timers.c | ||
printk.c | ||
profile.c | ||
ptrace.c | ||
rcupdate.c | ||
rcutorture.c | ||
relay.c | ||
resource.c | ||
rtmutex_common.h | ||
rtmutex-debug.c | ||
rtmutex-debug.h | ||
rtmutex-tester.c | ||
rtmutex.c | ||
rtmutex.h | ||
rwsem.c | ||
sched.c | ||
seccomp.c | ||
signal.c | ||
softirq.c | ||
softlockup.c | ||
spinlock.c | ||
stacktrace.c | ||
stop_machine.c | ||
sys_ni.c | ||
sys.c | ||
sysctl.c | ||
taskstats.c | ||
time.c | ||
timer.c | ||
uid16.c | ||
unwind.c | ||
user.c | ||
wait.c | ||
workqueue.c |