kernel-ark/kernel
Oleg Nesterov d721304dce workqueue: fix flush_workqueue() vs CPU_DEAD race
Many thanks to Srivatsa Vaddagiri for the helpful discussion and for spotting
the bug in my previous attempt.

work->func() (and thus flush_workqueue()) must not use workqueue_mutex,
this leads to deadlock when CPU_DEAD does kthread_stop(). However without
this mutex held we can't detect CPU_DEAD in progress, which can move pending
works to another CPU while the dead one is not on cpu_online_map.

Change flush_workqueue() to use for_each_possible_cpu(). This means that
flush_cpu_workqueue() may hit CPU which is already dead. However in that
case

	!list_empty(&cwq->worklist) || cwq->current_work != NULL

means that CPU_DEAD in progress, it will do kthread_stop() + take_over_work()
so we can proceed and insert a barrier. We hold cwq->lock, so we are safe.

Also, add migrate_sequence incremented by take_over_work() under cwq->lock.
If take_over_work() happened before we checked this CPU, we should see the
new value after spin_unlock().

Further possible changes:

	remove CPU_DEAD handling (along with take_over_work, migrate_sequence)
	from workqueue.c. CPU_DEAD just sets cwq->please_exit_after_flush flag.

	CPU_UP_PREPARE->create_workqueue_thread() clears this flag, and creates
	the new thread if cwq->thread == NULL.

This way the workqueue/cpu-hotplug interaction is almost zero, workqueue_mutex
just protects "workqueues" list, CPU_LOCK_ACQUIRE/CPU_LOCK_RELEASE go away.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
Cc: Gautham shenoy <ego@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:52 -07:00
..
irq Fix Linuxdoc comment 2007-05-09 12:30:48 -07:00
power PM: Separate hibernation code from suspend code 2007-05-09 12:30:48 -07:00
time Fix printk format warnings in timer_list.c 2007-05-09 12:30:50 -07:00
.gitignore
acct.c
audit.c
audit.h
auditfilter.c
auditsc.c
capability.c
compat.c
configs.c use simple_read_from_buffer in kernel/ 2007-05-09 12:30:49 -07:00
cpu.c call cpu_chain with CPU_DOWN_FAILED if CPU_DOWN_PREPARE failed 2007-05-09 12:30:51 -07:00
cpuset.c use simple_read_from_buffer in kernel/ 2007-05-09 12:30:49 -07:00
delayacct.c
die_notifier.c
dma.c
exec_domain.c
exit.c
extable.c
fork.c
futex_compat.c
futex.c
hrtimer.c
itimer.c
kallsyms.c
Kconfig.hz
Kconfig.preempt
kexec.c
kfifo.c
kmod.c
kprobes.c
ksysfs.c
kthread.c
latency.c
lockdep_internals.h
lockdep_proc.c
lockdep.c
Makefile
module.c
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
nsproxy.c
panic.c
params.c
pid.c
posix-cpu-timers.c
posix-timers.c
printk.c
profile.c
ptrace.c
rcupdate.c
rcutorture.c
relay.c relay: use plain timer instead of delayed work 2007-05-09 12:30:51 -07:00
resource.c
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rwsem.c
sched.c Eliminate lock_cpu_hotplug in kernel/schedc 2007-05-09 12:30:51 -07:00
seccomp.c
signal.c Move sig_kernel_* et al macros to linux/signal.h 2007-05-09 12:30:49 -07:00
softirq.c
softlockup.c
spinlock.c
srcu.c
stacktrace.c
stop_machine.c
sys_ni.c
sys.c Extend notifier_call_chain to count nr_calls made 2007-05-09 12:30:51 -07:00
sysctl.c
taskstats.c
time.c
timer.c
tsacct.c
uid16.c
user.c
utsname_sysctl.c
utsname.c
wait.c
workqueue.c workqueue: fix flush_workqueue() vs CPU_DEAD race 2007-05-09 12:30:52 -07:00