kernel-ark/kernel
Linus Torvalds 04e2f1741d Add memory barrier semantics to wake_up() & co
Oleg Nesterov and others have pointed out that on some architectures,
the traditional sequence of

	set_current_state(TASK_INTERRUPTIBLE);
	if (CONDITION)
		return;
	schedule();

is racy wrt another CPU doing

	CONDITION = 1;
	wake_up_process(p);

because while set_current_state() has a memory barrier separating
setting of the TASK_INTERRUPTIBLE state from reading of the CONDITION
variable, there is no such memory barrier on the wakeup side.

Now, wake_up_process() does actually take a spinlock before it reads and
sets the task state on the waking side, and on x86 (and many other
architectures) that spinlock is in fact equivalent to a memory barrier,
but that is not generally guaranteed.  The write that sets CONDITION
could move into the critical region protected by the runqueue spinlock.

However, adding a smp_wmb() to before the spinlock should now order the
writing of CONDITION wrt the lock itself, which in turn is ordered wrt
the accesses within the spinlock (which includes the reading of the old
state).

This should thus close the race (which probably has never been seen in
practice, but since smp_wmb() is a no-op on x86, it's not like this will
make anything worse either on the most common architecture where the
spinlock already gave the required protection).

Acked-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23 18:05:03 -08:00
..
irq genirq: do not leave interupts enabled on free_irq 2008-02-19 10:43:58 +01:00
power PM: Introduce PM_EVENT_HIBERNATE callback state 2008-02-23 10:40:04 -08:00
time timer_list: print relative expiry time signed 2008-02-17 17:29:38 +01:00
.gitignore
acct.c
audit_tree.c Introduce path_put() 2008-02-14 21:13:33 -08:00
audit.c d_path: Make d_path() use a struct path 2008-02-14 21:17:09 -08:00
audit.h
auditfilter.c Introduce path_put() 2008-02-14 21:13:33 -08:00
auditsc.c Audit: use == not = in if statements 2008-02-18 18:46:28 -08:00
backtracetest.c
capability.c
cgroup_debug.c Task Control Groups: simple task cgroup debug info subsystem 2007-10-19 11:53:36 -07:00
cgroup.c cgroup: remove dead code in cgroup_get_rootdir() 2008-02-23 17:13:25 -08:00
compat.c
configs.c
cpu.c
cpuset.c
delayacct.c
dma.c
exec_domain.c
exit.c Use struct path in fs_struct 2008-02-14 21:13:33 -08:00
extable.c module: Don't report discarded init pages as kernel text. 2008-01-29 17:13:18 +11:00
fork.c Use struct path in fs_struct 2008-02-14 21:13:33 -08:00
futex_compat.c futex: runtime enable pi and robust functionality 2008-02-23 17:12:15 -08:00
futex.c futex: runtime enable pi and robust functionality 2008-02-23 17:12:15 -08:00
hrtimer.c hrtimer: catch expired CLOCK_REALTIME timers early 2008-02-14 22:08:30 +01:00
itimer.c
kallsyms.c remove support for un-needed _extratext section 2008-02-06 10:41:01 -08:00
Kconfig.hz
Kconfig.preempt
kexec.c vmcoreinfo: add "VMCOREINFO_" to all the call for vmcoreinfo_append_str() 2008-02-07 08:42:25 -08:00
kfifo.c
kmod.c Dont touch fs_struct in usermodehelper 2008-02-14 21:13:32 -08:00
kprobes.c
ksysfs.c
kthread.c
latencytop.c
lockdep_internals.h
lockdep_proc.c
lockdep.c
Makefile
marker.c markers: fix sparse warnings in markers.c 2008-02-23 17:12:14 -08:00
module.c modules: do not try to add sysfs attributes if !CONFIG_SYSFS 2008-02-21 15:27:08 -08:00
mutex-debug.c
mutex-debug.h
mutex.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
mutex.h
notifier.c
ns_cgroup.c
nsproxy.c
panic.c
params.c
pid_namespace.c namespaces: cleanup the code managed with PID_NS option 2008-02-08 09:22:23 -08:00
pid.c
pm_qos_params.c
posix-cpu-timers.c
posix-timers.c hrtimer: check relative timeouts for overflow 2008-02-14 22:08:30 +01:00
printk.c
profile.c
ptrace.c
rcuclassic.c
rcupdate.c
rcupreempt_trace.c
rcupreempt.c
rcutorture.c
relay.c
res_counter.c Memory controller improve user interface 2008-02-07 08:42:18 -08:00
resource.c
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rwsem.c
sched_debug.c
sched_fair.c
sched_idletask.c
sched_rt.c
sched_stats.h
sched.c Add memory barrier semantics to wake_up() & co 2008-02-23 18:05:03 -08:00
seccomp.c
signal.c remove final fastcall users 2008-02-13 16:21:18 -08:00
softirq.c
softlockup.c
spinlock.c spinlock: lockbreak cleanup 2008-01-30 13:31:20 +01:00
srcu.c
stacktrace.c
stop_machine.c
sys_ni.c
sys.c
sysctl_check.c
sysctl.c hugetlb: fix overcommit locking 2008-02-13 16:21:18 -08:00
taskstats.c
test_kprobes.c
time.c
timeconst.pl
timer.c
tsacct.c
uid16.c
user_namespace.c namespaces: cleanup the code managed with the USER_NS option 2008-02-08 09:22:23 -08:00
user.c sched: rt-group: make rt groups scheduling configurable 2008-02-13 15:45:40 +01:00
utsname_sysctl.c
utsname.c Fix UTS corruption during clone(CLONE_NEWUTS) 2007-09-19 11:24:17 -07:00
wait.c
workqueue.c