kernel-ark/drivers
Mikulas Patocka 7c5f78b9d7 dm snapshot: fix primary_pe race
Fix a race condition with primary_pe ref_count handling.

put_pending_exception runs under dm_snapshot->lock, it does atomic_dec_and_test
on primary_pe->ref_count, and later does atomic_read primary_pe->ref_count.

__origin_write does atomic_dec_and_test on primary_pe->ref_count without holding
dm_snapshot->lock.

This opens the following race condition:
Assume two CPUs, CPU1 is executing put_pending_exception (and holding
dm_snapshot->lock). CPU2 is executing __origin_write in parallel.
primary_pe->ref_count == 2.

CPU1:
if (primary_pe && atomic_dec_and_test(&primary_pe->ref_count))
	origin_bios = bio_list_get(&primary_pe->origin_bios);
... decrements primary_pe->ref_count to 1. Doesn't load origin_bios

CPU2:
if (first && atomic_dec_and_test(&primary_pe->ref_count)) {
	flush_bios(bio_list_get(&primary_pe->origin_bios));
	free_pending_exception(primary_pe);
	/* If we got here, pe_queue is necessarily empty. */
	return r;
}
... decrements primary_pe->ref_count to 0, submits pending bios, frees
primary_pe.

CPU1:
if (!primary_pe || primary_pe != pe)
	free_pending_exception(pe);
... this has no effect.
if (primary_pe && !atomic_read(&primary_pe->ref_count))
	free_pending_exception(primary_pe);
... sees ref_count == 0 (written by CPU 2), does double free !!

This bug can happen only if someone is simultaneously writing to both the
origin and the snapshot.

If someone is writing only to the origin, __origin_write will submit kcopyd
request after it decrements primary_pe->ref_count (so it can't happen that the
finished copy races with primary_pe->ref_count decrementation).

If someone is writing only to the snapshot, __origin_write isn't invoked at all
and the race can't happen.

The race happens when someone writes to the snapshot --- this creates
pending_exception with primary_pe == NULL and starts copying. Then, someone
writes to the same chunk in the snapshot, and __origin_write races with
termination of already submitted request in pending_complete (that calls
put_pending_exception).

This race may be reason for bugs:
  http://bugzilla.kernel.org/show_bug.cgi?id=11636
  https://bugzilla.redhat.com/show_bug.cgi?id=465825

The patch fixes the code to make sure that:
1. If atomic_dec_and_test(&primary_pe->ref_count) returns false, the process
must no longer dereference primary_pe (because someone else may free it under
us).
2. If atomic_dec_and_test(&primary_pe->ref_count) returns true, the process
is responsible for freeing primary_pe.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@kernel.org
2008-10-21 17:44:51 +01:00
..
accessibility
acpi x86: sysfs: kill owner field from attribute 2008-10-20 08:52:42 -07:00
amba
ata
atm
auxdisplay
base memory_probe: fix wrong sysfs file attribute 2008-10-20 08:52:32 -07:00
block x86: sysfs: kill owner field from attribute 2008-10-20 08:52:42 -07:00
bluetooth
cdrom
char Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6 2008-10-20 14:40:31 -07:00
clocksource Merge branches 'timers/clocksource', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus 2008-10-20 13:14:06 +02:00
connector
cpufreq
cpuidle
crypto
dca device create: misc: convert device_create_drvdata to device_create 2008-10-16 09:24:43 -07:00
dio
dma Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx 2008-10-20 12:54:30 -07:00
edac edac cell: fix incorrect edac_mode 2008-10-20 08:52:40 -07:00
eisa
firewire firewire: fix ioctl() return code 2008-10-15 22:21:10 +02:00
firmware x86: sysfs: kill owner field from attribute 2008-10-20 08:52:42 -07:00
gpio Merge branch 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-10-20 13:23:01 -07:00
gpu Export tiny shmem_file_setup for DRM-GEM 2008-10-20 16:17:42 -07:00
hid USB: remove warn macro from HID core 2008-10-17 14:41:09 -07:00
hwmon hwmon: applesmc: lighter wait mechanism, drastic improvement 2008-10-20 08:52:35 -07:00
i2c PCI: Check dynids driver_data value for validity 2008-10-20 10:48:35 -07:00
ide Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 2008-10-20 13:12:39 -07:00
ieee1394 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 2008-10-16 15:02:24 -07:00
infiniband x86: sysfs: kill owner field from attribute 2008-10-20 08:52:42 -07:00
input Merge branch 'for-next' of git://git.o-hand.com/linux-mfd 2008-10-20 09:22:47 -07:00
isdn device create: misc: convert device_create_drvdata to device_create 2008-10-16 09:24:43 -07:00
leds Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 2008-10-20 13:12:39 -07:00
lguest
macintosh device create: misc: convert device_create_drvdata to device_create 2008-10-16 09:24:43 -07:00
mca
md dm snapshot: fix primary_pe race 2008-10-21 17:44:51 +01:00
media byteorder: remove direct includes of linux/byteorder/swab[b].h 2008-10-20 08:52:40 -07:00
memstick x86: sysfs: kill owner field from attribute 2008-10-20 08:52:42 -07:00
message i2o: Fix 32/64bit DMA locking 2008-10-16 11:21:38 -07:00
mfd Merge branch 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-10-20 13:23:01 -07:00
misc HP-WMI: additional keycode (or typo) 2008-10-20 08:52:34 -07:00
mmc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc 2008-10-20 10:17:42 -07:00
mtd Merge git://git.infradead.org/mtd-2.6 2008-10-20 09:03:12 -07:00
net misc: replace remaining __FUNCTION__ with __func__ 2008-10-20 16:17:42 -07:00
nubus nubus: fix mis-indented statement 2008-10-16 11:21:30 -07:00
of
oprofile
parisc Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6 2008-10-20 14:40:31 -07:00
parport parport: remove CVS keywords 2008-10-16 11:21:49 -07:00
pci Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2008-10-20 13:40:47 -07:00
pcmcia Merge branch 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-10-20 13:23:01 -07:00
pnp {pci,pnp} quirks.c: don't use deprecated print_fn_descriptor_symbol() 2008-10-16 16:11:43 -07:00
power Merge git://git.infradead.org/battery-2.6 2008-10-20 09:44:30 -07:00
ps3 ps3: Add passthru support for non-audio streams 2008-10-20 08:05:15 +02:00
rapidio
regulator
rtc Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6 2008-10-20 14:40:31 -07:00
s390 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block 2008-10-17 09:29:55 -07:00
sbus
scsi Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2008-10-20 13:40:47 -07:00
serial Merge branch 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-10-20 13:23:01 -07:00
sh
sn
spi Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 2008-10-16 12:40:26 -07:00
ssb
staging Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 2008-10-20 09:09:56 -07:00
tc
telephony phonedev: remove BKL 2008-10-20 08:52:36 -07:00
thermal
uio Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6 2008-10-20 13:42:14 -07:00
usb USB: Fix unused label warnings in drivers/usb/host/ehci-hcd.c 2008-10-20 14:23:29 -07:00
video Remove empty imacfb.c file 2008-10-20 11:32:09 -07:00
virtio
w1 x86: sysfs: kill owner field from attribute 2008-10-20 08:52:42 -07:00
watchdog
xen genirq: use iterators for irq_desc loops 2008-10-16 16:53:30 +02:00
zorro
Kconfig
Makefile