930d332a23
dm-mirror has potential data corruption problem: while on-disk log shows that all disk contents are in-sync, actual contents of the disks are not synchronized. This problem occurs if initial recovery (synching) is interrupted and resumed. Attached patch fixes this problem. Background: rh_dec() changes the region state from RH_NOSYNC (out-of-sync) to RH_CLEAN (in-sync), which results in the corresponding bit of clean_bits being set. This is harmful if on-disk log is used and the map is removed/suspended before the initial sync is completed. The clean_bits is written down to the on-disk log at the map removal, and, upon resume, it's read and copied to sync_bits. Since the recovery process refers to the sync_bits to find a region to be recovered, the region whose state was changed from RH_NOSYNC to RH_CLEAN is no longer recovered. If you haven't applied dm-raid1-read-balancing.patch proposed in dm-devel sometimes ago, the contents of the mirrored disk just corrupt silently. If you have, balanced read may get bogus data from out-of-sync disks. The patch keeps RH_NOSYNC state unchanged. It will be changed to RH_RECOVERING when recovery starts and get reclaimed when the recovery completes. So it doesn't leak the region hash entry. Description: Keep RH_NOSYNC state unchanged when I/O on the region completes. rh_dec() changes the region state from RH_NOSYNC (out-of-sync) to RH_CLEAN (in-sync), which results in the corresponding bit of clean_bits being set. This is harmful if on-disk log is used and the map is removed/suspended before the initial sync is completed. The clean_bits is written down to the on-disk log at the map removal, and, upon resume, it's read and copied to sync_bits. Since the recovery process refers to the sync_bits to find a region to be recovered, the region whose state was changed from RH_NOSYNC to RH_CLEAN is no longer recovered. If you haven't applied dm-raid1-read-balancing.patch proposed in dm-devel sometimes ago, the contents of the mirrored disk just corrupt silently. If you have, balanced read may get bogus data from out-of-sync disks. The RH_NOSYNC region will be changed to RH_RECOVERING when recovery starts on the region and get reclaimed when the recovery completes. So it doesn't leak the region hash entry. Alasdair said: I've analysed the relevant part of the state machine and I believe that the patch is correct. (Further work on this code is still needed - this patch has the side-effect of holding onto memory unnecessarily for long periods of time under certain workloads - but better that than corrupting data.) Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
||
---|---|---|
.. | ||
raid6test | ||
.gitignore | ||
bitmap.c | ||
dm-bio-list.h | ||
dm-bio-record.h | ||
dm-crypt.c | ||
dm-emc.c | ||
dm-exception-store.c | ||
dm-hw-handler.c | ||
dm-hw-handler.h | ||
dm-io.c | ||
dm-io.h | ||
dm-ioctl.c | ||
dm-linear.c | ||
dm-log.c | ||
dm-log.h | ||
dm-mpath.c | ||
dm-mpath.h | ||
dm-path-selector.c | ||
dm-path-selector.h | ||
dm-raid1.c | ||
dm-round-robin.c | ||
dm-snap.c | ||
dm-snap.h | ||
dm-stripe.c | ||
dm-table.c | ||
dm-target.c | ||
dm-zero.c | ||
dm.c | ||
dm.h | ||
faulty.c | ||
Kconfig | ||
kcopyd.c | ||
kcopyd.h | ||
linear.c | ||
Makefile | ||
md.c | ||
mktables.c | ||
multipath.c | ||
raid0.c | ||
raid1.c | ||
raid5.c | ||
raid6.h | ||
raid6algos.c | ||
raid6altivec.uc | ||
raid6int.uc | ||
raid6main.c | ||
raid6mmx.c | ||
raid6recov.c | ||
raid6sse1.c | ||
raid6sse2.c | ||
raid6x86.h | ||
raid10.c | ||
unroll.pl | ||
xor.c |