kernel-ark/fs
NeilBrown 0feae5c47a [PATCH] Fix dcache race during umount
The race is that the shrink_dcache_memory shrinker could get called while a
filesystem is being unmounted, and could try to prune a dentry belonging to
that filesystem.

If it does, then it will call in to iput on the inode while the dentry is
no longer able to be found by the umounting process.  If iput takes a
while, generic_shutdown_super could get all the way though
shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without
ever waiting on this particular inode.

Eventually the superblock gets freed anyway and if the iput tried to touch
it (which some filesystems certainly do), it will lose.  The promised
"Self-destruct in 5 seconds" doesn't lead to a nice day.

The race is closed by holding s_umount while calling prune_one_dentry on
someone else's dentry.  As a down_read_trylock is used,
shrink_dcache_memory will no longer try to prune the dentry of a filesystem
that is being unmounted, and unmount will not be able to start until any
such active prune_one_dentry completes.

This requires that prune_dcache *knows* which filesystem (if any) it is
doing the prune on behalf of so that it can be careful of other
filesystems.  shrink_dcache_memory isn't called it on behalf of any
filesystem, and so is careful of everything.

shrink_dcache_anon is now passed a super_block rather than the s_anon list
out of the superblock, so it can get the s_anon list itself, and can pass
the superblock down to prune_dcache.

If prune_dcache finds a dentry that it cannot free, it leaves it where it
is (at the tail of the list) and exits, on the assumption that some other
thread will be removing that dentry soon.  To try to make sure that some
work gets done, a limited number of dnetries which are untouchable are
skipped over while choosing the dentry to work on.

I believe this race was first found by Kirill Korotaev.

Cc: Jan Blunck <jblunck@suse.de>
Acked-by: Kirill Korotaev <dev@openvz.org>
Cc: Olaf Hering <olh@suse.de>
Acked-by: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:57 -07:00
..
9p [PATCH] v9fs: signal handling fixes 2006-05-15 11:20:56 -07:00
adfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
affs [PATCH] affs: possible null pointer dereference in affs_rename() 2006-05-26 11:55:46 -07:00
afs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
autofs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
autofs4 [PATCH] autofs4: NFY_NONE wait race fix 2006-05-15 11:20:54 -07:00
befs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
bfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
cifs [[CIFS] Pass truncate open flag through on file open in case setattr fails 2006-05-30 18:09:31 +00:00
coda [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
configfs configfs: configfs_mkdir() failed to cleanup linkage. 2006-05-17 14:38:51 -07:00
cramfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
debugfs [PATCH] debugfs inode leak 2006-06-08 15:14:24 -07:00
devfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
devpts [PATCH] devpts: use lib/parser.c for parsing mount options 2006-03-23 07:38:17 -08:00
efs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
exportfs [PATCH] NFS server subtree_check returns dubious value 2006-05-21 12:59:16 -07:00
ext2 [PATCH] Introduce sys_splice() system call 2006-03-30 12:28:18 -08:00
ext3 Merge git://git.infradead.org/~dwmw2/rbtree-2.6 2006-06-20 14:51:22 -07:00
fat [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
freevxfs BUG_ON() Conversion in fs/freevxfs/ 2006-04-02 13:41:02 +02:00
fuse [fuse] fix race between checking and setting file->private_data 2006-04-26 10:49:16 +02:00
hfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
hfsplus BUG_ON() Conversion in fs/hfsplus/ 2006-04-01 01:14:43 +02:00
hostfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
hpfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
hppfs [PATCH] uml: __user annotations 2006-03-31 12:18:51 -08:00
hugetlbfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
isofs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
jbd [PATCH] Make address_space_operations->invalidatepage return void 2006-03-26 08:56:55 -08:00
jffs [MTD] Remove silly MTD_WRITE/READ macros 2006-05-29 15:06:50 +02:00
jffs2 Merge git://git.infradead.org/~dwmw2/rbtree-2.6 2006-06-20 14:51:22 -07:00
jfs JFS: Fix multiple errors in metapage_releasepage 2006-05-24 07:43:38 -05:00
lockd NFS: make 2 functions static 2006-04-19 12:43:47 -04:00
minix [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
msdos [PATCH] fat: kill reserved names 2006-03-31 12:18:55 -08:00
ncpfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
nfs NFS: remove needless check in nfs_opendir() 2006-04-19 13:06:37 -04:00
nfs_common
nfsd [PATCH] knfsd: Fix two problems that can cause rmmod nfsd to die 2006-05-23 10:35:31 -07:00
nls [PATCH] fs: Use ARRAY_SIZE macro 2006-03-24 07:33:19 -08:00
ntfs [PATCH] NTFS: Critical bug fix (affects MIPS and possibly others) 2006-06-22 15:05:55 -07:00
ocfs2 ocfs2: fix gfp mask in some file system paths 2006-05-17 14:38:49 -07:00
openpromfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
partitions [PATCH] Driver core: add generic "subsystem" link to all devices 2006-06-21 12:40:49 -07:00
proc [PATCH] proc_loginuid_write() uses simple_strtoul() on non-terminated array 2006-06-20 05:25:24 -04:00
qnx4 [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
ramfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
reiserfs [PATCH] Fix reiserfs deadlock 2006-04-22 09:19:53 -07:00
romfs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
smbfs [PATCH] smbfs: Fix slab corruption in samba error path 2006-05-15 11:20:56 -07:00
sysfs [PATCH] sysfs: Allow sysfs attribute files to be pollable 2006-04-14 11:41:24 -07:00
sysv BUG_ON() Conversion in fs/sysv/ 2006-04-02 13:39:21 +02:00
udf BUG_ON() Conversion in fs/udf/ 2006-04-02 13:40:13 +02:00
ufs [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
vfat [PATCH] fat: kill reserved names 2006-03-31 12:18:55 -08:00
xfs [XFS] Remove files from the build that are now unused. 2006-06-20 14:53:51 +10:00
aio.c [PATCH] use kzalloc and kcalloc in core fs code 2006-03-25 08:23:00 -08:00
attr.c
bad_inode.c [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
binfmt_aout.c
binfmt_elf_fdpic.c BUG_ON() Conversion in fs/binfmt_elf_fdpic.c 2006-03-24 18:38:48 +01:00
binfmt_elf.c [PATCH] remove steal_locks() 2006-06-22 15:05:57 -07:00
binfmt_em86.c
binfmt_flat.c [PATCH] binfmt_flat: don't check for EMFILE 2006-05-21 12:59:17 -07:00
binfmt_misc.c [PATCH] remove steal_locks() 2006-06-22 15:05:57 -07:00
binfmt_script.c
binfmt_som.c
bio.c [PATCH] Fix missing ret assignment in __bio_map_user() error path 2006-06-17 10:52:12 -07:00
block_dev.c [PATCH] Fix a race condition between ->i_mapping and iput() 2006-06-22 15:05:57 -07:00
buffer.c [PATCH] for_each_online_pgdat: renaming for_each_pgdat 2006-03-27 08:44:48 -08:00
char_dev.c [PATCH] Simplify proc/devices and fix early termination regression 2006-03-31 12:18:53 -08:00
compat_ioctl.c [PATCH] fs: Use ARRAY_SIZE macro 2006-03-24 07:33:19 -08:00
compat.c [PATCH] NFS: fix error handling on access_ok in compat_sys_nfsservctl 2006-05-21 12:59:16 -07:00
dcache.c [PATCH] Fix dcache race during umount 2006-06-22 15:05:57 -07:00
dcookies.c [PATCH] Use __read_mostly on some hot fs variables 2006-03-26 08:56:56 -08:00
direct-io.c BUG_ON() Conversion in fs/direct-io.c 2006-04-01 01:10:13 +02:00
dnotify.c [PATCH] Use __read_mostly on some hot fs variables 2006-03-26 08:56:56 -08:00
dquot.c BUG_ON() Conversion in fs/dquot.c 2006-04-02 13:36:13 +02:00
drop_caches.c
eventpoll.c [RBTREE] Update eventpoll.c to use rb_parent() accessor macro. 2006-04-21 13:17:24 +01:00
exec.c [PATCH] remove steal_locks() 2006-06-22 15:05:57 -07:00
fcntl.c BUG_ON() Conversion in fs/fcntl.c 2006-04-02 13:37:19 +02:00
fifo.c [PATCH] pipe.c/fifo.c code cleanups 2006-04-11 13:53:33 +02:00
file_table.c [PATCH] get_empty_filp tweaks, inline epoll_init_file() 2006-03-23 07:38:17 -08:00
file.c [PATCH] for_each_possible_cpu: fixes for generic part 2006-03-28 09:16:05 -08:00
filesystems.c
fs-writeback.c [PATCH] Move cond_resched() after iput() in sync_sb_inodes() 2006-03-25 08:22:56 -08:00
inode.c BUG_ON() Conversion in fs/inode.c 2006-04-02 13:38:18 +02:00
inotify_user.c [PATCH] inotify (3/5): add interfaces to kernel API 2006-06-20 05:25:18 -04:00
inotify.c [PATCH] inotify (4/5): allow watch removal from event handler 2006-06-20 05:25:19 -04:00
ioctl.c
ioprio.c
Kconfig Merge branch 'audit.b21' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current 2006-06-20 15:37:56 -07:00
Kconfig.binfmt
libfs.c [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
locks.c [PATCH] remove steal_locks() 2006-06-22 15:05:57 -07:00
Makefile [PATCH] inotify (1/5): split kernel API from userspace support 2006-06-20 05:25:17 -04:00
mbcache.c [PATCH] Typo fixes 2006-03-28 09:16:08 -08:00
mpage.c [PATCH] map multiple blocks for mpage_readpages() 2006-03-26 08:57:01 -08:00
namei.c [PATCH] log more info for directory entry change events 2006-06-20 05:25:28 -04:00
namespace.c [PATCH] revert "vfs: propagate mnt_flags into do_loopback/vfsmount" 2006-05-15 11:20:57 -07:00
nfsctl.c [PATCH] fs: Use ARRAY_SIZE macro 2006-03-24 07:33:19 -08:00
open.c [PATCH] log more info for directory entry change events 2006-06-20 05:25:28 -04:00
pipe.c [PATCH] vmsplice: restrict stealing a little more 2006-05-02 15:29:57 +02:00
pnode.c [PATCH] s/;;/;/g 2006-03-24 07:33:24 -08:00
pnode.h
posix_acl.c
quota_v1.c
quota_v2.c [PATCH] sem2mutex: quota 2006-03-23 07:38:11 -08:00
quota.c [PATCH] sem2mutex: quota 2006-03-23 07:38:11 -08:00
read_write.c [PATCH] splice: unlikely() optimizations 2006-04-11 13:56:09 +02:00
readdir.c
select.c [PATCH] select: don't overflow if (SELECT_STACK_ALLOC % sizeof(long) != 0) 2006-04-11 06:18:41 -07:00
seq_file.c [PATCH] sem2mutex: fs/seq_file.c 2006-03-23 07:38:12 -08:00
splice.c [PATCH] splice: redo page lookup if add_to_page_cache() returns -EEXIST 2006-05-04 06:55:12 +02:00
stat.c [PATCH] powerpc: Wire up *at syscalls 2006-04-28 21:04:59 +10:00
super.c [PATCH] Fix dcache race during umount 2006-06-22 15:05:57 -07:00
sync.c [PATCH] sync_file_range(): use unsigned for flags 2006-04-11 06:18:40 -07:00
xattr_acl.c
xattr.c [PATCH] log more info for directory entry change events 2006-06-20 05:25:28 -04:00