4120db4719
Bug symptoms ~~~~~~~~~~~~ For the same inode VFS calls read_inode() twice and doesn't call clear_inode() between the two read_inode() invocations. Bug description ~~~~~~~~~~~~~~~ Suppose we have an inode which has zero reference count but is still in the inode cache. Suppose kswapd invokes shrink_icache_memory() to free some RAM. In prune_icache() inodes are removed from i_hash. prune_icache () is then going to call clear_inode(), but drops the inode_lock spinlock before this. If in this moment another task calls iget() for an inode which was just removed from i_hash by prune_icache(), then iget() invokes read_inode() for this inode, because it is *already removed* from i_hash. The end result is: we call iget(#N) then iput(#N); inode #N has zero i_count now and is in the inode cache; kswapd starts. kswapd removes the inode #N from i_hash ans is preempted; we call iget(#N) again; read_inode() is invoked as the result; but we expect clear_inode() before. Fix ~~~~~~~ To fix the bug I remove inodes from i_hash later, when clear_inode() is actually called. I remove them from i_hash under spinlock protection. Since the i_state is set to I_FREEING, it is safe to do this. The others will sleep waiting for the inode state change. I also postpone removing inodes from i_sb_list. It is not compulsory to do so but I do it for readability reasons. Inodes are added/removed to the lists together everywhere in the code and there is no point to change this rule. This is harmless because the only user of i_sb_list which somehow may interfere with me (invalidate_list()) is excluded by the iprune_sem mutex. The same race is possible in invalidate_list() so I do the same for it. Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
||
---|---|---|
.. | ||
adfs | ||
affs | ||
afs | ||
autofs | ||
autofs4 | ||
befs | ||
bfs | ||
cifs | ||
coda | ||
cramfs | ||
debugfs | ||
devfs | ||
devpts | ||
efs | ||
exportfs | ||
ext2 | ||
ext3 | ||
fat | ||
freevxfs | ||
hfs | ||
hfsplus | ||
hostfs | ||
hpfs | ||
hppfs | ||
hugetlbfs | ||
isofs | ||
jbd | ||
jffs | ||
jffs2 | ||
jfs | ||
lockd | ||
minix | ||
msdos | ||
ncpfs | ||
nfs | ||
nfs_common | ||
nfsd | ||
nls | ||
ntfs | ||
openpromfs | ||
partitions | ||
proc | ||
qnx4 | ||
ramfs | ||
reiserfs | ||
romfs | ||
smbfs | ||
sysfs | ||
sysv | ||
udf | ||
ufs | ||
umsdos | ||
vfat | ||
xfs | ||
aio.c | ||
attr.c | ||
bad_inode.c | ||
binfmt_aout.c | ||
binfmt_elf_fdpic.c | ||
binfmt_elf.c | ||
binfmt_em86.c | ||
binfmt_flat.c | ||
binfmt_misc.c | ||
binfmt_script.c | ||
binfmt_som.c | ||
bio.c | ||
block_dev.c | ||
buffer.c | ||
char_dev.c | ||
compat_ioctl.c | ||
compat.c | ||
dcache.c | ||
dcookies.c | ||
direct-io.c | ||
dnotify.c | ||
dquot.c | ||
eventpoll.c | ||
exec.c | ||
fcntl.c | ||
fifo.c | ||
file_table.c | ||
file.c | ||
filesystems.c | ||
fs-writeback.c | ||
inode.c | ||
ioctl.c | ||
ioprio.c | ||
Kconfig | ||
Kconfig.binfmt | ||
libfs.c | ||
locks.c | ||
Makefile | ||
mbcache.c | ||
mpage.c | ||
namei.c | ||
namespace.c | ||
nfsctl.c | ||
open.c | ||
pipe.c | ||
posix_acl.c | ||
quota_v1.c | ||
quota_v2.c | ||
quota.c | ||
read_write.c | ||
readdir.c | ||
select.c | ||
seq_file.c | ||
stat.c | ||
super.c | ||
xattr_acl.c | ||
xattr.c |