kernel-ark/fs
Ken Chen dee11c2364 [PATCH] aio: fix buggy put_ioctx call in aio_complete - v2
An AIO bug was reported that sleeping function is being called in softirq
context:

BUG: warning at kernel/mutex.c:132/__mutex_lock_common()
Call Trace:
     [<a000000100577b00>] __mutex_lock_slowpath+0x640/0x6c0
     [<a000000100577ba0>] mutex_lock+0x20/0x40
     [<a0000001000a25b0>] flush_workqueue+0xb0/0x1a0
     [<a00000010018c0c0>] __put_ioctx+0xc0/0x240
     [<a00000010018d470>] aio_complete+0x2f0/0x420
     [<a00000010019cc80>] finished_one_bio+0x200/0x2a0
     [<a00000010019d1c0>] dio_bio_complete+0x1c0/0x200
     [<a00000010019d260>] dio_bio_end_aio+0x60/0x80
     [<a00000010014acd0>] bio_endio+0x110/0x1c0
     [<a0000001002770e0>] __end_that_request_first+0x180/0xba0
     [<a000000100277b90>] end_that_request_chunk+0x30/0x60
     [<a0000002073c0c70>] scsi_end_request+0x50/0x300 [scsi_mod]
     [<a0000002073c1240>] scsi_io_completion+0x200/0x8a0 [scsi_mod]
     [<a0000002074729b0>] sd_rw_intr+0x330/0x860 [sd_mod]
     [<a0000002073b3ac0>] scsi_finish_command+0x100/0x1c0 [scsi_mod]
     [<a0000002073c2910>] scsi_softirq_done+0x230/0x300 [scsi_mod]
     [<a000000100277d20>] blk_done_softirq+0x160/0x1c0
     [<a000000100083e00>] __do_softirq+0x200/0x240
     [<a000000100083eb0>] do_softirq+0x70/0xc0

See report: http://marc.theaimsgroup.com/?l=linux-kernel&m=116599593200888&w=2

flush_workqueue() is not allowed to be called in the softirq context.
However, aio_complete() called from I/O interrupt can potentially call
put_ioctx with last ref count on ioctx and triggers bug.  It is simply
incorrect to perform ioctx freeing from aio_complete.

The bug is trigger-able from a race between io_destroy() and aio_complete().
A possible scenario:

cpu0                               cpu1
io_destroy                         aio_complete
  wait_for_all_aios {                __aio_put_req
     ...                                 ctx->reqs_active--;
     if (!ctx->reqs_active)
        return;
  }
  ...
  put_ioctx(ioctx)

                                     put_ioctx(ctx);
                                        __put_ioctx
                                          bam! Bug trigger!

The real problem is that the condition check of ctx->reqs_active in
wait_for_all_aios() is incorrect that access to reqs_active is not
being properly protected by spin lock.

This patch adds that protective spin lock, and at the same time removes
all duplicate ref counting for each kiocb as reqs_active is already used
as a ref count for each active ioctx.  This also ensures that buggy call
to flush_workqueue() in softirq context is eliminated.

Signed-off-by: "Ken Chen" <kenchen@google.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Suparna Bhattacharya <suparna@in.ibm.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: <stable@kernel.org>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-03 11:26:06 -08:00
..
9p [PATCH] 9p: null terminate error strings for debug print 2007-01-26 13:51:00 -08:00
adfs [PATCH] adfs: fix filename handling 2007-01-05 23:55:22 -08:00
affs [PATCH] affs: change uses of f_{dentry, vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
afs [PATCH] rename struct namespace to struct mnt_namespace 2006-12-08 08:28:51 -08:00
autofs [PATCH] autofs: change uses of f_{dentry, vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
autofs4 [PATCH] getting rid of all casts of k[cmz]alloc() calls 2006-12-13 09:05:58 -08:00
befs [PATCH] getting rid of all casts of k[cmz]alloc() calls 2006-12-13 09:05:58 -08:00
bfs [PATCH] update Tigran's email addresses 2006-12-13 09:05:53 -08:00
cifs Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 2007-01-24 09:46:54 -08:00
coda [PATCH] struct path: convert coda 2006-12-08 08:28:44 -08:00
configfs [PATCH] configfs: change uses of f_{dentry, vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
cramfs [PATCH] struct path: convert cramfs 2006-12-08 08:28:44 -08:00
debugfs DebugFS : file/directory removal fix 2006-12-13 15:38:45 -08:00
devpts [PATCH] inode-diet: Eliminate i_blksize from the inode structure 2006-09-27 08:26:18 -07:00
dlm [DLM] fix compile warning 2006-12-15 12:51:22 -05:00
ecryptfs [PATCH] ecryptfs: change uses of f_{dentry, vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
efs [PATCH] struct path: convert efs 2006-12-08 08:28:45 -08:00
exportfs [PATCH] VFS: Make filldir_t and struct kstat deal in 64-bit inode numbers 2006-10-03 08:03:40 -07:00
ext2 [PATCH] LOG2: Implement a general integer log2 facility in the kernel 2006-12-08 08:28:51 -08:00
ext3 [PATCH] LOG2: Implement a general integer log2 facility in the kernel 2006-12-08 08:28:51 -08:00
ext4 [PATCH] ext4: change uses of f_{dentry, vfsmnt} to use f_path 2006-12-08 08:28:41 -08:00
fat [PATCH] fat: change uses of f_{dentry,vfsmnt} to use f_path 2006-12-08 08:28:41 -08:00
freevxfs [PATCH] struct path: convert freevxfs 2006-12-08 08:28:45 -08:00
fuse [PATCH] fuse: fix bug in control filesystem mount 2007-01-30 08:26:45 -08:00
gfs2 [PATCH] Revert bd_mount_mutex back to a semaphore 2007-01-11 18:18:21 -08:00
hfs [PATCH] struct path: convert hfs 2006-12-08 08:28:45 -08:00
hfsplus [PATCH] struct path: convert hfsplus 2006-12-08 08:28:45 -08:00
hostfs [PATCH] uml: fix mknod 2007-01-30 08:26:44 -08:00
hpfs [PATCH] struct path: convert hpfs 2006-12-08 08:28:45 -08:00
hppfs [PATCH] struct path: convert hppfs 2006-12-08 08:28:45 -08:00
hugetlbfs VM: Remove "clear_page_dirty()" and "test_clear_page_dirty()" functions 2006-12-21 09:19:57 -08:00
isofs [PATCH] isofs: change uses of f_{dentry, vfsmnt} to use f_path 2006-12-08 08:28:41 -08:00
jbd [PATCH] jbd: wait for already submitted t_sync_datalist buffer to complete 2006-12-22 08:55:51 -08:00
jbd2 [PATCH] jbd2: wait for already submitted t_sync_datalist buffer to complete 2006-12-07 08:39:42 -08:00
jffs Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2007-01-18 10:34:51 +11:00
jffs2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2007-01-18 10:34:51 +11:00
jfs [PATCH] Fix JFS after clear_page_dirty() removal 2006-12-21 09:24:03 -08:00
lockd [PATCH] fs/lockd/clntlock.c: add missing newlines to dprintk's 2007-01-30 08:26:45 -08:00
minix [PATCH] struct path: convert minix 2006-12-08 08:28:47 -08:00
msdos [PATCH] fat: add fat_getattr() 2006-11-16 11:43:38 -08:00
ncpfs [PATCH] ncpfs: ensure we free wdog_pid on parse_option or fill_inode failure 2006-12-13 09:05:53 -08:00
nfs [PATCH] Remove warning: VFS is out of sync with lock manager 2007-01-30 16:01:35 -08:00
nfs_common [PATCH] nfs_common endianness annotations 2006-10-20 10:26:41 -07:00
nfsd [PATCH] endianness bug: ntohl() misspelled as >> 24 in fh_verify(). 2007-02-01 16:17:06 -08:00
nls [PATCH] fs: make nls_cp936.c handle some U00XY characters and U20AC correctly 2006-12-07 08:39:46 -08:00
ntfs [PATCH] ntfs: kmap_atomic() atomicity fix 2007-01-30 16:01:35 -08:00
ocfs2 [PATCH] ocfs2: fix thinko in ocfs2_backup_super_blkno() 2007-01-26 14:53:27 -08:00
openpromfs [PATCH] struct path: convert openpromfs 2006-12-08 08:28:48 -08:00
partitions [MIPS] Rename SNI_RM200_PCI to just SNI_RM preparing for more RM machines 2006-12-09 01:03:58 +00:00
proc [PATCH] procfs: Fix listing of /proc/NOT_A_TGID/task 2007-02-01 16:22:41 -08:00
qnx4 [PATCH] struct path: convert qnx4 2006-12-08 08:28:48 -08:00
ramfs [PATCH] ramfs breaks without CONFIG_BLOCK 2006-12-30 10:56:42 -08:00
reiserfs [PATCH] resierfs: avoid tail packing if an inode was ever mmapped 2007-01-23 07:52:06 -08:00
romfs [PATCH] struct path: convert romfs 2006-12-08 08:28:49 -08:00
smbfs [PATCH] smbfs: Make conn_pid a struct pid 2006-12-13 09:05:53 -08:00
sysfs [PATCH] sysfs: change uses of f_{dentry, vfsmnt} to use f_path 2006-12-08 08:28:41 -08:00
sysv [PATCH] fs/sysv/: proper prototypes for 2 functions 2006-12-22 08:55:47 -08:00
udf [PATCH] struct path: convert udf 2006-12-08 08:28:50 -08:00
ufs [PATCH] ufs: reallocation fix 2007-01-30 08:26:45 -08:00
vfat [PATCH] fat: add fat_getattr() 2006-11-16 11:43:38 -08:00
xfs [PATCH] Fix XFS after clear_page_dirty() removal 2006-12-21 10:01:08 -08:00
aio.c [PATCH] aio: fix buggy put_ioctx call in aio_complete - v2 2007-02-03 11:26:06 -08:00
attr.c
bad_inode.c [PATCH] fix memory corruption from misinterpreted bad_inode_ops return values 2007-01-05 23:55:23 -08:00
binfmt_aout.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
binfmt_elf_fdpic.c [PATCH] core-dumping unreadable binaries via PT_INTERP 2007-01-26 13:51:00 -08:00
binfmt_elf.c [PATCH] core-dumping unreadable binaries via PT_INTERP 2007-01-26 13:51:00 -08:00
binfmt_em86.c
binfmt_flat.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
binfmt_misc.c [PATCH] getting rid of all casts of k[cmz]alloc() calls 2006-12-13 09:05:58 -08:00
binfmt_script.c
binfmt_som.c [PARISC] Fix fs/binfmt_som.c 2006-10-04 06:51:26 -06:00
bio.c [PATCH] optimize o_direct on block devices 2006-12-13 09:05:50 -08:00
block_dev.c [PATCH] fix blk_direct_IO bio preparation 2007-01-23 07:52:06 -08:00
buffer.c [PATCH] Fix try_to_free_buffer() locking 2007-01-29 20:20:42 -08:00
char_dev.c [PATCH] BLOCK: Move extern declarations out of fs/*.c into header files [try #6] 2006-09-30 20:52:18 +02:00
compat_ioctl.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
compat.c [PATCH] fdtable: Make fdarray and fdsets equal in size 2006-12-10 09:57:22 -08:00
dcache.c [PATCH] dcache: avoid RCU for never-hashed dentries 2006-12-07 08:39:41 -08:00
dcookies.c [PATCH] slab: remove kmem_cache_t 2006-12-07 08:39:25 -08:00
direct-io.c [PATCH] dio: lock refcount operations 2006-12-10 09:57:21 -08:00
dnotify.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
dquot.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
drop_caches.c
eventpoll.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
exec.c [PATCH] fdtable: Make fdarray and fdsets equal in size 2006-12-10 09:57:22 -08:00
fcntl.c [PATCH] fdtable: Make fdarray and fdsets equal in size 2006-12-10 09:57:22 -08:00
fifo.c
file_table.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
file.c [PATCH] fdtable: Provide free_fdtable() wrapper 2006-12-22 08:55:50 -08:00
filesystems.c [PATCH] Ban register_filesystem(NULL); 2006-09-29 09:18:20 -07:00
fs-writeback.c Write back inode data pages even when the inode itself is locked 2007-01-26 12:53:20 -08:00
generic_acl.c [PATCH] Generic infrastructure for acls 2006-09-29 09:18:24 -07:00
inode.c [PATCH] relative atime 2006-12-13 09:05:50 -08:00
inotify_user.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
inotify.c [PATCH] severing fs.h, radix-tree.h -> sched.h 2006-12-04 02:00:24 -05:00
internal.h [PATCH] CONFIG_BLOCK internal.h cleanups 2006-09-30 20:52:32 +02:00
ioctl.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
ioprio.c [PATCH] block layer: ioprio_best function fix 2006-10-12 15:09:51 +02:00
Kconfig [PATCH] Make JFFS depend on CONFIG_BROKEN 2006-12-22 08:55:48 -08:00
Kconfig.binfmt
libfs.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
locks.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
Makefile [PATCH] fsstack: Introduce fsstack_copy_{attr,inode}_* 2006-12-08 08:28:40 -08:00
mbcache.c [PATCH] slab: remove kmem_cache_t 2006-12-07 08:39:25 -08:00
mpage.c [PATCH] BLOCK: Dissociate generic_writepages() from mpage stuff [try #6] 2006-09-30 20:52:26 +02:00
namei.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
namespace.c [PATCH] relative atime 2006-12-13 09:05:50 -08:00
nfsctl.c Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00
no-block.c [PATCH] BLOCK: Make it possible to disable the block layer [try #6] 2006-09-30 20:52:31 +02:00
open.c [PATCH] fdtable: Make fdarray and fdsets equal in size 2006-12-10 09:57:22 -08:00
pipe.c [PATCH] fix leaks on pipe(2) failure exits 2006-12-21 00:16:03 -08:00
pnode.c [PATCH] rename struct namespace to struct mnt_namespace 2006-12-08 08:28:51 -08:00
pnode.h [PATCH] rename struct namespace to struct mnt_namespace 2006-12-08 08:28:51 -08:00
posix_acl.c [PATCH] kmemdup: some users 2006-10-01 00:39:19 -07:00
quota_v1.c
quota_v2.c
quota.c [PATCH] BLOCK: Make it possible to disable the block layer [try #6] 2006-09-30 20:52:31 +02:00
read_write.c [PATCH] one more EXPORT_UNUSED_SYMBOL removal 2006-12-13 09:05:53 -08:00
read_write.h [PATCH] Remove readv/writev methods and use aio_read/aio_write instead 2006-10-01 00:39:28 -07:00
readdir.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
select.c [PATCH] fdtable: Make fdarray and fdsets equal in size 2006-12-10 09:57:22 -08:00
seq_file.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
splice.c [PATCH] constify pipe_buf_operations 2006-12-13 09:05:47 -08:00
stack.c [PATCH] fsstack: Remove inode copy 2006-12-22 08:55:48 -08:00
stat.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
super.c [PATCH] Revert bd_mount_mutex back to a semaphore 2007-01-11 18:18:21 -08:00
sync.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
utimes.c [PATCH] severing fs.h, radix-tree.h -> sched.h 2006-12-04 02:00:24 -05:00
xattr_acl.c
xattr.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00