kernel-ark/fs
Frederic Weisbecker 48f6ba5e69 kill-the-bkl/reiserfs: fix reiserfs lock to cpu_add_remove_lock dependency
While creating the reiserfs workqueue during the journal
initialization, we are holding the reiserfs lock, but
create_workqueue() also holds the cpu_add_remove_lock, creating
then the following dependency:

- reiserfs lock -> cpu_add_remove_lock

But we also have the following existing dependencies:

- mm->mmap_sem -> reiserfs lock
- cpu_add_remove_lock -> cpu_hotplug.lock -> slub_lock -> sysfs_mutex

The merged dependency chain then becomes:

- mm->mmap_sem -> reiserfs lock -> cpu_add_remove_lock ->
	cpu_hotplug.lock -> slub_lock -> sysfs_mutex

But when we fill a dir entry in sysfs_readir(), we are holding the
sysfs_mutex and we also might fault while copying the directory entry
to the user, leading to the following dependency:

- sysfs_mutex -> mm->mmap_sem

The end result is then a lock inversion between sysfs_mutex and
mm->mmap_sem, as reported in the following lockdep warning:

[ INFO: possible circular locking dependency detected ]
2.6.31-07095-g25a3912 #4
-------------------------------------------------------
udevadm/790 is trying to acquire lock:
 (&mm->mmap_sem){++++++}, at: [<c1098942>] might_fault+0x72/0xc0

but task is already holding lock:
 (sysfs_mutex){+.+.+.}, at: [<c110813c>] sysfs_readdir+0x7c/0x260

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #5 (sysfs_mutex){+.+.+.}:
      [...]

-> #4 (slub_lock){+++++.}:
      [...]

-> #3 (cpu_hotplug.lock){+.+.+.}:
      [...]

-> #2 (cpu_add_remove_lock){+.+.+.}:
      [...]

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
      [...]

-> #0 (&mm->mmap_sem){++++++}:
      [...]

This can be fixed by relaxing the reiserfs lock while creating the
workqueue.
This is fine to relax the lock here, we just keep it around to pass
through reiserfs lock checks and for paranoid reasons.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
2009-10-05 16:31:37 +02:00
..
9p 9p: remove unnecessary v9fses->options which duplicates the mount string 2009-08-17 16:42:28 -05:00
adfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
affs affs: add ->sync_fs 2009-06-11 21:36:14 -04:00
afs AFS: Stop readlink() on AFS crashing due to NULL 'file' ptr 2009-08-27 12:22:08 -07:00
autofs switch follow_down() 2009-06-11 21:36:01 -04:00
autofs4 autofs4 - fix missed case when changing to use struct path 2009-08-31 17:44:05 -10:00
befs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-06-17 08:46:57 -07:00
bfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
btrfs btrfs: fix inode rbtree corruption 2009-08-21 10:09:44 +02:00
cachefiles enforce ->sync_fs is only called for rw superblock 2009-06-11 21:36:06 -04:00
cifs [CIFS] Update readme to reflect forceuid mount parms 2009-08-04 03:53:28 +00:00
coda splice: implement default splice_read method 2009-05-11 14:13:10 +02:00
configfs configfs: Rework configfs_depend_item() locking and make lockdep happy 2009-04-30 10:48:26 -07:00
cramfs
debugfs debugfs: use specified mode to possibly mark files read/write only 2009-06-15 21:30:28 -07:00
devpts devpts: remove module-related code 2009-06-24 08:15:24 -04:00
dlm dlm: free socket in error exit path 2009-07-14 12:28:43 -05:00
ecryptfs eCryptfs: parse_tag_3_packet check tag 3 packet encrypted key size 2009-07-28 14:26:06 -07:00
efs get rid of BKL in fs/efs 2009-06-17 00:36:36 -04:00
exofs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
exportfs
ext2 ext2: fix unbalanced kmap()/kunmap() 2009-09-05 13:41:08 -07:00
ext3 ext3: Improve error message that changing journaling mode on remount is not possible 2009-08-24 16:48:45 +02:00
ext4 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2009-07-13 16:39:25 -07:00
fat headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
freevxfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
fscache FS-Cache: Fixup renamed filenames in comments in internal.h 2009-05-27 10:20:13 -07:00
fuse Revert "fuse: Fix build error" as unnecessary 2009-07-11 11:22:34 -07:00
gfs2 GFS2: Fix permissions on "recover" file 2009-08-14 14:04:46 +01:00
hfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
hfsplus headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
hostfs hostfs: set maximum filesize in superblock for proper LFS support 2009-06-30 18:56:03 -07:00
hpfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
hppfs
hugetlbfs mm: fix hugetlb bug due to user_shm_unlock call 2009-08-24 12:53:01 -07:00
isofs isofs: fix Joliet regression 2009-07-10 19:18:59 -07:00
jbd jbd: fix race between write_metadata_buffer and get_write_access 2009-07-21 11:54:42 +02:00
jbd2 jbd2: fix race between write_metadata_buffer and get_write_access 2009-07-13 17:55:35 -04:00
jffs2 JFFS2: add missing verify buffer allocation/deallocation 2009-09-03 15:01:34 +01:00
jfs jfs: Fix early release of acl in jfs_get_acl 2009-07-23 11:08:36 -05:00
lockd headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
minix Making fs/minix/minix.h double including safe 2009-06-22 11:34:42 -07:00
ncpfs NLS: update handling of Unicode 2009-06-15 21:44:43 -07:00
nfs NFSv4: Fix an infinite looping problem with the nfs4_state_manager 2009-08-24 16:28:42 -07:00
nfs_common
nfsd headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
nilfs2 nilfs2: fix preempt count underflow in nilfs_btnode_prepare_change_key 2009-08-31 12:03:06 +09:00
nls NLS: update handling of Unicode 2009-06-15 21:44:43 -07:00
notify inotify: update the group mask on mark addition 2009-08-28 12:51:14 -04:00
ntfs ntfs: use is_power_of_2() function for clarity. 2009-06-16 19:47:48 -07:00
ocfs2 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 2009-09-05 13:38:37 -07:00
omfs switch omfs to simple_fsync() 2009-06-11 21:36:13 -04:00
openpromfs
partitions partitions: fix broken uevent_suppress conversion 2009-07-12 13:02:09 -07:00
proc mm: revert "oom: move oom_adj value" 2009-08-18 16:31:13 -07:00
qnx4 fs/qnx4: sanitize includes 2009-06-11 21:36:12 -04:00
quota quota: Silence lockdep on quota_on 2009-07-30 17:31:23 +02:00
ramfs fs/ramfs/file-nommu.c needs include/linux/sched.h 2009-07-29 19:10:36 -07:00
reiserfs kill-the-bkl/reiserfs: fix reiserfs lock to cpu_add_remove_lock dependency 2009-10-05 16:31:37 +02:00
romfs ROMFS: romfs_dev_read() error ignored 2009-05-09 10:49:41 -04:00
smbfs push BKL down into ->put_super 2009-06-11 21:36:07 -04:00
squashfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
sysfs sysfs: fix hardlink count on device_move 2009-07-28 13:45:21 -07:00
sysv get rid of BKL in fs/sysv 2009-06-17 00:36:37 -04:00
ubifs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
udf udf: Fix loading of VAT inode when drive wrongly reports number of recorded blocks 2009-07-30 17:28:26 +02:00
ufs ufs: sector_t cannot be negative 2009-06-18 13:03:46 -07:00
xfs xfs: actually enable the swapext compat handler 2009-09-01 17:00:46 -05:00
aio.c eventfd: revised interface and cleanups 2009-06-30 18:55:58 -07:00
anon_inodes.c fs: Provide empty .set_page_dirty() aop for anon inodes 2009-06-18 14:46:10 +02:00
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c elf_core_dump: use rcu_read_lock() to access ->real_parent 2009-06-18 13:03:52 -07:00
binfmt_elf.c elf: fix one check-after-use 2009-07-01 11:14:28 -07:00
binfmt_em86.c
binfmt_flat.c flat: fix uninitialized ptr with shared libs 2009-08-07 10:39:57 -07:00
binfmt_misc.c
binfmt_script.c
binfmt_som.c
bio-integrity.c block: Create bip slabs with embedded integrity vectors 2009-07-01 10:56:25 +02:00
bio.c block: fix sg SG_DXFER_TO_FROM_DEV regression 2009-07-10 20:31:53 +02:00
block_dev.c PM / Hibernate: Replace bdget call with simple atomic_inc of i_count 2009-07-29 21:07:55 +02:00
buffer.c Re-introduce page mapping check in mark_buffer_dirty() 2009-08-21 17:40:08 -07:00
char_dev.c headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
compat_binfmt_elf.c
compat_ioctl.c compat_ioctl: hook up compat handler for FIEMAP ioctl 2009-08-07 10:39:56 -07:00
compat.c exec: do not sleep in TASK_TRACED under ->cred_guard_mutex 2009-09-05 11:30:42 -07:00
dcache.c dcache: extrace and use d_unlinked() 2009-06-11 21:36:06 -04:00
dcookies.c
direct-io.c block: Do away with the notion of hardsect_size 2009-05-22 23:22:54 +02:00
drop_caches.c mm: remove __invalidate_mapping_pages variant 2009-06-16 19:47:43 -07:00
eventfd.c eventfd: revised interface and cleanups 2009-06-30 18:55:58 -07:00
eventpoll.c epoll: fix nested calls support 2009-06-18 13:03:41 -07:00
exec.c exec: do not sleep in TASK_TRACED under ->cred_guard_mutex 2009-09-05 11:30:42 -07:00
fcntl.c headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
fifo.c
file_table.c fs: move mark_files_ro into file_table.c 2009-06-11 21:36:02 -04:00
file.c
filesystems.c fs: Mark get_filesystem_list() as __init function. 2009-04-20 23:02:52 -04:00
fs_struct.c
fs-writeback.c cleanup __writeback_single_inode 2009-06-24 08:15:26 -04:00
generic_acl.c
inode.c vfs: add __destroy_inode 2009-08-07 14:38:29 -03:00
internal.h Trim a bit of crap from fs.h 2009-06-11 21:36:07 -04:00
ioctl.c fs: Add new pre-allocation ioctls to vfs for compatibility with legacy xfs ioctls 2009-06-24 08:15:27 -04:00
ioprio.c
Kconfig fs/Kconfig: move nilfs2 out 2009-07-14 12:34:17 +09:00
Kconfig.binfmt
libfs.c vfs: make get_sb_pseudo set s_maxbytes to value that can be cast to signed 2009-08-18 16:31:12 -07:00
locks.c lockd: call locks_release_private to cleanup per-filesystem state 2009-04-24 16:36:03 -04:00
Makefile
mbcache.c
mpage.c ext4: Properly initialize the buffer_head state 2009-05-13 15:13:42 -04:00
namei.c IMA: update ima_counts_put 2009-09-07 11:54:58 +10:00
namespace.c vfs: mnt_want_write_file(): fix special file handling 2009-08-07 10:39:56 -07:00
nfsctl.c
no-block.c
open.c fs: Add new pre-allocation ioctls to vfs for compatibility with legacy xfs ioctls 2009-06-24 08:15:27 -04:00
pipe.c lockdep: Fix lockdep annotation for pipe_double_lock() 2009-07-22 21:14:14 +02:00
pnode.c
pnode.h
posix_acl.c
read_write.c splice: implement default splice_read method 2009-05-11 14:13:10 +02:00
read_write.h
readdir.c
select.c poll/select: initialize triggered field of struct poll_wqueues 2009-08-15 18:40:11 -07:00
seq_file.c seq_file: add function to write binary data 2009-06-18 13:03:57 -07:00
signalfd.c
splice.c splice: fix kmaps in default_file_splice_write() 2009-05-19 11:37:46 +02:00
stack.c
stat.c kill vfs_stat_fd / vfs_lstat_fd 2009-04-20 23:02:52 -04:00
super.c ... and the same for vfsmount id/mount group id 2009-06-24 08:15:26 -04:00
sync.c sys_sync(): fix 16% performance regression in ffsb create_4k test 2009-07-06 13:57:03 -07:00
timerfd.c
utimes.c
xattr_acl.c
xattr.c fs: introduce mnt_clone_write 2009-06-11 21:36:02 -04:00