Commit Graph

23511 Commits

Author SHA1 Message Date
Al Viro
2830ba7f34 ->permission() sanitizing: don't pass flags to generic_permission()
redundant; all callers get it duplicated in mask & MAY_NOT_BLOCK and none of
them removes that bit.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:22 -04:00
Al Viro
7e40145eb1 ->permission() sanitizing: don't pass flags to ->check_acl()
not used in the instances anymore.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:21 -04:00
Al Viro
9c2c703929 ->permission() sanitizing: pass MAY_NOT_BLOCK to ->check_acl()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:19 -04:00
Al Viro
1fc0f78ca9 ->permission() sanitizing: MAY_NOT_BLOCK
Duplicate the flags argument into mask bitmap.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:18 -04:00
Al Viro
178ea73521 kill check_acl callback of generic_permission()
its value depends only on inode and does not change; we might as
well store it in ->i_op->check_acl and be done with that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:16 -04:00
Al Viro
07b8ce1ee8 lockless get_write_access/deny_write_access
new helpers: atomic_inc_unless_negative()/atomic_dec_unless_positive()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:14 -04:00
Al Viro
f4d6ff89d8 move exec_permission() up to the rest of permission-related functions
... and convert the comment before it into linuxdoc form.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:13 -04:00
Al Viro
3bfa784a65 kill file_permission() completely
convert the last remaining caller to inode_permission()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:11 -04:00
Al Viro
1b5d783c94 consolidate BINPRM_FLAGS_ENFORCE_NONDUMP handling
new helper: would_dump(bprm, file).  Checks if we are allowed to
read the file and if we are not - sets ENFORCE_NODUMP.  Exported,
used in places that previously open-coded the same logics.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:10 -04:00
Al Viro
78f32a9b47 switch path_init() to exec_permission()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:08 -04:00
Al Viro
6f28610974 switch udf_ioctl() to inode_permission()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:07 -04:00
Al Viro
4cf27141cb make exec_permission(dir) really equivalent to inode_permission(dir, MAY_EXEC)
capability overrides apply only to the default case; if fs has ->permission()
that does _not_ call generic_permission(), we have no business doing them.
Moreover, if it has ->permission() that does call generic_permission(), we
have no need to recheck capabilities.

Besides, the capability overrides should apply only if we got EACCES from
acl_permission_check(); any other value (-EIO, etc.) should be returned
to caller, capabilities or not capabilities.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:05 -04:00
Al Viro
43e15cdbef new helper: iterate_supers_type()
Call the given function for all superblocks of given type.  Function
gets a superblock (with s_umount locked shared) and (void *) argument
supplied by caller of iterator.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:04 -04:00
Josef Bacik
44396f4b5c fs: add a DCACHE_NEED_LOOKUP flag for d_flags
Btrfs (and I'd venture most other fs's) stores its indexes in nice disk order
for readdir, but unfortunately in the case of anything that stats the files in
order that readdir spits back (like oh say ls) that means we still have to do
the normal lookup of the file, which means looking up our other index and then
looking up the inode.  What I want is a way to create dummy dentries when we
find them in readdir so that when ls or anything else subsequently does a
stat(), we already have the location information in the dentry and can go
straight to the inode itself.  The lookup stuff just assumes that if it finds a
dentry it is done, it doesn't perform a lookup.  So add a DCACHE_NEED_LOOKUP
flag so that the lookup code knows it still needs to run i_op->lookup() on the
parent to get the inode for the dentry.  I have tested this with btrfs and I
went from something that looks like this

http://people.redhat.com/jwhiter/ls-noreada.png

To this

http://people.redhat.com/jwhiter/ls-good.png

Thats a savings of 1300 seconds, or 22 minutes.  That is a significant savings.
Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:03 -04:00
Akinobu Mita
f7b88631a8 fs/libfs.c: fix simple_attr_write() on 32bit machines
Assume that /sys/kernel/debug/dummy64 is debugfs file created by
debugfs_create_x64().

	# cd /sys/kernel/debug
	# echo 0x1234567812345678 > dummy64
	# cat dummy64
	0x0000000012345678

	# echo 0x80000000 > dummy64
	# cat dummy64
	0xffffffff80000000

A value larger than INT_MAX cannot be written to the debugfs file created
by debugfs_create_u64 or debugfs_create_x64 on 32bit machine.  Because
simple_attr_write() uses simple_strtol() for the conversion.

To fix this, use simple_strtoll() instead.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-19 22:09:30 -07:00
Linus Torvalds
e501f29c72 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  vfs: fix race in rcu lookup of pruned dentry
  Fix cifs_get_root()

[ Edited the last commit to get rid of a 'unused variable "seq"'
  warning due to Al editing the patch.  - Linus ]
2011-07-19 21:50:21 -07:00
Linus Torvalds
5943026240 vfs: fix race in rcu lookup of pruned dentry
Don't update *inode in __follow_mount_rcu() until we'd verified that
there is mountpoint there.  Kudos to Hugh Dickins for catching that
one in the first place and eventually figuring out the solution (and
catching a braino in the earlier version of patch).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-19 21:49:01 -07:00
David Teigland
10d1459faf dlm: don't limit active work items
Allow multiple workqueue items (locks with callbacks) to be
processed concurrently.  There should be no reason not to
take advantage of this workqueue feature.

Signed-off-by: David Teigland <teigland@redhat.com>
2011-07-19 14:22:32 -05:00
Al Viro
fec11dd9a0 Fix cifs_get_root()
Add missing ->i_mutex, convert to lookup_one_len() instead of
(broken) open-coded analog, cope with getting something like
a//b as relative pathname.  Simplify the hell out of it, while
we are there...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
2011-07-18 13:51:58 -04:00
Linus Torvalds
d36c30181c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  hppfs_lookup(): don't open-code lookup_one_len()
  hppfs: fix dentry leak
  cramfs: get_cramfs_inode() returns ERR_PTR() on failure
  ufs should use d_splice_alias()
  fix exofs ->get_parent()
  ceph analog of cifs build_path_from_dentry() race fix
  cifs: build_path_from_dentry() race fix
2011-07-18 09:03:15 -07:00
Al Viro
0916a5e45f hppfs_lookup(): don't open-code lookup_one_len()
... and it's getting it wrong, too - missing ->d_revalidate() calls when
it's dealing with filesystem (procfs) that has non-trivial ->d_revalidate()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-17 23:22:48 -04:00
Al Viro
3cc0658e35 hppfs: fix dentry leak
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-17 23:22:17 -04:00
Al Viro
0577d1ba41 cramfs: get_cramfs_inode() returns ERR_PTR() on failure
... and we want to report these failures in ->lookup() anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-17 23:22:02 -04:00
Al Viro
642c937b4e ufs should use d_splice_alias()
it's NFS-exportable, so...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-17 23:21:35 -04:00
Al Viro
a803b8067e fix exofs ->get_parent()
NULL is not a possible return value for that method, TYVM...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-17 23:20:29 -04:00
Linus Torvalds
f560f6697f Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] update cifs to version 1.74
  [CIFS] update limit for snprintf in cifs_construct_tcon
  cifs: Fix signing failure when server mandates signing for NTLMSSP
2011-07-17 12:49:55 -07:00
Al Viro
1b71fe2efa ceph analog of cifs build_path_from_dentry() race fix
... unfortunately, cifs bug got copied.  Fix is essentially the same.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-16 23:43:58 -04:00
Al Viro
dc137bf553 cifs: build_path_from_dentry() race fix
deal with d_move() races properly; rename_lock read-retry loop,
rcu_read_lock() held while walking to root, d_lock held over
subtraction from namelen and copying the component to stabilize
->d_name.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-16 23:37:20 -04:00
David Teigland
23e8e1aaac dlm: use workqueue for callbacks
Instead of creating our own kthread (dlm_astd) to deliver
callbacks for all lockspaces, use a per-lockspace workqueue
to deliver the callbacks.  This eliminates complications and
slowdowns from many lockspaces sharing the same thread.

Signed-off-by: David Teigland <teigland@redhat.com>
2011-07-15 12:30:43 -05:00
Linus Torvalds
da1b001a2a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  fix loop checks in d_materialise_unique()
  Fix ->d_lock locking order in unlazy_walk()
2011-07-15 09:55:39 -07:00
Eric Sandeen
46fcb2ed29 GFS2: combine duplicated block freeing routines
__gfs2_free_data and __gfs2_free_meta are almost identical, and
can be trivially combined.

[This is as per Eric's original patch minus gfs2_free_data() which had
 no callers left and plus the conversion of the bmap.c calls to these
 functions. All in all, a nice clean up]

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2011-07-15 09:32:52 +01:00
Steven Whitehouse
9964afbb79 GFS2: Add S_NOSEC support
This adds S_NOSEC support to GFS2. We set/reset the flag either when
a user calls setattr or when we have just regained the glock
from another node. The flag is only set if there are no xattrs
on the inode and there is no suid bit set.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
2011-07-15 09:32:35 +01:00
Bob Peterson
7cf8dcd3b6 GFS2: Automatically adjust glock min hold time
This patch is a performance improvement for GFS2 in a clustered
environment. It makes the glock hold time self-adjusting.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2011-07-15 09:32:11 +01:00
Steven Whitehouse
17d539f049 GFS2: Cache dir hash table in a contiguous buffer
This patch adds a cache for the hash table to the directory code
in order to help simplify the way in which the hash table is
accessed. This is intended to be a first step towards introducing
some performance improvements in the directory code.

There are two follow ups that I'm hoping to see fairly shortly. One
is to simplify the hash table reading code now that we always read the
complete hash table, whether we want one entry or all of them. The
other is to introduce readahead on the heads of the hash chains
which are referred to from the table.

The hash table is a maximum of 128k in size, so it is not worth trying
to read it in small chunks.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2011-07-15 09:31:48 +01:00
Al Viro
1836750115 fix loop checks in d_materialise_unique()
Both __d_unalias() and __d_materialise_dentry() need loop prevention.
Grab rename_lock in caller, check for loops there...

As a side benefit, we have dentry_lock_for_move() called only under
rename_lock, which seriously reduces deadlock potential of the
execrable "locking order" used for ->d_lock.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-14 21:33:41 -04:00
David Teigland
883ba74f43 dlm: remove deadlock debug print
gfs2 recently began using this feature heavily,
creating more debug output than we want to see.

Signed-off-by: David Teigland <teigland@redhat.com>
2011-07-14 12:31:49 -05:00
Linus Torvalds
5dcd07b9f3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
  GFS2: Resolve inode eviction and ail list interaction bug
  GFS2: Fix race during filesystem mount
  GFS2: force a log flush when invalidating the rindex glock
2011-07-14 10:20:42 -07:00
Steven Whitehouse
380f7c65a7 GFS2: Resolve inode eviction and ail list interaction bug
This patch contains a few misc fixes which resolve a recently
reported issue. This patch has been a real team effort and has
received a lot of testing.

The first issue is that the ail lock needs to be held over a few
more operations. The lock thats added into gfs2_releasepage() may
possibly be a candidate for replacing with RCU at some future
point, but at this stage we've gone for the obvious fix.

The second issue is that gfs2_write_inode() can end up calling
a glock recursively when called from gfs2_evict_inode() via the
syncing code, so it needs a guard added.

The third issue is that we either need to not truncate the metadata
pages of inodes which have zero link count, but which we cannot
deallocate due to them still being in use by other nodes, or we need
to ensure that those pages have all made it through the journal and
ail lists first. This patch takes the former approach, but the
latter has also been tested and there is nothing to choose between
them performance-wise. So again, we could revise that decision
in the future.

Also, the inode eviction process is now better documented.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Tested-by: Bob Peterson <rpeterso@redhat.com>
Tested-by: Abhijith Das <adas@redhat.com>
Reported-by: Barry J. Marson <bmarson@redhat.com>
Reported-by: David Teigland <teigland@redhat.com>
2011-07-14 08:59:44 +01:00
Linus Torvalds
201f92e2ca Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  SUNRPC: Fix use of static variable in rpcb_getport_async
  NFSv4.1: update nfs4_fattr_bitmap_maxsz
  SUNRPC: Fix a race between work-queue and rpc_killall_tasks
  pnfs: write: Set mds_offset in the generic layer - it is needed by all LDs
2011-07-13 14:34:08 -07:00
Christoph Hellwig
d0f9e8fb4c xfs: remove the dead XFS_DABUF_DEBUG code
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-13 13:43:50 +02:00
Christoph Hellwig
c84470dda7 xfs: remove leftovers of the old btree tracing code
Remove various bits left over from the old kdb-only btree tracing code, but
leave the actual trace point stubs in place to ease adding new event based
btree tracing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-13 13:43:50 +02:00
Christoph Hellwig
ea15ab3cdd xfs: remove the dead QUOTADEBUG code
Remove the dead hash table test rid which has been rotting away under
QUOTADEBUG, including some code that was compiled for normal debug
builds, but not actually called without QUOTADEBUG, and enable a few
cheap debug checks that were hidden under QUOTADEBUG for normal
debug builds.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-13 13:43:50 +02:00
Christoph Hellwig
54244fec67 xfs: remove the unused xfs_buf_delwri_sort function
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-13 13:43:49 +02:00
Christoph Hellwig
cb669ca570 xfs: remove wrappers around b_iodone
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-13 13:43:49 +02:00
Christoph Hellwig
adadbeefb3 xfs: remove wrappers around b_fspriv
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-13 13:43:49 +02:00
Christoph Hellwig
bf9d9013a2 xfs: add a proper transaction pointer to struct xfs_buf
Replace the typeless b_fspriv2 and the ugly macros around it with a properly
typed transaction pointer.  As a fallout the log buffer state debug checks
are also removed.  We could have kept them using casts, but as they do
not have a real purpose we can as well just remove them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-13 13:43:49 +02:00
Christoph Hellwig
77936d0280 xfs: factor out xfs_da_grow_inode_int
xfs_da_grow_inode and xfs_dir2_grow_inode are mostly duplicate code.  Factor
the meat of those two functions into a new common helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-13 13:43:49 +02:00
Christoph Hellwig
a230a1df40 xfs: factor out xfs_dir2_leaf_find_stale
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-13 13:43:48 +02:00
Christoph Hellwig
a00b7745c6 xfs: cleanup struct xfs_dir2_free
Change the bests array to be a proper variable sized entry.  This is done
easily as no one relies on the size of the structure.  Also change
XFS_DIR2_MAX_FREE_BESTS to an inline function while we're at it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-13 13:43:48 +02:00
Christoph Hellwig
5792664070 xfs: reshuffle dir2 headers
Replace the current mess of dir2 headers with just three that have a clear
purpose:

 - xfs_dir2_format.h for all format definitions, including the inline helpers
   to access our variable size structures
 - xfs_dir2_priv.h for all prototypes that are internal to the dir2 code
   and not needed by anything outside of the directory code.  For this
   purpose xfs_da_btree.c, and phase6.c in xfs_repair are considered part
   of the directory code.
 - xfs_dir2.h for the public interface to the directory code

In addition to the reshuffle I have also update the comments to not only
match the new file structure, but also to describe the directory format
better.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-13 13:43:48 +02:00
Christoph Hellwig
2bcf6e970f xfs: start periodic workers later
Start the periodic sync workers only after we have finished xfs_mountfs
and thus fully set up the filesystem structures.  Without this we can
call into xfs_qm_sync before the quotainfo strucute is set up if the
mount takes unusually long, and probably hit other incomplete states
as well.

Also clean up the xfs_fs_fill_super error path by using consistent
label names, and removing an impossible to reach case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Reviewed-by: Alex Elder <aelder@sgi.com>
2011-07-13 13:43:48 +02:00
Al Viro
94c0d4ecbe Fix ->d_lock locking order in unlazy_walk()
Make sure that child is still a child of parent before nested locking
of child->d_lock in unlazy_walk(); otherwise we are risking a violation
of locking order and deadlocks.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-12 21:40:23 -04:00
David Teigland
3881ac04eb dlm: improve rsb searches
By pre-allocating rsb structs before searching the hash
table, they can be inserted immediately.  This avoids
always having to repeat the search when adding the struct
to hash list.

This also adds space to the rsb struct for a max resource
name, so an rsb allocation can be used by any request.
The constant size also allows us to finally use a slab
for the rsb structs.

Signed-off-by: David Teigland <teigland@redhat.com>
2011-07-12 16:02:09 -05:00
Steve French
c2ec9471b5 [CIFS] update cifs to version 1.74
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-12 19:15:02 +00:00
Steve French
ea1be1a3c3 [CIFS] update limit for snprintf in cifs_construct_tcon
In 34c87901e1 "Shrink stack space usage in cifs_construct_tcon" we
change the size of the username name buffer from MAX_USERNAME_SIZE
(256) to 28.  This call to snprintf() needs to be updated as well.
Reported by Dan Carpenter.

Reviewed-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-12 19:14:24 +00:00
Shirish Pargaonkar
62411ab2fe cifs: Fix signing failure when server mandates signing for NTLMSSP
When using NTLMSSP authentication mechanism, if server mandates
signing, keep the flags in type 3 messages of the NTLMSSP exchange
same as in type 1 messages (i.e. keep the indicated capabilities same).

Some of the servers such as Samba, expect the flags such as
Negotiate_Key_Exchange in type 3 message of NTLMSSP exchange as well.
Some servers like Windows do not.

https://bugzilla.samba.org/show_bug.cgi?id=8212

Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-12 19:14:23 +00:00
Steven Whitehouse
3942ae5319 GFS2: Fix race during filesystem mount
There is a potential race during filesystem mounting which has recently
been reported. It occurs when the userland gfs_controld is able to
process requests fast enough that it tries to use the sysfs interface
before the lock module is properly initialised. This is a pretty
unusual case as normally the lock module initialisation is very quick
compared with gfs_controld.

This patch adds an interruptible completion which is used to ensure that
userland will wait for the initialisation of the lock module to
complete.

There are other potential solutions to this problem, but this is the
quickest at this stage and has been tested both with and without
mount.gfs2 present in the system.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: David Booher <dbooher@adams.net>
2011-07-12 09:15:46 +01:00
Benjamin Marzinski
1ce533686c GFS2: force a log flush when invalidating the rindex glock
Right now, there is nothing that forces the log to get flushed when a node
drops its rindex glock so that another node can grow the filesystem. If the
log doesn't get flushed, GFS2 can corrupt the sd_log_le_rg list in the
following way.

A node puts an rgd on the list in rg_lo_add(), and then the rindex glock is
dropped so the other node can grow the filesystem. When the node reacquires the
rindex glock, that rgd gets deleted in clear_rgrpdi() before ever being
removed from the list by gfs2_log_flush().

This code simply forces a log flush when the rindex glock is invalidated,
solving the problem.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2011-07-12 09:15:24 +01:00
Andy Adamson
e5012d1f38 NFSv4.1: update nfs4_fattr_bitmap_maxsz
Attribute IDs assigned in RFC 5661 now require three bitmaps.
Fixes hitting a BUG_ON in xdr_shrink_bufhead when getting ACLs.

Signed-off-by: Andy Adamson <andros@netapp.com>
Cc:stable@kernel.org [2.6.39]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-11 19:14:38 -04:00
Linus Torvalds
71a1b44b03 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: drop spinlock before calling cifs_put_tlink
  cifs: fix expand_dfs_referral
  cifs: move bdi_setup_and_register outside of CONFIG_CIFS_DFS_UPCALL
  cifs: factor smb_vol allocation out of cifs_setup_volume_info
  cifs: have cifs_cleanup_volume_info not take a double pointer
  cifs: fix build_unc_path_to_root to account for a prefixpath
  cifs: remove bogus call to cifs_cleanup_volume_info
2011-07-11 12:48:24 -07:00
Jeff Layton
f484b5d001 cifs: drop spinlock before calling cifs_put_tlink
...as that function can sleep.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-11 18:40:52 +00:00
Alex Elder
b2ce397400 Revert "xfs: fix filesystsem freeze race in xfs_trans_alloc"
This reverts commit 7a249cf83d.

That commit created a situation that could lead to a filesystem
hang.  As Dave Chinner pointed out, xfs_trans_alloc() could hold a
reference to m_active_trans (i.e., keep it non-zero) and then wait
for SB_FREEZE_TRANS to complete.  Meanwhile a filesystem freeze
request could set SB_FREEZE_TRANS and then wait for m_active_trans
to drop to zero.  Nobody benefits from this sequence of events...

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-11 10:21:03 -05:00
David Teigland
3d6aa675ff dlm: keep lkbs in idr
This is simpler and quicker than the hash table, and
avoids needing to search the hash list for every new
lkid to check if it's used.

Signed-off-by: David Teigland <teigland@redhat.com>
2011-07-11 08:43:45 -05:00
David Teigland
a22ca48068 dlm: fix kmalloc args
The gfp and size args were switched.

Signed-off-by: David Teigland <teigland@redhat.com>
2011-07-11 08:40:53 -05:00
Jesper Juhl
5d70828a77 dlm: don't do pointless NULL check, use kzalloc and fix order of arguments
In fs/dlm/lock.c in the dlm_scan_waiters() function there are 3 small
issues:

1) There's no need to test the return value of the allocation and do a
memset if is succeedes. Just use kzalloc() to obtain zeroed memory.

2) Since kfree() handles NULL pointers gracefully, the test of
'warned' against NULL before the kfree() after the loop is completely
pointless. Remove it.

3) The arguments to kmalloc() (now kzalloc()) were swapped. Thanks to
Dr. David Alan Gilbert for pointing this out.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2011-07-11 08:39:42 -05:00
Jeff Layton
b9bce2e9f9 cifs: fix expand_dfs_referral
Regression introduced in commit 724d9f1cfb.

Prior to that, expand_dfs_referral would regenerate the mount data string
and then call cifs_parse_mount_options to re-parse it (klunky, but it
worked). The above commit moved cifs_parse_mount_options out of cifs_mount,
so the re-parsing of the new mount options no longer occurred. Fix it by
making expand_dfs_referral re-parse the mount options.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-09 21:25:57 +00:00
Jeff Layton
20547490c1 cifs: move bdi_setup_and_register outside of CONFIG_CIFS_DFS_UPCALL
This needs to be done regardless of whether that KConfig option is set
or not.

Reported-by: Sven-Haegar Koch <haegar@sdinet.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-09 20:29:51 +00:00
Linus Torvalds
1acc9309eb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  btrfs: fix oops when doing space balance
  Btrfs: don't panic if we get an error while balancing V2
  btrfs: add missing options displayed in mount output
2011-07-08 23:25:45 -07:00
Chandra Seetharaman
81463b1ca8 xfs: remove variables that serve no purpose in xfs_alloc_ag_vextent_exact()
Remove two variables that serve no purpose in
xfs_alloc_ag_vextent_exact().

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-08 11:32:51 -05:00
Eric Sandeen
c0e090ced2 xfs: consolidate & clarify mount sanity checks
Pavol pointed out that there is one silent error case in the mount
path, and that others are rather uninformative.

I've taken Pavol's suggested patch and extended it a bit to also:

* fix a message which says "turned off" but actually errors out
* consolidate the vaguely differentiated "SB sanity check [12]"
  messages, and hexdump the superblock for analysis

Original-patch-by: Pavol Gono <Pavol.Gono@siemens.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-08 11:32:51 -05:00
Linus Torvalds
54af2bd25c Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: unpin stale inodes directly in IOP_COMMITTED
2011-07-08 09:00:51 -07:00
Christoph Hellwig
e163cbde98 xfs: avoid a few disk cache flushes
There is no need for a pre-flush when doing writing the second part of a
split log buffer, and if we are using an external log there is no need
to do a full cache flush of the log device at all given that all writes
to it use the FUA flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:36:36 +02:00
Christoph Hellwig
1d5ae5dfee xfs: cleanup I/O-related buffer flags
Remove the unused and misnamed _XBF_RUN_QUEUES flag, rename XBF_LOG_BUFFER
to the more fitting XBF_SYNCIO, and split XBF_ORDERED into XBF_FUA and
XBF_FLUSH to allow more fine grained control over the bio flags.  Also
cleanup processing of the flags in _xfs_buf_ioapply to make more sense,
and renumber the sparse flag number space to group flags by purpose.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:36:32 +02:00
Christoph Hellwig
c8da0faf6b xfs: return the buffer locked from xfs_buf_get_uncached
All other xfs_buf_get/read-like helpers return the buffer locked, make sure
xfs_buf_get_uncached isn't different for no reason.  Half of the callers
already lock it directly after, and the others probably should also keep
it locked if only for consistency and beeing able to use xfs_buf_rele,
but I'll leave that for later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:36:25 +02:00
Christoph Hellwig
0c842ad46a xfs: clean up buffer locking helpers
Rename xfs_buf_cond_lock and reverse it's return value to fit most other
trylock operations in the Kernel and XFS (with the exception of down_trylock,
after which xfs_buf_cond_lock was modelled), and replace xfs_buf_lock_val
with an xfs_buf_islocked for use in asserts, or and opencoded variant in
tracing.  remove the XFS_BUF_* wrappers for all the locking helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:36:19 +02:00
Christoph Hellwig
bbb4197c73 xfs: remove the unused xfs_bufhash structure
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:36:10 +02:00
Christoph Hellwig
69ef921b55 xfs: byteswap constants instead of variables
Micro-optimize various comparisms by always byteswapping the constant
instead of the variable, which allows to do the swap at compile instead
of runtime.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:36:05 +02:00
Christoph Hellwig
218106a110 xfs: use generic get_unaligned_beXX helpers
Switch the shortform directory code over to use the generic
get_unaligned_beXX helpers instead of reinventing them.  As a result
kill off xfs_arch.h and move the setting of XFS_NATIVE_HOST into
xfs_linux.h.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:35:58 +02:00
Christoph Hellwig
2282396d81 xfs: cleanup struct xfs_dir2_leaf
Simplify the confusing xfs_dir2_leaf structure.  It is supposed to describe
an XFS dir2 leaf format btree block, but due to the variable sized nature
of almost all elements in it it can't actuall do anything close to that
job.   Remove the members that are after the first variable sized array,
given that they could only be used for sizeof expressions that can as well
just use the underlying types directly, and make the ents array a real
C99 variable sized array.

Also factor out the xfs_dir2_leaf_size, to make the sizing of a leaf
entry which already was convoluted somewhat readable after using the
longer type names in the sizeof expressions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:35:53 +02:00
Christoph Hellwig
3ed8638f88 xfs: cleanup the definition of struct xfs_dir2_data_entry
Remove the tag member which is at a variable offset after the actual
name, and make name a real variable sized C99 array instead of the incorrect
one-sized array which confuses (not only) gcc.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:35:50 +02:00
Christoph Hellwig
0ba9cd84ef xfs: kill struct xfs_dir2_data
Remove the confusing xfs_dir2_data structure.  It is supposed to describe
an XFS dir2 data btree block, but due to the variable sized nature of
almost all elements in it it can't actuall do anything close to that
job.  In addition to accessing the fixed offset header structure it was
only used to get a pointer to the first dir or unused entry after it,
which can be trivially replaced by pointer arithmetics on the header
pointer.  For most users that is actually more natural anyway, as they
don't use a typed pointer but rather a character pointer for further
arithmetics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:35:42 +02:00
Christoph Hellwig
c2066e2662 xfs: avoid usage of struct xfs_dir2_data
In most places we can simply pass around and use the struct xfs_dir2_data_hdr,
which is the first and most important member of struct xfs_dir2_data instead
of the full structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:35:38 +02:00
Christoph Hellwig
a64b041797 xfs: kill struct xfs_dir2_block
Remove the confusing xfs_dir2_block structure.  It is supposed to describe
an XFS dir2 block format btree block, but due to the variable sized nature
of almost all elements in it it can't actuall do anything close to that
job.  In addition to accessing the fixed offset header structure it was
only used to get a pointer to the first dir or unused entry after it,
which can be trivially replaced by pointer arithmetics on the header
pointer.  For most users that is actually more natural anyway, as they
don't use a typed pointer but rather a character pointer for further
arithmetics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:35:32 +02:00
Christoph Hellwig
4f6ae1a49e xfs: avoid usage of struct xfs_dir2_block
In most places we can simply pass around and use the struct xfs_dir2_data_hdr,
which is the first and most important member of struct xfs_dir2_block instead
of the full structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:35:27 +02:00
Christoph Hellwig
78f70cd7b7 xfs: cleanup the definition of struct xfs_dir2_sf_entry
Remove the inumber member which is at a variable offset after the actual
name, and make name a real variable sized C99 array instead of the incorrect
one-sized array which confuses (not only) gcc.  Based on this clean up
the helpers to calculate the entry size.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:35:19 +02:00
Christoph Hellwig
ac8ba50f6b xfs: kill struct xfs_dir2_sf
The list field of it is never cactually used, so all uses can simply be
replaced with the xfs_dir2_sf_hdr_t type that it has as first member.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:35:13 +02:00
Christoph Hellwig
8bc3878758 xfs: cleanup shortform directory inode number handling
Refactor the shortform directory helpers that deal with the 32-bit vs
64-bit wide inode numbers into more sensible helpers, and kill the
xfs_intino_t typedef that is now superflous.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:35:03 +02:00
Christoph Hellwig
4fb44c8272 xfs: factor out xfs_dir2_leaf_find_entry
Add a new xfs_dir2_leaf_find_entry helper to factor out some duplicate code
from xfs_dir2_leaf_addname xfs_dir2_leafn_add.  Found by Eric Sandeen using
an automated code duplication checker.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:34:59 +02:00
Christoph Hellwig
29d104af0a xfs: kill the unused struct xfs_sync_work
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:34:51 +02:00
Christoph Hellwig
f3ca87389d xfs: remove i_transp
Remove the transaction pointer in the inode.  It's only used to avoid
passing down an argument in the bmap code, and for a few asserts in
the transaction code right now.

Also use the local variable ip in a few more places in xfs_inode_item_unlock,
so that it isn't only used for debug builds after the above change.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:34:47 +02:00
Christoph Hellwig
7a249cf83d xfs: fix filesystsem freeze race in xfs_trans_alloc
As pointed out by Jan xfs_trans_alloc can race with a concurrent filesystem
freeze when it sleeps during the memory allocation.  Fix this by moving the
wait_for_freeze call after the memory allocation.  This means moving the
freeze into the low-level _xfs_trans_alloc helper, which thus grows a new
argument.  Also fix up some comments in that area while at it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2011-07-08 14:34:42 +02:00
Christoph Hellwig
33b8f7c247 xfs: improve sync behaviour in the face of aggressive dirtying
The following script from Wu Fengguang shows very bad behaviour in XFS
when aggressively dirtying data during a sync on XFS, with sync times
up to almost 10 times as long as ext4.

A large part of the issue is that XFS writes data out itself two times
in the ->sync_fs method, overriding the livelock protection in the core
writeback code, and another issue is the lock-less xfs_ioend_wait call,
which doesn't prevent new ioend from being queue up while waiting for
the count to reach zero.

This patch removes the XFS-internal sync calls and relies on the VFS
to do it's work just like all other filesystems do.  Note that the
i_iocount wait which is rather suboptimal is simply removed here.
We already do it in ->write_inode, which keeps the current supoptimal
behaviour.  We'll eventually need to remove that as well, but that's
material for a separate commit.

------------------------------ snip ------------------------------
#!/bin/sh

umount /dev/sda7
mkfs.xfs -f /dev/sda7
# mkfs.ext4 /dev/sda7
# mkfs.btrfs /dev/sda7
mount /dev/sda7 /fs

echo $((50<<20)) > /proc/sys/vm/dirty_bytes

pid=
for i in `seq 10`
do
	dd if=/dev/zero of=/fs/zero-$i bs=1M count=1000 &
	pid="$pid $!"
done

sleep 1

tic=$(date +'%s')
sync
tac=$(date +'%s')

echo
echo sync time: $((tac-tic))
egrep '(Dirty|Writeback|NFS_Unstable)' /proc/meminfo

pidof dd > /dev/null && { kill -9 $pid; echo sync NOT livelocked; }
------------------------------ snip ------------------------------

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:34:39 +02:00
Christoph Hellwig
8f04c47aa9 xfs: split xfs_itruncate_finish
Split the guts of xfs_itruncate_finish that loop over the existing extents
and calls xfs_bunmapi on them into a new helper, xfs_itruncate_externs.
Make xfs_attr_inactive call it directly instead of xfs_itruncate_finish,
which allows to simplify the latter a lot, by only letting it deal with
the data fork.  As a result xfs_itruncate_finish is renamed to
xfs_itruncate_data to make its use case more obvious.

Also remove the sync parameter from xfs_itruncate_data, which has been
unessecary since the introduction of the busy extent list in 2002, and
completely dead code since 2003 when the XFS_BMAPI_ASYNC parameter was
made a no-op.

I can't actually see why the xfs_attr_inactive needs to set the transaction
sync, but let's keep this patch simple and without changes in behaviour.

Also avoid passing a useless argument to xfs_isize_check, and make it
private to xfs_inode.c.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:34:34 +02:00
Christoph Hellwig
857b9778d8 xfs: kill xfs_itruncate_start
xfs_itruncate_start is a rather length wrapper that evaluates to a call
to xfs_ioend_wait and xfs_tosspages, and only has two callers.

Instead of using the complicated checks left over from IRIX where we
can to truncate the pagecache just call xfs_tosspages
(aka truncate_inode_pages) directly as we want to get rid of all data
after i_size, and truncate_inode_pages handles incorrect alignments
and too large offsets just fine.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:34:30 +02:00
Christoph Hellwig
681b120018 xfs: always log timestamp updates in xfs_setattr_size
Get rid of the special case where we use unlogged timestamp updates for
a truncate to the current inode size, and just call xfs_setattr_nonsize
for it to treat it like a utimes calls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:34:26 +02:00
Christoph Hellwig
c4ed4243c4 xfs: split xfs_setattr
Split up xfs_setattr into two functions, one for the complex truncate
handling, and one for the trivial attribute updates.  Also move both
new routines to xfs_iops.c as they are fairly Linux-specific.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:34:23 +02:00
Christoph Hellwig
dec58f1dfd xfs: work around bogus gcc warning in xfs_allocbt_init_cursor
GCC 4.6 complains about an array subscript is above array bounds when
using the btree index to index into the agf_levels array.  The only
two indices passed in are 0 and 1, and we have an assert insuring that.

Replace the trick of using the array index directly with using constants
in the already existing branch for assigning the XFS_BTREE_LASTREC_UPDATE
flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:34:18 +02:00
Christoph Hellwig
dbcdde3e76 xfs: re-enable non-blocking behaviour in xfs_map_blocks
The non-blockig behaviour in xfs_vm_writepage currently is conditional on
having both the WB_SYNC_NONE sync_mode and the nonblocking flag set.
The latter used to be used by both pdflush, kswapd and a few other places
in older kernels, but has been fading out starting with the introduction
of the per-bdi flusher threads.

Enable the non-blocking behaviour for all WB_SYNC_NONE calls to get back
the behaviour we want.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:34:14 +02:00
Christoph Hellwig
680a647b49 xfs: PF_FSTRANS should never be set in ->writepage
Now that we reject direct reclaim in addition to always using GFP_NOFS
allocation there's no chance we'll ever end up in ->writepage with
PF_FSTRANS set.  Add a WARN_ON if we hit this case, and stop checking
if we'd actually need to start a transaction.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2011-07-08 14:34:05 +02:00
Anatolij Gustschin
19495f70d1 UBIFS: fix master node recovery
When the 1st LEB was unmapped and written but 2nd LEB not,
the master node recovery doesn't succeed after power cut.
We see following error when mounting UBIFS partition on NOR
flash:

UBIFS error (pid 1137): ubifs_recover_master_node: failed to recover master node

Correct 2nd master node offset check is needed to fix the
problem. If the 2nd master node is at the end in the 2nd LEB,
first master node is used for recovery. When checking for this
condition we should check whether the master node is exactly at
the end of the LEB (without remaining empty space) or whether
it is followed by an empty space less than the master node size.

Artem: when the error happened, offs2 = 261120, sz = 512, c->leb_size = 262016.

Signed-off-by: Anatolij Gustschin <agust@denx.de>
Signed-off-by: Artem Bityutskiy <dedekind1@gmail.com>
2011-07-08 06:53:18 +03:00
Jeff Layton
04db79b015 cifs: factor smb_vol allocation out of cifs_setup_volume_info
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-08 03:51:23 +00:00
David Howells
c902ce1bfb FS-Cache: Add a helper to bulk uncache pages on an inode
Add an FS-Cache helper to bulk uncache pages on an inode.  This will
only work for the circumstance where the pages in the cache correspond
1:1 with the pages attached to an inode's page cache.

This is required for CIFS and NFS: When disabling inode cookie, we were
returning the cookie and setting cifsi->fscache to NULL but failed to
invalidate any previously mapped pages.  This resulted in "Bad page
state" errors and manifested in other kind of errors when running
fsstress.  Fix it by uncaching mapped pages when we disable the inode
cookie.

This patch should fix the following oops and "Bad page state" errors
seen during fsstress testing.

  ------------[ cut here ]------------
  kernel BUG at fs/cachefiles/namei.c:201!
  invalid opcode: 0000 [#1] SMP
  Pid: 5, comm: kworker/u:0 Not tainted 2.6.38.7-30.fc15.x86_64 #1 Bochs Bochs
  RIP: 0010: cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  RSP: 0018:ffff88002ce6dd00  EFLAGS: 00010282
  RAX: ffff88002ef165f0 RBX: ffff88001811f500 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000282
  RBP: ffff88002ce6dda0 R08: 0000000000000100 R09: ffffffff81b3a300
  R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff88002ae54840
  R13: ffff88002ae54840 R14: ffff880029c29c00 R15: ffff88001811f4b0
  FS:  00007f394dd32720(0000) GS:ffff88002ef00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00007fffcb62ddf8 CR3: 000000001825f000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process kworker/u:0 (pid: 5, threadinfo ffff88002ce6c000, task ffff88002ce55cc0)
  Stack:
   0000000000000246 ffff88002ce55cc0 ffff88002ce6dd58 ffff88001815dc00
   ffff8800185246c0 ffff88001811f618 ffff880029c29d18 ffff88001811f380
   ffff88002ce6dd50 ffffffff814757e4 ffff88002ce6dda0 ffffffff8106ac56
  Call Trace:
   cachefiles_lookup_object+0x78/0xd4 [cachefiles]
   fscache_lookup_object+0x131/0x16d [fscache]
   fscache_object_work_func+0x1bc/0x669 [fscache]
   process_one_work+0x186/0x298
   worker_thread+0xda/0x15d
   kthread+0x84/0x8c
   kernel_thread_helper+0x4/0x10
  RIP  cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  ---[ end trace 1d481c9af1804caa ]---

I tested the uncaching by the following means:

 (1) Create a big file on my NFS server (104857600 bytes).

 (2) Read the file into the cache with md5sum on the NFS client.  Look in
     /proc/fs/fscache/stats:

	Pages  : mrk=25601 unc=0

 (3) Open the file for read/write ("bash 5<>/warthog/bigfile").  Look in proc
     again:

	Pages  : mrk=25601 unc=25601

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-07 13:21:56 -07:00
Alexey Khoroshilov
dd7f3d5458 hfsplus: Add error propagation for hfsplus_ext_write_extent_locked
Implement error propagation through the callers of
hfsplus_ext_write_extent_locked().

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-07-07 17:45:46 +02:00
Alexey Khoroshilov
5bd9d99d10 hfsplus: add error checking for hfs_find_init()
hfs_find_init() may fail with ENOMEM, but there are places, where
the returned value is not checked. The consequences can be very
unpleasant, e.g. kfree uninitialized pointer and
inappropriate mutex unlocking.

The patch adds checks for errors in hfs_find_init().

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-07-07 17:45:46 +02:00
Miao Xie
149e2d76b4 btrfs: fix oops when doing space balance
We need to make sure the data relocation inode doesn't go through
the delayed metadata updates, otherwise we get an oops during balance:

kernel BUG at fs/btrfs/relocation.c:4303!
[SNIP]
Call Trace:
 [<ffffffffa03143fd>] ? update_ref_for_cow+0x22d/0x330 [btrfs]
 [<ffffffffa0314951>] __btrfs_cow_block+0x451/0x5e0 [btrfs]
 [<ffffffffa031355d>] ? read_block_for_search+0x14d/0x4d0 [btrfs]
 [<ffffffffa0314beb>] btrfs_cow_block+0x10b/0x240 [btrfs]
 [<ffffffffa031acae>] btrfs_search_slot+0x49e/0x7a0 [btrfs]
 [<ffffffffa032d8af>] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
 [<ffffffff8147bf0e>] ? mutex_lock+0x1e/0x50
 [<ffffffffa0380cf1>] btrfs_update_delayed_inode+0x71/0x160 [btrfs]
 [<ffffffffa037ff27>] ? __btrfs_release_delayed_node+0x67/0x190 [btrfs]
 [<ffffffffa0381cf8>] btrfs_run_delayed_items+0xe8/0x120 [btrfs]
 [<ffffffffa03365e0>] btrfs_commit_transaction+0x250/0x850 [btrfs]
 [<ffffffff810f91d9>] ? find_get_pages+0x39/0x130
 [<ffffffffa0336cd5>] ? join_transaction+0x25/0x250 [btrfs]
 [<ffffffff81081de0>] ? wake_up_bit+0x40/0x40
 [<ffffffffa03785fa>] prepare_to_relocate+0xda/0xf0 [btrfs]
 [<ffffffffa037f2bb>] relocate_block_group+0x4b/0x620 [btrfs]
 [<ffffffffa0334cf5>] ? btrfs_clean_old_snapshots+0x35/0x150 [btrfs]
 [<ffffffffa037fa43>] btrfs_relocate_block_group+0x1b3/0x2e0 [btrfs]
 [<ffffffffa0368ec0>] ? btrfs_tree_unlock+0x50/0x50 [btrfs]
 [<ffffffffa035e39b>] btrfs_relocate_chunk+0x8b/0x670 [btrfs]
 [<ffffffffa031303d>] ? btrfs_set_path_blocking+0x3d/0x50 [btrfs]
 [<ffffffffa03577d8>] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
 [<ffffffffa031bea1>] ? btrfs_previous_item+0xb1/0x150 [btrfs]
 [<ffffffffa03577d8>] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
 [<ffffffffa035f5aa>] btrfs_balance+0x21a/0x2b0 [btrfs]
 [<ffffffffa0368898>] btrfs_ioctl+0x798/0xd20 [btrfs]
 [<ffffffff8111e358>] ? handle_mm_fault+0x148/0x270
 [<ffffffff814809e8>] ? do_page_fault+0x1d8/0x4b0
 [<ffffffff81160d6a>] do_vfs_ioctl+0x9a/0x540
 [<ffffffff811612b1>] sys_ioctl+0xa1/0xb0
 [<ffffffff81484ec2>] system_call_fastpath+0x16/0x1b
[SNIP]
RIP  [<ffffffffa037c1cc>] btrfs_reloc_cow_block+0x22c/0x270 [btrfs]

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-07-06 18:51:53 -04:00
Josef Bacik
508794eb5e Btrfs: don't panic if we get an error while balancing V2
A user reported an error where if we try to balance an fs after a device has
been removed it will blow up.  This is because we get an EIO back and this is
where BUG_ON(ret) bites us in the ass.  To fix we just exit.  Thanks,

Reported-by: Anand Jain <Anand.Jain@oracle.com>
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-07-06 18:46:43 -04:00
David Sterba
0942caa373 btrfs: add missing options displayed in mount output
There are three missed mount options settable by user which are not
currently displayed in mount output.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-07-06 18:46:43 -04:00
Masatake YAMATO
bcaadf5c1a dlm: dump address of unknown node
When the dlm fails to make a network connection to another
node, include the address of the node in the error message.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2011-07-06 16:37:23 -05:00
Dave Chinner
1316d4da3f xfs: unpin stale inodes directly in IOP_COMMITTED
When inodes are marked stale in a transaction, they are treated
specially when the inode log item is being inserted into the AIL.
It tries to avoid moving the log item forward in the AIL due to a
race condition with the writing the underlying buffer back to disk.
The was "fixed" in commit de25c18 ("xfs: avoid moving stale inodes
in the AIL").

To avoid moving the item forward, we return a LSN smaller than the
commit_lsn of the completing transaction, thereby trying to trick
the commit code into not moving the inode forward at all. I'm not
sure this ever worked as intended - it assumes the inode is already
in the AIL, but I don't think the returned LSN would have been small
enough to prevent moving the inode. It appears that the reason it
worked is that the lower LSN of the inodes meant they were inserted
into the AIL and flushed before the inode buffer (which was moved to
the commit_lsn of the transaction).

The big problem is that with delayed logging, the returning of the
different LSN means insertion takes the slow, non-bulk path.  Worse
yet is that insertion is to a position -before- the commit_lsn so it
is doing a AIL traversal on every insertion, and has to walk over
all the items that have already been inserted into the AIL. It's
expensive.

To compound the matter further, with delayed logging inodes are
likely to go from clean to stale in a single checkpoint, which means
they aren't even in the AIL at all when we come across them at AIL
insertion time. Hence these were all getting inserted into the AIL
when they simply do not need to be as inodes marked XFS_ISTALE are
never written back.

Transactional/recovery integrity is maintained in this case by the
other items in the unlink transaction that were modified (e.g. the
AGI btree blocks) and committed in the same checkpoint.

So to fix this, simply unpin the stale inodes directly in
xfs_inode_item_committed() and return -1 to indicate that the AIL
insertion code does not need to do any further processing of these
inodes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2011-07-06 15:44:40 -05:00
Jeff Layton
f9e59bcba2 cifs: have cifs_cleanup_volume_info not take a double pointer
...as that makes for a cumbersome interface. Make it take a regular
smb_vol pointer and rely on the caller to zero it out if needed.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-06 20:03:05 +00:00
Jeff Layton
b2a0fa1520 cifs: fix build_unc_path_to_root to account for a prefixpath
Regression introduced by commit f87d39d951.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-06 20:03:05 +00:00
Jeff Layton
677d8537d8 cifs: remove bogus call to cifs_cleanup_volume_info
This call to cifs_cleanup_volume_info is clearly wrong. As soon as it's
called the following call to cifs_get_tcp_session will oops as the
volume_info pointer will then be NULL.

The caller of cifs_mount should clean up this data since it passed it
in. There's no need for us to call this here.

Regression introduced by commit 724d9f1cfb.

Reported-by: Adam Williamson <awilliam@redhat.com>
Cc: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-06 20:03:04 +00:00
Davidlohr Bueso
bcb65a797e FDPIC: Fix memory leak
The shdr4extnum variable isn't being freed in the cleanup process of
elf_fdpic_core_dump().

Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06 12:15:16 -07:00
Miklos Szeredi
a51cb91d81 fs: fix lock initialization
locks_alloc_lock() assumed that the allocated struct file_lock is
already initialized to zero members.  This is only true for the first
allocation of the structure, after reuse some of the members will have
random values.

This will for example result in passing random fl_start values to
userspace in fuse for FL_FLOCK locks, which is an information leak at
best.

Fix by reinitializing those members which may be non-zero after freeing.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06 10:41:13 -07:00
Linus Torvalds
121782a248 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: fix sync and dio writes across stripe boundaries
  libceph: fix page calculation for non-page-aligned io
  ceph: fix page alignment corrections
2011-07-05 13:15:57 -07:00
Linus Torvalds
a8728d3554 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/hfsplus
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/hfsplus:
  hfsplus: Fix double iput of the same inode in hfsplus_fill_super()
  hfsplus: add missing call to bio_put()
2011-07-05 10:04:27 -07:00
Artem Bityutskiy
a7fa94a9fe UBIFS: improve power cut emulation testing
This patch cleans-up and improves the power cut testing:

1. Kill custom 'simple_random()' function and use 'random32()' instead.
2. Make timeout larger
3. When cutting the buffer - fill the end with random data sometimes, not
   only with 0xFFs.
4. Some times cut in the middle of the buffer, not always at the end.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:34 +03:00
Artem Bityutskiy
d27462a518 UBIFS: rename recovery testing variables
Since the recovery testing is effectively about emulating power cuts by UBIFS,
use "power cut" as the base term for all the related variables and name them
correspondingly. This is just a minor clean-up for the sake of readability.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:34 +03:00
Artem Bityutskiy
f57cb188cc UBIFS: remove custom list of superblocks
This is a clean-up of the power-cut emulation code - remove the custom list of
superblocks which we maintained to find the superblock by the UBI volume
descriptor. We do not need that crud any longer, because now we can get the
superblock as a function argument.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:33 +03:00
Artem Bityutskiy
0a541b14e8 UBIFS: stop re-defining UBI operations
Now when we use UBIFS helpers for all the I/O, we can remove the horrible hack
of re-defining UBI I/O functions.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:33 +03:00
Artem Bityutskiy
d3b2578f56 UBIFS: switch to I/O helpers
Switch the rest of direct UBI calls to UBIFS helper functions.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:33 +03:00
Artem Bityutskiy
987226a5d3 UBIFS: switch to ubifs_leb_write
Stop using 'ubi_leb_write()' directly and switch to the 'ubifs_leb_write()'
helper.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:33 +03:00
Artem Bityutskiy
d304820a1f UBIFS: switch to ubifs_leb_read
Instead of using 'ubi_read()' function directly, used the 'ubifs_leb_read()'
helper function instead. This allows to get rid of several redundant error
messages and make sure that we always have a stack dump on read errors.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:33 +03:00
Artem Bityutskiy
83cef708c6 UBIFS: introduce more I/O helpers
Introduce the following I/O helper functions: 'ubifs_leb_read()',
'ubifs_leb_write()', 'ubifs_leb_change()', 'ubifs_leb_unmap()',
'ubifs_leb_map()', 'ubifs_is_mapped().

The idea is to wrap all UBI I/O functions in order to encapsulate various
assertions and error path handling (error message, stack dump, switching to R/O
mode). And there are some other benefits of this which will be used in the
following patches.

This patch does not switch whole UBIFS to use these functions yet.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:33 +03:00
Artem Bityutskiy
d033c98b17 UBIFS: always print stacktrace when switching to R/O mode
When switching to R/O mode due to an I/O error, always dump the stack, not only
when debugging is enabled.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:33 +03:00
Artem Bityutskiy
891a54a153 UBIFS: remove unused and unneeded debugging function
This patch contains several minor clean-up and preparational cahnges.

1. Remove 'dbg_read()', 'dbg_write()', 'dbg_change()', and 'dbg_leb_erase()'
   functions as they are not used.
2. Remove 'dbg_leb_read()' and 'dbg_is_mapped()' as they are not really needed,
   it is fine to let reads go through in failure mode.
3. Rename 'offset' argument to 'offs' to be consistent with the rest of UBIFS
   code.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:32 +03:00
Artem Bityutskiy
e7717060dd UBIFS: add global debugfs knobs
Now we have per-FS (superblock) debugfs knobs, but they have one drawback - you
have to first mount the FS and only after this you can switch self-checks
on/off. But often we want to have the checks enabled during the mount.
Introduce global debugging knobs for this purpose.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:32 +03:00
Artem Bityutskiy
28488fc28a UBIFS: introduce debugfs helpers
Separate out pieces of code from the debugfs file read/write functions and
create separate 'interpret_user_input()'/'provide_user_output()' helpers. These
helpers will be needed in one of the following patches, so this is just a
preparational change.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:32 +03:00
Artem Bityutskiy
7dae997de6 UBIFS: re-arrange debugging code a bit
Move 'dbg_debugfs_init()' and 'dbg_debugfs_exit()' functions which initialize
debugfs for whole UBIFS subsystem below the code which initializes debugfs for
a particular UBIFS instance. And do the same for 'ubifs_debugging_init()' and
'ubifs_debugging_exit()' functions. This layout is a bit better for the next
patches, so this is just a preparation.

Also, rename 'open_debugfs_file()' into 'dfs_file_open()' for consistency.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:32 +03:00
Artem Bityutskiy
24a4f8009e UBIFS: be more informative in failure mode
When we are testing UBIFS recovery, it is better to print in which eraseblock
we are going to fail. Currently UBIFS prints it only if recovery debugging
messages are enabled, but this is not very practical. So change 'dbg_rcvry()'
messages to 'ubifs_warn()' messages.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:30 +03:00
Artem Bityutskiy
81e79d38df UBIFS: switch self-check knobs to debugfs
UBIFS has many built-in self-check functions which can be enabled using the
debug_chks module parameter or the corresponding sysfs file
(/sys/module/ubifs/parameters/debug_chks). However, this is not flexible enough
because it is not per-filesystem. This patch moves this to debugfs interfaces.

We already have debugfs support, so this patch just adds more debugfs files.
While looking at debugfs support I've noticed that it is racy WRT file-system
unmount, and added a TODO entry for that. This problem has been there for long
time and it is quite standard debugfs PITA. The plan is to fix this later.

This patch is simple, but it is large because it changes many places where we
check if a particular type of checks is enabled or disabled.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:28 +03:00
Artem Bityutskiy
8d7819b4af UBIFS: lessen amount of debugging check types
We have too many different debugging checks - lessen the amount by merging all
index-related checks into one. At the same time, move the "force in-the-gap"
test to the "index checks" class, because it is too heavy for the "general"
class.

This patch merges TNC, Old index, and Index size check and calles this just
"index checks".

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:28 +03:00
Artem Bityutskiy
2b1844a8c9 UBIFS: introduce helper functions for debugging checks and tests
This patch introduces helper functions for all debugging checks, so instead of
doing

if (!(ubifs_chk_flags & UBIFS_CHK_GEN))

we now do

if (!dbg_is_chk_gen(c))

This is a preparation to further changes where the flags will go away, and
we'll need to only change the helper functions, but the code which utilizes
them won't be touched.

At the same time this patch removes 'dbg_force_in_the_gaps()',
'dbg_force_in_the_gaps_enabled()', and dbg_failure_mode helpers for
consistency.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:28 +03:00
Artem Bityutskiy
d808efb407 UBIFS: amend debugging inode size check function prototype
Add 'const struct ubifs_info *c' parameter to 'dbg_check_synced_i_size()'
function because we'll need it in the next patch when we switch to debugfs.
So this patch is just a preparation.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:27 +03:00
Artem Bityutskiy
bb2615d4d1 UBIFS: amend debugging name check function prototype
Add 'struct ubifs_info *c' parameter to the 'dbg_check_name()' debugging
function - it will be needed in one of the following commits where we switch to
debugfs. So this is just a preparation.

Mark parameters as 'const' while on it.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:27 +03:00
Artem Bityutskiy
06b282a4cc UBIFS: add few commentaries about TNC
Add a couple of comments - while looking into TNC I could not easily figure out
few facts, so it is a good idea to document them in the code.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:27 +03:00
Artem Bityutskiy
3766244769 UBIFS: use correct flags in lprops
The UBIFS lpt tree is in many aspects similar to the TNC tree, and we have
similar flags for these trees. And by mistake we use the COW_ZNODE flag for
LPT in some places, instead of the right flag COW_CNODE. And this works
only because these two constants have the same value.

This patch makes all the LPT code to use COW_CNODE and also changes COW_CNODE
constant value to make sure we do not misuse the flags any more.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:27 +03:00
Artem Bityutskiy
f42eed7cba UBIFS: harmonize znode flag helpers
We have 3 znode flags: cow, obsolete, dirty. For the last flag we have a
'ubifs_zn_dirty()' helper function, but for the other 2 flags we use
'test_bit()' directly.

This patch makes the situation more consistent and introduces helpers for the
other 2 flags: 'ubifs_zn_cow()' and 'ubifs_zn_obsolete()'.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:27 +03:00
Artem Bityutskiy
1f42596ec0 UBIFS: remove dead code
Remove dead pieces of code under "if (c->min_io_size == 1)" statement -
we never execute it because in UBIFS 'c->min_io_size' is always at least 8.
This are leftovers from old pre-mainline prototype.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:27 +03:00
Artem Bityutskiy
12e776a088 UBIFS: remove unnecessary brackets
Remove unnecessary brackets in "inode->i_flags |= (S_NOCMTIME)" statement to
make the code not look silly.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:27 +03:00
Artem Bityutskiy
a29fa9dfa4 UBIFS: minor cleanup: use S_ISREG helper
Instead of using long "(inode->i_mode & S_IFMT) != S_IFREG" expression, use
shorted "!S_ISREG(inode->i_mode)".

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:26 +03:00
Artem Bityutskiy
1b51e98365 UBIFS: rename dbg_check_dir_size function
Since this function is not only about size checking, rename it to
'dbg_check_dir()'.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:26 +03:00
Artem Bityutskiy
4315fb4072 UBIFS: improve inode dumping function
Teach 'dbg_dump_inode()' dump directory entries for directory inodes.
This requires few additional changes:
1. The 'c' argument of 'dbg_dump_inode()' cannot be const any more.
2. Users of 'dbg_dump_inode()' should not have 'tnc_mutex' locked.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:26 +03:00
Artem Bityutskiy
bfcf677dec UBIFS: dump stack when pnode or nnode reading fails
When we fail to read a pnode or nnode - print stacktrace if debugging is enabled.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:26 +03:00
Artem Bityutskiy
ae380ce047 UBIFS: lessen the size of debugging info data structure
This patch lessens the 'struct ubifs_debug_info' size by 90 bytes by
allocating less bytes for the debugfs root directory name. It introduces macros
for the name patter an length instead of hard-coding 100 bytes. It also makes
UBIFS use 'snprintf()' and teaches it to gracefully catch situations when the
name array is too short.

Additionally, this patch makes 2 unrelated changes - I just thought they do not
deserve separate commits: simplifies 'ubifs_assert()' for non-debugging case
and makes 'dbg_debugfs_init()' properly verify debugfs return code which may be
an error code or NULL, so we should you 'IS_ERR_OR_NULL()' instead of
'IS_ERR()'.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:26 +03:00
Artem Bityutskiy
549c999a76 UBIFS: return EROFS in case of broken commit
If commit failed and it is in broken state, UBIFS switches to R/O mode. Most
operations return -EROFS in this case, except of commit which returns -EINVAL.
Make it return -EROFS too for consistency. This is also important for our power
cut emulation testing.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:26 +03:00
Bryn M. Reeves
c282af4990 dlm: use vmalloc for hash tables
Allocate dlm hash tables in the vmalloc area to allow a greater
maximum size without restructuring of the hash table code.

Signed-off-by: Bryn M. Reeves <bmr@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2011-07-01 15:49:23 -05:00
Denys Vlasenko
bb188d7e64 ptrace: make former thread ID available via PTRACE_GETEVENTMSG after PTRACE_EVENT_EXEC stop
When multithreaded program execs under ptrace,
all traced threads report WIFEXITED status, except for
thread group leader and the thread which execs.

Unless tracer tracks thread group relationship between tracees,
which is a nontrivial task, it will not detect that
execed thread no longer exists.

This patch allows tracer to figure out which thread
performed this exec, by requesting PTRACE_GETEVENTMSG
in PTRACE_EVENT_EXEC stop.

Another, samller problem which is solved by this patch
is that tracer now can figure out which of the several
concurrent execs in multithreaded program succeeded.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-07-01 18:51:49 +02:00
Jeff Layton
ee1b3ea9e6 cifs: set socket send and receive timeouts before attempting connect
Benjamin S. reported that he was unable to suspend his machine while
it had a cifs share mounted. The freezer caused this to spew when he
tried it:

-----------------------[snip]------------------
PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.01 seconds) done.
Freezing remaining freezable tasks ...
Freezing of tasks failed after 20.01 seconds (1 tasks refusing to freeze, wq_busy=0):
cifsd         S ffff880127f7b1b0     0  1821      2 0x00800000
 ffff880127f7b1b0 0000000000000046 ffff88005fe008a8 ffff8800ffffffff
 ffff880127cee6b0 0000000000011100 ffff880127737fd8 0000000000004000
 ffff880127737fd8 0000000000011100 ffff880127f7b1b0 ffff880127736010
Call Trace:
 [<ffffffff811e85dd>] ? sk_reset_timer+0xf/0x19
 [<ffffffff8122cf3f>] ? tcp_connect+0x43c/0x445
 [<ffffffff8123374e>] ? tcp_v4_connect+0x40d/0x47f
 [<ffffffff8126ce41>] ? schedule_timeout+0x21/0x1ad
 [<ffffffff8126e358>] ? _raw_spin_lock_bh+0x9/0x1f
 [<ffffffff811e81c7>] ? release_sock+0x19/0xef
 [<ffffffff8123e8be>] ? inet_stream_connect+0x14c/0x24a
 [<ffffffff8104485b>] ? autoremove_wake_function+0x0/0x2a
 [<ffffffffa02ccfe2>] ? ipv4_connect+0x39c/0x3b5 [cifs]
 [<ffffffffa02cd7b7>] ? cifs_reconnect+0x1fc/0x28a [cifs]
 [<ffffffffa02cdbdc>] ? cifs_demultiplex_thread+0x397/0xb9f [cifs]
 [<ffffffff81076afc>] ? perf_event_exit_task+0xb9/0x1bf
 [<ffffffffa02cd845>] ? cifs_demultiplex_thread+0x0/0xb9f [cifs]
 [<ffffffffa02cd845>] ? cifs_demultiplex_thread+0x0/0xb9f [cifs]
 [<ffffffff810444a1>] ? kthread+0x7a/0x82
 [<ffffffff81002d14>] ? kernel_thread_helper+0x4/0x10
 [<ffffffff81044427>] ? kthread+0x0/0x82
 [<ffffffff81002d10>] ? kernel_thread_helper+0x0/0x10

Restarting tasks ... done.
-----------------------[snip]------------------

We do attempt to perform a try_to_freeze in cifs_reconnect, but the
connection attempt itself seems to be taking longer than 20s to time
out. The connect timeout is governed by the socket send and receive
timeouts, so we can shorten that period by setting those timeouts
before attempting the connect instead of after.

Adam Williamson tested the patch and said that it seems to have fixed
suspending on his laptop when a cifs share is mounted.

Reported-by: Benjamin S <da_joind@gmx.net>
Tested-by: Adam Williamson <awilliam@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-01 16:15:30 +00:00
Masatake YAMATO
55b3286d3d dlm: show addresses in configfs
Display all addresses the dlm is using for the local node
from the configfs file config/dlm/<cluster>/comms/<comm>/addr_list
Also make the addr file write only.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2011-06-30 14:45:28 -05:00