Commit Graph

13146 Commits

Author SHA1 Message Date
Christoph Hellwig
7d095257e3 xfs: kill xfs_qmops
Kill the quota ops function vector and replace it with direct calls or
stubs in the CONFIG_XFS_QUOTA=n case.

Make sure we check XFS_IS_QUOTA_RUNNING in the right spots.  We can remove
the number of those checks because the XFS_TRANS_DQ_DIRTY flag can't be set
otherwise.

This brings us back closer to the way this code worked in IRIX and earlier
Linux versions, but we keep a lot of the more useful factoring of common
code.

Eventually we should also kill xfs_qm_bhv.c, but that's left for a later
patch.

Reduces the size of the source code by about 250 lines and the size of
XFS module by about 1.5 kilobytes with quotas enabled:

   text	   data	    bss	    dec	    hex	filename
 615957	   2960	   3848	 622765	  980ad	fs/xfs/xfs.o
 617231	   3152	   3848	 624231	  98667	fs/xfs/xfs.o.old

Fallout:

 - xfs_qm_dqattach is split into xfs_qm_dqattach_locked which expects
   the inode locked and xfs_qm_dqattach which does the locking around it,
   thus removing XFS_QMOPT_ILOCKED.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:33:32 +02:00
Christoph Hellwig
0c5e1ce89f xfs: validate quota log items during log recovery
Arkadiusz has seen really strange crashes in xfs_qm_dqcheck that
I can only explain by a log item being too smal to actually fit the
xfs_dqblk_t we're dereferencing all over xfs_qm_dqcheck.  So add
graceful checks for NULL or too small quota items to the log recovery
code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:33:21 +02:00
Christoph Hellwig
e1696834e8 xfs: update max log size
Commit a6634fba3dec4a92f0a2c4e30c80b634c0576ad5 in xfsprogs increased the
maximum log size supported by mkfs.  Merged back the changes to xfs_fs.h
so the growfs enforced the same limit and the headers are in sync.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:32:59 +02:00
Felix Blyakher
4156e735d3 xfs: prevent deadlock in xfs_qm_shake()
It's possible to recurse into filesystem from the memory
allocation, which deadlocks in xfs_qm_shake(). Add check
for __GFP_FS, and bail out if it is not set.

Signed-off-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Hedi Berriche <hedi@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-06-01 13:13:24 -05:00
Eric Sandeen
096324873f xfs: fix overflow in xfs_growfs_data_private
In the case where growing a filesystem would leave the last AG
too small, the fixup code has an overflow in the calculation
of the new size with one fewer ag, because "nagcount" is a 32
bit number.  If the new filesystem has > 2^32 blocks in it
this causes a problem resulting in an EINVAL return from growfs:

 # xfs_io -f -c "truncate 19998630180864" fsfile
 # mkfs.xfs -f -bsize=4096 -dagsize=76288719b,size=3905982455b fsfile
 # mount -o loop fsfile /mnt
 # xfs_growfs /mnt

meta-data=/dev/loop0             isize=256    agcount=52,
agsize=76288719 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=3905982455, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=32768, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0
xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Invalid argument

Reported-by: richard.ems@cape-horn-eng.com
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-05-26 17:46:37 -05:00
Felix Blyakher
ec91d1335f xfs: fix double unlock in xfs_swap_extents()
Regreesion from commit ef8f7fc, which rearranged the code in
xfs_swap_extents() leading to double unlock of xfs inode ilock.
That resulted in xfs_fsr deadlocking itself on platforms, which
don't handle double unlock of rw_semaphore nicely. It caused the
count go negative, which represents the write holder, without
really having one. ia64 is one of the platforms where deadlock
was easily reproduced and the fix was tested.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-05-08 00:29:44 -05:00
Christoph Hellwig
6321e3ed2a xfs: fix getbmap vs mmap deadlock
xfs_getbmap (or rather the formatters called by it) copy out the getbmap
structures under the ilock, which can deadlock against mmap.  This has
been reported via bugzilla a while ago (#717) and has recently also
shown up via lockdep.

So allocate a temporary buffer to format the kernel getbmap structures
into and then copy them out after dropping the locks.

A little problem with this is that we limit the number of extents we
can copy out by the maximum allocation size, but I see no real way
around that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-29 13:25:29 -05:00
Christoph Hellwig
4be4a00fb5 xfs: a couple getbmap cleanups
- reshuffle various conditionals for data vs attr fork to make the code
   more readable
 - do fine-grainded goto-based error handling
 - exit early from conditionals instead of keeping a long else branch around
 - allow kmem_alloc to fail

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-29 10:00:01 -05:00
Olaf Weber
2ac00af7a6 xfs: add more checks to superblock validation
There had been reports where xfs filesystem was randomly
corrupted with fsfuzzer, and xfs failed to handle it
gracefully. This patch fixes couple of reported problem
by providing additional checks in the superblock
validation routine.

Signed-off-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-29 09:24:29 -05:00
Lachlan McIlroy
f25181f598 xfs_file_last_byte() needs to acquire ilock
We had some systems crash with this stack:

[<a00000010000cb20>] ia64_leave_kernel+0x0/0x280
[<a00000021291ca00>] xfs_bmbt_get_startoff+0x0/0x20 [xfs]
[<a0000002129080b0>] xfs_bmap_last_offset+0x210/0x280 [xfs]
[<a00000021295b010>] xfs_file_last_byte+0x70/0x1a0 [xfs]
[<a00000021295b200>] xfs_itruncate_start+0xc0/0x1a0 [xfs]
[<a0000002129935f0>] xfs_inactive_free_eofblocks+0x290/0x460 [xfs]
[<a000000212998fb0>] xfs_release+0x1b0/0x240 [xfs]
[<a0000002129ad930>] xfs_file_release+0x70/0xa0 [xfs]
[<a000000100162ea0>] __fput+0x1a0/0x420
[<a000000100163160>] fput+0x40/0x60

The problem here is that xfs_file_last_byte() does not acquire the
inode lock and can therefore race with another thread that is modifying
the extext list.  While xfs_bmap_last_offset() is trying to lookup
what was the last extent some extents were merged and the extent list
shrunk so the index we lookup is now beyond the end of the extent list
and potentially in a freed buffer.

Signed-off-by: Lachlan McIlroy <lmcilroy@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-29 09:14:10 -05:00
Dave Chinner
8de2bf937a xfs: remove xfs_flush_space
The only thing we need to do now when we get an ENOSPC condition during delayed
allocation reservation is flush all the other inodes with delalloc blocks on
them and retry without EOF preallocation. Remove the unneeded mess that is
xfs_flush_space() and just call xfs_flush_inodes() directly from
xfs_iomap_write_delay().

Also, change the location of the retry label to avoid trying to do EOF
preallocation because we don't want to do that at ENOSPC. This enables us to
remove the BMAPI_SYNC flag as it is no longer used.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:49:12 +02:00
Dave Chinner
153fec43ce xfs: flush delayed allcoation blocks on ENOSPC in create
If we are creating lots of small files, we can fail to get
a reservation for inode create earlier than we should due to
EOF preallocation done during delayed allocation reservation.
Hence on the first reservation ENOSPC failure flush all the
delayed allocation blocks out of the system and retry.

This fixes the last commonly triggered spurious ENOSPC issue
that has been reported.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:48:30 +02:00
Dave Chinner
e43afd72d2 xfs: block callers of xfs_flush_inodes() correctly
xfs_flush_inodes() currently uses a magic timeout to wait for
some inodes to be flushed before returning. This isn't
really reliable but used to be the best that could be done
due to deadlock potential of waiting for the entire flush.

Now the inode flush is safe to execute while we hold page
and inode locks, we can wait for all the inodes to flush
synchronously. Convert the wait mechanism to a completion
to do this efficiently. This should remove all remaining
spurious ENOSPC errors from the delayed allocation reservation
path.

This is extracted almost line for line from a larger patch
from Mikulas Patocka.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:47:27 +02:00
Dave Chinner
5825294edd xfs: make inode flush at ENOSPC synchronous
When we are writing to a single file and hit ENOSPC, we trigger a background
flush of the inode and try again.  Because we hold page locks and the iolock,
the flush won't proceed until after we release these locks. This occurs once
we've given up and ENOSPC has been reported. Hence if this one is the only
dirty inode in the system, we'll get an ENOSPC prematurely.

To fix this, remove the async flush from the allocation routines and move
it to the top of the write path where we can do a synchronous flush
and retry the write again. Only retry once as a second ENOSPC indicates
that we really are ENOSPC.

This avoids a page cache deadlock when trying to do this flush synchronously
in the allocation layer that was identified by Mikulas Patocka.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:45:44 +02:00
Dave Chinner
a8d770d987 xfs: use xfs_sync_inodes() for device flushing
Currently xfs_device_flush calls sync_blockdev() which is
a no-op for XFS as all it's metadata is held in a different
address to the one sync_blockdev() works on.

Call xfs_sync_inodes() instead to flush all the delayed
allocation blocks out. To do this as efficiently as possible,
do it via two passes - one to do an async flush of all the
dirty blocks and a second to wait for all the IO to complete.
This requires some modification to the xfs-sync_inodes_ag()
flush code to do efficiently.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:44:54 +02:00
Dave Chinner
9d7fef74b2 xfs: inform the xfsaild of the push target before sleeping
When trying to reserve log space, we find the amount of space
we need, then go to sleep waiting for space. When we are
woken, we try to push the tail of the log forward to make
sure we have space available.

Unfortunately, this means that if there is not space available, and
everyone who needs space goes to sleep there is no-one left to push
the tail of the log to make space available. Once we have a thread
waiting for space to become available, the others queue up behind
it in a FIFO, and none of them push the tail of the log.

This can result in everyone going to sleep in xlog_grant_log_space()
if the first sleeper races with the last I/O that moves the tail
of the log forward. With no further I/O tomove the tail of the log,
there is nothing to wake the sleepers and hence all transactions
just stop.

Fix this by making sure the xfsaild will create enough space for the
transaction that is about to sleep by moving the push target far
enough forwards to ensure that that the curent proceeees will have
enough space available when it is woken. That is, we push the
AIL before we go to sleep.

Because we've inserted the log ticket into the queue before we've
pushed and gone to sleep, subsequent transactions will wait behind
this one. Hence we are guaranteed to have space available when we
are woken.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:42:59 +02:00
Dave Chinner
c626d174cf xfs: prevent unwritten extent conversion from blocking I/O completion
Unwritten extent conversion can recurse back into the filesystem due
to memory allocation. Memory reclaim requires I/O completions to be
processed to allow the callers to make progress. If the I/O
completion workqueue thread is doing the recursion, then we have a
deadlock situation.

Move unwritten extent completion into it's own workqueue so it
doesn't block I/O completions for normal delayed allocation or
overwrite data.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:42:11 +02:00
Dave Chinner
705db3fd46 xfs: fix double free of inode
If we fail to initialise the VFS inode in inode_init_always(),
it will call ->delete_inode internally resulting in the inode being
freed. Hence we need to delay the call to inode_init_always()
until after the XFS inode is sufficient set up to handle a
call to ->delete_inode, and then if that fails do not touch
the inode again at all as it has been freed.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:40:17 +02:00
Dave Chinner
a6cb767e24 xfs: validate log feature fields correctly
If the large log sector size feature bit is set in the
superblock by accident (say disk corruption), the then
fields that are now considered valid are not checked on
production kernels. The checks are present as ASSERT
statements so cause a panic on a debug kernel.

Change this so that the fields are validity checked if
the feature bit is set and abort the log mount if the
fields do not contain valid values.

Reported-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:39:27 +02:00
Felix Blyakher
1aacc064e0 Revert "xfs: increase the maximum number of supported ACL entries"
This reverts commit 8b11217173.
Premature unintended commit.

Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-31 00:23:37 -05:00
Felix Blyakher
5123bc35d2 Merge branch 'master' of git://git.kernel.org/pub/scm/fs/xfs/xfs 2009-03-30 22:17:44 -05:00
Felix Blyakher
930861c4e6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2009-03-30 22:08:33 -05:00
Linus Torvalds
d17abcd541 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask:
  oprofile: Thou shalt not call __exit functions from __init functions
  cpumask: remove the now-obsoleted pcibus_to_cpumask(): generic
  cpumask: remove cpumask_t from core
  cpumask: convert rcutorture.c
  cpumask: use new cpumask_ functions in core code.
  cpumask: remove references to struct irqaction's mask field.
  cpumask: use mm_cpumask() wrapper: kernel/fork.c
  cpumask: use set_cpu_active in init/main.c
  cpumask: remove node_to_first_cpu
  cpumask: fix seq_bitmap_*() functions.
  cpumask: remove dangerous CPU_MASK_ALL_PTR, &CPU_MASK_ALL
2009-03-30 18:00:26 -07:00
Linus Torvalds
cf2f7d7c90 Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc
* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc:
  Revert "proc: revert /proc/uptime to ->read_proc hook"
  proc 2/2: remove struct proc_dir_entry::owner
  proc 1/2: do PDE usecounting even for ->read_proc, ->write_proc
  proc: fix sparse warnings in pagemap_read()
  proc: move fs/proc/inode-alloc.txt comment into a source file
2009-03-30 16:06:04 -07:00
Jeff Mahoney
3a355cc61d reiserfs: xattr_create is unused with xattrs disabled
This patch ifdefs xattr_create when xattrs aren't enabled.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 14:28:58 -07:00
Alexey Dobriyan
a9caa3de24 Revert "proc: revert /proc/uptime to ->read_proc hook"
This reverts commit 6c87df37dc.

proc files implemented through seq_file do pread(2) now.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-31 01:14:58 +04:00
Alexey Dobriyan
99b7623380 proc 2/2: remove struct proc_dir_entry::owner
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.

We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.

But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.

->read_proc/->write_proc were just fixed to not require ->owner for
protection.

rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.

Removing ->owner will also make PDE smaller.

So, let's nuke it.

Kudos to Jeff Layton for reminding about this, let's say, oversight.

http://bugzilla.kernel.org/show_bug.cgi?id=12454

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-31 01:14:44 +04:00
Alexey Dobriyan
3dec7f59c3 proc 1/2: do PDE usecounting even for ->read_proc, ->write_proc
struct proc_dir_entry::owner is going to be removed. Now it's only necessary
to protect PDEs which are using ->read_proc, ->write_proc hooks.

However, ->owner assignments are racy and make it very easy for someone to switch
->owner on live PDE (as some subsystems do) without fixing refcounts and so on.

http://bugzilla.kernel.org/show_bug.cgi?id=12454

So, ->owner is on death row.

Proxy file operations exist already (proc_file_operations), just bump usecount
when necessary.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-31 01:14:27 +04:00
Milind Arun Choudhary
09729a9919 proc: fix sparse warnings in pagemap_read()
fs/proc/task_mmu.c:696:12: warning: cast removes address space of expression
fs/proc/task_mmu.c:696:9: warning: incorrect type in assignment (different address spaces)
fs/proc/task_mmu.c:696:9:    expected unsigned long long [noderef] [usertype] <asn:1>*out
fs/proc/task_mmu.c:696:9:    got unsigned long long [usertype] *<noident>
fs/proc/task_mmu.c:697:12: warning: cast removes address space of expression
fs/proc/task_mmu.c:697:9: warning: incorrect type in assignment (different address spaces)
fs/proc/task_mmu.c:697:9:    expected unsigned long long [noderef] [usertype] <asn:1>*end
fs/proc/task_mmu.c:697:9:    got unsigned long long [usertype] *<noident>
fs/proc/task_mmu.c:723:12: warning: cast removes address space of expression
fs/proc/task_mmu.c:723:26: error: subtraction of different types can't work (different address spaces)
fs/proc/task_mmu.c:725:24: error: subtraction of different types can't work (different address spaces)

Signed-off-by: Milind Arun Choudhary <milindchoudhary@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-31 01:14:22 +04:00
Randy Dunlap
1681bc30f2 proc: move fs/proc/inode-alloc.txt comment into a source file
so that people will realize that it exists and can update it as needed.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-31 01:13:12 +04:00
Linus Torvalds
e1c5024828 Merge branch 'reiserfs-updates' from Jeff Mahoney
* reiserfs-updates: (35 commits)
  reiserfs: rename [cn]_* variables
  reiserfs: rename p_._ variables
  reiserfs: rename p_s_tb to tb
  reiserfs: rename p_s_inode to inode
  reiserfs: rename p_s_bh to bh
  reiserfs: rename p_s_sb to sb
  reiserfs: strip trailing whitespace
  reiserfs: cleanup path functions
  reiserfs: factor out buffer_info initialization
  reiserfs: add atomic addition of selinux attributes during inode creation
  reiserfs: use generic readdir for operations across all xattrs
  reiserfs: journaled xattrs
  reiserfs: use generic xattr handlers
  reiserfs: remove i_has_xattr_dir
  reiserfs: make per-inode xattr locking more fine grained
  reiserfs: eliminate per-super xattr lock
  reiserfs: simplify xattr internal file lookups/opens
  reiserfs: Clean up xattrs when REISERFS_FS_XATTR is unset
  reiserfs: remove IS_PRIVATE helpers
  reiserfs: remove link detection code
  ...

Fixed up conflicts manually due to:
 - quota name cleanups vs variable naming changes:
	fs/reiserfs/inode.c
	fs/reiserfs/namei.c
	fs/reiserfs/stree.c
        fs/reiserfs/xattr.c
 - exported include header cleanups
	include/linux/reiserfs_fs.h
2009-03-30 12:33:01 -07:00
Jeff Mahoney
ee93961be1 reiserfs: rename [cn]_* variables
This patch renames n_, c_, etc variables to something more sane.  This
is the sixth in a series of patches to rip out some of the awful
variable naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:40 -07:00
Jeff Mahoney
d68caa9530 reiserfs: rename p_._ variables
This patch is a simple s/p_._//g to the reiserfs code.  This is the
fifth in a series of patches to rip out some of the awful variable
naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:40 -07:00
Jeff Mahoney
a063ae1792 reiserfs: rename p_s_tb to tb
This patch is a simple s/p_s_tb/tb/g to the reiserfs code.  This is the
fourth in a series of patches to rip out some of the awful variable
naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:40 -07:00
Jeff Mahoney
995c762ea4 reiserfs: rename p_s_inode to inode
This patch is a simple s/p_s_inode/inode/g to the reiserfs code.  This
is the third in a series of patches to rip out some of the awful
variable naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:39 -07:00
Jeff Mahoney
ad31a4fc03 reiserfs: rename p_s_bh to bh
This patch is a simple s/p_s_bh/bh/g to the reiserfs code.  This is the
second in a series of patches to rip out some of the awful variable
naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:39 -07:00
Jeff Mahoney
a9dd364358 reiserfs: rename p_s_sb to sb
This patch is a simple s/p_s_sb/sb/g to the reiserfs code.  This is the
first in a series of patches to rip out some of the awful variable
naming in reiserfs.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:39 -07:00
Jeff Mahoney
0222e6571c reiserfs: strip trailing whitespace
This patch strips trailing whitespace from the reiserfs code.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:39 -07:00
Jeff Mahoney
3cd6dbe6fe reiserfs: cleanup path functions
This patch cleans up some redundancies in the reiserfs tree path code.

decrement_bcount() is essentially the same function as brelse(), so we use
that instead.

decrement_counters_in_path() is exactly the same function as pathrelse(), so
we kill that and use pathrelse() instead.

There's also a bit of cleanup that makes the code a bit more readable.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:39 -07:00
Jeff Mahoney
fba4ebb5f0 reiserfs: factor out buffer_info initialization
This is the first in a series of patches to make balance_leaf() not
quite so insane.

This patch factors out the open coded initializations of buffer_info
structures and defines a few initializers for the 4 cases they're used.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:39 -07:00
Jeff Mahoney
57fe60df62 reiserfs: add atomic addition of selinux attributes during inode creation
Some time ago, some changes were made to make security inode attributes
be atomically written during inode creation.  ReiserFS fell behind in
this area, but with the reworking of the xattr code, it's now fairly
easy to add.

The following patch adds the ability for security attributes to be added
automatically during inode creation.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:39 -07:00
Jeff Mahoney
a41f1a4715 reiserfs: use generic readdir for operations across all xattrs
The current reiserfs xattr implementation open codes reiserfs_readdir
and frees the path before calling the filldir function.  Typically, the
filldir function is something that modifies the file system, such as a
chown or an inode deletion that also require reading of an inode
associated with each direntry.  Since the file system is modified, the
path retained becomes invalid for the next run.  In addition, it runs
backwards in attempt to minimize activity.

This is clearly suboptimal from a code cleanliness perspective as well
as performance-wise.

This patch implements a generic reiserfs_for_each_xattr that uses the
generic readdir and a specific filldir routine that simply populates an
array of dentries and then performs a specific operation on them.  When
all files have been operated on, it then calls the operation on the
directory itself.

The result is a noticable code reduction and better performance.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
0ab2621ebd reiserfs: journaled xattrs
Deadlocks are possible in the xattr code between the journal lock and the
xattr sems.

This patch implements journalling for xattr operations. The benefit is
twofold:
 * It gets rid of the deadlock possibility by always ensuring that xattr
   write operations are initiated inside a transaction.
 * It corrects the problem where xattr backing files aren't considered any
   differently than normal files, despite the fact they are metadata.

I discussed the added journal load with Chris Mason, and we decided that
since xattrs (versus other journal activity) is fairly rare, the introduction
of larger transactions to support journaled xattrs wouldn't be too big a deal.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
48b32a3553 reiserfs: use generic xattr handlers
Christoph Hellwig had asked me quite some time ago to port the reiserfs
xattrs to the generic xattr interface.

This patch replaces the reiserfs-specific xattr handling code with the
generic struct xattr_handler.

However, since reiserfs doesn't split the prefix and name when accessing
xattrs, it can't leverage generic_{set,get,list,remove}xattr without
needlessly reconstructing the name on the back end.

Update 7/26/07: Added missing dput() to deletion path.
Update 8/30/07: Added missing mark_inode_dirty when i_mode is used to
                represent an ACL and no previous ACL existed.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
8ecbe550a1 reiserfs: remove i_has_xattr_dir
With the changes to xattr root locking, the i_has_xattr_dir flag
is no longer needed. This patch removes it.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
8b6dd72a44 reiserfs: make per-inode xattr locking more fine grained
The per-inode locking can be made more fine-grained to surround just the
interaction with the filesystem itself.  This really only applies to
protecting reads during a write, since concurrent writes are barred with
inode->i_mutex at the vfs level.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
d984561b32 reiserfs: eliminate per-super xattr lock
With the switch to using inode->i_mutex locking during lookups/creation
in the xattr root, the per-super xattr lock is no longer needed.

This patch removes it.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:38 -07:00
Jeff Mahoney
6c17675e1e reiserfs: simplify xattr internal file lookups/opens
The xattr file open/lookup code is needlessly complex.  We can use
vfs-level operations to perform the same work, and also simplify the
locking constraints.  The locking advantages will be exploited in future
patches.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:37 -07:00
Jeff Mahoney
a72bdb1cd2 reiserfs: Clean up xattrs when REISERFS_FS_XATTR is unset
The current reiserfs xattr implementation will not clean up old xattr
files if files are deleted when REISERFS_FS_XATTR is unset.  This
results in inaccessible lost files, wasting space.

This patch compiles in basic xattr knowledge, such as how to delete them
and change ownership for quota tracking.  If the file system has never
used xattrs, then the operation is quite fast: it returns immediately
when it sees there is no .reiserfs_priv directory.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:37 -07:00
Jeff Mahoney
6dfede6963 reiserfs: remove IS_PRIVATE helpers
There are a number of helper functions for marking a reiserfs inode
private that were leftover from reiserfs did its own thing wrt to
private inodes.  S_PRIVATE has been in the kernel for some time, so this
patch removes the helpers and uses IS_PRIVATE instead.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-30 12:16:37 -07:00