Add a proper prototype for hugetlb_get_unmapped_area() in
include/linux/hugetlb.h.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We attempted to insert new nodes into the tree by just using
rb_replace_node to let them replace an earlier node which they
completely overlapped. However, that could place the new node into the
wrong place in the tree, since its start could be node only before the
start of the victim, but before the node _before_ the victim in the tree
(if that previous node actually ends _after_ the new node, thus isn't
entirely overlapped and wasn't itself chosen to be the victim).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Add NFS lock support to GFS2.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Rewrite nlmsvc_lock() to use the asynchronous interface.
As with testlock, we answer nlm requests in nlmsvc_lock by first looking up
the block and then using the results we find in the block if B_QUEUED is
set, and calling vfs_lock_file() otherwise.
If this a new lock request and we get -EINPROGRESS return on a non-blocking
request then we defer the request.
Also modify nlmsvc_unlock() to call the filesystem method if appropriate.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Normally we could skip ever having to allocate a block in the case where
the client asks for a non-blocking lock, or asks for a blocking lock that
succeeds immediately.
However we're going to want to always look up a block first in order to
check whether we're revisiting a deferred lock call, and to be prepared to
handle the case where the filesystem returns -EINPROGRESS--in that case we
want to make sure the lock we've given the filesystem is the one embedded
in the block that we'll use to track the deferred request.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Rewrite nlmsvc_testlock() to use the new asynchronous interface: instead of
immediately doing a posix_test_lock(), we first look for a matching block.
If the subsequent test_lock returns anything other than -EINPROGRESS, we
then remove the block we've found and return the results.
If it returns -EINPROGRESS, then we defer the lock request.
In the case where the block we find in the first step has B_QUEUED set,
we bypass the vfs_test_lock entirely, instead using the block to decide how
to respond:
with nlm_lck_denied if B_TIMED_OUT is set.
with nlm_granted if B_GOT_CALLBACK is set.
by dropping if neither B_TIMED_OUT nor B_GOT_CALLBACK is set
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Change NLM internal interface to pass more information for test lock; we
need this to make sure the cookie information is pushed down to the place
where we do request deferral, which is handled for testlock by the
following patch.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Add code to handle file system callback when the lock is finally granted.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
We need to keep some state for a pending asynchronous lock request, so this
patch adds that state to struct nlm_block.
This also adds a function which defers the request, by calling
rqstp->rq_chandle.defer and storing the resulting deferred request in a
nlm_block structure which we insert into lockd's global block list. That
new function isn't called yet, so it's dead code until a later patch.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acquiring a lock on a cluster filesystem may require communication with
remote hosts, and to avoid blocking lockd or nfsd threads during such
communication, we allow the results to be returned asynchronously.
When a ->lock() call needs to block, the file system will return
-EINPROGRESS, and then later return the results with a call to the
routine in the fl_grant field of the lock_manager_operations struct.
This differs from the case when ->lock returns -EAGAIN to a blocking
lock request; in that case, the filesystem calls fl_notify when the lock
is granted, and the caller retries the original lock. So while
fl_notify is merely a hint to the caller that it should retry, fl_grant
actually communicates the final result of the lock operation (with the
lock already acquired in the succesful case).
Therefore fl_grant takes a lock, a status and, for the test lock case, a
conflicting lock. We also allow fl_grant to return an error to the
filesystem, to handle the case where the fl_grant requests arrives after
the lock manager has already given up waiting for it.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Convert NFSv4 to the new lock interface. We don't define any callback for now,
so we're not taking advantage of the asynchronous feature--that's less critical
for the multi-threaded nfsd then it is for the single-threaded lockd. But this
does allow a cluster filesystems to export cluster-coherent locking to NFS.
Note that it's cluster filesystems that are the issue--of the filesystems that
define lock methods (nfs, cifs, etc.), most are not exportable by nfsd.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Lock managers need to be able to cancel pending lock requests. In the case
where the exported filesystem manages its own locks, it's not sufficient just
to call posix_unblock_lock(); we need to let the filesystem know what's
happening too.
We do this by adding a new fcntl lock command: FL_CANCELLK. Some day this
might also be made available to userspace applications that could benefit from
an asynchronous locking api.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
The nfsv4 protocol's lock operation, in the case of a conflict, returns
information about the conflicting lock.
It's unclear how clients can use this, so for now we're not going so far as to
add a filesystem method that can return a conflicting lock, but we may as well
return something in the local case when it's easy to.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Factor out the code that switches between generic and filesystem-specific lock
methods; eventually we want to call this from lock managers (lockd and nfsd)
too; currently they only call the generic methods.
This patch does that for all the setlk code.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Factor out the code that switches between generic and filesystem-specific lock
methods; eventually we want to call this from lock managers (lockd and nfsd)
too; currently they only call the generic methods.
This patch does that for test_lock.
Note that this hasn't been necessary until recently, because the few
filesystems that define ->lock() (nfs, cifs...) aren't exportable via NFS.
However GFS (and, in the future, other cluster filesystems) need to implement
their own locking to get cluster-coherent locking, and also want to be able to
export locking to NFS (lockd and NFSv4).
So we accomplish this by factoring out code such as this and exporting it for
the use of lockd and nfsd.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
posix_test_lock() and ->lock() do the same job but have gratuitously
different interfaces. Modify posix_test_lock() so the two agree,
simplifying some code in the process.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
The file_lock argument to ->lock is used to return the conflicting lock
when found. There's no reason for the filesystem to return any private
information with this conflicting lock, but nfsv4 is.
Fix nfsv4 client, and modify locks.c to stop calling fl_release_private
for it in this case.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Cc: "Trond Myklebust" <Trond.Myklebust@netapp.com>"
The original code would remember, during the first pass over the tree,
a suitable place to start the insertion from when we eventually come
to add a new node.
The optimisation was broken, and we sometimes ended up inserting a new
node in the wrong place because we started the insertion from the wrong
point.
Just ditch the optimisation and start the insertion from the root of the
tree, for now. I'll try it again when I'm feeling cleverer.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] Fix typo in cifs readme from previous commit
[CIFS] Make sec=none force an anonymous mount
[CIFS] Change semaphore to mutex for cifs lock_sem
[CIFS] Fix oops in reset_cifs_unix_caps on reconnect
[CIFS] UID/GID override on CIFS mounts to Samba
[CIFS] prefixpath mounts to servers supporting posix paths used wrong slash
[CIFS] Update cifs version to 1.49
[CIFS] Replace kmalloc/memset combination with kzalloc
[CIFS] Add IPv6 support
[CIFS] New CIFS POSIX mkdir performance improvement (part 2)
[CIFS] New CIFS POSIX mkdir performance improvement
[CIFS] Add write perm for usr to file on windows should remove r/o dos attr
[CIFS] Remove unnecessary parm to cifs_reopen_file
[CIFS] Switch cifsd to kthread_run from kernel_thread
[CIFS] Remove unnecessary checks
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (231 commits)
[PATCH] i386: Don't delete cpu_devs data to identify different x86 types in late_initcall
[PATCH] i386: type may be unused
[PATCH] i386: Some additional chipset register values validation.
[PATCH] i386: Add missing !X86_PAE dependincy to the 2G/2G split.
[PATCH] x86-64: Don't exclude asm-offsets.c in Documentation/dontdiff
[PATCH] i386: avoid redundant preempt_disable in __unlazy_fpu
[PATCH] i386: white space fixes in i387.h
[PATCH] i386: Drop noisy e820 debugging printks
[PATCH] x86-64: Fix allnoconfig error in genapic_flat.c
[PATCH] x86-64: Shut up warnings for vfat compat ioctls on other file systems
[PATCH] x86-64: Share identical video.S between i386 and x86-64
[PATCH] x86-64: Remove CONFIG_REORDER
[PATCH] x86-64: Print type and size correctly for unknown compat ioctls
[PATCH] i386: Remove copy_*_user BUG_ONs for (size < 0)
[PATCH] i386: Little cleanups in smpboot.c
[PATCH] x86-64: Don't enable NUMA for a single node in K8 NUMA scanning
[PATCH] x86: Use RDTSCP for synchronous get_cycles if possible
[PATCH] i386: Add X86_FEATURE_RDTSCP
[PATCH] i386: Implement X86_FEATURE_SYNC_RDTSC on i386
[PATCH] i386: Implement alternative_io for i386
...
Fix up trivial conflict in include/linux/highmem.h manually.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's possible for a journal I/O request to be added to the log_redrive
queue and the jfsIO thread to be awakened after the thread releases
log_redrive_lock but before it sets its state to TASK_INTERRUPTIBLE.
The jfsIO thread should set the state before giving up the spinlock, so
the waking thread will really wake it.
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
This fixes a problem Artem found with the integck test tool -- we
weren't correctly keeping track of the 'overlap' flag in some cases,
which led to the nodes being played back in an incorrect order and file
corruption.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
After flushing the last page of an eraseblock, don't leave the
wbuf 'offset' field pointing at the start of the next physical
eraseblock. This was causing a BUG() on NOR-ECC (Sibley) flash, where
we start writing a little further in, after the cleanmarker.
Debugged by Alexander Belyakov <abelyako@googlemail.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
ocfs2: Force use of GFP_NOFS in ocfs2_write()
ocfs2: fix sparse warnings in fs/ocfs2/cluster
ocfs2: fix sparse warnings in fs/ocfs2/dlm
ocfs2: fix sparse warnings in fs/ocfs2
[PATCH] Copy i_flags to ocfs2 inode flags on write
[PATCH] ocfs2: use __set_current_state()
ocfs2: Wrap access of directory allocations with ip_alloc_sem.
[PATCH] fs/ocfs2/: make 3 functions static
ocfs2: Implement compat_ioctl()
We had a customer report that attempting to make CIFS mount with a null
username (i.e. doing an anonymous mount) doesn't work. Looking through the
code, it looks like CIFS expects a NULL username from userspace in order
to trigger an anonymous mount. The mount.cifs code doesn't seem to ever
pass a null username to the kernel, however.
It looks also like the kernel can take a sec=none option, but it only seems
to look at it if the username is already NULL. This seems redundant and
effectively makes sec=none useless.
The following patch makes sec=none force an anonymous mount.
Signed-off-by: Steve French <sfrench@us.ibm.com>
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (28 commits)
NFS: Fix a compile glitch on 64-bit systems
NFS: Clean up nfs_create_request comments
spkm3: initialize hash
spkm3: remove bad kfree, unnecessary export
spkm3: fix spkm3's use of hmac
NFS4: invalidate cached acl on setacl
NFS: Fix directory caching problem - with test case and patch.
NFS: Set meaningful value for fattr->time_start in readdirplus results.
NFS: Added support to turn off the NFSv3 READDIRPLUS RPC.
SUNRPC: RPC client should retry with different versions of rpcbind
SUNRPC: remove old portmapper
NFS: switch NFSROOT to use new rpcbind client
SUNRPC: switch the RPC server to use the new rpcbind registration API
SUNRPC: switch socket-based RPC transports to use rpcbind
SUNRPC: introduce rpcbind: replacement for in-kernel portmapper
SUNRPC: Eliminate side effects from rpc_malloc
SUNRPC: RPC buffer size estimates are too large
NLM: Shrink the maximum request size of NLM4 requests
NFS: Use pgoff_t in structures and functions that pass page cache offsets
NFS: Clean up nfs_sync_mapping_wait()
...
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (49 commits)
[SCTP]: Set assoc_id correctly during INIT collision.
[SCTP]: Re-order SCTP initializations to avoid race with sctp_rcv()
[SCTP]: Fix the SO_REUSEADDR handling to be similar to TCP.
[SCTP]: Verify all destination ports in sctp_connectx.
[XFRM] SPD info TLV aggregation
[XFRM] SAD info TLV aggregationx
[AF_RXRPC]: Sort out MTU handling.
[AF_IUCV/IUCV] : Add missing section annotations
[AF_IUCV]: Implementation of a skb backlog queue
[NETLINK]: Remove bogus BUG_ON
[IPV6]: Some cleanups in include/net/ipv6.h
[TCP]: zero out rx_opt in tcp_disconnect()
[BNX2]: Fix TSO problem with small MSS.
[NET]: Rework dev_base via list_head (v3)
[TCP] Highspeed: Limited slow-start is nowadays in tcp_slow_start
[BNX2]: Update version and reldate.
[BNX2]: Print bus information for PCIE devices.
[BNX2]: Add 1-shot MSI handler for 5709.
[BNX2]: Restructure PHY event handling.
[BNX2]: Add indirect spinlock.
...
fs/nfs/pagelist.c:226: error: conflicting types for 'nfs_pageio_init'
include/linux/nfs_page.h:80: error: previous declaration of 'nfs_pageio_init' was here
Thanks to Andrew for spotting this...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cleanup of dev_base list use, with the aim to simplify making device
list per-namespace. In almost every occasion, use of dev_base variable
and dev->next pointer could be easily replaced by for_each_netdev
loop. A few most complicated places were converted to using
first_netdev()/next_netdev().
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adjust the new netdevice scanning code provided by Patrick McHardy:
(1) Restore the function banner comments that were dropped.
(2) Rather than using an array size of 6 in some places and an array size of
ETH_ALEN in others, pass a pointer instead and pass the array size
through so that we can actually check it.
(3) Do the buffer fill count check before checking the for_primary_ifa
condition again. This permits us to skip that check should maxbufs be
reached before we run out of interfaces.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the large and complicated rtnetlink client by two simple
functions for getting the MAC address for the first ethernet device
and building a list of IPv4 addresses.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The interface array is not freed on exit.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make miscellaneous fixes to AFS and AF_RXRPC:
(*) Make AF_RXRPC select KEYS rather than RXKAD or AFS_FS in Kconfig.
(*) Don't use FS_BINARY_MOUNTDATA.
(*) Remove a done 'TODO' item in a comemnt on afs_get_sb().
(*) Don't pass a void * as the page pointer argument of kmap_atomic() as this
breaks on m68k. Patch from Geert Uytterhoeven <geert@linux-m68k.org>.
(*) Use match_*() functions rather than doing my own parsing.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Originally at http://lkml.org/lkml/2006/9/2/86
The recent change to "allow Windows blocking locks to be cancelled via a
CANCEL_LOCK call" introduced a new semaphore in struct cifsFileInfo,
lock_sem. However, semaphores used as mutexes are deprecated these days,
and there's no reason to add a new one to the kernel. Therefore, convert
lock_sem to a struct mutex (and also fix one indentation glitch on one of
the lines changed anyway).
Signed-off-by: Roland Dreier <roland@digitalvampire.org>
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
We need to work on cleaning up the relationship between kobjects, ksets and
ktypes. The removal of 'struct subsystem' is the first step of this,
especially as it is not really needed at all.
Thanks to Kay for fixing the bugs in this patch.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix sysfs printk format warning:
fs/sysfs/bin.c:62: warning: format '%d' expects type 'int', but argument 4 has type 'size_t'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Propagate flags such as S_APPEND, S_IMMUTABLE, etc. from i_flags into
ocfs2-specific ip_attr. Hence, when someone sets these flags via a different
interface than ioctl, they are stored correctly.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
use __set_current_state(TASK_*) instead of current->state = TASK_*, in
fs/ocfs2
Signed-off-by: Milind Arun Choudhary <milindchoudhary@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
OCFS2_I(inode)->ip_alloc_sem is a read-write semaphore protecting
local concurrent access of ocfs2 inodes. However, ocfs2 directories were
not taking the semaphore while they accessed or modified the allocation
tree.
ocfs2_extend_dir() needs to take the semaphore in a write mode when it
adds to the allocation. All other directory users get there via
ocfs2_bread(), which takes the semaphore in read mode.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
This patch makes the following needlessly global functions static:
- aops.c: ocfs2_write_data_page()
- dlmglue.c: ocfs2_dump_meta_lvb_info()
- file.c: ocfs2_set_inode_size()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
vfat implements compat handlers for these ioctls, but when they
were executed on other file systems the kernel would still complain
about an unknown compat ioctl. Just declare them as compatible
and let them be rejected when not needed by the normal path.
This makes wine runs a lot quieter
Signed-off-by: Andi Kleen <ak@suse.de>