This patch series is based on top of "Swap-over-NBD without deadlocking
v15" as it depends on the same reservation of PF_MEMALLOC reserves logic.
When a user or administrator requires swap for their application, they
create a swap partition and file, format it with mkswap and activate it
with swapon. In diskless systems this is not an option so if swap if
required then swapping over the network is considered. The two likely
scenarios are when blade servers are used as part of a cluster where the
form factor or maintenance costs do not allow the use of disks and thin
clients.
The Linux Terminal Server Project recommends the use of the Network Block
Device (NBD) for swap but this is not always an option. There is no
guarantee that the network attached storage (NAS) device is running Linux
or supports NBD. However, it is likely that it supports NFS so there are
users that want support for swapping over NFS despite any performance
concern. Some distributions currently carry patches that support swapping
over NFS but it would be preferable to support it in the mainline kernel.
Patch 1 avoids a stream-specific deadlock that potentially affects TCP.
Patch 2 is a small modification to SELinux to avoid using PFMEMALLOC
reserves.
Patch 3 adds three helpers for filesystems to handle swap cache pages.
For example, page_file_mapping() returns page->mapping for
file-backed pages and the address_space of the underlying
swap file for swap cache pages.
Patch 4 adds two address_space_operations to allow a filesystem
to pin all metadata relevant to a swapfile in memory. Upon
successful activation, the swapfile is marked SWP_FILE and
the address space operation ->direct_IO is used for writing
and ->readpage for reading in swap pages.
Patch 5 notes that patch 3 is bolting
filesystem-specific-swapfile-support onto the side and that
the default handlers have different information to what
is available to the filesystem. This patch refactors the
code so that there are generic handlers for each of the new
address_space operations.
Patch 6 adds an API to allow a vector of kernel addresses to be
translated to struct pages and pinned for IO.
Patch 7 adds support for using highmem pages for swap by kmapping
the pages before calling the direct_IO handler.
Patch 8 updates NFS to use the helpers from patch 3 where necessary.
Patch 9 avoids setting PF_private on PG_swapcache pages within NFS.
Patch 10 implements the new swapfile-related address_space operations
for NFS and teaches the direct IO handler how to manage
kernel addresses.
Patch 11 prevents page allocator recursions in NFS by using GFP_NOIO
where appropriate.
Patch 12 fixes a NULL pointer dereference that occurs when using
swap-over-NFS.
With the patches applied, it is possible to mount a swapfile that is on an
NFS filesystem. Swap performance is not great with a swap stress test
taking roughly twice as long to complete than if the swap device was
backed by NBD.
This patch: netvm: prevent a stream-specific deadlock
It could happen that all !SOCK_MEMALLOC sockets have buffered so much data
that we're over the global rmem limit. This will prevent SOCK_MEMALLOC
buffers from receiving data, which will prevent userspace from running,
which is needed to reduce the buffered data.
Fix this by exempting the SOCK_MEMALLOC sockets from the rmem limit. Once
this change it applied, it is important that sockets that set
SOCK_MEMALLOC do not clear the flag until the socket is being torn down.
If this happens, a warning is generated and the tokens reclaimed to avoid
accounting errors until the bug is fixed.
[davem@davemloft.net: Warning about clearing SOCK_MEMALLOC]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
unregister_netdevice_notifier() must be called before
unregister_pernet_subsys() to avoid accessing already freed
pernet memory. This fixes the following oops when doing rmmod:
Call Trace:
[<ffffffffa0f802bd>] caif_device_notify+0x4d/0x5a0 [caif]
[<ffffffff81552ba9>] unregister_netdevice_notifier+0xb9/0x100
[<ffffffffa0f86dcc>] caif_device_exit+0x1c/0x250 [caif]
[<ffffffff810e7734>] sys_delete_module+0x1a4/0x300
[<ffffffff810da82d>] ? trace_hardirqs_on_caller+0x15d/0x1e0
[<ffffffff813517de>] ? trace_hardirqs_on_thunk+0x3a/0x3
[<ffffffff81696bad>] system_call_fastpath+0x1a/0x1f
RIP
[<ffffffffa0f7f561>] caif_get+0x51/0xb0 [caif]
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/caif/caif_hsi.c
drivers/net/usb/qmi_wwan.c
The qmi_wwan merge was trivial.
The caif_hsi.c, on the other hand, was not. It's a conflict between
1c385f1fdf ("caif-hsi: Replace platform
device with ops structure.") in the net-next tree and commit
39abbaef19 ("caif-hsi: Postpone init of
HIS until open()") in the net tree.
I did my best with that one and will ask Sjur to check it out.
Signed-off-by: David S. Miller <davem@davemloft.net>
Rearranged the allocation and packet creations to
avoid potential leaks in error path.
Signed-off-by: Kim Lilliestierna <kim.xx.lilliestierna@stericsson.com>
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericssion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add check on NULL return from caif_get().
Signed-off-by: Kim Lilliestierna <Kim.xx.Lilliestierna@stericsson.com>
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericssion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed surplus call to caif_device_list() in caif_dev.c
Signed-off-by: Kim Lilliestierna <kim.xx.lilliestierna@stericsson.com>
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clear caif sockets's shutdown mask at (re)connect.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull trivial updates from Jiri Kosina:
"As usual, it's mostly typo fixes, redundant code elimination and some
documentation updates."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (57 commits)
edac, mips: don't change code that has been removed in edac/mips tree
xtensa: Change mail addresses of Hannes Weiner and Oskar Schirmer
lib: Change mail address of Oskar Schirmer
net: Change mail address of Oskar Schirmer
arm/m68k: Change mail address of Sebastian Hess
i2c: Change mail address of Oskar Schirmer
net: Fix tcp_build_and_update_options comment in struct tcp_sock
atomic64_32.h: fix parameter naming mismatch
Kconfig: replace "--- help ---" with "---help---"
c2port: fix bogus Kconfig "default no"
edac: Fix spelling errors.
qla1280: Remove redundant NULL check before release_firmware() call
remoteproc: remove redundant NULL check before release_firmware()
qla2xxx: Remove redundant NULL check before release_firmware() call.
aic94xx: Get rid of redundant NULL check before release_firmware() call
tehuti: delete redundant NULL check before release_firmware()
qlogic: get rid of a redundant test for NULL before call to release_firmware()
bna: remove redundant NULL test before release_firmware()
tg3: remove redundant NULL test before release_firmware() call
typhoon: get rid of redundant conditional before all to release_firmware()
...
Standardize the net core ratelimited logging functions.
Coalesce formats, align arguments.
Change a printk then vprintk sequence to use printf extension %pV.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are three Kconfig entries with "--- help ---" attributes, and over
2000 Kconfig entries with "---help---" attributes. Apparently the three
attributes with embedded spaces are valid. Still, I see little reason
for using this obscure variant. And replacing those three attributes
with the common variant makes grepping Kconfig files for help texts a
bit easier too.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Use of "unsigned int" is preferred to bare "unsigned" in net tree.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set traffic class for CAIF packets, based on socket
priority, CAIF protocol type, or type of message.
Traffic class mapping for different packet types:
- control: TC_PRIO_CONTROL;
- flow control: TC_PRIO_CONTROL;
- at: TC_PRIO_CONTROL;
- rfm: TC_PRIO_INTERACTIVE_BULK;
- other sockets: equals to socket's TC;
- network data: no change.
Signed-off-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull in the 'net' tree to get CAIF bug fixes upon which
the following set of CAIF feature patches depend.
Signed-off-by: David S. Miller <davem@davemloft.net>
Added kfree_skb() calls in the chnk_net.c file on
the error paths.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.
Signed-off-by: David S. Miller <davem@davemloft.net>
Connection ID configured through RTNL must allow zero as
connection id. If connection-id is not given when creating the
interface, configure a loopback interface using ifindex as
connection-id.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kill faulty checks on flow-off leading to connection drop at race conditions.
caif_socket checks for flow-on before transmitting and goes to sleep or
return -EAGAIN upon flow stop. Remove faulty subsequent checks on flow-off
leading to connection drop. Also fix memory leaks on some of the errors paths.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"priv" is initialized twice. I kept the second one, because it is next
to the check for NULL.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Count dropped packets in CAIF Netdevice.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kill off the debug-fs exposed varaibles from caif_socket.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SKB is freed twice upon send error. The Network stack consumes SKB even
when it returns error code.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Always use cfmuxl_remove_uplayer when removing a up-layer.
cfmuxl_ctrlcmd() can be called independently and in parallel with
cfmuxl_remove_uplayer(). The race between them could cause list_del_rcu
to be called on a node which has been already taken out from the list.
That lead to a (rare) crash on accessing poisoned node->prev inside
list_del_rcu.
This fix ensures that deletion are done holding the same lock.
Reported-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com>
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
caif is a subsystem and as such it needs to register with
register_pernet_subsys instead of register_pernet_device.
Among other problems using register_pernet_device was resulting in
net_generic being called before the caif_net structure was allocated.
Which has been causing net_generic to fail with either BUG_ON's or by
return NULL pointers.
A more ugly problem that could be caused is packets in flight why the
subsystem is shutting down.
To remove confusion also remove the cruft cause by inappropriately
trying to fix this bug.
With the aid of the previous patch I have tested this patch and
confirmed that using register_pernet_subsys makes the failure go away as
it should.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove WARN_ON and bad handling of SKB without destructor callback
in caif_flow_cb. SKB without destructor cannot be handled as an
error case.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix typo for the Vendor/Product Id for ST-Ericsson CAIF modems.
Discovery is based on fixed USB vendor 0x04cc (ST-Ericsson),
product-id 0x230f (NCM).
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: Remove irqsafe_cpu_xxx variants
Fix up conflict in arch/x86/include/asm/percpu.h due to clash with
cebef5beed ("x86: Fix and improve percpu_cmpxchg{8,16}b_double()")
which edited the (now removed) irqsafe_cpu_cmpxchg*_double code.
We simply say that regular this_cpu use must be safe regardless of
preemption and interrupt state. That has no material change for x86
and s390 implementations of this_cpu operations. However, arches that
do not provide their own implementation for this_cpu operations will
now get code generated that disables interrupts instead of preemption.
-tj: This is part of on-going percpu API cleanup. For detailed
discussion of the subject, please refer to the following thread.
http://thread.gmane.org/gmane.linux.kernel/1222078
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <alpine.DEB.2.00.1112221154380.11787@router.home>
DaveM said:
Please, this kind of stuff rots forever and not using bool properly
drives me crazy.
Joe Perches <joe@perches.com> gave me the spatch script:
@@
bool b;
@@
-b = 0
+b = false
@@
bool b;
@@
-b = 1
+b = true
I merely installed coccinelle, read the documentation and took credit.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
BUG_ON is too strict in a number of circumstances,
use WARN_ON instead. Protocol errors should not halt the system.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix bad assert on fragment size triggering false positive.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds functionality for avoiding orphaning SKB too early.
The original skb is stashed away and the original destructor is called
from the hi-jacked flow-on callback. If CAIF interface goes down and a
hi-jacked SKB exists, the original skb->destructor is restored.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Flow control is implemented by inspecting the qdisc queue length
in order to detect potential overflow on the TX queue. When a threshold
is reached flow-off is sent upwards in the CAIF stack. At the same time
the skb->destructor is hi-jacked by orphaning the SKB and the original
destructor is replaced with a "flow-on" callback. When the "hi-jacked"
SKB is consumed the queue should be empty, and the "flow-on" callback
is called and xon is sent upwards in the CAIF stack.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NCM 1.0 does not support anything but Ethernet framing, hence
CAIF payload will be put into Ethernet frames.
Discovery is based on fixed USB vendor 0x04cc (ST-Ericsson),
product-id 0x230f (NCM). In this variant only CAIF payload is sent over
the NCM interface.
The CAIF stack (cfusbl.c) will when USB interface register first check if
we got a CDC NCM USB interface with the right VID, PID.
It will then read the device's Ethernet address and create a 'template'
Ethernet TX header, using a broadcast address as the destination address,
and EthType 0x88b5 (802.1 Local Experimental - vendor specific).
A protocol handler for 0x88b5 is setup for reception of CAIF frames from
the CDC NCM USB interface.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove unused enum cfcnfg_phy_type and the parameter to cfserl_create.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enrolling CAIF link layers are refactored.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow NULL pointer in cfpkt_extr_head in order to
skip past header data.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "tmp" variable here is used to store the result of cpu_to_le16()
so it should be an __le16 instead of an int. We want the high bits
set and the current code works on little endian systems but not on
big endian systems.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These files are non modular, but need to export symbols using
the macros now living in export.h -- call out the include so
that things won't break when we remove the implicit presence
of module.h from everywhere.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
With calls to modular infrastructure, these files really
needs the full module.h header. Call it out so some of the
cleanups of implicit and unrequired includes elsewhere can be
cleaned up.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
The caif code will register its own pernet_operations, and then register
a netdevice_notifier. Each time the netdevice_notifier is triggered,
it'll do some stuff... including a lookup of its own pernet stuff with
net_generic().
If the net_generic() call ever returns NULL, the caif code will BUG().
That doesn't seem *so* unreasonable, I suppose — it does seem like it
should never happen.
However, it *does* happen. When we clone a network namespace,
setup_net() runs through all the pernet_operations one at a time. It
gets to loopback before it gets to caif. And loopback_net_init()
registers a netdevice... while caif hasn't been initialised. So the caif
netdevice notifier triggers, and immediately goes BUG().
We could imagine a complex and overengineered solution to this generic
class of problems, but this patch takes the simple approach. It just
makes caif_device_notify() *not* go looking for its pernet data
structures if the device it's being notified about isn't a caif device
in the first place.
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The allocation of "phyinfo" wasn't checked, and also the allocation
wasn't freed on error paths. Sjur Brændeland pointed out as well
that "phy_driver" should be freed on the error path too.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit bd30ce4bc0 (caif: Use RCU instead of spin-lock in caif_dev.c)
added a potential NULL dereference in case alloc_percpu() fails.
caif_device_alloc() can also use GFP_KERNEL instead of GFP_ATOMIC.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Sjur Brændeland <sjur.brandeland@stericsson.com>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove per site OOM messages because they duplicate
the generic mm subsystem OOM message.
Use kzalloc instead of kmalloc/memset
when next to the OOM message removals.
Reduces object size (allyesconfig ~2%)
$ size -t drivers/net/caif/built-in.o.old net/caif/built-in.o.old
text data bss dec hex filename
32297 700 8224 41221 a105 drivers/net/caif/built-in.o.old
72159 1317 20552 94028 16f4c net/caif/built-in.o.old
104456 2017 28776 135249 21051 (TOTALS)
$ size -t drivers/net/caif/built-in.o.new net/caif/built-in.o.new
text data bss dec hex filename
31975 700 8184 40859 9f9b drivers/net/caif/built-in.o.new
70748 1317 20152 92217 16839 net/caif/built-in.o.new
102723 2017 28336 133076 207d4 (TOTALS)
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When assigning a NULL value to an RCU protected pointer, no barrier
is needed. The rcu_assign_pointer, used to handle that but will soon
change to not handle the special case.
Convert all rcu_assign_pointer of NULL value.
//smpl
@@ expression P; @@
- rcu_assign_pointer(P, NULL)
+ RCU_INIT_POINTER(P, NULL)
// </smpl>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>