move the tx_bottom() from delayed_work to tasklet. It makes the rx
and tx balanced. If the device is in runtime suspend when getting
the tx packet, wakeup the device before trasmitting.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check tx agg list before spin lock to avoid doing spin lock every
times.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use spin_lock and spin_unlock in interrupt context.
The ndo_start_xmit would not be called in interrupt context, so
replace the relative spin_lock_irqsave and spin_unlock_irqrestore
with spin_lock_bh and spin_unlock_bh.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates
This series contains updates to i40e and i40evf.
Most notable are:
Joseph completes the implementation of the ethtool ntuple rule
management interface by adding the get, update and delete interface
reset.
Akeem provides a fix to prevent a possible overflow due to multiplication
of number and size by using kzalloc, so use kcalloc.
Jesse provides an implementation for skb_set_hash() and adds the L4 type
return when we know it is an L4 hash. He also adds a counter to
statistics for Tx timeouts to help users. Lastly he provides a change
to stay away from the cache line where the done bit may be getting
written back for the transmit ring since the hardware may be writing the
whole cache line for a partial update.
Shannon cleans up code comments.
Anjali removes a firmware workaround for newer firmware since the number
of MSIx vectors are being reported correctly.
v2:
- dropped patch 01 of the series based on feedback from the author
Joe Perches and Shannon Nelson.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAUxX4KBOxKuMESys7AQJqkRAAmxU1GEGESCE/F/U3Hm9pRFeg9kj6+7BO
7vPrUzB5KEnu4eXjCc60a+BVmV0fJRiS7zsFWzmcKgjIZzJdobbStD944tdz0kIx
w7RFKOr+D6+zyo0zr1Fj0DwlhvuOpfBsisYtNC5J7OTWc3whgnIkroLt16FNFasu
gxe7oo+KHiTXIJuIf6E/p5iK1te7CUsIIc1v01hgVVD91Udk4odWyl288BzzcXuC
5WFYaMnOG+9ndo+4/mCt5dN0FDYL2a1djyknPdXP6HOSBoqVBjPxFGxuN8O9HFMV
ghWjNG72YGDk0jy6ghiQijdoxE4l2iy0/gYGzSiCMSp59aHHlkq/XIhOoZWD96Nw
TAoo0PbTFrqRzfvOQ8lMULDneB0KP20gI/AUbgnU3CGataALntAKiVOhrOtXaG21
trE77omDA5FtMlQ60DabTl6RveMZXCZ+LwHPuZ0Euxo4uykmDLdOChZADGoqZA+5
VzI1Qa4YK+9B/fkgUyiIzS4syCHGISFIIQ7/Srr26UMlpuK4WDPA9a11hKxrbrkA
JxEkAk9i/KD3/XsWWfxEMPgNarwHOyDQQ4/XYKx20O2N9RDrao+cNw6ghlwZS+mo
bmx/sqWmMDaQoMNqL8Q+bq2W7omG8wNgv1pn6MVnFSfiJHAMDhNZkBzp0uCLGFpA
8WYj9OBcUgY=
=PK2i
-----END PGP SIGNATURE-----
Merge tag 'rxrpc-devel-20140304' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
net-next: AF_RXRPC fixes and development
Here are some AF_RXRPC fixes:
(1) Fix to remove incorrect checksum calculation made during recvmsg(). It's
unnecessary to try to do this there since we check the checksum before
reading the RxRPC header from the packet.
(2) Fix to prevent the sending of an ABORT packet in response to another
ABORT packet and inducing a storm.
(3) Fix UDP MTU calculation from parsing ICMP_FRAG_NEEDED packets where we
don't handle the ICMP packet not specifying an MTU size.
And development patches:
(4) Add sysctls for configuring RxRPC parameters, specifically various delays
pertaining to ACK generation, the time before we resend a packet for
which we don't receive an ACK, the maximum time a call is permitted to
live and the amount of time transport, connection and dead call
information is cached.
(5) Improve ACK packet production by adjusting the handling of ACK_REQUESTED
packets, ignoring the MORE_PACKETS flag, delaying the production of
otherwise immediate ACK_IDLE packets and delaying all ACK_IDLE production
(barring the call termination) to half a second.
(6) Add more sysctl parameters to expose the Rx window size, the maximum
packet size that we're willing to receive and the number of jumbo rxrpc
packets we're willing to handle in a single UDP packet.
(7) Request ACKs on alternate DATA packets so that the other side doesn't
wait till we fill up the Tx window.
(8) Use a RCU hash table to look up the rxrpc_call for an incoming packet
rather than stepping through a hierarchy involving several spinlocks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Zoltan Kiss says:
====================
xen-netback: TX grant mapping with SKBTX_DEV_ZEROCOPY instead of copy
A long known problem of the upstream netback implementation that on the TX
path (from guest to Dom0) it copies the whole packet from guest memory into
Dom0. That simply became a bottleneck with 10Gb NICs, and generally it's a
huge perfomance penalty. The classic kernel version of netback used grant
mapping, and to get notified when the page can be unmapped, it used page
destructors. Unfortunately that destructor is not an upstreamable solution.
Ian Campbell's skb fragment destructor patch series [1] tried to solve this
problem, however it seems to be very invasive on the network stack's code,
and therefore haven't progressed very well.
This patch series use SKBTX_DEV_ZEROCOPY flags to tell the stack it needs to
know when the skb is freed up. That is the way KVM solved the same problem,
and based on my initial tests it can do the same for us. Avoiding the extra
copy boosted up TX throughput from 6.8 Gbps to 7.9 (I used a slower AMD
Interlagos box, both Dom0 and guest on upstream kernel, on the same NUMA node,
running iperf 2.0.5, and the remote end was a bare metal box on the same 10Gb
switch)
Based on my investigations the packet get only copied if it is delivered to
Dom0 IP stack through deliver_skb, which is due to this [2] patch. This affects
DomU->Dom0 IP traffic and when Dom0 does routing/NAT for the guest. That's a bit
unfortunate, but luckily it doesn't cause a major regression for this usecase.
In the future we should try to eliminate that copy somehow.
There are a few spinoff tasks which will be addressed in separate patches:
- grant copy the header directly instead of map and memcpy. This should help
us avoiding TLB flushing
- use something else than ballooned pages
- fix grant map to use page->index properly
I've tried to broke it down to smaller patches, with mixed results, so I
welcome suggestions on that part as well:
1: Use skb->cb to store pending_idx
2: Some refactoring
3: Change RX path for mapped SKB fragments (moved here to keep bisectability,
review it after #4)
4: Introduce TX grant mapping
5: Remove old TX grant copy definitons and fix indentations
6: Add stat counters for zerocopy
7: Handle guests with too many frags
8: Timeout packets in RX path
9: Aggregate TX unmap operations
v2: I've fixed some smaller things, see the individual patches. I've added a
few new stat counters, and handling the important use case when an older guest
sends lots of slots. Instead of delayed copy now we timeout packets on the RX
path, based on the assumption that otherwise packets should get stucked
anywhere else. Finally some unmap batching to avoid too much TLB flush
v3: Apart from fixing a few things mentioned in responses the important change
is the use the hypercall directly for grant [un]mapping, therefore we can
avoid m2p override.
v4: Now we are using a new grant mapping API to avoid m2p_override. The RX queue
timeout logic changed also.
v5: Only minor fixes based on Wei's comments
v6: Important bugfixes for xenvif_poll exit path and zerocopy callback, see
first 2 patches. Also rework of handling packets with too many slots, and
reorder the series a bit.
v7: Small fixes in comments/log messages/error paths, and merging the frag
overflow stats patch into its parent.
[1] http://lwn.net/Articles/491522/
[2] https://lkml.org/lkml/2012/7/20/363
====================
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unmapping causes TLB flushing, therefore we should make it in the largest
possible batches. However we shouldn't starve the guest for too long. So if
the guest has space for at least two big packets and we don't have at least a
quarter ring to unmap, delay it for at most 1 milisec.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A malicious or buggy guest can leave its queue filled indefinitely, in which
case qdisc start to queue packets for that VIF. If those packets came from an
another guest, it can block its slots and prevent shutdown. To avoid that, we
make sure the queue is drained in every 10 seconds.
The QDisc queue in worst case takes 3 round to flush usually.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xen network protocol had implicit dependency on MAX_SKB_FRAGS. Netback has to
handle guests sending up to XEN_NETBK_LEGACY_SLOTS_MAX slots. To achieve that:
- create a new skb
- map the leftover slots to its frags (no linear buffer here!)
- chain it to the previous through skb_shinfo(skb)->frag_list
- map them
- copy and coalesce the frags into a brand new one and send it to the stack
- unmap the 2 old skb's pages
It's also introduces new stat counters, which help determine how often the guest
sends a packet with more than MAX_SKB_FRAGS frags.
NOTE: if bisect brought you here, you should apply the series up until
"xen-netback: Timeout packets in RX path", otherwise malicious guests can block
other guests by not releasing their sent packets.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These counters help determine how often the buffers had to be copied. Also
they help find out if packets are leaked, as if "sent != success + fail",
there are probably packets never freed up properly.
NOTE: if bisect brought you here, you should apply the series up until
"xen-netback: Timeout packets in RX path", otherwise Windows guests can't work
properly and malicious guests can block other guests by not releasing their sent
packets.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These became obsolete with grant mapping. I've left intentionally the
indentations in this way, to improve readability of previous patches.
NOTE: if bisect brought you here, you should apply the series up until
"xen-netback: Timeout packets in RX path", otherwise Windows guests can't work
properly and malicious guests can block other guests by not releasing their sent
packets.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces grant mapping on netback TX path. It replaces grant copy
operations, ditching grant copy coalescing along the way. Another solution for
copy coalescing is introduced in "xen-netback: Handle guests with too many
frags", older guests and Windows can broke before that patch applies.
There is a callback (xenvif_zerocopy_callback) from core stack to release the
slots back to the guests when kfree_skb or skb_orphan_frags called. It feeds a
separate dealloc thread, as scheduling NAPI instance from there is inefficient,
therefore we can't do dealloc from the instance.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RX path need to know if the SKB fragments are stored on pages from another
domain.
Logically this patch should be after introducing the grant mapping itself, as
it makes sense only after that. But to keep bisectability, I moved it here. It
shouldn't change any functionality here. xenvif_zerocopy_callback and
ubuf_to_vif are just stubs here, they will be introduced properly later on.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains a few bits of refactoring before introducing the grant
mapping changes:
- introducing xenvif_tx_pending_slots_available(), as this is used several
times, and will be used more often
- rename the thread to vifX.Y-guest-rx, to signify it does RX work from the
guest point of view
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Storing the pending_idx at the first byte of the linear buffer never looked
good, skb->cb is a more proper place for this. It also prevents the header to
be directly grant copied there, and we don't have the pending_idx after we
copied the header here, so it's time to change it.
It also introduces helpers for the RX side
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 3804fad454.
This commit, together with commit 247bf55727
"xhci 1.0: Limit arbitrarily-aligned scatter gather." were
origially added to get xHCI 1.0 hosts and usb ethernet ax88179_178a devices
working together with scatter gather. xHCI 1.0 hosts pose some requirement on how transfer
buffers are aligned, setting this requirement for 1.0 hosts caused USB 3.0 mass
storage devices to fail more frequently.
USB 3.0 mass storage devices used to work before 3.14-rc1. Theoretically,
the TD fragment rules could have caused an occasional disk glitch.
Now the devices *will* fail, instead of theoretically failing.
>From a user perspective, this looks like a regression; the USB device obviously
fails on 3.14-rc1, and may sometimes silently fail on prior kernels.
The proper soluition is to implement the TD fragment rules for xHCI 1.0 hosts,
but for now, revert this patch until scatter gather can be properly supported.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 247bf55727.
This commit, together with commit 3804fad454
"USBNET: ax88179_178a: enable tso if usb host supports sg dma" were
origially added to get xHCI 1.0 hosts and usb ethernet ax88179_178a devices
working together with scatter gather. xHCI 1.0 hosts pose some requirement on how transfer
buffers are aligned, setting this requirement for 1.0 hosts caused USB 3.0 mass
storage devices to fail more frequently.
USB 3.0 mass storage devices used to work before 3.14-rc1. Theoretically,
the TD fragment rules could have caused an occasional disk glitch.
Now the devices *will* fail, instead of theoretically failing.
>From a user perspective, this looks like a regression; the USB device obviously
fails on 3.14-rc1, and may sometimes silently fail on prior kernels.
The proper soluition is to implement the TD fragment rules required, but for now
this patch needs to be reverted to get USB 3.0 mass storage devices working at the
level they used to.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The DELAY_INIT quirk only reduces the frequency of enumeration failures
with the Logitech HD Pro C920 and C930e webcams, but does not quite
eliminate them. We have found that adding a delay of 100ms between the
first and second Get Configuration request makes the device enumerate
perfectly reliable even after several weeks of extensive testing. The
reasons for that are anyone's guess, but since the DELAY_INIT quirk
already delays enumeration by a whole second, wating for another 10th of
that isn't really a big deal for the one other device that uses it, and
it will resolve the problems with these webcams.
Signed-off-by: Julius Werner <jwerner@chromium.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We've encountered a rare issue when enumerating two Logitech webcams
after a reboot that doesn't power cycle the USB ports. They are spewing
random data (possibly some leftover UVC buffers) on the second
(full-sized) Get Configuration request of the enumeration phase. Since
the data is random this can potentially cause all kinds of odd behavior,
and since it occasionally happens multiple times (after the kernel
issues another reset due to the garbled configuration descriptor), it is
not always recoverable. Set the USB_DELAY_INIT quirk that seems to work
around the issue.
Signed-off-by: Julius Werner <jwerner@chromium.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no reason to orphan skb in l2tp.
This breaks things like per socket memory limits, TCP Small queues...
Fix this before more people copy/paste it.
This is very similar to commit 8f646c922d
("vxlan: keep original skb ownership")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Usage of skb->tstamp should remain private to TCP stack
(only set on packets on write queue, not on cloned ones)
Otherwise, packets given to loopback interface with a non null tstamp
can confuse netif_rx() / net_timestamp_check()
Other possibility would be to clear tstamp in loopback_xmit(),
as done in skb_scrub_packet()
Fixes: 740b0f1841 ("tcp: switch rtt estimations to usec resolution")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Via commit 87809942d3 "libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk
for Seagate Momentus SpinPoint M8" we added a quirk for disks named
"ST1000LM024 HN-M101MBB" with firmware revision "2AR10001".
As reported on https://bugzilla.redhat.com/show_bug.cgi?id=1073901,
we need to also add firmware revision 2BA30001 as it is broken as well.
Reported-by: Nicholas <arealityfarbetween@googlemail.com>
Signed-off-by: Michele Baldessari <michele@acksyn.org>
Tested-by: Guilherme Amadio <guilherme.amadio@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
This fixes a subtle issue with cache flush which could potentially cause
random userspace crashes because of stale icache lines.
This error crept in when consolidating the cache flush code
Fixes: bd12976c36 (ARC: cacheflush refactor #3: Unify the {d,i}cache)
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # 3.13
Cc: arc-linux-dev@synopsys.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Just a few device-specific quirks for HD-audio and USB-audio, most of
which are one-liners.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJTGWaKAAoJEGwxgFQ9KSmkgSAP/iTHMKaIt20O0P8L/WvQvk+A
4qHHagktY2Tg2NfyQsyGsH3+w1+kBQRL7a8Jg+tWjMAWcL76dkqbr3ufwrsssOca
UGT4KUOSb62pHn/lH+g/wiupdLjWbNpbAEtcxzpSJNIshhPRD+FJcTJnMRqEbHPk
COpzC93H3fRdVVVInutAbVluGoE1OhmXS3kgrKBmWn/6W4WCdRJsZ4p8WqPnFajo
P7hkwk4XjWC9nflzKwaV7SRQY7tYoUPtyyrLkfTkXt70ffzpaaJdkoTOJOHXhcsU
kBicFhzBkqalI3g25eKXk9p2K1oDz9lhxscxd1k7zP52vCDqWMO3XsQxx/M1ULYL
5m6c8m3XfCctfrIWfw34ooq6iZb803gcPP5/wlaCh6L8UTfkiYuKXMF0eKUxLAIo
mQEVUwLy/f7cAHeaQmUOoCpRR6Qhk0B6k137ZpU5yGAxNJYNOoZAOwjofVLkuwD/
jukDVb2kTWsgfPcaHfWoIxVxQFHqdv4KeIWKxjOFNSJQ23k8kMRzPLGQZzsBasfA
0rPcpd1oKBoRcmMFHPMYKJZoOc5cdnpyEpDS3vcYwKf0K1TcjlKGorO5hOuUyPCi
EDIOwu5qweV6Sr+JYlZSOtcdCk+M3OFXWwXgSCTrmmKi8ccPPcLE/e4eBRv2ACqw
JMLhfFd61i4w5xWSBYQ6
=FPhL
-----END PGP SIGNATURE-----
Merge tag 'sound-3.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Just a few device-specific quirks for HD-audio and USB-audio, most of
which are one-liners"
* tag 'sound-3.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Add quirk for Logitech Webcam C500
ALSA: hda - Use analog beep for Thinkpads with AD1984 codecs
ALSA: hda - Add missing loopback merge path for AD1884/1984 codecs
ALSA: hda - add automute fix for another dell AIO model
ALSA: hda - Added inverted digital-mic handling for Acer TravelMate 8371
Pull drm fixes from Dave Airlie:
"Mostly intel and radeon fixes, one tda998x, one kconfig dep fix and
two more MAINTAINERS updates,
All pretty run of the mill for this stage"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/atom: select the proper number of lanes in transmitter setup
MAINTAINERS: add maintainer entry for TDA998x driver
drm: fix bochs kconfig dependencies
drm/radeon/dpm: fix typo in EVERGREEN_SMC_FIRMWARE_HEADER_softRegisters
drm/radeon/cik: fix typo in documentation
drm/radeon: silence GCC warning on 32 bit
drm/radeon: resume old pm late
drm/radeon: TTM must be init with cpu-visible VRAM, v2
DRM: armada: fix use of kfifo_put()
drm/i915: Reject >165MHz modes w/ DVI monitors
drm/i915: fix assert_cursor on BDW
drm/i915: vlv: reserve GT power context early
drm/i915: fix pch pci device enumeration
drm/i915: Resolving the memory region conflict for Stolen area
drm/i915: use backlight legacy combination mode also for i915gm/i945gm
MAINTAINERS: update AGP tree to point at drm tree
Pull block fixes from Jens Axboe:
"Small collection of fixes for 3.14-rc. It contains:
- Three minor update to blk-mq from Christoph.
- Reduce number of unaligned (< 4kb) in-flight writes on mtip32xx to
two. From Micron.
- Make the blk-mq CPU notify spinlock raw, since it can't be a
sleeper spinlock on RT. From Mike Galbraith.
- Drop now bogus BUG_ON() for bio iteration with blk integrity. From
Nic Bellinger.
- Properly propagate the SYNC flag on requests. From Shaohua"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: add REQ_SYNC early
rt,blk,mq: Make blk_mq_cpu_notify_lock a raw spinlock
bio-integrity: Drop bio_integrity_verify BUG_ON in post bip->bip_iter world
blk-mq: support partial I/O completions
blk-mq: merge blk_mq_insert_request and blk_mq_run_request
blk-mq: remove blk_mq_alloc_rq
mtip32xx: Reduce the number of unaligned writes to 2
- Fix chained interrupts, interrupt masking and register offset
calculation for the sunxi driver.
- Make MSM a bool rather than a tristate to stop build problems
to happen - chained interrupt controllers cannot currently be
defined in modules.
- Fix a clock in the PFC driver.
- Fix a kernel panic in the sirf driver.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTF97HAAoJEEEQszewGV1zdgQP/jFi8k5EDOFL7gYuCjeBrvqF
HLk8vQF3HHizfjY2+hHh38FV0pQOE+tuJOmn/zbFb5ZQniQeXjMOT6uQFwKyZsXv
8o6Ct0hPOc0LVId7NbwaL8oQ4tUd5KqFznOEhHzbQo8t9xsMBE5Q5Efo+9PRFCDW
rdWshQy09nUgwXVi1BQBssk1y8Z2aMfiRFpzV0F6M4wI4oo8SPW2xsmmChn+e5wU
EVeyNykfPdHQmc/yFR1GARhHH4ZqhGOkW2+j5Rp4UnhyfEsgeUI9tLknj7e6Ntp4
AJUE/++dLYYR4fYx0VjcddvvW6QgMSNq98JnRJ5gRNnLAi/c5u3Ic4d00gUO+gOj
HQiQiCqyqifzFkkGOfKaUl5D8RGRZHQg3R4NnXgQ2/UrSQfOBd9Q6HPdPj8w6kQ4
13pyj2kqhmPydjTQLvCWzBp5EayLL3w06mfRdfxM4IluMjS2xv8NoYkKsmPEceFA
MbpGD7ubDxBN0LkyESCg/ihNPbar56bYH4MPp/ycaghKmSXx1tcOl793yNnGHL5u
N1JKQ7ZcOTV7dIvgyzmivtGv7Fet0Rb2aGdfCfweBG2uJcKZE++05s2B6I9ZBknx
WvFnXkA/V/Q+lwXxubFNvst5C98SfZQq9BYwWXzD3Nv/9d/zfDmFDPxXFsGc/3MD
Ist/2lEyd+ef6Nrf3rn4
=gGJ6
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-v3.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"This is a set of pin control fixes I have collected over the last few
days. Some have rotated more than others in linux-next, but they were
rebased on v3.14-rc5 due to sloppy commit messages. I am quite
convinced that they are all good fixes that only hit this or that
individual driver and not the entire subsystem.
- Fix chained interrupts, interrupt masking and register offset
calculation for the sunxi driver
- Make MSM a bool rather than a tristate to stop build problems to
happen - chained interrupt controllers cannot currently be defined
in modules
- Fix a clock in the PFC driver
- Fix a kernel panic in the sirf driver"
* tag 'pinctrl-v3.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: sirf: fix kernel panic in gpio_lock_as_irq
pinctrl: sh-pfc: r8a7791: SD1_CLK fix
pinctrl: msm: make PINCTRL_MSM bool instead of tristate
pinctrl: sunxi: Fix interrupt register offset calculation
pinctrl: sunxi: Fix masking when setting irq type
pinctrl: sunxi: use chained_irq_{enter, exit} for GIC compatibility
- Fix compile dependency on Xen ARM to have MMU.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJTF4o9AAoJEFjIrFwIi8fJmXAIAJGwTQaaRtSeVQC3Yd3gxGQL
dfIdH+MdPaLxLDFRuLmD0GG4tEUkPXD0n1MarS+UX5hF3sJQi1DjEPc2EeG0vpFU
KaaHNJ6mD/r6P16Gsx5gwwuJ0y0tJfwX8F6WHODJZE3ryTUMFP3iuWYNYeNpLhn4
fQooIiwxmdN5B9Q7Q0VMfEYgBnWiq6mKdtCrbzeTj0JjLNx91F0/umupCsgcO73z
7MzF9rPkTVWFZB1JHnCSrCtzgJ2eS9bnLHMUiUdm8pzZZxq7zv3TKoqx6uuJQOQw
vnL1OxKLNekbsM5XUthTIzG2J+LdqHKhLQv/TeuR5msBg2LN3En6OiMhFuu9qHY=
=w5ov
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.14-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen fix from Konrad Rzeszutek Wilk:
"This has exactly one patch for Xen ARM. It sets the dependency to
compile the kernel with MMU enabled - otherwise - the guest won't work
very well"
* tag 'stable/for-linus-3.14-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
ARM: XEN depends on having a MMU
This has been a relatively long-standing issue that wasn't nailed down
until Teng-Feng Yang's meticulous bug report to dm-devel on 3/7/2014,
see: http://www.redhat.com/archives/dm-devel/2014-March/msg00021.html
From that report:
"When decreasing the reference count of a metadata block with its
reference count equals 3, we will call dm_btree_remove() to remove
this enrty from the B+tree which keeps the reference count info in
metadata device.
The B+tree will try to rebalance the entry of the child nodes in each
node it traversed, and the rebalance process contains the following
steps.
(1) Finding the corresponding children in current node (shadow_current(s))
(2) Shadow the children block (issue BOP_INC)
(3) redistribute keys among children, and free children if necessary (issue BOP_DEC)
Since the update of a metadata block's reference count could be
recursive, we will stash these reference count update operations in
smm->uncommitted and then process them in a FILO fashion.
The problem is that step(3) could free the children which is created
in step(2), so the BOP_DEC issued in step(3) will be carried out
before the BOP_INC issued in step(2) since these BOPs will be
processed in FILO fashion. Once the BOP_DEC from step(3) tries to
decrease the reference count of newly shadow block, it will report
failure for its reference equals 0 before decreasing. It looks like we
can solve this issue by processing these BOPs in a FIFO fashion
instead of FILO."
Commit 5b564d80 ("dm space map: disallow decrementing a reference count
below zero") changed the code to report an error for this temporary
refcount decrement below zero. So what was previously a harmless
invalid refcount became a hard failure due to the new error path:
device-mapper: space map common: unable to decrement a reference count below 0
device-mapper: thin: 253:6: dm_thin_insert_block() failed: error = -22
device-mapper: thin: 253:6: switching pool to read-only mode
This bug is in dm persistent-data code that is common to the DM thin and
cache targets. So any users of those targets should apply this fix.
Fix this by applying recursive space map operations in FIFO order rather
than FILO.
Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=68801
Reported-by: Apollon Oikonomopoulos <apoikos@debian.org>
Reported-by: edwillam1007@gmail.com
Reported-by: Teng-Feng Yang <shinrairis@gmail.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 3.13+
PREPARE_[DELAYED_]WORK() are being phased out. They have few users
and a nasty surprise in terms of reentrancy guarantee as workqueue
considers work items to be different if they don't have the same work
function.
firewire core-device and sbp2 have been been multiplexing work items
with multiple work functions. Introduce fw_device_workfn() and
sbp2_lu_workfn() which invoke fw_device->workfn and
sbp2_logical_unit->workfn respectively and always use the two
functions as the work functions and update the users to set the
->workfn fields instead of overriding work functions using
PREPARE_DELAYED_WORK().
This fixes a variety of possible regressions since a2c1c57be8
"workqueue: consider work function when searching for busy work items"
due to which fw_workqueue lost its required non-reentrancy property.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: linux1394-devel@lists.sourceforge.net
Cc: stable@vger.kernel.org # v3.9+
Cc: stable@vger.kernel.org # v3.8.2+
Cc: stable@vger.kernel.org # v3.4.60+
Cc: stable@vger.kernel.org # v3.2.40+
Add REQ_SYNC early, so rq_dispatched[] in blk_mq_rq_ctx_init
is set correctly.
Signed-off-by: Shaohua Li<shli@fusionio.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The hash set type is very broken and was never meant to be merged in this
state. Missing RCU synchronization on element removal, leaking chain
refcounts when used as a verdict map, races during lookups, a fixed table
size are probably just some of the problems. Luckily it is currently
never chosen by the kernel when the rbtree type is also available.
Rewrite it to be usable.
The new implementation supports automatic hash table resizing using RCU,
based on Paul McKenney's and Josh Triplett's algorithm "Optimized Resizing
For RCU-Protected Hash Tables" described in [1].
Resizing doesn't require a second list head in the elements, it works by
chosing a hash function that remaps elements to a predictable set of buckets,
only resizing by integral factors and
- during expansion: linking new buckets to the old bucket that contains
elements for any of the new buckets, thereby creating imprecise chains,
then incrementally seperating the elements until the new buckets only
contain elements that hash directly to them.
- during shrinking: linking the hash chains of all old buckets that hash
to the same new bucket to form a single chain.
Expansion requires at most the number of elements in the longest hash chain
grace periods, shrinking requires a single grace period.
Due to the requirement of having hash chains/elements linked to multiple
buckets during resizing, homemade single linked lists are used instead of
the existing list helpers, that don't support this in a clean fashion.
As a side effect, the amount of memory required per element is reduced by
one pointer.
Expansion is triggered when the load factors exceeds 75%, shrinking when
the load factor goes below 30%. Both operations are allowed to fail and
will be retried on the next insertion or removal if their respective
conditions still hold.
[1] http://dl.acm.org/citation.cfm?id=2002181.2002192
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
nf_conntrack_lock is a monolithic lock and suffers from huge contention
on current generation servers (8 or more core/threads).
Perf locking congestion is clear on base kernel:
- 72.56% ksoftirqd/6 [kernel.kallsyms] [k] _raw_spin_lock_bh
- _raw_spin_lock_bh
+ 25.33% init_conntrack
+ 24.86% nf_ct_delete_from_lists
+ 24.62% __nf_conntrack_confirm
+ 24.38% destroy_conntrack
+ 0.70% tcp_packet
+ 2.21% ksoftirqd/6 [kernel.kallsyms] [k] fib_table_lookup
+ 1.15% ksoftirqd/6 [kernel.kallsyms] [k] __slab_free
+ 0.77% ksoftirqd/6 [kernel.kallsyms] [k] inet_getpeer
+ 0.70% ksoftirqd/6 [nf_conntrack] [k] nf_ct_delete
+ 0.55% ksoftirqd/6 [ip_tables] [k] ipt_do_table
This patch change conntrack locking and provides a huge performance
improvement. SYN-flood attack tested on a 24-core E5-2695v2(ES) with
10Gbit/s ixgbe (with tool trafgen):
Base kernel: 810.405 new conntrack/sec
After patch: 2.233.876 new conntrack/sec
Notice other floods attack (SYN+ACK or ACK) can easily be deflected using:
# iptables -A INPUT -m state --state INVALID -j DROP
# sysctl -w net/netfilter/nf_conntrack_tcp_loose=0
Use an array of hashed spinlocks to protect insertions/deletions of
conntracks into the hash table. 1024 spinlocks seem to give good
results, at minimal cost (4KB memory). Due to lockdep max depth,
1024 becomes 8 if CONFIG_LOCKDEP=y
The hash resize is a bit tricky, because we need to take all locks in
the array. A seqcount_t is used to synchronize the hash table users
with the resizing process.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Netfilter expectations are protected with the same lock as conntrack
entries (nf_conntrack_lock). This patch split out expectations locking
to use it's own lock (nf_conntrack_expect_lock).
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Preparation for disconnecting the nf_conntrack_lock from the
expectations code. Once the nf_conntrack_lock is lifted, a race
condition is exposed.
The expectations master conntrack exp->master, can race with
delete operations, as the refcnt increment happens too late in
init_conntrack(). Race is against other CPUs invoking
->destroy() (destroy_conntrack()), or nf_ct_delete() (via timeout
or early_drop()).
Avoid this race in nf_ct_find_expectation() by using atomic_inc_not_zero(),
and checking if nf_ct_is_dying() (path via nf_ct_delete()).
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
One spinlock per cpu to protect dying/unconfirmed/template special lists.
(These lists are now per cpu, a bit like the untracked ct)
Add a @cpu field to nf_conn, to make sure we hold the appropriate
spinlock at removal time.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Changes while reading through the netfilter code.
Added hint about how conntrack nf_conn refcnt is accessed.
And renamed repl_hash to reply_hash for readability
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Via Simon Horman:
====================
* Whitespace cleanup spotted by checkpatch.pl from Tingwei Liu.
* Section conflict cleanup, basically removal of one wrong __read_mostly,
from Andi Kleen.
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Additionally to have the second (data) bitrate available the data bitrate
has to be greater or equal to the arbitration bitrate in CAN FD.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The configuration for CAN FD depends on CAN_CTRLMODE_FD enabled in the driver
specific ctrlmode_supported capabilities.
The configuration can be done either with the 'fd { on | off }' option in the
'ip' tool from iproute2 or by setting the CAN netdevice MTU to CAN_MTU (16) or
to CANFD_MTU (72).
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
As CAN FD offers a second bitrate for the data section of the CAN frame the
infrastructure for storing and configuring this second bitrate is introduced.
Improved the readability of the if-statement by inserting some newlines.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
As the bittiming calculation functions are to be used with different
bittiming_const structures for CAN and CAN FD the direct reference to
priv->bittiming_const inside these functions has to be removed.
Also moved the check for existing bittiming const to one place.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch moves a sanity check in order to have a second user for CAN FD.
Also simplify the return value generation in can_get_bittiming() as only
correct return values of can_[calc|fixup]_bittiming() lead to a return value of
zero.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
When setting the bitrate both can_calc_bittiming() and can_fixup_bittiming()
lead to the bitrate variable to be set, when a proper bit timing is available.
Only then the bitrate configuration is stored for the device, so checking for
priv->bittiming.bitrate is always sufficient.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The skbuff protocol value was formerly fixed/sanitized to ETH_P_CAN in
can_put_echo_skb(). With CAN FD this value has to be preserved.
This patch changes the hard assignment of the protocol value to a check of
valid protocol values for CAN and CAN FD.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch converts the dev_<level> printing to netdev_<level>, this makes it
possible to remove the "struct device *dev" pointer from the "struct
ican3_dev".
Cc: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Lenovo IdeaPad 410Y with ALC282 codec makes loud click noises at boot
and shutdown. Also, it wrongly misdetects the acpi_thinkpad hook.
This patch adds a device-specific fixup for disabling the shutup
callback that is the cause of the click noise and also avoiding the
thinpad_helper calls.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=71511
Reported-and-tested-by: Guilherme Amadio <guilherme.amadio@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
iproute2 already defines a structure with that name, let's use another one to
avoid any conflict.
CC: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>