This is a patch for discussion addressing some receive buffer growing issues.
This is partially related to the thread "Possible BUG in IPv4 TCP window
handling..." last week.
Specifically it addresses the problem of an interaction between rcvbuf
moderation (receiver autotuning) and rcv_ssthresh. The problem occurs when
sending small packets to a receiver with a larger MTU. (A very common case I
have is a host with a 1500 byte MTU sending to a host with a 9k MTU.) In
such a case, the rcv_ssthresh code is targeting a window size corresponding
to filling up the current rcvbuf, not taking into account that the new rcvbuf
moderation may increase the rcvbuf size.
One hunk makes rcv_ssthresh use tcp_rmem[2] as the size target rather than
rcvbuf. The other changes the behavior when it overflows its memory bounds
with in-order data so that it tries to grow rcvbuf (the same as with
out-of-order data).
These changes should help my problem of mixed MTUs, and should also help the
case from last week's thread I think. (In both cases though you still need
tcp_rmem[2] to be set much larger than the TCP window.) One question is if
this is too aggressive at trying to increase rcvbuf if it's under memory
stress.
Orignally-from: John Heffner <jheffner@psc.edu>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is an updated version of the RFC3465 ABC patch originally
for Linux 2.6.11-rc4 by Yee-Ting Li. ABC is a way of counting
bytes ack'd rather than packets when updating congestion control.
The orignal ABC described in the RFC applied to a Reno style
algorithm. For advanced congestion control there is little
change after leaving slow start.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move all the code that does linear TCP slowstart to one
inline function to ease later patch to add ABC support.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simplify the code that comuputes microsecond rtt estimate used
by TCP Vegas. Move the callback out of the RTT sampler and into
the end of the ack cleanup.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP peformance with TSO over networks with delay is awful.
On a 100Mbit link with 150ms delay, we get 4Mbits/sec with TSO and
50Mbits/sec without TSO.
The problem is with TSO, we intentionally do not keep the maximum
number of packets in flight to fill the window, we hold out to until
we can send a MSS chunk. But, we also don't update the congestion window
unless we have filled, as per RFC2861.
This patch replaces the check for the congestion window being full
with something smarter that accounts for TSO.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
After much testing and agony, I've discovered that my previous ohci1394
quirk for Toshiba laptops is not 100% reliable. It apparently fails to
do the interrupt line change either correctly or in time, since in about
2 out of 5 boots, the kernel's irqdebug code will *still* disable irq 11
when the ohci1394 driver is loaded (at pci_enable_device time I think).
This patch switches things around a little in the workaround. First, it
removes the mdelay. I didn't see a need for it and my testing has shown
that it's not necessary for the quirk to work.
Secondly, instead of trying to change the interrupt line to what ACPI
tells us it should be, this patch makes the quirk use the value in the
PCI_INTERRUPT_LINE register. On this laptop at least, that seems to be
the right thing to do, though additional testing on other laptops and/or
with actual firewire devices would be appreciated.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
MSI hardcoded delivery mode to use logical delivery mode. Recently
x86_64 moved to use physical mode addressing to support physflat mode.
With this mode enabled noticed that my eth with MSI werent working.
msi_address_init() was hardcoded to use logical mode for i386 and x86_64.
So when we switch to use physical mode, things stopped working.
Since anyway we dont use lowest priority delivery with MSI, its always
directed to just a single CPU. Its safe and simpler to use
physical mode always, even when we use logical delivery mode for IPI's
or other ioapic RTE's.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch contains the following cleanups:
- access.c should #include "pci.h" for getting the prototypes of it's
global functions
- hotplug/shpchp_pci.c: make the needlessly global function
program_fw_provided_values() static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
pci_ids cleanup: fixup bt87x.c: two macro defined IDs missed in prior cleanup.
Caught by Chun-Chung Chen <cjj@u.washington.edu>: "In the patch for bt87x.c,
you seemed have missed the two occurrences of BT_DEVICE on line 897 and
line 898."
Signed-off-by: Grant Coady <gcoady@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch contains the driver bits for enabling DLPAR and PCI Hotplug
for the new OF-based PCI probe. This functionality was regressed when
the new PCI approach was introduced. Please apply if appropriate.
Signed-off-by: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A nice feature of sysfs is that it can create the symlink from the
driver to the module that is contained in it.
It requires that the device_driver.owner is set, what is not the
case for many PCI drivers.
This patch allows pci_register_driver to set automatically the
device_driver.owner for any PCI driver.
Credits to Al Viro who suggested the method.
Signed-off-by: Laurent Riffard <laurent.riffard@free.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
--
drivers/ide/setup-pci.c | 12 +++++++-----
drivers/pci/pci-driver.c | 9 +++++----
include/linux/ide.h | 3 ++-
include/linux/pci.h | 10 ++++++++--
4 files changed, 22 insertions(+), 12 deletions(-)
store_new_id() should not be (and cannot be) inline;
the function pointer is stored in a device_attribute table.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Move the PPC fixup for old NCR 810 controllers to generic quirks -
it's needed for Alpha, x86 and other architectures that use
setup-bus.c.
Thanks to Jay Estabrook for pointing out the issue.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The quirk names for VIA 686 are mistyped in 2.6.14 (686 vs 868). S3 868
influence? :) Here is a patch to correct them.
Signed-off-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The current pciehp implementation reports a power-fail error
even if the condition has cleared by the time the corresponding
interrupt handling code gets a chance to run. This patch
fixes this problem.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch further tweaks how we request control of hotplug
controller hardware from BIOS. We first search the ACPI namespace
corresponding to a specific hotplug controller looking for an
_OSC or OSHP method. On failure, we successively move to the
ACPI parent object, till we hit the highest level host bridge
in the hierarchy. This allows for different types of BIOS's
which place the _OSC/OSHP methods at various places in the acpi
namespace, while still not encroaching on the namespace of
some other root level host bridge.
This patch also introduces a new load time option (pciehp_force)
that allows us to bypass all _OSC/OSHP checking. Not supporting
these methods seems to be be the most common ACPI firmware problem
we've run into. This will still _not_ allow the pciehp driver to
work correctly if the BIOS really doesn't support pciehp (i.e. if
it doesn't generate a hotplug interrupt). Use this option with
caution. Some BIOS's may deliberately not build any _OSC/OSHP
methods to make sure it retains control the hotplug hardware.
Using the pciehp_force parameter for such systems can lead to
two separate entities trying to control the same hardware.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch tweaks the way pciehp requests control of the hotplug
hardware from BIOS. It now tries to invoke the ACPI _OSC method
for a specific hotplug controller only, rather than walking the
entire acpi namespace invoking all possible _OSC methods under
all host bridges. This allows us to gain control of each hotplug
controller individually, even if BIOS fails to give us control of
some other hotplug controller in the system.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Reduce the number of debug messages generated if pciehp debug is
enabled. I tried to restrict this to removing debug messages that
are either early-driver-debug type messages, or print information
that can be inferred through other debug prints.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
State information is currently stored in per-slot as well as
per-pci-function data structures in pciehp. There's a lot of
overlap in the information kept, and some of it is never used.
This patch consolidates the state information to per-slot and
eliminates unused data structures. The biggest change is to
eliminate the pci_func structure and the code around managing
its lists.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Reduce the PCI Express hotplug driver's dependence on ACPI.
We don't walk the acpi namespace anymore to build a list of
bridges and devices. We go to ACPI only to run the _OSC or
_OSHP methods to transition control of hotplug hardware from
system BIOS to the hotplug driver, and to run the _HPP
method to get hotplug device parameters like cache line size,
latency timer and SERR/PERR enable from BIOS.
Note that one of the side effects of this patch is that pciehp
does not automatically enable the hot-added device or its DMA
bus mastering capability now. It expects the device driver to
do that. This may break some drivers and we will have to fix
them as they are reported.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch converts the pci express hotplug controller driver
to use the PCI core for resource management. This eliminates a
lot of duplicated code and integrates pciehp with the system's
normal PCI handling code.
Signed-off-by: Rajesh Shah <rajesh.shah@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Some devices have more than one capability of the same type. For
example, the PCI header for the PathScale InfiniPath looks like:
04:01.0 InfiniBand: Unknown device 1fc1:000d (rev 02)
Subsystem: Unknown device 1fc1:000d
Flags: bus master, fast devsel, latency 0, IRQ 193
Memory at fea00000 (64-bit, non-prefetchable) [size=2M]
Capabilities: [c0] HyperTransport: Slave or Primary Interface
Capabilities: [f8] HyperTransport: Interrupt Discovery and Configuration
There are _two_ HyperTransport capabilities, and the PathScale driver
wants to look at both of them.
The current pci_find_capability() API doesn't work for this, since it
only allows us to get to the first capability of a given type. The
patch below introduces a new pci_find_next_capability(), which can be
used in a loop like
for (pos = pci_find_capability(pdev, <ID>);
pos;
pos = pci_find_next_capability(pdev, pos, <ID>)) {
/* ... */
}
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I just hit a page allocation error on a kernel configured to support
64 CPUs. It spewed 60 completely useless unnecessary lines of info.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch is a first go at some documentation. Please advise if gmail
has mangled patch and I will revert to an attachment:
Signed-off-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The protocol field in ethernet headers is big-endian and should be
annotated as such. This patch allows detection of missing ntohs() calls
on the ethernet protocol field when sparse is run with __CHECK_ENDIAN__
defined.
This is a revised version that includes <linux/types.h> so that the
userspace programs are not confused by __be16. Thanks to David S.
Miller.
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is the patch that introduces the generic skb_checksum_complete
which also checks for hardware RX checksum faults. If that happens,
it'll call netdev_rx_csum_fault which currently prints out a stack
trace with the device name. In future it can turn off RX checksum.
I've converted every spot under net/ that does RX checksum checks to
use skb_checksum_complete or __skb_checksum_complete with the
exceptions of:
* Those places where checksums are done bit by bit. These will call
netdev_rx_csum_fault directly.
* The following have not been completely checked/converted:
ipmr
ip_vs
netfilter
dccp
This patch is based on patches and suggestions from Stephen Hemminger
and David S. Miller.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the superfluous parameter checking in bnx2_{get,set}_eeprom.
The parameters are already validated in ethtool_{get,set}_eeprom.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check return of dev_alloc_skb in bnx2_test_loopback, and handle
appropriately.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Output driver name as prefix to "Unknown flash/EEPROM type." message.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I have to revert the recent addition of -imacros to the Makefile to get my
tool chain to build. Without the change, below, I get:
Note that this looks entirely like a toolchain bug. Here is the offending command:
[pid 12163] execve("/usr/lib/gcc-lib/i386-redhat-linux/3.2.2/tradcpp0", ["/usr/lib/gcc-lib/i386-redhat-linux/3.2.2/tradcpp0", "-lang-asm", "-nostdinc", "-Iinclude", "-Iinclude/asm-i386/mach-default", "-D__GNUC__=3", "-D__GNUC_MINOR__=2", "-D__GNUC_PATCHLEVEL__=2", "-D__GXX_ABI_VERSION=102", "-D__ELF__", "-Dunix", "-D__gnu_linux__", "-Dlinux", "-D__ELF__", "-D__unix__", "-D__gnu_linux__", "-D__linux__", "-D__unix", "-D__linux", "-Asystem=posix", "-D__NO_INLINE__", "-D__STDC_HOSTED__=1", "-Acpu=i386", "-Amachine=i386", "-Di386", "-D__i386", "-D__i386__", "-D__tune_i386__", "-D__KERNEL__", "-D__ASSEMBLY__", "-isystem", "/usr/lib/gcc-lib/i386-redhat-linux/3.2.2/include", "-imacros", "include/linux/autoconf.h", "-MD", "arch/i386/kernel/.entry.o.d", "arch/i386/kernel/entry.S", "-o", "/tmp/ccOlsFJR.s"]
Which should execute properly, I think. But it does not:
zach-dev:linux-2.6.14-zach-work $ make
CHK include/linux/version.h
CHK include/linux/compile.h
CHK usr/initramfs_list
AS arch/i386/kernel/entry.o
/usr/lib/gcc-lib/i386-redhat-linux/3.2.2/tradcpp0: output filename specified twice
make[1]: *** [arch/i386/kernel/entry.o] Error 1
make: *** [arch/i386/kernel] Error 2
gcc (GCC) 3.2.2 20030222 (Red Hat Linux 3.2.2-5)
Deprecating the -imacros fixes the build for me. It does not appear to be a
simple argument overflow problem in trapcpp0, since deprecating all the defines
reproduces the problem as well. Also, switching -imacros to -include fixes the
problem.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
XPC (as in arch/ia64/sn/kernel/xp*) has a need to notify other partitions
(SGI Altix) whenever a partition is going down in order to get them to
disengage from accessing the halting partition's memory. If this is not
done before the reset of the hardware, the other partitions can find
themselves encountering MCAs that bring them down.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>