Commit Graph

553 Commits

Author SHA1 Message Date
Michael Ellerman
db5e8718ea [PATCH] iseries_veth: Fix bogus counting of TX errors
There's a number of problems with the way iseries_veth counts TX errors.

Firstly it counts conditions which aren't really errors as TX errors. This
includes if we don't have a connection struct for the other LPAR, or if the
other LPAR is currently down (or just doesn't want to talk to us). Neither
of these should count as TX errors.

Secondly, it counts one TX error for each LPAR that fails to accept the packet.
This can lead to TX error counts higher than the total number of packets sent
through the interface. This is confusing for users.

This patch fixes that behaviour. The non-error conditions are no longer
counted, and we introduce a new and I think saner meaning to the TX counts.

If a packet is successfully transmitted to any LPAR then it is transmitted
and tx_packets is incremented by 1.

If there is an error transmitting a packet to any LPAR then that is counted
as one error, ie. tx_errors is incremented by 1.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:42:45 -04:00
Michael Ellerman
e0808494ff [PATCH] iseries_veth: Simplify full-queue handling
The iseries_veth driver often has multiple netdevices sending packets over
a single connection to another LPAR. If the bandwidth to the other LPAR is
exceeded, all the netdevices must have their queues stopped.

The current code achieves this by queueing one incoming skb on the
per-netdevice port structure. When the connection is able to send more packets
we iterate through the port structs and flush any packet that is queued,
as well as restarting the associated netdevice's queue.

This arrangement makes less sense now that we have per-connection TX timers,
rather than the per-netdevice generic TX timer.

The new code simply detects when one of the connections is full, and stops
the queue of all associated netdevices. Then when a packet is acked on that
connection (ie. there is space again) all the queues are woken up.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:42:45 -04:00
Michael Ellerman
24562ffa8b [PATCH] iseries_veth: Add a per-connection ack timer
Currently the iseries_veth driver contravenes the specification in
Documentation/networking/driver.txt, in that if packets are not acked by
the other LPAR they will sit around forever.

This patch adds a per-connection timer which fires if we've had no acks for
five seconds. This is superior to the generic TX timer because it catches
the case of a small number of packets being sent and never acked.

This fixes a bug we were seeing on real systems, where some IPv6 neighbour
discovery packets would not be acked and then prevent the module from being
removed, due to skbs lying around.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:42:45 -04:00
Michael Ellerman
48683d72f8 [PATCH] iseries_veth: Remove TX timeout code
The iseries_veth driver uses the generic TX timeout watchdog, however a better
solution is in the works, so remove this code.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:39:43 -04:00
Michael Ellerman
f0c129caa3 [PATCH] iseries_veth: Use kobjects to track lifecycle of connection structs
The iseries_veth driver can attach to multiple vlans, which correspond to
multiple net devices. However there is only 1 connection between each LPAR,
so the connection structure may be shared by multiple net devices.

This makes module removal messy, because we can't deallocate the connections
until we know there are no net devices still using them. The solution is to
use ref counts on the connections, so we can delete them (actually stop) as
soon as the ref count hits zero.

This patch fixes (part of) a bug we were seeing with IPv6 sending probes to
a dead LPAR, which would then hang us forever due to leftover skbs.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:39:43 -04:00
Michael Ellerman
ec60beebed [PATCH] iseries_veth: Make init_connection() & destroy_connection() symmetrical
This patch makes veth_init_connection() and veth_destroy_connection()
symmetrical in that they allocate/deallocate the same data.

Currently if there's an error while initialising connections (ie. ENOMEM)
we call veth_module_cleanup(), however this will oops because we call
driver_unregister() before we've called driver_register(). I've never seen
this actually happen though.

So instead we explicitly call veth_destroy_connection() for each connection,
any that have been set up will be deallocated.

We also fix a potential leak if vio_register_driver() fails.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:39:43 -04:00
Michael Ellerman
cbf9074cc3 [PATCH] iseries_veth: Only call dma_unmap_single() if dma_map_single() succeeded
The iseries_veth driver unconditionally calls dma_unmap_single() even
when the corresponding dma_map_single() may have failed.

Rework the code a bit to keep the return value from dma_unmap_single()
around, and then check if it's a dma_mapping_error() before we do
the dma_unmap_single().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:39:43 -04:00
Michael Ellerman
b08bd5c0a3 [PATCH] iseries_veth: Replace lock-protected atomic with an ordinary variable
The iseries_veth driver uses atomic ops to manipulate the in_use field of
one of its per-connection structures. However all references to the
flag occur while the connection's lock is held, so the atomic ops aren't
necessary.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:39:43 -04:00
Michael Ellerman
d7893ddd1b [PATCH] iseries_veth: Remove redundant message stack lock
The iseries_veth driver keeps a stack of messages for each connection
and a lock to protect the stack. However there is also a per-connection lock
which makes the message stack lock redundant.

Remove the message stack lock and document the fact that callers of the
stack-manipulation functions must hold the connection's lock.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:37:57 -04:00
Michael Ellerman
2a5391a122 [PATCH] iseries_veth: Fix broken promiscuous handling
Due to a logic bug, once promiscuous mode is enabled in the iseries_veth
driver it is never disabled.

The driver keeps two flags, promiscuous and all_mcast which have exactly the
same effect. This is because we only ever receive packets destined for us,
or multicast packets. So consolidate them into one promiscuous flag for
simplicity.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:37:57 -04:00
Michael Ellerman
58c5900bda [PATCH] iseries_veth: Try to avoid pathological reset behaviour
The iseries_veth driver contains a state machine which is used to manage
how connections are setup and neogotiated between LPARs.

If one side of a connection resets for some reason, the two LPARs can get
stuck in a race to re-setup the connection. This can lead to the connection
being declared dead by one or both ends. In practice the connection is
declared dead by one or both ends approximately 8/10 times a connection is
reset, although it is rare for connections to be reset.

(an example here: http://michael.ellerman.id.au/files/misc/veth-trace.html)

The core of the problem is that the end that resets the connection doesn't
wait for the other end to become aware of the reset. So the resetting end
starts setting the connection back up, and then receives a reset from the
other end (which is the response to the initial reset). And so on.

We're severely limited in what we can do to fix this. The protocol between
LPARs is essentially fixed, as we have to interoperate with both OS/400
and old Linux drivers. Which also means we need a fix that only changes the
code on one end.

The only fix I've found given that, is to just blindly sleep for a bit when
resetting the connection, in the hope that the other end will get itself
sorted.  Needless to say I'd love it if someone has a better idea.

This does work, I've so far been unable to get it to break, whereas without
the fix a reset of one end will lead to a dead connection ~8/10 times.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:37:56 -04:00
Michael Ellerman
abfda4719c [PATCH] iseries_veth: Remove a FIXME WRT deletion of the ack_timer
The iseries_veth driver has a timer which we use to send acks. When the
connection is reset or stopped we need to delete the timer.

Currently we only call del_timer() when resetting a connection, which means
the timer might run again while the connection is being re-setup. As it turns
out that's ok, because the flags the timer consults have been reset.

It's cleaner though to call del_timer_sync() once we've dropped the lock,
although the timer may still run between us dropping the lock and calling
del_timer_sync(), but as above that's ok.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:37:56 -04:00
Michael Ellerman
61a3c69661 [PATCH] iseries_veth: Cleanup error and debug messages
Currently the iseries_veth driver prints the file name and line number in its
error messages. This isn't very useful for most users, so just print
"iseries_veth: message" instead.

 - convert uses of veth_printk() to veth_debug()/veth_error()/veth_info()
 - make terminology consistent, ie. always refer to LPAR not lpar
 - be consistent about printing return codes as %d not %x
 - make format strings fit in 80 columns

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-31 22:37:56 -04:00
Jeff Garzik
afc7097f45 [netdrvr de2104x] store PCI bus addresses in unsigned long
BZ# 4475.
2005-08-31 06:11:16 -04:00
Jeff Garzik
1a44935840 [netdrvr tulip] new PCI ID
Noted in BZ# 2960.
2005-08-31 05:48:59 -04:00
Jeff Garzik
ed735ccbef Merge HEAD from /spare/repo/linux-2.6/.git 2005-08-30 13:32:29 -04:00
Linus Torvalds
1fdab81e67 Merge refs/heads/upstream-fixes from master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-08-29 22:05:21 -07:00
Linus Torvalds
8bc2bee26b Merge HEAD from master.kernel.org:/pub/scm/linux/kernel/git/paulus/ppc64-2.6 2005-08-29 21:44:33 -07:00
Andrew Morton
7ef24b69f9 [PATCH] s2io build fix
Damir Perisa <damir.perisa@solnet.ch> reports:

 drivers/net/s2io.h:765: error: invalid lvalue in assignment
 drivers/net/s2io.h:766: error: invalid lvalue in assignment

That's a gcc4 error.  I don't see why the casts are there anyway..

Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-30 00:37:35 -04:00
Stephen Rothwell
fb120da678 [PATCH] Make MODULE_DEVICE_TABLE work for vio devices
Make MODULE_DEVICE_TABLE work for vio devices.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-08-30 13:31:56 +10:00
Michael Chan
73eef4cddb [BNX2]: update version and minor fixes
Update version and add 4 minor fixes, the last 2 were suggested by
Jeff Garzik:

1. check for a valid ethernet address before setting it
2. zero out bp->regview if init_one encounters an error and unmaps
   the IO address. This prevents remove_one from unmapping again.
3. use netif_rx_schedule() instead of hand coding the same.
4. use IRQ_HANDLED and IRQ_NONE.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:10:48 -07:00
Michael Chan
c770a65cee [BNX2]: change irq locks to bh locks
Change all locks from spin_lock_irqsave() to spin_lock_bh(). All
places that require spinlocks are in BH context.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:10:43 -07:00
Michael Chan
e89bbf1049 [BNX2]: remove atomics in tx
Remove atomic operations in the fast tx path. Expensive atomic
operations were used to keep track of the number of available tx
descriptors. The new code uses the difference between the consumer
and producer index to determine the number of free tx descriptors.

As suggested by Jeff Garzik, the name of the inline function is
changed to all lower case.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:10:38 -07:00
Michael Chan
cd339a0ed6 [BNX2]: speedup serdes linkup
This speeds up link-up time on 5706 SerDes if the link partner does
not autoneg, a rather common scenario in blade servers. Some blade
servers use IPMI for keyboard input and it's important to minimize
link disruptions.

The speedup is achieved by shortening the timer to (HZ / 3) during
the transient period right after initiating a SerDes autoneg. If
autoneg does not complete, parallel detect can be done sooner. After
the transient period is over, the timer goes back to its normal HZ
interval.

As suggested by Jeff Garzik, the timer initialization is moved to
bnx2_init_board() from bnx2_open().

An eeprom bit is also added to allow default forced SerDes speed for
even faster link-up time.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:10:34 -07:00
Michael Chan
afdc08b9f9 [BNX2]: Fix rtnl deadlock in bnx2_close
This fixes an rtnl deadlock problem when flush_scheduled_work() is
called from bnx2_close(). In rare cases, linkwatch_event() may be on
the workqueue from a previous close of a different device and it will
try to get the rtnl lock which is already held by dev_close().

The fix is to set a flag if we are in the reset task which is run
from the workqueue. bnx2_close() will loop until the flag is cleared.
As suggested by Jeff Garzik, the loop is changed to call msleep(1)
instead of yield() in the original patch.

flush_scheduled_work() is also moved to bnx2_remove_one() before the
netdev is freed.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 16:10:29 -07:00
Peter Hagervall
14ab9b867a [BNX2]: Possible sparse fixes, take two
This patch contains the following possible cleanups/fixes:

- use C99 struct initializers
- make a few arrays and structs static
- remove a few uses of literal 0 as NULL pointer
- use convenience function instead of cast+dereference in bnx2_ioctl()
- remove superfluous casts to u8 * in calls to readl/writel

Signed-off-by: Peter Hagervall <hager@cs.umu.se>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:56:43 -07:00
Michael Chan
087fe256f0 [TG3]: Fix bug in setting a tg3_flag
Found a bug while reviewing the patches the second time.

The TG3_FLAG_TXD_MBOX_HWBUG flag is set after the register access
methods have been determined. This patch fixes it by moving it up before
the various access methods are assigned.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:50:49 -07:00
Michael Chan
15f5a585c6 [TG3]: Eliminate one register write in tg3_restart_ints()
The register write to register 0x68 to restart interrupts is unnecessary
as the interrupt wasn't masked in that register by the irq handler. This
will save one register write in the fast path.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:50:45 -07:00
Michael Chan
6892914fb7 [TG3]: Add indirect register method for 5703 behind ICH
This patch adds the new workaround for 5703 A1/A2 if it is behind
certain ICH bridges. The workaround disables memory and uses config.
cycles only to access all registers. The 5702/03 chips can mistakenly
decode the special cycles from the ICH chipsets as memory write cycles,
causing corruption of register and memory space. Only certain ICH
bridges will drive special cycles with non-zero data during the address
phase which can fall within the 5703's address range. This is not an ICH
bug as the PCI spec allows non-zero address during special cycles.
However, only these ICH bridges are known to drive non-zero addresses
during special cycles.

The indirect_lock is also changed to spin_lock_irqsave from spin_lock_bh
because it is used in irq handler when using the indirect method to
disable interrupts.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:50:42 -07:00
Michael Chan
09ee929ccc [TG3]: Add mailbox read method
This patch adds the mailbox read method and also adds an inline function
tw32_mailbox_f() for mailbox writes that require read flush.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:50:38 -07:00
Michael Chan
1ee582d8e4 [TG3]: Add various register methods
This patch adds various dedicated register read/write methods for the
existing workarounds, including PCIX target workaround, write with read
flush, etc. The chips that require these workarounds will use these
dedicated access functions.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:50:15 -07:00
Michael Chan
2009493065 [TG3]: Add basic register access function pointers
This patch adds the basic function pointers to do register accesses in
the fast path. This was suggested by David Miller. The idea is that
various register access methods for different hardware errata can easily
be implemented with these function pointers and performance will not be
degraded on chips that use normal register access methods.

The various register read write macros (e.g. tw32, tr32, tw32_mailbox)
are redefined to call the function pointers.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:50:12 -07:00
David S. Miller
86e65da9c1 [NET]: Remove explicit initializations of skb->input_dev
Instead, set it in one place, namely the beginning of
netif_receive_skb().

Based upon suggestions from Jamal Hadi Salim.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:33:26 -07:00
David S. Miller
f2ccd8fa06 [NET]: Kill skb->real_dev
Bonding just wants the device before the skb_bond()
decapsulation occurs, so simply pass that original
device into packet_type->func() as an argument.

It remains to be seen whether we can use this same
exact thing to get rid of skb->input_dev as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:32:25 -07:00
Stephen Hemminger
6f1cf16582 [NET]: Remove HIPPI private from skbuff.h
This removes the private element from skbuff, that is only used by
HIPPI. Instead it uses skb->cb[] to hold the additional data that is
needed in the output path from hard_header to device driver.

PS: The only qdisc that might potentially corrupt this cb[] is if
netem was used over HIPPI. I will take care of that by fixing netem
to use skb->stamp. I don't expect many users of netem over HIPPI

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 15:31:42 -07:00
David S. Miller
8728b834b2 [NET]: Kill skb->list
Remove the "list" member of struct sk_buff, as it is entirely
redundant.  All SKB list removal callers know which list the
SKB is on, so storing this in sk_buff does nothing other than
taking up some space.

Two tricky bits were SCTP, which I took care of, and two ATM
drivers which Francois Romieu <romieu@fr.zoreil.com> fixed
up.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
2005-08-29 15:31:14 -07:00
Jeff Garzik
39fbe47377 /spare/repo/netdev-2.6 branch 'chelsio' 2005-08-29 16:53:08 -04:00
Jeff Garzik
c1b054d03f Merge /spare/repo/linux-2.6/ 2005-08-29 16:40:27 -04:00
Jeff Garzik
cfc4692825 /spare/repo/netdev-2.6 branch 'e100' 2005-08-29 15:55:26 -04:00
Jeff Garzik
8aaf226a8e /spare/repo/netdev-2.6 branch 'sis190' 2005-08-29 15:52:56 -04:00
Jeff Garzik
1703ecc7e8 /spare/repo/netdev-2.6 branch 'uli-tulip' 2005-08-29 15:52:42 -04:00
Al Viro
bf4e70e54c [PATCH] missing include in smc-ultra
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-29 10:42:40 -07:00
Linus Torvalds
3d963f5bb1 Merge refs/heads/upstream from master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-08-29 10:04:37 -07:00
Andy Fleming
e13934563d [PATCH] PHY Layer fixup
This patch adds back the code that was taken out, thus re-enabling:

* The PHY Layer to initialize without crashing
* Drivers to actually connect to PHYs
* The entire PHY Control Layer

This patch is used by the gianfar driver, and other drivers which are in
development.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-28 20:28:25 -04:00
Linus Torvalds
3d52acb342 Merge refs/heads/upstream-fixes from master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-08-27 18:05:14 -07:00
Francois Romieu
86f0cd5057 [PATCH] r8169: avoid conflict between revisions 2 and 3 of the Linksys EG1032
Both revisions share the same PCI device ID and vendor ID but revision 2
of the device uses SysKonnect's chipset whereas revision 3 of the device
uses Realtek's 8169 chipset.

Credit goes to Christiaan Lutzer <mythtv.lutzer@gmail.com> for reporting
the issue and giving the actual value for the different revisions.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-27 04:41:01 -04:00
Ralf Baechle
815f62bf74 [PATCH] SMP rewrite of mkiss
Rewrite the mkiss driver to make it SMP-proof following the example of
6pack.c.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-27 04:35:31 -04:00
Ralf Baechle
214838a210 [PATCH] Fix 6pack setting of MAC address
Don't check type of sax25_family; dev_set_mac_address has already done
that before and anyway, the type to check against would have been
ARPHRD_AX25.  We only got away because AF_AX25 and ARPHRD_AX25 both happen
to be defined to the same value.

Don't check sax25_ndigis either; it's value is insignificant for the
purpose of setting the MAC address and the check has shown to break
some application software for no good reason.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-27 04:32:39 -04:00
Ralf Baechle
84a2ea1c2c [PATCH] 6pack Timer initialization
I dropped the timer initialization bits by accident when sending the
p-persistence fix.  This patch gets the driver to work again on halfduplex
links.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-08-27 04:32:39 -04:00
Linus Torvalds
13142341ac Merge HEAD from master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git 2005-08-26 16:32:31 -07:00