kernel-ark

Author	SHA1	Message	Date
David S. Miller	1de9243bbf	ipv4: Pull icmp socket delivery out into a helper function. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 18:32:17 -07:00
Eric Dumazet	46d3ceabd8	tcp: TCP Small Queues This introduce TSQ (TCP Small Queues) TSQ goal is to reduce number of TCP packets in xmit queues (qdisc & device queues), to reduce RTT and cwnd bias, part of the bufferbloat problem. sk->sk_wmem_alloc not allowed to grow above a given limit, allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a given time. TSO packets are sized/capped to half the limit, so that we have two TSO packets in flight, allowing better bandwidth use. As a side effect, setting the limit to 40000 automatically reduces the standard gso max limit (65536) to 40000/2 : It can help to reduce latencies of high prio packets, having smaller TSO packets. This means we divert sock_wfree() to a tcp_wfree() handler, to queue/send following frames when skb_orphan() [2] is called for the already queued skbs. Results on my dev machines (tg3/ixgbe nics) are really impressive, using standard pfifo_fast, and with or without TSO/GSO. Without reduction of nominal bandwidth, we have reduction of buffering per bulk sender : < 1ms on Gbit (instead of 50ms with TSO) < 8ms on 100Mbit (instead of 132 ms) I no longer have 4 MBytes backlogged in qdisc by a single netperf session, and both side socket autotuning no longer use 4 Mbytes. As skb destructor cannot restart xmit itself ( as qdisc lock might be taken at this point ), we delegate the work to a tasklet. We use one tasklest per cpu for performance reasons. If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag. This flag is tested in a new protocol method called from release_sock(), to eventually send new segments. [1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable [2] skb_orphan() is usually called at TX completion time, but some drivers call it in their start_xmit() handler. These drivers should at least use BQL, or else a single TCP session can still fill the whole NIC TX ring, since TSQ will have no effect. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 18:12:59 -07:00
Alexander Duyck	2100844ca9	tcp: Fix out of bounds access to tcpm_vals The recent patch "tcp: Maintain dynamic metrics in local cache." introduced an out of bounds access due to what appears to be a typo. I believe this change should resolve the issue by replacing the access to RTAX_CWND with TCP_METRIC_CWND. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 17:30:41 -07:00
David S. Miller	48ee3569f3	ipv6: Move ipv6 twsk accessors outside of CONFIG_IPV6 ifdefs. Fixes build when ipv6 is disabled. Reported-by: Fengguang Wu <wfg@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 02:39:24 -07:00
Li RongQing	4715213d9c	bridge: fix endian mld->mld_maxdelay is net endian, so we should use ntohs, not htons CC: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 01:31:24 -07:00
Li RongQing	0d653ed891	qlge: fix endian issue commit `6d29b1ef` introduces a bug, ntohs is __be16_to_cpu, not cpu_to_be16. We always use htons on IP_OFFSET and IP_MF, then compare with network package. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 01:31:24 -07:00
Li RongQing	5b70ca3599	ksz884x: fix Endian ETH_P_IP is host Endian, skb->protocol is big Endian, when compare them, Using htons on skb->protocol is wrong. And fix two code style issues: indentation and remove unnecessary parentheses. CC: Tristram Ha <Tristram.Ha@micrel.com> CC: Ben Hutchings <bhutchings@solarflare.com> CC: Joe Perches <joe@perches.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 01:31:23 -07:00
David S. Miller	4e01df28d4	Merge branch 'davem-next.r8169' of git://violet.fr.zoreil.com/romieu/linux	2012-07-11 01:28:36 -07:00
David S. Miller	04c9f416e3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/batman-adv/bridge_loop_avoidance.c net/batman-adv/bridge_loop_avoidance.h net/batman-adv/soft-interface.c net/mac80211/mlme.c With merge help from Antonio Quartulli (batman-adv) and Stephen Rothwell (drivers/net/usb/qmi_wwan.c). The net/mac80211/mlme.c conflict seemed easy enough, accounting for a conversion to some new tracing macros. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:56:33 -07:00
Michael Chan	c1f5163de4	bnx2: Fix bug in bnx2_free_tx_skbs(). In rare cases, bnx2x_free_tx_skbs() can unmap the wrong DMA address when it gets to the last entry of the tx ring. We were not using the proper macro to skip the last entry when advancing the tx index. Reported-by: Zongyun Lai <zlai@vmware.com> Reviewed-by: Jeffrey Huang <huangjw@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:33:47 -07:00
Eric Dumazet	b28ba72665	IPoIB: fix skb truesize underestimatiom Or Gerlitz reported triggering of WARN_ON_ONCE(delta < len); in skb_try_coalesce() This warning tracks drivers that incorrectly set skb->truesize IPoIB indeed allocates a full page to store a fragment, but only accounts in skb->truesize the used part of the page (frame length) This patch fixes skb truesize underestimation, and also fixes a performance issue, because RX skbs have not enough tailroom to allow IP and TCP stacks to pull their header in skb linear part without an expensive call to pskb_expand_head() Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Cc: Erez Shitrit <erezsh@mellanox.com> Cc: Shlomo Pongartz <shlomop@mellanox.com> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:33:12 -07:00
Amir Hanania	efc73f4bbc	net: Fix memory leak - vlan_info struct In driver reload test there is a memory leak. The structure vlan_info was not freed when the driver was removed. It was not released since the nr_vids var is one after last vlan was removed. The nr_vids is one, since vlan zero is added to the interface when the interface is being set, but the vlan zero is not deleted at unregister. Fix - delete vlan zero when we unregister the device. Signed-off-by: Amir Hanania <amir.hanania@intel.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:32:27 -07:00
David S. Miller	941a46a29c	Included changes: - fix a bug generated by the wrong interaction between the GW feature and the Bridge Loop Avoidance -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAk/2EMQACgkQpGgxIkP9cweoqgCeNGrHU9HxBnKXSylNcqhQBzqr 9jMAni+gJX+lzmrA2j1w/rCaamuNpbJG =mXZq -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - fix a bug generated by the wrong interaction between the GW feature and the Bridge Loop Avoidance	2012-07-10 23:31:37 -07:00
Jitendra Kalsaria	c278fa53c1	qlge: Bumped driver version to 1.00.00.31 Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:28:34 -07:00
Jitendra Kalsaria	667b9382cf	qlge: Refactoring of ethtool stats. Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:28:34 -07:00
Jitendra Kalsaria	433c88e866	qlge: Moving low level frame error to ethtool statistics. Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:28:33 -07:00
Jitendra Kalsaria	f5c4441cd8	qlge: Fixed double pci free upon tx_ring->q allocation failure. Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:28:33 -07:00
Jitendra Kalsaria	a7db9ad1d4	qlge: Added missing case statement to ethtool get_strings. Missing case was causing ethtool self test to print garbage value in extra info section. Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:28:33 -07:00
Jitendra Kalsaria	849bcaff80	qlge: Clean up ethtool set WOL routine. Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:28:33 -07:00
Jitendra Kalsaria	206d78e0c5	qlge: Fix ethtool WOL calls to operate only on devices that support WOL. Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:28:33 -07:00
Jitendra Kalsaria	d0de73096e	qlge: Cleanup atomic queue threshold check. Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:28:33 -07:00
Jitendra Kalsaria	41812db8e2	qlge: Fix TX queue stoppage due to full condition. TX queue was being stopped at beginning of send path instead of at the end when last descriptor is used. Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:28:33 -07:00
Rob Herring	f62a23a7cb	net: calxedaxgmac: enable rx cut-thru mode Enabling RX cut-thru mode yields better performance as received frames start getting written to memory before a whole frame is received. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:25:47 -07:00
Rob Herring	e36ce6eb2b	net: calxedaxgmac: set outstanding AXI bus transactions to 8 Increase the number of outstanding read and write AXI transactions from 1 to 8 for better performance. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:25:47 -07:00
Rob Herring	7c4009192e	net: calxedaxgmac: fix hang on rx refill Fix intermittent hangs in xgmac_rx_refill. If a ring buffer entry already had an skb allocated, then xgmac_rx_refill would get stuck in a loop. This can happen on a rx error when we just leave the skb allocated to the entry. [ 7884.510000] INFO: rcu_preempt detected stall on CPU 0 (t=727315 jiffies) [ 7884.510000] [<c0010a59>] (unwind_backtrace+0x1/0x98) from [<c006fd93>] (__rcu_pending+0x11b/0x2c4) [ 7884.510000] [<c006fd93>] (__rcu_pending+0x11b/0x2c4) from [<c0070b95>] (rcu_check_callbacks+0xed/0x1a8) [ 7884.510000] [<c0070b95>] (rcu_check_callbacks+0xed/0x1a8) from [<c0036abb>] (update_process_times+0x2b/0x48) [ 7884.510000] [<c0036abb>] (update_process_times+0x2b/0x48) from [<c004e8fd>] (tick_sched_timer+0x51/0x94) [ 7884.510000] [<c004e8fd>] (tick_sched_timer+0x51/0x94) from [<c0045527>] (__run_hrtimer+0x4f/0x1e8) [ 7884.510000] [<c0045527>] (__run_hrtimer+0x4f/0x1e8) from [<c0046003>] (hrtimer_interrupt+0xd7/0x1e4) [ 7884.510000] [<c0046003>] (hrtimer_interrupt+0xd7/0x1e4) from [<c00101d3>] (twd_handler+0x17/0x24) [ 7884.510000] [<c00101d3>] (twd_handler+0x17/0x24) from [<c006be39>] (handle_percpu_devid_irq+0x59/0x114) [ 7884.510000] [<c006be39>] (handle_percpu_devid_irq+0x59/0x114) from [<c0069aab>] (generic_handle_irq+0x17/0x2c) [ 7884.510000] [<c0069aab>] (generic_handle_irq+0x17/0x2c) from [<c000cc8d>] (handle_IRQ+0x35/0x7c) [ 7884.510000] [<c000cc8d>] (handle_IRQ+0x35/0x7c) from [<c033b153>] (__irq_svc+0x33/0xb8) [ 7884.510000] [<c033b153>] (__irq_svc+0x33/0xb8) from [<c0244b06>] (xgmac_rx_refill+0x3a/0x140) [ 7884.510000] [<c0244b06>] (xgmac_rx_refill+0x3a/0x140) from [<c02458ed>] (xgmac_poll+0x265/0x3bc) [ 7884.510000] [<c02458ed>] (xgmac_poll+0x265/0x3bc) from [<c029fcbf>] (net_rx_action+0xc3/0x200) [ 7884.510000] [<c029fcbf>] (net_rx_action+0xc3/0x200) from [<c0030cab>] (__do_softirq+0xa3/0x1bc) Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:25:47 -07:00
Rob Herring	eb5e1b29a5	net: calxedaxgmac: fix net timeout recovery Fix net tx watchdog timeout recovery. The descriptor ring was reset, but the DMA engine was not reset to the beginning of the ring. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:25:47 -07:00
Jon Mason	0b43b9a703	ll_temac: remove unnecessary setting of skb->dev skb->dev is being unnecessarily set by the driver on packet recieve. eth_type_trans already sets skb->dev to the proper value and it is not referenced anywhere else in the dirver, thus making its setting unnecessary. Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:57 -07:00
Jon Mason	d233d70771	sunhme: remove unnecessary setting of skb->dev skb->dev is being unnecessarily set during ring init and skb alloc in rx. It is already being set to the proper value when eth_type_trans is called on packet receive, and the skb->dev is not referenced anywhere else in the code. Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:57 -07:00
Jon Mason	8505120e5a	sungem: remove unnecessary setting of skb->dev skb->dev is being unnecessarily set by the driver's skb alloc routine (which is called in init and during rx). It is already being set to the proper value when eth_type_trans is called on packet receive, and the skb->dev is not referenced anywhere else in the code. Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:57 -07:00
Jon Mason	eb716c54b1	sunbmac: remove unnecessary setting of skb->dev skb->dev is being unnecessarily set during ring init and skb alloc in rx. It is already being set to the proper value when eth_type_trans is called on packet receive, and the skb->dev is not referenced anywhere else in the code. Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:57 -07:00
Jon Mason	c768b681f4	qlge: remove unnecessary setting of skb->dev skb->dev is being unnecessarily set by the driver on packet recieve. eth_type_trans already sets skb->dev to the proper value and it is not referenced anywhere else in the dirver, thus making its setting unnecessary. Signed-off-by: Jon Mason <jdmason@kudzu.us> Cc: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Cc: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Cc: Ron Mercer <ron.mercer@qlogic.com> Cc: linux-driver@qlogic.com Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:57 -07:00
Jon Mason	ad95dfc72a	qlcnic: remove unnecessary setting of skb->dev skb->dev is being unnecessarily set before calling eth_type_trans. eth_type_trans already sets skb->dev to the proper value, thus making this unnecessary. Signed-off-by: Jon Mason <jdmason@kudzu.us> Cc: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Cc: Sony Chacko <sony.chacko@qlogic.com> Cc: linux-driver@qlogic.com Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:57 -07:00
Jon Mason	b06b66c05b	ksz884x: remove unnecessary setting of skb->dev skb->dev is being unnecessarily set during ring init. It is already being set to the proper value when eth_type_trans is called on packet receive, and the skb->dev is not referenced anywhere else in the code. Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:57 -07:00
Jon Mason	4a4511a019	lantiq_etop: remove unnecessary setting of skb->dev skb->dev is being unnecessarily set before calling eth_type_trans. eth_type_trans already sets skb->dev to the proper value, thus making this unnecessary. Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:57 -07:00
Jon Mason	95f2bce55b	netxen: remove unnecessary setting of skb->dev skb->dev is being unnecessarily set by the driver on packet recieve. eth_type_trans already sets skb->dev to the proper value and it is not referenced anywhere else in the dirver, thus making its setting unnecessary. Signed-off-by: Jon Mason <jdmason@kudzu.us> Cc: Sony Chacko <sony.chacko@qlogic.com> Cc: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:56 -07:00
Jon Mason	b6457acfb7	enic: remove unnecessary setting of skb->dev skb->dev is being unnecessarily set after calling eth_type_trans. eth_type_trans already sets skb->dev to the proper value, thus making this unnecessary. Signed-off-by: Jon Mason <jdmason@kudzu.us> Cc: Christian Benvenuti <benve@cisco.com> Cc: Roopa Prabhu <roprabhu@cisco.com> Cc: Neel Patel <neepatel@cisco.com> Cc: Nishank Trivedi <nistrive@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:56 -07:00
Jon Mason	5c8b73ca43	lance: remove unnecessary setting of skb->dev skb->dev is being unnecessarily set during ring init. It is already being set to the proper value when eth_type_trans is called on packet receive, and the skb->dev is not referenced anywhere else in the code. Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:56 -07:00
Jon Mason	c0589fa78a	vxge/s2io: remove dead URLs URLs to neterion.com and s2io.com no longer resolve. Remove all references to these URLs in the driver source and documentation. Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:24:47 -07:00
Eric Dumazet	1a203cb33a	ipv6: optimize ipv6 addresses compares On 64 bit arches having efficient unaligned accesses (eg x86_64) we can use long words to reduce number of instructions for free. Joe Perches suggested to change ipv6_masked_addr_cmp() to return a bool instead of 'int', to make sure ipv6_masked_addr_cmp() cannot be used in a sorting function. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:13:46 -07:00
Ben Hutchings	1aa8b471e0	drivers/net/ethernet: Fix non-kernel-doc comments with kernel-doc start markers Convert doxygen (or similar) formatted comments to kernel-doc or unformatted comment. Delete a few that are content-free. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:13:46 -07:00
Ben Hutchings	49ce9c2cda	drivers/net/ethernet: Fix (nearly-)kernel-doc comments for various functions Fix incorrect start markers, wrapped summary lines, missing section breaks, incorrect separators, and some name mismatches. Delete a few that are content-free. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:13:46 -07:00
Ben Hutchings	ae86b9e384	net: Fix non-kernel-doc comments with kernel-doc start marker Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:13:45 -07:00
Ben Hutchings	2c53040f01	net: Fix (nearly-)kernel-doc comments for various functions Fix incorrect start markers, wrapped summary lines, missing section breaks, incorrect separators, and some name mismatches. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:13:45 -07:00
Ben Hutchings	a55b138b1d	net: Properly define functions with no parameters Defining a function with no parameters as 'T foo()' is the deprecated K&R style, and is not strictly equivalent to defining it as 'T foo(void)'. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:13:45 -07:00
David S. Miller	fdd28d7328	Merge branch 'metrics_restructure' This patch series works towards the goal of minimizing the amount of things that can change in an ipv4 route. In a regime where the routing cache is removed, route changes will lead to cloning in the FIB tables or similar. The largest trigger of route metrics writes, TCP, now has it's own cache of dynamic metric state. The timewait timestamps are stored there now as well. As a result of that, pre-cowing metrics is no longer necessary, and therefore FLOWI_FLAG_PRECOW_METRICS is removed. Redirect and PMTU handling is moved back into the ipv4 routes. I'm sorry for all the headaches trying to do this in the inetpeer has caused, it was the wrong approach for sure. Since metrics become read-only for ipv4 we no longer need the inetpeer hung off of the ipv4 routes either. So those disappear too. Also, timewait sockets no longer need to hold onto an inetpeer either. After this series, we still have some details to resolve wrt. PMTU and redirects for a route-cache-less system: 1) With just the plain route cache removal, PMTU will continue to work mostly fine. This is because of how the local route users call down into the PMTU update code with the route they already hold. However, if we wish to cache pre-computed routes in fib_info nexthops (which we want for performance), then we need to add route cloning for PMTU events. 2) Redirects require more work. First, redirects must be changed to be handled like PMTU. Wherein we call down into the sockets and other entities, and then they call back into the routing code with the route they were using. So we'll be adding an ->update_nexthop() method alongside ->update_pmtu(). And then, like for PMTU, we'll need cloning support once we start caching routes in the fib_info nexthops. But that's it, we can completely pull the trigger and remove the routing cache with minimal disruptions. As it is, this patch series alone helps a lot of things. For one, routing cache entry creation should be a lot faster, because we no longer do inetpeer lookups (even to check if an entry exists). This patch series also opens the door for non-DST_HOST ipv4 routes, because nothing fundamentally cares about rt->rt_dst any more. It can be removed with the base routing cache removal patch. In fact, that was the primary goal of this patch series. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:53:57 -07:00
David S. Miller	f185071ddf	ipv4: Remove inetpeer from routes. No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:18 -07:00
David S. Miller	312487313d	ipv4: Calling ->cow_metrics() now is a bug. Nothing every writes to ipv4 metrics any longer. PMTU is stored in rt->rt_pmtu. Dynamic TCP metrics are stored in a special TCP metrics cache, completely outside of the routes. Therefore ->cow_metrics() can simply nothing more than a WARN_ON trigger so we can catch anyone who tries to add new writes to ipv4 route metrics. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:17 -07:00
David S. Miller	2db2d67e4c	ipv4: Kill dst_copy_metrics() call from ipv4_blackhole_route(). Blackhole routes have a COW metrics operation that returns NULL always, therefore this dst_copy_metrics() call did absolutely nothing. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:16 -07:00
David S. Miller	710ab6c031	ipv4: Enforce max MTU metric at route insertion time. Rather than at every struct rtable creation. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:15 -07:00
David S. Miller	5943634fc5	ipv4: Maintain redirect and PMTU info in struct rtable again. Maintaining this in the inetpeer entries was not the right way to do this at all. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:14 -07:00

1 2 3 4 5 ...

312904 Commits