Commit Graph

7850 Commits

Author SHA1 Message Date
Patrick McHardy
c88130bcd5 [NETFILTER]: nf_conntrack: naming unification
Rename all "conntrack" variables to "ct" for more consistency and
avoiding some overly long lines.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:59 -08:00
Patrick McHardy
76eb946040 [NETFILTER]: nf_conntrack: don't inline early_drop()
early_drop() is only called *very* rarely, unfortunately gcc inlines it
into the hotpath because there is only a single caller. Explicitly mark
it noinline.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:58 -08:00
Patrick McHardy
0794935e21 [NETFILTER]: nf_conntrack: optimize hash_conntrack()
Avoid calling jhash three times and hash the entire tuple in one go.

  __hash_conntrack | -485 # 760 -> 275, # inlines: 3 -> 1, size inlines: 717 -> 252
 1 function changed, 485 bytes removed

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:57 -08:00
Patrick McHardy
ba419aff2c [NETFILTER]: nf_conntrack: optimize __nf_conntrack_find()
Ignoring specific entries in __nf_conntrack_find() is only needed by NAT
for nf_conntrack_tuple_taken(). Remove it from __nf_conntrack_find()
and make nf_conntrack_tuple_taken() search the hash itself.

Saves 54 bytes of text in the hotpath on x86_64:

  __nf_conntrack_find      |  -54 # 321 -> 267, # inlines: 3 -> 2, size inlines: 181 -> 127
  nf_conntrack_tuple_taken | +305 # 15 -> 320, lexblocks: 0 -> 3, # inlines: 0 -> 3, size inlines: 0 -> 181
  nf_conntrack_find_get    |   -2 # 90 -> 88
 3 functions changed, 305 bytes added, 56 bytes removed, diff: +249

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:55 -08:00
Patrick McHardy
f8ba1affa1 [NETFILTER]: nf_conntrack: switch rwlock to spinlock
With the RCU conversion only write_lock usages of nf_conntrack_lock are
left (except one read_lock that should actually use write_lock in the
H.323 helper). Switch to a spinlock.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:54 -08:00
Patrick McHardy
76507f69c4 [NETFILTER]: nf_conntrack: use RCU for conntrack hash
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:54 -08:00
Patrick McHardy
7d0742da1c [NETFILTER]: nf_conntrack_expect: use RCU for expectation hash
Use RCU for expectation hash. This doesn't buy much for conntrack
runtime performance, but allows to reduce the use of nf_conntrack_lock
for /proc and nf_netlink_conntrack.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:53 -08:00
Patrick McHardy
c52fbb410b [NETFILTER]: nf_conntrack_core: avoid taking nf_conntrack_lock in nf_conntrack_alter_reply
The conntrack is unconfirmed, so we have an exclusive reference, which
means that the write_lock is definitely unneeded. A read_lock used to
be needed for the helper lookup, but since we're using RCU for helpers
now rcu_read_lock is enough.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:52 -08:00
Patrick McHardy
58a3c9bb0c [NETFILTER]: nf_conntrack: use RCU for conntrack helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:51 -08:00
Patrick McHardy
47d9504543 [NETFILTER]: nf_conntrack: fix accounting with fixed timeouts
Don't skip accounting for conntracks with the FIXED_TIMEOUT bit.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:51 -08:00
Patrick McHardy
1d670fdc8c [NETFILTER]: nf_conntrack_netlink: fix unbalanced locking
Properly drop nf_conntrack_lock on tuple parsing error.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:50 -08:00
Patrick McHardy
99fa5f5397 [NETFILTER]: nf_conntrack_ipv6: fix sparse warnings
CHECK   net/ipv6/netfilter/nf_conntrack_reasm.c
  net/ipv6/netfilter/nf_conntrack_reasm.c:77:18: warning: symbol 'nf_ct_ipv6_sysctl_table' was not declared. Should it be static?
  net/ipv6/netfilter/nf_conntrack_reasm.c:586:16: warning: symbol 'nf_ct_frag6_gather' was not declared. Should it be static?
  net/ipv6/netfilter/nf_conntrack_reasm.c:662:6: warning: symbol 'nf_ct_frag6_output' was not declared. Should it be static?
  net/ipv6/netfilter/nf_conntrack_reasm.c:683:5: warning: symbol 'nf_ct_frag6_kfree_frags' was not declared. Should it be static?
  net/ipv6/netfilter/nf_conntrack_reasm.c:698:5: warning: symbol 'nf_ct_frag6_init' was not declared. Should it be static?
  net/ipv6/netfilter/nf_conntrack_reasm.c:717:6: warning: symbol 'nf_ct_frag6_cleanup' was not declared. Should it be static?

Based on patch by Stephen Hemminger with suggestions by Yasuyuki KOZAKAI.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:49 -08:00
Patrick McHardy
b0a6363c24 [NETFILTER]: {ip,arp,ip6}_tables: fix sparse warnings in compat code
CHECK   net/ipv4/netfilter/ip_tables.c
net/ipv4/netfilter/ip_tables.c:1453:8: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1453:8:    expected int *size
net/ipv4/netfilter/ip_tables.c:1453:8:    got unsigned int [usertype] *size
net/ipv4/netfilter/ip_tables.c:1458:44: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1458:44:    expected int *size
net/ipv4/netfilter/ip_tables.c:1458:44:    got unsigned int [usertype] *size
net/ipv4/netfilter/ip_tables.c:1603:2: warning: incorrect type in argument 2 (different signedness)
net/ipv4/netfilter/ip_tables.c:1603:2:    expected unsigned int *i
net/ipv4/netfilter/ip_tables.c:1603:2:    got int *<noident>
net/ipv4/netfilter/ip_tables.c:1627:8: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1627:8:    expected int *size
net/ipv4/netfilter/ip_tables.c:1627:8:    got unsigned int *size
net/ipv4/netfilter/ip_tables.c:1634:40: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1634:40:    expected int *size
net/ipv4/netfilter/ip_tables.c:1634:40:    got unsigned int *size
net/ipv4/netfilter/ip_tables.c:1653:8: warning: incorrect type in argument 5 (different signedness)
net/ipv4/netfilter/ip_tables.c:1653:8:    expected unsigned int *i
net/ipv4/netfilter/ip_tables.c:1653:8:    got int *<noident>
net/ipv4/netfilter/ip_tables.c:1666:2: warning: incorrect type in argument 2 (different signedness)
net/ipv4/netfilter/ip_tables.c:1666:2:    expected unsigned int *i
net/ipv4/netfilter/ip_tables.c:1666:2:    got int *<noident>
  CHECK   net/ipv4/netfilter/arp_tables.c
net/ipv4/netfilter/arp_tables.c:1285:40: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/arp_tables.c:1285:40:    expected int *size
net/ipv4/netfilter/arp_tables.c:1285:40:    got unsigned int *size
net/ipv4/netfilter/arp_tables.c:1543:44: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/arp_tables.c:1543:44:    expected int *size
net/ipv4/netfilter/arp_tables.c:1543:44:    got unsigned int [usertype] *size
  CHECK   net/ipv6/netfilter/ip6_tables.c
net/ipv6/netfilter/ip6_tables.c:1481:8: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1481:8:    expected int *size
net/ipv6/netfilter/ip6_tables.c:1481:8:    got unsigned int [usertype] *size
net/ipv6/netfilter/ip6_tables.c:1486:44: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1486:44:    expected int *size
net/ipv6/netfilter/ip6_tables.c:1486:44:    got unsigned int [usertype] *size
net/ipv6/netfilter/ip6_tables.c:1631:2: warning: incorrect type in argument 2 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1631:2:    expected unsigned int *i
net/ipv6/netfilter/ip6_tables.c:1631:2:    got int *<noident>
net/ipv6/netfilter/ip6_tables.c:1655:8: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1655:8:    expected int *size
net/ipv6/netfilter/ip6_tables.c:1655:8:    got unsigned int *size
net/ipv6/netfilter/ip6_tables.c:1662:40: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1662:40:    expected int *size
net/ipv6/netfilter/ip6_tables.c:1662:40:    got unsigned int *size
net/ipv6/netfilter/ip6_tables.c:1680:8: warning: incorrect type in argument 5 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1680:8:    expected unsigned int *i
net/ipv6/netfilter/ip6_tables.c:1680:8:    got int *<noident>
net/ipv6/netfilter/ip6_tables.c:1693:2: warning: incorrect type in argument 2 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1693:2:    expected unsigned int *i
net/ipv6/netfilter/ip6_tables.c:1693:2:    got int *<noident>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:49 -08:00
Patrick McHardy
855304af29 [NETFILTER]: ipt_recent: fix sparse warnings
net/ipv4/netfilter/ipt_recent.c:215:17: warning: symbol 't' shadows an earlier one
net/ipv4/netfilter/ipt_recent.c:179:22: originally declared here
net/ipv4/netfilter/ipt_recent.c:322:13: warning: context imbalance in 'recent_seq_start' - wrong count at exit
net/ipv4/netfilter/ipt_recent.c:354:13: warning: context imbalance in 'recent_seq_stop' - unexpected unlock

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:48 -08:00
Stephen Hemminger
dc64d02ba8 [NETFILTER]: nf_conntrack_h3223: sparse fixes
Sparse complains when a function is not really static. Putting static
on the function prototype is not enough.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:47 -08:00
Stephen Hemminger
f4f6fb714f [NETFILTER]: more sparse fixes
Some lock annotations, and make initializers static.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:46 -08:00
Stephen Hemminger
2f0d2f1039 [NETFILTER]: conntrack: get rid of sparse warnings
Teach sparse about locking here, and fix signed/unsigned warnings.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:46 -08:00
Stephen Hemminger
4e26fe2681 [NETFILTER]: nfnetlink_log: sparse warning fixes
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:45 -08:00
Stephen Hemminger
96eb24d770 [NETFILTER]: nf_conntrack: sparse warnings
The hashtable size is really unsigned so sparse complains when you pass
a signed integer.  Change all uses to make it consistent.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:44 -08:00
Stephen Hemminger
06aa10728e [NETFILTER]: nf_nat_snmp: sparse warning
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:44 -08:00
Jan Engelhardt
edc26f7aaa [NETFILTER]: xt_owner: allow matching UID/GID ranges
Add support for ranges to the new revision. This doesn't affect
compatibility since the new revision was not released yet.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:43 -08:00
Jan Engelhardt
37c08387fc [NETFILTER]: xt_TCPMSS: consider reverse route's MTU in clamp-to-pmtu
The TCPMSS target in Xtables should consider the MTU of the reverse
route on forwarded packets as part of the path MTU.

Point in case: IN=ppp0, OUT=eth0. MSS set to 1460 in spite of MTU of
ppp0 being 1392.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:42 -08:00
Alexey Dobriyan
df200969b1 [NETFILTER]: netns: put table module on netns stop
When number of entries exceeds number of initial entries, foo-tables code
will pin table module. But during table unregister on netns stop,
that additional pin was forgotten.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:41 -08:00
Alexey Dobriyan
9ea0cb2601 [NETFILTER]: arp_tables: per-netns arp_tables FILTER
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:41 -08:00
Alexey Dobriyan
79df341ab6 [NETFILTER]: arp_tables: netns preparation
* Propagate netns from userspace.
* arpt_register_table() registers table in supplied netns.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:40 -08:00
Alexey Dobriyan
8280aa6182 [NETFILTER]: ip6_tables: per-netns IPv6 FILTER, MANGLE, RAW
Now it's possible to list and manipulate per-netns ip6tables rules.
Filtering decisions are based on init_net's table so far.

P.S.: remove init_net check in inet6_create() to see the effect

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:39 -08:00
Alexey Dobriyan
336b517fdc [NETFILTER]: ip6_tables: netns preparation
* Propagate netns from userspace down to xt_find_table_lock()
* Register ip6 tables in netns (modules still use init_net)

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:39 -08:00
Alexey Dobriyan
9335f047fe [NETFILTER]: ip_tables: per-netns FILTER, MANGLE, RAW
Now, iptables show and configure different set of rules in different
netnss'. Filtering decisions are still made by consulting only
init_net's set.

Changes are identical except naming so no splitting.

P.S.: one need to remove init_net checks in nf_sockopt.c and inet_create()
      to see the effect.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:38 -08:00
Alexey Dobriyan
34bd137ba7 [NETFILTER]: ip_tables: propagate netns from userspace
.. all the way down to table searching functions.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:37 -08:00
Alexey Dobriyan
44d34e721e [NETFILTER]: x_tables: return new table from {arp,ip,ip6}t_register_table()
Typical table module registers xt_table structure (i.e. packet_filter)
and link it to list during it. We can't use one template for it because
corresponding list_head will become corrupted. We also can't unregister
with template because it wasn't changed at all and thus doesn't know in
which list it is.

So, we duplicate template at the very first step of table registration.
Table modules will save it for use during unregistration time and actual
filtering.

Do it at once to not screw bisection.

P.S.: renaming i.e. packet_filter => __packet_filter is temporary until
      full netnsization of table modules is done.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:36 -08:00
Alexey Dobriyan
8d87005207 [NETFILTER]: x_tables: per-netns xt_tables
In fact all we want is per-netns set of rules, however doing that will
unnecessary complicate routines such as ipt_hook()/ipt_do_table, so
make full xt_table array per-netns.

Every user stubbed with init_net for a while.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:35 -08:00
Alexey Dobriyan
a98da11d88 [NETFILTER]: x_tables: change xt_table_register() return value convention
Switch from 0/-E to ptr/PTR_ERR convention.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:35 -08:00
Jan Engelhardt
30083c9500 [NETFILTER]: ebtables: mark matches, targets and watchers __read_mostly
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:34 -08:00
Jan Engelhardt
f776c4cda4 [NETFILTER]: ebtables: Update modules' descriptions
Update the MODULES_DESCRIPTION() tags for all Ebtables modules.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:33 -08:00
Jan Engelhardt
abfdf1c489 [NETFILTER]: ebtables: remove casts, use consts
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:33 -08:00
Jan Engelhardt
b41649989c [NETFILTER]: xt_conntrack: add port and direction matching
Extend the xt_conntrack match revision 1 by port matching (all four
{orig,repl}{src,dst}) and by packet direction matching.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:31 -08:00
Patrick McHardy
41d0cdedd5 [NETFILTER]: nfnetlink_log: fix typo
It should use htonl for the GID, not htons.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:30 -08:00
Patrick McHardy
2fd8e526f4 [NETFILTER]: bridge netfilter: remove nf_bridge_info read-only netoutdev member
Before the removal of the deferred output hooks, netoutdev was used in
case of VLANs on top of a bridge to store the VLAN device, so the
deferred hooks would see the correct output device. This isn't
necessary anymore since we're calling the output hooks for the correct
device directly in the IP stack.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:29 -08:00
Patrick McHardy
d44caf88e8 [NETFILTER]: nf_nat: remove double bysource hash initialization
The hash table is already initialized by nf_ct_alloc_hashtable().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:28 -08:00
Jan Engelhardt
ecb6f85e11 [NETFILTER]: Use const in struct xt_match, xt_target, xt_table
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:28 -08:00
Eric Dumazet
ca7c48ca97 [NETFILTER]: Supress some sparse warnings
CHECK   net/netfilter/nf_conntrack_expect.c
net/netfilter/nf_conntrack_expect.c:429:13: warning: context imbalance in 'exp_seq_start' - wrong count at exit
net/netfilter/nf_conntrack_expect.c:441:13: warning: context imbalance in 'exp_seq_stop' - unexpected unlock
  CHECK   net/netfilter/nf_log.c
net/netfilter/nf_log.c:105:13: warning: context imbalance in 'seq_start' - wrong count at exit
net/netfilter/nf_log.c:125:13: warning: context imbalance in 'seq_stop' - unexpected unlock
  CHECK   net/netfilter/nfnetlink_queue.c
net/netfilter/nfnetlink_queue.c:363:7: warning: symbol 'size' shadows an earlier one
net/netfilter/nfnetlink_queue.c:217:9: originally declared here
net/netfilter/nfnetlink_queue.c:847:13: warning: context imbalance in 'seq_start' - wrong count at exit
net/netfilter/nfnetlink_queue.c:859:13: warning: context imbalance in 'seq_stop' - unexpected unlock

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:27 -08:00
Denis V. Lunev
3046d76746 [RAW]: Wrong content of the /proc/net/raw6.
The address of IPv6 raw sockets was shown in the wrong format, from
IPv4 ones.  The problem has been introduced by the commit
42a73808ed ("[RAW]: Consolidate proc
interface.")

Thanks to Adrian Bunk who originally noticed the problem.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:26 -08:00
Denis V. Lunev
8cd850efa4 [RAW]: Cleanup IPv4 raw_seq_show.
There is no need to use 128 bytes on the stack at all. Clean the code
in the IPv6 style.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:25 -08:00
Denis V. Lunev
377cf82d66 [RAW]: Family check in the /proc/net/raw[6] is extra.
Different hashtables are used for IPv6 and IPv4 raw sockets, so no
need to check the socket family in the iterator over hashtables. Clean
this out.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:24 -08:00
Herbert Xu
b1641064a3 [IPCOMP]: Fix reception of incompressible packets
I made a silly typo by entering IPPROTO_IP (== 0) instead of
IPPROTO_IPIP (== 4).  This broke the reception of incompressible
packets.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:24 -08:00
Eric Dumazet
e242297055 [NET]: should explicitely initialize atomic_t field in struct dst_ops
All but one struct dst_ops static initializations miss explicit
initialization of entries field.

As this field is atomic_t, we should use ATOMIC_INIT(0), and not
rely on atomic_t implementation.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:23 -08:00
Ilpo Järvinen
ad1984e844 [TCP]: NewReno must count every skb while marking losses
NewReno should add cnt per skb (as with FACK) instead of depending on
SACKED_ACKED bits which won't be set with it at all.  Effectively,
NewReno should always exists after the first iteration anyway (or
immediately if there's already head in lost_out.

This was fixed earlier in net-2.6.25 but got reverted among other
stuff and I didn't notice that this is still necessary (actually
wasn't even considering this case while trying to figure out the
reports because I lived with different kind of code than it in reality
was).

This should solve the WARN_ONs in TCP code that as a result of this
triggered multiple times in every place we check for this invariant.

Special thanks to Dave Young <hidave.darkstar@gmail.com> and Krishna
Kumar2 <krkumar2@in.ibm.com> for trying with my debug patches.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Krishna Kumar2 <krkumar2@in.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:22 -08:00
Pavel Emelyanov
23fe18669e [NETNS]: Fix race between put_net() and netlink_kernel_create().
The comment about "race free view of the set of network
namespaces" was a bit hasty. Look (there even can be only
one CPU, as discovered by Alexey Dobriyan and Denis Lunev):

put_net()
  if (atomic_dec_and_test(&net->refcnt))
    /* true */
      __put_net(net);
        queue_work(...);

/*
 * note: the net now has refcnt 0, but still in
 * the global list of net namespaces
 */

== re-schedule ==

register_pernet_subsys(&some_ops);
  register_pernet_operations(&some_ops);
    (*some_ops)->init(net);
      /*
       * we call netlink_kernel_create() here
       * in some places
       */
      netlink_kernel_create();
         sk_alloc();
            get_net(net); /* refcnt = 1 */
         /*
          * now we drop the net refcount not to
          * block the net namespace exit in the
          * future (or this can be done on the
          * error path)
          */
         put_net(sk->sk_net);
             if (atomic_dec_and_test(&...))
                   /*
                    * true. BOOOM! The net is
                    * scheduled for release twice
                    */

When thinking on this problem, I decided, that getting and
putting the net in init callback is wrong. If some init
callback needs to have a refcount-less reference on the struct
net, _it_ has to be careful himself, rather than relying on
the infrastructure to handle this correctly.

In case of netlink_kernel_create(), the problem is that the
sk_alloc() gets the given namespace, but passing the info
that we don't want to get it inside this call is too heavy.

Instead, I propose to crate the socket inside an init_net
namespace and then re-attach it to the desired one right
after the socket is created.

After doing this, we also have to be careful on error paths
not to drop the reference on the namespace, we didn't get
the one on.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:22 -08:00
Eric Dumazet
533cb5b0a6 [XFRM]: constify 'struct xfrm_type'
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:20 -08:00
Benjamin Thery
2216b48376 [NETNS]: Add missing initialization of nl_info.nl_net in rtm_to_fib6_config()
Add missing initialization of the new nl_info.nl_net field in
rtm_to_fib6_config(). This will be needed the store network namespace
associated to the fib6_config struct.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:20 -08:00
Laszlo Attila Toth
4a19ec5800 [NET]: Introducing socket mark socket option.
A userspace program may wish to set the mark for each packets its send
without using the netfilter MARK target. Changing the mark can be used
for mark based routing without netfilter or for packet filtering.

It requires CAP_NET_ADMIN capability.

Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:19 -08:00
Jan Engelhardt
036c2e27bc [AF_RXRPC]: constify function pointer tables
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:18 -08:00
Dave Young
b6c0632105 [BLUETOOTH]: Add conn add/del workqueues to avoid connection fail.
The bluetooth hci_conn sysfs add/del executed in the default
workqueue.  If the del_conn is executed after the new add_conn with
same target, add_conn will failed with warning of "same kobject name".

Here add btaddconn & btdelconn workqueues, flush the btdelconn
workqueue in the add_conn function to avoid the issue.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:12 -08:00
Herbert Xu
2614fa59fa [IPCOMP]: Fetch nexthdr before ipch is destroyed
When I moved the nexthdr setting out of IPComp I accidently moved
the reading of ipch->nexthdr after the decompression.  Unfortunately
this means that we'd be reading from a stale ipch pointer which
doesn't work very well.

This patch moves the reading up so that we get the correct nexthdr
value.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:11 -08:00
Julian Anastasov
936f6f8e1b [IPV4] fib_trie: apply fixes from fib_hash
Update fib_trie with some fib_hash fixes:
- check for duplicate alternative routes for prefix+tos+priority when
replacing route
- properly insert by matching tos together with priority
- fix alias walking to use list_for_each_entry_continue for insertion
and deletion when fa_head is not NULL
- copy state from fa to new_fa on replace (not a problem for now)
- additionally, avoid replacement without error if new route is same,
as Joonwoo Park suggests.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:10 -08:00
Julian Anastasov
c18865f392 [IPV4] fib: fix route replacement, fib_info is shared
fib_info can be shared by many route prefixes but we don't want
duplicate alternative routes for a prefix+tos+priority. Last change
was not correct to check fib_treeref because it accounts usage from
other prefixes. Additionally, avoid replacement without error if new
route is same, as Joonwoo Park suggests.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:10 -08:00
Wei Yongjun
ec9dbb1c3e [SCTP]: Fix miss of report unrecognized HMAC Algorithm parameter
This patch fix miss of check for report unrecognized HMAC Algorithm
parameter.  When AUTH is disabled, goto fall through path to report
unrecognized parameter, else, just break

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:09 -08:00
Arnaldo Carvalho de Melo
8cf8e5a67f [INET_DIAG]: Fix inet_diag_lock_handler error path.
Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=9825

The inet_diag_lock_handler function uses ERR_PTR to encode errors but
its callers were testing against NULL.

This only happens when the only inet_diag modular user, DCCP, is not
built into the kernel or available as a module.

Also there was a problem with not dropping the mutex lock when a handler
was not found, also fixed in this patch.

This caused an OOPS and ss would then hang on subsequent calls, as
&inet_diag_table_mutex was being left locked.

Thanks to spike at ml.yaroslavl.ru for report it after trying 'ss -d'
on a kernel that doesn't have DCCP available.

This bug was introduced in cset
d523a328fb ("Fix inet_diag dead-lock
regression"), after 2.6.24-rc3, so just 2.6.24 seems to be affected.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:08 -08:00
Herbert Xu
29ffe1a5c5 [INET]: Prevent out-of-sync truesize on ip_fragment slow path
When ip_fragment has to hit the slow path the value of skb->truesize
may go out of sync because we would have updated it without changing
the packet length.  This violates the constraints on truesize.

This patch postpones the update of skb->truesize to prevent this.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:07 -08:00
maximilian attems
1987e7b485 [AX25]: Kill ax25_bind() user triggable printk.
on the last run overlooked that sfuzz triggable message.
move the message to the corresponding comment.

Signed-off-by: maximilian attems <max@stro.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:06 -08:00
maximilian attems
a44562e4d3 [AX25]: Beautify x25_init() version printk.
kill ref to old version and dup Linux.

Signed-off-by: maximilian attems <max@stro.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:06 -08:00
Ilpo Järvinen
8674204a49 [NET] 9p: kill dead static inline buf_put_string
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:05 -08:00
Herbert Xu
1a6509d991 [IPSEC]: Add support for combined mode algorithms
This patch adds support for combined mode algorithms with GCM being
the first algorithm supported.

Combined mode algorithms can be added through the xfrm_user interface
using the new algorithm payload type XFRMA_ALG_AEAD.  Each algorithms
is identified by its name and the ICV length.

For the purposes of matching algorithms in xfrm_tmpl structures,
combined mode algorithms occupy the same name space as encryption
algorithms.  This is in line with how they are negotiated using IKE.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:03 -08:00
Herbert Xu
6fbf2cb774 [IPSEC]: Allow async algorithms
Now that ESP uses authenc we can turn on the support for async
algorithms in IPsec.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:02 -08:00
Herbert Xu
38320c70d2 [IPSEC]: Use crypto_aead and authenc in ESP
This patch converts ESP to use the crypto_aead interface and in particular
the authenc algorithm.  This lays the foundations for future support of
combined mode algorithms.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:02 -08:00
Iñaky Pérez-González
303d9bf6bb rfkill: add the WiMAX radio type
Teach rfkill about wimax radios.

Had to define a KEY_WIMAX as a 'key for disabling only wimax radios',
as other radio technologies have. This makes sense as hardware has
specific keys for disabling specific radios.

The RFKILL enabling part is, otherwise, a copy and paste of any other
radio technology.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:26:46 -08:00
Ron Rindjunsky
8b6bbe7538 mac80211: fixing null qos data frames check for reordering buffer
This patch fixes a wrong condition for null qos data frames, causing us to
drop data frames needed for reordering as well.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:26:38 -08:00
Johannes Berg
691ba2346d mac80211: fix alignment warning
When I introduced the alignment warning I forgot the A-MSDU case which
has a different requirement because each frame contains 14-byte 802.3
headers in front of the IP payload. This patch moves the alignment
warning to a place where we know whether we're dealing with an A-MSDU
frame and adjusts it accordingly.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:26:34 -08:00
Linus Torvalds
75659ca0c1 Merge branch 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc
* 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: (22 commits)
  Remove commented-out code copied from NFS
  NFS: Switch from intr mount option to TASK_KILLABLE
  Add wait_for_completion_killable
  Add wait_event_killable
  Add schedule_timeout_killable
  Use mutex_lock_killable in vfs_readdir
  Add mutex_lock_killable
  Use lock_page_killable
  Add lock_page_killable
  Add fatal_signal_pending
  Add TASK_WAKEKILL
  exit: Use task_is_*
  signal: Use task_is_*
  sched: Use task_contributes_to_load, TASK_ALL and TASK_NORMAL
  ptrace: Use task_is_*
  power: Use task_is_*
  wait: Use TASK_NORMAL
  proc/base.c: Use task_is_*
  proc/array.c: Use TASK_REPORT
  perfmon: Use task_is_*
  ...

Fixed up conflicts in NFS/sunrpc manually..
2008-02-01 11:45:47 +11:00
Linus Torvalds
44c3b59102 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
  security: compile capabilities by default
  selinux: make selinux_set_mnt_opts() static
  SELinux: Add warning messages on network denial due to error
  SELinux: Add network ingress and egress control permission checks
  NetLabel: Add auditing to the static labeling mechanism
  NetLabel: Introduce static network labels for unlabeled connections
  SELinux: Allow NetLabel to directly cache SIDs
  SELinux: Enable dynamic enable/disable of the network access checks
  SELinux: Better integration between peer labeling subsystems
  SELinux: Add a new peer class and permissions to the Flask definitions
  SELinux: Add a capabilities bitmap to SELinux policy version 22
  SELinux: Add a network node caching mechanism similar to the sel_netif_*() functions
  SELinux: Only store the network interface's ifindex
  SELinux: Convert the netif code to use ifindex values
  NetLabel: Add IP address family information to the netlbl_skbuff_getattr() function
  NetLabel: Add secid token support to the NetLabel secattr struct
  NetLabel: Consolidate the LSM domain mapping/hashing locks
  NetLabel: Cleanup the LSM domain hash functions
  NetLabel: Remove unneeded RCU read locks
2008-01-31 09:32:24 +11:00
Linus Torvalds
dd430ca20c Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: (890 commits)
  x86: fix nodemap_size according to nodeid bits
  x86: fix overlap between pagetable with bss section
  x86: add PCI IDs to k8topology_64.c
  x86: fix early_ioremap pagetable ops
  x86: use the same pgd_list for PAE and 64-bit
  x86: defer cr3 reload when doing pud_clear()
  x86: early boot debugging via FireWire (ohci1394_dma=early)
  x86: don't special-case pmd allocations as much
  x86: shrink some ifdefs in fault.c
  x86: ignore spurious faults
  x86: remove nx_enabled from fault.c
  x86: unify fault_32|64.c
  x86: unify fault_32|64.c with ifdefs
  x86: unify fault_32|64.c by ifdef'd function bodies
  x86: arch/x86/mm/init_32.c printk fixes
  x86: arch/x86/mm/init_32.c cleanup
  x86: arch/x86/mm/init_64.c printk fixes
  x86: unify ioremap
  x86: fixes some bugs about EFI memory map handling
  x86: use reboot_type on EFI 32
  ...
2008-01-31 00:40:09 +11:00
Linus Torvalds
44c45eb911 Make !NETFILTER_ADVANCED enable IP6_NF_MATCH_IPV6HEADER
We want IPV6HEADER matching for the non-advanced default netfilter
configuration, since it's part of the standard netfilter setup of at
least some distributions (eg Fedora).

Otherwise NETFILTER_ADVANCED loses much of its point, since even
non-advanced users would have to enable all the advanced options just to
get a working IPv6 netfilter setup.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-31 00:26:10 +11:00
travis@sgi.com
df3825c56d x86: change NR_CPUS arrays in numa_64
Change the following static arrays sized by NR_CPUS to
per_cpu data variables:

	char cpu_to_node_map[NR_CPUS];

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:33:11 +01:00
Trond Myklebust
34f5b4662b SUNRPC: Don't bother changing the sigmask for asynchronous RPC calls
The caller will never sleep in rpc_execute, so don't bother setting the
sigmask.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:10 -05:00
Chuck Lever
afc881124b SUNRPC: rpcb_getport_sync() passes incorrect address size to rpc_create()
The variable "sin" is a pointer, so sizeof(sin) is the size of a pointer,
not the size of thing that sin points to.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
67d6021362 SUNRPC: Clean up block comment preceding rpcb_getport_sync()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
f1ec08cb94 SUNRPC: Use appropriate argument types in rpcb client
Clean up: Follow recommendations of Chapter 5 of Documentation/CodingStyle
and use "u32" instead of "__u32" for types in definitions that are not
shared with user space.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
b91e101fca SUNRPC: rpcb_getport_sync() should use built-in hostname generator
rpc_create() can already fill in the hostname with a string representation
of the server's IP address, so remove redundant logic in in
rpcb_getport_sync() that does that.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
33e01dc7f5 SUNRPC: Clean up functions that free address_strings array
Clean up: document the rule (kfree) and the exceptions
(RPC_DISPLAY_PROTO and RPC_DISPLAY_NETID) when freeing the objects in
a transport's address_strings array.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:08 -05:00
Trond Myklebust
86d61d8638 SUNRPC: Fix up constant string declarations in struct rpcbind_args
...and eliminate an unnecessary cast.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:06 -05:00
Chuck Lever
b454ae9060 SUNRPC: fewer conditionals in the format_ip_address routines
Clean up: have the set up routines explicitly pass the strings to be used
for the transport name and NETID.  This removes a number of conditionals
and dependencies on rpc_xprt.prot, which is overloaded.

Tighten up type checking on the address_strings array while we're at it.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:04 -05:00
Chuck Lever
7df089952f SUNRPC: Fix use of copy_to_user() in gss_pipe_upcall()
The gss_pipe_upcall() function expects the copy_to_user() function to
return a negative error value if the call fails, but copy_to_user()
returns an unsigned long number of bytes that couldn't be copied.

Can rpc_pipefs actually retry a partially completed upcall read?  If
not, then gss_pipe_upcall() should punt any partial read, just like the
upcall logic in net/sunrpc/cache.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:00 -05:00
Trond Myklebust
ba7392bb37 SUNRPC: Add support for per-client timeout values
In order to be able to support setting the timeo and retrans parameters on
a per-mountpoint basis, we move the rpc_timeout structure into the
rpc_clnt.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:59 -05:00
Trond Myklebust
2881ae74e6 SUNRPC: Clean up the transport timeout initialisation
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:58 -05:00
Trond Myklebust
698b6d088e SUNRPC: cleanup for rpc_new_client()
There is no reason why we shouldn't just pass the rpc_create_args.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:58 -05:00
Chuck Lever
0fb2b7e945 SUNRPC: Move universal address definitions to global header
Universal addresses are defined in RFC 1833 and clarified in RFC 3530.  We
need to use them in several places in the NFS and RPC clients, so move the
relevant definition and block comment to an appropriate global include
file.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:50 -05:00
Chuck Lever
0a48f5d70f SUNRPC: RPC version numbers are u32
Clean up: use correct type for RPC version numbers in rpcbind client.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:50 -05:00
Chuck Lever
9f6ad26d2a SUNRPC: Fix socket address handling in rpcb_clnt
Make sure rpcb_clnt passes the correct address length to rpc_create().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:50 -05:00
Chuck Lever
510deb0d70 SUNRPC: rpc_create() default hostname should support AF_INET6 addresses
If the ULP doesn't pass a hostname string to rpc_create(), it manufactures
one based on the passed-in address.  Be smart enough to handle an AF_INET6
address properly in this case.

Move the default servername logic before the xprt_create_transport() call
to simplify error handling in rpc_create().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:49 -05:00
Chuck Lever
9b45b74ce2 SUNRPC: Remove an unneeded implicit type cast when calling rpc_depopulate()
The two arguments of rpc_depopulate() that pass in inode numbers should use
the same type as inode->i_ino: unsigned long.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:43 -05:00
Chuck Lever
322e2efe62 SUNRPC: temp var should match return type of xdr_skb_read_actor
The return type of xdr_skb_read_actor functions is size_t.  This fixes a
nit I unwittingly overlooked in commit dd456471.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:43 -05:00
Chuck Lever
5d40a8a525 SUNRPC: Check a return result
Minor: Replace an empty if statement with a debugging dprintk.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Thomas Talpey <Thomas.Talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:42 -05:00
Chuck Lever
d4b37ff735 SUNRPC: Fix an unnecessary implicit type cast in rpcrdma_count_chunks()
Nit: rl_nchunks is an unsigned integer, so pass it into
rpcrdma_count_chunks() via an unsigned integer argument.  This eliminates
a harmless mixed sign comparison in rpcrdma_count_chunks()

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Thomas Talpey <Thomas.Talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:42 -05:00
Chuck Lever
2a428b2b8f SUNRPC: Prevent mixed sign comparisons in rpcrdma_convert_iovs()
Keep the type of the buffer position the same during iovec conversion to
reduce the likelihood of unexpected results from comparisons and length
computations.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Thomas Talpey <Thomas.Talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:41 -05:00
Trond Myklebust
a4a874990c SUNRPC: Cleanup to remove the last users of the RPC_WAITQ declaration
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:41 -05:00
Trond Myklebust
47fe064831 SUNRPC: Unexport rpc_init_task() and rpc_execute()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:40 -05:00
Trond Myklebust
e8f5d77c80 SUNRPC: allow the caller of rpc_run_task to preallocate the struct rpc_task
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:38 -05:00
Trond Myklebust
b5627943ab SUNRPC: Remove the now unused function rpc_call_setup()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:35 -05:00
Trond Myklebust
5138fde011 NFS/SUNRPC: Convert all users of rpc_call_setup()
Replace use of rpc_call_setup() with rpc_init_task(), and in cases where we
need to initialise task->tk_action, with rpc_call_start().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:32 -05:00
Trond Myklebust
b3ef8b3bb9 SUNRPC: Allow rpc_init_task() to initialise the rpc_task->tk_msg
In preparation for the removal of rpc_call_setup().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:31 -05:00
Trond Myklebust
77de2c590e SUNRPC: Add a helper rpc_call_start() that initialises task->tk_action
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:31 -05:00
Trond Myklebust
5085925902 SUNRPC: Mask signals across the call to rpc_call_setup() in rpc_run_task
To ensure that the RPCSEC_GSS upcall is performed with the correct sigmask.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:31 -05:00
Trond Myklebust
3ff7576dda SUNRPC: Clean up the initialisation of priority queue scheduling info.
We want the default scheduling priority (priority == 0) to remain
RPC_PRIORITY_NORMAL.

Also ensure that the priority wait queue scheduling is per process id
instead of sometimes being per thread, and sometimes being per inode.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:30 -05:00
Trond Myklebust
c970aa85e7 SUNRPC: Clean up rpc_run_task
Make it use the new task initialiser structure instead of acting as a
wrapper.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:30 -05:00
Trond Myklebust
84115e1cd4 SUNRPC: Cleanup of rpc_task initialisation
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:30 -05:00
Trond Myklebust
e8914c65f7 SUNRPC: Restrict sunrpc client exports
The sunrpc client exports are not meant to be part of any official kernel
API: they can change at the drop of a hat. Mark them as internal functions
using EXPORT_SYMBOL_GPL.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:28 -05:00
Trond Myklebust
a6eaf8bdf9 SUNRPC: Move exported declarations to the function declarations
Do this for all RPC client related functions and XDR functions.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:28 -05:00
J. Bruce Fields
93a44a75b9 sunrpc: document the rpc_pipefs kernel api
Add kerneldoc comments for the rpc_pipefs.c functions that are exported.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:27 -05:00
Trond Myklebust
663b8858dd SUNRPC: Reconnect immediately whenever the server isn't refusing it.
If we've disconnected from the server, rather than the other way round,
then it makes little sense to wait 3 seconds before reconnecting.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:27 -05:00
Trond Myklebust
62da3b2488 SUNRPC: Rename xprt_disconnect()
xprt_disconnect() should really only be called when the transport shutdown
is completed, and it is time to wake up any pending tasks. Rename it to
xprt_disconnect_done() in order to reflect the semantical change.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:27 -05:00
Trond Myklebust
3ebb067d92 SUNRPC: Make call_status()/call_decode() call xprt_force_disconnect()
Move the calls to xprt_disconnect() over to xprt_force_disconnect() in
order to enable the transport layer to manage the state of the
XPRT_CONNECTED flag.
Ditto in xs_tcp_read_fraghdr().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:26 -05:00
Trond Myklebust
7272dcd31d SUNRPC: xprt_autoclose() should not call xprt_disconnect()
The transport layer should do that itself whenever appropriate.

Note that the RDMA transport already assumes that it needs to call
xprt_disconnect in xprt_rdma_close().
For TCP sockets, we want to call xprt_disconnect() only after the
connection has been closed by both ends.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:26 -05:00
Trond Myklebust
e06799f958 SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
By using shutdown() rather than close() we allow the RPC client to wait
for the TCP close handshake to complete before we start trying to reconnect
using the same port.
We use shutdown(SHUT_WR) only instead of shutting down both directions,
however we wait until the server has closed the connection on its side.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:26 -05:00
Trond Myklebust
ef80367071 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:25 -05:00
Trond Myklebust
3b948ae5be SUNRPC: Allow the client to detect if the TCP connection is closed
Add an xprt->state bit to enable the TCP ->state_change() method to signal
whether or not the TCP connection is in the process of closing down.
This will to be used by the reconnection logic in a separate patch.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:25 -05:00
Trond Myklebust
67a391d72c SUNRPC: Fix TCP rebinding logic
Currently the TCP rebinding logic assumes that if we're not using a
reserved port, then we don't need to reconnect on the same port if a
disconnection event occurs. This breaks most RPC duplicate reply cache
implementations.

Also take into account the fact that xprt_min_resvport and
xprt_max_resvport may change while we're reconnecting, since the user may
change them at any time via the sysctls. Ensure that we check the port
boundaries every time we loop in xs_bind4/xs_bind6. Also ensure that if the
boundaries change, we only scan the ports a maximum of 2 times.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:25 -05:00
Trond Myklebust
66af1e5585 SUNRPC: Fix a race in xs_tcp_state_change()
When scheduling the autoclose RPC call, we want to ensure that we don't
race against the test_bit() call in xprt_clear_locked().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:24 -05:00
Paul Moore
13541b3ada NetLabel: Add auditing to the static labeling mechanism
This patch adds auditing support to the NetLabel static labeling mechanism.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:29 +11:00
Paul Moore
8cc44579d1 NetLabel: Introduce static network labels for unlabeled connections
Most trusted OSs, with the exception of Linux, have the ability to specify
static security labels for unlabeled networks.  This patch adds this ability to
the NetLabel packet labeling framework.

If the NetLabel subsystem is called to determine the security attributes of an
incoming packet it first checks to see if any recognized NetLabel packet
labeling protocols are in-use on the packet.  If none can be found then the
unlabled connection table is queried and based on the packets incoming
interface and address it is matched with a security label as configured by the
administrator using the netlabel_tools package.  The matching security label is
returned to the caller just as if the packet was explicitly labeled using a
labeling protocol.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:28 +11:00
Paul Moore
d621d35e57 SELinux: Enable dynamic enable/disable of the network access checks
This patch introduces a mechanism for checking when labeled IPsec or SECMARK
are in use by keeping introducing a configuration reference counter for each
subsystem.  In the case of labeled IPsec, whenever a labeled SA or SPD entry
is created the labeled IPsec/XFRM reference count is increased and when the
entry is removed it is decreased.  In the case of SECMARK, when a SECMARK
target is created the reference count is increased and later decreased when the
target is removed.  These reference counters allow SELinux to quickly determine
if either of these subsystems are enabled.

NetLabel already has a similar mechanism which provides the netlbl_enabled()
function.

This patch also renames the selinux_relabel_packet_permission() function to
selinux_secmark_relabel_packet_permission() as the original name and
description were misleading in that they referenced a single packet label which
is not the case.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:26 +11:00
Paul Moore
75e22910cf NetLabel: Add IP address family information to the netlbl_skbuff_getattr() function
In order to do any sort of IP header inspection of incoming packets we need to
know which address family, AF_INET/AF_INET6/etc., it belongs to and since the
sk_buff structure does not store this information we need to pass along the
address family separate from the packet itself.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:20 +11:00
Paul Moore
16efd45435 NetLabel: Add secid token support to the NetLabel secattr struct
This patch adds support to the NetLabel LSM secattr struct for a secid token
and a type field, paving the way for full LSM/SELinux context support and
"static" or "fallback" labels.  In addition, this patch adds a fair amount
of documentation to the core NetLabel structures used as part of the
NetLabel kernel API.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:19 +11:00
Paul Moore
1c3fad936a NetLabel: Consolidate the LSM domain mapping/hashing locks
Currently we use two separate spinlocks to protect both the hash/mapping table
and the default entry.  This could be considered a bit foolish because it adds
complexity without offering any real performance advantage.  This patch
removes the dedicated default spinlock and protects the default entry with the
hash/mapping table spinlock.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:18 +11:00
Paul Moore
b64397e0b4 NetLabel: Cleanup the LSM domain hash functions
The NetLabel/LSM domain hash table search function used an argument to specify
if the default entry should be returned if an exact match couldn't be found in
the hash table.  This is a bit against the kernel's style so make two separate
functions to represent the separate behaviors.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:17 +11:00
Paul Moore
c783f1ce57 NetLabel: Remove unneeded RCU read locks
This patch removes some unneeded RCU read locks as we can treat the reads as
"safe" even without RCU.  It also converts the NetLabel configuration refcount
from a spinlock protected u32 into atomic_t to be more consistent with the rest
of the kernel.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:16 +11:00
YOSHIFUJI Hideaki
85040bcb46 [IPV6] ADDRLABEL: Fix double free on label deletion.
If an entry is being deleted because it has only one reference, 
we immediately delete it and blindly register the rcu handler for it,
This results in oops by double freeing that object.

This patch fixes it by consolidating the code paths for the deletion;
let its rcu handler delete the object if it has no more reference.

Bug was found by Mitsuru Chinen <mitch@linux.vnet.ibm.com>

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:46:02 -08:00
Stephen Hemminger
ac97f75faa [IPV4] fib_trie: remove unneeded NULL check
Since fib_route_seq_show now uses hlist_for_each_entry(), the leaf
info can not be NULL.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:26 -08:00
Stephen Hemminger
f638a2f057 [IPV4] fib_trie: More whitespace cleanup.
Remove extra blank lines.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:25 -08:00
Patrick McHardy
7a9c1bd409 [NET_SCHED]: Use nla_policy for attribute validation in ematches
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:24 -08:00
Patrick McHardy
53b2bf3f8a [NET_SCHED]: Use nla_policy for attribute validation in actions
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:23 -08:00
Patrick McHardy
6fa8c0144b [NET_SCHED]: Use nla_policy for attribute validation in classifiers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:23 -08:00
Patrick McHardy
27a3421e48 [NET_SCHED]: Use nla_policy for attribute validation in packet schedulers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:22 -08:00
Patrick McHardy
5feb5e1aaa [NET_SCHED]: sch_api: introduce constant for rate table size
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:21 -08:00
Patrick McHardy
1587bac49f [NET_SCHED]: Use typeful attribute parsing helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:21 -08:00
Patrick McHardy
24beeab539 [NET_SCHED]: Use typeful attribute construction helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:20 -08:00
Patrick McHardy
57e1c487a4 [NET_SCHED]: Use NLA_PUT_STRING for string dumping
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:19 -08:00
Patrick McHardy
4b3550ef53 [NET_SCHED]: Use nla_nest_start/nla_nest_end
Use nla_nest_start/nla_nest_end for dumping nested attributes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:18 -08:00
Patrick McHardy
cee63723b3 [NET_SCHED]: Propagate nla_parse return value
nla_parse() returns more detailed errno codes, propagate them back on
error.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:18 -08:00
Patrick McHardy
ab27cfb85c [NET_SCHED]: act_api: use PTR_ERR in tcf_action_init/tcf_action_get
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:17 -08:00
Patrick McHardy
c96c9471dd [NET_SCHED]: act_api: use nlmsg_parse
Convert open-coded nlmsg_parse to use the real function.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:16 -08:00
Patrick McHardy
6d834e04e5 [NET_SCHED]: act_api: fix netlink API conversion bug
Fix two invalid attribute accesses, indices start at 1 with the new
netlink API.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:15 -08:00
Patrick McHardy
b03f467200 [NET_SCHED]: sch_netem: use nla_parse_nested_compat
Replace open coded equivalent of nla_parse_nested_compat().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:15 -08:00
Patrick McHardy
f5e5cb7553 [NET_SCHED]: sch_atm: fix format string warning
Fix format string warning introduces by the netlink API conversion:

net/sched/sch_atm.c:250: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'int'.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:14 -08:00
Denis V. Lunev
dde1bc0e6f [NETNS]: Add namespace for ICMP replying code.
All needed API is done, the namespace is available when required from
the device on the DST entry from the incoming packet. So, just replace
init_net with proper namespace.

Other protocols will follow.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:13 -08:00
Denis V. Lunev
b5921910a1 [NETNS]: Routing cache virtualization.
Basically, this piece looks relatively easy. Namespace is already
available on the dst entry via device and the device is safe to
dereferrence. Compare it with one of a searcher and skip entry if
appropriate.

The only exception is ip_rt_frag_needed. So, add namespace parameter to it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:13 -08:00
Patrick McHardy
7ba699c604 [NET_SCHED]: Convert actions from rtnetlink to new netlink API
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:11 -08:00
Patrick McHardy
add93b610a [NET_SCHED]: Convert classifiers from rtnetlink to new netlink API
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:11 -08:00
Patrick McHardy
1e90474c37 [NET_SCHED]: Convert packet schedulers from rtnetlink to new netlink API
Convert packet schedulers to use the netlink API. Unfortunately a gradual
conversion is not possible without breaking compilation in the middle or
adding lots of casts, so this patch converts them all in one step. The
patch has been mostly generated automatically with some minor edits to
at least allow seperate conversion of classifiers and actions.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:10 -08:00
Patrick McHardy
01480e1cf5 [NETLINK]: Add nla_append()
Used to append data to a message without a header or padding.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:09 -08:00
Patrick McHardy
2eb9d75c72 [NET_SCHED]: mark classifier ops __read_mostly
Additionally remove unnecessary NULL initilizations of the next pointer.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:08 -08:00
Patrick McHardy
62e3ba1b55 [NET_SCHED]: Move EXPORT_SYMBOL next to exported symbol
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:07 -08:00
Denis V. Lunev
f206351a50 [NETNS]: Add namespace parameter to ip_route_output_key.
Needed to propagate it down to the ip_route_output_flow.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:07 -08:00
Denis V. Lunev
f1b050bf7a [NETNS]: Add namespace parameter to ip_route_output_flow.
Needed to propagate it down to the __ip_route_output_key.

Signed_off_by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:06 -08:00
Denis V. Lunev
611c183ebc [NETNS]: Add namespace parameter to __ip_route_output_key.
This is only required to propagate it down to the
ip_route_output_slow.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:05 -08:00
Denis V. Lunev
b40afd0e5c [NETNS]: Add namespace parameter to ip_route_output_slow.
This function needs a net namespace to lookup devices, fib tables,
etc. in, so pass it there.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:05 -08:00
Denis V. Lunev
1ab352768f [NETNS]: Add namespace parameter to ip_dev_find.
in_dev_find() need a namespace to pass it to fib_get_table(), so add
an argument.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:04 -08:00
Denis V. Lunev
010278ec4c [NETNS]: Add netns parameter to fib_select_default.
Currently fib_select_default calls fib_get_table() with the
init_net. Prepare it to provide a correct namespace to lookup default
route.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:03 -08:00
Denis V. Lunev
64c2d53829 [IPV4]: Consolidate fib_select_default.
The difference in the implementation of the fib_select_default when
CONFIG_IP_MULTIPLE_TABLES is (not) defined looks
negligible. Consolidate it and place into fib_frontend.c.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:02 -08:00
Stephen Hemminger
d5ce8a0e97 [IPV4] fib_trie: avoid rescan on dump
This converts dumping (and flushing) of large route tables form O(N^2)
to O(N). If the route dump took multiple pages then the dump routine
gets called again. The old code kept track of location by counter, the
new code instead uses the last key.

This is a really big win ( 0.3 sec vs 12 sec) for big route tables.

One side effect is that if the table changes during the dump, then the
last key will not be found, and we will return -EBUSY.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:01 -08:00
Stephen Hemminger
9195bef7fb [IPV4] fib_trie: avoid extra search on delete
Get rid of extra search that made route deletion O(n).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:00 -08:00
Stephen Hemminger
a88ee22925 [IPV4] fib_trie: dump table in sorted order
It is easier with TRIE to dump the data traversal rather than
interating over every possible prefix. This saves some time and makes
the dump come out in sorted order.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:00 -08:00
Stephen Hemminger
82cfbb0085 [IPV4] fib_trie: iterator recode
Remove the complex loop structure of nextleaf() and replace it with a
simpler tree walker. This improves the performance and is much
cleaner.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:59 -08:00
Stephen Hemminger
64347f786d [IPV4] fib_trie: dump message multiple part flag
Match fib_hash, and set NLM_F_MULTI to handle multiple part messages.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:58 -08:00
Stephen Hemminger
1328042e26 [IPV4] fib_trie: use hash list
The code to dump can use the existing hash chain rather than doing
repeated lookup.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:58 -08:00
Stephen Hemminger
936722922f [IPV4] fib_trie: compute size when needed
Compute the number of prefixes when needed, rather than doing bookeeping.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:57 -08:00
Stephen Hemminger
a07f5f508a [IPV4] fib_trie: style cleanup
Style cleanups:
      * make check_leaf return -1 or plen, rather than by reference
      * Get rid of #ifdef that is always set
      * split out embedded function calls in if statements.
      * checkpatch warnings

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:56 -08:00
Stephen Hemminger
bc3c8c1e02 [IPV4] fib_trie: put leaf nodes in a slab cache
This improves locality for operations that touch all the leaves.  Save
space since these entries don't need to be hardware cache aligned.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:56 -08:00
Jan Engelhardt
a1f6f5a768 [AF_X25]: constify function pointer tables
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:55 -08:00
Ross Burton
91cde6f7d2 [IrDA]: LMP discovery timer not started by default
By default, LMP sets up a 3 seconds timer for discovery.
We don't need it until discovery is set to 1.

Signed-off-by: Ross Burton <ross@openedhand.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:54 -08:00
Masakazu Mokuno
4248d2f811 WEXT: remove unused variable
As event_type_pk_size[] is not used,  Remove it.

Signed-off-by: Masakazu Mokuno <mokuno@sm.sony.co.jp>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:48 -08:00
Ron Rindjunsky
71ebb4aac8 mac80211: fix rx flow sparse errors, make functions static
This patch adds static declarations to functions in the Rx flow in order to
eliminate sparse errors

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:46 -08:00
Stefano Brivio
c09c7237ea rc80211-pid: fix last_sample initialization
Fix last_sample initialization. kzalloc'ing the per-STA data wasn't enough,
as jiffies can assume negative values as well. This fixes a bug which
prevented correct PID operation for a while after booting.

Thanks to Michael Buesch for reporting this.

Reported-and-tested-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:43 -08:00
Johannes Berg
f99b751fca mac80211: fix RCU locking in __ieee80211_rx_handle_packet
Commit c7a51bda ("mac80211: restructure __ieee80211_rx") extracted
__ieee80211_rx_handle_packet out of __ieee80211_rx and hence changed
the locking rules for __ieee80211_rx_handle_packet(), it is now
invoked under RCU lock. There is, however, one instance left where
it contains an rcu_read_unlock() in an error path, which is a bug.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:43 -08:00
Guy Cohen
a8bdf29c6c mac80211: Assign correct TID for local bridged packets
This patch assigns correct TID to frames transmitted between
two stations in the same BSS in AP mode.
The problem is that skb->protocol is not set to ETH_P_IP and it is wrong
to use that field at this stage.
The fix compares the LLC/Protocol headers explicitly to check if the
encapsulated frame is IP frame

Signed-off-by: Guy Cohen <guy.cohen@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:41 -08:00
Eric Dumazet
69a73829db [DST]: shrinks sizeof(struct rtable) by 64 bytes on x86_64
On x86_64, sizeof(struct rtable) is 0x148, which is rounded up to
0x180 bytes by SLAB allocator.

We can reduce this to exactly 0x140 bytes, without alignment overhead,
and store 12 struct rtable per PAGE instead of 10.

rate_tokens is currently defined as an "unsigned long", while its
content should not exceed 6*HZ. It can safely be converted to an
unsigned int.

Moving tclassid right after rate_tokens to fill the 4 bytes hole
permits to save 8 bytes on 'struct dst_entry', which finally permits
to save 8 bytes on 'struct rtable'

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:41 -08:00
Pavel Emelyanov
81566e8322 [NETNS][FRAGS]: Make the pernet subsystem for fragments.
On namespace start we mainly prepare the ctl variables.

When the namespace is stopped we have to kill all the fragments that
point to this namespace.  The inet_frags_exit_net() handles it.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:40 -08:00
Pavel Emelyanov
3140c25c82 [NETNS][FRAGS]: Make the LRU list per namespace.
The inet_frags.lru_list is used for evicting only, so we have
to make it per-namespace, to evict only those fragments, who's
namespace exceeded its high threshold, but not the whole hash.
Besides, this helps to avoid long loops  in evictor.

The spinlock is not per-namespace because it protects the
hash table as well, which is global.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:39 -08:00
Pavel Emelyanov
3b4bc4a2bf [NETNS][FRAGS]: Isolate the secret interval from namespaces.
Since we have one hashtable to lookup the fragment, having
different secret_interval-s for hash rebuild doesn't make
sense, so move this one to inet_frags.

The inet_frags_ctl becomes empty after this, so remove it.
The appropriate ctl table is kept read-only in namespaces.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:39 -08:00
Pavel Emelyanov
e31e0bdc7e [NETNS][FRAGS]: Make thresholds work in namespaces.
This is the same as with the timeout variable.

Currently, after exceeding the high threshold _all_
the fragments are evicted, but it will be fixed in
later patch.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:38 -08:00
Pavel Emelyanov
b2fd5321dd [NETNS][FRAGS]: Make the net.ipv4.ipfrag_timeout work in namespaces.
Move it to the netns_frags, adjust the usage and
make the appropriate ctl table writable.

Now fragment, that live in different namespaces can
live for different times.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:37 -08:00
Pavel Emelyanov
e4a2d5c2bc [NETNS][FRAGS]: Duplicate sysctl tables for new namespaces.
Each namespace has to have own tables to tune their
different parameters, so duplicate the tables and
register them.

All the tables in sub-namespaces are temporarily made
read-only.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:37 -08:00
Pavel Emelyanov
6ddc082223 [NETNS][FRAGS]: Make the mem counter per-namespace.
This is also simple, but introduces more changes, since
then mem counter is altered in more places.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:36 -08:00
Pavel Emelyanov
e5a2bb842c [NETNS][FRAGS]: Make the nqueues counter per-namespace.
This is simple - just move the variable from struct inet_frags
to struct netns_frags and adjust the usage appropriately.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:35 -08:00
Pavel Emelyanov
ac18e7509e [NETNS][FRAGS]: Make the inet_frag_queue lookup work in namespaces.
Since fragment management code is consolidated, we cannot have the
pointer from inet_frag_queue to struct net, since we must know what
king of fragment this is.

So, I introduce the netns_frags structure. This one is currently
empty, but will be eventually filled with per-namespace
attributes. Each inet_frag_queue is tagged with this one.

The conntrack_reasm is not "netns-izated", so it has one static
netns_frags instance to keep working in init namespace.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:34 -08:00
Pavel Emelyanov
8d8354d2fb [NETNS][FRAGS]: Move ctl tables around.
This is a preparation for sysctl netns-ization.
Move the ctl tables to the files, where the tuning
variables reside. Plus make the helpers to register
the tables.

This will simplify the later patches and will keep
similar things closer to each other.

ipv4, ipv6 and conntrack_reasm are patched differently,
but the result is all the tables are in appropriate files.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:34 -08:00
YOSHIFUJI Hideaki
61cf46ad58 [IPV6] NDISC: Sparse: Use different variable name for local use.
Fix the following sparse warnings:
| net/ipv6/ndisc.c:1300:21: warning: symbol 'opt' shadows an earlier one
| net/ipv6/ndisc.c:1078:7: originally declared here

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:28 -08:00
YOSHIFUJI Hideaki
5d5619b40c [IPV6] ADDRCONF: Sparse: Make inet6_dump_addr() code paths more straight-forward.
Fix the following sparse warning:
| net/ipv6/addrconf.c:3384:2: warning: context imbalance in 'inet6_dump_addr' - different lock contexts for basic block

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:28 -08:00
YOSHIFUJI Hideaki
2334ecbdb2 [IPV6]: Sparse: Declare non-static ipv6_{route,icmp,frag}_sysctl_init() in header.
Fix the following sparse warnings:
| net/ipv6/route.c:2491:18: warning: symbol 'ipv6_route_sysctl_init' was not declared. Should it be static?
| net/ipv6/icmp.c:922:18: warning: symbol 'ipv6_icmp_sysctl_init' was not declared. Should it be static?
| net/ipv6/reassembly.c:628:6: warning: symbol 'ipv6_frag_sysctl_init' was not declared. Should it be static?

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:27 -08:00
YOSHIFUJI Hideaki
40fee36e11 [IPV6] ADDRLABEL: Sparse: Make several functions static.
Fix following sparse warnings:
| net/ipv6/addrlabel.c:172:25: warning: symbol 'ip6addrlbl_alloc' was not declared. Should it be static?
| net/ipv6/addrlabel.c:219:5: warning: symbol '__ip6addrlbl_add' was not declared. Should it be static?
| net/ipv6/addrlabel.c:260:5: warning: symbol 'ip6addrlbl_add' was not declared. Should it be static?
| net/ipv6/addrlabel.c:285:5: warning: symbol '__ip6addrlbl_del' was not declared. Should it be static?
| net/ipv6/addrlabel.c:311:5: warning: symbol 'ip6addrlbl_del' was not declared. Should it be static?

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:26 -08:00
YOSHIFUJI Hideaki
5e8b9df6e8 [IPV6] UDPLITE: Sparse: Declare non-static symbols in header.
Fix the following sparse warnings:
| net/ipv6/udplite.c:45:14: warning: symbol 'udplitev6_prot' was not declared. Should it be static?
| net/ipv6/udplite.c:80:12: warning: symbol 'udplitev6_init' was not declared. Should it be static?
| net/ipv6/udplite.c:99:6: warning: symbol 'udplitev6_exit' was not declared. Should it be static?

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:26 -08:00
YOSHIFUJI Hideaki
77d0d350e9 [IPV6] UDP,UDPLITE: Sparse: {__udp6_lib,udp,udplite}_err() are of void.
Fix following sparse warnings:
| net/ipv6/udp.c:262:2: warning: returning void-valued expression
| net/ipv6/udplite.c:29:2: warning: returning void-valued expression

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:25 -08:00
YOSHIFUJI Hideaki
fc80be87dc [IPV4] UDP,UDPLITE: Sparse: {__udp4_lib,udp,udplite}_err() are of void.
Fix following sparse warnings:
| net/ipv4/udp.c:421:2: warning: returning void-valued expression
| net/ipv4/udplite.c:38:2: warning: returning void-valued expression

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:24 -08:00
Denis V. Lunev
ecfdc8c542 [NETNS]: Pass correct namespace in ip_rt_get_source.
ip_rt_get_source is the infamous place for which dst_ifdown kludges
have been implemented. This means that rt->u.dst.dev can be safely
dereferrenced obtain nd_net.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:23 -08:00
Denis V. Lunev
84a885f449 [NETNS]: Pass correct namespace in ip_route_input_slow.
The packet on the input path always has a referrence to an input
network device it is passed from. Extract network namespace from it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:22 -08:00
Denis V. Lunev
86167a377f [NETNS]: Pass correct namespace in context fib_check_nh.
Correct network namespace is already used in fib_check_nh. Re-work its
usage for better readability and pass into fib_lookup &
inetdev_by_index.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:21 -08:00
Denis V. Lunev
5b707aaae4 [NETNS]: Pass correct namespace in fib_validate_source.
Correct network namespace is available inside fib_validate_source. It
can be obtained from the device passed in. The device is not NULL as
in_device is obtained from it just above.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:21 -08:00
Denis V. Lunev
7fee0ca237 [NETNS]: Add netns parameter to inetdev_by_index.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:20 -08:00
Denis V. Lunev
da0e28cb68 [NETNS]: Add netns parameter to fib_lookup.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:19 -08:00
Stephen Hemminger
ba93ef7465 [IPV4]: ipmr sparse warnings
Get rid of some of the sparse warnings.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:18 -08:00
Stephen Hemminger
dd329bfa96 [IPV4]: igmp sparse warnings
Partial sparse warning fix.  The other conditional locking
is too much for sparse to handle.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:18 -08:00
Stephen Hemminger
1402c8519a [VLAN]: sparse warning fix
Minor sparse warning fix.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:17 -08:00
Ron Rindjunsky
66dcb6bdc5 mac80211: A-MPDU Rx remove stop_rx_ba_session warning print
This patch removes a warning print from ieee80211_sta_stop_rx_ba_session
in case the tid is inactive when interface goes down.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:10 -08:00
Ron Rindjunsky
5b3d71d90e mac80211: A-MPDU Rx stop aggregation on proper dev
This patch adds a check to insure that Rx A-MPDU will be stopped only
for the proper device.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:09 -08:00
Ivo van Doorn
0f7054e32f mac80211: Initialize vif pointer
Before calling update_beacon() mac80211 must
initialize the control.vif pointer so it can
be used by the driver to determine which
interface is trying to send the beacon.

v2: ieee80211_beacon_get() should also initialize the
vif pointer since it can be called by mac80211 internally
before calling config_interface().

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:08 -08:00
Johannes Berg
471b3efdfc mac80211: add unified BSS configuration
This patch (based on Ron Rindjunsky's) creates a framework for
a unified way to pass BSS configuration to drivers that require
the information, e.g. for implementing power save mode.

This patch introduces new ieee80211_bss_conf structure that is
passed to the driver via the new bss_info_changed() callback
when the BSS configuration changes.

This new BSS configuration infrastructure adds the following
new features:
 * drivers are notified of their association AID
 * drivers are notified of association status

and replaces the erp_ie_changed() callback. The patch also does
the relevant driver updates for the latter change.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:43 -08:00
Michael Wu
2bc454b0b3 mac80211: Fix rate reporting regression
Mattias Nissler's "clean up rate selection" patch incorrectly changes
the behavior of txrate setting in sta_info. This patch backs out parts
of the rate selection consolidation in order to fix this issue for now.

Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:42 -08:00
Johannes Berg
4fd6931ebe mac80211: implement cfg80211 station handling
This implements station handling from userspace via cfg80211
in mac80211.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:39 -08:00
Johannes Berg
5dfdaf58d6 mac80211: add beacon configuration via cfg80211
This patch implements the cfg80211 hooks for configuring beaconing
on an access point interface in mac80211. While doing so, it fixes
a number of races that could badly crash the machine when the
beacon is changed while being requested by the driver.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:38 -08:00
Johannes Berg
51fb61e76d mac80211: move interface type to vif structure
Drivers that support mixed AP/STA operation may well need to
know the type of a virtual interface when iterating over them.
The easiest way to support that is to move the interface type
variable into the vif structure.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:37 -08:00
Johannes Berg
32bfd35d4b mac80211: dont use interface indices in drivers
This patch gets rid of the if_id stuff where possible in favour of
a new per-virtual-interface structure "struct ieee80211_vif". This
structure is located at the end of the per-interface structure and
contains a variable length driver-use data area.

This has two advantages:
 * removes the need to look up interfaces by if_id, this is better
   for working with network namespaces and performance
 * allows drivers to store and retrieve per-interface data without
   having to allocate own lists/hash tables

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:36 -08:00
Al Viro
8524f59d47 ieee80211: beacon->capability is little-endian
It's only a debugging printk, so it went unnoticed; still, the
fix is trivial, so...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:08:48 -08:00
Al Viro
d9e94d5647 ieee80211: fix misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:08:48 -08:00
Al Viro
c414e84b22 ieee80211softmac_auth_resp() fix
The struct ieee8021_auth * passed to it comes straight from skb->data
without any conversions; members of the struct are little-endian, so
we'd better take that into account when doing switch by auth->algorithm,
etc.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:08:47 -08:00
Al Viro
b16f13d00c several missing cpu_to_le16() in ieee80211softmac_capabilities()
on some codepaths we forgot to convert to little-endian as we do on the
rest of them and as the caller expects from us.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:08:46 -08:00
Al Viro
8fffc15dc7 eliminate byteswapping in struct ieee80211_qos_parameters
Make it match the on-the-wire endianness, eliminate byteswapping.
The only driver that used this sucker (ipw2200) updated.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:08:45 -08:00
Jan Engelhardt
1e637c74b0 [IPV4]: Enable use of 240/4 address space.
This short patch modifies the IPv4 networking to enable use of the
240.0.0.0/4 (aka "class-E") address space as propsed in the internet
draft draft-fuller-240space-00.txt.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:44 -08:00
Jarek Poplawski
96750162b5 [NET] gen_estimator: gen_replace_estimator() cosmetic changes
White spaces etc. are changed in gen_replace_estimator() to make it
similar to others in a file.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:43 -08:00
Stephen Hemminger
72348a424f [PKT_SCHED] net: add sparse annotation to ptype_seq_start/stop
Get rid of some more sparse warnings.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:42 -08:00
Stephen Hemminger
aa767bfea4 [PKT_SCHED] net classifier: style cleanup's
Classifier code cleanup. Get rid of printk wrapper, and fix whitespace
and other style stuff reported by checkpatch

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:42 -08:00
Stephen Hemminger
786a90366f [PKT_SCHED] sch_atm: style cleanup
ATM scheduler clean house:
  * get rid of printk and qdisc_priv() wrapper
  * split some assignment in if() statements
  * whitespace and line breaks.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:41 -08:00
Stephen Hemminger
9d127fbdd2 [PKT_SCHED] dsmark: checkpatch warning cleanup
Get rid of all style things checkpatch warns about, indentation and
whitespace.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:40 -08:00
Stephen Hemminger
4c30719f4f [PKT_SCHED] dsmark: handle cloned and non-linear skb's
Make dsmark work properly with non-linear and cloned skb's
Before modifying the header, it needs to check that skb header is
writeable.

Note: this makes the assumption, that if it queues a good skb
then a good skb will come out of the embedded qdisc.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:40 -08:00
David S. Miller
5b0ac72bc5 [PKT_SCHED] dsmark: Use hweight32() instead of convoluted loop.
Based upon a patch by Stephen Hemminger and suggestions
from Patrick McHardy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:39 -08:00
Stephen Hemminger
81da99ed71 [PKT_SCHED] dsmark: get rid of wrappers
Remove extraneous macro wrappers for printk and qdisc_priv.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:38 -08:00
Stephen Hemminger
d20b3109e9 [IPV6]: addrconf sparse warnings
Get rid of a couple of sparse warnings in IPV6 addrconf code.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:37 -08:00
Patrick McHardy
13a0a096e5 [NET_SCHED]: kill obsolete NET_CLS_POLICE option
The code is already gone for about half a year, the config option
has been kept around to select the replacement options for easier
upgrades. This seems long enough, people upgrading from older
kernels will have to reconfigure a lot anyway.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:37 -08:00
Pavel Emelyanov
91b4f95475 [VLAN]: Move protocol determination to seperate function
I think, that we can make this code flow easier to understand
by introducing the vlan_set_encap_proto() function (I hope the
name is good) to setup the skb proto and merge the paths calling
netif_rx() together.

[Patrick: Modified to apply on top of my previous patches]

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:35 -08:00
Patrick McHardy
31ffdbcb59 [VLAN]: Clean up vlan_skb_recv()
- remove three instances of identical code
- remove unnecessary NULL initialization
- remove obvious and unnecessary comments

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:35 -08:00
Patrick McHardy
ad712087f7 [VLAN]: Update list address
VLAN related mail should go to netdev.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:34 -08:00
Patrick McHardy
2029cc2c84 [VLAN]: checkpatch cleanups
Checkpatch cleanups, consisting mainly of overly long lines and
missing spaces.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:33 -08:00
Patrick McHardy
9dfebcc647 [VLAN]: Turn VLAN_DEV_INFO into inline function
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:32 -08:00
Patrick McHardy
af30151709 [VLAN]: Simplify vlan unregistration
Keep track of the number of VLAN devices in a vlan group. This allows
to have the caller sense when the group is going to be destroyed and
stop using it, which in turn allows to remove the wrapper around
unregister_vlan_dev for the NETDEV_UNREGISTER notifier and avoid
iterating over all possible VLAN ids whenever a device in unregistered.

Also fix what looks like a use-after-free (but is actually safe since
we're holding the RTNL), the real_dev reference should not be dropped
while we still use it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:31 -08:00
Patrick McHardy
acc5efbcd2 [VLAN]: Clean up unregister_vlan_dev
Save two levels of indentation by aborting on error conditions,
remove unnecessary initialization to NULL and remove two obvious
comments.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:30 -08:00
Patrick McHardy
69ab4b7d6d [VLAN]: Clean up initialization code
- move module init/exit functions to end of file, remove some now unnecessary
  forward declarations
- remove some obvious comments
- clean up proc init function and move a proc-related printk there

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:30 -08:00
Patrick McHardy
198a291ce3 [VLAN]: Remove non-implemented ioctls
The GET_VLAN_INGRESS_PRIORITY_CMD/GET_VLAN_EGRESS_PRIORITY_CMD ioctls are
not implemented and won't be, new functionality will be added to the netlink
interface. Remove the code and make the ioctl handler return -EOPNOTSUPP
for unknown commands instead of -EINVAL.

Also remove a comment about passing unknown commands to the underlying
device, that doesn't make any sense since its a VLAN specific ioctl and
if its not implemented here, its implemented nowhere.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:29 -08:00
Patrick McHardy
40f98e1af4 [VLAN]: Clean up debugging and printks
- use pr_* functions and common prefix for non-device related messages

- remove VLAN_ printk levels

- kill lots of useless debugging statements

- remove a few unnecessary printks like for double VID registration (already
  returns -EEXIST) and kill of a number of unnecessary checks in
  vlan_proc_{add,rem}_dev() that are already performed by the caller

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:28 -08:00
Patrick McHardy
62f99efce6 [VLAN]: Kill useless check
vlan->real_dev is always equal to the device since thats what we used
for the lookup. It doesn't even seem worth a WARN_ON or BUG_ON.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:27 -08:00
Patrick McHardy
ef3eb3e59b [VLAN]: Move device setup to vlan_dev.c
Move device setup to vlan_dev.c and make all the VLAN device methods
static.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:26 -08:00
Patrick McHardy
7bd38d778e [VLAN]: Use dev->stats
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:25 -08:00
Patrick McHardy
b7a4a83629 [VLAN]: Kill useless VLAN_NAME define
The only user already includes __FUNCTION__ (vlan_proto_init) in the
output, which is enough to identify what the message is about.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:25 -08:00
Patrick McHardy
891687649a [NET_SCHED]: sch_ingress: remove useless printk
The printk about ingress qdisc registration error can't be triggered
under normal circumstances. Since register_qdisc only fails for two
identical registrations, the only way to trigger it is by loading the
sch_ingress modules multiple times under different names, in which
case we already return -EEXIST to userspace.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:22 -08:00
Patrick McHardy
1389356735 [NET_SCHED]: sch_ingress: avoid a few #ifdefs
Move the repeating "ifndef CONFIG_NET_CLS_ACT/ifdef CONFIG_NETFILTER"
ifdefs into a single condition.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:22 -08:00
Patrick McHardy
645a1e39e4 [NET_SCHED]: sch_ingress: move dependencies to Kconfig
Instead of complaining at scheduler initialization time, check the
dependencies in Kconfig.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:21 -08:00
Patrick McHardy
c6ee877f2e [NET_SCHED]: sch_ingress: remove unnecessary ops
- ->reset is optional
- sch_api provides identical defaults for ->dequeue/->requeue
- ->drop can't happen since ingress never has a parent qdisc

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:20 -08:00
Patrick McHardy
e037834758 [NET_SCHED]: sch_ingress: return proper error code in ingress_graft()
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:20 -08:00
Patrick McHardy
c21d4d5dd2 [NET_SCHED]: sch_ingress: remove unused inner qdisc
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:19 -08:00
Patrick McHardy
cb53c04891 [NET_SCHED]: sch_ingress: remove qdisc_priv() wrapper
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:18 -08:00
Patrick McHardy
a47812211b [NET_SCHED]: sch_ingress: remove excessive debugging
Remove excessive debugging statements and some "future use" stuff.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:18 -08:00
Patrick McHardy
58f4df423e [NET_SCHED]: sch_ingress: formatting fixes
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:17 -08:00
Stephen Hemminger
6f9e98f7a9 [PKT_SCHED] SFQ: whitespace cleanup
Add whitespace around operators, and add a few blank lines to improve
readability.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:16 -08:00
Stephen Hemminger
d46f8dd87d [PKT_SCHED] SFQ: use net_random
SFQ doesn't need true random numbers, it is only using them to salt a
hash. Therefore it is better to use net_random() and avoid any
possible problems with depleting the entropy pool.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:15 -08:00
Stephen Hemminger
d3e994830d [PKT_SCHED] SFQ: timer is deferrable
The perturbation timer used for re-keying can be deferred, it doesn't
need to be deterministic.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:15 -08:00
Denis V. Lunev
51314a17ba [NETNS]: Process FIB rule action in the context of the namespace.
Save namespace context on the fib rule at the rule creation time and
call routing lookup in the correct namespace.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:14 -08:00
Denis V. Lunev
9e3a548781 [NETNS]: FIB rules API cleanup.
Remove struct net from fib_rules_register(unregister)/notify_change
paths and diet code size a bit.

add/remove: 0/0 grow/shrink: 10/12 up/down: 35/-100 (-65)
function                                     old     new   delta
notify_rule_change                           273     280      +7
trie_show_stats                              471     475      +4
fn_trie_delete                               473     477      +4
fib_rules_unregister                         144     148      +4
fib4_rule_compare                            119     123      +4
resize                                      2842    2845      +3
fn_trie_select_default                       515     518      +3
inet_sk_rebuild_header                       836     838      +2
fib_trie_seq_show                            764     766      +2
__devinet_sysctl_register                    276     278      +2
fn_trie_lookup                              1124    1123      -1
ip_fib_check_default                         133     131      -2
devinet_conf_sysctl                          223     221      -2
snmp_fold_field                              126     123      -3
fn_trie_insert                              2091    2086      -5
inet_create                                  876     870      -6
fib4_rules_init                              197     191      -6
fib_sync_down                                452     444      -8
inet_gso_send_check                          334     325      -9
fib_create_info                             3003    2991     -12
fib_nl_delrule                               568     553     -15
fib_nl_newrule                               883     852     -31

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:13 -08:00
Denis V. Lunev
0359238333 [FIB]: Add netns to fib_rules_ops.
The backward link from FIB rules operations to the network namespace
will allow to simplify the API a bit.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:13 -08:00
Vlad Yasevich
853f4b5055 [SCTP]: Correctly initialize error when parameter validation failed.
When parameter validation fails, there should be error causes that
specify what type of failure we've encountered.  If the causes are not
there, we lacked memory to allocated them.  Thus make that the default
value for the error.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:11 -08:00
Adrian Bunk
e9888f5498 [IrDA]: Irport removal - part 1
This patch removes IrPORT and the old dongle drivers (all off them
have replacement drivers).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:10 -08:00
Robie Basak
5d780cd658 [IrDA]: Frame length validation.
When using a stir4200-based USB adaptor to talk to a device that uses
an mcp2150, the stir4200 sometimes drops an incoming frame causing the
mcp2150 to try and retransmit the lost frame. In this combination, the
next frame received from the mcp2150 is often invalid - either an
empty i:rsp or an IrCOMM i:rsp with an invalid clen. These corner
cases are now checked.

Signed-off-by: Robie Basak <rb-oss-1@justgohome.co.uk>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:09 -08:00
Robie Basak
6d97b53e92 [IrDA]: Resend frames on timeout.
When final timer expires, it might also mean that the i:cmd wasn't
received properly. If we have rejected frames, we can try to resend them.

Signed-off-by: Robie Basak <rb-oss-1@justgohome.co.uk>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:08 -08:00
Denis V. Lunev
775516bfa2 [NETNS]: Namespace stop vs 'ip r l' race.
During network namespace stop process kernel side netlink sockets
belonging to a namespace should be closed. They should not prevent
namespace to stop, so they do not increment namespace usage
counter. Though this counter will be put during last sock_put.

The raplacement of the correct netns for init_ns solves the problem
only partial as socket to be stoped until proper stop is a valid
netlink kernel socket and can be looked up by the user processes. This
is not a problem until it resides in initial namespace (no processes
inside this net), but this is not true for init_net.

So, hold the referrence for a socket, remove it from lookup tables and
only after that change namespace and perform a last put.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Tested-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:08 -08:00
Denis V. Lunev
b7c6ba6eb1 [NETNS]: Consolidate kernel netlink socket destruction.
Create a specific helper for netlink kernel socket disposal. This just
let the code look better and provides a ground for proper disposal
inside a namespace.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Tested-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:07 -08:00
Denis V. Lunev
4f84d82f7a [NETNS]: Memory leak on network namespace stop.
Network namespace allocates 2 kernel netlink sockets, fibnl &
rtnl. These sockets should be disposed properly, i.e. by
sock_release. Plain sock_put is not enough.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Tested-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:06 -08:00
Denis V. Lunev
869e58f870 [NETNS]: Double free in netlink_release.
Netlink protocol table is global for all namespaces. Some netlink
protocols have been virtualized, i.e. they have per/namespace netlink
socket. This difference can easily lead to double free if more than 1
namespace is started. Count the number of kernel netlink sockets to
track that this table is not used any more.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Tested-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:05 -08:00
Daniel Lezcano
7d460db953 [IPV6]: Fix ip6_frag ctl
Alexey Dobriyan reported an oops when unsharing the network
indefinitely inside a loop. This is because the ip6_frag is not per
namespace while the ctls are.

That happens at the fragment timer expiration:
inet_frag_secret_rebuild function is called and this one restarts the
timer using the value stored inside the sysctl field.

        "mod_timer(&f->secret_timer, now + f->ctl->secret_interval);"

When the network is unshared, ip6_frag.ctl is initialized with the new
sysctl instances, but ip6_frag has only one instance. A race in this
case will appear because f->ctl can be modified during the read access
in the timer callback.

Until the ip6_frag is not per namespace, I discard the assignation to
the ctl field of ip6_frags in ip6_frag_sysctl_init when the network
namespace is not the init net.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:04 -08:00
Roel Kluin
f59d978275 wireless: fix '!x & y' typo's
Fix priority mistakes similar to '!x & y'

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:03:35 -08:00
Rami Rosen
191df5737e [BRIDGE]: Remove unused include of a header file in ebtables.c
In net/bridge/netfilter/ebtables.c,
- remove unused include of a header file (linux/tty.h) and remove the
  corresponding comment above it.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:48 -08:00
David S. Miller
66688ea7c8 [SCTP]: Fix build warning in sctp_sf_do_5_1C_ack().
Reported by Andrew Morton.

net/sctp/sm_statefuns.c: In function 'sctp_sf_do_5_1C_ack':
net/sctp/sm_statefuns.c:484: warning: 'error' may be used uninitialized in this function

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:48 -08:00
Daniel Lezcano
569d36452e [NETNS][DST] dst: pass the dst_ops as parameter to the gc functions
The garbage collection function receive the dst_ops structure as
parameter. This is useful for the next incoming patchset because it
will need the dst_ops (there will be several instances) and the
network namespace pointer (contained in the dst_ops).

The protocols which do not take care of the namespaces will not be
impacted by this change (expect for the function signature), they do
just ignore the parameter.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:46 -08:00
Eric Dumazet
a6501e080c [IPV4] FIB_HASH: Reduce memory needs and speedup lookups
Currently, sizeof(struct fib_alias) is 24 or 48 bytes on 32/64 bits
arches.

Because of SLAB_HWCACHE_ALIGN requirement, these are rounded to 32 and
64 bytes respectively.

This patch moves rcu to the end of fib_alias, and conditionally
defines it only for CONFIG_IP_FIB_TRIE.

We also remove SLAB_HWCACHE_ALIGN requirement for fib_alias and
fib_node objects because it is not necessary.

(BTW SLUB currently denies it for objects smaller than
cache_line_size() / 2, but not SLAB)

Finally, sizeof(fib_alias) go back to 16 and 32 bytes.

Then, we can embed one fib_alias on each fib_node, to favor locality.
Most of the time access to the fib_alias will be free because one
cache line contains both the list head (fn_alias) and (one of) the
list element.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:46 -08:00
Eric Dumazet
b59cfbf77d [FIB]: Fix rcu_dereference() abuses in fib_trie.c
node_parent() and tnode_get_child() currently use rcu_dereference().

These functions are called from both
- readers only paths (where rcu_dereference() is needed), and
- writer path (where rcu_dereference() is not needed)

To make explicit where rcu_dereference() is really needed, I
introduced new node_parent_rcu() and tnode_get_child_rcu() functions
which use rcu_dereference(), while node_parent() and tnode_get_child()
dont use it.

Then I changed calling sites where rcu_dereference() was really needed
to call the _rcu() variants.

This should have no impact but for alpha architecture, and may help
future sparse checks.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:45 -08:00
Eric Dumazet
95b7d924a5 [ROSE]: Supress sparse warnings
CHECK   net/rose/af_rose.c
net/rose/af_rose.c:125:11: warning: expensive signed divide
net/rose/af_rose.c:976:46: warning: expensive signed divide
net/rose/af_rose.c:1379:13: warning: context imbalance in 'rose_info_start' - wrong count at exit
net/rose/af_rose.c:1406:13: warning: context imbalance in 'rose_info_stop' - unexpected unlock
  CHECK   net/rose/rose_in.c
net/rose/rose_in.c:185:25: warning: expensive signed divide
  CHECK   net/rose/rose_route.c
net/rose/rose_route.c:997:46: warning: expensive signed divide
net/rose/rose_route.c:1070:13: warning: context imbalance in 'rose_node_start' - wrong count at exit
net/rose/rose_route.c:1093:13: warning: context imbalance in 'rose_node_stop' - unexpected unlock
net/rose/rose_route.c:1146:13: warning: context imbalance in 'rose_neigh_start' - wrong count at exit
net/rose/rose_route.c:1169:13: warning: context imbalance in 'rose_neigh_stop' - unexpected unlock
net/rose/rose_route.c:1229:13: warning: context imbalance in 'rose_route_start' - wrong count at exit
net/rose/rose_route.c:1252:13: warning: context imbalance in 'rose_route_stop' - unexpected unlock

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:44 -08:00
Eric Dumazet
5c17d5f112 [ATM]: Suppress some sparse warnings
CHECK   net/atm/br2684.c
net/atm/br2684.c:665:13: warning: context imbalance in 'br2684_seq_start' - wrong count at exit
net/atm/br2684.c:676:13: warning: context imbalance in 'br2684_seq_stop' - unexpected unlock
  CHECK   net/atm/lec.c
net/atm/lec.c:196:23: warning: expensive signed divide
  CHECK   net/atm/proc.c
net/atm/proc.c:151:14: warning: context imbalance in 'vcc_seq_start' - wrong count at exit
net/atm/proc.c:154:13: warning: context imbalance in 'vcc_seq_stop' - unexpected unlock

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:43 -08:00
Eric Dumazet
ca629f2472 [APPLETALK]: Annotations to clear sparse warnings
CHECK   net/appletalk/aarp.c
net/appletalk/aarp.c:951:14: warning: context imbalance in 'aarp_seq_start' - wrong count at exit
net/appletalk/aarp.c:977:13: warning: context imbalance in 'aarp_seq_stop' - unexpected unlock
  CHECK   net/appletalk/atalk_proc.c
net/appletalk/atalk_proc.c:34:11: warning: context imbalance in 'atalk_seq_interface_start' - wrong count at exit
net/appletalk/atalk_proc.c:54:13: warning: context imbalance in 'atalk_seq_interface_stop' - unexpected unlock
net/appletalk/atalk_proc.c:93:11: warning: context imbalance in 'atalk_seq_route_start' - wrong count at exit
net/appletalk/atalk_proc.c:113:13: warning: context imbalance in 'atalk_seq_route_stop' - unexpected unlock
net/appletalk/atalk_proc.c:161:11: warning: context imbalance in 'atalk_seq_socket_start' - wrong count at exit
net/appletalk/atalk_proc.c:178:13: warning: context imbalance in 'atalk_seq_socket_stop' - unexpected unlock

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:43 -08:00
Patrick McHardy
c71e916708 [NETFILTER]: nf_conntrack: make print_conntrack function optional for l4protos
Allows to remove five empty implementations.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:42 -08:00
Patrick McHardy
c56cc9c07b [NETFILTER]: nf_conntrack: remove print_conntrack function from l3protos
Its unused and unlikely to ever be used.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:41 -08:00
Patrick McHardy
b334aadc3c [NETFILTER]: nf_conntrack: clean up a few header files
- Remove declarations of non-existing variables and functions
- Move helper init/cleanup function declarations to nf_conntrack_helper.h
- Remove unneeded __nf_conntrack_attach declaration and make it static

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:41 -08:00
Patrick McHardy
4f536522da [NETFILTER]: kill nf_sysctl.c
Since there now is generic support for shared sysctl paths, the only
remains are the net/netfilter and net/ipv4/netfilter paths. Move them
to net/netfilter/core.c and net/ipv4/netfilter.c and kill nf_sysctl.c.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:40 -08:00
Patrick McHardy
86c0bf4095 [NETFILTER]: nf_conntrack_sctp: remove timeout indirection
Instead of keeping pointers to the timeout values in a table, simply
put the timeout values in the table directly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:39 -08:00
Patrick McHardy
9b1c2cfd7a [NETFILTER]: nf_conntrack_sctp: replace magic value by symbolic constant
Use SCTP_CHUNK_FLAG_T instead of 0x1.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:39 -08:00
Patrick McHardy
4a64830af0 [NETFILTER]: nf_conntrack_sctp: don't take sctp_lock once per chunk
Don't take and release the lock once per SCTP chunk, simply hold it
the entire time while iterating through the chunks.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:37 -08:00
Patrick McHardy
efe9f68afe [NETFILTER]: nf_conntrack_sctp: rename "newconntrack" variable
The name is misleading, it holds the new connection state, so rename it
to "newstate". Also rename "oldsctpstate" to "oldstate" for consistency.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:36 -08:00
Patrick McHardy
b37e933ac7 [NETFILTER]: nf_conntrack_sctp: consolidate sctp_packet() error paths
Consolidate error paths and use proper symbolic return value instead
of magic values.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:36 -08:00
Patrick McHardy
8528819adc [NETFILTER]: nf_conntrack_sctp: reduce line length further
Eliminate a few lines over 80 characters by using a local variable to
hold the conntrack direction instead of using CTINFO2DIR everywhere.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:35 -08:00
Patrick McHardy
112f35c9c1 [NETFILTER]: nf_conntrack_sctp: reduce line length
Reduce the length of some overly long lines by renaming all
"conntrack" variables to "ct".

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:34 -08:00
Patrick McHardy
35c6d3cbe1 [NETFILTER]: nf_conntrack_sctp: use proper types for bitops
Use unsigned long instead of char for the bitmap and removed lots
of casts.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:34 -08:00
Patrick McHardy
5447d4777c [NETFILTER]: nf_conntrack_sctp: basic cleanups
Reindent switch cases properly, get rid of weird constructs like "!(x == y)",
put logical operations on the end of the line instead of the next line, get
rid of superfluous braces.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:33 -08:00
Patrick McHardy
2d6462869f [NETFILTER]: nf_conntrack_tcp: remove timeout indirection
Instead of keeping pointers to the timeout values in a table, simply
put the timeout values in the table directly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:32 -08:00
Patrick McHardy
a5e73c29d9 [NETFILTER]: nf_conntrack_{tcp,sctp}: shrink state table
The TCP and SCTP conntrack state transition tables only holds
small numbers, but gcc uses 4 byte per entry for the enum. Switching
to an u8 reduces the size from 480 to 120 bytes for TCP and from
576 to 144 bytes for SCTP.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:32 -08:00
Patrick McHardy
77e2420b85 [NETFILTER]: nf_conntrack_{tcp,sctp}: mark state table const
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:31 -08:00
Denys Vlasenko
9ba99b0d3f [NETFILTER]: ipt_REJECT: properly handle IP options
The current TCP RST construction reuses the old packet and can't
deal with IP options as a consequence of that. Construct the
RST from scratch instead.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:30 -08:00
Denys Vlasenko
022748a935 [NETFILTER]: {ip,ip6}_tables: remove some inlines
This patch removes inlines except those which are used
by packet matching code and thus are performance-critical.

Before:

$ size */*/*/ip*tables*.o
   text    data     bss     dec     hex filename
   6402     500      16    6918    1b06 net/ipv4/netfilter/ip_tables.o
   7130     500      16    7646    1dde net/ipv6/netfilter/ip6_tables.o

After:

$ size */*/*/ip*tables*.o
   text    data     bss     dec     hex filename
   6307     500      16    6823    1aa7 net/ipv4/netfilter/ip_tables.o
   7010     500      16    7526    1d66 net/ipv6/netfilter/ip6_tables.o

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:29 -08:00
Jan Engelhardt
1a50c5a1fe [NETFILTER]: xt_iprange match, revision 1
Adds IPv6 support to xt_iprange, making it possible to match on IPv6
address ranges with ip6tables.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:28 -08:00
Jan Engelhardt
f72e25a897 [NETFILTER]: Rename ipt_iprange to xt_iprange
This patch moves ipt_iprange to xt_iprange, in preparation for adding
IPv6 support to xt_iprange.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:27 -08:00
Jan Engelhardt
2ae15b64e6 [NETFILTER]: Update modules' descriptions
Updates the MODULE_DESCRIPTION() tags for all Netfilter modules,
actually describing what the module does and not just
"netfilter XYZ target".

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:26 -08:00
Jan Engelhardt
917b6fbd6e [NETFILTER]: xt_policy: use the new union nf_inet_addr
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:25 -08:00
Jan Engelhardt
57de0abbff [NETFILTER]: xt_pkttype: IPv6 multicast address recognition
Signed-off-by: Jan Engelhart <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:25 -08:00
Jan Engelhardt
13b0e83b5b [NETFILTER]: xt_pkttype: Add explicit check for IPv4
In the PACKET_LOOPBACK case, the skb data was always interpreted as
IPv4, but that is not valid for IPv6, obviously. Fix this by adding an
extra condition to check for AF_INET.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:24 -08:00
Jan Engelhardt
17b0d7ef65 [NETFILTER]: xt_mark match, revision 1
Introduces the xt_mark match revision 1. It uses fixed types,
eventually obsoleting revision 0 some day (uses nonfixed types).

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:23 -08:00
Jan Engelhardt
64eb12f997 [NETFILTER]: xt_conntrack match, revision 1
Introduces the xt_conntrack match revision 1. It uses fixed types, the
new nf_inet_addr and comes with IPv6 support, thereby completely
superseding xt_state.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:23 -08:00
Jan Engelhardt
96e3227265 [NETFILTER]: xt_connmark match, revision 1
Introduces the xt_connmark match revision 1. It uses fixed types,
eventually obsoleting revision 0 some day (uses nonfixed types).
(Unfixed types like "unsigned long" do not play well with mixed
user-/kernelspace "bitness", e.g. 32/64, as is common on SPARC64,
and need extra compat code.)

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:21 -08:00
Jan Engelhardt
e0a812aea5 [NETFILTER]: xt_MARK target, revision 2
Introduces the xt_MARK target revision 2. It uses fixed types, and
also uses the more expressive XOR logic.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:21 -08:00
Jan Engelhardt
0dc8c76029 [NETFILTER]: xt_CONNMARK target, revision 1
Introduces the xt_CONNMARK target revision 1. It uses fixed types, and
also uses the more expressive XOR logic. Futhermore, it allows to
selectively pick bits from both the ctmark and the nfmark in the SAVE
and RESTORE operations.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:20 -08:00
Jan Engelhardt
cdfe8b9797 [NETFILTER]: xt_TOS: Properly set the TOS field
Fix incorrect mask value passed to ipv4_change_dsfield/ipv6_change_dsfield.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:18 -08:00
Jan Engelhardt
9bb268ed7c [NETFILTER]: xt_TOS: Change semantic of mask value
This patch changes the behavior of xt_TOS v1 so that the mask value
the user supplies means "zero out these bits" rather than "keep these
bits". This is more easy on the user, as (I would assume) people keep
more bits than zeroing, so, an example:

	Action:     Set bit 0x01.
    	before (&): iptables -j TOS --set-tos 0x01/0xFE
    	after (&~): iptables -j TOS --set-tos 0x01/0x01

This is not too "tragic" with xt_TOS, but where larger fields are used
(e.g. proposed xt_MARK v2), `--set-xmar 0x01/0x01` vs. `--set-xmark
0x01/0xFFFFFFFE` really makes a difference. Other target(!) modules,
such as xt_TPROXY also use &~ rather than &, so let's get to a common
ground.

(Since xt_TOS has not yet left the development tree en direction to
mainline, the semantic can be changed as proposed without breaking
iptables.)

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:18 -08:00
Jan Engelhardt
11fa2aa362 [NETFILTER]: remove ipt_TOS.c
Commit 88c85d81f74f92371745158aebc5cbf490412002 forgot to remove the
old ipt_TOS file (whose code has been merged into xt_DSCP). Remove
it now.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:17 -08:00
Patrick McHardy
8ce22fcab4 [NETFILTER]: Remove some EXPERIMENTAL dependencies
Most of the netfilter modules are not considered experimental anymore,
the only ones I want to keep marked as EXPERIMENTAL are:

- TCPOPTSTRIP target, which is brand new.

- SANE helper, which is quite new.

- CLUSTERIP target, which I believe hasn't had much testing despite
  being in the kernel for quite a long time.

- SCTP match and conntrack protocol, which are a mess and need to
  be reviewed and cleaned up before I would trust them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:16 -08:00
Patrick McHardy
b26e76b7ce [NETFILTER]: Hide a few more options under NETFILTER_ADVANCED
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:16 -08:00
Stephen Hemminger
7f9b80529b [IPV4]: fib hash|trie initialization
Initialization of the slab cache's should be done when IP is
initialized to make sure of available memory, and that code can be
marked __init.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:15 -08:00
Stephen Hemminger
d717a9a620 [IPV4] fib_trie: size and statistics
Show number of entries in trie, the size field was being set but never used,
but it only counted leaves, not all entries. Refactor the two cases in
fib_triestat_seq_show into a single routine.

Note: the stat structure was being malloc'd but the stack usage isn't so
high (288 bytes) that it is worth the additional complexity.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:14 -08:00
Eric Dumazet
28d36e3702 [FIB]: Avoid using static variables without proper locking
fib_trie_seq_show() uses two helper functions, rtn_scope() and
rtn_type() that can write to static storage without locking.

Just pass to them a temporary buffer to avoid potential corruption
(probably not triggerable but still...)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:13 -08:00
Denis V. Lunev
39a6d06300 [NETNS]: Process inet_confirm_addr in the correct namespace.
inet_confirm_addr can be called with NULL in_dev from arp_ignore iff
scope is RT_SCOPE_LINK.

Lets always pass the device and check for RT_SCOPE_LINK scope inside
inet_confirm_addr. This let us take network namespace from in_device a
need for an additional argument.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:13 -08:00
Denis V. Lunev
9bd85e3264 [IPV4]: Remove extra argument from arp_ignore.
arp_ignore has two arguments: dev & in_dev. dev is used for
inet_confirm_addr calling only.

inet_confirm_addr, in turn, either gets in_dev from the device passed
or iterates over all network devices if the device passed is NULL. It
seems logical to directly pass in_dev into inet_confirm_addr.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:12 -08:00
Denis V. Lunev
06f0511df1 [ARP]: neigh_parms_put(destroy) are essentially local to core/neighbour.c.
Make them static.

[ Moved the inline before, instead of after, call sites. -DaveM ]

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:11 -08:00
Denis V. Lunev
14db4133d5 [ARP]: Remove forward declaration of neigh_changeaddr.
No need for this. It is declared in the neighbour.h

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:11 -08:00
Denis V. Lunev
486b51d370 [ARP]: Remove overkill checks from neigh_param_alloc.
Valid network device is always passed into neigh_param_alloc, so
remove extra checking for dev == NULL. Additionally, cleanup bogus
netns assignment.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:10 -08:00
Denis V. Lunev
72132c1b6c [IPV4]: fib_rules_unregister is essentially void.
fib_rules_unregister is called only after successful register and the
return code is never checked.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:09 -08:00
Denis V. Lunev
2db82b534b [NETNS]: Make arp code network namespace consistent.
Some calls in the arp.c have network namespace as an argument. Getting
init_net inside these functions is simply inconsistent. Fix this.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:08 -08:00
Denis V. Lunev
a79878f00d [ARP]: Move inet_addr_type call after simple error checks in arp_contructor.
The neighbour entry will be destroyed in the case of error, so it is
pointless to perform constly routing table lookup in this case.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:08 -08:00
Pavel Emelyanov
a308da1627 [NETNS][RAW]: Create the /proc/net/raw(6) in each namespace.
To do so, just register the proper subsystem and create files in
->init callbacks.

No other special per-namespace handling for raw sockets is required.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:07 -08:00
Pavel Emelyanov
e5ba31f11f [NETNS][RAW]: Eliminate explicit init_net references.
Happily, in all the rest places (->bind callbacks only), that require the
struct net, we have a socket, so get the net from it.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:06 -08:00
Pavel Emelyanov
f51d599fbe [NETNS][RAW]: Make /proc/net/raw(6) show per-namespace socket list.
Pull the struct net pointer up to the showing functions
to filter the sockets depending on their namespaces.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:06 -08:00
Pavel Emelyanov
be185884b3 [NETNS][RAW]: Make ipv[46] raw sockets lookup namespaces aware.
This requires just to pass the appropriate struct net pointer
into __raw_v[46]_lookup and skip sockets that do not belong
to a needed namespace.

The proper net is get from skb->dev in all the cases.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:05 -08:00
Eric Dumazet
8d96544475 [FIB]: full_children & empty_children should be uint, not ushort
If declared as unsigned short, these fields can overflow, and whole
trie logic is broken. I could not make the machine crash, but some
tnode can never be freed.

Note for 64 bit arches : By reordering t_key and parent in [node,
leaf, tnode] structures, we can use 32 bits hole after t_key so that
sizeof(struct tnode) doesnt change after this patch.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:04 -08:00
Eric Dumazet
f16f3026db [AX25]: sparse cleanups
net/ax25/ax25_route.c:251:13: warning: context imbalance in
'ax25_rt_seq_start' - wrong count at exit
net/ax25/ax25_route.c:276:13: warning: context imbalance in 'ax25_rt_seq_stop'
- unexpected unlock
net/ax25/ax25_std_timer.c:65:25: warning: expensive signed divide
net/ax25/ax25_uid.c:46:1: warning: symbol 'ax25_uid_list' was not declared.
Should it be static?
net/ax25/ax25_uid.c:146:13: warning: context imbalance in 'ax25_uid_seq_start'
- wrong count at exit
net/ax25/ax25_uid.c:169:13: warning: context imbalance in 'ax25_uid_seq_stop'
- unexpected unlock
net/ax25/af_ax25.c:573:28: warning: expensive signed divide
net/ax25/af_ax25.c:1865:13: warning: context imbalance in 'ax25_info_start' -
wrong count at exit
net/ax25/af_ax25.c:1888:13: warning: context imbalance in 'ax25_info_stop' -
unexpected unlock
net/ax25/ax25_ds_timer.c:133:25: warning: expensive signed divide

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:03 -08:00
Eric Dumazet
6bf1574ee3 [X25]: Avoid divides and sparse warnings
CHECK   net/x25/af_x25.c
net/x25/af_x25.c:117:46: warning: expensive signed divide
   CHECK   net/x25/x25_facilities.c
net/x25/x25_facilities.c:209:30: warning: expensive signed divide
   CHECK   net/x25/x25_in.c
net/x25/x25_in.c:250:26: warning: expensive signed divide
   CHECK   net/x25/x25_proc.c
net/x25/x25_proc.c:48:11: warning: context imbalance in 'x25_seq_route_start'
- wrong count at exit
net/x25/x25_proc.c:72:13: warning: context imbalance in 'x25_seq_route_stop' -
unexpected unlock
net/x25/x25_proc.c:112:11: warning: context imbalance in
'x25_seq_socket_start' - wrong count at exit
net/x25/x25_proc.c:129:13: warning: context imbalance in 'x25_seq_socket_stop'
- unexpected unlock
net/x25/x25_proc.c:190:11: warning: context imbalance in
'x25_seq_forward_start' - wrong count at exit
net/x25/x25_proc.c:215:13: warning: context imbalance in
'x25_seq_forward_stop' - unexpected unlock
   CHECK   net/x25/x25_subr.c
net/x25/x25_subr.c:362:57: warning: expensive signed divide

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:03 -08:00
Eric Dumazet
4dde4610c4 [IPV4] fib_trie: removes a memset() call in tnode_new()
tnode_alloc() already clears allocated memory, using kcalloc() or
alloc_pages(GFP_KERNEL|__GFP_ZERO, ...)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:02 -08:00
David S. Miller
88ebc72f68 [IPV4] FIB: Include nexthop device indexes in fib_info hashfn.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:02:01 -08:00
Ilpo Järvinen
d88c305a03 [PKT_SCHED] HTB: htb_classid is dead static inline
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:59 -08:00
Ilpo Järvinen
8519660b98 [NET] core/utils.c: digit2bin is dead static inline
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:58 -08:00
Eric Dumazet
112d8cfcbf [FIB]: Reduce text size of net/ipv4/fib_trie.o
In struct tnode, we use two fields of 5 bits for 'pos' and 'bits'.
Switching to plain 'unsigned char' (8 bits) take the same space
because of compiler alignments, and reduce text size by 435 bytes
on i386.

On i386 :
$ size net/ipv4/fib_trie.o.before_patch net/ipv4/fib_trie.o
    text    data     bss     dec     hex filename
   13714       4      64   13782    35d6 net/ipv4/fib_trie.o.before
   13279       4      64   13347    3423 net/ipv4/fib_trie.o

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:58 -08:00
Ilpo Järvinen
b9aed45507 [NETFILTER] xt_policy.c: kill some bloat
net/netfilter/xt_policy.c:
  policy_mt | -906
 1 function changed, 906 bytes removed, diff: -906

net/netfilter/xt_policy.c:
  match_xfrm_state | +427
 1 function changed, 427 bytes added, diff: +427

net/netfilter/xt_policy.o:
 2 functions changed, 427 bytes added, 906 bytes removed, diff: -479

Alternatively, this could be done by combining identical
parts of the match_policy_in/out()

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:57 -08:00
Stephen Hemminger
c95aaf9af5 [IPV4] fib_trie: Fix sparse warnings.
Make FIB TRIE go through sparse checker without warnings.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:56 -08:00
Stephen Hemminger
66a2f7fd2f [IPV4] fib_trie: Add statistics.
The FIB TRIE code has a bunch of statistics, but the code is hidden
behind an ifdef that was never implemented. Since it was dead code, it
was broken as well.

This patch fixes that by making it a config option.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:56 -08:00
Stephen Hemminger
a6db901092 [IPV4] FIB: printk related cleanups
printk related cleanups:
 * Get rid of unused printk wrappers.
 * Make bug checks into KERN_WARNING because KERN_DEBUG gets ignored
 * Turn one cryptic old message into something real
 * Make sure all messages have KERN_XXX

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:55 -08:00
Stephen Hemminger
fea86ad812 [IPV4] fib_trie: fib_insert_node cleanup
The only error from fib_insert_node is if memory allocation fails, so
instead of passing by reference, just use the convention of returning
NULL.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:54 -08:00
Stephen Hemminger
187b5188a7 [IPV4] fib_trie: Use %u for unsigned printfs.
Use %u instead of %d when printing unsigned values.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:53 -08:00
Stephen Hemminger
93e4308b3b [IPV4] fib_trie: Get rid of unused revision element.
The revision element must of been part of an earlier design, because
currently it is set but never used.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:53 -08:00
Stephen Hemminger
c28a1cf448 [IPV4] fib_trie: Get rid of trie_init().
trie_init is worthless it is just zeroing stuff that is already zero!
Move the memset() down to make it obvious.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:52 -08:00
Ilpo Järvinen
6db105db95 [PKTGEN]: uninline getCurUs
net/core/pktgen.c:
  pktgen_stop_device   |  -50
  pktgen_run           | -105
  pktgen_if_show       |  -37
  pktgen_thread_worker | -702
 4 functions changed, 894 bytes removed, diff: -894

net/core/pktgen.c:
  getCurUs |  +36
 1 function changed, 36 bytes added, diff: +36

net/core/pktgen.o:
 5 functions changed, 36 bytes added, 894 bytes removed, diff: -858

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:51 -08:00
Ilpo Järvinen
56e252c748 [PKTGEN]: Kill dead static inlines
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:51 -08:00
Ilpo Järvinen
3f25252675 [NETLINK] af_netlink: kill some bloat
net/netlink/af_netlink.c:
  netlink_realloc_groups        |  -46
  netlink_insert                |  -49
  netlink_autobind              |  -94
  netlink_clear_multicast_users |  -48
  netlink_bind                  |  -55
  netlink_setsockopt            |  -54
  netlink_release               |  -86
  netlink_kernel_create         |  -47
  netlink_change_ngroups        |  -56
 9 functions changed, 535 bytes removed, diff: -535

net/netlink/af_netlink.c:
  netlink_table_ungrab |  +53
 1 function changed, 53 bytes added, diff: +53

net/netlink/af_netlink.o:
 10 functions changed, 53 bytes added, 535 bytes removed, diff: -482

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:50 -08:00
Ilpo Järvinen
50eb431d6e [IPV6] route: kill some bloat
net/ipv6/route.c:
  ip6_pkt_prohibit_out | -130
  ip6_pkt_discard      | -261
  ip6_pkt_discard_out  | -130
  ip6_pkt_prohibit     | -261
 4 functions changed, 782 bytes removed, diff: -782

net/ipv6/route.c:
  ip6_pkt_drop | +300
 1 function changed, 300 bytes added, diff: +300

net/ipv6/route.o:
 5 functions changed, 300 bytes added, 782 bytes removed, diff: -482

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:49 -08:00
Ilpo Järvinen
1486cbd777 [XFRM] xfrm_policy: kill some bloat
net/xfrm/xfrm_policy.c:
  xfrm_audit_policy_delete | -692
  xfrm_audit_policy_add    | -692
 2 functions changed, 1384 bytes removed, diff: -1384

net/xfrm/xfrm_policy.c:
  xfrm_audit_common_policyinfo | +704
 1 function changed, 704 bytes added, diff: +704

net/xfrm/xfrm_policy.o:
 3 functions changed, 704 bytes added, 1384 bytes removed, diff: -680

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:49 -08:00
Ilpo Järvinen
cea14e0ed6 [TCP]: Uninline tcp_is_cwnd_limited
net/ipv4/tcp_cong.c:
  tcp_reno_cong_avoid |  -65
 1 function changed, 65 bytes removed, diff: -65

net/ipv4/arp.c:
  arp_ignore |   -5
 1 function changed, 5 bytes removed, diff: -5

net/ipv4/tcp_bic.c:
  bictcp_cong_avoid |  -57
 1 function changed, 57 bytes removed, diff: -57

net/ipv4/tcp_cubic.c:
  bictcp_cong_avoid |  -61
 1 function changed, 61 bytes removed, diff: -61

net/ipv4/tcp_highspeed.c:
  hstcp_cong_avoid |  -63
 1 function changed, 63 bytes removed, diff: -63

net/ipv4/tcp_hybla.c:
  hybla_cong_avoid |  -85
 1 function changed, 85 bytes removed, diff: -85

net/ipv4/tcp_htcp.c:
  htcp_cong_avoid |  -57
 1 function changed, 57 bytes removed, diff: -57

net/ipv4/tcp_veno.c:
  tcp_veno_cong_avoid |  -52
 1 function changed, 52 bytes removed, diff: -52

net/ipv4/tcp_scalable.c:
  tcp_scalable_cong_avoid |  -61
 1 function changed, 61 bytes removed, diff: -61

net/ipv4/tcp_yeah.c:
  tcp_yeah_cong_avoid |  -75
 1 function changed, 75 bytes removed, diff: -75

net/ipv4/tcp_illinois.c:
  tcp_illinois_cong_avoid |  -54
 1 function changed, 54 bytes removed, diff: -54

net/dccp/ccids/ccid3.c:
  ccid3_update_send_interval |   -7
  ccid3_hc_tx_packet_recv    |   +7
 2 functions changed, 7 bytes added, 7 bytes removed, diff: +0

net/ipv4/tcp_cong.c:
  tcp_is_cwnd_limited |  +88
 1 function changed, 88 bytes added, diff: +88

built-in.o:
 14 functions changed, 95 bytes added, 642 bytes removed, diff: -547

...Again some gcc artifacts visible as well.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:48 -08:00
Ilpo Järvinen
490d504693 [TCP]: Uninline tcp_set_state
net/ipv4/tcp.c:
  tcp_close_state | -226
  tcp_done        | -145
  tcp_close       | -564
  tcp_disconnect  | -141
 4 functions changed, 1076 bytes removed, diff: -1076

net/ipv4/tcp_input.c:
  tcp_fin               |  -86
  tcp_rcv_state_process | -164
 2 functions changed, 250 bytes removed, diff: -250

net/ipv4/tcp_ipv4.c:
  tcp_v4_connect | -209
 1 function changed, 209 bytes removed, diff: -209

net/ipv4/arp.c:
  arp_ignore |   +5
 1 function changed, 5 bytes added, diff: +5

net/ipv6/tcp_ipv6.c:
  tcp_v6_connect | -158
 1 function changed, 158 bytes removed, diff: -158

net/sunrpc/xprtsock.c:
  xs_sendpages |   -2
 1 function changed, 2 bytes removed, diff: -2

net/dccp/ccids/ccid3.c:
  ccid3_update_send_interval |   +7
 1 function changed, 7 bytes added, diff: +7

net/ipv4/tcp.c:
  tcp_set_state | +238
 1 function changed, 238 bytes added, diff: +238

built-in.o:
 12 functions changed, 250 bytes added, 1695 bytes removed, diff: -1445

I've no explanation why some unrelated changes seem to occur
consistently as well (arp_ignore, ccid3_update_send_interval;
I checked the arp_ignore asm and it seems to be due to some
reordered of operation order causing some extra opcodes to be
generated). Still, the benefits are pretty obvious from the
codiff's results.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:47 -08:00
Daniel Lezcano
389f661224 [NETNS][IPV6]: inet6_addr - make ipv6_chk_home_addr namespace aware
Looks if the address is belonging to the network namespace, otherwise
discard the address for the check.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:46 -08:00
Daniel Lezcano
1cab3da6be [NETNS][IPV6]: inet6_addr - ipv6_get_ifaddr namespace aware
The inet6_addr_lst is browsed taking into account the network
namespace specified as parameter. If an address does not belong
to the specified namespace, it is ignored.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:45 -08:00
Daniel Lezcano
06bfe655e7 [NETNS][IPV6]: inet6_addr - ipv6_chk_same_addr namespace aware
This patch makes ipv6_chk_same_addr function to be aware of the
network namespace. The addresses not belonging to the network
namespace are discarded.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:45 -08:00
Daniel Lezcano
bfeade0870 [NETNS][IPV6]: inet6_addr - check ipv6 address per namespace
When a new address is added, we must check if the new address does not
already exists.  This patch makes this check to be aware of a network
namespace, so the check will look if the address already exists for
the specified network namespace. While the addresses are browsed, the
addresses which do not belong to the namespace are discarded.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:44 -08:00
Daniel Lezcano
3c40090a0f [NETNS][IPV6]: inet6_addr - isolate inet6 addresses from proc file
Make /proc/net/if_inet6 show only inet6 addresses belonging to the
namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:43 -08:00
David S. Miller
9993e7d313 [TCP]: Do not purge sk_forward_alloc entirely in tcp_delack_timer().
Otherwise we beat heavily on the global tcp_memory atomics
when all of the sockets in the system are slowly sending
perioding packet clumps.

Noticed and suggested by Eric Dumazet.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:42 -08:00
Pavel Emelyanov
e186932b3d [NETNS]: Use the per-net ipv6_devconf(_all) in sysctl handlers
Actually the net->ipv6.devconf_all can be used in a few places,
but to keep the /proc/sys/net/ipv6/conf/ sysctls work consistently
in the namespace we should use the per-net devconf_all in the
sysctl "forwarding" handler.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:41 -08:00
Pavel Emelyanov
441fc2a239 [NETNS]: Use the per-net ipv6_devconf_dflt
All its users are in net/ipv6/addrconf.c's sysctl handlers.
Since they already have the struct net to get from, the
per-net ipv6_devconf_dflt can already be used.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:40 -08:00
Pavel Emelyanov
e0da5a480c [NETNS]: Create ipv6 devconf-s for namespaces
This is the core. Declare and register the pernet subsys for
addrconf. The init callback the will create the devconf-s.

The init_net will reuse the existing statically declared confs,
so that accessing them from inside the ipv6 code will still
work.

The register_pernet_subsys() is moved above the ipv6_add_dev()
call for loopback, because this function will need the
net->devconf_dflt pointer to be already set.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:40 -08:00
Pavel Emelyanov
bff16c2f99 [NETNS]: Make the ctl-tables per-namespace
This includes passing the net to __addrconf_sysctl_register
and saving this on the ctl_table->extra2 to be used in
handlers (those, needing it).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:39 -08:00
Pavel Emelyanov
9589731220 [NETNS]: Make the __addrconf_sysctl_register return an error
This error code will be needed to abort the namespace
creation if needed.

Probably, this is to be checked when a new device is
created (currently it is ignored).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:38 -08:00
Pavel Emelyanov
408c4768cd [NETNS]: Clean out the ipv6-related sysctls creation/destruction
The addrconf sysctls and neigh sysctls are registered and
unregistered always in pairs, so they can be joined into
one (well, two) functions, that accept the struct inet6_dev
and do all the job.

This also get rids of unneeded ifdefs inside the code.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:38 -08:00
Denis V. Lunev
4250846146 [NEIGH]: Make /proc/net/arp opening consistent with seq_net_open semantics
seq_open_net requires that first field of the seq->private data to be
struct seq_net_private. In reality this is a single pointer to a
struct net for now. The patch makes code consistent.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:37 -08:00
Denis V. Lunev
ae22120ad8 [ATM]: Simplify /proc/net/atm/arp opening
The iterator state->ns.neigh_sub_iter initialization is moved from
arp_seq_open to clip_seq_start for convinience. This should not be a
problem as the iterator will be used only after the seq_start
callback.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:36 -08:00
Denis V. Lunev
e5d69b9f4a [ATM]: Oops reading net/atm/arp
cat /proc/net/atm/arp causes the NULL pointer dereference in the
get_proc_net+0xc/0x3a. This happens as proc_get_net believes that the
parent proc dir entry contains struct net.

Fix this assumption for "net/atm" case.

The problem is introduced by the commit c0097b07abf5f92ab135d024dd41bd2aada1512f
from Eric W. Biederman/Daniel Lezcano.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:36 -08:00
Denis V. Lunev
8cced9eff1 [NETNS]: Enable routing configuration in non-initial namespace.
I.e. remove the net != &init_net checks from the places, that now can
handle other-than-init net namespace.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:35 -08:00
Denis V. Lunev
226b0b4a51 [NETNS]: Replace init_net with the correct context in fib_frontend.c
Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:34 -08:00
Denis V. Lunev
1bad118a33 [NETNS]: Pass namespace through ip_rt_ioctl.
... up to rtentry_to_fib_config

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:34 -08:00
Denis V. Lunev
4b5d47d4d3 [NETNS]: Correctly fill fib_config data.
Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:33 -08:00
Denis V. Lunev
6bd48fcf73 [NETNS]: Provide correct namespace for fibnl netlink socket.
This patch makes the netlink socket to be per namespace. That allows
to have each namespace its own socket for routing queries.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:32 -08:00
Denis V. Lunev
e4aef8aea3 [NETNS]: Place fib tables into netns.
The preparatory work has been done. All we need is to substitute
fib_table_hash with net->ipv4.fib_table_hash. Netns context is
available when required.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:31 -08:00
Denis V. Lunev
e4e4971c5f [NETNS]: Namespacing IPv4 fib rules.
The final trick for rules: place fib4_rules_ops into struct net and
modify initialization path for this.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:31 -08:00
Denis V. Lunev
1c340b2fd7 [NETNS]: Show routing information from correct namespace (fib_trie.c)
This is the second part (for the CONFIG_IP_FIB_TRIE case) of the patch
#4, where we have created proc files in namespaces.

Now we can dump correct info in them.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:30 -08:00
Denis V. Lunev
6e04d01dfa [NETNS]: Show routing information from correct namespace (fib_hash.c)
This is the second part (for the CONFIG_IP_FIB_HASH case) of the patch
#4, where we have created proc files in namespaces.

Now we can dump correct info in them.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:29 -08:00
Denis V. Lunev
4d1169c1e7 [NETNS]: Add netns to nl_info structure.
nl_info is used to track the end-user destination of routing change
notification. This is a natural object to hold a namespace on. Place
it there and utilize the context in the appropriate places.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:29 -08:00
Eric W. Biederman
6b175b26c1 [NETNS]: Add netns parameter to inet_(dev_)add_type.
The patch extends the inet_addr_type and inet_dev_addr_type with the
network namespace pointer. That allows to access the different tables
relatively to the network namespace.

The modification of the signature function is reported in all the
callers of the inet_addr_type using the pointer to the well known
init_net.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:27 -08:00
Denis V. Lunev
8ad4942cd5 [NETNS]: Add netns parameter to fib_get_table/fib_new_table.
This patch extends the fib_get_table and the fib_new_table functions
with the network namespace pointer. That will allow to access the
table relatively from the network namespace.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:27 -08:00
Denis V. Lunev
93456b6d77 [IPV4]: Unify access to the routing tables.
Replace the direct pointers to local and main tables with
calls to fib_get_table() with appropriate argument.

This doesn't introduce additional dereferences, but makes the access to fib
tables uniform in any (CONFIG_IP_MULTIPLE_TABLES) case.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:26 -08:00
Denis V. Lunev
7b1a74fdbb [NETNS]: Refactor fib initialization so it can handle multiple namespaces.
This patch makes the fib to be initialized as a subsystem for the
network namespaces. The code does not handle several namespaces yet,
so in case of a creation of a network namespace, the
creation/initialization will not occur.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:25 -08:00
Denis V. Lunev
dbb50165b5 [IPV4]: Check fib4_rules_init failure.
This adds error paths into both versions of fib4_rules_init
(with/without CONFIG_IP_MULTIPLE_TABLES) and returns error code to the
caller.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:25 -08:00
Denis V. Lunev
61a0265344 [NETNS]: Add namespace to API for routing /proc entries creation.
This adds netns parameter to fib_proc_init/exit and replaces __init
specifier with __net_init. After this, we will not yet have these proc
files show info from the specific namespace - this will be done when
these tables become namespaced.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:24 -08:00
Denis V. Lunev
5fd30ee7c4 [NETNS]: Namespacing in the generic fib rules code.
Move static rules_ops & rules_mod_lock to the struct net, register the
pernet subsys to init them and enjoy the fact that the core rules
infrastructure works in the namespace.

Real IPv4 fib rules virtualization requires fib tables support in the
namespace and will be done seriously later in the patchset.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:23 -08:00
Denis V. Lunev
868d13ac81 [NETNS]: Pass fib_rules_ops into default_pref method.
fib_rules_ops contains operations and the list of configured rules. ops will
become per/namespace soon, so we need them to be known in the default_pref
callback.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:22 -08:00
Denis V. Lunev
f8c26b8d58 [NETNS]: Add netns parameter to fib_rules_(un)register.
The patch extends the different fib rules API in order to pass the
network namespace pointer. That will allow to access the different
tables from a namespace relative object. As usual, the pointer to the
init_net variable is passed as parameter so we don't break the
network.

Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:21 -08:00
Daniel Lezcano
41a76906b3 [NETNS][IPV6]: Make icmpv6_time sysctl per namespace.
This patch moves the icmpv6_time sysctl to the network namespace
structure.

Because the ipv6 protocol is not yet per namespace, the variable is
accessed relatively to the initial network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:20 -08:00
Daniel Lezcano
4990509f19 [NETNS][IPV6]: Make sysctls route per namespace.
All the sysctl concerning the routes are moved to the network
namespace structure. A helper function is called to initialize the
variables.

Because the ipv6 protocol is not yet per namespace, the variables are
accessed relatively from the network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:20 -08:00
Daniel Lezcano
7c76509d0d [NETNS][IPV6]: Make mld_max_msf readonly in other namespaces.
The mld_max_msf protects the system with a maximum allowed multicast
source filters. Making this variable per namespace can be potentially
an problem if someone inside a namespace set it to a big value, that
will impact the whole system including other namespaces.

I don't see any benefits to have it per namespace for now, so in order
to keep a directory entry in a newly created namespace, I make it
read-only when we are not in the initial network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:19 -08:00
Daniel Lezcano
e71e0349eb [NETNS][IPV6]: Make ip6_frags per namespace.
The ip6_frags is moved to the network namespace structure.  Because
there can be multiple instances of the network namespaces, and the
ip6_frags is no longer a global static variable, a helper function has
been added to facilitate the initialization of the variables.

Until the ipv6 protocol is not per namespace, the variables are
accessed relatively from the initial network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:18 -08:00
Daniel Lezcano
99bc9c4e45 [NETNS][IPV6]: Make bindv6only sysctl per namespace.
This patch moves the bindv6only sysctl to the network namespace
structure. Until the ipv6 protocol is not per namespace, the sysctl
variable is always from the initial network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:18 -08:00
Daniel Lezcano
760f2d0186 [NETNS][IPV6]: Make multiple instance of sysctl tables.
Each network namespace wants its own set of sysctl value, eg. we
should not be able from a namespace to set a sysctl value for another
namespace , especially for the initial network namespace.

This patch duplicates the sysctl table when we register a new network
namespace for ipv6. The duplicated table are postfixed with the
"template" word to notify the developper the table is cloned.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:17 -08:00
Daniel Lezcano
89918fc270 [NETNS][IPV6]: Make the ipv6 sysctl to be a netns subsystem.
The initialization of the sysctl for the ipv6 protocol is changed to a
network namespace subsystem. That means when a new network namespace
is created the initialization function for the sysctl will be called.

That do not change the behavior of the sysctl in case of the kernel
with the network namespace disabled.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:16 -08:00
Daniel Lezcano
81c1c17804 [NETNS][IPV6]: Make a subsystem for af_inet6.
This patch add a network namespace subsystem for the af_inet6 module.
It does nothing right now, but one of its purpose is to receive the
different variables for sysctl in order to initialize them.

When the sysctl variable will be moved to the network namespace
structure, they will be no longer initialized as global static
variables, so we must find a place to initialize them. Because the
sysctl can be disabled, it has no sense to store them in the
sysctl_net_ipv6 file.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:15 -08:00
Daniel Lezcano
291480c09a [NETNS][IPV6]: Make ipv6_sysctl_register to return a value.
This patch makes the function ipv6_sysctl_register to return a
value. The af_inet6 init function is now able to handle an error and
catch it from the initialization of the sysctl.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:14 -08:00
Sebastian Siewior
50dd79653e [XFRM]: Remove ifdef crypto.
and select the crypto subsystem if neccessary

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:12 -08:00
Rami Rosen
06eaa1a01d [BRIDGE]: Remove unused macros from ebt_vlan.c
Remove two unused macros, INV_FLAG and SET_BITMASK
from net/bridge/netfilter/ebt_vlan.c.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:11 -08:00
Pavel Emelyanov
b3fd3ffe39 [NETFILTER]: Use the ctl paths instead of hand-made analogue
The conntracks subsystem has a similar infrastructure
to maintain ctl_paths, but since we already have it
on the generic level, I think it's OK to switch to
using it.

So, basically, this patch just replaces the ctl_table-s
with ctl_path-s, nf_register_sysctl_table with
register_sysctl_paths() and removes no longer needed code.

After this the net/netfilter/nf_sysctl.c file contains
the paths only.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:11 -08:00
Pavel Emelyanov
3d7cc2ba62 [NETFILTER]: Switch to using ctl_paths in nf_queue and conntrack modules
This includes the most simple cases for netfilter.

The first part is tne queue modules for ipv4 and ipv6,
on which the net/ipv4/ and net/ipv6/ paths are reused
from the appropriate ipv4 and ipv6 code.

The conntrack module is also patched, but this hunk is
very small and simple.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:10 -08:00
Pavel Emelyanov
c6995bdff0 [AX25]: Switch to using ctl_paths.
This one is almost the same as the hunks in the
first patch, but ax25 tables are created dynamically.

So this patch differs a bit to handle this case.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:09 -08:00
Pavel Emelyanov
3151a9ab04 [DECNET]: Switch to using ctl_paths.
The decnet includes two places to patch. The first one is
the net/decnet table itself, and it is patched just like
other subsystems in the first patch in this series.

The second place is a bit more complex - it is the
net/decnet/conf/xxx entries,. similar to those in
ipv4/devinet.c and ipv6/addrconf.c. This code is made similar
to those in ipv[46].

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:09 -08:00
Pavel Emelyanov
90754f8ec0 [IPVS]: Switch to using ctl_paths.
The feature of ipvs ctls is that the net/ipv4/vs path
is common for core ipvs ctls and for two schedulers,
so I make it exported and re-use it in modules.

Two other .c files required linux/sysctl.h to make the
extern declaration of this path compile well.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:08 -08:00
Pavel Emelyanov
b5ccd792fa [NET]: Simple ctl_table to ctl_path conversions.
This patch includes many places, that only required
replacing the ctl_table-s with appropriate ctl_paths
and call register_sysctl_paths().

Nothing special was done with them.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:07 -08:00
Rami Rosen
cb7928a528 [IPV4]: Remove unsupported DNAT (RTCF_NAT and RTCF_NAT) in IPV4
- The DNAT (Destination NAT) is not implemented in IPV4.

- This patch remove the code which checks these flags
in net/ipv4/arp.c and net/ipv4/route.c.

The RTCF_NAT and RTCF_NAT should stay in the header (linux/in_route.h)
because they are used in DECnet.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:07 -08:00
Julia Lawall
4cec72c890 [TIPC]: Use tipc_port_unlock
The file net/tipc/port.c takes a lock using the function tipc_port_lock and
then releases the lock sometimes using tipc_port_unlock and sometimes using
spin_unlock_bh(p_ptr->publ.lock).  tipc_port_unlock simply does the
spin_unlock_bh, but it seems cleaner to use it everywhere.

The problem was fixed using the following semantic patch.
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
struct port *p_ptr;
@@

   p_ptr = tipc_port_lock(...)
   ...
(
   p_ptr = tipc_port_lock(...);
|
?- spin_unlock_bh(p_ptr->publ.lock);
+  tipc_port_unlock(p_ptr);
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Jon Paul Maloy <maloy@donjonn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:05 -08:00
Ivo van Doorn
cdcb006fbe mac80211: Add radio led trigger
Some devices have a seperate LED which indicates if the radio is
enabled or not. This adds a LED trigger to mac80211 where drivers
can hook into when they are interested in radio status changes.

v2: Check hw.conf.radio_enabled when calling start().

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:04 -08:00
Andrew Lutomirski
df26e7ea04 rc80211_pid should respect fixed rates.
I would argue that mac80211 should handle fixed rates outside the rate
control code, which would also allow them to take effect immediately
instead of during the rate control callback, but this is pretty close
to correct.

Signed-Off-By: Andy Lutomirski <luto@myrealbox.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:04 -08:00
Johannes Berg
4b475898ec mac80211: better rate control algorithm selection
This patch changes mac80211's Kconfig/Makefile to:
 * select between the PID and the SIMPLE rate control
   algorithm as default
 * always allow tri-state for the rate control algorithms,
   building those that are selected 'y' into the mac80211
   module (if that is a module, otherwise all into the kernel)
 * force the default rate control algorithm to be built into
   mac80211

It also makes both rate control algorithms proper modules again
with MODULE_LICENSE etc.

Only if EMBEDDED is the user allowed to select "NONE" as default
which will cause no algorithm to be selected, this will work
only when the driver brings one itself (e.g. iwlwifi drivers).

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:02 -08:00
Ron Rindjunsky
688b88a488 mac80211: A-MPDU Rx handling DELBA requests
This patch opens the flow to DELBA management frames, and handles end
of A-MPDU session produced by this event.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:02 -08:00
Ron Rindjunsky
713647169e mac80211: A-MPDU Rx adding BAR handling capability
This patch adds the ability to handle Block Ack Request

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:01 -08:00
Ron Rindjunsky
b580781e03 mac80211: A-MPDU Rx handling aggregation reordering
This patch handles the reordering of the Rx A-MPDU.
This issue occurs when the sequence of the internal MPDUs is not in the
right order. such a case can be encountered for example when some MPDUs from
previous aggregations were recieved, while others failed, so current A-MPDU
will contain a mix of re-transmited MPDUs and new ones.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:00 -08:00
Ron Rindjunsky
16c5f15c73 mac80211: A-MPDU Rx MLME data initialization
This patch initialize A-MPDU MLME data for Rx sessions.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:01:00 -08:00
Ron Rindjunsky
07db218396 mac80211: A-MPDU Rx adding basic functionality
This patch adds the basic needed abilities and functions for A-MPDU Rx session
changed functions:
 - ieee80211_sta_process_addba_request - Rx A-MPDU initialization enabled
 - ieee80211_stop - stops all A-MPDU Rx in case interface goes down
added functions:
 - ieee80211_send_delba - used for sending out Del BA in A-MPDU sessions
 - ieee80211_sta_stop_rx_BA_session - stopping Rx A-MPDU session
 - sta_rx_agg_session_timer_expired - stops A-MPDU Rx use if load is too
low

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:59 -08:00
Ron Rindjunsky
5aae288061 mac80211: A-MPDU Rx add MLME structures
This patch adds the needed structures to describe the Rx aggregation MLME per STA
new:
 - struct tid_ampdu_rx: TID aggregation information (Rx)
 - struct sta_ampdu_mlme: MLME aggregation information for STA
changed:
 - struct sta_info: ampdu_mlme added to describe A-MPDU MLME per STA,
		    and timer_to_tid added to map timer id into TID

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:58 -08:00
Ron Rindjunsky
6368e4b18d mac80211: restructure __ieee80211_rx
This patch makes a separation between Rx frame pre-handling which stays in
__ieee80211_rx and Rx frame handlers, moving to __ieee80211_rx_handle_packet.
Although this separation has no affect in regular mode of operation, this kind
of mechanism will be used in A-MPDU frames reordering as it allows accumulation
of frames during pre-handling, dispatching them to later handling when necessary.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:57 -08:00
Johannes Berg
f704662fb7 mac80211: make rc_pid_fop_events static
No need to not be.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:56 -08:00
Stefano Brivio
1b507e7e53 rc80211-pid: fix definition of rate control interval
Fix the rate control interval definition. Thanks to Mattias Nissler for
spotting this out.

Cc: Mattias Nissler <mattias.nissler@gmx.de>
Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:55 -08:00
Johannes Berg
6d65f5db2f mac80211: remove misleading 'res' variable
When this function returns != CONTINUE, it needs to put the
station struct it has acquired. Hence, having this unused
variable is not just superfluous but also misleading.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:55 -08:00
Stefano Brivio
d439810bda rc80211-pid: pf_target tuning
Set a better value for percentage target for failed frames. The previous value
slowed down too much rate increases in case of permanently low activity. While
at it, increase readability.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:53 -08:00
Stefano Brivio
13e05aa631 rc80211-pid: fix sta_info refcounting
Fix a bug which caused uncorrect refcounting of PHYs in mac80211. Thanks to
Johannes Berg for spotting this out.

Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:53 -08:00
Stefano Brivio
fa44327c06 rc80211-pid: simplify and fix shift_adjust
Simplify and fix rate_control_pid_shift_adjust(). A bug prevented correct
mapping of sorted rates, and readability was seriously flawed.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:52 -08:00
Stefano Brivio
ca5fbca924 rc80211-pid: add kerneldoc for tunable parameters
Add a kerneldoc description for parameters which are tunable through debugfs.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:51 -08:00
Stefano Brivio
426706c079 rc80211-pid: export human-readable target_pf value to debugfs
Export the non-shifted target_pf value to debugfs, so that it's human-readable.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:50 -08:00
Helmut Schaa
69f817b654 mac80211: Restore rx.fc before every invocation of ieee80211_invoke_rx_handlers
This patch fixes a problem with rx handling on multiple interfaces. Especially
when using hardware-scanning and a wireless driver (i.e. iwlwifi) which is
able to receive data while scanning.

The rx handlers can modify the skb and the frame control field (see
ieee80211_rx_h_remove_qos_control) but since every interface gets its own
copy of the skb each should get its own copy of rx.fc too.

In my case the wlan0-interface did not remove the qos-control from the frame
because the corresponding flag in rx.fc was already removed while processing
the frame on the master interface. Therefore somehow corrupted frames were
passed to the userspace.

Signed-off-by: Helmut Schaa <hschaa@suse.de>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:50 -08:00
Eric Dumazet
6666351df9 [XFRM]: xfrm_state_clone() should be static, not exported
xfrm_state_clone() is not used outside of net/xfrm/xfrm_state.c
There is no need to export it.

Spoted by sparse checker.
   CHECK   net/xfrm/xfrm_state.c
net/xfrm/xfrm_state.c:1103:19: warning: symbol 'xfrm_state_clone' was not
declared. Should it be static?

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:49 -08:00
Eric Dumazet
40ccbf525e [PACKET]: Fix sparse warnings in af_packet.c
CHECK   net/packet/af_packet.c
net/packet/af_packet.c:1876:14: warning: context imbalance in 'packet_seq_start' - wrong count at exit
net/packet/af_packet.c:1888:13: warning: context imbalance in 'packet_seq_stop' - unexpected unlock

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:48 -08:00
Julia Lawall
67b23219ce [BLUETOOTH]: Use sockfd_put()
The function sockfd_lookup uses fget on the value that is stored in
the file field of the returned structure, so fput should ultimately be
applied to this value.  This can be done directly, but it seems better
to use the specific macro sockfd_put, which does the same thing.

The problem was fixed using the following semantic patch.
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
expression s;
@@

   s = sockfd_lookup(...)
   ...
+  sockfd_put(s);
?- fput(s->file);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:48 -08:00
WANG Cong
64c31b3f76 [XFRM] xfrm_policy_destroy: Rename and relative fixes.
Since __xfrm_policy_destroy is used to destory the resources
allocated by xfrm_policy_alloc. So using the name
__xfrm_policy_destroy is not correspond with xfrm_policy_alloc.
Rename it to xfrm_policy_destroy.

And along with some instances that call xfrm_policy_alloc
but not using xfrm_policy_destroy to destroy the resource,
fix them.

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:46 -08:00
Masahide NAKAMURA
d66e37a99d [XFRM] Statistics: Add outbound-dropping error.
o Increment PolError counter when flow_cache_lookup() returns
  errored pointer.

o Increment NoStates counter at larval-drop.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:45 -08:00
Ilpo Järvinen
a067d9ac39 [NET]: Remove obsolete comment
It seems that ip_build_xmit is no longer used in here and
ip_append_data is used.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:45 -08:00
Ilpo Järvinen
c4e18dade1 [CCID3]: Kill some bloat
Without a number of CONFIG.*DEBUG:

net/dccp/ccids/ccid3.c:
  ccid3_hc_tx_update_x          | -170
  ccid3_hc_tx_packet_sent       | -175
  ccid3_hc_tx_packet_recv       | -169
  ccid3_hc_tx_no_feedback_timer | -192
  ccid3_hc_tx_send_packet       | -144
 5 functions changed, 850 bytes removed, diff: -850

net/dccp/ccids/ccid3.c:
  ccid3_update_send_interval | +191
 1 function changed, 191 bytes added, diff: +191

net/dccp/ccids/ccid3.o:
 6 functions changed, 191 bytes added, 850 bytes removed, diff: -659

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:44 -08:00
Ilpo Järvinen
cf35f43e6e [XFRM]: Kill some bloat
net/xfrm/xfrm_state.c:
  xfrm_audit_state_delete          | -589
  xfrm_replay_check                | -542
  xfrm_audit_state_icvfail         | -520
  xfrm_audit_state_add             | -589
  xfrm_audit_state_replay_overflow | -523
  xfrm_audit_state_notfound_simple | -509
  xfrm_audit_state_notfound        | -521
 7 functions changed, 3793 bytes removed, diff: -3793

net/xfrm/xfrm_state.c:
  xfrm_audit_helper_pktinfo | +522
  xfrm_audit_helper_sainfo  | +598
 2 functions changed, 1120 bytes added, diff: +1120

net/xfrm/xfrm_state.o:
 9 functions changed, 1120 bytes added, 3793 bytes removed, diff: -2673

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:43 -08:00
Ilpo Järvinen
ad1b30b1c2 [IPVS]: Kill some bloat
net/ipv4/ipvs/ip_vs_xmit.c:
  ip_vs_icmp_xmit   | -638
  ip_vs_tunnel_xmit | -674
  ip_vs_nat_xmit    | -716
  ip_vs_dr_xmit     | -682
 4 functions changed, 2710 bytes removed, diff: -2710

net/ipv4/ipvs/ip_vs_xmit.c:
  __ip_vs_get_out_rt | +595
 1 function changed, 595 bytes added, diff: +595

net/ipv4/ipvs/ip_vs_xmit.o:
 5 functions changed, 595 bytes added, 2710 bytes removed, diff: -2115

Without some CONFIG.*DEBUGs:

net/ipv4/ipvs/ip_vs_xmit.o:
 5 functions changed, 383 bytes added, 1513 bytes removed, diff: -1130

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:43 -08:00
Ilpo Järvinen
bb5cf80e94 [NETFILTER]: Kill some supper dupper bloatry
/me awards the bloatiest-of-all-net/-.c-code award to
nf_conntrack_netlink.c, congratulations to all the authors :-/!

Hall of (unquestionable) fame (measured per inline, top 10 under
net/):
  -4496 ctnetlink_parse_tuple        netfilter/nf_conntrack_netlink.c
  -2165 ctnetlink_dump_tuples        netfilter/nf_conntrack_netlink.c
  -2115 __ip_vs_get_out_rt           ipv4/ipvs/ip_vs_xmit.c
  -1924 xfrm_audit_helper_pktinfo    xfrm/xfrm_state.c
  -1799 ctnetlink_parse_tuple_proto  netfilter/nf_conntrack_netlink.c
  -1268 ctnetlink_parse_tuple_ip     netfilter/nf_conntrack_netlink.c
  -1093 ctnetlink_exp_dump_expect    netfilter/nf_conntrack_netlink.c
  -1060 void ccid3_update_send_interval  dccp/ccids/ccid3.c
  -983  ctnetlink_dump_tuples_proto  netfilter/nf_conntrack_netlink.c
  -827  ctnetlink_exp_dump_tuple     netfilter/nf_conntrack_netlink.c

  (i386 / gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-13) /
   allyesconfig except CONFIG_FORCED_INLINING)

...and I left < 200 byte gains as future work item.

After iterative inline removal, I finally have this:

net/netfilter/nf_conntrack_netlink.c:
  ctnetlink_exp_fill_info   | -1104
  ctnetlink_new_expect      | -1572
  ctnetlink_fill_info       | -1303
  ctnetlink_new_conntrack   | -2230
  ctnetlink_get_expect      | -341
  ctnetlink_del_expect      | -352
  ctnetlink_expect_event    | -1110
  ctnetlink_conntrack_event | -1548
  ctnetlink_del_conntrack   | -729
  ctnetlink_get_conntrack   | -728
 10 functions changed, 11017 bytes removed, diff: -11017

net/netfilter/nf_conntrack_netlink.c:
  ctnetlink_parse_tuple     | +419
  dump_nat_seq_adj          | +183
  ctnetlink_dump_counters   | +166
  ctnetlink_dump_tuples     | +261
  ctnetlink_exp_dump_expect | +633
  ctnetlink_change_status   | +460
 6 functions changed, 2122 bytes added, diff: +2122

net/netfilter/nf_conntrack_netlink.o:
 16 functions changed, 2122 bytes added, 11017 bytes removed, diff: -8895

Without a number of CONFIG.*DEBUGs, I got this:
net/netfilter/nf_conntrack_netlink.o:
 16 functions changed, 2122 bytes added, 11029 bytes removed, diff: -8907

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:41 -08:00
Eric Dumazet
2a75de0c1d [NETNS]: Should build with CONFIG_SYSCTL=n
Previous NETNS patches broke CONFIG_SYSCTL=n case

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:40 -08:00
Eric Dumazet
74feb6e84e [ICMP]: Avoid sparse warnings in net/ipv4/icmp.c
CHECK   net/ipv4/icmp.c
net/ipv4/icmp.c:249:13: warning: context imbalance in 'icmp_xmit_unlock' -
unexpected unlock
net/ipv4/icmp.c:376:13: warning: context imbalance in 'icmp_reply' - different
lock contexts for basic block
net/ipv4/icmp.c:430:6: warning: context imbalance in 'icmp_send' - different
lock contexts for basic block

Solution is to declare both icmp_xmit_lock() and icmp_xmit_unlock() as inline

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:37 -08:00
Eric Dumazet
65f7651788 [NET]: prot_inuse cleanups and optimizations
1) Cleanups (all functions are prefixed by sock_prot_inuse)

sock_prot_inc_use(prot) -> sock_prot_inuse_add(prot,-1)
sock_prot_dec_use(prot) -> sock_prot_inuse_add(prot,-1)
sock_prot_inuse()       -> sock_prot_inuse_get()

New functions :

sock_prot_inuse_init() and sock_prot_inuse_free() to abstract pcounter use.

2) if CONFIG_PROC_FS=n, we can zap 'inuse' member from "struct proto",
since nobody wants to read the inuse value.

This saves 1372 bytes on i386/SMP and some cpu cycles.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:36 -08:00
Eric Dumazet
789675e216 [NET]: Avoid divides in net/core/gen_estimator.c
We can void divides (as seen with CONFIG_CC_OPTIMIZE_FOR_SIZE=y on x86)
changing ((HZ<<idx)/4) to ((HZ/4) << idx)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:35 -08:00
Ilpo Järvinen
e870a8efcd [TCP]: Perform setting of common control fields in one place
In case of segments which are purely for control without any
data (SYN/ACK/FIN/RST), many fields are set to common values
in multiple places.

i386 results:

$ gcc --version
gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-13)

$ codiff tcp_output.o.old tcp_output.o.new
net/ipv4/tcp_output.c:
  tcp_xmit_probe_skb    |  -48
  tcp_send_ack          |  -56
  tcp_retransmit_skb    |  -79
  tcp_connect           |  -43
  tcp_send_active_reset |  -35
  tcp_make_synack       |  -42
  tcp_send_fin          |  -48
 7 functions changed, 351 bytes removed

net/ipv4/tcp_output.c:
  tcp_init_nondata_skb |  +90
 1 function changed, 90 bytes added

tcp_output.o.mid:
 8 functions changed, 90 bytes added, 351 bytes removed, diff: -261

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:34 -08:00
Ilpo Järvinen
19773b4923 [TCP]: Urgent parameter effect can be simplified.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:33 -08:00
Ilpo Järvinen
f038ac8f9b [TCP]: cleanup tcp_parse_options deep indented switch
Removed case indentation level & combined some nested ifs, mostly
within 80 lines now. This is a leftover from indent patch, it
just had to be done manually to avoid messing it up completely.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:33 -08:00
Herbert Xu
dbb1db8b59 [IPSEC]: Return EOVERFLOW when output sequence number overflows
Previously we made it an error on the output path if the sequence number
overflowed.  However we did not set the err variable accordingly.  This
patch sets err to -EOVERFLOW in that case.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:32 -08:00
Eric Dumazet
9a429c4983 [NET]: Add some acquires/releases sparse annotations.
Add __acquires() and __releases() annotations to suppress some sparse
warnings.

example of warnings :

net/ipv4/udp.c:1555:14: warning: context imbalance in 'udp_seq_start' - wrong
count at exit
net/ipv4/udp.c:1571:13: warning: context imbalance in 'udp_seq_stop' -
unexpected unlock

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:31 -08:00
Eric Dumazet
680a5a5086 [PATCH] use SK_MEM_QUANTUM_SHIFT in __sk_mem_reclaim()
Avoid an expensive divide (as done in commit
18030477e70a826b91608aee40a987bbd368fec6 but lost in commit
23821d2653111d20e75472c8c5003df1a55309a8)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:27 -08:00
Ilpo Järvinen
d436d68630 [TCP]: Remove unnecessary local variable
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:26 -08:00
Ilpo Järvinen
409d22b470 [TCP]: Code duplication removal, added tcp_bound_to_half_wnd()
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:26 -08:00
Ilpo Järvinen
056834d9f6 [TCP]: cleanup tcp_{in,out}put.c style
These were manually selected from indent's results which as is
are too noisy to be of any use without human reason. In addition,
some extra newlines between function and its comment were removed
too.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:25 -08:00
Ilpo Järvinen
058dc3342b [TCP]: reduce tcp_output's indentation levels a bit
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:24 -08:00
Ilpo Järvinen
4828e7f49a [TCP]: Remove TCPCB_URG & TCPCB_AT_TAIL as unnecessary
The snd_up check should be enough. I suspect this has been
there to provide a minor optimization in clean_rtx_queue which
used to have a small if (!->sacked) block which could skip
snd_up check among the other work.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:23 -08:00
Ilpo Järvinen
cadbd0313b [TCP]: Dropped unnecessary skb/sacked accessing in reneging
SACK reneging can be precalculated to a FLAG in clean_rtx_queue
which has the right skb looked up. This will help a bit in
future because skb->sacked access will be changed eventually,
changing it already won't hurt any.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:23 -08:00
Ilpo Järvinen
90840defab [TCP]: Introduce tcp_wnd_end() to reduce line lengths
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:22 -08:00
Ilpo Järvinen
66f5fe624f [TCP]: Rename update_send_head & include related increment to it
There's very little need to have the packets_out incrementing in
a separate function. Also name the combined function
appropriately.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:21 -08:00
Ilpo Järvinen
3ccd3130b3 [TCP]: Make invariant check complain about invalid sacked_out
Earlier resolution for NewReno's sacked_out should now keep
it small enough for this to become invariant-like check.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:20 -08:00
Hideo Aoki
95766fff6b [UDP]: Add memory accounting.
Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Hideo Aoki <haoki@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:19 -08:00
Hideo Aoki
3ab224be6d [NET] CORE: Introducing new memory accounting interface.
This patch introduces new memory accounting functions for each network
protocol. Most of them are renamed from memory accounting functions
for stream protocols. At the same time, some stream memory accounting
functions are removed since other functions do same thing.

Renaming:
	sk_stream_free_skb()		->	sk_wmem_free_skb()
	__sk_stream_mem_reclaim()	->	__sk_mem_reclaim()
	sk_stream_mem_reclaim()		->	sk_mem_reclaim()
	sk_stream_mem_schedule 		->    	__sk_mem_schedule()
	sk_stream_pages()      		->	sk_mem_pages()
	sk_stream_rmem_schedule()	->	sk_rmem_schedule()
	sk_stream_wmem_schedule()	->	sk_wmem_schedule()
	sk_charge_skb()			->	sk_mem_charge()

Removeing
	sk_stream_rfree():	consolidates into sock_rfree()
	sk_stream_set_owner_r(): consolidates into skb_set_owner_r()
	sk_stream_mem_schedule()

The following functions are added.
    	sk_has_account(): check if the protocol supports accounting
	sk_mem_uncharge(): do the opposite of sk_mem_charge()

In addition, to achieve consolidation, updating sk_wmem_queued is
removed from sk_mem_charge().

Next, to consolidate memory accounting functions, this patch adds
memory accounting calls to network core functions. Moreover, present
memory accounting call is renamed to new accounting call.

Finally we replace present memory accounting calls with new interface
in TCP and SCTP.

Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Hideo Aoki <haoki@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:18 -08:00
Gui Jianfeng
a06b494b61 [IPV6]: Remove useless code from fib6_del_route().
There are useless codes in fib6_del_route(). The following patch has
been tested, every thing looks fine, as usual.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:17 -08:00
Chas Williams
fb64c735a5 [ATM]: [br2864] whitespace cleanup
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:00:14 -08:00