Commit Graph

20 Commits

Author SHA1 Message Date
Eric W. Biederman
a4ffe319ae act_connmark: Remember the struct net instead of guessing it.
Stop guessing the struct net instead of remember it.  Guessing is just
silly and will be problematic in the future when I implement routes
between network namespaces.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18 21:59:31 +02:00
Alexei Starovoitov
cff82457c5 net_sched: act_bpf: remove spinlock in fast path
Similar to act_gact/act_mirred, act_bpf can be lockless in packet processing
with extra care taken to free bpf programs after rcu grace period.
Replacement of existing act_bpf (very rare) is done with synchronize_rcu()
and final destruction is done from tc_action_ops->cleanup() callback that is
called from tcf_exts_destroy()->tcf_action_destroy()->__tcf_hash_release() when
bind and refcnt reach zero which is only possible when classifier is destroyed.
Previous two patches fixed the last two classifiers (tcindex and rsvp) to
call tcf_exts_destroy() from rcu callback.

Similar to gact/mirred there is a race between prog->filter and
prog->tcf_action. Meaning that the program being replaced may use
previous default action if it happened to return TC_ACT_UNSPEC.
act_mirred race betwen tcf_action and tcfm_dev is similar.
In all cases the race is harmless.
Long term we may want to improve the situation by replacing the whole
tc_action->priv as single pointer instead of updating inner fields one by one.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-26 11:01:45 -07:00
Eric Dumazet
2ee22a90c7 net_sched: act_mirred: remove spinlock in fast path
Like act_gact, act_mirred can be lockless in packet processing

1) Use percpu stats
2) update lastuse only every clock tick to avoid false sharing
3) use rcu to protect tcfm_dev
4) Remove spinlock usage, as it is no longer needed.

Next step : add multi queue capability to ifb device

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08 13:50:42 -07:00
Eric Dumazet
cc6510a950 net_sched: act_gact: use a separate packet counters for gact_determ()
Second step for gact RCU operation :

We want to get rid of the spinlock protecting gact operations.
Stats (packets/bytes) will soon be per cpu.

gact_determ() would not work without a central packet counter,
so lets add it for this mode.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08 13:50:42 -07:00
Daniel Borkmann
a8cb5f556b act_bpf: add initial eBPF support for actions
This work extends the "classic" BPF programmable tc action by extending
its scope also to native eBPF code!

Together with commit e2e9b6541d ("cls_bpf: add initial eBPF support
for programmable classifiers") this adds the facility to implement fully
flexible classifier and actions for tc that can be implemented in a C
subset in user space, "safely" loaded into the kernel, and being run in
native speed when JITed.

Also, since eBPF maps can be shared between eBPF programs, it offers the
possibility that cls_bpf and act_bpf can share data 1) between themselves
and 2) between user space applications. That means that, f.e. customized
runtime statistics can be collected in user space, but also more importantly
classifier and action behaviour could be altered based on map input from
the user space application.

For the remaining details on the workflow and integration, see the cls_bpf
commit e2e9b6541d. Preliminary iproute2 part can be found under [1].

  [1] http://git.breakpoint.cc/cgit/dborkman/iproute2.git/log/?h=ebpf-act

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-20 19:10:44 -04:00
Felix Fietkau
22a5dc0e5e net: sched: Introduce connmark action
This tc action allows you to retrieve the connection tracking mark
This action has been used heavily by openwrt for a few years now.

There are known limitations currently:

doesn't work for initial packets, since we only query the ct table.
  Fine given use case is for returning packets

no implicit defrag.
  frags should be rare so fix later..

won't work for more complex tasks, e.g. lookup of other extensions
  since we have no means to store results

we still have a 2nd lookup later on via normal conntrack path.
This shouldn't break anything though since skb->nfct isn't altered.

V2:
remove unnecessary braces (Jiri)
change the action identifier to 14 (Jiri)
Fix some stylistic issues caught by checkpatch
V3:
Move module params to bottom (Cong)
Get rid of tcf_hashinfo_init and friends and conform to newer API (Cong)

Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-19 16:02:06 -05:00
Jiri Pirko
d23b8ad8ab tc: add BPF based action
This action provides a possibility to exec custom BPF code.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-17 23:51:10 -05:00
Jiri Pirko
08dcf9fd19 tc_vlan: fix type of tcfv_push_vid
Should be u16. So fix it to kill the sparse warning.

Fixes: c7e2b9689e "sched: introduce vlan action"
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24 16:12:03 -05:00
Jiri Pirko
c7e2b9689e sched: introduce vlan action
This tc action allows to work with vlan tagged skbs. Two supported
sub-actions are header pop and header push.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-21 14:20:18 -05:00
WANG Cong
86062033fe net_sched: act: hide struct tcf_common from API
Now we can totally hide it from modules. tcf_hash_*() API's
will operate on struct tc_action, modules don't need to care about
the details.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-12 19:23:32 -05:00
Jeff Kirsher
a6227e26d9 include/net/: Fix FSF address in file headers
Several files refer to an old address for the Free Software Foundation
in the file header comment.  Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06 12:37:56 -05:00
Grégoire Baron
eb4d406545 net/sched: add ACT_CSUM action to update packets checksums
net/sched: add ACT_CSUM action to update packets checksums

ACT_CSUM can be called just after ACT_PEDIT in order to re-compute some
altered checksums in IPv4 and IPv6 packets. The following checksums are
supported by this patch:
 - IPv4: IPv4 header, ICMP, IGMP, TCP, UDP & UDPLite
 - IPv6: ICMPv6, TCP, UDP & UDPLite
It's possible to request in the same action to update different kind of
checksums, if the packets flow mix TCP, UDP and UDPLite, ...

An example of usage is done in the associated iproute2 patch.

Version 3 changes:
 - remove useless goto instructions
 - improve IPv6 hop options decoding

Version 2 changes:
 - coding style correction
 - remove useless arguments of some functions
 - use stack in tcf_csum_dump()
 - add tcf_csum_skb_nextlayer() to factor code

Signed-off-by: Gregoire Baron <baronchon@n7mm.org>
Acked-by: jamal <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-20 01:42:59 -07:00
stephen hemminger
3b87956ea6 net sched: fix race in mirred device removal
This fixes hang when target device of mirred packet classifier
action is removed.

If a mirror or redirection action is configured to cause packets
to go to another device, the classifier holds a ref count, but was assuming
the adminstrator cleaned up all redirections before removing. The fix
is to add a notifier and cleanup during unregister.

The new list is implicitly protected by RTNL mutex because
it is held during filter add/delete as well as notifier.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-24 21:04:20 -07:00
jamal
1c55d62e77 pkt_sched: skbedit add support for setting mark
This adds support for setting the skb mark.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-22 21:56:42 -07:00
Alexander Duyck
ca9b0e27e0 pkt_action: add new action skbedit
This new action will have the ability to change the priority and/or
queue_mapping fields on an sk_buff.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 16:30:20 -07:00
Herbert Xu
b421995235 [PKT_SCHED]: Add stateless NAT
Stateless NAT is useful in controlled environments where restrictions are
placed on through traffic such that we don't need connection tracking to
correctly NAT protocol-specific data.

In particular, this is of interest when the number of flows or the number
of addresses being NATed is large, or if connection tracking information
has to be replicated and where it is not practical to do so.

Previously we had stateless NAT functionality which was integrated into
the IPv4 routing subsystem.  This was a great solution as long as the NAT
worked on a subnet to subnet basis such that the number of NAT rules was
relatively small.  The reason is that for SNAT the routing based system
had to perform a linear scan through the rules.

If the number of rules is large then major renovations would have take
place in the routing subsystem to make this practical.

For the time being, the least intrusive way of achieving this is to use
the u32 classifier written by Alexey Kuznetsov along with the actions
infrastructure implemented by Jamal Hadi Salim.

The following patch is an attempt at this problem by creating a new nat
action that can be invoked from u32 hash tables which would allow large
number of stateless NAT rules that can be used/updated in constant time.

The actual NAT code is mostly based on the previous stateless NAT code
written by Alexey.  In future we might be able to utilise the protocol
NAT code from netfilter to improve support for other protocols.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:53:11 -07:00
David S. Miller
e9ce1cd3cf [PKT_SCHED]: Kill pkt_act.h inlining.
This was simply making templates of functions and mostly causing a lot
of code duplication in the classifier action modules.

We solve this more cleanly by having a common "struct tcf_common" that
hash worker functions contained once in act_api.c can work with.

Callers work with real action objects that have the common struct
plus their module specific struct members.  You go from a common
object to the higher level one using a "to_foo()" macro which makes
use of container_of() to do the dirty work.

This also kills off act_generic.h which was only used by act_simple.c
and keeping it around was more work than the it's value.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:55:10 -07:00
Dmitry Mishin
1e30a014e3 [NETFILTER]: futher {ip,ip6,arp}_tables unification
This patch moves {ip,ip6,arp}t_entry_{match,target} definitions to
x_tables.h. This move simplifies code and future compatibility fixes.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Acked-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:56:56 -08:00
Jamal Hadi Salim
db75307979 [PKT_SCHED]: Introduce simple actions.
And provide an example simply action in order to
demonstrate usage.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-24 20:10:16 -07:00
Linus Torvalds
1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00