2005-04-16 22:20:36 +00:00
|
|
|
/*
|
|
|
|
* Linux Socket Filter - Kernel level socket filtering
|
|
|
|
*
|
|
|
|
* Author:
|
|
|
|
* Jay Schulist <jschlst@samba.org>
|
|
|
|
*
|
|
|
|
* Based on the design of:
|
|
|
|
* - The Berkeley Packet Filter
|
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or
|
|
|
|
* modify it under the terms of the GNU General Public License
|
|
|
|
* as published by the Free Software Foundation; either version
|
|
|
|
* 2 of the License, or (at your option) any later version.
|
|
|
|
*
|
|
|
|
* Andi Kleen - Fix a few bad bugs and races.
|
2006-01-04 21:58:36 +00:00
|
|
|
* Kris Katterjohn - Added many additional checks in sk_chk_filter()
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/module.h>
|
|
|
|
#include <linux/types.h>
|
|
|
|
#include <linux/mm.h>
|
|
|
|
#include <linux/fcntl.h>
|
|
|
|
#include <linux/socket.h>
|
|
|
|
#include <linux/in.h>
|
|
|
|
#include <linux/inet.h>
|
|
|
|
#include <linux/netdevice.h>
|
|
|
|
#include <linux/if_packet.h>
|
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 08:04:11 +00:00
|
|
|
#include <linux/gfp.h>
|
2005-04-16 22:20:36 +00:00
|
|
|
#include <net/ip.h>
|
|
|
|
#include <net/protocol.h>
|
[SKFILTER]: Add SKF_ADF_NLATTR instruction
SKF_ADF_NLATTR searches for a netlink attribute, which avoids manually
parsing and walking attributes. It takes the offset at which to start
searching in the 'A' register and the attribute type in the 'X' register
and returns the offset in the 'A' register. When the attribute is not
found it returns zero.
A top-level attribute can be located using a filter like this
(example for nfnetlink, using struct nfgenmsg):
...
{
/* A = offset of first attribute */
.code = BPF_LD | BPF_IMM,
.k = sizeof(struct nlmsghdr) + sizeof(struct nfgenmsg)
},
{
/* X = CTA_PROTOINFO */
.code = BPF_LDX | BPF_IMM,
.k = CTA_PROTOINFO,
},
{
/* A = netlink attribute offset */
.code = BPF_LD | BPF_B | BPF_ABS,
.k = SKF_AD_OFF + SKF_AD_NLATTR
},
{
/* Exit if not found */
.code = BPF_JMP | BPF_JEQ | BPF_K,
.k = 0,
.jt = <error>
},
...
A nested attribute below the CTA_PROTOINFO attribute would then
be parsed like this:
...
{
/* A += sizeof(struct nlattr) */
.code = BPF_ALU | BPF_ADD | BPF_K,
.k = sizeof(struct nlattr),
},
{
/* X = CTA_PROTOINFO_TCP */
.code = BPF_LDX | BPF_IMM,
.k = CTA_PROTOINFO_TCP,
},
{
/* A = netlink attribute offset */
.code = BPF_LD | BPF_B | BPF_ABS,
.k = SKF_AD_OFF + SKF_AD_NLATTR
},
...
The data of an attribute can be loaded into 'A' like this:
...
{
/* X = A (attribute offset) */
.code = BPF_MISC | BPF_TAX,
},
{
/* A = skb->data[X + k] */
.code = BPF_LD | BPF_B | BPF_IND,
.k = sizeof(struct nlattr),
},
...
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 09:02:28 +00:00
|
|
|
#include <net/netlink.h>
|
2005-04-16 22:20:36 +00:00
|
|
|
#include <linux/skbuff.h>
|
|
|
|
#include <net/sock.h>
|
|
|
|
#include <linux/errno.h>
|
|
|
|
#include <linux/timer.h>
|
|
|
|
#include <asm/uaccess.h>
|
2006-04-18 21:50:10 +00:00
|
|
|
#include <asm/unaligned.h>
|
2005-04-16 22:20:36 +00:00
|
|
|
#include <linux/filter.h>
|
2010-11-18 22:04:46 +00:00
|
|
|
#include <linux/reciprocal_div.h>
|
2011-05-26 19:00:31 +00:00
|
|
|
#include <linux/ratelimit.h>
|
2012-04-12 21:47:52 +00:00
|
|
|
#include <linux/seccomp.h>
|
2012-10-27 02:26:17 +00:00
|
|
|
#include <linux/if_vlan.h>
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2012-03-30 05:08:19 +00:00
|
|
|
/* No hurry in this branch
|
|
|
|
*
|
|
|
|
* Exported for the bpf jit load helper.
|
|
|
|
*/
|
|
|
|
void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb, int k, unsigned int size)
|
2005-04-16 22:20:36 +00:00
|
|
|
{
|
|
|
|
u8 *ptr = NULL;
|
|
|
|
|
|
|
|
if (k >= SKF_NET_OFF)
|
2007-04-11 03:50:43 +00:00
|
|
|
ptr = skb_network_header(skb) + k - SKF_NET_OFF;
|
2005-04-16 22:20:36 +00:00
|
|
|
else if (k >= SKF_LL_OFF)
|
2007-03-19 22:33:04 +00:00
|
|
|
ptr = skb_mac_header(skb) + k - SKF_LL_OFF;
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2010-12-07 22:26:15 +00:00
|
|
|
if (ptr >= skb->head && ptr + size <= skb_tail_pointer(skb))
|
2005-04-16 22:20:36 +00:00
|
|
|
return ptr;
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2010-12-06 20:50:09 +00:00
|
|
|
static inline void *load_pointer(const struct sk_buff *skb, int k,
|
2007-02-09 14:24:36 +00:00
|
|
|
unsigned int size, void *buffer)
|
2005-07-05 21:10:21 +00:00
|
|
|
{
|
|
|
|
if (k >= 0)
|
|
|
|
return skb_header_pointer(skb, k, size, buffer);
|
2012-03-30 05:08:19 +00:00
|
|
|
return bpf_internal_load_pointer_neg_helper(skb, k, size);
|
2005-07-05 21:10:21 +00:00
|
|
|
}
|
|
|
|
|
2008-04-10 08:43:09 +00:00
|
|
|
/**
|
|
|
|
* sk_filter - run a packet through a socket filter
|
|
|
|
* @sk: sock associated with &sk_buff
|
|
|
|
* @skb: buffer to filter
|
|
|
|
*
|
|
|
|
* Run the filter code and then cut skb->data to correct size returned by
|
|
|
|
* sk_run_filter. If pkt_len is 0 we toss packet. If skb->len is smaller
|
|
|
|
* than pkt_len we keep whole skb->data. This is the socket level
|
|
|
|
* wrapper to sk_run_filter. It returns 0 if the packet should
|
|
|
|
* be accepted or -EPERM if the packet should be tossed.
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
int sk_filter(struct sock *sk, struct sk_buff *skb)
|
|
|
|
{
|
|
|
|
int err;
|
|
|
|
struct sk_filter *filter;
|
|
|
|
|
2012-07-31 23:44:19 +00:00
|
|
|
/*
|
|
|
|
* If the skb was allocated from pfmemalloc reserves, only
|
|
|
|
* allow SOCK_MEMALLOC sockets to use it as this socket is
|
|
|
|
* helping free memory
|
|
|
|
*/
|
|
|
|
if (skb_pfmemalloc(skb) && !sock_flag(sk, SOCK_MEMALLOC))
|
|
|
|
return -ENOMEM;
|
|
|
|
|
2008-04-10 08:43:09 +00:00
|
|
|
err = security_sock_rcv_skb(sk, skb);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
2011-01-18 07:46:52 +00:00
|
|
|
rcu_read_lock();
|
|
|
|
filter = rcu_dereference(sk->sk_filter);
|
2008-04-10 08:43:09 +00:00
|
|
|
if (filter) {
|
2011-04-20 09:27:32 +00:00
|
|
|
unsigned int pkt_len = SK_RUN_FILTER(filter, skb);
|
2010-10-25 03:47:05 +00:00
|
|
|
|
2008-04-10 08:43:09 +00:00
|
|
|
err = pkt_len ? pskb_trim(skb, pkt_len) : -EPERM;
|
|
|
|
}
|
2011-01-18 07:46:52 +00:00
|
|
|
rcu_read_unlock();
|
2008-04-10 08:43:09 +00:00
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(sk_filter);
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/**
|
2006-01-24 00:26:16 +00:00
|
|
|
* sk_run_filter - run a filter on a socket
|
2005-04-16 22:20:36 +00:00
|
|
|
* @skb: buffer to run the filter on
|
2011-01-08 17:41:42 +00:00
|
|
|
* @fentry: filter to apply
|
2005-04-16 22:20:36 +00:00
|
|
|
*
|
|
|
|
* Decode and apply filter instructions to the skb->data.
|
2010-11-19 17:49:59 +00:00
|
|
|
* Return length to keep, 0 for none. @skb is the data we are
|
|
|
|
* filtering, @filter is the array of filter instructions.
|
|
|
|
* Because all jumps are guaranteed to be before last instruction,
|
|
|
|
* and last instruction guaranteed to be a RET, we dont need to check
|
|
|
|
* flen. (We used to pass to this function the length of filter)
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
2010-12-06 20:50:09 +00:00
|
|
|
unsigned int sk_run_filter(const struct sk_buff *skb,
|
|
|
|
const struct sock_filter *fentry)
|
2005-04-16 22:20:36 +00:00
|
|
|
{
|
2005-07-05 21:10:21 +00:00
|
|
|
void *ptr;
|
2006-01-24 00:26:16 +00:00
|
|
|
u32 A = 0; /* Accumulator */
|
|
|
|
u32 X = 0; /* Index Register */
|
2005-04-16 22:20:36 +00:00
|
|
|
u32 mem[BPF_MEMWORDS]; /* Scratch Memory Store */
|
2005-07-05 21:10:21 +00:00
|
|
|
u32 tmp;
|
2005-04-16 22:20:36 +00:00
|
|
|
int k;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Process array of filter instructions.
|
|
|
|
*/
|
2010-11-19 17:49:59 +00:00
|
|
|
for (;; fentry++) {
|
|
|
|
#if defined(CONFIG_X86_32)
|
|
|
|
#define K (fentry->k)
|
|
|
|
#else
|
|
|
|
const u32 K = fentry->k;
|
|
|
|
#endif
|
2007-02-09 14:24:36 +00:00
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
switch (fentry->code) {
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_ADD_X:
|
2005-04-16 22:20:36 +00:00
|
|
|
A += X;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_ADD_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
A += K;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_SUB_X:
|
2005-04-16 22:20:36 +00:00
|
|
|
A -= X;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_SUB_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
A -= K;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_MUL_X:
|
2005-04-16 22:20:36 +00:00
|
|
|
A *= X;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_MUL_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
A *= K;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_DIV_X:
|
2005-04-16 22:20:36 +00:00
|
|
|
if (X == 0)
|
|
|
|
return 0;
|
|
|
|
A /= X;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_DIV_K:
|
2010-11-18 22:04:46 +00:00
|
|
|
A = reciprocal_divide(A, K);
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
2012-09-07 22:03:35 +00:00
|
|
|
case BPF_S_ALU_MOD_X:
|
|
|
|
if (X == 0)
|
|
|
|
return 0;
|
|
|
|
A %= X;
|
|
|
|
continue;
|
|
|
|
case BPF_S_ALU_MOD_K:
|
|
|
|
A %= K;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_AND_X:
|
2005-04-16 22:20:36 +00:00
|
|
|
A &= X;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_AND_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
A &= K;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_OR_X:
|
2005-04-16 22:20:36 +00:00
|
|
|
A |= X;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_OR_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
A |= K;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
2012-09-24 02:23:59 +00:00
|
|
|
case BPF_S_ANC_ALU_XOR_X:
|
|
|
|
case BPF_S_ALU_XOR_X:
|
|
|
|
A ^= X;
|
|
|
|
continue;
|
|
|
|
case BPF_S_ALU_XOR_K:
|
|
|
|
A ^= K;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_LSH_X:
|
2005-04-16 22:20:36 +00:00
|
|
|
A <<= X;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_LSH_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
A <<= K;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_RSH_X:
|
2005-04-16 22:20:36 +00:00
|
|
|
A >>= X;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_RSH_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
A >>= K;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ALU_NEG:
|
2005-04-16 22:20:36 +00:00
|
|
|
A = -A;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_JMP_JA:
|
2010-11-19 17:49:59 +00:00
|
|
|
fentry += K;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_JMP_JGT_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
fentry += (A > K) ? fentry->jt : fentry->jf;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_JMP_JGE_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
fentry += (A >= K) ? fentry->jt : fentry->jf;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_JMP_JEQ_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
fentry += (A == K) ? fentry->jt : fentry->jf;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_JMP_JSET_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
fentry += (A & K) ? fentry->jt : fentry->jf;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_JMP_JGT_X:
|
2010-11-19 17:49:59 +00:00
|
|
|
fentry += (A > X) ? fentry->jt : fentry->jf;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_JMP_JGE_X:
|
2010-11-19 17:49:59 +00:00
|
|
|
fentry += (A >= X) ? fentry->jt : fentry->jf;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_JMP_JEQ_X:
|
2010-11-19 17:49:59 +00:00
|
|
|
fentry += (A == X) ? fentry->jt : fentry->jf;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_JMP_JSET_X:
|
2010-11-19 17:49:59 +00:00
|
|
|
fentry += (A & X) ? fentry->jt : fentry->jf;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LD_W_ABS:
|
2010-11-19 17:49:59 +00:00
|
|
|
k = K;
|
2006-01-17 10:25:52 +00:00
|
|
|
load_w:
|
2005-07-05 21:10:21 +00:00
|
|
|
ptr = load_pointer(skb, k, 4, &tmp);
|
|
|
|
if (ptr != NULL) {
|
2008-05-02 23:26:16 +00:00
|
|
|
A = get_unaligned_be32(ptr);
|
2005-07-05 21:10:21 +00:00
|
|
|
continue;
|
2005-04-16 22:20:36 +00:00
|
|
|
}
|
2010-12-15 19:45:28 +00:00
|
|
|
return 0;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LD_H_ABS:
|
2010-11-19 17:49:59 +00:00
|
|
|
k = K;
|
2006-01-17 10:25:52 +00:00
|
|
|
load_h:
|
2005-07-05 21:10:21 +00:00
|
|
|
ptr = load_pointer(skb, k, 2, &tmp);
|
|
|
|
if (ptr != NULL) {
|
2008-05-02 23:26:16 +00:00
|
|
|
A = get_unaligned_be16(ptr);
|
2005-07-05 21:10:21 +00:00
|
|
|
continue;
|
2005-04-16 22:20:36 +00:00
|
|
|
}
|
2010-12-15 19:45:28 +00:00
|
|
|
return 0;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LD_B_ABS:
|
2010-11-19 17:49:59 +00:00
|
|
|
k = K;
|
2005-04-16 22:20:36 +00:00
|
|
|
load_b:
|
2005-07-05 21:10:21 +00:00
|
|
|
ptr = load_pointer(skb, k, 1, &tmp);
|
|
|
|
if (ptr != NULL) {
|
|
|
|
A = *(u8 *)ptr;
|
|
|
|
continue;
|
2005-04-16 22:20:36 +00:00
|
|
|
}
|
2010-12-15 19:45:28 +00:00
|
|
|
return 0;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LD_W_LEN:
|
2005-07-05 21:10:40 +00:00
|
|
|
A = skb->len;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LDX_W_LEN:
|
2005-07-05 21:10:40 +00:00
|
|
|
X = skb->len;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LD_W_IND:
|
2010-11-19 17:49:59 +00:00
|
|
|
k = X + K;
|
2005-04-16 22:20:36 +00:00
|
|
|
goto load_w;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LD_H_IND:
|
2010-11-19 17:49:59 +00:00
|
|
|
k = X + K;
|
2005-04-16 22:20:36 +00:00
|
|
|
goto load_h;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LD_B_IND:
|
2010-11-19 17:49:59 +00:00
|
|
|
k = X + K;
|
2005-04-16 22:20:36 +00:00
|
|
|
goto load_b;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LDX_B_MSH:
|
2010-11-19 17:49:59 +00:00
|
|
|
ptr = load_pointer(skb, K, 1, &tmp);
|
2005-07-05 21:10:21 +00:00
|
|
|
if (ptr != NULL) {
|
|
|
|
X = (*(u8 *)ptr & 0xf) << 2;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
return 0;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LD_IMM:
|
2010-11-19 17:49:59 +00:00
|
|
|
A = K;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LDX_IMM:
|
2010-11-19 17:49:59 +00:00
|
|
|
X = K;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LD_MEM:
|
2010-12-01 20:46:24 +00:00
|
|
|
A = mem[K];
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_LDX_MEM:
|
2010-12-01 20:46:24 +00:00
|
|
|
X = mem[K];
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_MISC_TAX:
|
2005-04-16 22:20:36 +00:00
|
|
|
X = A;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_MISC_TXA:
|
2005-04-16 22:20:36 +00:00
|
|
|
A = X;
|
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_RET_K:
|
2010-11-19 17:49:59 +00:00
|
|
|
return K;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_RET_A:
|
2006-01-06 21:08:20 +00:00
|
|
|
return A;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_ST:
|
2010-11-19 17:49:59 +00:00
|
|
|
mem[K] = A;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
case BPF_S_STX:
|
2010-11-19 17:49:59 +00:00
|
|
|
mem[K] = X;
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
2010-12-15 19:45:28 +00:00
|
|
|
case BPF_S_ANC_PROTOCOL:
|
2006-11-15 04:48:11 +00:00
|
|
|
A = ntohs(skb->protocol);
|
2005-04-16 22:20:36 +00:00
|
|
|
continue;
|
2010-12-15 19:45:28 +00:00
|
|
|
case BPF_S_ANC_PKTTYPE:
|
2005-04-16 22:20:36 +00:00
|
|
|
A = skb->pkt_type;
|
|
|
|
continue;
|
2010-12-15 19:45:28 +00:00
|
|
|
case BPF_S_ANC_IFINDEX:
|
2010-04-22 03:32:22 +00:00
|
|
|
if (!skb->dev)
|
|
|
|
return 0;
|
2005-04-16 22:20:36 +00:00
|
|
|
A = skb->dev->ifindex;
|
|
|
|
continue;
|
2010-12-15 19:45:28 +00:00
|
|
|
case BPF_S_ANC_MARK:
|
2009-10-19 02:17:56 +00:00
|
|
|
A = skb->mark;
|
|
|
|
continue;
|
2010-12-15 19:45:28 +00:00
|
|
|
case BPF_S_ANC_QUEUE:
|
2009-10-20 08:06:22 +00:00
|
|
|
A = skb->queue_mapping;
|
|
|
|
continue;
|
2010-12-15 19:45:28 +00:00
|
|
|
case BPF_S_ANC_HATYPE:
|
2010-04-22 03:32:22 +00:00
|
|
|
if (!skb->dev)
|
|
|
|
return 0;
|
|
|
|
A = skb->dev->type;
|
|
|
|
continue;
|
2010-12-15 19:45:28 +00:00
|
|
|
case BPF_S_ANC_RXHASH:
|
2010-11-30 21:45:56 +00:00
|
|
|
A = skb->rxhash;
|
|
|
|
continue;
|
2010-12-15 19:45:28 +00:00
|
|
|
case BPF_S_ANC_CPU:
|
2010-11-30 21:45:56 +00:00
|
|
|
A = raw_smp_processor_id();
|
|
|
|
continue;
|
2012-10-27 02:26:17 +00:00
|
|
|
case BPF_S_ANC_VLAN_TAG:
|
|
|
|
A = vlan_tx_tag_get(skb);
|
|
|
|
continue;
|
|
|
|
case BPF_S_ANC_VLAN_TAG_PRESENT:
|
|
|
|
A = !!vlan_tx_tag_present(skb);
|
|
|
|
continue;
|
filter: add ANC_PAY_OFFSET instruction for loading payload start offset
It is very useful to do dynamic truncation of packets. In particular,
we're interested to push the necessary header bytes to the user space and
cut off user payload that should probably not be transferred for some reasons
(e.g. privacy, speed, or others). With the ancillary extension PAY_OFFSET,
we can load it into the accumulator, and return it. E.g. in bpfc syntax ...
ld #poff ; { 0x20, 0, 0, 0xfffff034 },
ret a ; { 0x16, 0, 0, 0x00000000 },
... as a filter will accomplish this without having to do a big hackery in
a BPF filter itself. Follow-up JIT implementations are welcome.
Thanks to Eric Dumazet for suggesting and discussing this during the
Netfilter Workshop in Copenhagen.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-19 06:39:31 +00:00
|
|
|
case BPF_S_ANC_PAY_OFFSET:
|
|
|
|
A = __skb_get_poff(skb);
|
|
|
|
continue;
|
2010-12-15 19:45:28 +00:00
|
|
|
case BPF_S_ANC_NLATTR: {
|
[SKFILTER]: Add SKF_ADF_NLATTR instruction
SKF_ADF_NLATTR searches for a netlink attribute, which avoids manually
parsing and walking attributes. It takes the offset at which to start
searching in the 'A' register and the attribute type in the 'X' register
and returns the offset in the 'A' register. When the attribute is not
found it returns zero.
A top-level attribute can be located using a filter like this
(example for nfnetlink, using struct nfgenmsg):
...
{
/* A = offset of first attribute */
.code = BPF_LD | BPF_IMM,
.k = sizeof(struct nlmsghdr) + sizeof(struct nfgenmsg)
},
{
/* X = CTA_PROTOINFO */
.code = BPF_LDX | BPF_IMM,
.k = CTA_PROTOINFO,
},
{
/* A = netlink attribute offset */
.code = BPF_LD | BPF_B | BPF_ABS,
.k = SKF_AD_OFF + SKF_AD_NLATTR
},
{
/* Exit if not found */
.code = BPF_JMP | BPF_JEQ | BPF_K,
.k = 0,
.jt = <error>
},
...
A nested attribute below the CTA_PROTOINFO attribute would then
be parsed like this:
...
{
/* A += sizeof(struct nlattr) */
.code = BPF_ALU | BPF_ADD | BPF_K,
.k = sizeof(struct nlattr),
},
{
/* X = CTA_PROTOINFO_TCP */
.code = BPF_LDX | BPF_IMM,
.k = CTA_PROTOINFO_TCP,
},
{
/* A = netlink attribute offset */
.code = BPF_LD | BPF_B | BPF_ABS,
.k = SKF_AD_OFF + SKF_AD_NLATTR
},
...
The data of an attribute can be loaded into 'A' like this:
...
{
/* X = A (attribute offset) */
.code = BPF_MISC | BPF_TAX,
},
{
/* A = skb->data[X + k] */
.code = BPF_LD | BPF_B | BPF_IND,
.k = sizeof(struct nlattr),
},
...
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 09:02:28 +00:00
|
|
|
struct nlattr *nla;
|
|
|
|
|
|
|
|
if (skb_is_nonlinear(skb))
|
|
|
|
return 0;
|
|
|
|
if (A > skb->len - sizeof(struct nlattr))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
nla = nla_find((struct nlattr *)&skb->data[A],
|
|
|
|
skb->len - A, X);
|
|
|
|
if (nla)
|
|
|
|
A = (void *)nla - (void *)skb->data;
|
|
|
|
else
|
|
|
|
A = 0;
|
|
|
|
continue;
|
|
|
|
}
|
2010-12-15 19:45:28 +00:00
|
|
|
case BPF_S_ANC_NLATTR_NEST: {
|
2008-11-20 08:49:27 +00:00
|
|
|
struct nlattr *nla;
|
|
|
|
|
|
|
|
if (skb_is_nonlinear(skb))
|
|
|
|
return 0;
|
|
|
|
if (A > skb->len - sizeof(struct nlattr))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
nla = (struct nlattr *)&skb->data[A];
|
|
|
|
if (nla->nla_len > A - skb->len)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
nla = nla_find_nested(nla, X);
|
|
|
|
if (nla)
|
|
|
|
A = (void *)nla - (void *)skb->data;
|
|
|
|
else
|
|
|
|
A = 0;
|
|
|
|
continue;
|
|
|
|
}
|
2012-04-12 21:47:52 +00:00
|
|
|
#ifdef CONFIG_SECCOMP_FILTER
|
|
|
|
case BPF_S_ANC_SECCOMP_LD_W:
|
|
|
|
A = seccomp_bpf_load(fentry->k);
|
|
|
|
continue;
|
|
|
|
#endif
|
2005-04-16 22:20:36 +00:00
|
|
|
default:
|
2011-05-21 07:48:40 +00:00
|
|
|
WARN_RATELIMIT(1, "Unknown code:%u jt:%u tf:%u k:%u\n",
|
|
|
|
fentry->code, fentry->jt,
|
|
|
|
fentry->jf, fentry->k);
|
2005-04-16 22:20:36 +00:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
2008-04-10 08:33:47 +00:00
|
|
|
EXPORT_SYMBOL(sk_run_filter);
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2010-12-01 20:46:24 +00:00
|
|
|
/*
|
|
|
|
* Security :
|
|
|
|
* A BPF program is able to use 16 cells of memory to store intermediate
|
|
|
|
* values (check u32 mem[BPF_MEMWORDS] in sk_run_filter())
|
|
|
|
* As we dont want to clear mem[] array for each packet going through
|
|
|
|
* sk_run_filter(), we check that filter loaded by user never try to read
|
|
|
|
* a cell if not previously written, and we check all branches to be sure
|
2011-03-31 01:57:33 +00:00
|
|
|
* a malicious user doesn't try to abuse us.
|
2010-12-01 20:46:24 +00:00
|
|
|
*/
|
|
|
|
static int check_load_and_stores(struct sock_filter *filter, int flen)
|
|
|
|
{
|
|
|
|
u16 *masks, memvalid = 0; /* one bit per cell, 16 cells */
|
|
|
|
int pc, ret = 0;
|
|
|
|
|
|
|
|
BUILD_BUG_ON(BPF_MEMWORDS > 16);
|
|
|
|
masks = kmalloc(flen * sizeof(*masks), GFP_KERNEL);
|
|
|
|
if (!masks)
|
|
|
|
return -ENOMEM;
|
|
|
|
memset(masks, 0xff, flen * sizeof(*masks));
|
|
|
|
|
|
|
|
for (pc = 0; pc < flen; pc++) {
|
|
|
|
memvalid &= masks[pc];
|
|
|
|
|
|
|
|
switch (filter[pc].code) {
|
|
|
|
case BPF_S_ST:
|
|
|
|
case BPF_S_STX:
|
|
|
|
memvalid |= (1 << filter[pc].k);
|
|
|
|
break;
|
|
|
|
case BPF_S_LD_MEM:
|
|
|
|
case BPF_S_LDX_MEM:
|
|
|
|
if (!(memvalid & (1 << filter[pc].k))) {
|
|
|
|
ret = -EINVAL;
|
|
|
|
goto error;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case BPF_S_JMP_JA:
|
|
|
|
/* a jump must set masks on target */
|
|
|
|
masks[pc + 1 + filter[pc].k] &= memvalid;
|
|
|
|
memvalid = ~0;
|
|
|
|
break;
|
|
|
|
case BPF_S_JMP_JEQ_K:
|
|
|
|
case BPF_S_JMP_JEQ_X:
|
|
|
|
case BPF_S_JMP_JGE_K:
|
|
|
|
case BPF_S_JMP_JGE_X:
|
|
|
|
case BPF_S_JMP_JGT_K:
|
|
|
|
case BPF_S_JMP_JGT_X:
|
|
|
|
case BPF_S_JMP_JSET_X:
|
|
|
|
case BPF_S_JMP_JSET_K:
|
|
|
|
/* a jump must set masks on targets */
|
|
|
|
masks[pc + 1 + filter[pc].jt] &= memvalid;
|
|
|
|
masks[pc + 1 + filter[pc].jf] &= memvalid;
|
|
|
|
memvalid = ~0;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
error:
|
|
|
|
kfree(masks);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/**
|
|
|
|
* sk_chk_filter - verify socket filter code
|
|
|
|
* @filter: filter to verify
|
|
|
|
* @flen: length of filter
|
|
|
|
*
|
|
|
|
* Check the user's filter code. If we let some ugly
|
|
|
|
* filter code slip through kaboom! The filter must contain
|
2006-01-04 21:58:36 +00:00
|
|
|
* no references or jumps that are out of range, no illegal
|
|
|
|
* instructions, and must end with a RET instruction.
|
2005-04-16 22:20:36 +00:00
|
|
|
*
|
2006-01-13 22:33:06 +00:00
|
|
|
* All jumps are forward as they are not signed.
|
|
|
|
*
|
|
|
|
* Returns 0 if the rule set is legal or -EINVAL if not.
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
2011-10-17 21:04:20 +00:00
|
|
|
int sk_chk_filter(struct sock_filter *filter, unsigned int flen)
|
2005-04-16 22:20:36 +00:00
|
|
|
{
|
2010-11-16 15:19:51 +00:00
|
|
|
/*
|
|
|
|
* Valid instructions are initialized to non-0.
|
|
|
|
* Invalid instructions are initialized to 0.
|
|
|
|
*/
|
|
|
|
static const u8 codes[] = {
|
2010-11-18 21:56:38 +00:00
|
|
|
[BPF_ALU|BPF_ADD|BPF_K] = BPF_S_ALU_ADD_K,
|
|
|
|
[BPF_ALU|BPF_ADD|BPF_X] = BPF_S_ALU_ADD_X,
|
|
|
|
[BPF_ALU|BPF_SUB|BPF_K] = BPF_S_ALU_SUB_K,
|
|
|
|
[BPF_ALU|BPF_SUB|BPF_X] = BPF_S_ALU_SUB_X,
|
|
|
|
[BPF_ALU|BPF_MUL|BPF_K] = BPF_S_ALU_MUL_K,
|
|
|
|
[BPF_ALU|BPF_MUL|BPF_X] = BPF_S_ALU_MUL_X,
|
|
|
|
[BPF_ALU|BPF_DIV|BPF_X] = BPF_S_ALU_DIV_X,
|
2012-09-07 22:03:35 +00:00
|
|
|
[BPF_ALU|BPF_MOD|BPF_K] = BPF_S_ALU_MOD_K,
|
|
|
|
[BPF_ALU|BPF_MOD|BPF_X] = BPF_S_ALU_MOD_X,
|
2010-11-18 21:56:38 +00:00
|
|
|
[BPF_ALU|BPF_AND|BPF_K] = BPF_S_ALU_AND_K,
|
|
|
|
[BPF_ALU|BPF_AND|BPF_X] = BPF_S_ALU_AND_X,
|
|
|
|
[BPF_ALU|BPF_OR|BPF_K] = BPF_S_ALU_OR_K,
|
|
|
|
[BPF_ALU|BPF_OR|BPF_X] = BPF_S_ALU_OR_X,
|
2012-09-24 02:23:59 +00:00
|
|
|
[BPF_ALU|BPF_XOR|BPF_K] = BPF_S_ALU_XOR_K,
|
|
|
|
[BPF_ALU|BPF_XOR|BPF_X] = BPF_S_ALU_XOR_X,
|
2010-11-18 21:56:38 +00:00
|
|
|
[BPF_ALU|BPF_LSH|BPF_K] = BPF_S_ALU_LSH_K,
|
|
|
|
[BPF_ALU|BPF_LSH|BPF_X] = BPF_S_ALU_LSH_X,
|
|
|
|
[BPF_ALU|BPF_RSH|BPF_K] = BPF_S_ALU_RSH_K,
|
|
|
|
[BPF_ALU|BPF_RSH|BPF_X] = BPF_S_ALU_RSH_X,
|
|
|
|
[BPF_ALU|BPF_NEG] = BPF_S_ALU_NEG,
|
|
|
|
[BPF_LD|BPF_W|BPF_ABS] = BPF_S_LD_W_ABS,
|
|
|
|
[BPF_LD|BPF_H|BPF_ABS] = BPF_S_LD_H_ABS,
|
|
|
|
[BPF_LD|BPF_B|BPF_ABS] = BPF_S_LD_B_ABS,
|
|
|
|
[BPF_LD|BPF_W|BPF_LEN] = BPF_S_LD_W_LEN,
|
|
|
|
[BPF_LD|BPF_W|BPF_IND] = BPF_S_LD_W_IND,
|
|
|
|
[BPF_LD|BPF_H|BPF_IND] = BPF_S_LD_H_IND,
|
|
|
|
[BPF_LD|BPF_B|BPF_IND] = BPF_S_LD_B_IND,
|
|
|
|
[BPF_LD|BPF_IMM] = BPF_S_LD_IMM,
|
|
|
|
[BPF_LDX|BPF_W|BPF_LEN] = BPF_S_LDX_W_LEN,
|
|
|
|
[BPF_LDX|BPF_B|BPF_MSH] = BPF_S_LDX_B_MSH,
|
|
|
|
[BPF_LDX|BPF_IMM] = BPF_S_LDX_IMM,
|
|
|
|
[BPF_MISC|BPF_TAX] = BPF_S_MISC_TAX,
|
|
|
|
[BPF_MISC|BPF_TXA] = BPF_S_MISC_TXA,
|
|
|
|
[BPF_RET|BPF_K] = BPF_S_RET_K,
|
|
|
|
[BPF_RET|BPF_A] = BPF_S_RET_A,
|
|
|
|
[BPF_ALU|BPF_DIV|BPF_K] = BPF_S_ALU_DIV_K,
|
|
|
|
[BPF_LD|BPF_MEM] = BPF_S_LD_MEM,
|
|
|
|
[BPF_LDX|BPF_MEM] = BPF_S_LDX_MEM,
|
|
|
|
[BPF_ST] = BPF_S_ST,
|
|
|
|
[BPF_STX] = BPF_S_STX,
|
|
|
|
[BPF_JMP|BPF_JA] = BPF_S_JMP_JA,
|
|
|
|
[BPF_JMP|BPF_JEQ|BPF_K] = BPF_S_JMP_JEQ_K,
|
|
|
|
[BPF_JMP|BPF_JEQ|BPF_X] = BPF_S_JMP_JEQ_X,
|
|
|
|
[BPF_JMP|BPF_JGE|BPF_K] = BPF_S_JMP_JGE_K,
|
|
|
|
[BPF_JMP|BPF_JGE|BPF_X] = BPF_S_JMP_JGE_X,
|
|
|
|
[BPF_JMP|BPF_JGT|BPF_K] = BPF_S_JMP_JGT_K,
|
|
|
|
[BPF_JMP|BPF_JGT|BPF_X] = BPF_S_JMP_JGT_X,
|
|
|
|
[BPF_JMP|BPF_JSET|BPF_K] = BPF_S_JMP_JSET_K,
|
|
|
|
[BPF_JMP|BPF_JSET|BPF_X] = BPF_S_JMP_JSET_X,
|
2010-11-16 15:19:51 +00:00
|
|
|
};
|
2005-04-16 22:20:36 +00:00
|
|
|
int pc;
|
net: filter: return -EINVAL if BPF_S_ANC* operation is not supported
Currently, we return -EINVAL for malformed or wrong BPF filters.
However, this is not done for BPF_S_ANC* operations, which makes it
more difficult to detect if it's actually supported or not by the
BPF machine. Therefore, we should also return -EINVAL if K is within
the SKF_AD_OFF universe and the ancillary operation did not match.
Why exactly is it needed? If tools such as libpcap/tcpdump want to
make use of new ancillary operations (like filtering VLAN in kernel
space), there is currently no sane way to test if this feature /
BPF_S_ANC* op is present or not, since no error is returned. This
patch will make life easier for that and allow for a proper usage
for user space applications.
There was concern, if this patch will break userland. Short answer: Yes
and no. Long answer: It will "break" only for code that calls ...
{ BPF_LD | BPF_(W|H|B) | BPF_ABS, 0, 0, <K> },
... where <K> is in [0xfffff000, 0xffffffff] _and_ <K> is *not* an
ancillary. And here comes the BUT: assuming some *old* code will have
such an instruction where <K> is between [0xfffff000, 0xffffffff] and
it doesn't know ancillary operations, then this will give a
non-expected / unwanted behavior as well (since we do not return the
BPF machine with 0 after a failed load_pointer(), which was the case
before introducing ancillary operations, but load sth. into the
accumulator instead, and continue with the next instruction, for
instance). Thus, user space code would already have been broken by
introducing ancillary operations into the BPF machine per se. Code
that does such a direct load, e.g. "load word at packet offset
0xffffffff into accumulator" ("ld [0xffffffff]") is quite broken,
isn't it? The whole assumption of ancillary operations is that no-one
intentionally calls things like "ld [0xffffffff]" and expect this
word to be loaded from such a packet offset. Hence, we can also safely
make use of this feature testing patch and facilitate application
development. Therefore, at least from this patch onwards, we have
*for sure* a check whether current or in future implemented BPF_S_ANC*
ops are supported in the kernel. Patch was tested on x86_64.
(Thanks to Eric for the previous review.)
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Ani Sinha <ani@aristanetworks.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28 10:50:17 +00:00
|
|
|
bool anc_found;
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2005-12-27 21:57:59 +00:00
|
|
|
if (flen == 0 || flen > BPF_MAXINSNS)
|
2005-04-16 22:20:36 +00:00
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
/* check the filter code now */
|
|
|
|
for (pc = 0; pc < flen; pc++) {
|
2010-11-16 15:19:51 +00:00
|
|
|
struct sock_filter *ftest = &filter[pc];
|
|
|
|
u16 code = ftest->code;
|
2006-01-04 21:58:36 +00:00
|
|
|
|
2010-11-16 15:19:51 +00:00
|
|
|
if (code >= ARRAY_SIZE(codes))
|
|
|
|
return -EINVAL;
|
|
|
|
code = codes[code];
|
2010-11-18 21:56:38 +00:00
|
|
|
if (!code)
|
2010-11-16 15:19:51 +00:00
|
|
|
return -EINVAL;
|
2006-01-04 21:58:36 +00:00
|
|
|
/* Some instructions need special checks */
|
2010-11-16 15:19:51 +00:00
|
|
|
switch (code) {
|
|
|
|
case BPF_S_ALU_DIV_K:
|
2006-01-04 21:58:36 +00:00
|
|
|
/* check for division by zero */
|
|
|
|
if (ftest->k == 0)
|
2005-04-16 22:20:36 +00:00
|
|
|
return -EINVAL;
|
2010-11-18 22:04:46 +00:00
|
|
|
ftest->k = reciprocal_value(ftest->k);
|
2006-01-04 21:58:36 +00:00
|
|
|
break;
|
2012-09-07 22:03:35 +00:00
|
|
|
case BPF_S_ALU_MOD_K:
|
|
|
|
/* check for division by zero */
|
|
|
|
if (ftest->k == 0)
|
|
|
|
return -EINVAL;
|
|
|
|
break;
|
2010-11-16 15:19:51 +00:00
|
|
|
case BPF_S_LD_MEM:
|
|
|
|
case BPF_S_LDX_MEM:
|
|
|
|
case BPF_S_ST:
|
|
|
|
case BPF_S_STX:
|
|
|
|
/* check for invalid memory addresses */
|
2006-01-04 21:58:36 +00:00
|
|
|
if (ftest->k >= BPF_MEMWORDS)
|
|
|
|
return -EINVAL;
|
|
|
|
break;
|
2010-11-16 15:19:51 +00:00
|
|
|
case BPF_S_JMP_JA:
|
2006-01-04 21:58:36 +00:00
|
|
|
/*
|
|
|
|
* Note, the large ftest->k might cause loops.
|
|
|
|
* Compare this with conditional jumps below,
|
|
|
|
* where offsets are limited. --ANK (981016)
|
|
|
|
*/
|
2012-04-15 05:58:06 +00:00
|
|
|
if (ftest->k >= (unsigned int)(flen-pc-1))
|
2006-01-04 21:58:36 +00:00
|
|
|
return -EINVAL;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
break;
|
|
|
|
case BPF_S_JMP_JEQ_K:
|
|
|
|
case BPF_S_JMP_JEQ_X:
|
|
|
|
case BPF_S_JMP_JGE_K:
|
|
|
|
case BPF_S_JMP_JGE_X:
|
|
|
|
case BPF_S_JMP_JGT_K:
|
|
|
|
case BPF_S_JMP_JGT_X:
|
|
|
|
case BPF_S_JMP_JSET_X:
|
|
|
|
case BPF_S_JMP_JSET_K:
|
2010-11-16 15:19:51 +00:00
|
|
|
/* for conditionals both must be safe */
|
2006-01-17 10:25:52 +00:00
|
|
|
if (pc + ftest->jt + 1 >= flen ||
|
2006-01-04 21:58:36 +00:00
|
|
|
pc + ftest->jf + 1 >= flen)
|
|
|
|
return -EINVAL;
|
2010-11-16 15:19:51 +00:00
|
|
|
break;
|
2010-12-15 19:45:28 +00:00
|
|
|
case BPF_S_LD_W_ABS:
|
|
|
|
case BPF_S_LD_H_ABS:
|
|
|
|
case BPF_S_LD_B_ABS:
|
net: filter: return -EINVAL if BPF_S_ANC* operation is not supported
Currently, we return -EINVAL for malformed or wrong BPF filters.
However, this is not done for BPF_S_ANC* operations, which makes it
more difficult to detect if it's actually supported or not by the
BPF machine. Therefore, we should also return -EINVAL if K is within
the SKF_AD_OFF universe and the ancillary operation did not match.
Why exactly is it needed? If tools such as libpcap/tcpdump want to
make use of new ancillary operations (like filtering VLAN in kernel
space), there is currently no sane way to test if this feature /
BPF_S_ANC* op is present or not, since no error is returned. This
patch will make life easier for that and allow for a proper usage
for user space applications.
There was concern, if this patch will break userland. Short answer: Yes
and no. Long answer: It will "break" only for code that calls ...
{ BPF_LD | BPF_(W|H|B) | BPF_ABS, 0, 0, <K> },
... where <K> is in [0xfffff000, 0xffffffff] _and_ <K> is *not* an
ancillary. And here comes the BUT: assuming some *old* code will have
such an instruction where <K> is between [0xfffff000, 0xffffffff] and
it doesn't know ancillary operations, then this will give a
non-expected / unwanted behavior as well (since we do not return the
BPF machine with 0 after a failed load_pointer(), which was the case
before introducing ancillary operations, but load sth. into the
accumulator instead, and continue with the next instruction, for
instance). Thus, user space code would already have been broken by
introducing ancillary operations into the BPF machine per se. Code
that does such a direct load, e.g. "load word at packet offset
0xffffffff into accumulator" ("ld [0xffffffff]") is quite broken,
isn't it? The whole assumption of ancillary operations is that no-one
intentionally calls things like "ld [0xffffffff]" and expect this
word to be loaded from such a packet offset. Hence, we can also safely
make use of this feature testing patch and facilitate application
development. Therefore, at least from this patch onwards, we have
*for sure* a check whether current or in future implemented BPF_S_ANC*
ops are supported in the kernel. Patch was tested on x86_64.
(Thanks to Eric for the previous review.)
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Ani Sinha <ani@aristanetworks.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28 10:50:17 +00:00
|
|
|
anc_found = false;
|
2010-12-15 19:45:28 +00:00
|
|
|
#define ANCILLARY(CODE) case SKF_AD_OFF + SKF_AD_##CODE: \
|
|
|
|
code = BPF_S_ANC_##CODE; \
|
net: filter: return -EINVAL if BPF_S_ANC* operation is not supported
Currently, we return -EINVAL for malformed or wrong BPF filters.
However, this is not done for BPF_S_ANC* operations, which makes it
more difficult to detect if it's actually supported or not by the
BPF machine. Therefore, we should also return -EINVAL if K is within
the SKF_AD_OFF universe and the ancillary operation did not match.
Why exactly is it needed? If tools such as libpcap/tcpdump want to
make use of new ancillary operations (like filtering VLAN in kernel
space), there is currently no sane way to test if this feature /
BPF_S_ANC* op is present or not, since no error is returned. This
patch will make life easier for that and allow for a proper usage
for user space applications.
There was concern, if this patch will break userland. Short answer: Yes
and no. Long answer: It will "break" only for code that calls ...
{ BPF_LD | BPF_(W|H|B) | BPF_ABS, 0, 0, <K> },
... where <K> is in [0xfffff000, 0xffffffff] _and_ <K> is *not* an
ancillary. And here comes the BUT: assuming some *old* code will have
such an instruction where <K> is between [0xfffff000, 0xffffffff] and
it doesn't know ancillary operations, then this will give a
non-expected / unwanted behavior as well (since we do not return the
BPF machine with 0 after a failed load_pointer(), which was the case
before introducing ancillary operations, but load sth. into the
accumulator instead, and continue with the next instruction, for
instance). Thus, user space code would already have been broken by
introducing ancillary operations into the BPF machine per se. Code
that does such a direct load, e.g. "load word at packet offset
0xffffffff into accumulator" ("ld [0xffffffff]") is quite broken,
isn't it? The whole assumption of ancillary operations is that no-one
intentionally calls things like "ld [0xffffffff]" and expect this
word to be loaded from such a packet offset. Hence, we can also safely
make use of this feature testing patch and facilitate application
development. Therefore, at least from this patch onwards, we have
*for sure* a check whether current or in future implemented BPF_S_ANC*
ops are supported in the kernel. Patch was tested on x86_64.
(Thanks to Eric for the previous review.)
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Ani Sinha <ani@aristanetworks.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28 10:50:17 +00:00
|
|
|
anc_found = true; \
|
2010-12-15 19:45:28 +00:00
|
|
|
break
|
|
|
|
switch (ftest->k) {
|
|
|
|
ANCILLARY(PROTOCOL);
|
|
|
|
ANCILLARY(PKTTYPE);
|
|
|
|
ANCILLARY(IFINDEX);
|
|
|
|
ANCILLARY(NLATTR);
|
|
|
|
ANCILLARY(NLATTR_NEST);
|
|
|
|
ANCILLARY(MARK);
|
|
|
|
ANCILLARY(QUEUE);
|
|
|
|
ANCILLARY(HATYPE);
|
|
|
|
ANCILLARY(RXHASH);
|
|
|
|
ANCILLARY(CPU);
|
2012-03-31 11:01:20 +00:00
|
|
|
ANCILLARY(ALU_XOR_X);
|
2012-10-27 02:26:17 +00:00
|
|
|
ANCILLARY(VLAN_TAG);
|
|
|
|
ANCILLARY(VLAN_TAG_PRESENT);
|
filter: add ANC_PAY_OFFSET instruction for loading payload start offset
It is very useful to do dynamic truncation of packets. In particular,
we're interested to push the necessary header bytes to the user space and
cut off user payload that should probably not be transferred for some reasons
(e.g. privacy, speed, or others). With the ancillary extension PAY_OFFSET,
we can load it into the accumulator, and return it. E.g. in bpfc syntax ...
ld #poff ; { 0x20, 0, 0, 0xfffff034 },
ret a ; { 0x16, 0, 0, 0x00000000 },
... as a filter will accomplish this without having to do a big hackery in
a BPF filter itself. Follow-up JIT implementations are welcome.
Thanks to Eric Dumazet for suggesting and discussing this during the
Netfilter Workshop in Copenhagen.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-19 06:39:31 +00:00
|
|
|
ANCILLARY(PAY_OFFSET);
|
2010-12-15 19:45:28 +00:00
|
|
|
}
|
net: filter: return -EINVAL if BPF_S_ANC* operation is not supported
Currently, we return -EINVAL for malformed or wrong BPF filters.
However, this is not done for BPF_S_ANC* operations, which makes it
more difficult to detect if it's actually supported or not by the
BPF machine. Therefore, we should also return -EINVAL if K is within
the SKF_AD_OFF universe and the ancillary operation did not match.
Why exactly is it needed? If tools such as libpcap/tcpdump want to
make use of new ancillary operations (like filtering VLAN in kernel
space), there is currently no sane way to test if this feature /
BPF_S_ANC* op is present or not, since no error is returned. This
patch will make life easier for that and allow for a proper usage
for user space applications.
There was concern, if this patch will break userland. Short answer: Yes
and no. Long answer: It will "break" only for code that calls ...
{ BPF_LD | BPF_(W|H|B) | BPF_ABS, 0, 0, <K> },
... where <K> is in [0xfffff000, 0xffffffff] _and_ <K> is *not* an
ancillary. And here comes the BUT: assuming some *old* code will have
such an instruction where <K> is between [0xfffff000, 0xffffffff] and
it doesn't know ancillary operations, then this will give a
non-expected / unwanted behavior as well (since we do not return the
BPF machine with 0 after a failed load_pointer(), which was the case
before introducing ancillary operations, but load sth. into the
accumulator instead, and continue with the next instruction, for
instance). Thus, user space code would already have been broken by
introducing ancillary operations into the BPF machine per se. Code
that does such a direct load, e.g. "load word at packet offset
0xffffffff into accumulator" ("ld [0xffffffff]") is quite broken,
isn't it? The whole assumption of ancillary operations is that no-one
intentionally calls things like "ld [0xffffffff]" and expect this
word to be loaded from such a packet offset. Hence, we can also safely
make use of this feature testing patch and facilitate application
development. Therefore, at least from this patch onwards, we have
*for sure* a check whether current or in future implemented BPF_S_ANC*
ops are supported in the kernel. Patch was tested on x86_64.
(Thanks to Eric for the previous review.)
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Ani Sinha <ani@aristanetworks.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28 10:50:17 +00:00
|
|
|
|
|
|
|
/* ancillary operation unknown or unsupported */
|
|
|
|
if (anc_found == false && ftest->k >= SKF_AD_OFF)
|
|
|
|
return -EINVAL;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
}
|
2010-11-16 15:19:51 +00:00
|
|
|
ftest->code = code;
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
}
|
2006-01-04 21:58:36 +00:00
|
|
|
|
net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).
Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):
7ff: 8b 06 mov (%rsi),%eax
801: 66 83 f8 35 cmp $0x35,%ax
805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a>
811: 66 83 f8 15 cmp $0x15,%ax
815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322>
81b: 77 73 ja 890 <sk_run_filter+0xd2>
81d: 66 83 f8 04 cmp $0x4,%ax
821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280>
827: 77 29 ja 852 <sk_run_filter+0x94>
829: 66 83 f8 01 cmp $0x1,%ax
[...]
With the modification the compiler translate the switch statement into
the following jump table fragment:
7ff: 66 83 3e 2c cmpw $0x2c,(%rsi)
803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a>
809: 0f b7 06 movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8)
813: 44 89 e3 mov %r12d,%ebx
816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
81b: 41 89 dc mov %ebx,%r12d
81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0>
Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-19 17:05:36 +00:00
|
|
|
/* last instruction must be a RET code */
|
|
|
|
switch (filter[flen - 1].code) {
|
|
|
|
case BPF_S_RET_K:
|
|
|
|
case BPF_S_RET_A:
|
2010-12-01 20:46:24 +00:00
|
|
|
return check_load_and_stores(filter, flen);
|
2010-11-16 15:19:51 +00:00
|
|
|
}
|
|
|
|
return -EINVAL;
|
2005-04-16 22:20:36 +00:00
|
|
|
}
|
2008-04-10 08:33:47 +00:00
|
|
|
EXPORT_SYMBOL(sk_chk_filter);
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2007-10-18 04:22:42 +00:00
|
|
|
/**
|
2010-12-06 17:29:43 +00:00
|
|
|
* sk_filter_release_rcu - Release a socket filter by rcu_head
|
2007-10-18 04:22:42 +00:00
|
|
|
* @rcu: rcu_head that contains the sk_filter to free
|
|
|
|
*/
|
2010-12-06 17:29:43 +00:00
|
|
|
void sk_filter_release_rcu(struct rcu_head *rcu)
|
2007-10-18 04:22:42 +00:00
|
|
|
{
|
|
|
|
struct sk_filter *fp = container_of(rcu, struct sk_filter, rcu);
|
|
|
|
|
2011-04-20 09:27:32 +00:00
|
|
|
bpf_jit_free(fp);
|
2010-12-06 17:29:43 +00:00
|
|
|
kfree(fp);
|
2007-10-18 04:22:42 +00:00
|
|
|
}
|
2010-12-06 17:29:43 +00:00
|
|
|
EXPORT_SYMBOL(sk_filter_release_rcu);
|
2007-10-18 04:22:42 +00:00
|
|
|
|
2012-03-31 11:01:19 +00:00
|
|
|
static int __sk_prepare_filter(struct sk_filter *fp)
|
|
|
|
{
|
|
|
|
int err;
|
|
|
|
|
|
|
|
fp->bpf_func = sk_run_filter;
|
|
|
|
|
|
|
|
err = sk_chk_filter(fp->insns, fp->len);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
bpf_jit_compile(fp);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* sk_unattached_filter_create - create an unattached filter
|
|
|
|
* @fprog: the filter program
|
2012-06-08 14:01:44 +00:00
|
|
|
* @pfp: the unattached filter that is created
|
2012-03-31 11:01:19 +00:00
|
|
|
*
|
2012-06-08 14:01:44 +00:00
|
|
|
* Create a filter independent of any socket. We first run some
|
2012-03-31 11:01:19 +00:00
|
|
|
* sanity checks on it to make sure it does not explode on us later.
|
|
|
|
* If an error occurs or there is insufficient memory for the filter
|
|
|
|
* a negative errno code is returned. On success the return is zero.
|
|
|
|
*/
|
|
|
|
int sk_unattached_filter_create(struct sk_filter **pfp,
|
|
|
|
struct sock_fprog *fprog)
|
|
|
|
{
|
|
|
|
struct sk_filter *fp;
|
|
|
|
unsigned int fsize = sizeof(struct sock_filter) * fprog->len;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
/* Make sure new filter is there and in the right amounts. */
|
|
|
|
if (fprog->filter == NULL)
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
fp = kmalloc(fsize + sizeof(*fp), GFP_KERNEL);
|
|
|
|
if (!fp)
|
|
|
|
return -ENOMEM;
|
|
|
|
memcpy(fp->insns, fprog->filter, fsize);
|
|
|
|
|
|
|
|
atomic_set(&fp->refcnt, 1);
|
|
|
|
fp->len = fprog->len;
|
|
|
|
|
|
|
|
err = __sk_prepare_filter(fp);
|
|
|
|
if (err)
|
|
|
|
goto free_mem;
|
|
|
|
|
|
|
|
*pfp = fp;
|
|
|
|
return 0;
|
|
|
|
free_mem:
|
|
|
|
kfree(fp);
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(sk_unattached_filter_create);
|
|
|
|
|
|
|
|
void sk_unattached_filter_destroy(struct sk_filter *fp)
|
|
|
|
{
|
|
|
|
sk_filter_release(fp);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(sk_unattached_filter_destroy);
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/**
|
|
|
|
* sk_attach_filter - attach a socket filter
|
|
|
|
* @fprog: the filter program
|
|
|
|
* @sk: the socket to use
|
|
|
|
*
|
|
|
|
* Attach the user's filter code. We first run some sanity checks on
|
|
|
|
* it to make sure it does not explode on us later. If an error
|
|
|
|
* occurs or there is insufficient memory for the filter a negative
|
|
|
|
* errno code is returned. On success the return is zero.
|
|
|
|
*/
|
|
|
|
int sk_attach_filter(struct sock_fprog *fprog, struct sock *sk)
|
|
|
|
{
|
2007-10-18 04:22:17 +00:00
|
|
|
struct sk_filter *fp, *old_fp;
|
2005-04-16 22:20:36 +00:00
|
|
|
unsigned int fsize = sizeof(struct sock_filter) * fprog->len;
|
|
|
|
int err;
|
|
|
|
|
2013-01-16 21:55:49 +00:00
|
|
|
if (sock_flag(sk, SOCK_FILTER_LOCKED))
|
|
|
|
return -EPERM;
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/* Make sure new filter is there and in the right amounts. */
|
2006-01-17 10:25:52 +00:00
|
|
|
if (fprog->filter == NULL)
|
|
|
|
return -EINVAL;
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
fp = sock_kmalloc(sk, fsize+sizeof(*fp), GFP_KERNEL);
|
|
|
|
if (!fp)
|
|
|
|
return -ENOMEM;
|
|
|
|
if (copy_from_user(fp->insns, fprog->filter, fsize)) {
|
2007-02-09 14:24:36 +00:00
|
|
|
sock_kfree_s(sk, fp, fsize+sizeof(*fp));
|
2005-04-16 22:20:36 +00:00
|
|
|
return -EFAULT;
|
|
|
|
}
|
|
|
|
|
|
|
|
atomic_set(&fp->refcnt, 1);
|
|
|
|
fp->len = fprog->len;
|
|
|
|
|
2012-03-31 11:01:19 +00:00
|
|
|
err = __sk_prepare_filter(fp);
|
2007-10-18 04:22:17 +00:00
|
|
|
if (err) {
|
|
|
|
sk_filter_uncharge(sk, fp);
|
|
|
|
return err;
|
2005-04-16 22:20:36 +00:00
|
|
|
}
|
|
|
|
|
2010-09-27 06:07:30 +00:00
|
|
|
old_fp = rcu_dereference_protected(sk->sk_filter,
|
|
|
|
sock_owned_by_user(sk));
|
2007-10-18 04:22:17 +00:00
|
|
|
rcu_assign_pointer(sk->sk_filter, fp);
|
|
|
|
|
[NET]: Fix bug in sk_filter race cures.
Looks like this might be causing problems, at least for me on ppc. This
happened during a normal boot, right around first interface config/dhcp
run..
cpu 0x0: Vector: 300 (Data Access) at [c00000000147b820]
pc: c000000000435e5c: .sk_filter_delayed_uncharge+0x1c/0x60
lr: c0000000004360d0: .sk_attach_filter+0x170/0x180
sp: c00000000147baa0
msr: 9000000000009032
dar: 4
dsisr: 40000000
current = 0xc000000004780fa0
paca = 0xc000000000650480
pid = 1295, comm = dhclient3
0:mon> t
[c00000000147bb20] c0000000004360d0 .sk_attach_filter+0x170/0x180
[c00000000147bbd0] c000000000418988 .sock_setsockopt+0x788/0x7f0
[c00000000147bcb0] c000000000438a74 .compat_sys_setsockopt+0x4e4/0x5a0
[c00000000147bd90] c00000000043955c .compat_sys_socketcall+0x25c/0x2b0
[c00000000147be30] c000000000007508 syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 000000000ff618d8
SP (fffdf040) is in userspace
0:mon>
I.e. null pointer deref at sk_filter_delayed_uncharge+0x1c:
0:mon> di $.sk_filter_delayed_uncharge
c000000000435e40 7c0802a6 mflr r0
c000000000435e44 fbc1fff0 std r30,-16(r1)
c000000000435e48 7c8b2378 mr r11,r4
c000000000435e4c ebc2cdd0 ld r30,-12848(r2)
c000000000435e50 f8010010 std r0,16(r1)
c000000000435e54 f821ff81 stdu r1,-128(r1)
c000000000435e58 380300a4 addi r0,r3,164
c000000000435e5c 81240004 lwz r9,4(r4)
That's the deref of fp:
static void sk_filter_delayed_uncharge(struct sock *sk, struct sk_filter *fp)
{
unsigned int size = sk_filter_len(fp);
...
That is called from sk_attach_filter():
...
rcu_read_lock_bh();
old_fp = rcu_dereference(sk->sk_filter);
rcu_assign_pointer(sk->sk_filter, fp);
rcu_read_unlock_bh();
sk_filter_delayed_uncharge(sk, old_fp);
return 0;
...
So, looks like rcu_dereference() returned NULL. I don't know the
filter code at all, but it seems like it might be a valid case?
sk_detach_filter() seems to handle a NULL sk_filter, at least.
So, this needs review by someone who knows the filter, but it fixes the
problem for me:
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-19 04:48:39 +00:00
|
|
|
if (old_fp)
|
2010-12-06 17:29:43 +00:00
|
|
|
sk_filter_uncharge(sk, old_fp);
|
2007-10-18 04:22:17 +00:00
|
|
|
return 0;
|
2005-04-16 22:20:36 +00:00
|
|
|
}
|
2010-02-14 01:01:00 +00:00
|
|
|
EXPORT_SYMBOL_GPL(sk_attach_filter);
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2007-10-18 04:21:26 +00:00
|
|
|
int sk_detach_filter(struct sock *sk)
|
|
|
|
{
|
|
|
|
int ret = -ENOENT;
|
|
|
|
struct sk_filter *filter;
|
|
|
|
|
2013-01-16 21:55:49 +00:00
|
|
|
if (sock_flag(sk, SOCK_FILTER_LOCKED))
|
|
|
|
return -EPERM;
|
|
|
|
|
2010-09-27 06:07:30 +00:00
|
|
|
filter = rcu_dereference_protected(sk->sk_filter,
|
|
|
|
sock_owned_by_user(sk));
|
2007-10-18 04:21:26 +00:00
|
|
|
if (filter) {
|
2011-08-01 16:19:00 +00:00
|
|
|
RCU_INIT_POINTER(sk->sk_filter, NULL);
|
2010-12-06 17:29:43 +00:00
|
|
|
sk_filter_uncharge(sk, filter);
|
2007-10-18 04:21:26 +00:00
|
|
|
ret = 0;
|
|
|
|
}
|
|
|
|
return ret;
|
|
|
|
}
|
2010-02-14 01:01:00 +00:00
|
|
|
EXPORT_SYMBOL_GPL(sk_detach_filter);
|
sk-filter: Add ability to get socket filter program (v2)
The SO_ATTACH_FILTER option is set only. I propose to add the get
ability by using SO_ATTACH_FILTER in getsockopt. To be less
irritating to eyes the SO_GET_FILTER alias to it is declared. This
ability is required by checkpoint-restore project to be able to
save full state of a socket.
There are two issues with getting filter back.
First, kernel modifies the sock_filter->code on filter load, thus in
order to return the filter element back to user we have to decode it
into user-visible constants. Fortunately the modification in question
is interconvertible.
Second, the BPF_S_ALU_DIV_K code modifies the command argument k to
speed up the run-time division by doing kernel_k = reciprocal(user_k).
Bad news is that different user_k may result in same kernel_k, so we
can't get the original user_k back. Good news is that we don't have
to do it. What we need to is calculate a user2_k so, that
reciprocal(user2_k) == reciprocal(user_k) == kernel_k
i.e. if it's re-loaded back the compiled again value will be exactly
the same as it was. That said, the user2_k can be calculated like this
user2_k = reciprocal(kernel_k)
with an exception, that if kernel_k == 0, then user2_k == 1.
The optlen argument is treated like this -- when zero, kernel returns
the amount of sock_fprog elements in filter, otherwise it should be
large enough for the sock_fprog array.
changes since v1:
* Declared SO_GET_FILTER in all arch headers
* Added decode of vlan-tag codes
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-01 02:01:48 +00:00
|
|
|
|
2013-06-05 13:30:55 +00:00
|
|
|
void sk_decode_filter(struct sock_filter *filt, struct sock_filter *to)
|
sk-filter: Add ability to get socket filter program (v2)
The SO_ATTACH_FILTER option is set only. I propose to add the get
ability by using SO_ATTACH_FILTER in getsockopt. To be less
irritating to eyes the SO_GET_FILTER alias to it is declared. This
ability is required by checkpoint-restore project to be able to
save full state of a socket.
There are two issues with getting filter back.
First, kernel modifies the sock_filter->code on filter load, thus in
order to return the filter element back to user we have to decode it
into user-visible constants. Fortunately the modification in question
is interconvertible.
Second, the BPF_S_ALU_DIV_K code modifies the command argument k to
speed up the run-time division by doing kernel_k = reciprocal(user_k).
Bad news is that different user_k may result in same kernel_k, so we
can't get the original user_k back. Good news is that we don't have
to do it. What we need to is calculate a user2_k so, that
reciprocal(user2_k) == reciprocal(user_k) == kernel_k
i.e. if it's re-loaded back the compiled again value will be exactly
the same as it was. That said, the user2_k can be calculated like this
user2_k = reciprocal(kernel_k)
with an exception, that if kernel_k == 0, then user2_k == 1.
The optlen argument is treated like this -- when zero, kernel returns
the amount of sock_fprog elements in filter, otherwise it should be
large enough for the sock_fprog array.
changes since v1:
* Declared SO_GET_FILTER in all arch headers
* Added decode of vlan-tag codes
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-01 02:01:48 +00:00
|
|
|
{
|
|
|
|
static const u16 decodes[] = {
|
|
|
|
[BPF_S_ALU_ADD_K] = BPF_ALU|BPF_ADD|BPF_K,
|
|
|
|
[BPF_S_ALU_ADD_X] = BPF_ALU|BPF_ADD|BPF_X,
|
|
|
|
[BPF_S_ALU_SUB_K] = BPF_ALU|BPF_SUB|BPF_K,
|
|
|
|
[BPF_S_ALU_SUB_X] = BPF_ALU|BPF_SUB|BPF_X,
|
|
|
|
[BPF_S_ALU_MUL_K] = BPF_ALU|BPF_MUL|BPF_K,
|
|
|
|
[BPF_S_ALU_MUL_X] = BPF_ALU|BPF_MUL|BPF_X,
|
|
|
|
[BPF_S_ALU_DIV_X] = BPF_ALU|BPF_DIV|BPF_X,
|
|
|
|
[BPF_S_ALU_MOD_K] = BPF_ALU|BPF_MOD|BPF_K,
|
|
|
|
[BPF_S_ALU_MOD_X] = BPF_ALU|BPF_MOD|BPF_X,
|
|
|
|
[BPF_S_ALU_AND_K] = BPF_ALU|BPF_AND|BPF_K,
|
|
|
|
[BPF_S_ALU_AND_X] = BPF_ALU|BPF_AND|BPF_X,
|
|
|
|
[BPF_S_ALU_OR_K] = BPF_ALU|BPF_OR|BPF_K,
|
|
|
|
[BPF_S_ALU_OR_X] = BPF_ALU|BPF_OR|BPF_X,
|
|
|
|
[BPF_S_ALU_XOR_K] = BPF_ALU|BPF_XOR|BPF_K,
|
|
|
|
[BPF_S_ALU_XOR_X] = BPF_ALU|BPF_XOR|BPF_X,
|
|
|
|
[BPF_S_ALU_LSH_K] = BPF_ALU|BPF_LSH|BPF_K,
|
|
|
|
[BPF_S_ALU_LSH_X] = BPF_ALU|BPF_LSH|BPF_X,
|
|
|
|
[BPF_S_ALU_RSH_K] = BPF_ALU|BPF_RSH|BPF_K,
|
|
|
|
[BPF_S_ALU_RSH_X] = BPF_ALU|BPF_RSH|BPF_X,
|
|
|
|
[BPF_S_ALU_NEG] = BPF_ALU|BPF_NEG,
|
|
|
|
[BPF_S_LD_W_ABS] = BPF_LD|BPF_W|BPF_ABS,
|
|
|
|
[BPF_S_LD_H_ABS] = BPF_LD|BPF_H|BPF_ABS,
|
|
|
|
[BPF_S_LD_B_ABS] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_PROTOCOL] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_PKTTYPE] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_IFINDEX] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_NLATTR] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_NLATTR_NEST] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_MARK] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_QUEUE] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_HATYPE] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_RXHASH] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_CPU] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_ALU_XOR_X] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_SECCOMP_LD_W] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_VLAN_TAG] = BPF_LD|BPF_B|BPF_ABS,
|
|
|
|
[BPF_S_ANC_VLAN_TAG_PRESENT] = BPF_LD|BPF_B|BPF_ABS,
|
filter: add ANC_PAY_OFFSET instruction for loading payload start offset
It is very useful to do dynamic truncation of packets. In particular,
we're interested to push the necessary header bytes to the user space and
cut off user payload that should probably not be transferred for some reasons
(e.g. privacy, speed, or others). With the ancillary extension PAY_OFFSET,
we can load it into the accumulator, and return it. E.g. in bpfc syntax ...
ld #poff ; { 0x20, 0, 0, 0xfffff034 },
ret a ; { 0x16, 0, 0, 0x00000000 },
... as a filter will accomplish this without having to do a big hackery in
a BPF filter itself. Follow-up JIT implementations are welcome.
Thanks to Eric Dumazet for suggesting and discussing this during the
Netfilter Workshop in Copenhagen.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-19 06:39:31 +00:00
|
|
|
[BPF_S_ANC_PAY_OFFSET] = BPF_LD|BPF_B|BPF_ABS,
|
sk-filter: Add ability to get socket filter program (v2)
The SO_ATTACH_FILTER option is set only. I propose to add the get
ability by using SO_ATTACH_FILTER in getsockopt. To be less
irritating to eyes the SO_GET_FILTER alias to it is declared. This
ability is required by checkpoint-restore project to be able to
save full state of a socket.
There are two issues with getting filter back.
First, kernel modifies the sock_filter->code on filter load, thus in
order to return the filter element back to user we have to decode it
into user-visible constants. Fortunately the modification in question
is interconvertible.
Second, the BPF_S_ALU_DIV_K code modifies the command argument k to
speed up the run-time division by doing kernel_k = reciprocal(user_k).
Bad news is that different user_k may result in same kernel_k, so we
can't get the original user_k back. Good news is that we don't have
to do it. What we need to is calculate a user2_k so, that
reciprocal(user2_k) == reciprocal(user_k) == kernel_k
i.e. if it's re-loaded back the compiled again value will be exactly
the same as it was. That said, the user2_k can be calculated like this
user2_k = reciprocal(kernel_k)
with an exception, that if kernel_k == 0, then user2_k == 1.
The optlen argument is treated like this -- when zero, kernel returns
the amount of sock_fprog elements in filter, otherwise it should be
large enough for the sock_fprog array.
changes since v1:
* Declared SO_GET_FILTER in all arch headers
* Added decode of vlan-tag codes
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-01 02:01:48 +00:00
|
|
|
[BPF_S_LD_W_LEN] = BPF_LD|BPF_W|BPF_LEN,
|
|
|
|
[BPF_S_LD_W_IND] = BPF_LD|BPF_W|BPF_IND,
|
|
|
|
[BPF_S_LD_H_IND] = BPF_LD|BPF_H|BPF_IND,
|
|
|
|
[BPF_S_LD_B_IND] = BPF_LD|BPF_B|BPF_IND,
|
|
|
|
[BPF_S_LD_IMM] = BPF_LD|BPF_IMM,
|
|
|
|
[BPF_S_LDX_W_LEN] = BPF_LDX|BPF_W|BPF_LEN,
|
|
|
|
[BPF_S_LDX_B_MSH] = BPF_LDX|BPF_B|BPF_MSH,
|
|
|
|
[BPF_S_LDX_IMM] = BPF_LDX|BPF_IMM,
|
|
|
|
[BPF_S_MISC_TAX] = BPF_MISC|BPF_TAX,
|
|
|
|
[BPF_S_MISC_TXA] = BPF_MISC|BPF_TXA,
|
|
|
|
[BPF_S_RET_K] = BPF_RET|BPF_K,
|
|
|
|
[BPF_S_RET_A] = BPF_RET|BPF_A,
|
|
|
|
[BPF_S_ALU_DIV_K] = BPF_ALU|BPF_DIV|BPF_K,
|
|
|
|
[BPF_S_LD_MEM] = BPF_LD|BPF_MEM,
|
|
|
|
[BPF_S_LDX_MEM] = BPF_LDX|BPF_MEM,
|
|
|
|
[BPF_S_ST] = BPF_ST,
|
|
|
|
[BPF_S_STX] = BPF_STX,
|
|
|
|
[BPF_S_JMP_JA] = BPF_JMP|BPF_JA,
|
|
|
|
[BPF_S_JMP_JEQ_K] = BPF_JMP|BPF_JEQ|BPF_K,
|
|
|
|
[BPF_S_JMP_JEQ_X] = BPF_JMP|BPF_JEQ|BPF_X,
|
|
|
|
[BPF_S_JMP_JGE_K] = BPF_JMP|BPF_JGE|BPF_K,
|
|
|
|
[BPF_S_JMP_JGE_X] = BPF_JMP|BPF_JGE|BPF_X,
|
|
|
|
[BPF_S_JMP_JGT_K] = BPF_JMP|BPF_JGT|BPF_K,
|
|
|
|
[BPF_S_JMP_JGT_X] = BPF_JMP|BPF_JGT|BPF_X,
|
|
|
|
[BPF_S_JMP_JSET_K] = BPF_JMP|BPF_JSET|BPF_K,
|
|
|
|
[BPF_S_JMP_JSET_X] = BPF_JMP|BPF_JSET|BPF_X,
|
|
|
|
};
|
|
|
|
u16 code;
|
|
|
|
|
|
|
|
code = filt->code;
|
|
|
|
|
|
|
|
to->code = decodes[code];
|
|
|
|
to->jt = filt->jt;
|
|
|
|
to->jf = filt->jf;
|
|
|
|
|
|
|
|
if (code == BPF_S_ALU_DIV_K) {
|
|
|
|
/*
|
|
|
|
* When loaded this rule user gave us X, which was
|
|
|
|
* translated into R = r(X). Now we calculate the
|
|
|
|
* RR = r(R) and report it back. If next time this
|
|
|
|
* value is loaded and RRR = r(RR) is calculated
|
|
|
|
* then the R == RRR will be true.
|
|
|
|
*
|
|
|
|
* One exception. X == 1 translates into R == 0 and
|
|
|
|
* we can't calculate RR out of it with r().
|
|
|
|
*/
|
|
|
|
|
|
|
|
if (filt->k == 0)
|
|
|
|
to->k = 1;
|
|
|
|
else
|
|
|
|
to->k = reciprocal_value(filt->k);
|
|
|
|
|
|
|
|
BUG_ON(reciprocal_value(to->k) != filt->k);
|
|
|
|
} else
|
|
|
|
to->k = filt->k;
|
|
|
|
}
|
|
|
|
|
|
|
|
int sk_get_filter(struct sock *sk, struct sock_filter __user *ubuf, unsigned int len)
|
|
|
|
{
|
|
|
|
struct sk_filter *filter;
|
|
|
|
int i, ret;
|
|
|
|
|
|
|
|
lock_sock(sk);
|
|
|
|
filter = rcu_dereference_protected(sk->sk_filter,
|
|
|
|
sock_owned_by_user(sk));
|
|
|
|
ret = 0;
|
|
|
|
if (!filter)
|
|
|
|
goto out;
|
|
|
|
ret = filter->len;
|
|
|
|
if (!len)
|
|
|
|
goto out;
|
|
|
|
ret = -EINVAL;
|
|
|
|
if (len < filter->len)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
ret = -EFAULT;
|
|
|
|
for (i = 0; i < filter->len; i++) {
|
|
|
|
struct sock_filter fb;
|
|
|
|
|
|
|
|
sk_decode_filter(&filter->insns[i], &fb);
|
|
|
|
if (copy_to_user(&ubuf[i], &fb, sizeof(fb)))
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
ret = filter->len;
|
|
|
|
out:
|
|
|
|
release_sock(sk);
|
|
|
|
return ret;
|
|
|
|
}
|