4fbae7d83c
In a typical IPvlan L3 setup where master is in default-ns and each slave is into different (slave) ns. In this setup egress packet processing for traffic originating from slave-ns will hit all NF_HOOKs in slave-ns as well as default-ns. However same is not true for ingress processing. All these NF_HOOKs are hit only in the slave-ns skipping them in the default-ns. IPvlan in L3 mode is restrictive and if admins want to deploy iptables rules in default-ns, this asymmetric data path makes it impossible to do so. This patch makes use of the l3_rcv() (added as part of l3mdev enhancements) to perform input route lookup on RX packets without changing the skb->dev and then uses nf_hook at NF_INET_LOCAL_IN to change the skb->dev just before handing over skb to L4. Signed-off-by: Mahesh Bandewar <maheshb@google.com> CC: David Ahern <dsa@cumulusnetworks.com> Reviewed-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
113 lines
4.6 KiB
Plaintext
113 lines
4.6 KiB
Plaintext
|
|
IPVLAN Driver HOWTO
|
|
|
|
Initial Release:
|
|
Mahesh Bandewar <maheshb AT google.com>
|
|
|
|
1. Introduction:
|
|
This is conceptually very similar to the macvlan driver with one major
|
|
exception of using L3 for mux-ing /demux-ing among slaves. This property makes
|
|
the master device share the L2 with it's slave devices. I have developed this
|
|
driver in conjunction with network namespaces and not sure if there is use case
|
|
outside of it.
|
|
|
|
|
|
2. Building and Installation:
|
|
In order to build the driver, please select the config item CONFIG_IPVLAN.
|
|
The driver can be built into the kernel (CONFIG_IPVLAN=y) or as a module
|
|
(CONFIG_IPVLAN=m).
|
|
|
|
|
|
3. Configuration:
|
|
There are no module parameters for this driver and it can be configured
|
|
using IProute2/ip utility.
|
|
|
|
ip link add link <master-dev> <slave-dev> type ipvlan mode { l2 | l3 | l3s }
|
|
|
|
e.g. ip link add link ipvl0 eth0 type ipvlan mode l2
|
|
|
|
|
|
4. Operating modes:
|
|
IPvlan has two modes of operation - L2 and L3. For a given master device,
|
|
you can select one of these two modes and all slaves on that master will
|
|
operate in the same (selected) mode. The RX mode is almost identical except
|
|
that in L3 mode the slaves wont receive any multicast / broadcast traffic.
|
|
L3 mode is more restrictive since routing is controlled from the other (mostly)
|
|
default namespace.
|
|
|
|
4.1 L2 mode:
|
|
In this mode TX processing happens on the stack instance attached to the
|
|
slave device and packets are switched and queued to the master device to send
|
|
out. In this mode the slaves will RX/TX multicast and broadcast (if applicable)
|
|
as well.
|
|
|
|
4.2 L3 mode:
|
|
In this mode TX processing up to L3 happens on the stack instance attached
|
|
to the slave device and packets are switched to the stack instance of the
|
|
master device for the L2 processing and routing from that instance will be
|
|
used before packets are queued on the outbound device. In this mode the slaves
|
|
will not receive nor can send multicast / broadcast traffic.
|
|
|
|
4.3 L3S mode:
|
|
This is very similar to the L3 mode except that iptables (conn-tracking)
|
|
works in this mode and hence it is L3-symmetric (L3s). This will have slightly less
|
|
performance but that shouldn't matter since you are choosing this mode over plain-L3
|
|
mode to make conn-tracking work.
|
|
|
|
5. What to choose (macvlan vs. ipvlan)?
|
|
These two devices are very similar in many regards and the specific use
|
|
case could very well define which device to choose. if one of the following
|
|
situations defines your use case then you can choose to use ipvlan -
|
|
(a) The Linux host that is connected to the external switch / router has
|
|
policy configured that allows only one mac per port.
|
|
(b) No of virtual devices created on a master exceed the mac capacity and
|
|
puts the NIC in promiscuous mode and degraded performance is a concern.
|
|
(c) If the slave device is to be put into the hostile / untrusted network
|
|
namespace where L2 on the slave could be changed / misused.
|
|
|
|
|
|
6. Example configuration:
|
|
|
|
+=============================================================+
|
|
| Host: host1 |
|
|
| |
|
|
| +----------------------+ +----------------------+ |
|
|
| | NS:ns0 | | NS:ns1 | |
|
|
| | | | | |
|
|
| | | | | |
|
|
| | ipvl0 | | ipvl1 | |
|
|
| +----------#-----------+ +-----------#----------+ |
|
|
| # # |
|
|
| ################################ |
|
|
| # eth0 |
|
|
+==============================#==============================+
|
|
|
|
|
|
(a) Create two network namespaces - ns0, ns1
|
|
ip netns add ns0
|
|
ip netns add ns1
|
|
|
|
(b) Create two ipvlan slaves on eth0 (master device)
|
|
ip link add link eth0 ipvl0 type ipvlan mode l2
|
|
ip link add link eth0 ipvl1 type ipvlan mode l2
|
|
|
|
(c) Assign slaves to the respective network namespaces
|
|
ip link set dev ipvl0 netns ns0
|
|
ip link set dev ipvl1 netns ns1
|
|
|
|
(d) Now switch to the namespace (ns0 or ns1) to configure the slave devices
|
|
- For ns0
|
|
(1) ip netns exec ns0 bash
|
|
(2) ip link set dev ipvl0 up
|
|
(3) ip link set dev lo up
|
|
(4) ip -4 addr add 127.0.0.1 dev lo
|
|
(5) ip -4 addr add $IPADDR dev ipvl0
|
|
(6) ip -4 route add default via $ROUTER dev ipvl0
|
|
- For ns1
|
|
(1) ip netns exec ns1 bash
|
|
(2) ip link set dev ipvl1 up
|
|
(3) ip link set dev lo up
|
|
(4) ip -4 addr add 127.0.0.1 dev lo
|
|
(5) ip -4 addr add $IPADDR dev ipvl1
|
|
(6) ip -4 route add default via $ROUTER dev ipvl1
|