From f3a81fae20b792b7905a13514aa072678929460a Mon Sep 17 00:00:00 2001 Message-Id: From: Laine Stump Date: Mon, 15 Dec 2014 10:51:27 -0500 Subject: [PATCH] conf: new network bridge device attribute macTableManager This is part of the fix for: https://bugzilla.redhat.com/show_bug.cgi?id=1099210 The macTableManager attribute of a network's bridge subelement tells libvirt how the bridge's MAC address table (used to determine the egress port for packets) is managed. In the default mode, "kernel", management is left to the kernel, which usually determines entries in part by turning on promiscuous mode on all ports of the bridge, flooding packets to all ports when the correct destination is unknown, and adding/removing entries to the fdb as it sees incoming traffic from particular MAC addresses. In "libvirt" mode, libvirt turns off learning and flooding on all the bridge ports connected to guest domain interfaces, and adds/removes entries according to the MAC addresses in the domain interface configurations. A side effect of turning off learning and unicast_flood on the ports of a bridge is that (with Linux kernel 3.17 and newer), the kernel can automatically turn off promiscuous mode on one or more of the bridge's ports (usually only the one interface that is used to connect the bridge to the physical network). The result is better performance (because packets aren't being flooded to all ports, and can be dropped earlier when they are of no interest) and slightly better security (a guest can still send out packets with a spoofed source MAC address, but will only receive traffic intended for the guest interface's configured MAC address). The attribute looks like this in the configuration: test ... This patch only adds the config knob, documentation, and test cases. The functionality behind this knob is added in later patches. (cherry picked from commit 40961978ee1f498f3a87baf158c7392e1ba48489) Signed-off-by: Jiri Denemark --- docs/formatnetwork.html.in | 50 ++++++++++++++++++--- docs/schemas/network.rng | 9 ++++ src/conf/network_conf.c | 51 +++++++++++++++++----- src/conf/network_conf.h | 11 +++++ src/libvirt_private.syms | 2 + tests/networkxml2xmlin/host-bridge-no-flood.xml | 6 +++ .../nat-network-explicit-flood.xml | 21 +++++++++ tests/networkxml2xmlout/host-bridge-no-flood.xml | 6 +++ .../nat-network-explicit-flood.xml | 23 ++++++++++ tests/networkxml2xmltest.c | 2 + 10 files changed, 164 insertions(+), 17 deletions(-) create mode 100644 tests/networkxml2xmlin/host-bridge-no-flood.xml create mode 100644 tests/networkxml2xmlin/nat-network-explicit-flood.xml create mode 100644 tests/networkxml2xmlout/host-bridge-no-flood.xml create mode 100644 tests/networkxml2xmlout/nat-network-explicit-flood.xml diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in index dc438ae..0851aa1 100644 --- a/docs/formatnetwork.html.in +++ b/docs/formatnetwork.html.in @@ -81,7 +81,7 @@
         ...
-        <bridge name="virbr0" stp="on" delay="5"/>
+        <bridge name="virbr0" stp="on" delay="5" macTableManager="libvirt"/>
         <domain name="example.com"/>
         <forward mode="nat" dev="eth0"/>
         ...
@@ -92,18 +92,56 @@ defines the name of a bridge device which will be used to construct the virtual network. The virtual machines will be connected to this bridge device allowing them to talk to each other. The bridge device - may also be connected to the LAN. It is recommended that bridge - device names started with the prefix vir, but the name - virbr0 is reserved for the "default" virtual - network. This element should always be provided when defining + may also be connected to the LAN. When defining a new network with a <forward> mode of + "nat" or "route" (or an isolated network with - no <forward> element). + no <forward> element), libvirt will + automatically generate a unique name for the bridge device if + none is given, and this name will be permanently stored in the + network configuration so that that the same name will be used + every time the network is started. For these types of networks + (nat, routed, and isolated), a bridge name beginning with the + prefix "virbr" is recommended (and that is what is + auto-generated), but not enforced. Attribute stp specifies if Spanning Tree Protocol is 'on' or 'off' (default is 'on'). Attribute delay sets the bridge's forward delay value in seconds (default is 0). Since 0.3.0 + +

+ The macTableManager attribute of the bridge + element is used to tell libvirt how the bridge's MAC address + table (used to determine the correct egress port for packets + based on destination MAC address) will be managed. In the + default kernel setting, the kernel + automatically adds and removes entries, typically using + learning, flooding, and promiscuous mode on the bridge's + ports in order to determine the proper egress port for + packets. When macTableManager is set + to libvirt, libvirt disables kernel management + of the MAC table (in the case of the Linux host bridge, this + means enabling vlan_filtering on the bridge, and disabling + learning and unicast_filter for all bridge ports), and + explicitly adds/removes entries to the table according to + the MAC addresses in the domain interface configurations. + Allowing libvirt to manage the MAC table can improve + performance - with a Linux host bridge, for example, turning + off learning and unicast_flood on ports has its own + performance advantage, and can also lead to an additional + boost by permitting the kernel to automatically turn off + promiscuous mode on some ports of the bridge (in particular, + the port attaching the bridge to the physical + network). However, it can also cause some networking setups + to stop working (e.g. vlan tagging, multicast, + guest-initiated changes to MAC address) and is not supported + by older kernels. + Since 1.2.11, requires kernel 3.17 or + newer +

+ +
domain
diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 2783f86..c0ef554 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -65,6 +65,15 @@ + + + + kernel + libvirt + + + + diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index b253fec..8015bf3 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -55,6 +55,10 @@ VIR_ENUM_IMPL(virNetworkForward, VIR_NETWORK_FORWARD_LAST, "none", "nat", "route", "bridge", "private", "vepa", "passthrough", "hostdev") +VIR_ENUM_IMPL(virNetworkBridgeMACTableManager, + VIR_NETWORK_BRIDGE_MAC_TABLE_MANAGER_LAST, + "default", "kernel", "libvirt") + VIR_ENUM_DECL(virNetworkForwardHostdevDevice) VIR_ENUM_IMPL(virNetworkForwardHostdevDevice, VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_LAST, @@ -2110,6 +2114,18 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) if (virXPathULong("string(./bridge[1]/@delay)", ctxt, &def->delay) < 0) def->delay = 0; + tmp = virXPathString("string(./bridge[1]/@macTableManager)", ctxt); + if (tmp) { + if ((def->macTableManager + = virNetworkBridgeMACTableManagerTypeFromString(tmp)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("Invalid macTableManager setting '%s' " + "in network '%s'"), tmp, def->name); + goto error; + } + VIR_FREE(tmp); + } + tmp = virXPathString("string(./mac[1]/@address)", ctxt); if (tmp) { if (virMacAddrParse(tmp, &def->mac) < 0) { @@ -2292,6 +2308,14 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) def->name); goto error; } + if (def->macTableManager) { + virReportError(VIR_ERR_XML_ERROR, + _("bridge macTableManager setting not allowed " + "in %s mode (network '%s')"), + virNetworkForwardTypeToString(def->forward.type), + def->name); + goto error; + } /* fall through to next case */ case VIR_NETWORK_FORWARD_BRIDGE: if (def->delay || stp) { @@ -2791,22 +2815,27 @@ virNetworkDefFormatBuf(virBufferPtr buf, virBufferAddLit(buf, "\n"); } + if (def->forward.type == VIR_NETWORK_FORWARD_NONE || - def->forward.type == VIR_NETWORK_FORWARD_NAT || - def->forward.type == VIR_NETWORK_FORWARD_ROUTE) { + def->forward.type == VIR_NETWORK_FORWARD_NAT || + def->forward.type == VIR_NETWORK_FORWARD_ROUTE || + def->bridge || def->macTableManager) { virBufferAddLit(buf, "bridge) - virBufferEscapeString(buf, " name='%s'", def->bridge); - virBufferAsprintf(buf, " stp='%s' delay='%ld'/>\n", - def->stp ? "on" : "off", - def->delay); - } else if (def->forward.type == VIR_NETWORK_FORWARD_BRIDGE && - def->bridge) { - virBufferEscapeString(buf, "\n", def->bridge); + virBufferEscapeString(buf, " name='%s'", def->bridge); + if (def->forward.type == VIR_NETWORK_FORWARD_NONE || + def->forward.type == VIR_NETWORK_FORWARD_NAT || + def->forward.type == VIR_NETWORK_FORWARD_ROUTE) { + virBufferAsprintf(buf, " stp='%s' delay='%ld'", + def->stp ? "on" : "off", def->delay); + } + if (def->macTableManager) { + virBufferAsprintf(buf, " macTableManager='%s'", + virNetworkBridgeMACTableManagerTypeToString(def->macTableManager)); + } + virBufferAddLit(buf, "/>\n"); } - if (def->mac_specified) { char macaddr[VIR_MAC_STRING_BUFLEN]; virMacAddrFormat(&def->mac, macaddr); diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index 660cd2d..8110028 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -54,6 +54,16 @@ typedef enum { } virNetworkForwardType; typedef enum { + VIR_NETWORK_BRIDGE_MAC_TABLE_MANAGER_DEFAULT = 0, + VIR_NETWORK_BRIDGE_MAC_TABLE_MANAGER_KERNEL, + VIR_NETWORK_BRIDGE_MAC_TABLE_MANAGER_LIBVIRT, + + VIR_NETWORK_BRIDGE_MAC_TABLE_MANAGER_LAST, +} virNetworkBridgeMACTableManagerType; + +VIR_ENUM_DECL(virNetworkBridgeMACTableManager) + +typedef enum { VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_NONE = 0, VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_PCI, VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_NETDEV, @@ -231,6 +241,7 @@ struct _virNetworkDef { int connections; /* # of guest interfaces connected to this network */ char *bridge; /* Name of bridge device */ + int macTableManager; /* enum virNetworkBridgeMACTableManager */ char *domain; unsigned long delay; /* Bridge forward delay (ms) */ bool stp; /* Spanning tree protocol */ diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index ec795aa..e4da444 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -525,6 +525,8 @@ virNetDevVPortTypeToString; # conf/network_conf.h virNetworkAssignDef; +virNetworkBridgeMACTableManagerTypeFromString; +virNetworkBridgeMACTableManagerTypeToString; virNetworkConfigChangeSetup; virNetworkConfigFile; virNetworkDefCopy; diff --git a/tests/networkxml2xmlin/host-bridge-no-flood.xml b/tests/networkxml2xmlin/host-bridge-no-flood.xml new file mode 100644 index 0000000..562e697 --- /dev/null +++ b/tests/networkxml2xmlin/host-bridge-no-flood.xml @@ -0,0 +1,6 @@ + + host-bridge-net + 81ff0d90-c91e-6742-64da-4a736edb9a8e + + + diff --git a/tests/networkxml2xmlin/nat-network-explicit-flood.xml b/tests/networkxml2xmlin/nat-network-explicit-flood.xml new file mode 100644 index 0000000..9738b81 --- /dev/null +++ b/tests/networkxml2xmlin/nat-network-explicit-flood.xml @@ -0,0 +1,21 @@ + + default + 81ff0d90-c91e-6742-64da-4a736edb9a9b + + + + + + + + + + + + + + + + + + diff --git a/tests/networkxml2xmlout/host-bridge-no-flood.xml b/tests/networkxml2xmlout/host-bridge-no-flood.xml new file mode 100644 index 0000000..bdbbf3b --- /dev/null +++ b/tests/networkxml2xmlout/host-bridge-no-flood.xml @@ -0,0 +1,6 @@ + + host-bridge-net + 81ff0d90-c91e-6742-64da-4a736edb9a8e + + + diff --git a/tests/networkxml2xmlout/nat-network-explicit-flood.xml b/tests/networkxml2xmlout/nat-network-explicit-flood.xml new file mode 100644 index 0000000..305c3b7 --- /dev/null +++ b/tests/networkxml2xmlout/nat-network-explicit-flood.xml @@ -0,0 +1,23 @@ + + default + 81ff0d90-c91e-6742-64da-4a736edb9a9b + + + + + + + + + + + + + + + + + + + + diff --git a/tests/networkxml2xmltest.c b/tests/networkxml2xmltest.c index 65ac591..34a5211 100644 --- a/tests/networkxml2xmltest.c +++ b/tests/networkxml2xmltest.c @@ -120,6 +120,8 @@ mymain(void) DO_TEST("hostdev"); DO_TEST_FULL("hostdev-pf", VIR_NETWORK_XML_INACTIVE); DO_TEST("passthrough-address-crash"); + DO_TEST("nat-network-explicit-flood"); + DO_TEST("host-bridge-no-flood"); return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } -- 2.2.0