Skip to content

Commit fbdd137

Browse files
committed
vm networking: add flag vnet_hdr
When segmentation offload is enabled, and unsegmented packets are sent to a VM (i.e. when running a container in the root netns), the kernel will detect that packets are larger than expected and proceed. That's not the case for containers (i.e. when running a container with its own netns, and a veth pair). In that case, packets reach the virtio-net interface, are forwarded to the bridge, and then to the appropriate veth. Unsegmented packets with GSO fields unset are dropped by the kernel either at the bridge or at the veth level. That may be due to the correct network topology where the vnet interface is attached to a bridge. In that case, we need to tell libkrun that the network backend sends / receives virtio_net_hdr structs with the packets, and the backend need to preserve GSO fields for VM-to-VM connections, or populate them for host-to-VM connections. Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>
1 parent 78fd29a commit fbdd137

File tree

2 files changed

+19
-1
lines changed

2 files changed

+19
-1
lines changed

docs/vm-networking.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,8 @@ that take the following fields:
4949
VFKIT magic sequence after connecting to the `socket`. Accept any of `1, t, T,
5050
TRUE, true, True, 0, f, F, FALSE, false, False`. Any other value is invalid and
5151
will produce an error.
52+
- `vnet_hdr` (optional, defaults to false): Indicate whether the VMM includes
53+
virtio-net headers along with Ethernet frames.
5254

5355
Note that the first network specified will be used as the default gateway.
5456

internal/shim/task/networking_unix.go

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,11 @@ import (
3030
"github.com/containerd/nerdbox/internal/vm"
3131
)
3232

33+
const (
34+
NET_FLAG_VFKIT = 1 << iota // See https://github.com/containers/libkrun/blob/357ec63fee444b973e4fc76d2121fd41631f121e/include/libkrun.h#L271C9-L271C23
35+
NET_FLAG_INCLUDE_VNET_HEADER
36+
)
37+
3338
type networksProvider struct {
3439
nws []network
3540
}
@@ -43,6 +48,7 @@ type network struct {
4348
addr6 netip.Prefix // addr6 is the IPv6 address + subnet mask of the network interface
4449
features uint32 // features is a bitmask of virtio-net features enabled on this network endpoint
4550
vfkit bool // vfkit is a boolean flag indicating whether libkrun must send the VFKIT magic sequence after connecting to the socket.
51+
vnetHdr bool // vnetHdr is a boolean flag indicating whether libkrun must include virtio-net headers along with Ethernet frames.
4652
}
4753

4854
const (
@@ -57,6 +63,7 @@ const (
5763
addrField = "addr"
5864
featuresField = "features" // features is a bitwise-OR separated list of virtio-net features. See https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html#x1-2370003
5965
vfkitField = "vfkit" // vfkit is a boolean flag indicating whether libkrun must send the VFKIT magic sequence after connecting to the socket.
66+
vnetHdrField = "vnet_hdr"
6067

6168
nwModeUnixgram = "unixgram"
6269
nwModeUnixstream = "unixstream"
@@ -149,6 +156,12 @@ func parseNetwork(annotation string) (network, error) {
149156
return network{}, fmt.Errorf("parsing vfkit field: %w", err)
150157
}
151158
n.vfkit = vfkit
159+
case vnetHdrField:
160+
vnetHdr, err := strconv.ParseBool(value)
161+
if err != nil {
162+
return network{}, fmt.Errorf("parsing vnet_hdr field: %w", err)
163+
}
164+
n.vnetHdr = vnetHdr
152165
default:
153166
return network{}, fmt.Errorf("unknown network field: %s", key)
154167
}
@@ -180,7 +193,10 @@ func (p *networksProvider) SetupVM(ctx context.Context, vmi vm.Instance) error {
180193

181194
var flags uint32
182195
if nw.vfkit {
183-
flags = 1 // See https://github.com/containers/libkrun/blob/357ec63fee444b973e4fc76d2121fd41631f121e/include/libkrun.h#L271C9-L271C23
196+
flags = NET_FLAG_VFKIT
197+
}
198+
if nw.vnetHdr {
199+
flags |= NET_FLAG_INCLUDE_VNET_HEADER
184200
}
185201

186202
if err := vmi.AddNIC(ctx, nw.endpoint, nw.mac, nwMode, nw.features, flags); err != nil {

0 commit comments

Comments
 (0)