Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix
networkRange
race condition and global state corruption (#945)
While the `networkRange` function calculates `lastIP`, it uses `net.IPv4zero` or `net.IPv6zero` as a starting value from which it builds the final last IP. The issue is that these are global variables defined in the standard library, and modifying them affects all future users of those variables across the program. This leads to a not-so-rare race condition when creating multiple libvirt networks. These networks are being created concurrently and they both race modifying that global variable. Their `lastIP` DHCP configuration gets mixed up as result. The manner in which the configuration gets mixed up is unpredictable, but it could lead to a corrupt configuration that cannot be applied. The solution is to copy the zero IP addresses to a new variable rather than use them directly. Instead for simplicity I just instantiate an empty slice with the same length as the net IP, which would be filled with zeroes anyway. This commit also solves another unrelated bug in `getNetworkIPConfig` where the `^` operator was used as if it's a power operator, while in reality it's a bitwise-xor that leads to slightly incorrect results. This makes the validation only slightly wrong and is overly strict (e.g. it will not allow you to create `/28` IPv4 network even though such network has far more available addresses than the condition blocks) A similar race condition can be simply reproduced and visualized with this short go code: ```go package main import ( "fmt" "net" "sync" ) func getNetMaskWithMax16Bits(m net.IPMask) net.IPMask { ones, bits := m.Size() if bits-ones > 16 { if bits == 128 { return net.CIDRMask(128-16, 128) } return net.CIDRMask(32-16, 32) } return m } func networkRange(network *net.IPNet) (net.IP, net.IP) { netIP := network.IP.To4() lastIP := net.IPv4zero.To4() if netIP == nil { netIP = network.IP.To16() lastIP = net.IPv6zero.To16() } firstIP := netIP.Mask(network.Mask) intMask := getNetMaskWithMax16Bits(network.Mask) for i := 0; i < len(lastIP); i++ { lastIP[i] = netIP[i] | ^intMask[i] } return firstIP, lastIP } func update(wg *sync.WaitGroup, cidr string, id int) { address := cidr _, ipNet, _ := net.ParseCIDR(address) start, end := networkRange(ipNet) start[len(start)-1]++ start[len(start)-1]++ end[len(end)-1]-- fmt.Printf("Start %d: %s\n", id, start.String()) fmt.Printf("End %d: %s\n", id, end.String()) wg.Done() } func main() { var wg sync.WaitGroup wg.Add(2) go update(&wg, "192.168.145.0/24", 0) go update(&wg, "192.168.127.0/24", 1) wg.Wait() } ``` Then run: ```bash watch -n0.1 -d go run main.go ``` To see it happen for short moments. Alternatively, create this `main.tf` file: ```terraform terraform { required_providers { libvirt = { source = "dmacvicar/libvirt" version = "0.6.14" } } } provider "libvirt" { uri = "qemu:///system" } resource "libvirt_network" "net" { name = "omer-net" mode = "nat" addresses = ["192.168.127.0/24"] autostart = true } resource "libvirt_network" "secondary_net" { name = "omer-second-net" mode = "nat" addresses = ["192.168.145.0/24"] autostart = true } ``` And run: ```bash while true; do if ! (terraform apply -auto-approve && terraform destroy -auto-approve); then break fi done ``` Until the bug reproduces with the following error: ``` │ Error: error defining libvirt network: internal error: range 192.168.127.2 - 192.168.145.254 is not entirely within network 192.168.127.1/24 - <network> │ <name>omer-net</name> │ <forward mode="nat"></forward> │ <bridge stp="on"></bridge> │ <dns enable="no"></dns> │ <ip address="192.168.127.1" family="ipv4" prefix="24"> │ <dhcp> │ <range start="192.168.127.2" end="192.168.145.254"></range> │ </dhcp> │ </ip> │ <options xmlns="http://libvirt.org/schemas/network/dnsmasq/1.0"></options> │ </network> │ │ with libvirt_network.net, │ on main.tf line 14, in resource "libvirt_network" "net": │ 14: resource "libvirt_network" "net" { ```
- Loading branch information