From 24db39fb2983ca83ab5c6ee37cb57a4f7f6f94e6 Mon Sep 17 00:00:00 2001 From: Michael Brown Date: Tue, 3 Dec 2024 13:55:18 +0000 Subject: [gve] Run startup process only while device is open The startup process is scheduled to run when the device is opened and terminated (if still running) when the device is closed. It assumes that the resource allocation performed in gve_open() has taken place, and that the admin and transmit/receive data structure pointers are therefore valid. The process initialisation in gve_probe() erroneously calls process_init() rather than process_init_stopped() and will therefore schedule the startup process immediately, before the relevant resources have been allocated. This bug is masked in the typical use case of a Google Cloud instance with a single NIC built with the config/cloud/gce.ipxe embedded script, since the embedded script will immediately open the NIC (and therefore allocate the required resources) before the scheduled process is allowed to run for the first time. In a multi-NIC instance, undefined behaviour will arise as soon as the startup process for the second NIC is allowed to run. Fix by using process_init_stopped() to avoid implicitly scheduling the startup process during gve_probe(). Originally-fixed-by: Kal Cutter Conley Signed-off-by: Michael Brown --- src/drivers/net/gve.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/drivers/net/gve.c b/src/drivers/net/gve.c index df10a94c6..efc38dd21 100644 --- a/src/drivers/net/gve.c +++ b/src/drivers/net/gve.c @@ -1543,7 +1543,8 @@ static int gve_probe ( struct pci_device *pci ) { gve->netdev = netdev; gve->tx.type = &gve_tx_type; gve->rx.type = &gve_rx_type; - process_init ( &gve->startup, &gve_startup_desc, &netdev->refcnt ); + process_init_stopped ( &gve->startup, &gve_startup_desc, + &netdev->refcnt ); timer_init ( &gve->watchdog, gve_watchdog, &netdev->refcnt ); /* Fix up PCI device */ -- cgit