Have you ever wondered how Linux knows what PCI devices are plugged in? How does Linux know what driver to associate with the device when it detects it?
In short, here’s what happens:
- During the kernel’s init process (
init/main.c), various subsystems are brought up according to their “init levels.” Among these early subsystems are the ACPI subsystem and the PCI bus driver.
- The ACPI subsystem probes the system bus. This “probe” is actually a recursive scan since there can be other devices that act as “bridges” from that main system bus.
- Each bus is probed, that is, asked to enumerate the devices that are connected to them. It’s at this point we’ll start seeing their sysfs entries.
- For each device that the bus sees, it will attempt to associate a device driver to it. How does it know to do this? Well, it’s actually up to the device driver. It’s the driver’s responsibility to export a table of devices that it will support when it registers itself to the PCI subsystem. This is used by the hotplug system to map modules to the PCI devices they support. It’s basically a phone book of who we need to call when dealing with a given PCI device.
- Assuming a match, the kernel will (eventually) call the driver’s
probe()function, and the device driver can decide whether or not it claims the device. Yes, the kernel basically takes the device and walks up to the driver(s) that claim they can handle a certain device and then asks, “is this your kid?”
- Remember, a device driver whose
module_initequivalent has been called is included in this roll-call (built-in or module) so long as this device is compatible with their supported module device table. Built-in modules are asked first (according to the order that they’re linked into the kernel image).
Finally, let’s take a look at a stack trace from a kernel running in QEMU. We’ll start from the bottom, and work our way up. (I’ve removed the function addresses in the stack trace to reduce clutter)
#28 ?? () at arch/x86/entry/entry_64.S:352 #27 ret_from_fork () at init/main.c:1087 #26 kernel_init (unused=<optimized out>) at init/main.c:1169 #25 kernel_init_freeable () at init/main.c:1009 #24 do_basic_setup () at init/main.c:991 #23 do_initcalls () at ./include/linux/compiler.h:305 #22 do_initcall_level (level=<optimized out>) at init/main.c:915 #21 do_one_initcall (fn=0xffffffff828d6101 <piix_init>) at drivers/ata/ata_piix.c:1773
As part of initialization, the kernel brings itself up gradually. This gradual bring-up is described by the kernel’s init levels. Here’s the order defined in
qemu-system-x86_64 system, it looks like
piix_init is being invoked. This driver’s entry point is declared using the module_init macro, meaning that it belongs to the device init level. So at this point, the ACPI subsystem has already been brought up and the PCI bus has been scanned for devices.
#20 piix_init () at drivers/pci/pci-driver.c:1402 #19 __pci_register_driver (drv=<optimized out>, owner=<optimized out>, mod_name=<optimized out>) at drivers/base/driver.c:170 #18 driver_register (drv=0xffffffff827a33b0 <piix_pci_driver+112>) at drivers/base/bus.c:645 #17 bus_add_driver (drv=0xffffffff827a33b0 <piix_pci_driver+112>) at drivers/base/dd.c:1037 #16 driver_attach (drv=<optimized out>) at drivers/base/bus.c:304 #15 bus_for_each_dev (bus=<optimized out>, start=<optimized out>, data=0x0 <fixed_percpu_data>, fn=0x0 <fixed_percpu_data>) at drivers/base/dd.c:1021
This module registers itself as a PCI driver in its init function. This means the driver is associated with the PCI bus and is now a valid candidate for driving PCI devices. Since it’s being registered now, we can carry on and see if we can find the physical device that it should be paired with. So, we iterate through all of the eligible devices connected to the bus in
bus_for_each_dev and begin searching.
#14 __driver_attach (dev=0xffff88801b14f0b0, data=0xffffffff827a33b0 <piix_pci_driver+112>) at drivers/base/dd.c:944 #13 device_driver_attach ( drv=0xffffffff827a33b0 <piix_pci_driver+112>, dev=0xffff88801b14f0b0) at drivers/base/dd.c:670 #12 driver_probe_device ( drv=0xffffffff827a33b0 <piix_pci_driver+112>, dev=0xffff88801b14f0b0) at drivers/base/dd.c:509 #11 really_probe (dev=0xffff88801b14f000, drv=0x1f <fixed_percpu_data+31>) at drivers/pci/pci-driver.c:425 #10 pci_device_probe (dev=0xffff88801b14f0b0) at drivers/pci/pci-driver.c:385 #9 __pci_device_probe (pci_dev=<optimized out>, drv=<optimized out>) at drivers/pci/pci-driver.c:360 #8 pci_call_probe (id=<optimized out>, dev=<optimized out>, drv=<optimized out>) at drivers/pci/pci-driver.c:306 #7 local_pci_probe (_ddi=0xffffc900000dbc50) at drivers/ata/ata_piix.c:1672 #6 piix_init_one (pdev=0xffff88801b14f000, ent=0x1f <fixed_percpu_data+31>) at drivers/pci/pci.c:1806
Wow, that looks like a mouthful at first glance. Rest assured, all that’s happening here is that for each eligible device connected to the PCI bus, the bus attempts to attach the driver to the device. This ultimately ends up calling the driver’s
probe() callback, which in this case is
This callback is registered
drivers/ata/ata_piix.c line 1760. We also see in
piix_init_one() the driver checks the PCI device ID information to determine if it should reject the device. The device is not claimed if this probe function returns an error.
#5 pcim_enable_device (pdev=0xffff88801b14f000) at drivers/pci/pci.c:1806 #4 pci_enable_device (dev=<optimized out>) at drivers/pci/pci.c:1677 #3 pci_enable_device_flags (dev=0xffff88801b14f000, flags=768) at drivers/pci/pci.c:1588 #2 do_pci_enable_device (dev=0xffff88801b14f000, bars=31) at arch/x86/pci/common.c:709 #1 pcibios_enable_device (dev=0xffff88801b14f000, mask=<optimized out>) at drivers/pci/setup-res.c:456 #0 pci_enable_resources (dev=0xffff88801b14f000, mask=31)
However, if the device driver remains happy during its
probe() function, it will ultimately enable the PCI device and return success.
And that’s how the Linux kernel detects PCI devices and pairs them with their device driver!