kernel-ark/drivers
Yinghai Lu ac205b7bb7 PCI: make sriov work with hotplug remove
When hot removing a pci express module that has a pcie switch and supports
SRIOV, we got:

[ 5918.610127] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 1
[ 5918.615779] pciehp 0000:80:02.2:pcie04: Attention button interrupt received
[ 5918.622730] pciehp 0000:80:02.2:pcie04: Button pressed on Slot(3)
[ 5918.629002] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 1f9
[ 5918.637416] pciehp 0000:80:02.2:pcie04: PCI slot #3 - powering off due to button press.
[ 5918.647125] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 10
[ 5918.653039] pciehp 0000:80:02.2:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[ 5918.661229] pciehp 0000:80:02.2:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
[ 5924.667627] pciehp 0000:80:02.2:pcie04: Disabling domain🚌device=0000:b0:00
[ 5924.674909] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 2f9
[ 5924.683262] pciehp 0000:80:02.2:pcie04: pciehp_unconfigure_device: domain🚌dev = 0000:b0:00
[ 5924.693976] libfcoe_device_notification: NETDEV_UNREGISTER eth6
[ 5924.764979] libfcoe_device_notification: NETDEV_UNREGISTER eth14
[ 5924.873539] libfcoe_device_notification: NETDEV_UNREGISTER eth15
[ 5924.995209] libfcoe_device_notification: NETDEV_UNREGISTER eth16
[ 5926.114407] sxge 0000:b2:00.0: PCI INT A disabled
[ 5926.119342] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 5926.127189] IP: [<ffffffff81353a3b>] pci_stop_bus_device+0x33/0x83
[ 5926.133377] PGD 0
[ 5926.135402] Oops: 0000 [#1] SMP
[ 5926.138659] CPU 2
[ 5926.140499] Modules linked in:
...
[ 5926.143754]
[ 5926.275823] Call Trace:
[ 5926.278267]  [<ffffffff81353a38>] pci_stop_bus_device+0x30/0x83
[ 5926.284180]  [<ffffffff81353af4>] pci_remove_bus_device+0x1a/0xba
[ 5926.290264]  [<ffffffff81366311>] pciehp_unconfigure_device+0x110/0x17b
[ 5926.296866]  [<ffffffff81365dd9>] ? pciehp_disable_slot+0x188/0x188
[ 5926.303123]  [<ffffffff81365d6f>] pciehp_disable_slot+0x11e/0x188
[ 5926.309206]  [<ffffffff81365e68>] pciehp_power_thread+0x8f/0xe0
...

 +-[0000:80]-+-00.0-[81-8f]--
 |           +-01.0-[90-9f]--
 |           +-02.0-[a0-af]--
 |           +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Device
 |           |                               |            +-00.1 Device
 |           |                               |            +-00.2 Device
 |           |                               |            \-00.3 Device
 |           |                               \-03.0-[b3]--+-00.0 Device
 |           |                                            +-00.1 Device
 |           |                                            +-00.2 Device
 |           |                                            \-00.3 Device

root complex: 80:02.2
pci express modules: have pcie switch and are listed as b0:00.0, b1:02.0 and b1:03.0.
end devices  are b2:00.0 and b3.00.0.
VFs are: b2:00.1,... b2:00.3, and b3:00.1,...,b3:00.3

Root cause: when doing pci_stop_bus_device() with phys fn, it will stop
virt fn and remove the fn, so
	list_for_each_safe(l, n, &bus->devices)
will have problem to refer freed n that is pointed to vf entry.

Solution is just replacing list_for_each_safe() with
list_for_each_prev_safe().  This will make sure we can get valid n pointer
to PF instead of the freed VF pointer (because newly added devices are
inserted to the bus->devices list tail).

During reviewing the patch, Bjorn said:
|   The PCI hot-remove path calls pci_stop_bus_devices() via
|   pci_remove_bus_device().
|
|   pci_stop_bus_devices() traverses the bus->devices list (point A below),
|   stopping each device in turn, which calls the driver remove() method.  When
|   the device is an SR-IOV PF, the driver calls pci_disable_sriov(), which
|   also uses pci_remove_bus_device() to remove the VF devices from the
|   bus->devices list (point B).
|
|       pci_remove_bus_device
|         pci_stop_bus_device
|           pci_stop_bus_devices(subordinate)
|             list_for_each(bus->devices)             <-- A
|               pci_stop_bus_device(PF)
|                 ...
|                   driver->remove
|                     pci_disable_sriov
|                       ...
|                         pci_remove_bus_device(VF)
|                             <remove from bus_list>  <-- B
|
|   At B, we're changing the same list we're iterating through at A, so when
|   the driver remove() method returns, the pci_stop_bus_devices() iterator has
|   a pointer to a list entry that has already been freed.

Discussion thread can be found : https://lkml.org/lkml/2011/10/15/141
				 https://lkml.org/lkml/2012/1/23/360

-v5: According to Linus to make remove more robust, Change to
     list_for_each_prev_safe instead. That is more reasonable, because
     those devices are added to tail of the list before.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:59 -08:00
..
accessibility module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
acpi Merge branches 'atomicio-apei', 'hotplug', 'sony-nvs-nosave' and 'thermal-netlink' into release 2012-01-23 19:47:06 -05:00
amba Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
ata [libata] ata_piix: Add Toshiba Satellite Pro A120 to the quirks list 2012-01-17 20:50:53 -05:00
atm module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
auxdisplay
base A fairly simple bugfix for a WARN_ON() which was triggered in the cache 2012-01-25 15:11:57 -08:00
bcma bcma: connect the bcma bus suspend/resume to the bcma driver suspend/resume 2012-01-17 09:54:08 -05:00
block nvme: fix merge error due to change of 'make_request_fn' fn type 2012-01-18 15:41:27 -08:00
bluetooth module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
cdrom block: add and use scsi_blk_cmd_ioctl 2012-01-14 15:07:24 -08:00
char agp: fix scratch page cleanup 2012-01-26 18:36:48 +00:00
clk
clocksource Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm 2012-01-06 18:15:25 -08:00
connector
cpufreq Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq 2012-01-11 18:53:33 -08:00
cpuidle
crypto Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2012-01-10 22:01:27 -08:00
dca
devfreq
dio
dma Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2012-01-17 18:40:24 -08:00
edac module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
eisa
firewire module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
firmware Merge commit '070680218379e15c1901f4bf21b98e3cbf12b527' into stable/for-linus-fixes-3.3 2012-01-12 11:53:55 -05:00
gpio gpio: tps65910: Use correct offset for gpio initialization 2012-01-18 13:48:43 -07:00
gpu gma500: Fix suspend/resume functions 2012-01-27 11:52:07 +00:00
hid module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
hv Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb 2012-01-09 12:09:47 -08:00
hwmon hwmon: (adm1031) Fix coding style issues 2012-01-16 22:51:48 +01:00
hwspinlock
i2c Merge branches 'for-33/i2c/eg20t' and 'for-33/i2c/omap' into for-linus/i2c-33 2012-01-17 23:30:41 +00:00
ide block: add and use scsi_blk_cmd_ioctl 2012-01-14 15:07:24 -08:00
idle ACPI processor hotplug: Delay acpi_processor_start() call for hotplugged cores 2012-01-19 21:26:32 -05:00
ieee802154
infiniband Merge branch 'for-next-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2012-01-18 16:29:42 -08:00
input Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 2012-01-14 12:32:16 -08:00
iommu Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu 2012-01-10 11:08:21 -08:00
isdn Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 2012-01-14 12:32:16 -08:00
leds leds: add led driver for Bachmann's ot200 2012-01-23 08:38:47 -08:00
lguest lguest: Make sure interrupt is allocated ok by lguest_setup_irq 2012-01-12 15:44:47 +10:30
macintosh module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
mca
md Merge branch 'for-3.3/core' of git://git.kernel.dk/linux-block 2012-01-15 12:24:45 -08:00
media Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2012-01-26 17:04:47 -08:00
memstick module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
message SCSI updates for post 3.2 merge window 2012-01-10 10:36:08 -08:00
mfd Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 2012-01-14 12:32:16 -08:00
misc Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2012-01-17 18:40:24 -08:00
mmc Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2012-01-17 18:40:24 -08:00
mtd Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2012-01-17 18:40:24 -08:00
net team: send only changed options/ports via netlink 2012-01-24 15:51:00 -05:00
nfc Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core 2012-01-07 12:03:30 -08:00
nubus
of 2nd set of device tree changes for v3.3 2012-01-14 13:25:55 -08:00
oprofile Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-01-06 08:02:58 -08:00
parisc Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci 2012-01-11 18:50:26 -08:00
parport Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 2012-01-14 12:32:16 -08:00
pci PCI: make sriov work with hotplug remove 2012-02-14 08:44:59 -08:00
pcmcia Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2012-01-14 13:05:21 -08:00
pinctrl
platform module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
pnp PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB 2012-01-06 12:11:20 -08:00
power module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
pps
ps3
ptp
rapidio
regulator regulator: Fix documentation for of_node parameter of regulator_register() 2012-01-24 10:40:06 -08:00
rtc Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 2012-01-13 20:43:32 -08:00
s390 module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
sbus
scsi Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k 2012-01-26 12:43:57 -08:00
sfi
sh SH/R-Mobile updates for 3.3 merge window. 2012-01-11 23:29:20 -08:00
sn
spi Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2012-01-17 18:40:24 -08:00
ssb
staging Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2012-01-15 12:49:56 -08:00
target
tc
telephony
thermal thermal: Rename generate_netlink_event 2012-01-23 03:15:25 -05:00
tty Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2012-01-17 18:40:24 -08:00
uio Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci 2012-01-11 18:50:26 -08:00
usb Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2012-01-17 18:40:24 -08:00
uwb Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb 2012-01-09 12:09:47 -08:00
vhost vhost-net: add module alias (v2.1) 2012-01-13 10:12:23 -08:00
video Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k 2012-01-26 12:43:57 -08:00
virt
virtio virtio: balloon: Add freeze, restore handlers to support S4 2012-01-12 15:44:47 +10:30
vlynq
w1
watchdog module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
xen xen/pciback: Support pci_reset_function, aka FLR or D3 support. 2012-02-14 08:44:49 -08:00
zorro
Kconfig
Makefile mmc: sdhci-pci: add platform data 2012-01-11 23:58:47 -05:00