64 lines
2.6 KiB
Diff
64 lines
2.6 KiB
Diff
Bugzilla: 1074235
|
|
Upstream-status: 3.15 and CC'd to stable
|
|
|
|
From e39435ce68bb4685288f78b1a7e24311f7ef939f Mon Sep 17 00:00:00 2001
|
|
From: Jens Axboe <axboe@fb.com>
|
|
Date: Tue, 8 Apr 2014 16:04:12 -0700
|
|
Subject: [PATCH] lib/percpu_counter.c: fix bad percpu counter state during
|
|
suspend
|
|
|
|
I got a bug report yesterday from Laszlo Ersek in which he states that
|
|
his kvm instance fails to suspend. Laszlo bisected it down to this
|
|
commit 1cf7e9c68fe8 ("virtio_blk: blk-mq support") where virtio-blk is
|
|
converted to use the blk-mq infrastructure.
|
|
|
|
After digging a bit, it became clear that the issue was with the queue
|
|
drain. blk-mq tracks queue usage in a percpu counter, which is
|
|
incremented on request alloc and decremented when the request is freed.
|
|
The initial hunt was for an inconsistency in blk-mq, but everything
|
|
seemed fine. In fact, the counter only returned crazy values when
|
|
suspend was in progress.
|
|
|
|
When a CPU is unplugged, the percpu counters merges that CPU state with
|
|
the general state. blk-mq takes care to register a hotcpu notifier with
|
|
the appropriate priority, so we know it runs after the percpu counter
|
|
notifier. However, the percpu counter notifier only merges the state
|
|
when the CPU is fully gone. This leaves a state transition where the
|
|
CPU going away is no longer in the online mask, yet it still holds
|
|
private values. This means that in this state, percpu_counter_sum()
|
|
returns invalid results, and the suspend then hangs waiting for
|
|
abs(dead-cpu-value) requests to complete which of course will never
|
|
happen.
|
|
|
|
Fix this by clearing the state earlier, so we never have a case where
|
|
the CPU isn't in online mask but still holds private state. This bug
|
|
has been there since forever, I guess we don't have a lot of users where
|
|
percpu counters needs to be reliable during the suspend cycle.
|
|
|
|
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Reported-by: Laszlo Ersek <lersek@redhat.com>
|
|
Tested-by: Laszlo Ersek <lersek@redhat.com>
|
|
Cc: <stable@vger.kernel.org>
|
|
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
---
|
|
lib/percpu_counter.c | 2 +-
|
|
1 file changed, 1 insertion(+), 1 deletion(-)
|
|
|
|
diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c
|
|
index 8280a5dd1727..7dd33577b905 100644
|
|
--- a/lib/percpu_counter.c
|
|
+++ b/lib/percpu_counter.c
|
|
@@ -169,7 +169,7 @@ static int percpu_counter_hotcpu_callback(struct notifier_block *nb,
|
|
struct percpu_counter *fbc;
|
|
|
|
compute_batch_value();
|
|
- if (action != CPU_DEAD)
|
|
+ if (action != CPU_DEAD && action != CPU_DEAD_FROZEN)
|
|
return NOTIFY_OK;
|
|
|
|
cpu = (unsigned long)hcpu;
|
|
--
|
|
1.8.5.3
|
|
|