Eliminate hangs when using frequent high-order allocations

2011-05-23 10:27:34 -04:00 · 2011-05-23 10:27:34 -04:00 · fcfc30ed64
parent 3cc5871fa6
commit fcfc30ed64
3 changed files with 278 additions and 0 deletions
--- a/kernel.spec
+++ b/kernel.spec
@ -749,6 +749,11 @@ Patch12407: scsi_dh_hp_sw-fix-deadlock-in-start_stop_endio.patch
 # temporary fix for Sempron machines stalling (#704059)
 Patch12408: x86-amd-arat-bug-on-sempron-workaround.patch

+# Eliminate hangs when using frequent high-order allocations V4
+# (will be in 2.6.38.8)
+Patch12410: mm-vmscan-correct-use-of-pgdat_balanced-in-sleeping_prematurely.patch
+Patch12411: mm-vmscan-correctly-check-if-reclaimer-should-schedule-during-shrink_slab.patch
+
 %endif

 BuildRoot: %{_tmppath}/kernel-%{KVERREL}-root
@ -1391,6 +1396,11 @@ ApplyPatch scsi_dh_hp_sw-fix-deadlock-in-start_stop_endio.patch
 # temporary fix for Sempron machines stalling (#704059)
 ApplyPatch x86-amd-arat-bug-on-sempron-workaround.patch

+# Eliminate hangs when using frequent high-order allocations V4
+# (will be in 2.6.38.8)
+ApplyPatch mm-vmscan-correct-use-of-pgdat_balanced-in-sleeping_prematurely.patch
+ApplyPatch mm-vmscan-correctly-check-if-reclaimer-should-schedule-during-shrink_slab.patch
+
 # END OF PATCH APPLICATIONS

 %endif
@ -2001,6 +2011,7 @@ fi
 %changelog
 * Mon May 23 2011 Chuck Ebbert <cebbert@redhat.com> 2.6.38.7-29
 - Linux 2.6.38.7
+- Eliminate hangs when using frequent high-order allocations

 * Fri May 20 2011 Chuck Ebbert <cebbert@redhat.com> 2.6.38.7-28.rc1
 - Linux 2.6.38.7-rc1
--- a/mm-vmscan-correct-use-of-pgdat_balanced-in-sleeping_prematurely.patch
+++ b/mm-vmscan-correct-use-of-pgdat_balanced-in-sleeping_prematurely.patch
@ -0,0 +1,114 @@
+Return-Path: stable-bounces@linux.kernel.org
+Received: from zmta03.collab.prod.int.phx2.redhat.com (LHLO
+ zmta03.collab.prod.int.phx2.redhat.com) (10.5.5.33) by
+ mail02.corp.redhat.com with LMTP; Mon, 23 May 2011 05:54:46 -0400 (EDT)
+Received: from localhost (localhost.localdomain [127.0.0.1])
+	by zmta03.collab.prod.int.phx2.redhat.com (Postfix) with ESMTP id 3139D4E5E6
+	for <cebbert@redhat.com>; Mon, 23 May 2011 05:54:46 -0400 (EDT)
+Received: from zmta03.collab.prod.int.phx2.redhat.com ([127.0.0.1])
+	by localhost (zmta03.collab.prod.int.phx2.redhat.com [127.0.0.1]) (amavisd-new, port 10024)
+	with ESMTP id Xko2+8bJJ7po for <cebbert@redhat.com>;
+	Mon, 23 May 2011 05:54:46 -0400 (EDT)
+Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12])
+	by zmta03.collab.prod.int.phx2.redhat.com (Postfix) with ESMTP id 1A5854D284
+	for <cebbert@mail.corp.redhat.com>; Mon, 23 May 2011 05:54:46 -0400 (EDT)
+Received: from mx1.redhat.com (ext-mx13.extmail.prod.ext.phx2.redhat.com [10.5.110.18])
+	by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p4N9sjdi005829
+	for <cebbert@redhat.com>; Mon, 23 May 2011 05:54:45 -0400
+Received: from hera.kernel.org (hera.kernel.org [140.211.167.34])
+	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p4N9siLf018408
+	for <cebbert@redhat.com>; Mon, 23 May 2011 05:54:45 -0400
+Received: from hera.kernel.org (localhost [127.0.0.1])
+	by hera.kernel.org (8.14.4/8.14.3) with ESMTP id p4N9s7Yv010104;
+	Mon, 23 May 2011 09:54:09 GMT
+X-Virus-Status: Clean
+X-Virus-Scanned: clamav-milter 0.97 at hera.kernel.org
+Received: from mx2.suse.de (cantor2.suse.de [195.135.220.15])
+	by hera.kernel.org (8.14.4/8.14.3) with ESMTP id p4N9s1LC009736;
+	Mon, 23 May 2011 09:54:02 GMT
+X-Virus-Status: Clean
+X-Virus-Scanned: clamav-milter 0.97 at hera.kernel.org
+Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.221.2])
+	by mx2.suse.de (Postfix) with ESMTP id 98E7590072;
+	Mon, 23 May 2011 11:53:59 +0200 (CEST)
+From: Mel Gorman <mgorman@suse.de>
+To: Andrew Morton <akpm@linux-foundation.org>
+Date: Mon, 23 May 2011 10:53:54 +0100
+Message-Id: <1306144435-2516-2-git-send-email-mgorman@suse.de>
+In-Reply-To: <1306144435-2516-1-git-send-email-mgorman@suse.de>
+References: <1306144435-2516-1-git-send-email-mgorman@suse.de>
+X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED
+	autolearn=unavailable version=3.3.2-r929478
+X-Spam-Checker-Version: SpamAssassin 3.3.2-r929478 (2010-03-31) on
+	hera.kernel.org
+X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.3 (hera.kernel.org [127.0.0.1]); Mon, 23 May 2011 09:54:12 +0000 (UTC)
+X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by
+	milter-greylist-4.2.3 (hera.kernel.org [140.211.167.34]);
+	Mon, 23 May 2011 09:54:04 +0000 (UTC)
+Cc: Pekka Enberg <penberg@kernel.org>, Rik van Riel <riel@redhat.com>,
+        Jan Kara <jack@suse.cz>, linux-kernel <linux-kernel@vger.kernel.org>,
+        James Bottomley <James.Bottomley@HansenPartnership.com>,
+        linux-mm <linux-mm@kvack.org>, Minchan Kim <minchan.kim@gmail.com>,
+        Raghavendra D Prabhu <raghu.prabhu13@gmail.com>,
+        Johannes Weiner <hannes@cmpxchg.org>,
+        linux-fsdevel <linux-fsdevel@vger.kernel.org>,
+        Colin King <colin.king@canonical.com>,
+        Christoph Lameter <cl@linux.com>,
+        linux-ext4 <linux-ext4@vger.kernel.org>, stable <stable@kernel.org>,
+        Chris Mason <chris.mason@oracle.com>, Mel Gorman <mgorman@suse.de>
+Subject: [stable] [PATCH 1/2] mm: vmscan: Correct use of pgdat_balanced in
+	sleeping_prematurely
+X-BeenThere: stable@linux.kernel.org
+X-Mailman-Version: 2.1.12
+Precedence: list
+List-Id: For maintainers of the stable Linux series <stable.linux.kernel.org>
+List-Unsubscribe: <http://linux.kernel.org/mailman/options/stable>,
+	<mailto:stable-request@linux.kernel.org?subject=unsubscribe>
+List-Archive: <http://linux.kernel.org/mailman/private/stable/>
+List-Post: <mailto:stable@linux.kernel.org>
+List-Help: <mailto:stable-request@linux.kernel.org?subject=help>
+List-Subscribe: <http://linux.kernel.org/mailman/listinfo/stable>,
+	<mailto:stable-request@linux.kernel.org?subject=subscribe>
+MIME-Version: 1.0
+Content-Type: text/plain; charset="us-ascii"
+Content-Transfer-Encoding: 7bit
+Sender: stable-bounces@linux.kernel.org
+Errors-To: stable-bounces@linux.kernel.org
+X-RedHat-Spam-Score: -2.31  (RCVD_IN_DNSWL_MED,T_RP_MATCHES_RCVD)
+X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12
+X-Scanned-By: MIMEDefang 2.68 on 10.5.110.18
+
+From: Johannes Weiner <hannes@cmpxchg.org>
+
+Johannes Weiner poined out that the logic in commit [1741c877: mm:
+kswapd: keep kswapd awake for high-order allocations until a percentage
+of the node is balanced] is backwards. Instead of allowing kswapd to go
+to sleep when balancing for high order allocations, it keeps it kswapd
+running uselessly.
+
+Signed-off-by: Mel Gorman <mgorman@suse.de>
+Reviewed-by: Rik van Riel <riel@redhat.com>
+---
+ mm/vmscan.c |    2 +-
+ 1 files changed, 1 insertions(+), 1 deletions(-)
+
+diff --git a/mm/vmscan.c b/mm/vmscan.c
+index 8bfd450..1aa262b 100644
+--- a/mm/vmscan.c
+++ b/mm/vmscan.c
+@@ -2286,7 +2286,7 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining,
+ 	 * must be balanced
+ 	 */
+ 	if (order)
+-		return pgdat_balanced(pgdat, balanced, classzone_idx);
+		return !pgdat_balanced(pgdat, balanced, classzone_idx);
+ 	else
+ 		return !all_zones_ok;
+ }
+-- 
+1.7.3.4
+
+_______________________________________________
+stable mailing list
+stable@linux.kernel.org
+http://linux.kernel.org/mailman/listinfo/stable
--- a/mm-vmscan-correctly-check-if-reclaimer-should-schedule-during-shrink_slab.patch
+++ b/mm-vmscan-correctly-check-if-reclaimer-should-schedule-during-shrink_slab.patch
@ -0,0 +1,153 @@
+Return-Path: stable-bounces@linux.kernel.org
+Received: from zmta01.collab.prod.int.phx2.redhat.com (LHLO
+ zmta01.collab.prod.int.phx2.redhat.com) (10.5.5.31) by
+ mail02.corp.redhat.com with LMTP; Mon, 23 May 2011 05:54:49 -0400 (EDT)
+Received: from localhost (localhost.localdomain [127.0.0.1])
+	by zmta01.collab.prod.int.phx2.redhat.com (Postfix) with ESMTP id 443AC9289D
+	for <cebbert@redhat.com>; Mon, 23 May 2011 05:54:49 -0400 (EDT)
+Received: from zmta01.collab.prod.int.phx2.redhat.com ([127.0.0.1])
+	by localhost (zmta01.collab.prod.int.phx2.redhat.com [127.0.0.1]) (amavisd-new, port 10024)
+	with ESMTP id WTG56s2uAm8Z for <cebbert@redhat.com>;
+	Mon, 23 May 2011 05:54:49 -0400 (EDT)
+Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11])
+	by zmta01.collab.prod.int.phx2.redhat.com (Postfix) with ESMTP id 2D1D2906E4
+	for <cebbert@mail.corp.redhat.com>; Mon, 23 May 2011 05:54:49 -0400 (EDT)
+Received: from mx1.redhat.com (ext-mx11.extmail.prod.ext.phx2.redhat.com [10.5.110.16])
+	by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p4N9snmk002150
+	for <cebbert@redhat.com>; Mon, 23 May 2011 05:54:49 -0400
+Received: from hera.kernel.org (hera.kernel.org [140.211.167.34])
+	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p4N9smZs008302
+	for <cebbert@redhat.com>; Mon, 23 May 2011 05:54:48 -0400
+Received: from hera.kernel.org (localhost [127.0.0.1])
+	by hera.kernel.org (8.14.4/8.14.3) with ESMTP id p4N9sGZo010150;
+	Mon, 23 May 2011 09:54:16 GMT
+X-Virus-Status: Clean
+X-Virus-Scanned: clamav-milter 0.97 at hera.kernel.org
+Received: from mx2.suse.de (cantor2.suse.de [195.135.220.15])
+	by hera.kernel.org (8.14.4/8.14.3) with ESMTP id p4N9s1xm009737;
+	Mon, 23 May 2011 09:54:02 GMT
+X-Virus-Status: Clean
+X-Virus-Scanned: clamav-milter 0.97 at hera.kernel.org
+Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.221.2])
+	by mx2.suse.de (Postfix) with ESMTP id 2B4998FFEB;
+	Mon, 23 May 2011 11:54:01 +0200 (CEST)
+From: Mel Gorman <mgorman@suse.de>
+To: Andrew Morton <akpm@linux-foundation.org>
+Date: Mon, 23 May 2011 10:53:55 +0100
+Message-Id: <1306144435-2516-3-git-send-email-mgorman@suse.de>
+In-Reply-To: <1306144435-2516-1-git-send-email-mgorman@suse.de>
+References: <1306144435-2516-1-git-send-email-mgorman@suse.de>
+X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED
+	autolearn=unavailable version=3.3.2-r929478
+X-Spam-Checker-Version: SpamAssassin 3.3.2-r929478 (2010-03-31) on
+	hera.kernel.org
+X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.3 (hera.kernel.org [127.0.0.1]); Mon, 23 May 2011 09:54:16 +0000 (UTC)
+X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by
+	milter-greylist-4.2.3 (hera.kernel.org [140.211.167.34]);
+	Mon, 23 May 2011 09:54:03 +0000 (UTC)
+Cc: Pekka Enberg <penberg@kernel.org>, Rik van Riel <riel@redhat.com>,
+        Jan Kara <jack@suse.cz>, linux-kernel <linux-kernel@vger.kernel.org>,
+        James Bottomley <James.Bottomley@HansenPartnership.com>,
+        linux-mm <linux-mm@kvack.org>, Minchan Kim <minchan.kim@gmail.com>,
+        Raghavendra D Prabhu <raghu.prabhu13@gmail.com>,
+        Johannes Weiner <hannes@cmpxchg.org>,
+        linux-fsdevel <linux-fsdevel@vger.kernel.org>,
+        Colin King <colin.king@canonical.com>,
+        Christoph Lameter <cl@linux.com>,
+        linux-ext4 <linux-ext4@vger.kernel.org>, stable <stable@kernel.org>,
+        Chris Mason <chris.mason@oracle.com>, Mel Gorman <mgorman@suse.de>
+Subject: [stable] [PATCH 2/2] mm: vmscan: Correctly check if reclaimer
+	should schedule during shrink_slab
+X-BeenThere: stable@linux.kernel.org
+X-Mailman-Version: 2.1.12
+Precedence: list
+List-Id: For maintainers of the stable Linux series <stable.linux.kernel.org>
+List-Unsubscribe: <http://linux.kernel.org/mailman/options/stable>,
+	<mailto:stable-request@linux.kernel.org?subject=unsubscribe>
+List-Archive: <http://linux.kernel.org/mailman/private/stable/>
+List-Post: <mailto:stable@linux.kernel.org>
+List-Help: <mailto:stable-request@linux.kernel.org?subject=help>
+List-Subscribe: <http://linux.kernel.org/mailman/listinfo/stable>,
+	<mailto:stable-request@linux.kernel.org?subject=subscribe>
+MIME-Version: 1.0
+Content-Type: text/plain; charset="us-ascii"
+Content-Transfer-Encoding: 7bit
+Sender: stable-bounces@linux.kernel.org
+Errors-To: stable-bounces@linux.kernel.org
+X-RedHat-Spam-Score: -2.31  (RCVD_IN_DNSWL_MED,T_RP_MATCHES_RCVD)
+X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11
+X-Scanned-By: MIMEDefang 2.68 on 10.5.110.16
+
+It has been reported on some laptops that kswapd is consuming large
+amounts of CPU and not being scheduled when SLUB is enabled during
+large amounts of file copying. It is expected that this is due to
+kswapd missing every cond_resched() point because;
+
+shrink_page_list() calls cond_resched() if inactive pages were isolated
+        which in turn may not happen if all_unreclaimable is set in
+        shrink_zones(). If for whatver reason, all_unreclaimable is
+        set on all zones, we can miss calling cond_resched().
+
+balance_pgdat() only calls cond_resched if the zones are not
+        balanced. For a high-order allocation that is balanced, it
+        checks order-0 again. During that window, order-0 might have
+        become unbalanced so it loops again for order-0 and returns
+        that it was reclaiming for order-0 to kswapd(). It can then
+        find that a caller has rewoken kswapd for a high-order and
+        re-enters balance_pgdat() without ever calling cond_resched().
+
+shrink_slab only calls cond_resched() if we are reclaiming slab
+	pages. If there are a large number of direct reclaimers, the
+	shrinker_rwsem can be contended and prevent kswapd calling
+	cond_resched().
+
+This patch modifies the shrink_slab() case. If the semaphore is
+contended, the caller will still check cond_resched(). After each
+successful call into a shrinker, the check for cond_resched() remains
+in case one shrinker is particularly slow.
+
+This patch replaces
+mm-vmscan-if-kswapd-has-been-running-too-long-allow-it-to-sleep.patch
+in -mm.
+
+[mgorman@suse.de: Preserve call to cond_resched after each call into shrinker]
+From: Minchan Kim <minchan.kim@gmail.com>
+Signed-off-by: Mel Gorman <mgorman@suse.de>
+---
+ mm/vmscan.c |    9 +++++++--
+ 1 files changed, 7 insertions(+), 2 deletions(-)
+
+diff --git a/mm/vmscan.c b/mm/vmscan.c
+index 1aa262b..cc1470b 100644
+--- a/mm/vmscan.c
+++ b/mm/vmscan.c
+@@ -230,8 +230,11 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
+ 	if (scanned == 0)
+ 		scanned = SWAP_CLUSTER_MAX;
+ 
+-	if (!down_read_trylock(&shrinker_rwsem))
+-		return 1;	/* Assume we'll be able to shrink next time */
+	if (!down_read_trylock(&shrinker_rwsem)) {
+		/* Assume we'll be able to shrink next time */
+		ret = 1;
+		goto out;
+	}
+ 
+ 	list_for_each_entry(shrinker, &shrinker_list, list) {
+ 		unsigned long long delta;
+@@ -282,6 +285,8 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
+ 		shrinker->nr += total_scan;
+ 	}
+ 	up_read(&shrinker_rwsem);
+out:
+	cond_resched();
+ 	return ret;
+ }
+ 
+-- 
+1.7.3.4
+
+_______________________________________________
+stable mailing list
+stable@linux.kernel.org
+http://linux.kernel.org/mailman/listinfo/stable