mm: Do not stall in synchronous compaction for THP allocations
This commit is contained in:
parent
917075f21e
commit
f0fb214f6a
|
@ -761,6 +761,7 @@ Patch21001: arm-smsc-support-reading-mac-address-from-device-tree.patch
|
|||
#rhbz #735946
|
||||
Patch21020: 0001-mm-vmscan-Limit-direct-reclaim-for-higher-order-allo.patch
|
||||
Patch21021: 0002-mm-Abort-reclaim-compaction-if-compaction-can-procee.patch
|
||||
Patch21022: mm-do-not-stall-in-synchronous-compaction-for-THP-allocations.patch
|
||||
|
||||
#rhbz 748691
|
||||
Patch21030: be2net-non-member-vlan-pkts-not-received-in-promisco.patch
|
||||
|
@ -1422,6 +1423,7 @@ ApplyPatch utrace.patch
|
|||
#rhbz #735946
|
||||
ApplyPatch 0001-mm-vmscan-Limit-direct-reclaim-for-higher-order-allo.patch
|
||||
ApplyPatch 0002-mm-Abort-reclaim-compaction-if-compaction-can-procee.patch
|
||||
ApplyPatch mm-do-not-stall-in-synchronous-compaction-for-THP-allocations.patch
|
||||
|
||||
#rhbz 748691
|
||||
ApplyPatch be2net-non-member-vlan-pkts-not-received-in-promisco.patch
|
||||
|
@ -2145,6 +2147,9 @@ fi
|
|||
# and build.
|
||||
|
||||
%changelog
|
||||
* Tue Nov 15 2011 Dave Jones <davej@redhat.com>
|
||||
- mm: Do not stall in synchronous compaction for THP allocations
|
||||
|
||||
* Tue Nov 15 2011 Dave Jones <davej@redhat.com>
|
||||
- Backport asus-laptop changes from 3.2 (rhbz 754214)
|
||||
|
||||
|
|
|
@ -0,0 +1,115 @@
|
|||
https://lkml.org/lkml/2011/11/10/173
|
||||
|
||||
Date Thu, 10 Nov 2011 10:06:16 +0000
|
||||
From Mel Gorman <>
|
||||
Subject [PATCH] mm: Do not stall in synchronous compaction for THP allocations
|
||||
|
||||
|
||||
Occasionally during large file copies to slow storage, there are still
|
||||
reports of user-visible stalls when THP is enabled. Reports on this
|
||||
have been intermittent and not reliable to reproduce locally but;
|
||||
|
||||
Andy Isaacson reported a problem copying to VFAT on SD Card
|
||||
https://lkml.org/lkml/2011/11/7/2
|
||||
|
||||
In this case, it was stuck in munmap for betwen 20 and 60
|
||||
seconds in compaction. It is also possible that khugepaged
|
||||
was holding mmap_sem on this process if CONFIG_NUMA was set.
|
||||
|
||||
Johannes Weiner reported stalls on USB
|
||||
https://lkml.org/lkml/2011/7/25/378
|
||||
|
||||
In this case, there is no stack trace but it looks like the
|
||||
same problem. The USB stick may have been using NTFS as a
|
||||
filesystem based on other work done related to writing back
|
||||
to USB around the same time.
|
||||
|
||||
Internally in SUSE, I received a bug report related to stalls in firefox
|
||||
when using Java and Flash heavily while copying from NFS
|
||||
to VFAT on USB. It has not been confirmed to be the same problem
|
||||
but if it looks like a duck and quacks like a duck.....
|
||||
In the past, commit [11bc82d6: mm: compaction: Use async migration for
|
||||
__GFP_NO_KSWAPD and enforce no writeback] forced that sync compaction
|
||||
would never be used for THP allocations. This was reverted in commit
|
||||
[c6a140bf: mm/compaction: reverse the change that forbade sync
|
||||
migraton with __GFP_NO_KSWAPD] on the grounds that it was uncertain
|
||||
it was beneficial.
|
||||
|
||||
While user-visible stalls do not happen for me when writing to USB,
|
||||
I setup a test running postmark while short-lived processes created
|
||||
anonymous mapping. The objective was to exercise the paths that
|
||||
allocate transparent huge pages. I then logged when processes were
|
||||
stalled for more than 1 second, recorded a stack strace and did some
|
||||
analysis to aggregate unique "stall events" which revealed
|
||||
|
||||
Time stalled in this event: 47369 ms
|
||||
Event count: 20
|
||||
usemem sleep_on_page 3690 ms
|
||||
usemem sleep_on_page 2148 ms
|
||||
usemem sleep_on_page 1534 ms
|
||||
usemem sleep_on_page 1518 ms
|
||||
usemem sleep_on_page 1225 ms
|
||||
usemem sleep_on_page 2205 ms
|
||||
usemem sleep_on_page 2399 ms
|
||||
usemem sleep_on_page 2398 ms
|
||||
usemem sleep_on_page 3760 ms
|
||||
usemem sleep_on_page 1861 ms
|
||||
usemem sleep_on_page 2948 ms
|
||||
usemem sleep_on_page 1515 ms
|
||||
usemem sleep_on_page 1386 ms
|
||||
usemem sleep_on_page 1882 ms
|
||||
usemem sleep_on_page 1850 ms
|
||||
usemem sleep_on_page 3715 ms
|
||||
usemem sleep_on_page 3716 ms
|
||||
usemem sleep_on_page 4846 ms
|
||||
usemem sleep_on_page 1306 ms
|
||||
usemem sleep_on_page 1467 ms
|
||||
[<ffffffff810ef30c>] wait_on_page_bit+0x6c/0x80
|
||||
[<ffffffff8113de9f>] unmap_and_move+0x1bf/0x360
|
||||
[<ffffffff8113e0e2>] migrate_pages+0xa2/0x1b0
|
||||
[<ffffffff81134273>] compact_zone+0x1f3/0x2f0
|
||||
[<ffffffff811345d8>] compact_zone_order+0xa8/0xf0
|
||||
[<ffffffff811346ff>] try_to_compact_pages+0xdf/0x110
|
||||
[<ffffffff810f773a>] __alloc_pages_direct_compact+0xda/0x1a0
|
||||
[<ffffffff810f7d5d>] __alloc_pages_slowpath+0x55d/0x7a0
|
||||
[<ffffffff810f8151>] __alloc_pages_nodemask+0x1b1/0x1c0
|
||||
[<ffffffff811331db>] alloc_pages_vma+0x9b/0x160
|
||||
[<ffffffff81142bb0>] do_huge_pmd_anonymous_page+0x160/0x270
|
||||
[<ffffffff814410a7>] do_page_fault+0x207/0x4c0
|
||||
[<ffffffff8143dde5>] page_fault+0x25/0x30
|
||||
The stall times are approximate at best but the estimates represent 25%
|
||||
of the worst stalls and even if the estimates are off by a factor of
|
||||
10, it's severe.
|
||||
|
||||
This patch once again prevents sync migration for transparent
|
||||
hugepage allocations as it is preferable to fail a THP allocation
|
||||
than stall. It was suggested that __GFP_NORETRY be used instead of
|
||||
__GFP_NO_KSWAPD. This would look less like a special case but would
|
||||
still cause compaction to run at least once with sync compaction.
|
||||
|
||||
If accepted, this is a -stable candidate.
|
||||
|
||||
Reported-by: Andy Isaacson <adi@hexapodia.org>
|
||||
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
|
||||
Signed-off-by: Mel Gorman <mgorman@suse.de>
|
||||
---
|
||||
|
||||
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
|
||||
index 9dd443d..84bf962 100644
|
||||
--- a/mm/page_alloc.c
|
||||
+++ b/mm/page_alloc.c
|
||||
@@ -2168,7 +2168,13 @@ rebalance:
|
||||
sync_migration);
|
||||
if (page)
|
||||
goto got_pg;
|
||||
- sync_migration = true;
|
||||
+
|
||||
+ /*
|
||||
+ * Do not use sync migration for transparent hugepage allocations as
|
||||
+ * it could stall writing back pages which is far worse than simply
|
||||
+ * failing to promote a page.
|
||||
+ */
|
||||
+ sync_migration = !(gfp_mask & __GFP_NO_KSWAPD);
|
||||
|
||||
/* Try direct reclaim and then allocating */
|
||||
page = __alloc_pages_direct_reclaim(gfp_mask, order,
|
Loading…
Reference in New Issue