116 lines
4.9 KiB
Diff
116 lines
4.9 KiB
Diff
|
https://lkml.org/lkml/2011/11/10/173
|
||
|
|
||
|
Date Thu, 10 Nov 2011 10:06:16 +0000
|
||
|
From Mel Gorman <>
|
||
|
Subject [PATCH] mm: Do not stall in synchronous compaction for THP allocations
|
||
|
|
||
|
|
||
|
Occasionally during large file copies to slow storage, there are still
|
||
|
reports of user-visible stalls when THP is enabled. Reports on this
|
||
|
have been intermittent and not reliable to reproduce locally but;
|
||
|
|
||
|
Andy Isaacson reported a problem copying to VFAT on SD Card
|
||
|
https://lkml.org/lkml/2011/11/7/2
|
||
|
|
||
|
In this case, it was stuck in munmap for betwen 20 and 60
|
||
|
seconds in compaction. It is also possible that khugepaged
|
||
|
was holding mmap_sem on this process if CONFIG_NUMA was set.
|
||
|
|
||
|
Johannes Weiner reported stalls on USB
|
||
|
https://lkml.org/lkml/2011/7/25/378
|
||
|
|
||
|
In this case, there is no stack trace but it looks like the
|
||
|
same problem. The USB stick may have been using NTFS as a
|
||
|
filesystem based on other work done related to writing back
|
||
|
to USB around the same time.
|
||
|
|
||
|
Internally in SUSE, I received a bug report related to stalls in firefox
|
||
|
when using Java and Flash heavily while copying from NFS
|
||
|
to VFAT on USB. It has not been confirmed to be the same problem
|
||
|
but if it looks like a duck and quacks like a duck.....
|
||
|
In the past, commit [11bc82d6: mm: compaction: Use async migration for
|
||
|
__GFP_NO_KSWAPD and enforce no writeback] forced that sync compaction
|
||
|
would never be used for THP allocations. This was reverted in commit
|
||
|
[c6a140bf: mm/compaction: reverse the change that forbade sync
|
||
|
migraton with __GFP_NO_KSWAPD] on the grounds that it was uncertain
|
||
|
it was beneficial.
|
||
|
|
||
|
While user-visible stalls do not happen for me when writing to USB,
|
||
|
I setup a test running postmark while short-lived processes created
|
||
|
anonymous mapping. The objective was to exercise the paths that
|
||
|
allocate transparent huge pages. I then logged when processes were
|
||
|
stalled for more than 1 second, recorded a stack strace and did some
|
||
|
analysis to aggregate unique "stall events" which revealed
|
||
|
|
||
|
Time stalled in this event: 47369 ms
|
||
|
Event count: 20
|
||
|
usemem sleep_on_page 3690 ms
|
||
|
usemem sleep_on_page 2148 ms
|
||
|
usemem sleep_on_page 1534 ms
|
||
|
usemem sleep_on_page 1518 ms
|
||
|
usemem sleep_on_page 1225 ms
|
||
|
usemem sleep_on_page 2205 ms
|
||
|
usemem sleep_on_page 2399 ms
|
||
|
usemem sleep_on_page 2398 ms
|
||
|
usemem sleep_on_page 3760 ms
|
||
|
usemem sleep_on_page 1861 ms
|
||
|
usemem sleep_on_page 2948 ms
|
||
|
usemem sleep_on_page 1515 ms
|
||
|
usemem sleep_on_page 1386 ms
|
||
|
usemem sleep_on_page 1882 ms
|
||
|
usemem sleep_on_page 1850 ms
|
||
|
usemem sleep_on_page 3715 ms
|
||
|
usemem sleep_on_page 3716 ms
|
||
|
usemem sleep_on_page 4846 ms
|
||
|
usemem sleep_on_page 1306 ms
|
||
|
usemem sleep_on_page 1467 ms
|
||
|
[<ffffffff810ef30c>] wait_on_page_bit+0x6c/0x80
|
||
|
[<ffffffff8113de9f>] unmap_and_move+0x1bf/0x360
|
||
|
[<ffffffff8113e0e2>] migrate_pages+0xa2/0x1b0
|
||
|
[<ffffffff81134273>] compact_zone+0x1f3/0x2f0
|
||
|
[<ffffffff811345d8>] compact_zone_order+0xa8/0xf0
|
||
|
[<ffffffff811346ff>] try_to_compact_pages+0xdf/0x110
|
||
|
[<ffffffff810f773a>] __alloc_pages_direct_compact+0xda/0x1a0
|
||
|
[<ffffffff810f7d5d>] __alloc_pages_slowpath+0x55d/0x7a0
|
||
|
[<ffffffff810f8151>] __alloc_pages_nodemask+0x1b1/0x1c0
|
||
|
[<ffffffff811331db>] alloc_pages_vma+0x9b/0x160
|
||
|
[<ffffffff81142bb0>] do_huge_pmd_anonymous_page+0x160/0x270
|
||
|
[<ffffffff814410a7>] do_page_fault+0x207/0x4c0
|
||
|
[<ffffffff8143dde5>] page_fault+0x25/0x30
|
||
|
The stall times are approximate at best but the estimates represent 25%
|
||
|
of the worst stalls and even if the estimates are off by a factor of
|
||
|
10, it's severe.
|
||
|
|
||
|
This patch once again prevents sync migration for transparent
|
||
|
hugepage allocations as it is preferable to fail a THP allocation
|
||
|
than stall. It was suggested that __GFP_NORETRY be used instead of
|
||
|
__GFP_NO_KSWAPD. This would look less like a special case but would
|
||
|
still cause compaction to run at least once with sync compaction.
|
||
|
|
||
|
If accepted, this is a -stable candidate.
|
||
|
|
||
|
Reported-by: Andy Isaacson <adi@hexapodia.org>
|
||
|
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
|
||
|
Signed-off-by: Mel Gorman <mgorman@suse.de>
|
||
|
---
|
||
|
|
||
|
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
|
||
|
index 9dd443d..84bf962 100644
|
||
|
--- a/mm/page_alloc.c
|
||
|
+++ b/mm/page_alloc.c
|
||
|
@@ -2168,7 +2168,13 @@ rebalance:
|
||
|
sync_migration);
|
||
|
if (page)
|
||
|
goto got_pg;
|
||
|
- sync_migration = true;
|
||
|
+
|
||
|
+ /*
|
||
|
+ * Do not use sync migration for transparent hugepage allocations as
|
||
|
+ * it could stall writing back pages which is far worse than simply
|
||
|
+ * failing to promote a page.
|
||
|
+ */
|
||
|
+ sync_migration = !(gfp_mask & __GFP_NO_KSWAPD);
|
||
|
|
||
|
/* Try direct reclaim and then allocating */
|
||
|
page = __alloc_pages_direct_reclaim(gfp_mask, order,
|