a79311c14e
When the VM is under pressure, it can happen that several direct reclaim
processes are in the pageout code simultaneously. It also happens that
the reclaiming processes run into mostly referenced, mapped and dirty
pages in the first round.
This results in multiple direct reclaim processes having a lower
pageout priority, which corresponds to a higher target of pages to
scan.
This in turn can result in each direct reclaim process freeing
many pages. Together, they can end up freeing way too many pages.
This kicks useful data out of memory (in some cases more than half
of all memory is swapped out). It also impacts performance by
keeping tasks stuck in the pageout code for too long.
A 30% improvement in hackbench has been observed with this patch.
The fix is relatively simple: in shrink_zone() we can check how many
pages we have already freed, direct reclaim tasks break out of the
scanning loop if they have already freed enough pages and have reached
a lower priority level.
We do not break out of shrink_zone() when priority == DEF_PRIORITY,
to ensure that equal pressure is applied to every zone in the common
case.
However, in order to do this we do need to know how many pages we already
freed, so move nr_reclaimed into scan_control.
akpm: a historical interlude...
We tried this in 2004:
:commit e468e46a9bea3297011d5918663ce6d19094cf87
:Author: akpm <akpm>
:Date: Thu Jun 24 15:53:52 2004 +0000
:
:[PATCH] vmscan.c: dont reclaim too many pages
:
: The shrink_zone() logic can, under some circumstances, cause far too many
: pages to be reclaimed. Say, we're scanning at high priority and suddenly hit
: a large number of reclaimable pages on the LRU.
: Change things so we bale out when SWAP_CLUSTER_MAX pages have been reclaimed.
And we reverted it in 2006:
:commit
|
||
---|---|---|
.. | ||
allocpercpu.c | ||
backing-dev.c | ||
bootmem.c | ||
bounce.c | ||
dmapool.c | ||
fadvise.c | ||
failslab.c | ||
filemap_xip.c | ||
filemap.c | ||
fremap.c | ||
highmem.c | ||
hugetlb.c | ||
internal.h | ||
Kconfig | ||
maccess.c | ||
madvise.c | ||
Makefile | ||
memcontrol.c | ||
memory_hotplug.c | ||
memory.c | ||
mempolicy.c | ||
mempool.c | ||
migrate.c | ||
mincore.c | ||
mlock.c | ||
mm_init.c | ||
mmap.c | ||
mmu_notifier.c | ||
mmzone.c | ||
mprotect.c | ||
mremap.c | ||
msync.c | ||
nommu.c | ||
oom_kill.c | ||
page_alloc.c | ||
page_cgroup.c | ||
page_io.c | ||
page_isolation.c | ||
page-writeback.c | ||
pagewalk.c | ||
pdflush.c | ||
prio_tree.c | ||
quicklist.c | ||
readahead.c | ||
rmap.c | ||
shmem_acl.c | ||
shmem.c | ||
slab.c | ||
slob.c | ||
slub.c | ||
sparse-vmemmap.c | ||
sparse.c | ||
swap_state.c | ||
swap.c | ||
swapfile.c | ||
thrash.c | ||
tiny-shmem.c | ||
truncate.c | ||
util.c | ||
vmalloc.c | ||
vmscan.c | ||
vmstat.c |