kernel-ark/mm
Wu Fengguang f862963174 mm: do batched scans for mem_cgroup
For mem_cgroup, shrink_zone() may call shrink_list() with nr_to_scan=1, in
which case shrink_list() _still_ calls isolate_pages() with the much
larger SWAP_CLUSTER_MAX.  It effectively scales up the inactive list scan
rate by up to 32 times.

For example, with 16k inactive pages and DEF_PRIORITY=12, (16k >> 12)=4.
So when shrink_zone() expects to scan 4 pages in the active/inactive list,
the active list will be scanned 4 pages, while the inactive list will be
(over) scanned SWAP_CLUSTER_MAX=32 pages in effect.  And that could break
the balance between the two lists.

It can further impact the scan of anon active list, due to the anon
active/inactive ratio rebalance logic in balance_pgdat()/shrink_zone():

inactive anon list over scanned => inactive_anon_is_low() == TRUE
                                => shrink_active_list()
                                => active anon list over scanned

So the end result may be

- anon inactive  => over scanned
- anon active    => over scanned (maybe not as much)
- file inactive  => over scanned
- file active    => under scanned (relatively)

The accesses to nr_saved_scan are not lock protected and so not 100%
accurate, however we can tolerate small errors and the resulted small
imbalanced scan rates between zones.

Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-22 07:17:39 -07:00
..
allocpercpu.c percpu: use dynamic percpu allocator as the default percpu allocator 2009-06-24 15:13:35 +09:00
backing-dev.c writeback: splice dirty inode entries to default bdi on bdi_destroy() 2009-09-16 15:18:52 +02:00
bootmem.c kmemleak: Do not report alloc_bootmem blocks as leaks 2009-08-27 14:29:17 +01:00
bounce.c
debug-pagealloc.c
dmapool.c dmapools: protect page_list walk in show_pools() 2009-06-30 18:56:00 -07:00
fadvise.c
failslab.c
filemap_xip.c
filemap.c mm: oom analysis: add shmem vmstat 2009-09-22 07:17:27 -07:00
fremap.c
highmem.c
hugetlb.c hugetlb: restore interleaving of bootmem huge pages 2009-09-22 07:17:26 -07:00
init-mm.c
internal.h vmscan: do not unconditionally treat zones that fail zone_reclaim() as full 2009-06-16 19:47:45 -07:00
Kconfig ksm: add some documentation 2009-09-22 07:17:33 -07:00
Kconfig.debug kmemcheck: enable in the x86 Kconfig 2009-06-15 15:49:15 +02:00
kmemcheck.c
kmemleak-test.c percpu: clean up percpu variable definitions 2009-06-24 15:13:48 +09:00
kmemleak.c kmemleak: Improve the "Early log buffer exceeded" error message 2009-09-11 10:42:09 +01:00
ksm.c ksm: unmerge is an origin of OOMs 2009-09-22 07:17:33 -07:00
maccess.c
madvise.c ksm: the mm interface to ksm 2009-09-22 07:17:31 -07:00
Makefile ksm: the mm interface to ksm 2009-09-22 07:17:31 -07:00
memcontrol.c mm: drop unneeded double negations 2009-09-22 07:17:35 -07:00
memory_hotplug.c memory hotplug: fix updating of num_physpages for hot plugged memory 2009-09-22 07:17:38 -07:00
memory.c mm: drop unneeded double negations 2009-09-22 07:17:35 -07:00
mempolicy.c mm: make set_mempolicy(MPOL_INTERLEAV) N_HIGH_MEMORY aware 2009-08-07 10:39:55 -07:00
mempool.c mm: remove broken 'kzalloc' mempool 2009-09-22 07:17:35 -07:00
migrate.c mm: return boolean from page_has_private() 2009-09-22 07:17:38 -07:00
mincore.c
mlock.c
mm_init.c
mmap.c ksm: clean up obsolete references 2009-09-22 07:17:33 -07:00
mmu_notifier.c ksm: add mmu_notifier set_pte_at_notify() 2009-09-22 07:17:31 -07:00
mmzone.c
mprotect.c perf: Do the big rename: Performance Counters -> Performance Events 2009-09-21 14:28:04 +02:00
mremap.c ksm: mremap use err from ksm_madvise 2009-09-22 07:17:33 -07:00
msync.c
nommu.c mm: includecheck fix for mm/nommu.c 2009-09-22 07:17:35 -07:00
oom_kill.c ksm: unmerge is an origin of OOMs 2009-09-22 07:17:33 -07:00
page_alloc.c mm: do batched scans for mem_cgroup 2009-09-22 07:17:39 -07:00
page_cgroup.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
page_io.c
page_isolation.c
page-writeback.c mm: count only reclaimable lru pages 2009-09-22 07:17:30 -07:00
pagewalk.c
percpu.c Merge branch 'for-next' into for-linus 2009-09-15 09:57:19 +09:00
prio_tree.c
quicklist.c percpu: cleanup percpu array definitions 2009-06-24 15:13:45 +09:00
readahead.c
rmap.c ksm: no debug in page_dup_rmap() 2009-09-22 07:17:31 -07:00
shmem_acl.c shmfs: use 'check_acl' instead of 'permission' 2009-09-08 11:08:46 -07:00
shmem.c mm: includecheck fix for mm/shmem.c 2009-09-22 07:17:35 -07:00
slab.c mm: replace various uses of num_physpages by totalram_pages 2009-09-22 07:17:38 -07:00
slob.c slab: remove duplicate kmem_cache_init_late() declarations 2009-08-06 11:36:25 +03:00
slub.c mm: kmem_cache_create(): make it easier to catch NULL cache names 2009-09-22 07:17:33 -07:00
sparse-vmemmap.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
sparse.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
swap_state.c mm: add_to_swap_cache() does not return -EEXIST 2009-09-22 07:17:35 -07:00
swap.c mm: replace various uses of num_physpages by totalram_pages 2009-09-22 07:17:38 -07:00
swapfile.c ksm: unmerge is an origin of OOMs 2009-09-22 07:17:33 -07:00
thrash.c mm: pass mm to grab_swap_token 2009-06-23 12:50:05 -07:00
truncate.c
util.c Merge branches 'slab/documentation', 'slab/fixes', 'slob/cleanups' and 'slub/fixes' into for-linus 2009-06-17 08:30:15 +03:00
vmalloc.c mm: replace various uses of num_physpages by totalram_pages 2009-09-22 07:17:38 -07:00
vmscan.c mm: do batched scans for mem_cgroup 2009-09-22 07:17:39 -07:00
vmstat.c mm: vmstat: add isolate pages 2009-09-22 07:17:29 -07:00