kernel-ark/mm
KAMEZAWA Hiroyuki ae41be3742 bugfix for memory cgroup controller: migration under memory controller fix
While using memory control cgroup, page-migration under it works as following.
==
 1. uncharge all refs at try to unmap.
 2. charge regs again remove_migration_ptes()
==
This is simple but has following problems.
==
 The page is uncharged and charged back again if *mapped*.
    - This means that cgroup before migration can be different from one after
      migration
    - If page is not mapped but charged as page cache, charge is just ignored
      (because not mapped, it will not be uncharged before migration)
      This is memory leak.
==
This patch tries to keep memory cgroup at page migration by increasing
one refcnt during it. 3 functions are added.

 mem_cgroup_prepare_migration() --- increase refcnt of page->page_cgroup
 mem_cgroup_end_migration()     --- decrease refcnt of page->page_cgroup
 mem_cgroup_page_migration() --- copy page->page_cgroup from old page to
                                 new page.

During migration
  - old page is under PG_locked.
  - new page is under PG_locked, too.
  - both old page and new page is not on LRU.

These 3 facts guarantee that page_cgroup() migration has no race.

Tested and worked well in x86_64/fake-NUMA box.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Cc: Paul Menage <menage@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: David Rientjes <rientjes@google.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:19 -08:00
..
allocpercpu.c PERCPU : __percpu_alloc_mask() can dynamically size percpu_data storage 2008-02-06 10:41:04 -08:00
backing-dev.c
bootmem.c
bounce.c
dmapool.c
fadvise.c check ADVICE of fadvise64_64 even if get_xip_page is given 2008-02-05 09:44:19 -08:00
filemap_xip.c Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user 2008-02-05 09:44:13 -08:00
filemap.c mem-controller gfp-mask fix 2008-02-07 08:42:19 -08:00
fremap.c sys_remap_file_pages: fix ->vm_file accounting 2008-02-05 09:44:07 -08:00
highmem.c mm: remove fastcall from mm/ 2008-02-05 09:44:18 -08:00
hugetlb.c mm: fix PageUptodate data race 2008-02-05 09:44:19 -08:00
internal.h set_page_refcounted() VM_BUG_ON fix 2008-02-05 09:44:19 -08:00
Kconfig sh: Bump number of quicklists for SH-5. 2008-01-28 13:18:55 +09:00
madvise.c
Makefile Memory controller: cgroups setup 2008-02-07 08:42:18 -08:00
memcontrol.c bugfix for memory cgroup controller: migration under memory controller fix 2008-02-07 08:42:19 -08:00
memory_hotplug.c Page allocator: clean up pcp draining functions 2008-02-05 09:44:17 -08:00
memory.c Memory controller: make charging gfp mask aware 2008-02-07 08:42:19 -08:00
mempolicy.c
mempool.c
migrate.c bugfix for memory cgroup controller: migration under memory controller fix 2008-02-07 08:42:19 -08:00
mincore.c
mlock.c
mmap.c brk: check the lower bound properly 2008-02-06 22:39:44 +01:00
mmzone.c
mprotect.c
mremap.c
msync.c
nommu.c nommu: add new vmalloc_user() and remap_vmalloc_range() interfaces. 2008-02-05 09:44:21 -08:00
oom_kill.c oom: add sysctl to enable task memory dump 2008-02-07 08:42:19 -08:00
page_alloc.c Memory controller: memory accounting 2008-02-07 08:42:18 -08:00
page_io.c mm: fix PageUptodate data race 2008-02-05 09:44:19 -08:00
page_isolation.c
page-writeback.c writeback: speed up writeback of big dirty files 2008-02-05 09:44:19 -08:00
pagewalk.c maps4: introduce a generic page walker 2008-02-05 09:44:16 -08:00
pdflush.c
prio_tree.c
quicklist.c
readahead.c
rmap.c Memory controller: make page_referenced() cgroup aware 2008-02-07 08:42:19 -08:00
shmem_acl.c
shmem.c VFS/Security: Rework inode_getsecurity and callers to return resulting buffer 2008-02-05 09:44:20 -08:00
slab.c cpu-hotplug: replace per-subsystem mutexes with get_online_cpus() 2008-01-25 21:08:02 +01:00
slob.c slob: reduce external fragmentation by using three free lists 2008-02-05 09:44:19 -08:00
slub.c SLUB: Do not upset lockdep 2008-02-04 10:56:02 -08:00
sparse-vmemmap.c
sparse.c mm: fix section mismatch warning in sparse.c 2008-02-05 09:44:19 -08:00
swap_state.c memory controller BUG_ON() 2008-02-07 08:42:19 -08:00
swap.c Memory controller: add per cgroup LRU and reclaim 2008-02-07 08:42:18 -08:00
swapfile.c memcgroup: reinstate swapoff mod 2008-02-07 08:42:19 -08:00
thrash.c
tiny-shmem.c Remove unused code from mm/tiny-shmem.c 2008-02-05 09:44:17 -08:00
truncate.c page migraton: handle orphaned pages 2008-02-05 09:44:19 -08:00
util.c
vmalloc.c mm: don't allow ioremapping of ranges larger than vmalloc space 2008-02-05 09:44:18 -08:00
vmscan.c kswapd should only wait on IO if there is IO 2008-02-07 08:42:19 -08:00
vmstat.c vmstat: remove prefetch 2008-02-05 09:44:18 -08:00