kernel-ark/mm
David Rientjes f33261d75b mm: fix deferred congestion timeout if preferred zone is not allowed
Before 0e093d9976 ("writeback: do not sleep on the congestion queue if
there are no congested BDIs or if significant congestion is not being
encountered in the current zone"), preferred_zone was only used for NUMA
statistics, to determine the zoneidx from which to allocate from given
the type requested, and whether to utilize memory compaction.

wait_iff_congested(), though, uses preferred_zone to determine if the
congestion wait should be deferred because its dirty pages are backed by
a congested bdi.  This incorrectly defers the timeout and busy loops in
the page allocator with various cond_resched() calls if preferred_zone
is not allowed in the current context, usually consuming 100% of a cpu.

This patch ensures preferred_zone is an allowed zone in the fastpath
depending on whether current is constrained by its cpuset or nodes in
its mempolicy (when the nodemask passed is non-NULL).  This is correct
since the fastpath allocation always passes ALLOC_CPUSET when trying to
allocate memory.  In the slowpath, this patch resets preferred_zone to
the first zone of the allowed type when the allocation is not
constrained by current's cpuset, i.e.  it does not pass ALLOC_CPUSET.

This patch also ensures preferred_zone is from the set of allowed nodes
when called from within direct reclaim since allocations are always
constrained by cpusets in this context (it is blockable).

Both of these uses of cpuset_current_mems_allowed are protected by
get_mems_allowed().

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-26 10:50:00 +10:00
..
backing-dev.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-10-26 17:58:44 -07:00
bootmem.c
bounce.c
compaction.c mm: compaction: prevent division-by-zero during user-requested compaction 2011-01-20 17:02:05 -08:00
debug-pagealloc.c
dmapool.c mm/dmapool.c: use TASK_UNINTERRUPTIBLE in dma_pool_alloc() 2011-01-13 17:32:48 -08:00
fadvise.c
failslab.c
filemap_xip.c
filemap.c mm: remove likely() from grab_cache_page_write_begin() 2011-01-13 17:32:36 -08:00
fremap.c
highmem.c mm,x86: fix kmap_atomic_push vs ioremap_32.c 2010-10-27 18:03:05 -07:00
huge_memory.c memcg: fix USED bit handling at uncharge in THP 2011-01-20 17:02:06 -08:00
hugetlb.c hugetlb: fix handling of parse errors in sysfs 2011-01-13 17:32:49 -08:00
hwpoison-inject.c
init-mm.c
internal.h Revert "mm: batch activate_page() to reduce lock contention" 2011-01-17 14:42:19 -08:00
Kconfig thp: select CONFIG_COMPACTION if TRANSPARENT_HUGEPAGE enabled 2011-01-13 17:32:45 -08:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c
ksm.c ksm: drain pagevecs to lru 2011-01-13 17:32:49 -08:00
maccess.c MN10300: Save frame pointer in thread_info struct rather than global var 2010-10-27 17:29:01 +01:00
madvise.c thp: khugepaged: make khugepaged aware about madvise 2011-01-13 17:32:47 -08:00
Makefile thp: transparent hugepage core 2011-01-13 17:32:42 -08:00
memblock.c memblock: fix memblock_is_region_memory() 2011-01-20 17:02:05 -08:00
memcontrol.c memcg: correctly order reading PCG_USED and pc->mem_cgroup 2011-01-20 17:02:06 -08:00
memory_hotplug.c Merge branch 'slub/hotplug' into slab/urgent 2011-01-15 13:28:17 +02:00
memory-failure.c thp: compound_trans_order 2011-01-13 17:32:47 -08:00
memory.c thp: add debug checks for mapcount related invariants 2011-01-13 17:32:47 -08:00
mempolicy.c thp: add numa awareness to hugepage allocations 2011-01-13 17:32:45 -08:00
mempool.c
migrate.c memcg: fix memory migration of shmem swapcache 2011-01-13 17:32:51 -08:00
mincore.c thp: mincore transparent hugepage support 2011-01-13 17:32:44 -08:00
mlock.c mlock: do not hold mmap_sem for extended periods of time 2011-01-13 17:32:36 -08:00
mm_init.c
mmap.c brk: fix min_brk lower bound computation for COMPAT_BRK 2011-01-13 17:32:48 -08:00
mmu_context.c
mmu_notifier.c thp: mmu_notifier_test_young 2011-01-13 17:32:46 -08:00
mmzone.c mm: page allocator: adjust the per-cpu counter threshold when memory is low 2011-01-13 17:32:31 -08:00
mprotect.c thp: mprotect: transparent huge page support 2011-01-13 17:32:44 -08:00
mremap.c thp: split_huge_page_mm/vma 2011-01-13 17:32:41 -08:00
msync.c
nommu.c mlock: do not hold mmap_sem for extended periods of time 2011-01-13 17:32:36 -08:00
oom_kill.c oom: kill all threads sharing oom killed task's mm 2010-10-26 16:52:05 -07:00
page_alloc.c mm: fix deferred congestion timeout if preferred zone is not allowed 2011-01-26 10:50:00 +10:00
page_cgroup.c
page_io.c
page_isolation.c mm: page_isolation: codeclean fix comment and rm unneeded val init 2010-10-26 16:52:11 -07:00
page-writeback.c writeback: avoid unnecessary determine_dirtyable_memory call 2011-01-13 17:32:38 -08:00
pagewalk.c thp: split_huge_page_mm/vma 2011-01-13 17:32:41 -08:00
percpu-km.c
percpu-vm.c mm: remove gfp mask from pcpu_get_vm_areas 2011-01-13 17:32:34 -08:00
percpu.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-01-13 10:05:56 -08:00
pgtable-generic.c mm/pgtable-generic.c: fix CONFIG_SWAP=n build 2011-01-26 10:49:58 +10:00
prio_tree.c
quicklist.c
readahead.c
rmap.c memcg: create extensible page stat update routines 2011-01-13 17:32:50 -08:00
shmem.c fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
slab.c mm/slab.c: make local symbols static 2011-01-15 13:28:36 +02:00
slob.c kernel: kmem_ptr_validate considered harmful 2011-01-07 17:50:16 +11:00
slub.c Merge branch 'slub/hotplug' into slab/urgent 2011-01-15 13:28:17 +02:00
sparse-vmemmap.c tree-wide: fix comment/printk typos 2010-11-01 15:38:34 -04:00
sparse.c thp: remove PG_buddy 2011-01-13 17:32:43 -08:00
swap_state.c thp: split_huge_page paging 2011-01-13 17:32:41 -08:00
swap.c Revert "mm: simplify code of swap.c" 2011-01-17 14:42:34 -08:00
swapfile.c thp: split_huge_page paging 2011-01-13 17:32:41 -08:00
thrash.c
truncate.c mm: fix truncate_setsize() comment 2011-01-20 17:02:06 -08:00
util.c kernel: kmem_ptr_validate considered harmful 2011-01-07 17:50:16 +11:00
vmalloc.c Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 2011-01-13 20:15:35 -08:00
vmscan.c mm: fix deferred congestion timeout if preferred zone is not allowed 2011-01-26 10:50:00 +10:00
vmstat.c thp: transparent hugepage vmstat 2011-01-13 17:32:43 -08:00