kernel-ark/mm
Paul Jackson 02a0e53d82 [PATCH] cpuset: rework cpuset_zone_allowed api
Elaborate the API for calling cpuset_zone_allowed(), so that users have to
explicitly choose between the two variants:

  cpuset_zone_allowed_hardwall()
  cpuset_zone_allowed_softwall()

Until now, whether or not you got the hardwall flavor depended solely on
whether or not you or'd in the __GFP_HARDWALL gfp flag to the gfp_mask
argument.

If you didn't specify __GFP_HARDWALL, you implicitly got the softwall
version.

Unfortunately, this meant that users would end up with the softwall version
without thinking about it.  Since only the softwall version might sleep,
this led to bugs with possible sleeping in interrupt context on more than
one occassion.

The hardwall version requires that the current tasks mems_allowed allows
the node of the specified zone (or that you're in interrupt or that
__GFP_THISNODE is set or that you're on a one cpuset system.)

The softwall version, depending on the gfp_mask, might allow a node if it
was allowed in the nearest enclusing cpuset marked mem_exclusive (which
requires taking the cpuset lock 'callback_mutex' to evaluate.)

This patch removes the cpuset_zone_allowed() call, and forces the caller to
explicitly choose between the hardwall and the softwall case.

If the caller wants the gfp_mask to determine this choice, they should (1)
be sure they can sleep or that __GFP_HARDWALL is set, and (2) invoke the
cpuset_zone_allowed_softwall() routine.

This adds another 100 or 200 bytes to the kernel text space, due to the few
lines of nearly duplicate code at the top of both cpuset_zone_allowed_*
routines.  It should save a few instructions executed for the calls that
turned into calls of cpuset_zone_allowed_hardwall, thanks to not having to
set (before the call) then check (within the call) the __GFP_HARDWALL flag.

For the most critical call, from get_page_from_freelist(), the same
instructions are executed as before -- the old cpuset_zone_allowed()
routine it used to call is the same code as the
cpuset_zone_allowed_softwall() routine that it calls now.

Not a perfect win, but seems worth it, to reduce this chance of hitting a
sleeping with irq off complaint again.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:49 -08:00
..
allocpercpu.c [PATCH] Allow NULL pointers in percpu_free 2006-12-07 08:39:22 -08:00
backing-dev.c [PATCH] separate bdi congestion functions from queue congestion functions 2006-10-20 10:26:35 -07:00
bootmem.c [PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols 2006-12-07 08:39:44 -08:00
bounce.c [PATCH] BLOCK: Separate the bounce buffering code from the highmem code [try #6] 2006-09-30 20:32:11 +02:00
fadvise.c [PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
filemap_xip.c [PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
filemap.c [PATCH] dio: only call aio_complete() after returning -EIOCBQUEUED 2006-12-10 09:57:21 -08:00
filemap.h Remove all inclusions of <linux/config.h> 2006-10-04 03:38:54 -04:00
fremap.c [PATCH] kill install_file_pte's pte_val 2006-12-07 08:39:23 -08:00
highmem.c
hugetlb.c [PATCH] cpuset: rework cpuset_zone_allowed api 2006-12-13 09:05:49 -08:00
internal.h
Kconfig
madvise.c
Makefile [PATCH] separate bdi congestion functions from queue congestion functions 2006-10-20 10:26:35 -07:00
memory_hotplug.c [PATCH] Get rid of zone_table[] 2006-12-07 08:39:20 -08:00
memory.c [PATCH] read_zero_pagealigned() locking fix 2006-12-10 09:55:39 -08:00
mempolicy.c [PATCH] struct path: convert mm 2006-12-08 08:28:47 -08:00
mempool.c
migrate.c [PATCH] radix-tree: RCU lockless readside 2006-12-07 08:39:25 -08:00
mincore.c
mlock.c [PATCH] mlock cleanup 2006-12-07 08:39:22 -08:00
mmap.c [PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
mmzone.c [PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols 2006-12-07 08:39:44 -08:00
mprotect.c
mremap.c
msync.c [PATCH] mm: msync() cleanup 2006-09-26 08:48:45 -07:00
nommu.c [PATCH] struct path: convert mm 2006-12-08 08:28:47 -08:00
oom_kill.c [PATCH] cpuset: rework cpuset_zone_allowed api 2006-12-13 09:05:49 -08:00
page_alloc.c [PATCH] cpuset: rework cpuset_zone_allowed api 2006-12-13 09:05:49 -08:00
page_io.c [PATCH] swsusp: use block device offsets to identify swap locations 2006-12-07 08:39:27 -08:00
page-writeback.c [PATCH] io-accounting: write accounting 2006-12-10 09:55:41 -08:00
pdflush.c [PATCH] Add include/linux/freezer.h and move definitions from sched.h 2006-12-07 08:39:27 -08:00
prio_tree.c
readahead.c [PATCH] io-accounting-read-accounting nfs fix 2006-12-10 09:55:41 -08:00
rmap.c [PATCH] mm: more commenting on lock ordering 2006-10-20 10:26:44 -07:00
shmem_acl.c
shmem.c [PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
slab.c [PATCH] cpuset: rework cpuset_zone_allowed api 2006-12-13 09:05:49 -08:00
slob.c [PATCH] More slab.h cleanups 2006-12-13 09:05:49 -08:00
sparse.c [PATCH] numa node ids are int, page_to_nid and zone_to_nid should return int 2006-12-07 08:39:23 -08:00
swap_state.c
swap.c [PATCH] hotplug CPU: clean up hotcpu_notifier() use 2006-12-07 08:39:39 -08:00
swapfile.c [PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
thrash.c [PATCH] make mm/thrash.c:global_faults static 2006-12-07 08:39:22 -08:00
tiny-shmem.c [PATCH] struct path: convert mm 2006-12-08 08:28:47 -08:00
truncate.c [PATCH] io-accounting: write-cancel accounting 2006-12-10 09:55:41 -08:00
util.c [PATCH] slab: clean up leak tracking ifdefs a little bit 2006-10-04 07:55:13 -07:00
vmalloc.c [PATCH] Fix strange size check in __get_vm_area_node() 2006-11-16 11:43:38 -08:00
vmscan.c [PATCH] cpuset: rework cpuset_zone_allowed api 2006-12-13 09:05:49 -08:00
vmstat.c [PATCH] struct seq_operations and struct file_operations constification 2006-12-07 08:39:46 -08:00