kernel-ark/fs
David Rientjes 028fec414d mempolicy: support optional mode flags
With the evolution of mempolicies, it is necessary to support mempolicy mode
flags that specify how the policy shall behave in certain circumstances.  The
most immediate need for mode flag support is to suppress remapping the
nodemask of a policy at the time of rebind.

Both the mempolicy mode and flags are passed by the user in the 'int policy'
formal of either the set_mempolicy() or mbind() syscall.  A new constant,
MPOL_MODE_FLAGS, represents the union of legal optional flags that may be
passed as part of this int.  Mempolicies that include illegal flags as part of
their policy are rejected as invalid.

An additional member to struct mempolicy is added to support the mode flags:

	struct mempolicy {
		...
		unsigned short policy;
		unsigned short flags;
	}

The splitting of the 'int' actual passed by the user is done in
sys_set_mempolicy() and sys_mbind() for their respective syscalls.  This is
done by intersecting the actual with MPOL_MODE_FLAGS, rejecting the syscall of
there are additional flags, and storing it in the new 'flags' member of struct
mempolicy.  The intersection of the actual with ~MPOL_MODE_FLAGS is stored in
the 'policy' member of the struct and all current users of pol->policy remain
unchanged.

The union of the policy mode and optional mode flags is passed back to the
user in get_mempolicy().

This combination of mode and flags within the same actual does not break
userspace code that relies on get_mempolicy(&policy, ...) and either

	switch (policy) {
	case MPOL_BIND:
		...
	case MPOL_INTERLEAVE:
		...
	};

statements or

	if (policy == MPOL_INTERLEAVE) {
		...
	}

statements.  Such applications would need to use optional mode flags when
calling set_mempolicy() or mbind() for these previously implemented statements
to stop working.  If an application does start using optional mode flags, it
will need to mask the optional flags off the policy in switch and conditional
statements that only test mode.

An additional member is also added to struct shmem_sb_info to store the
optional mode flags.

[hugh@veritas.com: shmem mpol: fix build warning]
Cc: Paul Jackson <pj@sgi.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28 08:58:19 -07:00
..
9p [PATCH] restore sane ->umount_begin() API 2008-04-25 09:23:25 -04:00
adfs mount options: fix adfs 2008-02-08 09:22:39 -08:00
affs mount options: fix affs 2008-02-08 09:22:39 -08:00
afs AFS: Do not describe debug parameters with their value 2008-04-16 07:43:48 -07:00
autofs mount options: fix autofs 2008-02-08 09:22:40 -08:00
autofs4 Introduce path_put() 2008-02-14 21:13:33 -08:00
befs mount options: fix befs 2008-02-08 09:22:40 -08:00
bfs iget: stop BFS from using iget() and read_inode() 2008-02-07 08:42:27 -08:00
cifs [PATCH] restore sane ->umount_begin() API 2008-04-25 09:23:25 -04:00
coda Introduce path_put() 2008-02-14 21:13:33 -08:00
configfs Introduce path_put() 2008-02-14 21:13:33 -08:00
cramfs fs: Remove unnecessary inclusions of asm/semaphore.h 2008-04-18 22:16:44 -04:00
debugfs debugfs: fix sparse warnings 2008-03-04 14:47:06 -08:00
devpts mount options: fix devpts 2008-02-08 09:22:40 -08:00
dlm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm 2008-04-22 13:44:23 -07:00
ecryptfs eCryptfs: Swap dput() and mntput() 2008-03-19 18:53:36 -07:00
efs efs: update error msg to not refer to deleted read_inode() 2008-04-02 15:28:19 -07:00
exportfs
ext2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/juhl/trivial 2008-04-21 16:36:46 -07:00
ext3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/juhl/trivial 2008-04-21 16:36:46 -07:00
ext4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/juhl/trivial 2008-04-21 16:36:46 -07:00
fat [PATCH] r/o bind mounts: elevate write count for ioctls() 2008-04-19 00:29:24 -04:00
freevxfs iget: stop FreeVXFS from using iget() and read_inode() 2008-02-07 08:42:28 -08:00
fuse [PATCH] restore sane ->umount_begin() API 2008-04-25 09:23:25 -04:00
gfs2 mm: remove nopage 2008-04-28 08:58:18 -07:00
hfs hfs_bnode_find() can fail, resulting in hfs_bnode_split() breakage 2008-03-17 09:46:55 -07:00
hfsplus [PATCH] r/o bind mounts: elevate write count for ioctls() 2008-04-19 00:29:24 -04:00
hostfs uml: fix hostfs tv_usec calculations 2008-02-05 09:44:30 -08:00
hpfs mount options: fix hpfs 2008-02-08 09:22:40 -08:00
hppfs [PATCH] sanitize hppfs 2008-03-19 06:42:18 -04:00
hugetlbfs mempolicy: support optional mode flags 2008-04-28 08:58:19 -07:00
isofs zisofs: fix readpage() outside i_size 2008-03-19 18:53:36 -07:00
jbd jbd/jbd2 NULL noise 2008-03-30 14:18:41 -07:00
jbd2 jbd/jbd2 NULL noise 2008-03-30 14:18:41 -07:00
jffs2 [JFFS2] Introduce dbg_readinode2 log level, use it to shut read_dnode() up 2008-04-23 16:43:15 +01:00
jfs [PATCH] r/o bind mounts: elevate write count for ioctls() 2008-04-19 00:29:24 -04:00
lockd locks: don't call ->copy_lock methods on return of conflicting locks 2008-04-25 13:00:11 -04:00
minix iget: stop the MINIX filesystem from using iget() and read_inode() 2008-02-07 08:42:28 -08:00
msdos
ncpfs [PATCH] r/o bind mounts: elevate write count for ncp_ioctl() 2008-04-19 00:29:23 -04:00
nfs [PATCH] restore sane ->umount_begin() API 2008-04-25 09:23:25 -04:00
nfs_common
nfsd nfsd: don't allow setting ctime over v4 2008-04-25 13:00:11 -04:00
nls
ntfs is_vmalloc_addr(): Check if an address is within the vmalloc boundaries 2008-02-05 09:44:14 -08:00
ocfs2 [PATCH] r/o bind mounts: elevate write count for ioctls() 2008-04-19 00:29:24 -04:00
openpromfs iget: stop OPENPROMFS from using iget() and read_inode() 2008-02-07 08:42:29 -08:00
partitions block: send disk "change" event for rescan_partitions() 2008-04-19 19:10:24 -07:00
proc make swap_pte_to_pagemap_entry() static 2008-04-28 08:58:18 -07:00
qnx4 iget: stop QNX4 from using iget() and read_inode() 2008-02-07 08:42:28 -08:00
ramfs
reiserfs Merge branch 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc 2008-04-21 15:41:27 -07:00
romfs ROMFS: Fix up an error in iget removal 2008-03-19 18:53:36 -07:00
smbfs NULL noise: fs/*, mm/*, kernel/* 2008-03-30 14:18:41 -07:00
sysfs [SCSI] sysfs: make group is_valid return a mode_t 2008-04-22 15:16:31 -05:00
sysv iget: stop the SYSV filesystem from using iget() and read_inode() 2008-02-07 08:42:29 -08:00
udf udf: use crc_itu_t from lib instead of udf_crc 2008-04-17 14:29:56 +02:00
ufs fs/ufs/balloc.c: fix sparc64 printk warning 2008-03-19 18:53:37 -07:00
vfat Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) 2008-02-07 08:42:26 -08:00
xfs Merge branch 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc 2008-04-21 15:41:27 -07:00
aio.c aio: io_getevents() should return if io_destroy() is invoked 2008-04-28 08:58:17 -07:00
anon_inodes.c [PATCH] fix up new filp allocators 2008-03-19 06:54:05 -04:00
attr.c
bad_inode.c iget: introduce a function to register iget failure 2008-02-07 08:42:26 -08:00
binfmt_aout.c aout: suppress A.OUT library support if !CONFIG_ARCH_SUPPORTS_AOUT 2008-02-08 09:22:30 -08:00
binfmt_elf_fdpic.c
binfmt_elf.c [PATCH] sanitize handling of shared descriptor tables in failing execve() 2008-04-25 09:23:53 -04:00
binfmt_em86.c
binfmt_flat.c FLAT binaries: drop BINFMT_FLAT bad header magic warning 2008-02-14 20:58:05 -08:00
binfmt_misc.c [PATCH] sanitize handling of shared descriptor tables in failing execve() 2008-04-25 09:23:53 -04:00
binfmt_script.c
binfmt_som.c [PATCH] sanitize handling of shared descriptor tables in failing execve() 2008-04-25 09:23:53 -04:00
bio.c block: convert bio_copy_user to bio_copy_user_iov 2008-04-21 09:50:08 +02:00
block_dev.c fs/block_dev.c: remove #if 0'ed code 2008-02-19 10:04:00 +01:00
buffer.c mm: filter based on a nodemask as well as a gfp_mask 2008-04-28 08:58:19 -07:00
char_dev.c fs/char_dev.c: chrdev_open marked static and removed from fs.h 2008-02-08 09:22:42 -08:00
compat_binfmt_elf.c x86: compat_binfmt_elf 2008-01-30 13:31:46 +01:00
compat_ioctl.c d_path: Make d_path() use a struct path 2008-02-14 21:17:09 -08:00
compat.c Merge branch 'linus_origin' into hotfixes 2008-02-15 13:36:30 -05:00
dcache.c [patch 2/7] vfs: mountinfo: add seq_file_root() 2008-04-23 00:04:38 -04:00
dcookies.c d_path: Make d_path() use a struct path 2008-02-14 21:17:09 -08:00
direct-io.c Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user 2008-02-05 09:44:13 -08:00
dnotify.c
dquot.c quota: add possibly missing iput() when quotaon and quotaoff races 2008-03-19 18:53:35 -07:00
drop_caches.c
eventfd.c fs/eventfd.c should #include <linux/syscalls.h> 2008-02-06 10:41:03 -08:00
eventpoll.c lockdep: annotate epoll 2008-02-05 09:44:07 -08:00
exec.c [PATCH] sanitize unshare_files/reset_files_struct 2008-04-25 09:23:59 -04:00
fcntl.c [PATCH] sanitize locate_fd() 2008-04-25 09:24:05 -04:00
fifo.c
file_table.c [PATCH] r/o bind mounts: debugging for missed calls 2008-04-19 00:29:28 -04:00
file.c get rid of NR_OPEN and introduce a sysctl_nr_open 2008-02-06 10:41:06 -08:00
filesystems.c
fs-writeback.c fs: fix kernel-doc notation warnings 2008-03-19 18:53:36 -07:00
generic_acl.c
inode.c [PATCH] r/o bind mounts: write count for file_update_time() 2008-04-19 00:29:24 -04:00
inotify_user.c Introduce path_put() 2008-02-14 21:13:33 -08:00
inotify.c inotify: remove debug code 2008-02-06 10:41:07 -08:00
internal.h [PATCH] move a bunch of declarations to fs/internal.h 2008-04-21 23:11:01 -04:00
ioctl.c fix up kerneldoc in fs/ioctl.c a little bit 2008-02-09 11:08:33 -08:00
ioprio.c cfq-iosched: relax IOPRIO_CLASS_IDLE restrictions 2008-01-28 11:38:15 +01:00
Kconfig Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 2008-04-24 11:46:16 -07:00
Kconfig.binfmt [SPARC]: Remove SunOS and Solaris binary support. 2008-04-21 15:10:15 -07:00
libfs.c Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user 2008-02-05 09:44:13 -08:00
locks.c Export __locks_copy_lock() so modular lockd builds 2008-04-25 15:49:46 -07:00
Makefile x86: compat_binfmt_elf Kconfig 2008-01-30 13:31:46 +01:00
mbcache.c vfs: fix possible deadlock in ext2, ext3, ext4 when using xattrs 2008-04-15 19:35:41 -07:00
mpage.c docbook: fix filesystems.tmpl source files 2008-03-03 10:47:13 -08:00
namei.c [PATCH] r/o bind mounts: elevate write count for open()s 2008-04-19 00:29:25 -04:00
namespace.c [PATCH] restore sane ->umount_begin() API 2008-04-25 09:23:25 -04:00
nfsctl.c Introduce path_put() 2008-02-14 21:13:33 -08:00
no-block.c
open.c [PATCH] r/o bind mounts: debugging for missed calls 2008-04-19 00:29:28 -04:00
pipe.c [PATCH] double-free of inode on alloc_file() failure exit in create_write_pipe() 2008-04-22 19:54:57 -04:00
pnode.c [patch 7/7] vfs: mountinfo: show dominating group id 2008-04-23 00:05:09 -04:00
pnode.h [patch 7/7] vfs: mountinfo: show dominating group id 2008-04-23 00:05:09 -04:00
posix_acl.c
quota_v1.c
quota_v2.c
quota.c Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) 2008-02-07 08:42:26 -08:00
read_write.c fs: use loff_t type instead of long long 2008-04-22 15:17:11 -07:00
read_write.h
readdir.c
select.c trivial: small cleanups 2008-04-21 22:15:06 +00:00
seq_file.c [patch 2/7] vfs: mountinfo: add seq_file_root() 2008-04-23 00:04:38 -04:00
signalfd.c signalfd: fix for incorrect SI_QUEUE user data reporting 2008-04-11 08:06:44 -07:00
splice.c splice: fix infinite loop in generic_file_splice_read() 2008-04-10 08:24:25 +02:00
stack.c
stat.c Introduce path_put() 2008-02-14 21:13:33 -08:00
super.c [PATCH] move a bunch of declarations to fs/internal.h 2008-04-21 23:11:01 -04:00
sync.c
timerfd.c timerfd: new timerfd API 2008-02-05 09:44:07 -08:00
utimes.c [PATCH] r/o bind mounts: elevate write count for do_utimes() 2008-04-19 00:29:24 -04:00
xattr_acl.c
xattr.c [PATCH] remove unused label in xattr.c (noise from ro-bind) 2008-04-23 00:04:04 -04:00