kernel-ark/kernel
Paul Menage 8bab8dded6 cgroups: add cgroup support for enabling controllers at boot time
The effects of cgroup_disable=foo are:

- foo isn't auto-mounted if you mount all cgroups in a single hierarchy
- foo isn't visible as an individually mountable subsystem

As a result there will only ever be one call to foo->create(), at init time;
all processes will stay in this group, and the group will never be mounted on
a visible hierarchy.  Any additional effects (e.g.  not allocating metadata)
are up to the foo subsystem.

This doesn't handle early_init subsystems (their "disabled" bit isn't set be,
but it could easily be extended to do so if any of the early_init systems
wanted it - I think it would just involve some nastier parameter processing
since it would occur before the command-line argument parser had been run.

Hugh said:

  Ballpark figures, I'm trying to get this question out rather than
  processing the exact numbers: CONFIG_CGROUP_MEM_RES_CTLR adds 15% overhead
  to the affected paths, booting with cgroup_disable=memory cuts that back to
  1% overhead (due to slightly bigger struct page).

  I'm no expert on distros, they may have no interest whatever in
  CONFIG_CGROUP_MEM_RES_CTLR=y; and the rest of us can easily build with or
  without it, or apply the cgroup_disable=memory patches.

Unix bench's execl test result on x86_64 was

== just after boot without mounting any cgroup fs.==
mem_cgorup=off : Execl Throughput       43.0     3150.1      732.6
mem_cgroup=on  : Execl Throughput       43.0     2932.6      682.0
==

[lizf@cn.fujitsu.com: fix boot option parsing]
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04 14:46:26 -07:00
..
irq
power Merge branches 'release' and 'doc' into release 2008-03-13 01:59:53 -04:00
time clocksource: revert: use init_timer_deferrable for clocksource_watchdog 2008-03-25 20:13:25 +01:00
.gitignore
acct.c bsd_acct: using task_struct->tgid is not right in pid-namespaces 2008-03-24 19:22:20 -07:00
audit_tree.c
audit.c audit: silence two kerneldoc warnings in kernel/audit.c 2008-03-28 14:45:21 -07:00
audit.h
auditfilter.c
auditsc.c
backtracetest.c
capability.c
cgroup_debug.c
cgroup.c cgroups: add cgroup support for enabling controllers at boot time 2008-04-04 14:46:26 -07:00
compat.c
configs.c
cpu.c
cpuset.c cpusets: fix obsolete comment 2008-03-05 17:53:33 -08:00
delayacct.c
dma.c
exec_domain.c
exit.c Fix waitid si_code regression 2008-03-08 11:54:00 -08:00
extable.c
fork.c memcgroup: fix spurious EBUSY on memory cgroup removal 2008-03-28 14:45:21 -07:00
futex_compat.c futex_compat __user annotation 2008-03-30 14:18:41 -07:00
futex.c NULL noise: fs/*, mm/*, kernel/* 2008-03-30 14:18:41 -07:00
hrtimer.c
itimer.c
kallsyms.c
Kconfig.hz
Kconfig.preempt rcu: move PREEMPT_RCU config option back under PREEMPT 2008-03-10 18:01:20 -07:00
kexec.c
kfifo.c
kmod.c
kprobes.c
ksysfs.c
kthread.c
latencytop.c
lockdep_internals.h
lockdep_proc.c
lockdep.c
Makefile
marker.c markers: use synchronize_sched() 2008-04-02 15:28:19 -07:00
module.c modules: warn about suspicious return values from module's ->init() hook 2008-03-10 18:01:20 -07:00
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
ns_cgroup.c
nsproxy.c
panic.c
params.c
pid_namespace.c
pid.c
pm_qos_params.c
posix-cpu-timers.c
posix-timers.c
printk.c Make printk() console semaphore accesses sensible 2008-03-24 19:25:08 -07:00
profile.c
ptrace.c
rcuclassic.c
rcupdate.c
rcupreempt_trace.c
rcupreempt.c
rcutorture.c
relay.c relay: set an spd_release() hook for splice 2008-03-26 12:04:09 +01:00
res_counter.c
resource.c
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rwsem.c
sched_debug.c sched: improve affine wakeups 2008-03-19 04:27:53 +01:00
sched_fair.c sched: cleanup old and rarely used 'debug' features. 2008-03-21 16:43:47 +01:00
sched_idletask.c
sched_rt.c sched: balance RT task resched only on runqueue 2008-03-07 16:43:00 +01:00
sched_stats.h
sched.c NOHZ: reevaluate idle sleep length after add_timer_on() 2008-03-26 08:28:55 +01:00
seccomp.c
signal.c
softirq.c
softlockup.c
spinlock.c
srcu.c
stacktrace.c
stop_machine.c
sys_ni.c
sys.c
sysctl_check.c
sysctl.c
taskstats.c
test_kprobes.c
time.c
timeconst.pl
timer.c NOHZ: reevaluate idle sleep length after add_timer_on() 2008-03-26 08:28:55 +01:00
tsacct.c
uid16.c
user_namespace.c
user.c
utsname_sysctl.c
utsname.c
wait.c
workqueue.c