kernel-ark/kernel/sched
Tejun Heo 0d5936344f sched: Implement interface for cgroup unified hierarchy
There are a couple interface issues which can be addressed in cgroup2
interface.

* Stats from cpuacct being reported separately from the cpu stats.

* Use of different time units.  Writable control knobs use
  microseconds, some stat fields use nanoseconds while other cpuacct
  stat fields use centiseconds.

* Control knobs which can't be used in the root cgroup still show up
  in the root.

* Control knob names and semantics aren't consistent with other
  controllers.

This patchset implements cpu controller's interface on cgroup2 which
adheres to the controller file conventions described in
Documentation/cgroups/cgroup-v2.txt.  Overall, the following changes
are made.

* cpuacct is implictly enabled and disabled by cpu and its information
  is reported through "cpu.stat" which now uses microseconds for all
  time durations.  All time duration fields now have "_usec" appended
  to them for clarity.

  Note that cpuacct.usage_percpu is currently not included in
  "cpu.stat".  If this information is actually called for, it will be
  added later.

* "cpu.shares" is replaced with "cpu.weight" and operates on the
  standard scale defined by CGROUP_WEIGHT_MIN/DFL/MAX (1, 100, 10000).
  The weight is scaled to scheduler weight so that 100 maps to 1024
  and the ratio relationship is preserved - if weight is W and its
  scaled value is S, W / 100 == S / 1024.  While the mapped range is a
  bit smaller than the orignal scheduler weight range, the dead zones
  on both sides are relatively small and covers wider range than the
  nice value mappings.  This file doesn't make sense in the root
  cgroup and isn't created on root.

* "cpu.weight.nice" is added. When read, it reads back the nice value
  which is closest to the current "cpu.weight".  When written, it sets
  "cpu.weight" to the weight value which matches the nice value.  This
  makes it easy to configure cgroups when they're competing against
  threads in threaded subtrees.

* "cpu.cfs_quota_us" and "cpu.cfs_period_us" are replaced by "cpu.max"
  which contains both quota and period.

v4: - Use cgroup2 basic usage stat as the information source instead
      of cpuacct.

v3: - Added "cpu.weight.nice" to allow using nice values when
      configuring the weight.  The feature is requested by PeterZ.
    - Merge the patch to enable threaded support on cpu and cpuacct.
    - Dropped the bits about getting rid of cpuacct from patch
      description as there is a pretty strong case for making cpuacct
      an implicit controller so that basic cpu usage stats are always
      available.
    - Documentation updated accordingly.  "cpu.rt.max" section is
      dropped for now.

v2: - cpu_stats_show() was incorrectly using CONFIG_FAIR_GROUP_SCHED
      for CFS bandwidth stats and also using raw division for u64.
      Use CONFIG_CFS_BANDWITH and do_div() instead.  "cpu.rt.max" is
      not included yet.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
2017-09-29 14:30:37 -07:00
..
autogroup.c sched/autogroup: Fix error reporting printk text in autogroup_create() 2017-08-10 17:06:03 +02:00
autogroup.h sched/headers: Prepare for new header dependencies before moving code to <linux/sched/autogroup.h> 2017-03-02 08:42:28 +01:00
clock.c sched/clock: Fix early boot preempt assumption in __set_sched_clock_stable() 2017-05-24 09:10:00 +02:00
completion.c Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-04 11:52:29 -07:00
core.c sched: Implement interface for cgroup unified hierarchy 2017-09-29 14:30:37 -07:00
cpuacct.c sched/cputime: Convert kcpustat to nsecs 2017-02-01 09:13:47 +01:00
cpudeadline.c sched/deadline: Change return value of cpudl_find() 2017-08-10 12:18:17 +02:00
cpudeadline.h
cpufreq_schedutil.c Merge branch 'pm-cpufreq-sched' 2017-09-04 00:05:22 +02:00
cpufreq.c
cpupri.c sched/cpupri: Don't re-initialize 'struct cpupri' 2017-08-10 12:18:14 +02:00
cpupri.h
cputime.c sched/cputime: Add dummy cputime_adjust() implementation for CONFIG_VIRT_CPU_ACCOUNTING_NATIVE 2017-09-25 14:27:54 -07:00
deadline.c cpuacct: Introduce cgroup_account_cputime[_field]() 2017-09-25 08:12:04 -07:00
debug.c Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-13 12:22:32 -07:00
fair.c cpuacct: Introduce cgroup_account_cputime[_field]() 2017-09-25 08:12:04 -07:00
features.h sched/core: Implement new approach to scale select_idle_cpu() 2017-06-08 10:25:17 +02:00
idle_task.c sched/core: Add wrappers for lockdep_(un)pin_lock() 2017-01-14 11:29:30 +01:00
idle.c PM / s2idle: Rename ->enter_freeze to ->enter_s2idle 2017-08-11 01:29:56 +02:00
loadavg.c sched/loadavg: Generalize "_idle" naming to "_nohz" 2017-06-22 11:30:01 +02:00
Makefile membarrier: Provide expedited private command 2017-08-17 07:28:05 -07:00
membarrier.c membarrier: Provide expedited private command 2017-08-17 07:28:05 -07:00
rt.c cpuacct: Introduce cgroup_account_cputime[_field]() 2017-09-25 08:12:04 -07:00
sched-pelt.h sched/fair: Move the PELT constants into a generated header 2017-04-14 10:26:37 +02:00
sched.h cpuacct: Introduce cgroup_account_cputime[_field]() 2017-09-25 08:12:04 -07:00
stats.c
stats.h sched/headers: Move cputime functionality from <linux/sched.h> and <linux/cputime.h> into <linux/sched/cputime.h> 2017-03-03 01:45:22 +01:00
stop_task.c cpuacct: Introduce cgroup_account_cputime[_field]() 2017-09-25 08:12:04 -07:00
swait.c sched/wait: Remove the lockless swait_active() check in swake_up*() 2017-08-10 12:28:53 +02:00
topology.c Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-13 12:22:32 -07:00
wait_bit.c sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming 2017-06-20 12:19:14 +02:00
wait.c sched/wait: Introduce wakeup boomark in wake_up_page_bit 2017-09-14 09:56:18 -07:00