Go to file
Rafael J. Wysocki b7eaf1aab9 cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely
The way the schedutil governor uses the PELT metric causes it to
underestimate the CPU utilization in some cases.

That can be easily demonstrated by running kernel compilation on
a Sandy Bridge Intel processor, running turbostat in parallel with
it and looking at the values written to the MSR_IA32_PERF_CTL
register.  Namely, the expected result would be that when all CPUs
were 100% busy, all of them would be requested to run in the maximum
P-state, but observation shows that this clearly isn't the case.
The CPUs run in the maximum P-state for a while and then are
requested to run slower and go back to the maximum P-state after
a while again.  That causes the actual frequency of the processor to
visibly oscillate below the sustainable maximum in a jittery fashion
which clearly is not desirable.

That has been attributed to CPU utilization metric updates on task
migration that cause the total utilization value for the CPU to be
reduced by the utilization of the migrated task.  If that happens,
the schedutil governor may see a CPU utilization reduction and will
attempt to reduce the CPU frequency accordingly right away.  That
may be premature, though, for example if the system is generally
busy and there are other runnable tasks waiting to be run on that
CPU already.

This is unlikely to be an issue on systems where cpufreq policies are
shared between multiple CPUs, because in those cases the policy
utilization is computed as the maximum of the CPU utilization values
over the whole policy and if that turns out to be low, reducing the
frequency for the policy most likely is a good idea anyway.  On
systems with one CPU per policy, however, it may affect performance
adversely and even lead to increased energy consumption in some cases.

On those systems it may be addressed by taking another utilization
metric into consideration, like whether or not the CPU whose
frequency is about to be reduced has been idle recently, because if
that's not the case, the CPU is likely to be busy in the near future
and its frequency should not be reduced.

To that end, use the counter of idle calls in the timekeeping code.
Namely, make the schedutil governor look at that counter for the
current CPU every time before its frequency is about to be reduced.
If the counter has not changed since the previous iteration of the
governor computations for that CPU, the CPU has been busy for all
that time and its frequency should not be decreased, so if the new
frequency would be lower than the one set previously, the governor
will skip the frequency update.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Joel Fernandes <joelaf@google.com>
2017-03-23 02:12:14 +01:00
arch Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2017-03-12 14:22:25 -07:00
block blk: improve order of bio handling in generic_make_request() 2017-03-08 10:55:17 -07:00
certs
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2017-03-04 10:42:53 -08:00
Documentation Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-12 14:11:38 -07:00
drivers Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-12 14:11:38 -07:00
firmware
fs Merge branch 'prep-for-5level' 2017-03-10 08:59:07 -08:00
include cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely 2017-03-23 02:12:14 +01:00
init Change get_random_{int,log} to use the CRNG used by /dev/urandom and 2017-03-11 09:08:47 -08:00
ipc Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-03 10:16:38 -08:00
kernel cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely 2017-03-23 02:12:14 +01:00
lib mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
mm Merge branch 'prep-for-5level' 2017-03-10 08:59:07 -08:00
net libceph: osd_request_timeout option 2017-03-07 14:30:38 +01:00
samples Merge branch 'rebased-statx' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-03-03 11:38:56 -08:00
scripts Merge branch 'akpm' (patches from Andrew) 2017-03-10 08:34:42 -08:00
security Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-03 10:16:38 -08:00
sound scripts/spelling.txt: add "disble(d)" pattern and fix typo instances 2017-03-09 17:01:09 -08:00
tools userfaultfd: selftest: vm: allow to build in vm/ directory 2017-03-09 17:01:10 -08:00
usr
virt KVM: arm/arm64: VGIC: Fix command handling while ITS being disabled 2017-03-07 15:44:08 +00:00
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore
.mailmap
COPYING
CREDITS MAINTAINERS: Remove old e-mail address 2017-02-13 12:24:56 -05:00
Kbuild
Kconfig
MAINTAINERS MAINTAINERS: usb251xb: remove reference inexistent file 2017-03-09 10:34:16 +01:00
Makefile Linux 4.11-rc2 2017-03-12 14:47:08 -07:00
README

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.