Commit Graph

8380 Commits

Author SHA1 Message Date
Robert Richter
c75841a398 perf/x86-ibs: Fix update of period
The last sw period was not correctly updated on overflow and thus led
to wrong distribution of events. We always need to properly initialize
data.period in struct perf_sample_data.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-2-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:11 +02:00
Ingo Molnar
ad8537cda6 Merge branch 'perf/x86-ibs' into perf/core 2012-05-09 15:22:23 +02:00
Ingo Molnar
ad7687dde8 x86/numa: Check for nonsensical topologies on real hw as well
Instead of only checking nonsensical topologies on numa-emu, do it
on real hardware as well, and print a warning.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/n/tip-re15l0jqjtpz709oxozt2zoh@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 13:32:35 +02:00
Peter Zijlstra
0acbb440f0 x86/numa: Hard partition cpu topology masks on node boundaries
When using numa=fake= you can get weird topologies where LLCs can span
nodes and other such nonsense. Cure this by hard partitioning these
masks on node boundaries.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/n/tip-di5vwjm96q5vrb76opwuflwx@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 13:28:59 +02:00
Jan Beulich
57da8b960b x86: Avoid double stack traces with show_regs()
What was called show_registers() so far already showed a stack
trace for kernel faults, and kernel_stack_pointer() isn't even
valid to be used for faults from user mode, hence it was
pointless for show_regs() to call show_trace() after
show_registers().

Simply rename show_registers() to show_regs() and eliminate
the old definition.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/4FAA3D3902000078000826E1@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 11:44:42 +02:00
Jarkko Sakkinen
cda846f101 x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline
This patch changes 64-bit trampoline so that CR4 and
EFER are provided by the kernel instead of using fixed
values.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-24-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 15:04:27 -07:00
Jarkko Sakkinen
bf8b88e977 x86, realmode: fixes compilation issue in tboot.c
Fixed include path of wakeup.h in tboot.c.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-23-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 15:04:27 -07:00
Jarkko Sakkinen
f37240f16b x86, realmode: header for trampoline code
Added header for trampoline code that can be used to supply
input data to it. This makes interface between real mode code
and kernel cleaner and simpler. Replaced two confusing pointers
to level4 pgt in trampoline_64.S with a single pointer to the
beginning of the page table.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-21-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:48:45 -07:00
Jarkko Sakkinen
c4845474a0 x86, realmode: flattened rm hierachy
Simplified hierarchy under rm directory to a flat
directory because it is not anymore really justified
to have own directory for wakeup code. It only adds
more complexity.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-20-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:48:45 -07:00
Jarkko Sakkinen
b429dbf6e8 x86, realmode: don't copy real_mode_header
Replaced copying of real_mode_header with a pointer
to beginning of RM memory.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-19-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:48:45 -07:00
Jarkko Sakkinen
8e029fcdd8 x86, realmode: fix 64-bit wakeup sequence
There were number of issues in wakeup sequence:

- Wakeup stack was placed in hardcoded address.
- NX bit in EFER was not enabled.
- Initialization incorrectly set physical address
of secondary_startup_64.
- Some alignment issues.

This patch fixes these issues and in addition:

- Unifies coding conventions in .S files.
- Sets alignments of code and data right.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-18-git-send-email-jarkko.sakkinen@intel.com
Originally-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <len.brown@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:48:11 -07:00
Jarkko Sakkinen
f156ffc439 x86, realmode: Set permission for real mode pages
Set proper permissions for rodata, text and data, removing the
realmode trampoline area as a remaining RWX memory mapping in the
kernel.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-8-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:47:08 -07:00
Jarkko Sakkinen
c9b77ccb52 x86, realmode: Move ACPI wakeup to unified realmode code
Migrated ACPI wakeup code to the real-mode blob.
Code existing in .x86_trampoline  can be completely
removed. Static descriptor table in wakeup_asm.S is
courtesy of H. Peter Anvin.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-7-git-send-email-jarkko.sakkinen@intel.com
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <len.brown@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:46:05 -07:00
Jarkko Sakkinen
48927bbb97 x86, realmode: Move SMP trampoline to unified realmode code
Migrated SMP trampoline code to the real mode blob.
SMP trampoline code is not yet removed from
.x86_trampoline because it is needed by the wakeup
code.

[ hpa: always enable compiling startup_32_smp in head_32.S... it is
  only a few instructions which go into .init on UP builds, and it makes
  the rest of the code less #ifdef ugly. ]

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-6-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:41:51 -07:00
Jarkko Sakkinen
5a8c9aebe0 x86, realmode: Move reboot_32.S to unified realmode code
Migrated reboot_32.S from x86_trampoline to the real-mode
blob.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-5-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:41:50 -07:00
Jarkko Sakkinen
084ee1c641 x86, realmode: Relocator for realmode code
Implements relocator for real mode code that is called
as part of setup_arch(). Processes segment relocations
and linear relocations. Real-mode code is relocated to
a free hole below 1 MB.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-4-git-send-email-jarkko.sakkinen@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-08 11:41:49 -07:00
Linus Torvalds
e9b19cd43f Merge branch 'for-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull two percpu fixes from Tejun Heo:
 "One adds missing KERN_CONT on split printk()s and the other makes
  the percpu allocator avoid using PMD_SIZE as atom_size on x86_32.

  Using PMD_SIZE led to vmalloc area exhaustion on certain
  configurations (x86_32 android) and the only cost of using PAGE_SIZE
  instead is static percpu area not being aligned to large page
  mapping."

* 'for-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bit
  percpu: use KERN_CONT in pcpu_dump_alloc_info()
2012-05-08 11:06:07 -07:00
Tejun Heo
d5e28005a1 percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bit
With the embed percpu first chunk allocator, x86 uses either PAGE_SIZE
or PMD_SIZE for atom_size.  PMD_SIZE is used when CPU supports PSE so
that percpu areas are aligned to PMD mappings and possibly allow using
PMD mappings in vmalloc areas in the future.  Using larger atom_size
doesn't waste actual memory; however, it does require larger vmalloc
space allocation later on for !first chunks.

With reasonably sized vmalloc area, PMD_SIZE shouldn't be a problem
but x86_32 at this point is anything but reasonable in terms of
address space and using larger atom_size reportedly leads to frequent
percpu allocation failures on certain setups.

As there is no reason to not use PMD_SIZE on x86_64 as vmalloc space
is aplenty and most x86_64 configurations support PSE, fix the issue
by always using PMD_SIZE on x86_64 and PAGE_SIZE on x86_32.

v2: drop cpu_has_pse test and make x86_64 always use PMD_SIZE and
    x86_32 PAGE_SIZE as suggested by hpa.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Yanmin Zhang <yanmin.zhang@intel.com>
Reported-by: ShuoX Liu <shuox.liu@intel.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <4F97BA98.6010001@intel.com>
Cc: stable@vger.kernel.org
2012-05-08 09:42:18 -07:00
Jan Beulich
b2a3477727 x86: Fix section annotation of acpi_map_cpu2node()
Commit 943bc7e110 ("x86: Fix section warnings") added
__cpuinitdata here, while for functions __cpuinit should be
used.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: <sp@numascale.com>
Link: http://lkml.kernel.org/r/4FA947910200007800082470@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Steffen Persvold <sp@numascale.com>
2012-05-08 16:40:26 +02:00
Thomas Gleixner
38e7c572ce x86: Use common threadinfo allocator
The only difference is the free_thread_info function, which frees
xstate.

Use the new arch_release_task_struct() function instead and switch
over to the core allocator.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20120505150141.559556763@linutronix.de
Cc: x86@kernel.org
2012-05-08 14:08:44 +02:00
Thomas Gleixner
67ba5293f7 Merge branch 'smp/threadalloc' into smp/hotplug
Reason: Pull in the separate branch which was created so arch/tile can
base further work on it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-05-08 14:07:48 +02:00
Thomas Gleixner
85f7f65627 x86: Use kick_all_cpus_sync()
Use kick_all_cpus_sync() and remove cpu_idle_wait().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20120507175652.190382227@linutronix.de
Cc: x86@kernel.org
2012-05-08 12:35:06 +02:00
Márton Németh
d1ecad6eee x86/apic: Only compile local function if used with !CONFIG_GENERIC_PENDING_IRQ
The local function io_apic_level_ack_pending() is only called
from io_apic_level_ack_pending(). The later function is only
compiled if CONFIG_GENERIC_PENDING_IRQ is defined. Move the
io_apic_level_ack_pending() to the existing #ifdef
CONFIG_GENERIC_PENDING_IRQ code block.

This will remove the following warning message during compiling
without CONFIG_GENERIC_PENDING_IRQ defined:

 * arch/x86/kernel/apic/io_apic.c:382: warning: ‘io_apic_level_ack_pending’ defined but not used

Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Link: http://lkml.kernel.org/r/1336461860.2296.3.camel@sbsiddha-mobl2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-08 11:23:15 +02:00
Shuah Khan
e826abd523 x86, microcode: microcode_core.c simple_strtoul cleanup
Change reload_for_cpu() in kernel/microcode_core.c to call kstrtoul()
instead of calling obsoleted simple_strtoul().

Signed-off-by: Shuah Khan <shuahkhan@gmail.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1336324264.2897.9.camel@lorien2
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-05-07 11:36:49 -07:00
Ingo Molnar
9438ef7f4e x86/apic: Fix UP boot crash
Commit 31b3c9d723 ("xen/x86: Implement x86_apic_ops") implemented
this:

... without considering that on UP the function pointer might be NULL.

Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/n/tip-3pfty0ml4yp62phbkchichh0@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-07 19:19:56 +02:00
Srivatsa S. Bhat
19209bbb86 x86/sched: Make mwait_usable() heed to "idle=" kernel parameters properly
The checks that exist in mwait_usable() for "idle=" kernel
parameters are insufficient. As a result, mwait_usable() can
return 1 even if "idle=nomwait" or "idle=poll" or "idle=halt"
parameters are passed.

Of these cases, incorrect handling of idle=nomwait is a
universal problem since mwait can get used for usual CPU idling.
However the rest of the cases are problematic only during CPU
Hotplug (offline) because, in the CPU offline path, the function
mwait_play_dead() is called, which might result in mwait being
used in the offline CPUs, if mwait_usable() happens to return 1.

Fix these issues by checking for the boot time "idle=" kernel
parameter properly in mwait_usable().

The first issue (usual cpu idling) is demonstrated below:

Before applying the patch (dmesg snippet):

 [    0.000000] Command line: [...] idle=nomwait
 [    0.000000] Kernel command line: [...] idle=nomwait
 [    0.000000]  RCU dyntick-idle grace-period acceleration is enabled.
 [    0.140606] using mwait in idle threads.  <======= mwait being used
 [    4.303986] cpuidle: using governor ladder
 [    4.308232] cpuidle: using governor menu

After applying the patch:

 [    0.000000] Command line: [...] idle=nomwait
 [    0.000000] Kernel command line: [...] idle=nomwait
 [    0.000000]  RCU dyntick-idle grace-period acceleration is enabled.
 [    4.264100] cpuidle: using governor ladder
 [    4.268342] cpuidle: using governor menu

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: venki@google.com
Cc: suresh.b.siddha@intel.com
Cc: Borislav Petkov <bp@amd64.org>
Cc: lenb@kernel.org
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Link: http://lkml.kernel.org/r/4F9E37B8.30001@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-07 16:27:20 +02:00
Shai Fultheim
42fa425043 x86: Conditionally update time when ack-ing pending irqs
On virtual environments, apic_read could take a long time. As a
result, under certain conditions the ack pending loop may exit
without any queued irqs left, but after more than one second. A
warning will be printed needlessly in this case.

If the loop is about to exit regardless of max_loops, don't
update it.

Signed-off-by: Shai Fultheim <shai@scalemp.com>
[ rebased and reworded the commit message]
Signed-off-by: Ido Yariv <ido@wizery.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1334873552-31346-1-git-send-email-ido@wizery.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-07 16:25:28 +02:00
Ingo Molnar
79fec2c557 Interrupt remapping ops for x86
This patchset introduces a generic ops-interface for
 accessing interrupt remapping hardware on x86. It factors
 out the VT-d specific code from io_apic.c and moves it to
 drivers/iommu. These changes will be used to add support for
 AMD interrupt remapping hardware.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPp8ZZAAoJECvwRC2XARrj0o0P/22YaTNWvl1ZGDnO2Z913pEM
 hbE0RPsht/2paCR+oXggu3vv3D0BzY5SkC0Hmjr4po4hiwDNXHj+V1QQKHFEMwuO
 gzWatXCW/dDvIbDNHuCgG76n0kNmDMeps/MgKGrG4seV7zQ2jMfjaWFvJ2UKF/fJ
 D1Jm+7QLr+MLM/cT64WG5S/8/Pi1N6CYbdXwkPTlFTnKRS+NODWGMo9fgc7VIsI/
 9A1haZp+aV/TvGqklDlSh6CgIUBZVt/cffpftJiuH4L3o5mq1sXaFcK4Whu8PxGD
 OwmbLe4PKXl39piPe4x8NGGrj0xnk+WN17JUQpefxFIbBkCjfGakNqLxFMKbxH8k
 AtWeERge3Jd4jcfbmlXQ0GKaL2+BQUzIq0VHadHwsifCKs0KP7qG+q4Ev+02+Xbo
 gfC5UtemoffI4fXznvNIE0Hcp8EYONmtksPJ2iKOxy+t7uDlESDMDspXf1IOLZzO
 BODOdDlnsPCCWxT9RsJziVkD6C/hSZRwQIavZchBeMele4+YMpym+POlTJ5lx9KJ
 TVXYZmRv8sUgzP67s0WuDNlB2SEQq8fHNYIHrCPGqcmp/VHCFzj1nSe5jRu3FdnB
 Yu44BOoTibYss/aVy21hbB99SRMkpUWfImhH2eZLO/dahtlUMiI9ZnnYBJ62Ol7u
 bBAnLHtq8WkVZH6drSCe
 =H+Nt
 -----END PGP SIGNATURE-----

Merge tag 'intr-remapping-ops-for-ingo' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into core/iommu

- This patchset introduces a generic ops-interface for
  accessing interrupt remapping hardware on x86. It factors
  out the VT-d specific code from io_apic.c and moves it to
  drivers/iommu. These changes will be used to add support for
  AMD interrupt remapping hardware.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-07 16:21:35 +02:00
Shai Fultheim
ddc5681ed3 x86/cache_info: Fix setup of l2/l3 ids
On some architectures (such as vSMP), it is possible to have
CPUs with a different number of cores sharing the same cache.

The current implementation implicitly assumes that all CPUs will
have the same number of cores sharing caches, and as a result,
different CPUs can end up with the same l2/l3 ids.

Fix this by masking out the shared cache bits, instead of
shifting the APICID. By doing so, it is guaranteed that the
generated cache ids are always unique.

Signed-off-by: Shai Fultheim <shai@scalemp.com>
[ rebased, simplified, and reworded the commit message]
Signed-off-by: Ido Yariv <ido@wizery.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Dave Jones <davej@redhat.com>
Link: http://lkml.kernel.org/r/1334873351-31142-1-git-send-email-ido@wizery.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-07 15:27:37 +02:00
Jan Beulich
b2d6aba965 x86: Allow multiple values to be specified with "hpet="
This is particularly to be able to specify "hpet=force,verbose",
as "force" ought to be a primary candidate for also wanting to
use "verbose".

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/4F79D120020000780007C031@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-07 15:06:28 +02:00
Jan Beulich
396e2c6fed x86: Clear HPET configuration registers on startup
While Linux itself has been calling hpet_disable() for quite a
while, having e.g. a secondary (kexec) kernel depend on such
behavior of the primary (crashed) environment is fragile. It
particularly broke until very recently when the primary
environment was Xen based, as that hypervisor did not clear any
of the HPET settings it may have used.

Rather than blindly (and incompletely) clearing certain HPET
settings in hpet_disable(), latch the config register settings
during boot and restore then here.

(Note on the hpet_set_mode() change: Now that we're clearing the
level bit upon initialization, there's no need anymore to do so
here.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/4F79D0BB020000780007C02D@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-07 15:06:27 +02:00
Srivatsa S. Bhat
7164b3f5e5 x86/microcode: Ensure that module is only loaded on supported Intel CPUs
Exit early when there's no support for a particular CPU family.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Dave Jones <davej@redhat.com>
Cc: tigran@aivazian.fsnet.co.uk
Cc: Borislav Petkov <borislav.petkov@amd.com>
Link: http://lkml.kernel.org/r/4F8BDB58.6070007@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-07 14:37:14 +02:00
Suresh Siddha
8a8f422d3b iommu: rename intr_remapping.[ch] to irq_remapping.[ch]
Make the file names consistent with the naming conventions of irq subsystem.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-05-07 14:35:00 +02:00
Suresh Siddha
95a02e976c iommu: rename intr_remapping references to irq_remapping
Make the code consistent with the naming conventions of irq subsystem.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-05-07 14:35:00 +02:00
Joerg Roedel
263b5e8629 x86, iommu/vt-d: Clean up interfaces for interrupt remapping
Remove the Intel specific interfaces from dmar.h and remove
asm/irq_remapping.h which is only used for io_apic.c anyway.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-05-07 14:35:00 +02:00
Joerg Roedel
5e2b930b07 iommu/vt-d: Convert MSI remapping setup to remap_ops
This patch introduces remapping-ops for setting ups MSI
interrupts.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-05-07 14:35:00 +02:00
Joerg Roedel
9d619f6572 iommu/vt-d: Convert free_irte into a remap_ops callback
The operation for releasing a remapping entry is iommu
specific too.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-05-07 14:34:59 +02:00
Joerg Roedel
4c1bad6a0a iommu/vt-d: Convert IR set_affinity function to remap_ops
The function to set interrupt affinity with interrupt
remapping enabled is Intel specific too. So move it to the
irq_remap_ops too.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-05-07 14:34:59 +02:00
Joerg Roedel
0c3f173a88 iommu/vt-d: Convert IR ioapic-setup to use remap_ops
The IOAPIC setup routine for interrupt remapping is VT-d
specific. Move it to the irq_remap_ops and add a call helper
function.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-05-07 14:34:59 +02:00
Joerg Roedel
4f3d8b67ad iommu/vt-d: Convert missing apic.c intr-remapping call to remap_ops
Convert these calls too:

	* Disable of remapping hardware
	* Reenable of remapping hardware
	* Enable fault handling

With that all of arch/x86/kernel/apic/apic.c is converted to
use the generic intr-remapping interface.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-05-07 14:34:59 +02:00
Joerg Roedel
736baef447 iommu/vt-d: Make intr-remapping initialization generic
This patch introduces irq_remap_ops to hold implementation
specific function pointer to handle interrupt remapping. As
the first part the initialization functions for VT-d are
converted to these ops.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-05-07 14:34:59 +02:00
Ingo Molnar
22042c086c Merge branch 'stable/for-ingo-v3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into x86/apic 2012-05-07 12:07:51 +02:00
Ingo Molnar
19631cb3d6 Merge branch 'tip/perf/core-4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core 2012-05-07 11:03:52 +02:00
Larry Finger
febb72a6e4 IA32 emulation: Fix build problem for modular ia32 a.out support
Commit ce7e5d2d19 ("x86: fix broken TASK_SIZE for ia32_aout") breaks
kernel builds when "CONFIG_IA32_AOUT=m" with

  ERROR: "set_personality_ia32" [arch/x86/ia32/ia32_aout.ko] undefined!
  make[1]: *** [__modpost] Error 1

The entry point needs to be exported.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Acked-by: Al Viro <viro@zeniv.linux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-06 18:26:20 -07:00
Gleb Natapov
62c49cc976 KVM: Do not take reference to mm during async #PF
It turned to be totally unneeded. The reason the code was introduced is
so that KVM can prefault swapped in page, but prefault can fail even
if mm is pinned since page table can change anyway. KVM handles this
situation correctly though and does not inject spurious page faults.

Fixes:
 "INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected" warning while
 running LTP inside a KVM guest using the recent -next kernel.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-05-06 15:00:02 +03:00
Thomas Gleixner
45046892ef x86: Use generic init_task
Same code. Use the generic version. The special Makefile treatment is
pointless anyway as init_task.o contains only data which is handled by
the linker script. So no point on being treated like head text.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20120503085035.739963562@linutronix.de
Cc: x86@kernel.org
2012-05-05 13:00:26 +02:00
Steven Rostedt
59a094c994 ftrace/x86: Use asm/kprobes.h instead of linux/kprobes.h
If CONFIG_KPROBES is not set, then linux/kprobes.h will not include
asm/kprobes.h needed by x86/ftrace.c for the BREAKPOINT macro.

The x86/ftrace.c file should just include asm/kprobes.h as it does not
need the rest of kprobes.

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-04 09:28:29 -04:00
James Morris
898bfc1d46 Linux 3.4-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJPnb50AAoJEHm+PkMAQRiGAE0H/A4zFZIUGmF3miKPDYmejmrZ
 oVDYxVAu6JHjHWhu8E3VsinvyVscowjV8dr15eSaQzmDmRkSHAnUQ+dB7Di7jLC2
 MNopxsWjwyZ8zvvr3rFR76kjbWKk/1GYytnf7GPZLbJQzd51om2V/TY/6qkwiDSX
 U8Tt7ihSgHAezefqEmWp2X/1pxDCEt+VFyn9vWpkhgdfM1iuzF39MbxSZAgqDQ/9
 JJrBHFXhArqJguhENwL7OdDzkYqkdzlGtS0xgeY7qio2CzSXxZXK4svT6FFGA8Za
 xlAaIvzslDniv3vR2ZKd6wzUwFHuynX222hNim3QMaYdXm012M+Nn1ufKYGFxI0=
 =4d4w
 -----END PGP SIGNATURE-----

Merge tag 'v3.4-rc5' into next

Linux 3.4-rc5

Merge to pull in prerequisite change for Smack:
86812bb0de

Requested by Casey.
2012-05-04 12:46:40 +10:00
Konrad Rzeszutek Wilk
4a8e2a3115 x86/apic: Replace io_apic_ops with x86_io_apic_ops.
Which makes the code fit within the rest of the x86_ops functions.

Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
[v1: Changed x86_apic -> x86_ioapic per Yinghai Lu <yinghai@kernel.org> suggestion]
[v2: Rebased on tip/x86/urgent and redid to match Ingo's syntax style]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-01 14:50:09 -04:00
Borislav Petkov
575203b474 x86, MCE, AMD: Disable error thresholding bank 4 on some models
Turn off MC4_MISC thresholding banks on models which have them but that
particular processor implementation does not supply applicable error
sources to be counted.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-04-30 13:22:54 +02:00
Borislav Petkov
d26ecc4894 x86, MCE, AMD: Hide interrupt_enable sysfs node
Depending on whether the box supports the APIC LVT interrupt for
thresholding, we want to show the 'interrupt_enable' sysfs node or not.
Make that the case by adding it to the default sysfs attributes only if
it is supported.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-04-30 13:22:44 +02:00
Borislav Petkov
f227d4306c x86, MCE, AMD: Make APIC LVT thresholding interrupt optional
Currently, the APIC LVT interrupt for error thresholding is implicitly
enabled. However, there are models in the F15h range which do not enable
it. Make the code machinery which sets up the APIC interrupt support
an optional setting and add an ->interrupt_capable member to the bank
representation mirroring that capability and enable the interrupt offset
programming only if it is true.

Simplify code and fixup comment style while at it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-04-30 13:22:22 +02:00
Steven Rostedt
4a6d70c950 ftrace/x86: Remove the complex ftrace NMI handling code
As ftrace function tracing would require modifying code that could
be executed in NMI context, which is not stopped with stop_machine(),
ftrace had to do a complex algorithm with various stages of setup
and memory barriers to make it work.

With the new breakpoint method, this is no longer required. The changes
to the code can be done without any problem in NMI context, as well as
without stop machine altogether. Remove the complex code as it is
no longer needed.

Also, a lot of the notrace annotations could be removed from the
NMI code as it is now safe to trace them. With the exception of
do_nmi itself, which does some special work to handle running in
the debug stack. The breakpoint method can cause NMIs to double
nest the debug stack if it's not setup properly, and that is done
in do_nmi(), thus that function must not be traced.

(Note the arch sh may want to do the same)

Cc: Paul Mundt <lethal@linux-sh.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-04-27 21:11:28 -04:00
Steven Rostedt
08d636b6d4 ftrace/x86: Have arch x86_64 use breakpoints instead of stop machine
This method changes x86 to add a breakpoint to the mcount locations
instead of calling stop machine.

Now that iret can be handled by NMIs, we perform the following to
update code:

1) Add a breakpoint to all locations that will be modified

2) Sync all cores

3) Update all locations to be either a nop or call (except breakpoint
   op)

4) Sync all cores

5) Remove the breakpoint with the new code.

6) Sync all cores

[
  Added updates that Masami suggested:
   Use unlikely(modifying_ftrace_code) in int3 trap to keep kprobes efficient.
   Don't use NOTIFY_* in ftrace handler in int3 as it is not a notifier.
]

Cc: H. Peter Anvin <hpa@zytor.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-04-27 21:10:44 -04:00
Andreas Herrmann
f7f286a910 x86/amd: Re-enable CPU topology extensions in case BIOS has disabled it
BIOS will switch off the corresponding feature flag on family
15h models 10h-1fh non-desktop CPUs.

The topology extension CPUID leafs are required to detect which
cores belong to the same compute unit. (thread siblings mask is
set accordingly and also correct information about L1i and L2
cache sharing depends on this).

W/o this patch we wouldn't see which cores belong to the same
compute unit and also cache sharing information for L1i and L2
would be incorrect on such systems.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-27 16:43:09 +02:00
Robert Richter
5f09fc6889 perf/x86: Fix cmpxchg() usage in amd_put_event_constraints()
Now the return value of cmpxchg() is used to match an event. The
change removes the duplicate event comparison and traverses the list
until an event was removed. This also fixes the following warning:

 arch/x86/kernel/cpu/perf_event_amd.c:170: warning: value computed is not used

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333643084-26776-3-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-26 13:52:51 +02:00
Robert Richter
10c250234c perf: Trivial cleanup of duplicate code
Removing duplicate code.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333643084-26776-2-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-26 13:52:50 +02:00
Thomas Gleixner
7eb43a6d23 x86: Use generic idle thread allocation
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/20120420124557.246929343@linutronix.de
2012-04-26 12:06:10 +02:00
Thomas Gleixner
5cdaf1834f x86: Add task_struct argument to smp_ops.cpu_up
Preparatory patch to use the generic idle thread allocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/20120420124557.176604405@linutronix.de
2012-04-26 12:06:10 +02:00
Ingo Molnar
fab06992de perf/x86: Clean up register_nmi_handler() usage
A function name represents the pointer to it - no need to take the
address of it. (Fixing this helps us introduce some macro magic
around register_nmi_handler() in the future.)

Cc: Robert Richter <robert.richter@amd.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25 12:56:26 +02:00
Greg Pearson
ea0dcf903e x86/apic: Use x2apic physical mode based on FADT setting
Provide systems that do not support x2apic cluster mode
a mechanism to select x2apic physical mode using the
FADT FORCE_APIC_PHYSICAL_DESTINATION_MODE bit.

Changes from v1: (based on Suresh's comments)
 - removed #ifdef CONFIG_ACPI
 - removed #include <linux/acpi.h>

Signed-off-by: Greg Pearson <greg.pearson@hp.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1335313436-32020-1-git-send-email-greg.pearson@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25 12:47:08 +02:00
Li Zhong
72b3fb2471 x86/nmi: Fix page faults by nmiaction if kmemcheck is enabled
This patch tries to fix the problem of page fault exception
caused by accessing nmiaction structure in nmi if kmemcheck
is enabled.

If kmemcheck is enabled, the memory allocated through slab are
in pages that are marked non-present, so that some checks could
be done in the page fault handling code ( e.g. whether the
memory is read before written to ).

As nmiaction is allocated in this way, so it resides in a
non-present page. Then there is a page fault while the nmi code
accessing the nmiaction structure, which would then cause a
warning by WARN_ON_ONCE(in_nmi()) in kmemcheck_fault(), called
by do_page_fault().

This significantly simplifies the code as well, as the whole
dynamic allocation dance goes away.

v2: as Peter suggested, changed the nmiaction to use static
    storage.

v3: as Peter suggested, use macro to shorten the codes. Also
    keep the original usage of register_nmi_handler, so users of
    this call doesn't need change.

Tested-by: Seiji Aguchi <seiji.aguchi@hds.com>
Fixes: https://lkml.org/lkml/2012/3/2/356
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
[ simplified the wrappers ]
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: thomas.mingarelli@hp.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1333051877-15755-4-git-send-email-dzickus@redhat.com
[ tidied the patch a bit ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25 12:44:06 +02:00
Don Zickus
553222f3e8 x86/nmi: Add new NMI queues to deal with IO_CHK and SERR
In discussions with Thomas Mingarelli about hpwdt, he explained
to me some issues they were some when using their virtual NMI
button to test the hpwdt driver.

It turns out the virtual NMI button used on HP's machines do no
send unknown NMIs but instead send IO_CHK NMIs.  The way the
kernel code is written, the hpwdt driver can not register itself
against that type of NMI and therefore can not successfully
capture system information before panic'ing.

To solve this I created two new NMI queues to allow driver to
register against the IO_CHK and SERR NMIs.  Or in the hpwdt all
three (if you include unknown NMIs too).

The change is straightforward and just mimics what the unknown
NMI does.

Reported-and-tested-by: Thomas Mingarelli <thomas.mingarelli@hp.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1333051877-15755-3-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25 12:43:34 +02:00
Ingo Molnar
cd32b1616b A small L3 cache index disable fix from Srivatsa Bhat which unifies the
way the code checks for already disabled indices.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPkEFNAAoJEBLB8Bhh3lVKEigP/1hrNM0oE3IQHejNYkZ2kbj0
 DL4WkALiHCgWJVcr/lk9PvTaY7aIAo8HFgvbUKF4PcBvnFZvu6rckJPIsWgsE1wD
 +OHDaO27vyYKMaA9ImvqYneB8hcl528EDlZ7ssfTepQXPyx99SoTYc9ToT1+znVp
 qQUC+d67MTLl/eHL+i6rbQfYdVaXGaVTuAIpzQn3oTqUDLrmXKe+60oTgnC0zgSW
 kQ63Vo8MHi9CpzCe0JhZwvK3d8sPzrBktNisaEbdHe+0Gk5zvcEiPzE2nIyvLXjz
 cf0QoQFQHzniCdFQoYjzLafT3aItBsN1v29gOrydPT6LxMZ8wO5k+8Suu1NcyI9R
 RLJR61wKwSzi4JQ/1+LAqoDQHrldATKPCM74BLYiNTi8OGeqda+10COJQmLFITzM
 9mn9fg/ZPg8V+z+0e2zObYcFVRtUCIs+XRWIzjhuPICdRRnwoNT+b9HyrLZ5JY1X
 BH3nHon0W4Bm5jEBhOb02XLdzBMtF3vRM4mNYCzpoS/tDFRszyOuV63FXkPurVbe
 hT9ZKhDZ63YV8ycwx2jqZgZAFuySi719kG3aUMDNMJYbUEi6D0QRKH9ngE6JAvwH
 rw1hX65sNX9jbBOAvcDTq2780SpV3dpO1TywLip9uSWIk6DYyEVHbr8AdWwnLlfb
 fdq9y8V/3+NC9aTNf/e3
 =VwUL
 -----END PGP SIGNATURE-----

Merge tag 'l3-fix-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent

A small L3 cache index disable fix from Srivatsa Bhat which unifies the
way the code checks for already disabled indices.

( Pulling it into v3.4 despite the v3.5 tag - the fix is small and we better
  keep the same code across kernel versions for such user facing interfaces. )

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25 12:24:16 +02:00
Konrad Rzeszutek Wilk
cd74257b97 x86, acpi: Call acpi_enter_sleep_state via an asmlinkage C function from assembler
With commit a2ef5c4fd4
"ACPI: Move module parameter gts and bfs to sleep.c" the
wake_sleep_flags is required when calling acpi_enter_sleep_state.

The assembler code in wakeup_*.S did not do that. One solution
is to call it from assembler and stick the wake_sleep_flags on
the stack (for 32-bit) or in %esi (for 64-bit). hpa and rafael
both suggested however to create a wrapper function to call
acpi_enter_sleep_state and call said wrapper function
("acpi_enter_s3") from assembler.

For 32-bit, the acpi_enter_s3 ends up looking as so:

  push   %ebp
  mov    %esp,%ebp
  sub    $0x8,%esp
  movzbl 0xc1809314,%eax [wake_sleep_flags]
  movl   $0x3,(%esp)
  mov    %eax,0x4(%esp)
  call   0xc12d1fa0 <acpi_enter_sleep_state>
  leave
  ret

And 64-bit:

  movzbl 0x9afde1(%rip),%esi        [wake_sleep_flags]
  push   %rbp
  mov    $0x3,%edi
  mov    %rsp,%rbp
  callq  0xffffffff812e9800 <acpi_enter_sleep_state>
  leaveq
  retq

Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
[v2: Remove extra assembler operations, per hpa review]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1335150198-21899-3-git-send-email-konrad.wilk@oracle.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-04-23 13:29:18 -07:00
Konrad Rzeszutek Wilk
2a14e541ed ACPI: Convert wake_sleep_flags to a value instead of function
With commit a2ef5c4fd4
"ACPI: Move module parameter gts and bfs to sleep.c" the wake_sleep_flags
is required when calling acpi_enter_sleep_state, which means
that if there are functions outside the sleep.c code they
can't get the wake_sleep_flags values.

This converts the function in to a exported value and converts
the module config operands to a function.

Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Lin Ming <ming.m.lin@intel.com>
[v2: Parameters can be turned on/off dynamically]
[v3: unsigned char -> u8]
[v4: val -> kp->arg]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1335150198-21899-2-git-send-email-konrad.wilk@oracle.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-04-23 13:29:07 -07:00
Chen Gong
8571723a69 x86/mce Add validation check before GHES error is recorded
When GHES error record is logged into mcelog kernel buffer, a validation
check for physical address is necessary, which prevents reporting an
invalid physical address.

[Since physical address is the only useful element in this error record,
we drop generating the record completely if we don't have a valid address]

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-04-20 16:02:05 -07:00
H. Peter Anvin
5d6f8d77ed x86, extable: Remove open-coded exception table entries in arch/x86/kernel/test_rodata.c
Remove open-coded exception table entries in arch/x86/kernel/test_rodata.c,
and replace them with _ASM_EXTABLE() macros; this will allow us to
change the format and type of the exception table entries.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: David Daney <david.daney@cavium.com>
Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
2012-04-20 13:51:38 -07:00
H. Peter Anvin
d7abc0fa99 x86, extable: Remove open-coded exception table entries in arch/x86/kernel/entry_64.S
Remove open-coded exception table entries in arch/x86/kernel/entry_64.S,
and replace them with _ASM_EXTABLE() macros; this will allow us to
change the format and type of the exception table entries.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: David Daney <david.daney@cavium.com>
Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
2012-04-20 13:51:38 -07:00
H. Peter Anvin
6837a54dd6 x86, extable: Remove open-coded exception table entries in arch/x86/kernel/entry_32.S
Remove open-coded exception table entries in arch/x86/kernel/entry_32.S,
and replace them with _ASM_EXTABLE() macros; this will allow us to
change the format and type of the exception table entries.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: David Daney <david.daney@cavium.com>
Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
2012-04-20 13:51:38 -07:00
H. Peter Anvin
4c5023a3fa x86-32: Handle exception table entries during early boot
If we get an exception during early boot, walk the exception table to
see if we should intercept it.  The main use case for this is to allow
rdmsr_safe()/wrmsr_safe() during CPU initialization.

Since the exception table is currently sorted at runtime, and fairly
late in startup, this code walks the exception table linearly.  We
obviously don't need to worry about modules, however: none have been
loaded at this point.

This patch changes the early IDT setup to look a lot more like x86-64:
we now install handlers for all 32 exception vectors.  The output of
the early exception handler has changed somewhat as it directly
reflects the stack frame of the exception handler, and the stack frame
has been somewhat restructured.

Finally, centralize the code that can and should be run only once.

[ v2: Use early_fixup_exception() instead of linear search ]

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1334794610-5546-6-git-send-email-hpa@zytor.com
2012-04-19 16:45:02 -07:00
H. Peter Anvin
9900aa2f95 x86-64: Handle exception table entries during early boot
If we get an exception during early boot, walk the exception table to
see if we should intercept it.  The main use case for this is to allow
rdmsr_safe()/wrmsr_safe() during CPU initialization.

Since the exception table is currently sorted at runtime, and fairly
late in startup, this code walks the exception table linearly.  We
obviously don't need to worry about modules, however: none have been
loaded at this point.

[ v2: Use early_fixup_exception() instead of linear search ]

Link: http://lkml.kernel.org/r/1334794610-5546-5-git-send-email-hpa@zytor.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-04-19 15:42:45 -07:00
H. Peter Anvin
ffc4bc9c6f x86, paravirt: Replace GET_CR2_INTO_RCX with GET_CR2_INTO_RAX
GET_CR2_INTO_RCX is asinine: it is only used in one place, the actual
paravirt call returns the value in %rax, not %rcx; and the one place
that wants it wants the result in %r9.  We actually generate as a
result of this call:

       call ...
       movq %rax, %rcx
       xorq %rax, %rax		/* this value isn't even used... */
       movq %rcx, %r9

At least make the macro do what the paravirt call does, which is put
the value into %rax.

Nevermind the fact that the macro clobbers all the volatile registers.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1334794610-5546-4-git-send-email-hpa@zytor.com
Cc: Glauber de Oliveira Costa <glommer@parallels.com>
2012-04-19 15:07:56 -07:00
H. Peter Anvin
84f4fc524e x86: Add symbolic constant for exceptions with error code
Add a symbolic constant for the bitmask which states which exceptions
carry an error code.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1334794610-5546-3-git-send-email-hpa@zytor.com
2012-04-19 15:07:49 -07:00
Marcelo Tosatti
eac0556750 Merge branch 'linus' into queue
Merge reason: development work has dependency on kvm patches merged
upstream.

Conflicts:
	Documentation/feature-removal-schedule.txt

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-04-19 17:06:26 -03:00
Srivatsa S. Bhat
a720b2dd24 x86, intel_cacheinfo: Fix error return code in amd_set_l3_disable_slot()
If the L3 disable slot is already in use, return -EEXIST instead of
-EINVAL. The caller, store_cache_disable(), checks this return value to
print an appropriate warning.

Also, we want to signal with -EEXIST that the current index we're
disabling has actually been already disabled on the node:

$ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_0
$ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_0
-bash: echo: write error: File exists
$ echo 12 > /sys/devices/system/cpu/cpu3/cache/index3/cache_disable_1
-bash: echo: write error: File exists
$ echo 12 > /sys/devices/system/cpu/cpu5/cache/index3/cache_disable_1
-bash: echo: write error: File exists

The old code would say

-bash: echo: write error: Invalid argument

for disable slot 1 when playing the example above with no output in
dmesg, which is clearly misleading.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20120419070053.GB16645@elgon.mountain
[Boris: add testing for the other index too]
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-04-19 18:30:28 +02:00
Tony Luck
95022b8cf6 x86/mce: Avoid reading every machine check bank register twice.
Reading machine check bank registers is slow. There is a trend of
increasing the number of banks, and the number of cores. The main section
of do_machine_check() is a serialized section where each cpu in turn
checks every bank. Even on a little two socket SandyBridge-EP system
that multiplies out as:

	2 sockets * 8 cores * 2 hyperthreads * 20 banks = 640 MSRs

We already scan the banks in parallel in mce_no_way_out() to see if there
is a fatal error anywhere in the system. If we build a cache of VALID
bits during this scan, we can avoid uselessly re-reading banks that have
no data. Note that this cache is only a hint. If the valid bit is set in a
shared bank, all cpus that share that bank will see it during the parallel
scan, but the first to find it in the sequential scan will (usually) clear
the bank.

Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-04-19 09:12:43 -07:00
Bryan O'Donoghue
cbf2829b61 x86, apic: APIC code touches invalid MSR on P5 class machines
Current APIC code assumes MSR_IA32_APICBASE is present for all systems.
Pentium Classic P5 and friends didn't have this MSR. MSR_IA32_APICBASE
was introduced as an architectural MSR by Intel @ P6.

Code paths that can touch this MSR invalidly are when vendor == Intel &&
cpu-family == 5 and APIC bit is set in CPUID - or when you simply pass
lapic on the kernel command line, on a P5.

The below patch stops Linux incorrectly interfering with the
MSR_IA32_APICBASE for P5 class machines. Other code paths exist that
touch the MSR - however those paths are not currently reachable for a
conformant P5.

Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linux.intel.com>
Link: http://lkml.kernel.org/r/4F8EEDD3.1080404@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
2012-04-18 09:44:31 -07:00
Oleg Nesterov
089f9fba56 i387: ptrace breaks the lazy-fpu-restore logic
Starting from 7e16838d "i387: support lazy restore of FPU state"
we assume that fpu_owner_task doesn't need restore_fpu_checking()
on the context switch, its FPU state should match what we already
have in the FPU on this CPU.

However, debugger can change the tracee's FPU state, in this case
we should reset fpu.last_cpu to ensure fpu_lazy_restore() can't
return true.

Change init_fpu() to do this, it is called by user_regset->set()
methods.

Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20120416204815.GB24884@redhat.com
Cc: <stable@vger.kernel.org> v3.3
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-04-16 16:23:59 -07:00
Andreas Herrmann
68894632af x86/platform: Remove incorrect error message in x86_default_fixup_cpu_id()
It's only called from amd.c:srat_detect_node(). The introduced
condition for calling the fixup code is true for all AMD
multi-node processors, e.g. Magny-Cours and Interlagos. There we
have 2 NUMA nodes on one socket. Thus there are cores having
different numa-node-id but with equal phys_proc_id.

There is no point to print error messages in such a situation.

The confusing/misleading error message was introduced with
commit 64be4c1c24 ("x86: Add
x86_init platform override to fix up NUMA core numbering").

Remove the default fixup function (especially the error message)
and replace it by a NULL pointer check, move the
Numascale-specific condition for calling the fixup into the
fixup-function itself and slightly adapt the comment.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: <stable@kernel.org>
Cc: <sp@numascale.com>
Cc: <bp@amd64.org>
Cc: <daniel@numascale-asia.com>
Link: http://lkml.kernel.org/r/20120402160648.GR27684@alberich.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-16 20:43:43 +02:00
Josh Triplett
a7e0e4e99f x86: Fix typo in MODULE_DEVICE_TABLE example: s/x86_cpu/x86cpu/
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-04-16 14:20:19 +02:00
Andreas Herrmann
d7de8649f3 x86/amd: Remove broken links from comment and kernel message
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Link: http://lkml.kernel.org/r/20120411151238.GA4794@alberich.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-16 08:53:43 +02:00
Ingo Molnar
1c2b443e2a A two-patch fix from Andreas taking care of a sysfs warning when the
microcode driver is loaded on unsupported platforms.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPh/jXAAoJEBLB8Bhh3lVKhFUP/25nGddMi/wl4D76Ef8eJJRu
 lYVJGqatSCJIV+51DFaFnC9JQ2C10bMfs5it6tNNPTaTpTTss3b/opFs9M2CXJF5
 GHY7XlIc7fC7v+Vur9Jgmz4QAldYVYtARDkx6uIuADiD6LvSsug384uFy6Yscdnh
 s02JTa9PeGYOz0RtTPxzrsGPablDukcfXV/Is/dLNygD+mHs8HdLWNRBDEZfDRea
 9nH+zqwa/SbkGPZHbN8wIGLD7V+QMLWbY5YRvOHzlU+L5Hnjfik25+XlsuKKSv+B
 v6/HNV8YluAGNixIFlm9EkrzNXTMe8smhM50+8b0CjOX/n2ldi89VVheOsgoPaNa
 dTwyTcrfY71I/B/nyp4pIswEt4P96c2HiZpGO6b22tGxDP7EYtXPKka1v7reBP/O
 atDmK+3++JOzgaqbKjZkdiLawyzAxHgkLZAr02EsCb0IFF8s6a0HI23s9zOEem3T
 pifrcYE8j8VgUIvGBUwWWzKONqiXpXPAKU/bwqphQIsnOqIaXOTaCuyDI5PjQX9l
 Zri/UxXvoBEq+fuobM3W7dhFQ3AnN5h0Ri3L55Eak9DVJU8vQ3ZfbxAwXGBeVtut
 EDp9LkT4wP0WTY8PfJaZUQoIZ72m0g2sO2eEkcwPuT6ljQb7Rdsr1y5F3KzP4kRF
 o79lhzWZBAv3cBYvocNS
 =ZgVv
 -----END PGP SIGNATURE-----

Merge tag 'microcode-fix-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent

Pull from Borislav Petkov a two-patch fix from Andreas taking care of a sysfs
warning when the microcode driver is loaded on unsupported platforms.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-14 13:30:32 +02:00
Ingo Molnar
6ac1ef482d Merge branch 'perf/core' into perf/uprobes
Merge in latest upstream (and the latest perf development tree),
to prepare for tooling changes, and also to pick up v3.4 MM
changes that the uprobes code needs to take care of.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-14 13:19:04 +02:00
Will Drewry
c6cfbeb402 x86: Enable HAVE_ARCH_SECCOMP_FILTER
Enable support for seccomp filter on x86:
- syscall_get_arch()
- syscall_get_arguments()
- syscall_rollback()
- syscall_set_return_value()
- SIGSYS siginfo_t support
- secure_computing is called from a ptrace_event()-safe context
- secure_computing return value is checked (see below).

SECCOMP_RET_TRACE and SECCOMP_RET_TRAP may result in seccomp needing to
skip a system call without killing the process.  This is done by
returning a non-zero (-1) value from secure_computing.  This change
makes x86 respect that return value.

To ensure that minimal kernel code is exposed, a non-zero return value
results in an immediate return to user space (with an invalid syscall
number).

Signed-off-by: Will Drewry <wad@chromium.org>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>

v18: rebase and tweaked change description, acked-by
v17: added reviewed by and rebased
v..: all rebases since original introduction.
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-04-14 11:13:21 +10:00
Andreas Herrmann
283c1f2558 x86, microcode: Ensure that module is only loaded on supported AMD CPUs
Exit early when there's no support for a particular CPU family. Also,
fixup the "no support for this CPU vendor" to be issued only when the
driver is attempted to be loaded on an unsupported vendor.

Cc: stable@vger.kernel.org
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/20120411163849.GE4794@alberich.amd.com
[Boris: add a commit msg because Andreas is lazy]
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-04-13 11:51:05 +02:00
Andreas Herrmann
a956bd6f85 x86, microcode: Fix sysfs warning during module unload on unsupported CPUs
Loading the microcode driver on an unsupported CPU and subsequently
unloading the driver causes

 WARNING: at fs/sysfs/group.c:138 mc_device_remove+0x5f/0x70 [microcode]()
 Hardware name: 01972NG
 sysfs group ffffffffa00013d0 not found for kobject 'cpu0'
 Modules linked in: snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel btusb snd_hda_codec bluetooth thinkpad_acpi rfkill microcode(-) [last unloaded: cfg80211]
 Pid: 4560, comm: modprobe Not tainted 3.4.0-rc2-00002-g258f742 #5
 Call Trace:
  [<ffffffff8103113b>] ? warn_slowpath_common+0x7b/0xc0
  [<ffffffff81031235>] ? warn_slowpath_fmt+0x45/0x50
  [<ffffffff81120e74>] ? sysfs_remove_group+0x34/0x120
  [<ffffffffa00000ef>] ? mc_device_remove+0x5f/0x70 [microcode]
  [<ffffffff81331eb9>] ? subsys_interface_unregister+0x69/0xa0
  [<ffffffff81563526>] ? mutex_lock+0x16/0x40
  [<ffffffffa0000c3e>] ? microcode_exit+0x50/0x92 [microcode]
  [<ffffffff8107051d>] ? sys_delete_module+0x16d/0x260
  [<ffffffff810a0065>] ? wait_iff_congested+0x45/0x110
  [<ffffffff815656af>] ? page_fault+0x1f/0x30
  [<ffffffff81565ba2>] ? system_call_fastpath+0x16/0x1b

on recent kernels.

This is due to commit 8a25a2fd12 ("cpu: convert 'cpu' and
'machinecheck' sysdev_class to a regular subsystem") which renders
commit 6c53cbfced ("x86, microcode: Correct sysdev_add error path")
useless.

See http://marc.info/?l=linux-kernel&m=133416246406478

Avoid above warning by restoring the old driver behaviour before
6c53cbfced ("x86, microcode: Correct sysdev_add error path").

Cc: stable@vger.kernel.org
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/20120411163849.GE4794@alberich.amd.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-04-13 11:49:06 +02:00
Linus Torvalds
4a1d7544fe Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Use correct byte-sized register constraint in __add()
  x86: Use correct byte-sized register constraint in __xchg_op()
  x86: vsyscall: Use NULL instead 0 for a pointer argument
2012-04-12 15:06:07 -07:00
Eric B Munson
248997095d kvmclock: remove unneeded EXPORT macro
check_and_clear_guest_paused does not need to be exported as it isn't used
by any modules, remove the export.

Signed-off-by: Eric B Munson <emunson@mgebm.net>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:49:54 +03:00
Eric B Munson
3b5d56b931 kvmclock: Add functions to check if the host has stopped the vm
When a host stops or suspends a VM it will set a flag to show this.  The
watchdog will use these functions to determine if a softlockup is real, or the
result of a suspended VM.

Signed-off-by: Eric B Munson <emunson@mgebm.net>
asm-generic changes Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-08 12:48:59 +03:00
Linus Torvalds
a3fac08085 Merge branch 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull a few KVM fixes from Avi Kivity:
 "A bunch of powerpc KVM fixes, a guest and a host RCU fix (unrelated),
  and a small build fix."

* 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Resolve RCU vs. async page fault problem
  KVM: VMX: vmx_set_cr0 expects kvm->srcu locked
  KVM: PMU: Fix integer constant is too large warning in kvm_pmu_set_msr()
  KVM: PPC: Book3S: PR: Fix preemption
  KVM: PPC: Save/Restore CR over vcpu_run
  KVM: PPC: Book3S HV: Save and restore CR in __kvmppc_vcore_entry
  KVM: PPC: Book3S HV: Fix kvm_alloc_linear in case where no linears exist
  KVM: PPC: Book3S: Compile fix for ppc32 in HIOR access code
2012-04-07 09:53:33 -07:00
Emil Goode
46ed99d1b7 x86: vsyscall: Use NULL instead 0 for a pointer argument
This patch silences the following sparse warning:
arch/x86/kernel/vsyscall_64.c:250:34:
       warning: Using plain integer as NULL pointer

Signed-off-by: Emil Goode <emilgoode@gmail.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: john.stultz@linaro.org
Link: http://lkml.kernel.org/r/1333306084-3776-1-git-send-email-emilgoode@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-04-06 11:49:59 +02:00
Linus Torvalds
5d32c88f0b Merge branch 'akpm' (Andrew's patch-bomb)
Merge batch of fixes from Andrew Morton:
 "The simple_open() cleanup was held back while I wanted for laggards to
  merge things.

  I still need to send a few checkpoint/restore patches.  I've been
  wobbly about merging them because I'm wobbly about the overall
  prospects for success of the project.  But after speaking with Pavel
  at the LSF conference, it sounds like they're further toward
  completion than I feared - apparently davem is at the "has stopped
  complaining" stage regarding the net changes.  So I need to go back
  and re-review those patchs and their (lengthy) discussion."

* emailed from Andrew Morton <akpm@linux-foundation.org>: (16 patches)
  memcg swap: use mem_cgroup_uncharge_swap fix
  backlight: add driver for DA9052/53 PMIC v1
  C6X: use set_current_blocked() and block_sigmask()
  MAINTAINERS: add entry for sparse checker
  MAINTAINERS: fix REMOTEPROC F: typo
  alpha: use set_current_blocked() and block_sigmask()
  simple_open: automatically convert to simple_open()
  scripts/coccinelle/api/simple_open.cocci: semantic patch for simple_open()
  libfs: add simple_open()
  hugetlbfs: remove unregister_filesystem() when initializing module
  drivers/rtc/rtc-88pm860x.c: fix rtc irq enable callback
  fs/xattr.c:setxattr(): improve handling of allocation failures
  fs/xattr.c:listxattr(): fall back to vmalloc() if kmalloc() failed
  fs/xattr.c: suppress page allocation failure warnings from sys_listxattr()
  sysrq: use SEND_SIG_FORCED instead of force_sig()
  proc: fix mount -t proc -o AAA
2012-04-05 15:30:34 -07:00
Stephen Boyd
234e340582 simple_open: automatically convert to simple_open()
Many users of debugfs copy the implementation of default_open() when
they want to support a custom read/write function op.  This leads to a
proliferation of the default_open() implementation across the entire
tree.

Now that the common implementation has been consolidated into libfs we
can replace all the users of this function with simple_open().

This replacement was done with the following semantic patch:

<smpl>
@ open @
identifier open_f != simple_open;
identifier i, f;
@@
-int open_f(struct inode *i, struct file *f)
-{
(
-if (i->i_private)
-f->private_data = i->i_private;
|
-f->private_data = i->i_private;
)
-return 0;
-}

@ has_open depends on open @
identifier fops;
identifier open.open_f;
@@
struct file_operations fops = {
...
-.open = open_f,
+.open = simple_open,
...
};
</smpl>

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05 15:25:50 -07:00
Gleb Natapov
e08759215b KVM: Resolve RCU vs. async page fault problem
"Page ready" async PF can kick vcpu out of idle state much like IRQ.
We need to tell RCU about this.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-05 19:26:14 +03:00
Linus Torvalds
6c216ec636 KGDB/KDB regression fixes
3.4: Fix an an Smatch warning that appeared in the 3.4 merge window
    3.0: Fix kgdb test suite with SMP for all archs without HW single stepping
 2.6.36: Fix kgdb sw breakpoints with CONFIG_DEBUG_RODATA=y limitations on x86
 2.6.26: Fix oops on kgdb test suite with CONFIG_DEBUG_RODATA
         Fix kgdb test suite with SMP for all archs with HW single stepping
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPedocAAoJEIciOldedpOjn7EP/397Rh0zmRlG8oQwMEJcK3E5
 gaRyBNpkGoU3ekHXHx/nzgQ/CS9opzW7nBZDu8weWLjRKMT4RyHfuJcWyu525GvQ
 SnoiX2ZUzP315d8llCYwXmaCEYA7lHQi4T2bGMlDSn1J8kS235EQxllgEfhXDdEC
 DxRWgHABG2UR62C62sGKbPaMMDO9TcNcrAQK27LDLTS7pKLmYqBWBdZKgWzBM/Pr
 AF8vakqSgUw3Aq9qrLge+483uT7uhMoUJofxRppWtm1QgnDcTmri9LOagiazDotz
 RQliRGwVxj9hEo5mLEiQtI0N1kIGCAsK0+9aUJEZRXovRBR9kvqaqHT4c5xdhznr
 VKYvqqTcHBkKLIfNXFvQZnn2cXtNVNqve9CZZwdBJaFYEkaR7ZVQqE6f2xq8KAb2
 RmhvzlEUyLU+89YKkH66uSa22VLSazkeH+4b8AJ4JxYDEab3BHoBCe8axcBQrTsj
 7X5NOs7V3Oj+4J3bS1fbUbxq4t0dfpLLyg8e/lELWtT+Kq7nQRzA2XHRZAMTve8M
 T0cTdrwtUbgY9ZMTpywNB2KlPgTvhWOyfYbH6/Kcks7ecSXlkow3edXoiUbw79iE
 hP8vcMWbT2Rv3IbLkSMFZEQGAG9qL1YyGv4NDmLOoljO1c/Bi3WQIR5aI+di6asV
 Z5q5s/bmGa4+OhFFITSd
 =SW2N
 -----END PGP SIGNATURE-----

Merge tag 'for_linus-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb

Pull KGDB/KDB regression fixes from Jason Wessel:
 - Fix a Smatch warning that appeared in the 3.4 merge window
 - Fix kgdb test suite with SMP for all archs without HW single stepping
 - Fix kgdb sw breakpoints with CONFIG_DEBUG_RODATA=y limitations on x86
 - Fix oops on kgdb test suite with CONFIG_DEBUG_RODATA
 - Fix kgdb test suite with SMP for all archs with HW single stepping

* tag 'for_linus-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
  x86,kgdb: Fix DEBUG_RODATA limitation using text_poke()
  kgdb,debug_core: pass the breakpoint struct instead of address and memory
  kgdbts: (2 of 2) fix single step awareness to work correctly with SMP
  kgdbts: (1 of 2) fix single step awareness to work correctly with SMP
  kgdbts: Fix kernel oops with CONFIG_DEBUG_RODATA
  kdb: Fix smatch warning on dbg_io_ops->is_console
2012-04-04 17:26:08 -07:00
Linus Torvalds
58bca4a8fa Merge branch 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA mapping branch from Marek Szyprowski:
 "Short summary for the whole series:

  A few limitations have been identified in the current dma-mapping
  design and its implementations for various architectures.  There exist
  more than one function for allocating and freeing the buffers:
  currently these 3 are used dma_{alloc, free}_coherent,
  dma_{alloc,free}_writecombine, dma_{alloc,free}_noncoherent.

  For most of the systems these calls are almost equivalent and can be
  interchanged.  For others, especially the truly non-coherent ones
  (like ARM), the difference can be easily noticed in overall driver
  performance.  Sadly not all architectures provide implementations for
  all of them, so the drivers might need to be adapted and cannot be
  easily shared between different architectures.  The provided patches
  unify all these functions and hide the differences under the already
  existing dma attributes concept.  The thread with more references is
  available here:

    http://www.spinics.net/lists/linux-sh/msg09777.html

  These patches are also a prerequisite for unifying DMA-mapping
  implementation on ARM architecture with the common one provided by
  dma_map_ops structure and extending it with IOMMU support.  More
  information is available in the following thread:

    http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819

  More works on dma-mapping framework are planned, especially in the
  area of buffer sharing and managing the shared mappings (together with
  the recently introduced dma_buf interface: commit d15bd7ee44
  "dma-buf: Introduce dma buffer sharing mechanism").

  The patches in the current set introduce a new alloc/free methods
  (with support for memory attributes) in dma_map_ops structure, which
  will later replace dma_alloc_coherent and dma_alloc_writecombine
  functions."

People finally started piping up with support for merging this, so I'm
merging it as the last of the pending stuff from the merge window.
Looks like pohmelfs is going to wait for 3.5 and more external support
for merging.

* 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  common: DMA-mapping: add NON-CONSISTENT attribute
  common: DMA-mapping: add WRITE_COMBINE attribute
  common: dma-mapping: introduce mmap method
  common: dma-mapping: remove old alloc_coherent and free_coherent methods
  Hexagon: adapt for dma_map_ops changes
  Unicore32: adapt for dma_map_ops changes
  Microblaze: adapt for dma_map_ops changes
  SH: adapt for dma_map_ops changes
  Alpha: adapt for dma_map_ops changes
  SPARC: adapt for dma_map_ops changes
  PowerPC: adapt for dma_map_ops changes
  MIPS: adapt for dma_map_ops changes
  X86 & IA64: adapt for dma_map_ops changes
  common: dma-mapping: introduce generic alloc() and free() methods
2012-04-04 17:13:43 -07:00
Linus Torvalds
66cfb32772 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar.

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/p4: Add format attributes
  tracing, sched, vfs: Fix 'old_pid' usage in trace_sched_process_exec()
2012-04-04 10:04:42 -07:00
Linus Torvalds
6742259866 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, kvm: Call restore_sched_clock_state() only after %gs is initialized
  x86: Use -mno-avx when available
  x86: Remove the ancient and deprecated disable_hlt() and enable_hlt() facility
  x86: Preserve lazy irq disable semantics in fixup_irqs()
2012-04-04 10:04:01 -07:00
Peter Zijlstra
7b8e6da46b perf/x86/p4: Add format attributes
Steven reported his P4 not booting properly, the missing format
attributes cause a NULL ptr deref. Cure this by adding the
missing format specification.

I took the format description out of the comment near
p4_config_pack*() and hope that comment is still relatively
accurate.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Bruno Prémont <bonbons@linux-vserver.org>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1332859842.16159.227.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-03 08:33:38 +02:00
Linus Torvalds
f187e9fd68 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates and fixes from Ingo Molnar:
 "It's mostly fixes, but there's also two late items:

   - preliminary GTK GUI support for perf report
   - PMU raw event format descriptors in sysfs, to be parsed by tooling

  The raw event format in sysfs is a new ABI.  For example for the 'CPU'
  PMU we have:

    aldebaran:~> ll /sys/bus/event_source/devices/cpu/format/*
    -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/any
    -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/cmask
    -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/edge
    -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/event
    -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/inv
    -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/offcore_rsp
    -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/pc
    -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/umask

  those lists of fields contain a specific format:

    aldebaran:~> cat /sys/bus/event_source/devices/cpu/format/offcore_rsp
    config1:0-63

  So, those who wish to specify raw events can now use the following
  event format:

    -e cpu/cmask=1,event=2,umask=3

  Most people will not want to specify any events (let alone raw
  events), they'll just use whatever default event the tools use.

  But for more obscure PMU events that have no cross-architecture
  generic events the above syntax is more usable and a bit more
  structured than specifying hex numbers."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  perf tools: Remove auto-generated bison/flex files
  perf annotate: Fix off by one symbol hist size allocation and hit accounting
  perf tools: Add missing ref-cycles event back to event parser
  perf annotate: addr2line wants addresses in same format as objdump
  perf probe: Finder fails to resolve function name to address
  tracing: Fix ent_size in trace output
  perf symbols: Handle NULL dso in dso__name_len
  perf symbols: Do not include libgen.h
  perf tools: Fix bug in raw sample parsing
  perf tools: Fix display of first level of callchains
  perf tools: Switch module.h into export.h
  perf: Move mmap page data_head offset assertion out of header
  perf: Fix mmap_page capabilities and docs
  perf diff: Fix to work with new hists design
  perf tools: Fix modifier to be applied on correct events
  perf tools: Fix various casting issues for 32 bits
  perf tools: Simplify event_read_id exit path
  tracing: Fix ftrace stack trace entries
  tracing: Move the tracing_on/off() declarations into CONFIG_TRACING
  perf report: Add a simple GTK2-based 'perf report' browser
  ...
2012-03-31 13:34:04 -07:00
Linus Torvalds
a335750b9a Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull ACPI & Power Management changes from Len Brown:
 - ACPI 5.0 after-ripples, ACPICA/Linux divergence cleanup
 - cpuidle evolving, more ARM use
 - thermal sub-system evolving, ditto
 - assorted other PM bits

Fix up conflicts in various cpuidle implementations due to ARM cpuidle
cleanups (ARM at91 self-refresh and cpu idle code rewritten into
"standby" in asm conflicting with the consolidation of cpuidle time
keeping), trivial SH include file context conflict and RCU tracing fixes
in generic code.

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (77 commits)
  ACPI throttling: fix endian bug in acpi_read_throttling_status()
  Disable MCP limit exceeded messages from Intel IPS driver
  ACPI video: Don't start video device until its associated input device has been allocated
  ACPI video: Harden video bus adding.
  ACPI: Add support for exposing BGRT data
  ACPI: export acpi_kobj
  ACPI: Fix logic for removing mappings in 'acpi_unmap'
  CPER failed to handle generic error records with multiple sections
  ACPI: Clean redundant codes in scan.c
  ACPI: Fix unprotected smp_processor_id() in acpi_processor_cst_has_changed()
  ACPI: consistently use should_use_kmap()
  PNPACPI: Fix device ref leaking in acpi_pnp_match
  ACPI: Fix use-after-free in acpi_map_lsapic
  ACPI: processor_driver: add missing kfree
  ACPI, APEI: Fix incorrect APEI register bit width check and usage
  Update documentation for parameter *notrigger* in einj.txt
  ACPI, APEI, EINJ, new parameter to control trigger action
  ACPI, APEI, EINJ, limit the range of einj_param
  ACPI, APEI, Fix ERST header length check
  cpuidle: power_usage should be declared signed integer
  ...
2012-03-30 16:45:39 -07:00
Len Brown
d326f44e5f Merge branch 'tboot' into release
Conflicts:
	drivers/acpi/acpica/hwsleep.c

Text conflict between:

2feec47d4c
(ACPICA: ACPI 5: Support for new FADT SleepStatus, SleepControl registers)

which removed #include "actables.h"

and

09f98a825a
(x86, acpi, tboot: Have a ACPI os prepare sleep instead of calling tboot_sleep.)

which removed #include <linux/tboot.h>

The resolution is to remove them both.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 16:38:59 -04:00
Len Brown
1a05e46787 Merge branches 'acpica', 'bgrt', 'bz-11533', 'cpuidle', 'ec', 'hotplug', 'misc', 'red-hat-bz-727865', 'thermal', 'throttling', 'turbostat' and 'video' into release
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 16:10:37 -04:00
Petr Vandrovec
ac909ec308 ACPI: Fix use-after-free in acpi_map_lsapic
When processor is being hot-added to the system, acpi_map_lsapic invokes
ACPI _MAT method to find APIC ID and flags, verifies that returned structure
is indeed ACPI's local APIC structure, and that flags contain MADT_ENABLED
bit.  Then saves APIC ID, frees structure - and accesses structure when
computing arguments for acpi_register_lapic call.  Which sometime leads
to acpi_register_lapic call being made with second argument zero, failing
to bring processor online with error 'Unable to map lapic to logical cpu
number'.

As lapic->lapic_flags & ACPI_MADT_ENABLED was already confirmed to be non-zero
few lines above, we can just pass unconditional ACPI_MADT_ENABLED to the
acpi_register_lapic.

Signed-off-by: Petr Vandrovec <petr@vmware.com>
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:31:58 -04:00
Boris Ostrovsky
1a022e3f1b idle, x86: Allow off-lined CPU to enter deeper C states
Currently when a CPU is off-lined it enters either MWAIT-based idle or,
if MWAIT is not desired or supported, HLT-based idle (which places the
processor in C1 state). This patch allows processors without MWAIT
support to stay in states deeper than C1.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:23:01 -04:00
Len Brown
f6365201d8 x86: Remove the ancient and deprecated disable_hlt() and enable_hlt() facility
The X86_32-only disable_hlt/enable_hlt mechanism was used by the
32-bit floppy driver. Its effect was to replace the use of the
HLT instruction inside default_idle() with cpu_relax() - essentially
it turned off the use of HLT.

This workaround was commented in the code as:

 "disable hlt during certain critical i/o operations"

 "This halt magic was a workaround for ancient floppy DMA
  wreckage. It should be safe to remove."

H. Peter Anvin additionally adds:

 "To the best of my knowledge, no-hlt only existed because of
  flaky power distributions on 386/486 systems which were sold to
  run DOS.  Since DOS did no power management of any kind,
  including HLT, the power draw was fairly uniform; when exposed
  to the much hhigher noise levels you got when Linux used HLT
  caused some of these systems to fail.

  They were by far in the minority even back then."

Alan Cox further says:

 "Also for the Cyrix 5510 which tended to go castors up if a HLT
  occurred during a DMA cycle and on a few other boxes HLT during
  DMA tended to go astray.

  Do we care ? I doubt it. The 5510 was pretty obscure, the 5520
  fixed it, the 5530 is probably the oldest still in any kind of
  use."

So, let's finally drop this.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Stephen Hemminger <shemminger@vyatta.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/n/tip-3rhk9bzf0x9rljkv488tloib@git.kernel.org
[ If anyone cares then alternative instruction patching could be
  used to replace HLT with a one-byte NOP instruction. Much simpler. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-30 08:50:27 +02:00
Ingo Molnar
186e54cbe1 Merge branch 'linus' into x86/urgent
Merge reason: Needed for include file dependencies.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-30 08:50:06 +02:00
Linus Torvalds
eb05df9e7e Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Peter Anvin:
 "The biggest textual change is the cleanup to use symbolic constants
  for x86 trap values.

  The only *functional* change and the reason for the x86/x32 dependency
  is the move of is_ia32_task() into <asm/thread_info.h> so that it can
  be used in other code that needs to understand if a system call comes
  from the compat entry point (and therefore uses i386 system call
  numbers) or not.  One intended user for that is the BPF system call
  filter.  Moving it out of <asm/compat.h> means we can define it
  unconditionally, returning always true on i386."

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Move is_ia32_task to asm/thread_info.h from asm/compat.h
  x86: Rename trap_no to trap_nr in thread_struct
  x86: Use enum instead of literals for trap values
2012-03-29 18:21:35 -07:00
Linus Torvalds
a591afc01d Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x32 support for x86-64 from Ingo Molnar:
 "This tree introduces the X32 binary format and execution mode for x86:
  32-bit data space binaries using 64-bit instructions and 64-bit kernel
  syscalls.

  This allows applications whose working set fits into a 32 bits address
  space to make use of 64-bit instructions while using a 32-bit address
  space with shorter pointers, more compressed data structures, etc."

Fix up trivial context conflicts in arch/x86/{Kconfig,vdso/vma.c}

* 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
  x32: Fix alignment fail in struct compat_siginfo
  x32: Fix stupid ia32/x32 inversion in the siginfo format
  x32: Add ptrace for x32
  x32: Switch to a 64-bit clock_t
  x32: Provide separate is_ia32_task() and is_x32_task() predicates
  x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls
  x86/x32: Fix the binutils auto-detect
  x32: Warn and disable rather than error if binutils too old
  x32: Only clear TIF_X32 flag once
  x32: Make sure TS_COMPAT is cleared for x32 tasks
  fs: Remove missed ->fds_bits from cessation use of fd_set structs internally
  fs: Fix close_on_exec pointer in alloc_fdtable
  x32: Drop non-__vdso weak symbols from the x32 VDSO
  x32: Fix coding style violations in the x32 VDSO code
  x32: Add x32 VDSO support
  x32: Allow x32 to be configured
  x32: If configured, add x32 system calls to system call tables
  x32: Handle process creation
  x32: Signal-related system calls
  x86: Add #ifdef CONFIG_COMPAT to <asm/sys_ia32.h>
  ...
2012-03-29 18:12:23 -07:00
Jason Wessel
3751d3e85c x86,kgdb: Fix DEBUG_RODATA limitation using text_poke()
There has long been a limitation using software breakpoints with a
kernel compiled with CONFIG_DEBUG_RODATA going back to 2.6.26. For
this particular patch, it will apply cleanly and has been tested all
the way back to 2.6.36.

The kprobes code uses the text_poke() function which accommodates
writing a breakpoint into a read-only page.  The x86 kgdb code can
solve the problem similarly by overriding the default breakpoint
set/remove routines and using text_poke() directly.

The x86 kgdb code will first attempt to use the traditional
probe_kernel_write(), and next try using a the text_poke() function.
The break point install method is tracked such that the correct break
point removal routine will get called later on.

Cc: x86@kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: stable@vger.kernel.org # >= 2.6.36
Inspried-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-03-29 17:41:25 -05:00
Linus Torvalds
7fda0412c5 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar.

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpusets: Remove an unused variable
  sched/rt: Improve pick_next_highest_task_rt()
  sched: Fix select_fallback_rq() vs cpu_active/cpu_online
  sched/x86/smp: Do not enable IRQs over calibrate_delay()
  sched: Fix compiler warning about declared inline after use
  MAINTAINERS: Update email address for SCHEDULER and PERF EVENTS
2012-03-29 14:46:05 -07:00
Linus Torvalds
6b8212a313 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 updates from Ingo Molnar.

This touches some non-x86 files due to the sanitized INLINE_SPIN_UNLOCK
config usage.

Fixed up trivial conflicts due to just header include changes (removing
headers due to cpu_idle() merge clashing with the <asm/system.h> split).

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic/amd: Be more verbose about LVT offset assignments
  x86, tls: Off by one limit check
  x86/ioapic: Add io_apic_ops driver layer to allow interception
  x86/olpc: Add debugfs interface for EC commands
  x86: Merge the x86_32 and x86_64 cpu_idle() functions
  x86/kconfig: Remove CONFIG_TR=y from the defconfigs
  x86: Stop recursive fault in print_context_stack after stack overflow
  x86/io_apic: Move and reenable irq only when CONFIG_GENERIC_PENDING_IRQ=y
  x86/apic: Add separate apic_id_valid() functions for selected apic drivers
  locking/kconfig: Simplify INLINE_SPIN_UNLOCK usage
  x86/kconfig: Update defconfigs
  x86: Fix excessive MSR print out when show_msr is not specified
2012-03-29 14:28:26 -07:00
Linus Torvalds
bcd550745f Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core updates from Thomas Gleixner.

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ia64: vsyscall: Add missing paranthesis
  alarmtimer: Don't call rtc_timer_init() when CONFIG_RTC_CLASS=n
  x86: vdso: Put declaration before code
  x86-64: Inline vdso clock_gettime helpers
  x86-64: Simplify and optimize vdso clock_gettime monotonic variants
  kernel-time: fix s/then/than/ spelling errors
  time: remove no_sync_cmos_clock
  time: Avoid scary backtraces when warning of > 11% adj
  alarmtimer: Make sure we initialize the rtctimer
  ntp: Fix leap-second hrtimer livelock
  x86, tsc: Skip refined tsc calibration on systems with reliable TSC
  rtc: Provide flag for rtc devices that don't support UIE
  ia64: vsyscall: Use seqcount instead of seqlock
  x86: vdso: Use seqcount instead of seqlock
  x86: vdso: Remove bogus locking in update_vsyscall_tz()
  time: Remove bogus comments
  time: Fix change_clocksource locking
  time: x86: Fix race switching from vsyscall to non-vsyscall clock
2012-03-29 14:16:48 -07:00
Liu, Chuansheng
99dd5497e5 x86: Preserve lazy irq disable semantics in fixup_irqs()
The default irq_disable() sematics are to mark the interrupt disabled,
but keep it unmasked. If the interrupt is delivered while marked
disabled, the low level interrupt handler masks it and marks it
pending. This is important for detecting wakeup interrupts during
suspend and for edge type interrupts to avoid losing interrupts.

fixup_irqs() moves the interrupts away from an offlined cpu. For
certain interrupt types it needs to mask the interrupt line before
changing the affinity. After affinity has changed the interrupt line
is unmasked again, but only if it is not marked disabled.

This breaks the lazy irq disable semantics and causes problems in
suspend as the interrupt can be lost or wakeup functionality is
broken.

Check irqd_irq_masked() instead of irqd_irq_disabled() because
irqd_irq_masked() is only set, when the core code actually masked the
interrupt line. If it's not set, we unmask the interrupt and let the
lazy irq disable logic deal with an eventually incoming interrupt.

[ tglx: Massaged changelog and added a comment ]

Signed-off-by: liu chuansheng <chuansheng.liu@intel.com>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/27240C0AC20F114CBF8149A2696CBE4A05DFB3@SHSMSX101.ccr.corp.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-03-29 15:28:47 +02:00
Linus Torvalds
532bfc851a Merge branch 'akpm' (Andrew's patch-bomb)
Merge third batch of patches from Andrew Morton:
 - Some MM stragglers
 - core SMP library cleanups (on_each_cpu_mask)
 - Some IPI optimisations
 - kexec
 - kdump
 - IPMI
 - the radix-tree iterator work
 - various other misc bits.

 "That'll do for -rc1.  I still have ~10 patches for 3.4, will send
  those along when they've baked a little more."

* emailed from Andrew Morton <akpm@linux-foundation.org>: (35 commits)
  backlight: fix typo in tosa_lcd.c
  crc32: add help text for the algorithm select option
  mm: move hugepage test examples to tools/testing/selftests/vm
  mm: move slabinfo.c to tools/vm
  mm: move page-types.c from Documentation to tools/vm
  selftests/Makefile: make `run_tests' depend on `all'
  selftests: launch individual selftests from the main Makefile
  radix-tree: use iterators in find_get_pages* functions
  radix-tree: rewrite gang lookup using iterator
  radix-tree: introduce bit-optimized iterator
  fs/proc/namespaces.c: prevent crash when ns_entries[] is empty
  nbd: rename the nbd_device variable from lo to nbd
  pidns: add reboot_pid_ns() to handle the reboot syscall
  sysctl: use bitmap library functions
  ipmi: use locks on watchdog timeout set on reboot
  ipmi: simplify locking
  ipmi: fix message handling during panics
  ipmi: use a tasklet for handling received messages
  ipmi: increase KCS timeouts
  ipmi: decrease the IPMI message transaction time in interrupt mode
  ...
2012-03-28 17:19:28 -07:00
Dave Young
09c71bfd83 kdump x86: fix total mem size calculation for reservation
crashkernel reservation need know the total memory size.  Current
get_total_mem simply use max_pfn - min_low_pfn.  It is wrong because it
will including memory holes in the middle.

Especially for kvm guest with memory > 0xe0000000, there's below in qemu
code: qemu split memory as below:

    if (ram_size >= 0xe0000000 ) {
        above_4g_mem_size = ram_size - 0xe0000000;
        below_4g_mem_size = 0xe0000000;
    } else {
        below_4g_mem_size = ram_size;
    }

So for 4G mem guest, seabios will insert a 512M usable region beyond of
4G.  Thus in above case max_pfn - min_low_pfn will be more than original
memsize.

Fixing this issue by using memblock_phys_mem_size() to get the total
memsize.

Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28 17:14:36 -07:00
Linus Torvalds
0195c00244 Disintegrate and delete asm/system.h
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIVAwUAT3NKzROxKuMESys7AQKElw/+JyDxJSlj+g+nymkx8IVVuU8CsEwNLgRk
 8KEnRfLhGtkXFLSJYWO6jzGo16F8Uqli1PdMFte/wagSv0285/HZaKlkkBVHdJ/m
 u40oSjgT013bBh6MQ0Oaf8pFezFUiQB5zPOA9QGaLVGDLXCmgqUgd7exaD5wRIwB
 ZmyItjZeAVnDfk1R+ZiNYytHAi8A5wSB+eFDCIQYgyulA1Igd1UnRtx+dRKbvc/m
 rWQ6KWbZHIdvP1ksd8wHHkrlUD2pEeJ8glJLsZUhMm/5oMf/8RmOCvmo8rvE/qwl
 eDQ1h4cGYlfjobxXZMHqAN9m7Jg2bI946HZjdb7/7oCeO6VW3FwPZ/Ic75p+wp45
 HXJTItufERYk6QxShiOKvA+QexnYwY0IT5oRP4DrhdVB/X9cl2MoaZHC+RbYLQy+
 /5VNZKi38iK4F9AbFamS7kd0i5QszA/ZzEzKZ6VMuOp3W/fagpn4ZJT1LIA3m4A9
 Q0cj24mqeyCfjysu0TMbPtaN+Yjeu1o1OFRvM8XffbZsp5bNzuTDEvviJ2NXw4vK
 4qUHulhYSEWcu9YgAZXvEWDEM78FXCkg2v/CrZXH5tyc95kUkMPcgG+QZBB5wElR
 FaOKpiC/BuNIGEf02IZQ4nfDxE90QwnDeoYeV+FvNj9UEOopJ5z5bMPoTHxm4cCD
 NypQthI85pc=
 =G9mT
 -----END PGP SIGNATURE-----

Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system

Pull "Disintegrate and delete asm/system.h" from David Howells:
 "Here are a bunch of patches to disintegrate asm/system.h into a set of
  separate bits to relieve the problem of circular inclusion
  dependencies.

  I've built all the working defconfigs from all the arches that I can
  and made sure that they don't break.

  The reason for these patches is that I recently encountered a circular
  dependency problem that came about when I produced some patches to
  optimise get_order() by rewriting it to use ilog2().

  This uses bitops - and on the SH arch asm/bitops.h drags in
  asm-generic/get_order.h by a circuituous route involving asm/system.h.

  The main difficulty seems to be asm/system.h.  It holds a number of
  low level bits with no/few dependencies that are commonly used (eg.
  memory barriers) and a number of bits with more dependencies that
  aren't used in many places (eg.  switch_to()).

  These patches break asm/system.h up into the following core pieces:

    (1) asm/barrier.h

        Move memory barriers here.  This already done for MIPS and Alpha.

    (2) asm/switch_to.h

        Move switch_to() and related stuff here.

    (3) asm/exec.h

        Move arch_align_stack() here.  Other process execution related bits
        could perhaps go here from asm/processor.h.

    (4) asm/cmpxchg.h

        Move xchg() and cmpxchg() here as they're full word atomic ops and
        frequently used by atomic_xchg() and atomic_cmpxchg().

    (5) asm/bug.h

        Move die() and related bits.

    (6) asm/auxvec.h

        Move AT_VECTOR_SIZE_ARCH here.

  Other arch headers are created as needed on a per-arch basis."

Fixed up some conflicts from other header file cleanups and moving code
around that has happened in the meantime, so David's testing is somewhat
weakened by that.  We'll find out anything that got broken and fix it..

* tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits)
  Delete all instances of asm/system.h
  Remove all #inclusions of asm/system.h
  Add #includes needed to permit the removal of asm/system.h
  Move all declarations of free_initmem() to linux/mm.h
  Disintegrate asm/system.h for OpenRISC
  Split arch_align_stack() out from asm-generic/system.h
  Split the switch_to() wrapper out of asm-generic/system.h
  Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
  Create asm-generic/barrier.h
  Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h
  Disintegrate asm/system.h for Xtensa
  Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt]
  Disintegrate asm/system.h for Tile
  Disintegrate asm/system.h for Sparc
  Disintegrate asm/system.h for SH
  Disintegrate asm/system.h for Score
  Disintegrate asm/system.h for S390
  Disintegrate asm/system.h for PowerPC
  Disintegrate asm/system.h for PA-RISC
  Disintegrate asm/system.h for MN10300
  ...
2012-03-28 15:58:21 -07:00
Linus Torvalds
2e7580b0e7 Merge branch 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Avi Kivity:
 "Changes include timekeeping improvements, support for assigning host
  PCI devices that share interrupt lines, s390 user-controlled guests, a
  large ppc update, and random fixes."

This is with the sign-off's fixed, hopefully next merge window we won't
have rebased commits.

* 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits)
  KVM: Convert intx_mask_lock to spin lock
  KVM: x86: fix kvm_write_tsc() TSC matching thinko
  x86: kvmclock: abstract save/restore sched_clock_state
  KVM: nVMX: Fix erroneous exception bitmap check
  KVM: Ignore the writes to MSR_K7_HWCR(3)
  KVM: MMU: make use of ->root_level in reset_rsvds_bits_mask
  KVM: PMU: add proper support for fixed counter 2
  KVM: PMU: Fix raw event check
  KVM: PMU: warn when pin control is set in eventsel msr
  KVM: VMX: Fix delayed load of shared MSRs
  KVM: use correct tlbs dirty type in cmpxchg
  KVM: Allow host IRQ sharing for assigned PCI 2.3 devices
  KVM: Ensure all vcpus are consistent with in-kernel irqchip settings
  KVM: x86 emulator: Allow PM/VM86 switch during task switch
  KVM: SVM: Fix CPL updates
  KVM: x86 emulator: VM86 segments must have DPL 3
  KVM: x86 emulator: Fix task switch privilege checks
  arch/powerpc/kvm/book3s_hv.c: included linux/sched.h twice
  KVM: x86 emulator: correctly mask pmc index bits in RDPMC instruction emulation
  KVM: mmu_notifier: Flush TLBs before releasing mmu_lock
  ...
2012-03-28 14:35:31 -07:00
Robert Richter
8abc3122aa x86/apic/amd: Be more verbose about LVT offset assignments
Add information about LVT offset assignments to better debug firmware
bugs related to this. See following examples.

 # dmesg | grep -i 'offset\|ibs'
 LVT offset 0 assigned for vector 0xf9
 [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu
 [Firmware Bug]: cpu 0, IBS interrupt offset 0 not available (MSRC001103A=0x0000000000000100)
 Failed to setup IBS, -22

In this case the BIOS assigns both offsets for MCE (0xf9) and IBS
(0x400) vectors to offset 0, which is why the second APIC setup (IBS)
failed.

With correct setup you get:

 # dmesg | grep -i 'offset\|ibs'
 LVT offset 0 assigned for vector 0xf9
 LVT offset 1 assigned for vector 0x400
 IBS: LVT offset 1 assigned
 perf: AMD IBS detected (0x00000007)
 oprofile: AMD IBS detected (0x00000007)

Note: The vector includes also the message type to handle also NMIs
(0x400). In the firmware bug message the format is the same as of the
APIC500 register and includes the mask bit (bit 16) in addition.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-28 20:02:39 +02:00
David Howells
f05e798ad4 Disintegrate asm/system.h for X86
Disintegrate asm/system.h for X86.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
cc: x86@kernel.org
2012-03-28 18:11:12 +01:00
Dan Carpenter
8f0750f197 x86, tls: Off by one limit check
These are used as offsets into an array of GDT_ENTRY_TLS_ENTRIES members
so GDT_ENTRY_TLS_ENTRIES is one past the end of the array.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: http://lkml.kernel.org/r/20120324075250.GA28258@elgon.mountain
Cc: <stable@vger.kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-03-28 09:39:35 -07:00
Andrzej Pietrasiewicz
baa676fcf8 X86 & IA64: adapt for dma_map_ops changes
Adapt core x86 and IA64 architecture code for dma_map_ops changes: replace
alloc/free_coherent with generic alloc/free methods.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
[removed swiotlb related changes and replaced it with wrappers,
 merged with IA64 patch to avoid inter-patch dependences in intel-iommu code]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tony Luck <tony.luck@intel.com>
2012-03-28 16:36:31 +02:00
Jeremy Fitzhardinge
136d249ef7 x86/ioapic: Add io_apic_ops driver layer to allow interception
Xen dom0 needs to paravirtualize IO operations to the IO APIC,
so add a io_apic_ops for it to intercept.  Do this as ops
structure because there's at least some chance that another
paravirtualized environment may want to intercept these.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: jwboyer@redhat.com
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/1332385090-18056-2-git-send-email-konrad.wilk@oracle.com
[ Made all the affected code easier on the eyes ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-28 09:49:29 +02:00
Peter Zijlstra
bc758133ed sched/x86/smp: Do not enable IRQs over calibrate_delay()
We should not ever enable IRQs until we're fully set up. This opens up
a window where interrupts can hit the cpu and interrupts can do
wakeups, wakeups need state that isn't set-up yet, in particular this
cpu isn't elegible to run tasks, so if any cpu-affine task that got
created in CPU_UP_PREPARE manages to get a wakeup, its affinity mask
will get broken and we'll run into lots of 'interesting' problems.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-yaezmlbriluh166tfkgni22m@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-27 14:50:10 +02:00
Ingo Molnar
7fd52392c5 Merge branch 'linus' into perf/urgent
Merge reason: we need to fix a non-trivial merge conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-26 17:19:03 +02:00
Richard Weinberger
90e240142b x86: Merge the x86_32 and x86_64 cpu_idle() functions
Both functions are mostly identical.
The differences are:

- x86_32's cpu_idle() makes use of check_pgt_cache(), which is a
  nop on both x86_32 and x86_64.

- x86_64's cpu_idle() uses enter/__exit_idle/(), on x86_32 these
  function are a nop.

- In contrast to x86_32, x86_64 calls rcu_idle_enter/exit() in
  the innermost loop because idle notifications need RCU.
  Calling these function on x86_32 also in the innermost loop
  does not hurt.

So we can merge both functions.

Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: paulmck@linux.vnet.ibm.com
Cc: josh@joshtriplett.org
Cc: tj@kernel.org
Link: http://lkml.kernel.org/r/1332709204-22496-1-git-send-email-richard@nod.at
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-26 03:16:07 +02:00
Linus Torvalds
ed2d265d12 The following text was taken from the original review request:
"[RFC - PATCH 0/7] consolidation of BUG support code."
 		https://lkml.org/lkml/2012/1/26/525
 --
 
 The changes shown here are to unify linux's BUG support under
 the one <linux/bug.h> file.  Due to historical reasons, we have
 some BUG code in bug.h and some in kernel.h -- i.e. the support for
 BUILD_BUG in linux/kernel.h predates the addition of linux/bug.h,
 but old code in kernel.h wasn't moved to bug.h at that time.  As
 a band-aid, kernel.h was including <asm/bug.h> to pseudo link them.
 
 This has caused confusion[1] and general yuck/WTF[2] reactions.
 Here is an example that violates the principle of least surprise:
 
       CC      lib/string.o
       lib/string.c: In function 'strlcat':
       lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON'
       make[2]: *** [lib/string.o] Error 1
       $
       $ grep linux/bug.h lib/string.c
       #include <linux/bug.h>
       $
 
 We've included <linux/bug.h> for the BUG infrastructure and yet we
 still get a compile fail!  [We've not kernel.h for BUILD_BUG_ON.]
 Ugh - very confusing for someone who is new to kernel development.
 
 With the above in mind, the goals of this changeset are:
 
 1) find and fix any include/*.h files that were relying on the
    implicit presence of BUG code.
 2) find and fix any C files that were consuming kernel.h and
    hence relying on implicitly getting some/all BUG code.
 3) Move the BUG related code living in kernel.h to <linux/bug.h>
 4) remove the asm/bug.h from kernel.h to finally break the chain.
 
 During development, the order was more like 3-4, build-test, 1-2.
 But to ensure that git history for bisect doesn't get needless
 build failures introduced, the commits have been reorderd to fix
 the problem areas in advance.
 
 [1]  https://lkml.org/lkml/2012/1/3/90
 [2]  https://lkml.org/lkml/2012/1/17/414
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPbNwpAAoJEOvOhAQsB9HWrqYP/A0t9VB0nK6e42F0OR2P14MZ
 GJFtf1B++wwioIrx+KSWSRfSur1C5FKhDbxLR3I/pvkAYl4+T4JvRdMG6xJwxyip
 CC1kVQQNDjWVVqzjz2x6rYkOffx6dUlw/ERyIyk+OzP+1HzRIsIrugMqbzGLlX0X
 y0v2Tbd0G6xg1DV8lcRdp95eIzcGuUvdb2iY2LGadWZczEOeSXx64Jz3QCFxg3aL
 LFU4oovsg8Nb7MRJmqDvHK/oQf5vaTm9WSrS0pvVte0msSQRn8LStYdWC0G9BPCS
 GwL86h/eLXlUXQlC5GpgWg1QQt5i2QpjBFcVBIG0IT5SgEPMx+gXyiqZva2KwbHu
 LKicjKtfnzPitQnyEV/N6JyV1fb1U6/MsB7ebU5nCCzt9Gr7MYbjZ44peNeprAtu
 HMvJ/BNnRr4Ha6nPQNu952AdASPKkxmeXFUwBL1zUbLkOX/bK/vy1ujlcdkFxCD7
 fP3t7hghYa737IHk0ehUOhrE4H67hvxTSCKioLUAy/YeN1IcfH/iOQiCBQVLWmoS
 AqYV6ou9cqgdYoyila2UeAqegb+8xyubPIHt+lebcaKxs5aGsTg+r3vq5juMDAPs
 iwSVYUDcIw9dHer1lJfo7QCy3QUTRDTxh+LB9VlHXQICgeCK02sLBOi9hbEr4/H8
 Ko9g8J3BMxcMkXLHT9ud
 =PYQT
 -----END PGP SIGNATURE-----

Merge tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux

Pull <linux/bug.h> cleanup from Paul Gortmaker:
 "The changes shown here are to unify linux's BUG support under the one
  <linux/bug.h> file.  Due to historical reasons, we have some BUG code
  in bug.h and some in kernel.h -- i.e.  the support for BUILD_BUG in
  linux/kernel.h predates the addition of linux/bug.h, but old code in
  kernel.h wasn't moved to bug.h at that time.  As a band-aid, kernel.h
  was including <asm/bug.h> to pseudo link them.

  This has caused confusion[1] and general yuck/WTF[2] reactions.  Here
  is an example that violates the principle of least surprise:

      CC      lib/string.o
      lib/string.c: In function 'strlcat':
      lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON'
      make[2]: *** [lib/string.o] Error 1
      $
      $ grep linux/bug.h lib/string.c
      #include <linux/bug.h>
      $

  We've included <linux/bug.h> for the BUG infrastructure and yet we
  still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.] Ugh -
  very confusing for someone who is new to kernel development.

  With the above in mind, the goals of this changeset are:

  1) find and fix any include/*.h files that were relying on the
     implicit presence of BUG code.
  2) find and fix any C files that were consuming kernel.h and hence
     relying on implicitly getting some/all BUG code.
  3) Move the BUG related code living in kernel.h to <linux/bug.h>
  4) remove the asm/bug.h from kernel.h to finally break the chain.

  During development, the order was more like 3-4, build-test, 1-2.  But
  to ensure that git history for bisect doesn't get needless build
  failures introduced, the commits have been reorderd to fix the problem
  areas in advance.

	[1]  https://lkml.org/lkml/2012/1/3/90
	[2]  https://lkml.org/lkml/2012/1/17/414"

Fix up conflicts (new radeon file, reiserfs header cleanups) as per Paul
and linux-next.

* tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
  kernel.h: doesn't explicitly use bug.h, so don't include it.
  bug: consolidate BUILD_BUG_ON with other bug code
  BUG: headers with BUG/BUG_ON etc. need linux/bug.h
  bug.h: add include of it to various implicit C users
  lib: fix implicit users of kernel.h for TAINT_WARN
  spinlock: macroize assert_spin_locked to avoid bug.h dependency
  x86: relocate get/set debugreg fcns to include/asm/debugreg.
2012-03-24 10:08:39 -07:00
Thomas Gleixner
68fe7b23d5 x86: vdso: Put declaration before code
Sigh, warnings are there for a reason.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: John Stultz <john.stultz@linaro.org>
2012-03-24 09:29:22 +01:00
Hugh Dickins
65c0ff4079 x86: Stop recursive fault in print_context_stack after stack overflow
After printing out the first line of a stack backtrace,
print_context_stack() calls print_ftrace_graph_addr() to check
if it's making a graph of function calls, usually not the case.

But unfortunate ordering of assignments causes this to oops if
an earlier stack overflow corrupted threadinfo->task.  Reorder
to avoid that irritation.

( The fact that there was a stack overflow may often be more
  interesting than the stack that can now be shown; but
  integrating that information with this stacktrace is awkward,
  so leave it to overflow reporting. )

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20120323225648.15DD5A033B@akpm.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-24 08:15:04 +01:00
Linus Torvalds
2fb9e96cad Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull additional x86 fixes from Peter Anvin:
 - address a long-standing bug related to when a kernel-spawned process
   gets a signal on an i386 kernel compiled without CONFIG_VM86.

 - fix the newly introduced build warning in arch/x86/boot.

 - fix a typo in the i386 system call table which affects building some
   libcs.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86-32: Fix endless loop when processing signals for kernel tasks
  x86, boot: Correct CFLAGS for hostprogs
  x86-32: Fix typo for mq_getsetattr in syscall table
2012-03-23 17:10:09 -07:00
Linus Torvalds
8e3ade251b Merge branch 'akpm' (Andrew's patch-bomb)
Merge second batch of patches from Andrew Morton:
 - various misc things
 - core kernel changes to prctl, exit, exec, init, etc.
 - kernel/watchdog.c updates
 - get_maintainer
 - MAINTAINERS
 - the backlight driver queue
 - core bitops code cleanups
 - the led driver queue
 - some core prio_tree work
 - checkpatch udpates
 - largeish crc32 update
 - a new poll() feature for the v4l guys
 - the rtc driver queue
 - fatfs
 - ptrace
 - signals
 - kmod/usermodehelper updates
 - coredump
 - procfs updates

* emailed from Andrew Morton <akpm@linux-foundation.org>: (141 commits)
  seq_file: add seq_set_overflow(), seq_overflow()
  proc-ns: use d_set_d_op() API to set dentry ops in proc_ns_instantiate().
  procfs: speed up /proc/pid/stat, statm
  procfs: add num_to_str() to speed up /proc/stat
  proc: speed up /proc/stat handling
  fs/proc/kcore.c: make get_sparsemem_vmemmap_info() static
  coredump: add VM_NODUMP, MADV_NODUMP, MADV_CLEAR_NODUMP
  coredump: remove VM_ALWAYSDUMP flag
  kmod: make __request_module() killable
  kmod: introduce call_modprobe() helper
  usermodehelper: ____call_usermodehelper() doesn't need do_exit()
  usermodehelper: kill umh_wait, renumber UMH_* constants
  usermodehelper: implement UMH_KILLABLE
  usermodehelper: introduce umh_complete(sub_info)
  usermodehelper: use UMH_WAIT_PROC consistently
  signal: zap_pid_ns_processes: s/SEND_SIG_NOINFO/SEND_SIG_FORCED/
  signal: oom_kill_task: use SEND_SIG_FORCED instead of force_sig()
  signal: cosmetic, s/from_ancestor_ns/force/ in prepare_signal() paths
  signal: give SEND_SIG_FORCED more power to beat SIGNAL_UNKILLABLE
  Hexagon: use set_current_blocked() and block_sigmask()
  ...
2012-03-23 16:59:10 -07:00
Akinobu Mita
0b2f4d4d76 x86: use for_each_clear_bit_from()
Use for_each_clear_bit() to iterate over all the cleared bit in a
memory region.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:34 -07:00
Akinobu Mita
307b1cd7ec bitops: rename for_each_set_bit_cont() in favor of analogous list.h function
This renames for_each_set_bit_cont() to for_each_set_bit_from() because
it is analogous to list_for_each_entry_from() in list.h rather than
list_for_each_entry_continue().

This doesn't remove for_each_set_bit_cont() for now.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:33 -07:00
Andy Lutomirski
91ec87d57f x86-64: Simplify and optimize vdso clock_gettime monotonic variants
We used to store the wall-to-monotonic offset and the realtime base.
It's faster to precompute the monotonic base.

This is about a 3% speedup on Sandy Bridge for CLOCK_MONOTONIC.
It's much more impressive for CLOCK_MONOTONIC_COARSE.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2012-03-23 16:49:33 -07:00
Linus Torvalds
475c77edf8 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
Pull PCI changes (including maintainer change) from Jesse Barnes:
 "This pull has some good cleanups from Bjorn and Yinghai, as well as
  some more code from Yinghai to better handle resource re-allocation
  when enabled.

  There's also a new initcall_debug feature from Arjan which will print
  out quirk timing information to help identify slow quirks for fixing
  or refinement (Yinghai sent in a few patches to do just that once the
  new debug code landed).

  Beyond that, I'm handing off PCI maintainership to Bjorn Helgaas.
  He's been a core PCI and Linux contributor for some time now, and has
  kindly volunteered to take over.  I just don't feel I have the time
  for PCI review and work that it deserves lately (I've taken on some
  other projects), and haven't been as responsive lately as I'd like, so
  I approached Bjorn asking if he'd like to manage things.  He's going
  to give it a try, and I'm confident he'll do at least as well as I
  have in keeping the tree managed, patches flowing, and keeping things
  stable."

Fix up some fairly trivial conflicts due to other cleanups (mips device
resource fixup cleanups clashing with list handling cleanup, ppc iseries
removal clashing with pci_probe_only cleanup etc)

* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (112 commits)
  PCI: Bjorn gets PCI hotplug too
  PCI: hand PCI maintenance over to Bjorn Helgaas
  unicore32/PCI: move <asm-generic/pci-bridge.h> include to asm/pci.h
  sparc/PCI: convert devtree and arch-probed bus addresses to resource
  powerpc/PCI: allow reallocation on PA Semi
  powerpc/PCI: convert devtree bus addresses to resource
  powerpc/PCI: compute I/O space bus-to-resource offset consistently
  arm/PCI: don't export pci_flags
  PCI: fix bridge I/O window bus-to-resource conversion
  x86/PCI: add spinlock held check to 'pcibios_fwaddrmap_lookup()'
  PCI / PCIe: Introduce command line option to disable ARI
  PCI: make acpihp use __pci_remove_bus_device instead
  PCI: export __pci_remove_bus_device
  PCI: Rename pci_remove_behind_bridge to pci_stop_and_remove_behind_bridge
  PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_device
  PCI: print out PCI device info along with duration
  PCI: Move "pci reassigndev resource alignment" out of quirks.c
  PCI: Use class for quirk for usb host controller fixup
  PCI: Use class for quirk for ti816x class fixup
  PCI: Use class for quirk for intel e100 interrupt fixup
  ...
2012-03-23 14:02:12 -07:00
Linus Torvalds
a20ae85aba Fixes:
- Fix KDB keyboard repeat scan codes and leaked keyboard events
 - Fix kernel crash with kdb_printf() for users who compile new kdb_printf()'s
   in early code
 - Return all segment registers to gdb on x86_64
 Features:
 - KDB/KGDB hook the reboot notifier and end user can control if it stops,
   detaches or does nothing (updated docs as well)
 - Notify users who use CONFIG_DEBUG_RODATA to use hw breakpoints
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPbJ17AAoJEIciOldedpOjQeIP/AkUxQFJ7O4aLrLYHl62EHnh
 spkgkd+nBzIzcKyV73alkrVBaR2WE2822aAPQmAPBP8/X283DZJJjqgDCUNVI1Mf
 CIZ7g8AQHRnS+bAZmof5Jss4malZn4byLvG/cfpOivrsye+4A8MdrAKKM3pYWNVy
 4xABkcEknVEEamdNEhHrcPd+xehretfw7+9mmU8hfjqHb/5cXFB7JwDcf4tF7ozT
 MDyN4xKtOn1W/ftQl0t6izksCUuPyqKzIfUyAy0j6AwTgsEavXu56S52T1UoB2ZI
 JcwLn/ZpN4eGCWVodY04R3jzaMtKFb6ImY40wsb5wl3CU3Ecy5syMU6z4fg3cvjH
 /KE6xWF61j4yiE5lzjeJVtKyxwalthzrr56qU2uEwrsEVmo3SOnW9sm0cwouqa7j
 /TAMlhZuGgbZGesFwdaUKI5OLGoki+pRQ0Gaq3TsbZwpPC5Bimkq0bIvruruKJCP
 QWVkEvb5TZgxCFS3AvniePOm7Hc2wD9zXB3OfN3o91pCom0ryDBIthcLlwhVeNCo
 Jd67pnJJNVULPF/1qVicZihKHxvG3DUb4E9pUcbgJplBke+isi+3eHOvnQrYFjIG
 iCamE9qvVbsQm/OFV8MOJ5mfPs9R+nb/jNzTO8JDmBc8AL7nRDV3AFGjW68x/KWT
 ERcqNEGJ4QuVAxfejq76
 =SXu9
 -----END PGP SIGNATURE-----

Merge tag 'for_linus-3.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb

Pull KGDB/KDB updates from Jason Wessel:
 "Fixes:
   - Fix KDB keyboard repeat scan codes and leaked keyboard events
   - Fix kernel crash with kdb_printf() for users who compile new
     kdb_printf()'s in early code
   - Return all segment registers to gdb on x86_64

  Features:
   - KDB/KGDB hook the reboot notifier and end user can control if it
     stops, detaches or does nothing (updated docs as well)
   - Notify users who use CONFIG_DEBUG_RODATA to use hw breakpoints"

* tag 'for_linus-3.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
  kdb: Add message about CONFIG_DEBUG_RODATA on failure to install breakpoint
  kdb: Avoid using dbg_io_ops until it is initialized
  kgdb,debug_core: add the ability to control the reboot notifier
  KDB: Fix usability issues relating to the 'enter' key.
  kgdb,debug-core,gdbstub: Hook the reboot notifier for debugger detach
  kgdb: Respect that flush op is optional
  kgdb: x86: Return all segment registers also in 64-bit mode
2012-03-23 09:29:44 -07:00
Alexander Gordeev
4da7072ad6 x86/io_apic: Move and reenable irq only when CONFIG_GENERIC_PENDING_IRQ=y
This patch removes dead code from certain .config variations.

When CONFIG_GENERIC_PENDING_IRQ=n irq move and reenable code is
never get executed, nor do_unmask_irq variable updates its init
value. Move the code under CONFIG_GENERIC_PENDING_IRQ macro.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/20120320141935.GA24806@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-23 13:47:25 +01:00
Steffen Persvold
b7157acf42 x86/apic: Add separate apic_id_valid() functions for selected apic drivers
As suggested by Suresh Siddha and Yinghai Lu:

For x2apic pre-enabled systems, apic driver is set already early
through early_acpi_boot_init()/early_acpi_process_madt()/
acpi_parse_madt()/default_acpi_madt_oem_check() path so that
apic_id_valid() checking will be sufficient during MADT and SRAT
parsing.

For non-x2apic pre-enabled systems, all apic ids should be less
than 255.

This allows us to substitute the checks in
arch/x86/kernel/acpi/boot.c::acpi_parse_x2apic() and
arch/x86/mm/srat.c::acpi_numa_x2apic_affinity_init() with
apic->apic_id_valid().

In addition we can avoid feigning the x2apic cpu feature in the
NumaChip apic code.

The following apic drivers have separate apic_id_valid()
functions which will accept x2apic type IDs :

 x2apic_phys
 x2apic_cluster
 x2apic_uv_x
 apic_numachip

Signed-off-by: Steffen Persvold <sp@numascale.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Daniel J Blueman <daniel@numascale-asia.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jack Steiner <steiner@sgi.com>
Link: http://lkml.kernel.org/r/1331925935-13372-1-git-send-email-sp@numascale.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-23 13:28:43 +01:00
Yinghai Lu
0b8b8078cb x86: Fix excessive MSR print out when show_msr is not specified
Dave found:

| During bootup, I now have 162 messages like this..
| [    0.227346]  MSR0000001b: 00000000fee00900
| [    0.227465]  MSR00000021: 0000000000000001
| [    0.227584]  MSR0000002a: 00000000c1c81400
|
| commit 21c3fcf3e3 looks suspect.
| It claims that it will only print these out if show_msr= is
| passed, but that doesn't seem to be the case.

Fix it by changing to the version that checks the index.

Reported-and-tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1332477103-4595-1-git-send-email-yinghai@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-23 09:55:00 +01:00
Peter Zijlstra
c7206205d0 perf: Fix mmap_page capabilities and docs
Complete the syscall-less self-profiling feature and address
all complaints, namely:

 - capabilities, so we can detect what is actually available at runtime

     Add a capabilities field to perf_event_mmap_page to indicate
     what is actually available for use.

 - on x86: RDPMC weirdness due to being 40/48 bits and not sign-extending
   properly.

 - ABI documentation as to how all this stuff works.

Also improve the documentation for the new features.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vweaver1@eecs.utk.edu>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1332433596.2487.33.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-23 09:52:16 +01:00
Ingo Molnar
c5bc437702 Cleanups and fixes for perf/core:
. Short term fix for 'diff' tool breakage related to perf.data files
   with multiple events. From Jiri Olsa
 
 . Cleanup for event id tracepoint reading routine, from Borislav Petkov
 
 . 32-bit compilation fixes from Jiri Olsa
 
 . Event parsing modifier assignment fixes from Jiri Olsa
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJPa24hAAoJENZQFvNTUqpARrEQAIXfsXG2X45twEFHJ0oOmvZI
 7F28pMFM0OBJdSq/MKyO86+j5wtbtKtjCcVZfVm1ukkeD7KnKQv+iyBw1rEuE/dx
 IJIIS+zkCEAje7s64diNax/rpvZXp+k92pKKo/JGjalsVcw63VF5n0Ej3JA+n8Qp
 anKyJd2SJFogoljTzRnFnZ+zsbn5CWVwb3NVRhH8t+jfKKOsmw8Nq5ds5ysuRf8x
 +Xpm3YAsyKzAJMd05Uo5no+Ne4A5TukySiarZl3wiQ0TrwM9mVxQVAU5iVkAUZkb
 Bp1dgBOgG1RNudcRSgbq3NuO7TGFby6DqODgXvo1TzpLyixKbWUz3JI0mEya5xCL
 eBxJ2UTQpruBg+XJ3C0ys74TkMcQUeyxhqbwPu1WhMbP7cIJK1W1hJ728qWRG94a
 uxFMFlEfc0d9kxlsXVvxUumiyhEw5gPtwdbQf7fo2xpMPQmHCC3yS7DdMO6FBfUW
 oqPC7gb3aSTmHnKuYvwbT9EcOL27YHU+Y/4sRJ/QfohArgc4h60dUQrE9fWz7dXH
 /eCXDvMnBs2+z52dNLq/mkoQMP4iFzWqboDNn5GM4jI2LmXd/hxBGL6lb2Vzm8CI
 QYi5xC4E6VStSkJG2DTtvaYTclWOjo+ZrsGz7sDXQV3MgZ1wiZIWcGiPjeFBWU25
 mHWkZT3of/MAbewmeklc
 =V7d9
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Cleanups and fixes for perf/core:

. Short term fix for 'diff' tool breakage related to perf.data files
  with multiple events. From Jiri Olsa

. Cleanup for event id tracepoint reading routine, from Borislav Petkov

. 32-bit compilation fixes from Jiri Olsa

. Event parsing modifier assignment fixes from Jiri Olsa

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-23 09:16:03 +01:00
Dmitry Adamushko
29a2e2836f x86-32: Fix endless loop when processing signals for kernel tasks
The problem occurs on !CONFIG_VM86 kernels [1] when a kernel-mode task
returns from a system call with a pending signal.

A real-life scenario is a child of 'khelper' returning from a failed
kernel_execve() in ____call_usermodehelper() [ kernel/kmod.c ].
kernel_execve() fails due to a pending SIGKILL, which is the result of
"kill -9 -1" (at least, busybox's init does it upon reboot).

The loop is as follows:

* syscall_exit_work:
 - work_pending:            // start_of_the_loop
 - work_notify_sig:
   - do_notify_resume()
     - do_signal()
       - if (!user_mode(regs)) return;
 - resume_userspace         // TIF_SIGPENDING is still set
 - work_pending             // so we call work_pending => goto
                            // start_of_the_loop

More information can be found in another LKML thread:
http://www.serverphorums.com/read.php?12,457826

[1] the problem was also seen on MIPS.

Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Link: http://lkml.kernel.org/r/1332448765.2299.68.camel@dimm
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-03-22 13:50:25 -07:00
Jan Kiszka
639077fb69 kgdb: x86: Return all segment registers also in 64-bit mode
Even if the content is always 0, gdb expects us to return also ds,
es, fs, and gs while in x86-64 mode. Do this to avoid ugly errors on
"info registers".

[jason.wessel@windriver.com: adjust NUMREGBYTES for two new regs]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-03-22 15:07:15 -05:00
Linus Torvalds
28f23d1f3b Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 "urgent" leftovers from Ingo Molnar:
 "Pending x86/urgent bits that were not high prio enough to warrant
  -rc-less v3.3-final inclusion."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, efi: Fix pointer math issue in handle_ramdisks()
  x86/ioapic: Add register level checks to detect bogus io-apic entries
  x86, mce: Fix rcu splat in drain_mce_log_buffer()
  x86, memblock: Move mem_hole_size() to .init
2012-03-22 09:44:50 -07:00
Linus Torvalds
2390481546 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform changes from Ingo Molnar.

Removes the Moorestown platform that nobody ever used.

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/platform: Move APIC ID validity check into platform APIC code
  x86/olpc/xo15/sci: Enable lid close wakeup control
  x86/geode/net5501: Add platform driver for Soekris Engineering net5501
  x86/geode/alix2: Supplement driver to include GPIO button support
  x86/mid/powerbtn: Use MSIC read/write instead of ipc_scu
  x86/mid/thermal: Turn off thermistor
  x86/mid/thermal: Add msic_thermal alias
  x86/mid/thermal: Convert to use Intel MSIC API
  x86/mid/scu_ipc: Remove Moorestown support
  x86/mid: Kill off Moorestown
  x86/mrst: Add msic_thermal platform support
  x86/config: Select MSIC MFD driver on Intel Medfield platform
  x86/mid: Remove Intel Moorestown
  x86/mrst: Set ISA bus type for fake MP IRQs
  x86/ioapic: Use legacy_pic to set correct gsi-irq mapping
2012-03-22 09:43:22 -07:00
Linus Torvalds
754b980077 Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull MCE changes from Ingo Molnar.

* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Fix return value of mce_chrdev_read() when erst is disabled
  x86/mce: Convert static array of pointers to per-cpu variables
  x86/mce: Replace hard coded hex constants with symbolic defines
  x86/mce: Recognise machine check bank signature for data path error
  x86/mce: Handle "action required" errors
  x86/mce: Add mechanism to safely save information in MCE handler
  x86/mce: Create helper function to save addr/misc when needed
  HWPOISON: Add code to handle "action required" errors.
  HWPOISON: Clean up memory_failure() vs. __memory_failure()
2012-03-22 09:42:04 -07:00
Linus Torvalds
35cb8d9e18 Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/fpu changes from Ingo Molnar.

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  i387: Split up <asm/i387.h> into exported and internal interfaces
  i387: Uninline the generic FP helpers that we expose to kernel modules
2012-03-22 09:41:22 -07:00
Linus Torvalds
f06fc0c0de Merge branch 'x86-eficross-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/eficross (booting 32/64-bit kernel from 64/32-bit EFI) from Ingo Molnar

* 'x86-eficross-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, efi: Allow basic init with mixed 32/64-bit efi/kernel
  x86, efi: Add basic error handling
  x86, efi: Cleanup config table walking
  x86, efi: Convert printk to pr_*()
  x86, efi: Refactor efi_init() a bit
2012-03-22 09:31:31 -07:00
Linus Torvalds
4c64616bb5 Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/debug changes from Ingo Molnar.

* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Fix section warnings
  x86-64: Fix CFI data for common_interrupt()
  x86: Properly _init-annotate NMI selftest code
  x86/debug: Fix/improve the show_msr=<cpus> debug print out
2012-03-22 09:30:39 -07:00