This patch adds a field in pv_cpu_ops for a paravirtualized hook
for rdtscp, needed for x86_64.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
write_tsc() does not need to be enclosed in any paravirt closure,
as it uses wrmsr(). So we rip off the duplicate in msr.h
and the definition from paravirt.h
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch adjust the PVOP_VCALL and PVOP_CALL macros to
work with x86_64. It has a different calling convention, and
we use auxiliary macros to account for both calling conventions
as cleanly as possible
Comments are adjusted accordingly.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch changes paravirt_32.c to paravirt.c. The goal
is to have paravirt support in x86_64, so we do it in a common file
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Establish the user API for sending a user-defined signal to the traced task on a BTS buffer overflow.
This should complete the user API for the BTS ptrace extension.
The patches so far implement wrap-around overflow handling as is needed for debugging.
The remaining open is another overflow handling mechanism that sends a signal to the traced task on a buffer overflow.
This will take some more time from my side.
Since, from a user perspective, this occurs behind the scenes, the patch set should already be useful. More features may/will be added on top of it (overflow signal, pageable back-up buffers, kernel tracing, core file support, profiling, ...).
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pass the buffer size for (most) ptrace commands that pass user-allocated buffers and check that size before accessing the buffer. Unfortunately, PTRACE_BTS_GET already uses all 4 parameters.
Commands that access user buffers return the number of bytes or records read or written.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Support BTS recording of 32bit and 64bit tasks from 32bit or 64bit tasks.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Check the rlimit of the tracing task for total and locked memory when allocating the BTS buffer.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move some deeply indented code related to re-entrance processing
from kprobe_handler() to reenter_kprobe().
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ mhiramat@redhat.com: updated it to latest x86.git ]
Factor common X86_32, X86_64 kprobe reenter logic from deeply
indented section to helper function.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Fix a preemption bug in kprobe_handler(). It has to call preempt_enable()
before returning.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Aviod TLB flush IPIs during C3 states by voluntary leave_mm()
before entering C3.
The performance impact of TLB flush on C3 should not be significant with
respect to C3 wakeup latency. Also, CPUs tend to flush TLB in hardware while in
C3 anyways.
On a 8 logical CPU system, running make -j2, the number of tlbflush IPIs goes
down from 40 per second to ~ 0. Total number of interrupts during the run
of this workload was ~1200 per second, which makes it ~3% savings in wakeups.
There was no measurable performance or power impact however.
[ akpm@linux-foundation.org: symbol export fixes. ]
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
we use a few static mapping rules in our pirq routing functions,
and for example regression f3ac84324f was due to the pirq
being out of range of the remapping array. Put in a few
WARN_ON_ONCE() lines so that we get notified about any such
out-of-bound incidents.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
increasing number of PCI slots in large multi-node systems. The kernel
currently attempts by default to allocate memory for all PCI expansion
ROMs so there has also been an increasing number of PCI memory
allocation failures seen on these systems. This occurs because the BIOS
either (1) provides insufficient PCI memory resource for all the
expansion ROMs or (2) provides adequate PCI memory resource for
expansion ROMs but provides the space in kernel unexpected BIOS assigned
P2P non-prefetch windows.
The resulting PCI memory allocation failures may be benign when related
to memory requests for expansion ROMs themselves but in some cases they
can occur when attempting to allocate space for more critical BARs.
This can happen when a successful expansion ROM allocation request
consumes memory resource that was intended for a non-ROM BAR. We have
seen this happen during PCI hotplug of an adapter that contains a P2P
bridge where successful memory allocation for an expansion ROM BAR on
device behind the bridge consumed memory that was intended for a non-ROM
BAR on the P2P bridge. In all cases the allocation failure messages can
be very confusing for users.
This patch addresses the issue by changing the kernel default behavior
so that expansion ROM memory allocations are no longer attempted by
default when the BIOS has not assigned a specific address range to the
expansion ROM BAR. This was done by changing the 'pci=rom' boot option
behavior for BIOS unassigned expansion ROMs to actually match it's
current kernel-parameters.txt description which already implies "off" by
default. Behavior for BIOS assigned expansion ROMs implemented in
pcibios_assign_resources() [arch/x86/pci/i386.c] is unchanged.
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jan Beulich <jbeulich@novell.com>
Acked-by: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
dmi_alloc() for CONFIG_X86_64 is defined to allocate from a static array
and it maintains a allocation index which is advanced each time allocation
is attempted - it gets incremented even if an allocation fails thereby
depriving any future request that may be small enough to be satisfied from
the array.
Fix this by first testing if allocation is going to be possible and
incrementing alloc index only then.
Signed-off-by: Parag Warudkar <parag.warudkar@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
People with HP Desktops (including me) encounter couple of DMI errors
during boot - dmi_save_oem_strings_devices: out of memory and
dmi_string: out of memory.
On some HP desktops the DMI data include OEM strings (type 11) out of
which only few are meaningful and most other are empty. DMI code
religiously creates copies of these 27 strings (65 bytes each in my
case) and goes OOM in dmi_string().
If DMI_MAX_DATA is bumped up a little then it goes and fails in
dmi_save_oem_strings while allocating dmi_devices of sizeof(struct
dmi_device) corresponding to these strings.
On x86_64 since we cannot use alloc_bootmem this early, the code uses a
static array of 2048 bytes (DMI_MAX_DATA) for allocating the memory DMI
needs. It does not survive the creation of empty strings and devices.
Fix this by detecting and not newly allocating empty strings and instead
using a one statically defined dmi_empty_string.
Also do not create a new struct dmi_device for each empty string - use
one statically define dmi_device with .name=dmi_empty_string and add
that to the dmi_devices list.
On x64 this should stop the OOM with same current size of DMI_MAX_DATA
and on x86 this should save a good amount of (27*65 bytes +
27*sizeof(struct dmi_device) bootmem.
Compile and boot tested on both 32-bit and 64-bit x86.
Signed-off-by: Parag Warudkar <parag.warudkar@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
What's left in processor_32.h and processor_64.h cannot be cleanly
integrated. However, it's just a couple of definitions. They are moved
to processor.h around ifdefs, and the original files are deleted. Note that
there's much less headers included in the final version.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch removes the __init modifier from an extern function
declaration in acpi.h.
Besides not being strictly needed, it requires the inclusion of
linux/init.h, which is usually not even included directly, increasing
header mess by a lot.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This removes duplicated code by calling the generic ptrace_request and
compat_ptrace_request functions for the things they already handle.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This makes ELF core dumps of 32-bit processes include a new
note type NT_386_TLS (0x200) giving the contents of the TLS
slots in struct user_desc format. This lets post mortem
examination figure out what the segment registers mean like
the debugger does with get_thread_area on a live process.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Remove the old ia32_binfmt.c file, which is no longer used.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This switches x86-64's 32-bit ELF support to use the shared
fs/compat_binfmt_elf.c code instead of our own ia32_binfmt.c.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This removes a bunch of dead code that is no longer needed now
that the user_regset interfaces are being used for all these jobs.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This cleans up the PTRACE_*REGS* request code so each one is just a
simple call to copy_regset_to_user or copy_regset_from_user. The
ptrace layouts already match the user_regset formats (core dump formats).
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This switches x86 to the user_regset-based code for ELF core dumps.
The core dumps come out exactly the same as before.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This defines task_user_regset_view and the tables
describing the x86 user_regset layouts for 32 and 64.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This adds accessor functions in the user_regset style for
the general registers (struct user_regs_struct).
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This adds accessor functions in the user_regset style for the TLS data.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This cleans up the TLS code to use struct desc_struct and to separate the
encoding and installation magic from the interface wrappers.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This removes all the old code that is no longer used after
the i387 unification and cleanup. The i387_64.h is renamed
to i387.h with no changes, but since it replaces the nonempty
one-line stub i387.h it looks like a big diff and not a rename.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This revamps the i387 code to be shared across 32-bit, 64-bit,
and 32-on-64. It does so by consolidating the code in one place
based on the user_regset accessor interfaces. This switches
32-bit to using the i387_64.h header and 64-bit to using the
i387.c that was previously i387_32.c, but that's what took the
least cleanup in each file. Here i387.h is stubbed to always
include i387_64.h rather than renaming the file, to keep this
diff smaller and easier to read.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This renames arch/x86/kernel/{i387_32.c => i387.c}.
This is a pure renaming, but paves the way for merging
the 32-bit and 64-bit versions of this code.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This converts the ptrace/signal accessors for i387 math_emu
state to the user_regset interface style, and calls these
from the old interfaces.
It also cleans up math_emulate's ptrace check to be a
single-step check, which is what it really wants.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This moves some code into asm-x86/i387_64.h in preparation for
unifying this code between 32 and 64. The 32-bit versions of
some things are copied in some existing names changed to match
32-bit names and share code. For 64, save_i387 is moved into
an inline from i387_64.c; this matches restore_i387, which is
already an inline, and makes sense since there is exactly one
caller (in signal_64.c). The save_i387 function could use more
cosmetic cleanup, but it is just moved verbatim in this patch.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The i387_fxsave_struct formats really have the same layout
on 32 and 64, with only some slightly different use of a few
fields. The i387_fsave_struct and i387_soft_struct formats
are never used by 64-bit kernels, but it doesn't hurt to
have the unused types in the union and cuts down on the
amount of #ifdef hair required throughout the i387 code.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This adds hard-wired definitions for the remaining cpu_has_* macros
that correspond to flags required-features.h demands are set for
64-bit. Using these can efficiently avoid some #ifdef's when
merging 32-bit and 64-bit code together.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This adds a generic definition of compat_sys_ptrace that calls
compat_arch_ptrace, parallel to sys_ptrace/arch_ptrace. Some
machines needing this already define a function by that name.
The new generic function is defined only on machines that
put #define __ARCH_WANT_COMPAT_SYS_PTRACE into asm/ptrace.h.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This adds a compat_ptrace_request that is the analogue of ptrace_request
for the things that 32-on-64 ptrace implementations can share in common.
So far there are just a couple of requests handled generically.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This makes ptrace_request handle {PEEK,POKE}{TEXT,DATA} directly.
Every arch_ptrace that could call generic_ptrace_peekdata already
has a default case calling ptrace_request, so this keeps things
simpler for the arch code.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This defines two new inlines in linux/regset.h, for use in arch_ptrace
implementations and the like. These provide simplified wrappers for using
the user_regset interfaces to copy thread regset data into the caller's
user-space memory. The inlines are trivial, but make the common uses in
places such as ptrace implementation much more concise, easier to read, and
less prone to code-copying errors.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This adds Kconfig and Makefile bits to build fs/compat_binfmt_elf.c,
just added. Each arch that wants to use this file needs to add a
"select COMPAT_BINFMT_ELF" line in its Kconfig bits that enable COMPAT.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This adds fs/compat_binfmt_elf.c, a wrapper around fs/binfmt_elf.c for
32-bit ELF support on 64-bit kernels. It can replace all the hand-rolled
versions of this that each 32/64 arch has, which are all about the same.
To use this, an arch's asm/elf.h has to define at least a few compat_*
macros that parallel the various macros that fs/binfmt_elf.c uses for
native support.
There is no attempt to deal with compat macros for the core dump format
support. To use this file, the arch has to define compat_gregset_t for
linux/elfcore-compat.h and #define CORE_DUMP_USE_REGSET. The 32-bit
compatible formats should come automatically from task_user_regset_view
called on a 32-bit task.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This adds some inlines to linux/regset.h intended for arch code to use in
its user_regset get and set functions. These make it pretty easy to deal
with the interface's optional kernel-space or user-space pointers and its
generalized access to a part of the register data at a time.
In simple cases where the internal data structure matches the exported
layout (core dump format), a get function can be nothing but a call to
user_regset_copyout, and a set function a call to user_regset_copyin.
In other cases the exported layout is usually made up of a few pieces each
stored contiguously in a different internal data structure. These helpers
make it straightforward to write a get or set function by processing each
contiguous chunk of the data in order. The start_pos and end_pos arguments
are always constants, so these inlines collapse to a small amount of code.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This modifies the ELF core dump code under #ifdef CORE_DUMP_USE_REGSET.
It changes nothing when this macro is not defined. When it's #define'd
by some arch header (e.g. asm/elf.h), the arch must support the
user_regset (linux/regset.h) interface for reading thread state.
This provides an alternate version of note segment writing that is based
purely on the user_regset interfaces. When CORE_DUMP_USE_REGSET is set,
the arch need not define macros such as ELF_CORE_COPY_REGS and ELF_ARCH.
All that information is taken from the user_regset data structures.
The core dumps come out exactly the same if arch's definitions for its
user_regset details are correct.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This pulls out the code for writing the notes segment of an ELF core dump
into separate functions. This cleanly isolates into one cluster of
functions everything that deals with the note formats and the hooks into
arch code to fill them. The top-level elf_core_dump function itself now
deals purely with the generic ELF format and the memory segments.
This only moves code around into functions that can be inlined away.
It should not change any behavior at all.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The new header <linux/regset.h> defines the types struct user_regset and
struct user_regset_view, with some associated declarations. This new set
of interfaces will become the standard way for arch code to expose
user-mode machine-specific state. A single set of entry points into arch
code can do all the low-level work in one place to fill the needs of core
dumps, ptrace, and any other user-mode debugging facilities that might come
along in the future.
For existing arch code to adapt to the user_regset interfaces, each arch
can work from the code it already has to support core files and ptrace.
The formats you want for user_regset are the core file formats. The only
wrinkle in adapting old ptrace implementation code as user_regset get and
set functions is that these functions can be called on current as well as
on another task_struct that is stopped and switched out as for ptrace.
For some kinds of machine state, you may have to load it directly from CPU
registers or otherwise differently for current than for another thread.
(Your core dump support already handles this in elf_core_copy_regs for
current and elf_core_copy_task_regs for other tasks, so just check there.)
The set function should also be made to work on current in case that
entails some special cases, though this was never required before for
ptrace. Adding this flexibility covers the arch needs to open the door to
more sophisticated new debugging facilities that don't always need to
context-switch to do every little thing.
The copyin/copyout helper functions (in a later patch) relieve the arch
code of most of the cumbersome details of the flexible get/set interfaces.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This adds one case to the MODULE_PROC_FAMILY block testing
for X86_64. There are no new things defined on X86_64 than
there were before.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Eliminate __always_inline, all of these static functions are
only called once. Minor whitespace cleanup. Eliminate one
supefluous return at end of void function. Change the one
#ifndef to #ifdef to match the sense of the rest of the config
tests.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>