kernel-ark/arch/x86/kernel
Steven Rostedt 0c54dd341f ftrace: Remove memory barriers from NMI code when not needed
The code in stop_machine that modifies the kernel text has a bit
of logic to handle the case of NMIs. stop_machine does not prevent
NMIs from executing, and if an NMI were to trigger on another CPU
as the modifying CPU is changing the NMI text, a GPF could result.

To prevent the GPF, the NMI calls ftrace_nmi_enter() which may
modify the code first, then any other NMIs will just change the
text to the same content which will do no harm. The code that
stop_machine called must wait for NMIs to finish while it changes
each location in the kernel. That code may also change the text
to what the NMI changed it to. The key is that the text will never
change content while another CPU is executing it.

To make the above work, the call to ftrace_nmi_enter() must also
do a smp_mb() as well as atomic_inc().  But for applications like
perf that require a high number of NMIs for profiling, this can have
a dramatic effect on the system. Not only is it doing a full memory
barrier on both nmi_enter() as well as nmi_exit() it is also
modifying a global variable with an atomic operation. This kills
performance on large SMP machines.

Since the memory barriers are only needed when ftrace is in the
process of modifying the text (which is seldom), this patch
adds a "modifying_code" variable that gets set before stop machine
is executed and cleared afterwards.

The NMIs will check this variable and store it in a per CPU
"save_modifying_code" variable that it will use to check if it
needs to do the memory barriers and atomic dec on NMI exit.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-02-25 08:42:06 -05:00
..
acpi ACPI: introduce kernel parameter acpi_sleep=sci_force_enable 2009-12-30 18:32:01 -05:00
apic x86_64 SGI UV: Fix writes to led registers on remote uv hubs. 2009-12-30 12:25:26 -08:00
cpu Merge branch 'perf-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-31 11:56:24 -08:00
.gitignore
alternative.c Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-09-14 07:57:32 -07:00
amd_iommu_init.c x86/amd-iommu: Fix initialization failure panic 2009-12-21 15:51:23 +01:00
amd_iommu.c iommu-helper: use bitmap library 2009-12-16 07:20:18 -08:00
aperture_64.c x86: Gart: fix breakage due to IOMMU initialization cleanup 2009-12-14 08:57:40 +01:00
apm_32.c x86: Remove BKL from apm_32 2009-10-14 17:04:48 +02:00
asm-offsets_32.c
asm-offsets_64.c tracing: Define NR_syscalls for x86_64 2009-08-26 21:29:58 +02:00
asm-offsets.c
audit_64.c
bios_uv.c x86: uv: update XPC to handle updated BIOS interface 2009-12-16 07:20:14 -08:00
bootflag.c
check.c
cpuid.c x86, msr/cpuid: Register enough minors for the MSR and CPUID drivers 2009-12-15 15:13:07 -08:00
crash_dump_32.c x86: crash_dump: Fix non-pae kdump kernel memory accesses 2009-10-26 12:38:59 +01:00
crash_dump_64.c
crash.c x86: Use x86_platform for iommu_shutdown 2009-11-08 13:12:26 +01:00
doublefault_32.c x86: Use get_desc_base() 2009-07-19 18:27:51 +02:00
ds_selftest.c
ds_selftest.h
ds.c percpu: make percpu symbols in x86 unique 2009-10-29 22:34:14 +09:00
dumpstack_32.c perf events, x86/stacktrace: Make stack walking optional 2009-12-17 09:56:19 +01:00
dumpstack_64.c perf events, x86/stacktrace: Make stack walking optional 2009-12-17 09:56:19 +01:00
dumpstack.c perf events, x86/stacktrace: Fix performance/softlockup by providing a special frame pointer-only stack walker 2009-12-17 10:42:52 +01:00
dumpstack.h perf events, x86/stacktrace: Make stack walking optional 2009-12-17 09:56:19 +01:00
e820.c x86: Increase MAX_EARLY_RES; insufficient on 32-bit NUMA 2009-12-16 16:46:23 -08:00
early_printk.c x86: earlyprintk: Fix regression to handle serial,ttySn as 1 arg 2009-10-01 10:34:16 +02:00
early-quirks.c
efi_32.c
efi_64.c x86: Make 64-bit efi_ioremap use ioremap on MMIO regions 2009-08-03 13:34:25 -07:00
efi_stub_32.S
efi_stub_64.S
efi.c x86: Make EFI RTC function depend on 32bit again 2009-10-27 12:35:48 +01:00
entry_32.S x86, 32-bit: Use same regs as 64-bit for kernel_thread_helper 2009-12-10 15:55:36 -08:00
entry_64.S Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-16 12:02:37 -08:00
ftrace.c ftrace: Remove memory barriers from NMI code when not needed 2010-02-25 08:42:06 -05:00
head32.c x86: Use find_e820() instead of hard coded trampoline address 2009-12-11 09:28:22 +01:00
head64.c x86: Use find_e820() instead of hard coded trampoline address 2009-12-11 09:28:22 +01:00
head_32.S x86-32: Use symbolic constants, safer CPUID when enabling EFER.NX 2009-11-16 13:44:56 -08:00
head_64.S Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-08 13:27:33 -08:00
head.c
hpet.c x86: hpet: Make WARN_ON understandable 2009-11-27 20:37:41 +01:00
hw_breakpoint.c hw-breakpoints: Use overflow handler instead of the event callback 2009-12-06 08:27:18 +01:00
i386_ksyms_32.c x86: Don't generate cmpxchg8b_emu if CONFIG_X86_CMPXCHG64=y 2009-10-01 08:42:24 +02:00
i387.c
i8237.c
i8253.c x86: Do not unregister PIT clocksource on PIT oneshot setup/shutdown 2009-08-21 21:13:37 +02:00
i8259.c
init_task.c Use new __init_task_data macro in arch init_task.c files. 2009-09-21 06:27:08 +02:00
io_delay.c
ioport.c x86-64, paravirt: Call set_iopl_mask() on 64 bits 2009-12-09 16:54:08 -08:00
irq_32.c x86: Unify fixup_irqs() for 32-bit and 64-bit kernels 2009-11-02 15:56:34 +01:00
irq_64.c x86: Unify fixup_irqs() for 32-bit and 64-bit kernels 2009-11-02 15:56:34 +01:00
irq.c genirq: Convert irq_desc.lock to raw_spinlock 2009-12-14 23:55:33 +01:00
irqinit.c x86: UV RTC: Rename generic_interrupt to x86_platform_ipi 2009-10-14 18:27:11 +02:00
k8.c
kdebugfs.c
kgdb.c kgdb,x86: do not set kgdb_single_step on x86 2009-12-11 08:43:18 -06:00
kprobes.c Merge branch 'for-next' into for-linus 2009-12-07 18:36:35 +01:00
kvm.c KVM guest: do not batch pte updates from interrupt context 2009-09-10 18:10:50 +03:00
kvmclock.c Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-09-18 14:05:47 -07:00
ldt.c cpumask: use mm_cpumask() wrapper: x86 2009-09-24 09:34:52 +09:30
machine_kexec_32.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-08 13:27:33 -08:00
machine_kexec_64.c
Makefile cs5535: drop the Geode-specific MFGPT/GPIO code 2009-12-15 08:53:28 -08:00
mca_32.c
microcode_amd.c arch/x86/kernel/microcode*: Use pr_fmt() and remove duplicated KERN_ERR prefix 2009-12-09 08:25:57 +01:00
microcode_core.c Revert "x86, ucode-amd: Ensure ucode update on suspend/resume after CPU off/online cycle" 2009-12-23 15:04:53 -08:00
microcode_intel.c arch/x86/kernel/microcode*: Use pr_fmt() and remove duplicated KERN_ERR prefix 2009-12-09 08:25:57 +01:00
mmconf-fam10h_64.c
module.c
mpparse.c x86: Use find_e820() instead of hard coded trampoline address 2009-12-11 09:28:22 +01:00
mrst.c x86: Add Moorestown early detection 2009-08-31 11:09:40 +02:00
msr.c x86, msr/cpuid: Register enough minors for the MSR and CPUID drivers 2009-12-15 15:13:07 -08:00
olpc.c cs5535: move VSA2 checks into linux/cs5535.h 2009-12-15 08:53:28 -08:00
paravirt_patch_32.c
paravirt_patch_64.c
paravirt-spinlocks.c locking: Convert __raw_spin* functions to arch_spin* 2009-12-14 23:55:32 +01:00
paravirt.c Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-09-18 14:05:47 -07:00
pci-calgary_64.c iommu-helper: use bitmap library 2009-12-16 07:20:18 -08:00
pci-dma.c x86: Split swiotlb initialization into two stages 2009-12-15 13:01:57 +01:00
pci-gart_64.c iommu-helper: use bitmap library 2009-12-16 07:20:18 -08:00
pci-nommu.c x86: Kill bad_dma_address variable 2009-11-17 07:53:21 +01:00
pci-swiotlb.c x86: Split swiotlb initialization into two stages 2009-12-15 13:01:57 +01:00
pcspeaker.c
pmtimer_64.c
probe_roms_32.c
process_32.c x86: Use KERN_DEFAULT log-level in __show_regs() 2009-12-28 09:40:21 +01:00
process_64.c x86: Use KERN_DEFAULT log-level in __show_regs() 2009-12-28 09:40:21 +01:00
process.c x86: Use KERN_DEFAULT log-level in __show_regs() 2009-12-28 09:40:21 +01:00
ptrace.c x86/ptrace: make genregs[32]_get/set more robust 2009-12-17 07:04:56 -08:00
pvclock.c x86: Fix warning in pvclock.c 2009-07-14 16:25:05 +02:00
quirks.c x86: AMD Northbridge: Verify NB's node is online 2009-11-16 15:43:05 +01:00
reboot_fixups_32.c cs5535: move the DIVIL MSR definition into linux/cs5535.h 2009-12-15 08:53:28 -08:00
reboot.c Merge branch 'linus' into x86/urgent 2009-12-07 13:14:18 +01:00
relocate_kernel_32.S
relocate_kernel_64.S
rtc.c Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-09-18 14:05:47 -07:00
scx200_32.c
setup_percpu.c x86: setup_percpu.c: Use pr_<level> and add pr_fmt(fmt) 2009-12-10 08:57:50 +01:00
setup.c x86: Use find_e820() instead of hard coded trampoline address 2009-12-11 09:28:22 +01:00
sfi.c SFI: remove unneeded includes 2009-09-15 15:08:40 -04:00
signal.c x86: Merge sys_sigaltstack 2009-12-09 16:28:59 -08:00
smp.c Revert "x86, timers: Check for pending timers after (device) interrupts" 2009-10-09 15:58:20 +02:00
smpboot.c x86: Limit the number of processor bootup messages 2009-12-11 15:16:00 -08:00
stacktrace.c perf events, x86/stacktrace: Make stack walking optional 2009-12-17 09:56:19 +01:00
step.c x86: Use get_desc_base() 2009-07-19 18:27:51 +02:00
sys_i386_32.c Unify sys_mmap* 2009-12-11 06:44:29 -05:00
sys_x86_64.c Unify sys_mmap* 2009-12-11 06:44:29 -05:00
syscall_64.c
syscall_table_32.S Unify sys_mmap* 2009-12-11 06:44:29 -05:00
tboot.c x86, intel_txt: clean up the impact on generic code, unbreak non-x86 2009-09-01 18:25:07 -07:00
tce_64.c
test_nx.c
test_rodata.c
time.c x86: fix kernel panic on 32 bits when profiling 2009-10-12 11:53:51 -07:00
tlb_uv.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-08 13:27:33 -08:00
tls.c
tls.h
topology.c
trampoline_32.S x86: cpuinit-annotate SMP boot trampolines properly 2009-09-20 20:23:37 +02:00
trampoline_64.S x86: Fix Suspend to RAM freeze on Acer Aspire 1511Lmi laptop 2009-10-12 18:06:48 +02:00
trampoline.c x86: Use find_e820() instead of hard coded trampoline address 2009-12-11 09:28:22 +01:00
traps.c Merge commit 'perf/core' into perf/hw-breakpoint 2009-10-18 01:12:33 +02:00
tsc_sync.c locking: Convert __raw_spin* functions to arch_spin* 2009-12-14 23:55:32 +01:00
tsc.c x86: Reenable TSC sync check at boot, even with NONSTOP_TSC 2009-12-17 14:44:35 -08:00
uv_irq.c x86, irq: Allow 0xff for /proc/irq/[n]/smp_affinity on an 8-cpu system 2009-12-17 22:03:06 -08:00
uv_sysfs.c
uv_time.c x86: UV RTC: Always enable RTC clocksource 2009-11-23 19:41:30 +01:00
verify_cpu_64.S
visws_quirks.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-08 13:27:33 -08:00
vm86_32.c x86, 32-bit: Convert sys_vm86 & sys_vm86old 2009-12-09 16:29:23 -08:00
vmi_32.c x86, vmi: Mark VMI deprecated and schedule it for removal 2009-10-08 22:27:55 +02:00
vmiclock_32.c x86: vmiclock: Fix printk format 2009-11-18 12:31:06 +01:00
vmlinux.lds.S x86: Regex support and known-movable symbols for relocs, fix _end 2009-12-14 13:55:20 -08:00
vsmp_64.c
vsyscall_64.c Merge branch 'timers-for-linus-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-08 19:28:09 -08:00
x86_init.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-08 13:27:33 -08:00
x8664_ksyms_64.c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-12-16 12:02:37 -08:00
xsave.c