In a micro-benchmark that stresses the pagefault path, the down_read_trylock
on the mmap_sem showed up quite high on the profile. Turns out this lock is
bouncing between cpus quite a bit and thus is cache-cold a lot. This patch
prefetches the lock (for write) as early as possible (and before some other
somewhat expensive operations). With this patch, the down_read_trylock
basically fell out of the top of profile.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Not for the ioctls so far because I was too lazy.
Cc: bcollins@debian.org
Cc: dan@dennedy.org
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
phys_proc_id[] on AMD boxes is right now populated with the initial
apic id, obtained by the cpuid instruction. But, the initial apic id
need not be the local apic id on clustered APIC systems (see comment at
x86_64/kernel/genapic_cluster.c, line 110). On vSMPowered with AMD
CPUs the cpu_to_node will turn out to be incorrect (as apicid_to_node[] is
indexed by the initial apic id rather than the local apic id).
On vSMPowered boxes with Intel CPUs this is working correctly as
phys_proc_id[] is initialized correctly in detect_ht().
This fixes AMD boot path according to specification, to use the correct
routines for local apic id and socket ids. We use
hard_smp_processor_id() to read the local apic id, and phys_pkg_id() to
determine socket id for phys_proc_id[]
Patch tested on Tyan multicore boxes as well as vSMPowered boxes.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- adjust limits of GDT/IDT pseudo-descriptors (some were off by one)
- move empty_zero_page into .bss.page_aligned
- move cpu_gdt_table into .data.page_aligned
- move idt_table into .bss
- align inital_code and init_rsp
- eliminate pointless (re-)declaration of idt_table in traps.c
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It needs num_physpages, so initialize it early. It's later overwritten
again.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Attached is a small code style cleanup patch that resulted from my
skimming through the arch/x86_64/kernel/traps.c code to figure out what
went haywire.
Signed-off-by: Roberto Nibali <ratz@drugphish.ch>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[description by AK]
Made a cut'n'paste error when adding the entry for the ALI M1695
AGP bridge and added a second entry for the 1689
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
gcc should handle this anyways, and it causes problems when
sprintf is turned into strcpy by gcc behind our backs and
the C fallback version of strcpy is actually defining __builtin_strcpy
Then drop -ffreestanding from the main Makefile because it isn't
needed anymore and implies -fno-builtin, which is wrong now.
(it was only added for x86-64, so dropping it should be safe)
Noticed by Roman Zippel
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We've always had the problem that arguments only did a prefix match,
which resulted e.g. in noapic and noapictimer getting confused.
Fix the early argument parsing code to always check that arguments are
whole words (except for those that take additional arguments of course)
I factored out the checking code for that while also makes the code
easier to maintain.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
While the modular aspect of the respective i386 patch doesn't apply to
x86-64 (as the top level page directory entry is shared between modules
and the base kernel), handlers registered with register_die_notifier()
are still under similar constraints for touching ioremap()ed or
vmalloc()ed memory. The likelihood of this problem becoming visible is
of course significantly lower, as the assigned virtual addresses would
have to cross a 2**39 byte boundary. This is because the callback gets
invoked
(a) in the page fault path before the top level page table propagation
gets carried out (hence a fault to propagate the top level page table
entry/entries mapping to module's code/data would nest infinitly) and
(b) in the NMI path, where nested faults must absolutely not happen,
since otherwise the IRET from the nested fault re-enables NMIs,
potentially resulting in nested NMI occurences.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The code waits for the GART to clear the TLB flush bit. Use cpu_relax
in this time to allow hypervisors to yield the CPU in this time.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The PM timer path through main_timer_handler doesn't need
the delay variable because it figures it out in a different way.
Don't try to read it from the PIT. With stopped PIT timer
it is even useless.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Minor cleanup to lend better for physical CPU hotplug.
Earlier way of using num_processors as index doesnt
fit if CPUs come and go. This makes the code little bit better
to read, and helps physical hotplug use the same functions as boot.
Reserving CPU0 for BSP is too late to be done in smp_prepare_boot_cpu().
Since logical assignments from MADT is already done via
setup_arch()->acpi_boot_init()->parse lapic
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Touching of the floating point state in a kernel debugger must be
NMI-safe, specifically math_state_restore() must be able to deal with
being called out of an NMI context. In order to do that reliably, the
context switch code must take care to not leave a window open where
the current task's TS_USEDFPU flag and CR0.TS could get out of sync.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For consistency and to have only a single place of definition, replace
set_debug() uses with set_debugreg(), and eliminate the definition of
thj former.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
While AMD formally permits multi-byte execution breakpoints, Intel
disallows 8-byte as much as 2- or 4-byte ones.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix one place where the previous change of cpu_pda from being an array
to being a macro was not properly carried out.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It conflicts with the struct node in node.h
Actually the x86-64 version was there first, but ..
Suggested by Jan Beulich
Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
8MB is not really very random, use 1GB (or more with larger page sizes)
instead.
Also use the low bits of the random generator output now instead of
throwing them away.
Only enabled on x86-64 right now. Other architectures need to add
a suitable STACK_RND_MASK
Cc: mingo@elte.hu
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The upcomming gcc 4.2 got a new option -mtune=generic to tune
code for both common AMD and Intel CPUs. Use this option
when available for generic kernels.
On x86-64 it is used with CONFIG_GENERIC_CPU. On i386 it is
enabled with CONFIG_X86_GENERIC. It won't affect the base
line CPU support in any ways and also not the minimum supported CPU.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Memory >39bits has a different PUD.
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IPoIB: P_Key change event handling
IB/mthca: Fix modify QP error path
IPoIB: Fix network interface "RUNNING" status
IB/mthca: Fix indentation
IB/mthca: Fix uninitialized variable in mthca_alloc_qp()
IB/mthca: Check SRQ limit in modify SRQ operation
IB/mthca: Check that SRQ WQE size does not exceed device's max value
IB/mthca: Check that sgid_index and path_mtu are valid in modify_qp
IB/srp: Use a fake scatterlist for non-SG SCSI commands
IPoIB: Pass correct pointer when flushing child interfaces
* 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2:
ocfs2: finally remove MLF* macros
ocfs2: don't use MLF* in the file system
ocfs2: don't use MLF* in dlm/ files
ocfs2: don't use MLF* in cluster/ files
[PATCH] ocfs2: dlm recovery fixes
[PATCH] ocfs2: fix hang in dlm lock resource mastery
ocfs2: use __attribute__ format
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] New IA64 core/thread detection patch
[IA64] Increase max node count on SN platforms
[IA64] Increase max node count on SN platforms
[IA64] Increase max node count on SN platforms
[IA64] Increase max node count on SN platforms
[IA64] Tollhouse HP: IA64 arch changes
[IA64] cleanup dig_irq_init
[IA64] MCA recovery: kernel context recovery table
IA64: Use early_parm to handle mvec_name and nomca
[IA64] move patchlist and machvec into init section
[IA64] add init declaration - nolwsys
[IA64] add init declaration - gate page functions
[IA64] add init declaration to memory initialization functions
[IA64] add init declaration to cpu initialization functions
[IA64] add __init declaration to mca functions
[IA64] Ignore disabled Local SAPIC Affinity Structure in SRAT
[IA64] sn_check_intr: use ia64_get_irr()
[IA64] fix ia64 is_hugepage_only_range
* master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild: (46 commits)
kbuild: remove obsoleted scripts/reference_* files
kbuild: fix make help & make *pkg
kconfig: fix time ordering of writes to .kconfig.d and include/linux/autoconf.h
Kconfig: remove the CONFIG_CC_ALIGN_* options
kbuild: add -fverbose-asm to i386 Makefile
kbuild: clean-up genksyms
kbuild: Lindent genksyms.c
kbuild: fix genksyms build error
kbuild: in makefile.txt note that Makefile is preferred name for kbuild files
kbuild: replace PHONY with FORCE
kbuild: Fix bug in crc symbol generating of kernel and modules
kbuild: change kbuild to not rely on incorrect GNU make behavior
kbuild: when warning symbols exported twice now tell user this is the problem
kbuild: fix make dir/file.xx when asm symlink is missing
kbuild: in the section mismatch check try harder to find symbols
kbuild: fix section mismatch check for unwind on IA64
kbuild: kill false positives from section mismatch warnings for powerpc
kbuild: kill trailing whitespace in modpost & friends
kbuild: small update of allnoconfig description
kbuild: make namespace.pl CROSS_COMPILE happy
...
Trivial conflict in arch/ppc/boot/Makefile manually fixed up
Hugh is rightly concerned that the CONFIG_DEBUG_VM coverage has gone too
far in vm_normal_page, considering that we expect production kernels to be
shipped with the option turned off, and that the code has been under some
large changes recently.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix bug identified by Linus Torvalds <torvalds@osdl.org>: the `out'
instruction depends upon the state of memory_data[], so we need to tell gcc
that before executing it. (The opcode, not gcc).
Fixes http://bugzilla.kernel.org/show_bug.cgi?id=5553
Thanks to Antonio Ospite <ospite@studenti.unina.it> for testing.
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (21 commits)
BUG_ON() Conversion in drivers/video/
BUG_ON() Conversion in drivers/parisc/
BUG_ON() Conversion in drivers/block/
BUG_ON() Conversion in sound/sparc/cs4231.c
BUG_ON() Conversion in drivers/s390/block/dasd.c
BUG_ON() Conversion in lib/swiotlb.c
BUG_ON() Conversion in kernel/cpu.c
BUG_ON() Conversion in ipc/msg.c
BUG_ON() Conversion in block/elevator.c
BUG_ON() Conversion in fs/coda/
BUG_ON() Conversion in fs/binfmt_elf_fdpic.c
BUG_ON() Conversion in input/serio/hil_mlc.c
BUG_ON() Conversion in md/dm-hw-handler.c
BUG_ON() Conversion in md/bitmap.c
The comment describing how MS_ASYNC works in msync.c is confusing
rcu: undeclared variable used in documentation
fix typos "wich" -> "which"
typo patch for fs/ufs/super.c
Fix simple typos
tabify drivers/char/Makefile
...
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NETFILTER] x_table.c: sem2mutex
[IPV4]: Aggregate route entries with different TOS values
[TCP]: Mark tcp_*mem[] __read_mostly.
[TCP]: Set default max buffers from memory pool size
[SCTP]: Fix up sctp_rcv return value
[NET]: Take RTNL when unregistering notifier
[WIRELESS]: Fix config dependencies.
[NET]: Fill in a 32-bit hole in struct sock on 64-bit platforms.
[NET]: Ensure device name passed to SO_BINDTODEVICE is NULL terminated.
[MODULES]: Don't allow statically declared exports
[BRIDGE]: Unaligned accesses in the ethernet bridge
drivers/scsi/sd.c: In function `sd_store_cache_type':
drivers/scsi/sd.c:193: warning: comparison of distinct pointer types lacks a cast
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This removes the support for pps. It's completely unused within the kernel
and is basically in the way for further cleanups. It should be easier to
readd proper support for it after the rest has been converted to NTP4
(where the pps mechanisms are quite different from NTP3 anyway).
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Merge mpsc.h into mpsc.c because its the only file that #include's mpsc.h.
Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The definition of SUPPORT_SYSRQ must come before #include of serial_core.h.
This patch moves the definition of SUPPORT_SYSRQ to be just after the #include
of config.h to make it consistent with 8250.c.
Reported-by: Stephane Chazelas <Stephane@artesyncp.com>
Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The devname passed to request_irq() contained a '/' which is wrong. At
a minimum, the '/' prevented the devname from showing up in
/proc/irq/<irq>/<devname>. This patch replaces the '/' with a '-' to
fixes that problem.
Reported-by: Stephane Chazelas <Stephane@artesyncp.com>
Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
'int dev' came out of an 'unsigned char *' - as such, it will not get
a negative value. Thanks Valdis.
Signed-off-by: Eugene Teo <eugene.teo@eugeneteo.net>
Cc: Jaroslav Kysela <perex@suse.cz>
Acked-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In binfmt_flat.c, the flat binary loader should check file descriptor table
and install the fd on the file.
Convert the function to single-exit and fix this bug.
Signed-off-by: "Luke Yang" <luke.adi@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add support for SA_PERCPU_IRQ (only mmtimer.c uses this at this stage).
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
nr_segs is unsigned long and thus cannot be negative. We checked against 0
few lines before.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Local variable i is unsigned int and thus cannot be negative.
(akpm: unsigneds shouldn't be called `i'. This value cannot possibly be
negative anyway).
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix Documentation/firmware_class/ examples so that they will build.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch removes duplicate definitions from include/linux/udf_fs_i.h
which are already defined in fs/udf/ecma_167.h.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch avoids arithmetic on 'signed' types that are slower than
'unsigned'. This saves space and cpu cycles.
size of kernel/sys.o before the patch (gcc-3.4.5)
text data bss dec hex filename
10924 252 4 11180 2bac kernel/sys.o
size of kernel/sys.o after the patch
text data bss dec hex filename
10903 252 4 11159 2b97 kernel/sys.o
I noticed that gcc-4.1.0 (from Fedora Core 5) even uses idiv instruction for
(a+b)/2 if a and b are signed.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>