Commit Graph

560 Commits

Author SHA1 Message Date
Andi Kleen
f0fdabf8bf [PATCH] x86_64: Don't warn for overflow in nommu case when dma_mask is < 32bit
This triggers for b44's 1GB DMA workaround which tries to map
first and then bounces.

The 32bit heuristic is reasonable because the IOMMU doesn't attempt
to handle < 32bit masks anyways.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-16 07:59:31 -07:00
Andi Kleen
ac71d12c99 [PATCH] x86_64: Avoid EBDA area in early boot allocator
Based on analysis&patch from Robert Hentosch

Observed on a Dell PE6850 with 16GB

The problem occurs very early on, when the kernel allocates space for the
temporary memory map called bootmap. The bootmap overlaps the EBDA region.
EBDA region is not historically reserved in the e820 mapping. When the
bootmap is freed it marks the EBDA region as usable.

If you notice in setup.c there is already code to work around the EBDA
in reserve_ebda_region(), this check however occurs after the bootmap
is allocated and doesn't prevent the bootmap from using this range.

AK: I redid the original patch. Thanks also to Jan Beulich for
spotting some mistakes.

Cc: Robert_Hentosch@dell.com
Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
Corey Minyard
8b1ffe9550 [PATCH] x86_64: add nmi_exit to die_nmi
Playing with NMI watchdog on x86_64, I discovered that it didn't
do what I expected.  It always panic-ed, even when it didn't
happen from interrupt context.  This patch solves that
problem for me.  Also, in this case, do_exit() will be called
with interrupts disabled, I believe.  Would it be wise to also
call local_irq_enable() after nmi_exit()?
[Yes I added it -AK]

Currently, on x86_64, any NMI watchdog timeout will cause a panic
because the irq count will always be set to be in an interrupt
when do_exit() is called from die_nmi().  If we add nmi_exit() to
the die_nmi() call (since the nmi will never exit "normally")
it seems to solve this problem.  The following small program
can be used to trigger the NMI watchdog to reproduce this:
  main ()
  {
        iopl(3);
        for (;;) asm("cli");
  }

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
Corey Minyard
cdc60a4c8e [PATCH] x86_64: fix die_lock nesting
I noticed this when poking around in this area.

The oops_begin() function in x86_64 would only conditionally claim
the die_lock if the call is nested, but oops_end() would always
release the spinlock. This patch adds a nest count for the die lock
so that the release of the lock is only done on the final oops_end().

Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
Andi Kleen
5192d84e4c [PATCH] x86_64: Check for too many northbridges in IOMMU code
The IOMMU code can only deal with 8 northbridges. Error out when
more are found.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
Kimball Murray
e0c1e9bf81 [PATCH] x86_64: avoid IRQ0 ioapic pin collision
The patch addresses a problem with ACPI SCI interrupt entry, which gets
re-used, and the IRQ is assigned to another unrelated device.  The patch
corrects the code such that SCI IRQ is skipped and duplicate entry is
avoided.  Second issue came up with VIA chipset, the problem was caused by
original patch assigning IRQs starting 16 and up.  The VIA chipset uses
4-bit IRQ register for internal interrupt routing, and therefore cannot
handle IRQ numbers assigned to its devices.  The patch corrects this
problem by allowing PCI IRQs below 16.

Cc: len.brown@intel.com

Signed-off by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
Linus Torvalds
532f57da40 Merge branch 'audit.b10' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b10' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] Audit Filter Performance
  [PATCH] Rework of IPC auditing
  [PATCH] More user space subject labels
  [PATCH] Reworked patch for labels on user space messages
  [PATCH] change lspp ipc auditing
  [PATCH] audit inode patch
  [PATCH] support for context based audit filtering, part 2
  [PATCH] support for context based audit filtering
  [PATCH] no need to wank with task_lock() and pinning task down in audit_syscall_exit()
  [PATCH] drop task argument of audit_syscall_{entry,exit}
  [PATCH] drop gfp_mask in audit_log_exit()
  [PATCH] move call of audit_free() into do_exit()
  [PATCH] sockaddr patch
  [PATCH] deal with deadlocks in audit_free()
2006-05-01 21:43:05 -07:00
Mikael Pettersson
160bd18e5e [PATCH] x86_64: make PC Speaker driver work
The PC Speaker driver's ->probe() routine doesn't even get called in the
64-bit kernels.  The reason for that is that the arch code apparently has
to explictly add a "pcspkr" platform device in order for the driver core to
call the ->probe() routine.  arch/i386/kernel/setup.c unconditionally adds
a "pcspkr" device, but the x86_64 kernel has no code at all related to the
PC Speaker.

The patch below copies the relevant code from i386 to x86_64, which makes
the PC Speaker work for me on x86_64.

Cc: Dmitry Torokhov <dtor_core@ameritech.net>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-01 18:17:47 -07:00
Al Viro
5411be59db [PATCH] drop task argument of audit_syscall_{entry,exit}
... it's always current, and that's a good thing - allows simpler locking.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-05-01 06:06:18 -04:00
Chandra Seetharaman
83d722f7e1 [PATCH] Remove __devinit and __cpuinit from notifier_call definitions
Few of the notifier_chain_register() callers use __init in the definition
of notifier_call.  It is incorrect as the function definition should be
available after the initializations (they do not unregister them during
initializations).

This patch fixes all such usages to _not_ have the notifier_call __init
section.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 08:30:03 -07:00
Mike Waychison
5b20192727 [PATCH] x86_64: Fix a race in the free_iommu path
We do this by removing a micro-optimization that tries to avoid grabbing
the iommu_bitmap_lock spinlock and using a bus-locked operation.

This still races with other simultaneous alloc_iommu or free_iommu(size >
1) which both use bus-unlocked operations.

The end result of this race is eventually ending up with an
iommu_gart_bitmap that has bits errornously set all over, making large
contiguous iommu space allocations fail with 'PCI-DMA: Out of IOMMU space'.

Signed-off-by: Mike Waychison <mikew@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-22 09:19:52 -07:00
Andi Kleen
18bd057b14 [PATCH] i386/x86-64: Fix x87 information leak between processes
AMD K7/K8 CPUs only save/restore the FOP/FIP/FDP x87 registers in FXSAVE
when an exception is pending.  This means the value leak through
context switches and allow processes to observe some x87 instruction
state of other processes.

This was actually documented by AMD, but nobody recognized it as
being different from Intel before.

The fix first adds an optimization: instead of unconditionally
calling FNCLEX after each FXSAVE test if ES is pending and skip
it when not needed. Then do a x87 load from a kernel variable to
clear FOP/FIP/FDP.

This means other processes always will only see a constant value
defined by the kernel in their FP state.

I took some pain to make sure to chose a variable that's already
in L1 during context switch to make the overhead of this low.

Also alternative() is used to patch away the new code on CPUs
who don't need it.

Patch for both i386/x86-64.

The problem was discovered originally by Jan Beulich. Richard
Brunner provided the basic code for the workarounds, with contribution
from Jan.

This is CVE-2006-1056

Cc: richard.brunner@amd.com
Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-20 07:58:11 -07:00
Prasanna S Panchamukhi
3b60211c16 [PATCH] Switch Kprobes inline functions to __kprobes for x86_64
Andrew Morton pointed out that compiler might not inline the functions
marked for inline in kprobes.  There-by allowing the insertion of probes
on these kprobes routines, which might cause recursion.

This patch removes all such inline and adds them to kprobes section
there by disallowing probes on all such routines.  Some of the routines
can even still be inlined, since these routines gets executed after the
kprobes had done necessay setup for reentrancy.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:53 -07:00
Vivek Goyal
8bcc5280e6 [PATCH] x86_64: x86_64 add crashdump trigger points
o Start booting into the capture kernel after an Oops if system is in a
  unrecoverable state. System will boot into the capture kernel, if one is
  pre-loaded by the user, and capture the kernel core dump.

o One of the following conditions should be true to trigger the booting of
  capture kernel.
        - panic_on_oops is set.
        - pid of current thread is 0
        - pid of current thread is 1
        - Oops happened inside interrupt context.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-18 10:39:19 -07:00
Bjorn Helgaas
4f705ae3e9 [PATCH] DMI: move dmi_scan.c from arch/i386 to drivers/firmware/
dmi_scan.c is arch-independent and is used by i386, x86_64, and ia64.
Currently all three arches compile it from arch/i386, which means that ia64
and x86_64 depend on things in arch/i386 that they wouldn't otherwise care
about.

This is simply "mv arch/i386/kernel/dmi_scan.c drivers/firmware/" (removing
trailing whitespace) and the associated Makefile changes.  All three
architectures already set CONFIG_DMI in their top-level Kconfig files.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andrey Panin <pazke@orbita1.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-14 11:41:25 -07:00
Andi Kleen
97a4d00388 [PATCH] x86_64: Remove check for canonical RIP
As pointed out by Linus it is useless now because entry.S should
handle it correctly in all cases.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:38:57 -07:00
Kyle McMartin
894b5779ce [PATCH] No arch-specific strpbrk implementations
While cleaning up parisc_ksyms.c earlier, I noticed that strpbrk wasn't
being exported from lib/string.c.  Investigating further, I noticed a
changeset that removed its export and added it to _ksyms.c on a few more
architectures.  The justification was that "other arches do it."

I think this is wrong, since no architecture currently defines
__HAVE_ARCH_STRPBRK, there's no reason for any of them to be exporting it
themselves.  Therefore, consolidate the export to lib/string.c.

Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:40 -07:00
John Blackwood
97c2803c9c [PATCH] x86_64: Plug GS leak in arch_prctl()
In linux-2.6.16, we have noticed a problem where the gs base value
returned from an arch_prtcl(ARCH_GET_GS, ...) call will be incorrect if:

   - the current/calling task has NOT set its own gs base yet to a
     non-zero value,

   - some other task that ran on the same processor previously set their
     own gs base to a non-zero value.

In this situation, the ARCH_GET_GS code will read and return the
MSR_KERNEL_GS_BASE msr register.

However, since the __switch_to() code does NOT load/zero the
MSR_KERNEL_GS_BASE register when the task that is switched IN has a zero
next->gs value, the caller of arch_prctl(ARCH_GET_GS, ...) will get back
the value of some previous tasks's gs base value instead of 0.

    Change the arch_prctl() ARCH_GET_GS code to only read and return
    the MSR_KERNEL_GS_BASE msr register if the 'gs' register of the calling
    task is non-zero.

    Side note: Since in addition to using arch_prctl(ARCH_SET_GS, ...),
    a task can also setup a gs base value by using modify_ldt() and write
    an index value into 'gs' from user space, the patch below reads
    'gs' instead of using thread.gs, since in the modify_ldt() case,
    the thread.gs value will be 0, and incorrect value would be returned
    (the task->thread.gs value).

    When the user has not set its own gs base value and the 'gs'
    register is zero, then the MSR_KERNEL_GS_BASE register will not be
    read and a value of zero will be returned by reading and returning
    'task->thread.gs'.

    The first patch shown below is an attempt at implementing this
    approach.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:53 -07:00
Jordan Hargrave
b20367a6c2 [PATCH] x86_64: Fix drift with HPET timer enabled
If the HPET timer is enabled, the clock can drift by ~3 seconds a day.
This is due to the HPET timer not being initialized with the correct
setting (still using PIT count).

If HZ changes, this drift can become even more pronounced.

HPET patch initializes tick_nsec with correct tick_nsec settings for
HPET timer.

Vojtech comments:

  "It's not entirely correct (it assumes the HPET ticks totally
   exactly), but it's significantly better than assuming the PIT error
   there."

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:53 -07:00
Ravikiran G Thirumalai
e405d06729 [PATCH] x86_64: Fixup read_mostly section on internode cache line size for vSMP
Fixup the read mostly section to start at internode cacheline boundary.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:52 -07:00
Andi Kleen
3d34ee6891 [PATCH] x86_64: Don't return error for HPET initialization in initcall
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:52 -07:00
Andi Kleen
ac04dcaf6f [PATCH] x86_64: Don't export strlen twice
Fix

  WARNING: vmlinux: 'strlen' exported twice. Previous export was in vmlinux

Reported by Mats Johannesson

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:52 -07:00
Andi Kleen
7bf36bbc5e [PATCH] x86_64: When user could have changed RIP always force IRET
Intel EM64T CPUs handle uncanonical return addresses differently
from AMD CPUs.

The exception is reported in the SYSRET, not the next instruction.
This leads to the kernel exception handler running on the user stack
with the wrong GS because the kernel didn't expect exceptions
on this instruction.

This version of the patch has the teething problems that plagued an earlier
version fixed.

This is CVE-2006-0744

Thanks to Ernie Petrides and Asit B. Mallick for analysis and initial
patches.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:52 -07:00
Andi Kleen
553f265fe8 [PATCH] x86_64: Don't run NMI watchdog during machine checks
Machine checks can stall the machine for a long time and
it's not good to trigger the nmi watchdog during that.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:52 -07:00
Andi Kleen
d1530d82e0 [PATCH] x86_64: Clear APIC feature bit when local APIC is disabled
Needed for other checks later in ACPI.

Pointed out by Len Brown

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:51 -07:00
Andi Kleen
fa47dd0ba3 [PATCH] x86_64: Fix compilation with CONFIG_PCI=n / allnoconfig
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:51 -07:00
Arjan van de Ven
952223683e [PATCH] x86_64: Introduce e820_all_mapped
Introduce a e820_all_mapped() function which checks if the entire range
<start,end> is mapped with type.

This is done by moving the local start variable to the end of each
known-good region; if at the end of the function the start address is
still before end, there must be a part that's not of the correct type;
otherwise it's a good region.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:50 -07:00
Arjan van de Ven
eee5a9fa63 [PATCH] x86_64: Rename e820_mapped to e820_any_mapped
Rename e820_mapped to e820_any_mapped since it tests if any part of the
range is mapped according to the type.

Later steps will introduce e820_all_mapped which will check if the
entire range is mapped with the type.  Both have their merit.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:17 -07:00
Andi Kleen
9d99aaa31f [PATCH] x86_64: Support memory hotadd without sparsemem
Memory hotadd doesn't need SPARSEMEM, but can be handled by just preallocating
mem_maps. This only needs some untangling of ifdefs to enable the necessary
code even without SPARSEMEM.

Originally from Keith Mannthey, hacked by AK.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:16 -07:00
Andi Kleen
805e8c03c9 [PATCH] x86_64: Clean up execve path
Just call IRET always, no need for any special cases.

Needed for the next bug fix.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:16 -07:00
Adrian Bunk
0cb3463f04 [PATCH] unexport get_wchan
The only user of get_wchan is the proc fs - and proc can't be built modular.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:01 -08:00
OGAWA Hirofumi
9b41046cd0 [PATCH] Don't pass boot parameters to argv_init[]
The boot cmdline is parsed in parse_early_param() and
parse_args(,unknown_bootoption).

And __setup() is used in obsolete_checksetup().

	start_kernel()
		-> parse_args()
			-> unknown_bootoption()
				-> obsolete_checksetup()

If __setup()'s callback (->setup_func()) returns 1 in
obsolete_checksetup(), obsolete_checksetup() thinks a parameter was
handled.

If ->setup_func() returns 0, obsolete_checksetup() tries other
->setup_func().  If all ->setup_func() that matched a parameter returns 0,
a parameter is seted to argv_init[].

Then, when runing /sbin/init or init=app, argv_init[] is passed to the app.
If the app doesn't ignore those arguments, it will warning and exit.

This patch fixes a wrong usage of it, however fixes obvious one only.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:53 -08:00
Matt Mackall
641f71f5f6 [PATCH] RTC: Remove RTC UIP synchronization on x86_64
Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:00 -08:00
Alan Stern
e041c68341 [PATCH] Notifier chain update: API changes
The kernel's implementation of notifier chains is unsafe.  There is no
protection against entries being added to or removed from a chain while the
chain is in use.  The issues were discussed in this thread:

    http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2

We noticed that notifier chains in the kernel fall into two basic usage
classes:

	"Blocking" chains are always called from a process context
	and the callout routines are allowed to sleep;

	"Atomic" chains can be called from an atomic context and
	the callout routines are not allowed to sleep.

We decided to codify this distinction and make it part of the API.  Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name).  New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain.  The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.

With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed.  For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections.  (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)

There are some limitations, which should not be too hard to live with.  For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem.  Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain.  (This did happen in a couple of places and the code
had to be changed to avoid it.)

Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization.  Instead we use RCU.  The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.

Here is the list of chains that we adjusted and their classifications.  None
of them use the raw API, so for the moment it is only a placeholder.

  ATOMIC CHAINS
  -------------
arch/i386/kernel/traps.c:		i386die_chain
arch/ia64/kernel/traps.c:		ia64die_chain
arch/powerpc/kernel/traps.c:		powerpc_die_chain
arch/sparc64/kernel/traps.c:		sparc64die_chain
arch/x86_64/kernel/traps.c:		die_chain
drivers/char/ipmi/ipmi_si_intf.c:	xaction_notifier_list
kernel/panic.c:				panic_notifier_list
kernel/profile.c:			task_free_notifier
net/bluetooth/hci_core.c:		hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_expect_chain
net/ipv6/addrconf.c:			inet6addr_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_expect_chain
net/netlink/af_netlink.c:		netlink_chain

  BLOCKING CHAINS
  ---------------
arch/powerpc/platforms/pseries/reconfig.c:	pSeries_reconfig_chain
arch/s390/kernel/process.c:		idle_chain
arch/x86_64/kernel/process.c		idle_notifier
drivers/base/memory.c:			memory_chain
drivers/cpufreq/cpufreq.c		cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c		cpufreq_transition_notifier_list
drivers/macintosh/adb.c:		adb_client_list
drivers/macintosh/via-pmu.c		sleep_notifier_list
drivers/macintosh/via-pmu68k.c		sleep_notifier_list
drivers/macintosh/windfarm_core.c	wf_client_list
drivers/usb/core/notify.c		usb_notifier_list
drivers/video/fbmem.c			fb_notifier_list
kernel/cpu.c				cpu_chain
kernel/module.c				module_notify_list
kernel/profile.c			munmap_notifier
kernel/profile.c			task_exit_notifier
kernel/sys.c				reboot_notifier_list
net/core/dev.c				netdev_chain
net/decnet/dn_dev.c:			dnaddr_chain
net/ipv4/devinet.c:			inetaddr_chain

It's possible that some of these classifications are wrong.  If they are,
please let us know or submit a patch to fix them.  Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)

The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.

[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:50 -08:00
Siddha, Suresh B
1e9f28fa1e [PATCH] sched: new sched domain for representing multi-core
Add a new sched domain for representing multi-core with shared caches
between cores.  Consider a dual package system, each package containing two
cores and with last level cache shared between cores with in a package.  If
there are two runnable processes, with this appended patch those two
processes will be scheduled on different packages.

On such systems, with this patch we have observed 8% perf improvement with
specJBB(2 warehouse) benchmark and 35% improvement with CFP2000 rate(with 2
users).

This new domain will come into play only on multi-core systems with shared
caches.  On other systems, this sched domain will be removed by domain
degeneration code.  This new domain can be also used for implementing power
savings policy (see OLS 2005 CMP kernel scheduler paper for more details..
I will post another patch for power savings policy soon)

Most of the arch/* file changes are for cpu_coregroup_map() implementation.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
Prasanna S Panchamukhi
c28f896634 [PATCH] kprobes: fix broken fault handling for x86_64
Provide proper kprobes fault handling, if a user-specified pre/post handlers
tries to access user address space, through copy_from_user(), get_user() etc.

The user-specified fault handler gets called only if the fault occurs while
executing user-specified handlers.  In such a case user-specified handler is
allowed to fix it first, later if the user-specifed fault handler does not fix
it, we try to fix it by calling fix_exception().

The user-specified handler will not be called if the fault happens when single
stepping the original instruction, instead we reset the current probe and
allow the system page fault handler to fix it up.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:04 -08:00
bibo,mao
2326c77017 [PATCH] kprobe handler: discard user space trap
Currently kprobe handler traps only happen in kernel space, so function
kprobe_exceptions_notify should skip traps which happen in user space.
This patch modifies this, and it is based on 2.6.16-rc4.

Signed-off-by: bibo mao <bibo.mao@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
Cc: <hiramatu@sdl.hitachi.co.jp>
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:04 -08:00
bibo mao
c6fd91f0bd [PATCH] kretprobe instance recycled by parent process
When kretprobe probes the schedule() function, if the probed process exits
then schedule() will never return, so some kretprobe instances will never
be recycled.

In this patch the parent process will recycle retprobe instances of the
probed function and there will be no memory leak of kretprobe instances.

Signed-off-by: bibo mao <bibo.mao@intel.com>
Cc: Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:04 -08:00
Andi Kleen
c36cd16f78 [PATCH] x86_64: Add cpu_relax() to busy loops in PM timer code
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:14:39 -08:00
Andi Kleen
3076a492a5 [PATCH] x86_64: Report SIGSEGV for IRET faults
tcsh is not happy with the -9999 error code.

Suggested by Ernie Petrides

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:14:39 -08:00
Andi Kleen
0085979006 [PATCH] x86_64: Remove bogus special case in AMD core parsing.
No need to restrict to power of two here.

TBD needs more double checking

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:14:39 -08:00
Kevin Winchester
40caa88465 [PATCH] x86_64: Eliminate register_die_notifier symbol exported
register_die_notifier is exported twice, once in traps.c and once in
x8664_ksyms.c.  This results in a warning on build.

Signed-off-by: Kevin Winchester <kwin@ns.sympatico.ca>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:14:38 -08:00
Navin Boppuri
9c01dda02f [PATCH] x86_64: Search K8 devices on more devices.
arch/x86_64/kernel/aperture.c: The search for the AGP bridge has been
extended to search for all the 256 buses instead of the first 32. This
is required since on a some systems, the bridge may be located on a bus
much farther than the first 32. By searching all 256 buses, we guarantee
that the search succeeds on such systems.

arch/x86_64/kernel/pci-gart.c: The search for the Northbridge is not
limited to just bus 0 anymore. This is required because on certain
systems, we may not find one on bus 0.

Signed-off-by: Navin Boppuri <navin.boppuri@newisys.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:14:38 -08:00
Andi Kleen
9d95dd849c [PATCH] i386/x86-64: List Intel LaGrange AKA SMX in /proc/cpuinfo
Spec just got published so we know the CPUID bit.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:57 -08:00
Jon Mason
c912c2db2f [PATCH] x86_64: free_bootmem_node needs __pa in allocate_aperture
free_bootmem_node expects a physical address to be passed in, but
__alloc_bootmem_node returns a virtual one.  That address needs to be
translated to physical.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:57 -08:00
Vivek Goyal
da7ed9f98f [PATCH] x86_64: timer interrupt lockup due to pending interrupt
o check_timer() routine fails while second kernel is booting after a crash
  on an opetron box. Problem happens because timer vector (0x31) seems to be
  locked.

o After a system crash, it is not safe to service interrupts any more, hence
  interrupts are disabled. This leads to pending interrupts at LAPIC. LAPIC
  sends these interrupts to the CPU during early boot of second kernel. Other
  pending interrupts are discarded saying unexpected trap but timer interrupt
  is serviced and CPU does not issue an LAPIC EOI because it think this
  interrupt came from i8259 and sends ack to 8259. This leads to vector 0x31
  locking as LAPIC does not clear respective ISR and keeps on waiting for
  EOI.

o This patch issues extra EOI for the pending interrupts who have ISR set.

o Though today only timer seems to be the special case because in early
  boot it thinks interrupts are coming from i8259 and uses
  mask_and_ack_8259A() as ack handler and does not issue LAPIC EOI. But
  probably doing it in generic manner for all vectors makes sense.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:57 -08:00
Brian Gerst
b1fc513d81 [PATCH] x86_64: Use cpumask bitops for cpu_vm_mask
cpu_vm_mask is of type cpumask_t, so use the proper bitops.

Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:56 -08:00
Andi Kleen
7682968b7d [PATCH] x86_64: Change default setting for noexec32 to match i386 kernel
This means i386 processes compiled with a recent compiler will get non
executable heap by default now.  This is the same default as a 32bit PAE
kernel would use on a NX enabled CPU.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:56 -08:00
Chuck Ebbert
5b922cd429 [PATCH] x86_64: fix orphaned bits of timer init messages
When x86_64 timer init messages were changed to use apic verbosity
levels, two messages were missed and one got the wrong level.  This
causes the last word of a suppressed message to print on a line by
itself.  Fix that so either the entire message prints or none of it
does.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:56 -08:00
Arjan van de Ven
4bdc3b7f1b [PATCH] x86_64: Basic reorder infrastructure
This patch puts the infrastructure in place to allow for a reordering of
functions based inside the vmlinux. The general idea is that it is possible
to put all "common" functions into the first 2Mb of the code, so that they
are covered by one TLB entry. This as opposed to the current situation where
a typical vmlinux covers about 3.5Mb (on x86-64) and thus 2 TLB entries.

This is done by enabling the -ffunction-sections flag in gcc, which puts
each function in its own ELF section, so that the linker can then order them
in a way defined by the linker script.

As per previous discussions, Linus said he wanted a "static" list for this,
eg a list provided by the kernel tarbal, so that most people have the same
ordering at least. A script is provided to create this list based on
readprofile(1) output. The included list is provisional, and entirely biased
on my own testbox and me running a few kernel compiles and some other
things.

I think that to get to a better list we need to invite people to submit
their own profiles, and somehow add those all up and base the final list on
that. I'm willing to do that effort if this is ends up being the prefered
approach. Such an effort probably needs to be repeated like once a year or
so to adopt to the changing nature of the kernel.

Made it a CONFIG with default n because it increases link times
dramatically.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:56 -08:00
Andi Kleen
9b2a13b963 [PATCH] x86_64: Always use IO-APIC routing for timer.
I tested it on a couple of chipsets and it worked everywhere so it
should be ok as default for now.

So far I haven't done the great purge of the useless old check_timer
code yet though.

Can be overwritten with enable_8254_timer in the worst case

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:55 -08:00
Andi Kleen
3056d6be19 [PATCH] x86_64: Don't invoke OOM killer during dma_alloc_coherent()
There is a fallback logic, so it's better to not use the OOM killer
in the allocations.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:55 -08:00
Andi Kleen
28456edeff [PATCH] x86_64: Reename CMOS update warning
Was disabled due to an old bug, long gone.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:55 -08:00
Andi Kleen
7351c0bfe8 [PATCH] x86_64: Fix formatting in time.c
Only white space changes, code should be identical

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:55 -08:00
Andi Kleen
6954bee829 [PATCH] x86_64: Handle years beyond 2100
ACPIv2 has an official but optional way to get a date >2100.  Use it.
But all the platforms I tested didn't seem to support it.  But anyways
the x86-64 kernel should be ready for the 22nd century now.  Actually i
shouldn't care about this because I will be dead by then @)

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:55 -08:00
Arjan van de Ven
eaeae0cc98 [PATCH] x86_64: Patch to make the head.S-must-be-first-in-vmlinux order explicit
This patch puts the code from head.S in a special .bootstrap.text
section.

I'm working on a patch to reorder the functions in the kernel (I'll post
that later), but for x86-64 at least the kernel bootstrap requires that
the head.S functions are on the very first page/pages of the kernel
text.  This is understandable since the bootstrap is complex enough
already and not a problem at all, it just means they aren't allowed to
be reordered.  This patch puts these special functions into a separate
section to document this, and to guarantee this in the light of possibly
reordering the rest later.

(So this patch doesn't fix a bug per se, but makes things more robust by
making the order of these functions explicit)

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:55 -08:00
Andi Kleen
f2d3efedbe [PATCH] x86_64: Implement early DMI scanning
There are more and more cases where we need to know DMI information
early to work around bugs.  i386 already had early DMI scanning, but
x86-64 didn't.  Implement this now.

This required some cleanup in the i386 code.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:55 -08:00
Dave Jones
e6fc99c6ab [PATCH] x86_64: s/Overwrite/Override/ in arch/x86-64
s/Overwrite/Override/

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:54 -08:00
Ravikiran G Thirumalai
60c1bc82d9 [PATCH] x86_64: to use lapic ids instead of initial apic ids
phys_proc_id[] on AMD boxes is right now populated with the initial
apic id, obtained by the cpuid instruction.  But, the initial apic id
need not be the local apic id on clustered APIC systems (see comment at
x86_64/kernel/genapic_cluster.c, line 110).  On vSMPowered with AMD
CPUs the cpu_to_node will turn out to be incorrect (as apicid_to_node[] is
indexed by the initial apic id rather than the local apic id).
On vSMPowered boxes with Intel CPUs this is working correctly as
phys_proc_id[] is initialized correctly in detect_ht().

This fixes AMD boot path according to specification, to use the correct
routines for local apic id and socket ids.  We use
hard_smp_processor_id() to read the local apic id, and phys_pkg_id() to
determine socket id for phys_proc_id[]

Patch tested on Tyan multicore boxes as well as vSMPowered boxes.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:54 -08:00
Jan Beulich
e57113bc1f [PATCH] x86_64: miscellaneous cleanup
- adjust limits of GDT/IDT pseudo-descriptors (some were off by one)
- move empty_zero_page into .bss.page_aligned
- move cpu_gdt_table into .data.page_aligned
- move idt_table into .bss
- align inital_code and init_rsp
- eliminate pointless (re-)declaration of idt_table in traps.c

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:54 -08:00
Andi Kleen
1f50249e94 [PATCH] x86_64: Make pfn_valid work early in boot
It needs num_physpages, so initialize it early. It's later overwritten
again.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:54 -08:00
Roberto Nibali
2b692a872c [PATCH] x86_64: Clean up white space in traps.c
Attached is a small code style cleanup patch that resulted from my
skimming through the arch/x86_64/kernel/traps.c code to figure out what
went haywire.

Signed-off-by: Roberto Nibali <ratz@drugphish.ch>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:53 -08:00
Andi Kleen
681558fdb5 [PATCH] x86_64: Check that early arguments are words on their own
We've always had the problem that arguments only did a prefix match,
which resulted e.g.  in noapic and noapictimer getting confused.

Fix the early argument parsing code to always check that arguments are
whole words (except for those that take additional arguments of course)
I factored out the checking code for that while also makes the code
easier to maintain.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:53 -08:00
Jan Beulich
86ebcea899 [PATCH] x86_64: remove dead do_softirq_thunk
Appearantly a left-over...

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:53 -08:00
Jan Beulich
8c914cb704 [PATCH] x86_64: actively synchronize vmalloc area when registering certain callbacks
While the modular aspect of the respective i386 patch doesn't apply to
x86-64 (as the top level page directory entry is shared between modules
and the base kernel), handlers registered with register_die_notifier()
are still under similar constraints for touching ioremap()ed or
vmalloc()ed memory. The likelihood of this problem becoming visible is
of course significantly lower, as the assigned virtual addresses would
have to cross a 2**39 byte boundary. This is because the callback gets
invoked
(a) in the page fault path before the top level page table propagation
gets carried out (hence a fault to propagate the top level page table
entry/entries mapping to module's code/data would nest infinitly) and
(b) in the NMI path, where nested faults must absolutely not happen,
since otherwise the IRET from the nested fault re-enables NMIs,
potentially resulting in nested NMI occurences.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:53 -08:00
Andi Kleen
85f9eebccd [PATCH] x86_64: Use cpu_relax in poll loop in GART IOMMU
The code waits for the GART to clear the TLB flush bit. Use cpu_relax
in this time to allow hypervisors to yield the CPU in this time.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:53 -08:00
Andi Kleen
77d910f557 [PATCH] x86_64: Report local APIC ID when initializing CPU
Makes some debugging easier.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:53 -08:00
Andi Kleen
9ede6b0945 [PATCH] x86_64: Don't need to read PIT in timer handler when PM timer is used
The PM timer path through main_timer_handler doesn't need
the delay variable because it figures it out in a different way.
Don't try to read it from the PIT. With stopped PIT timer
it is even useless.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:53 -08:00
Ashok Raj
51f62e186b [PATCH] x86_64: cleanup allocating logical cpu numbers in x86_64
Minor cleanup to lend better for physical CPU hotplug.
Earlier way of using num_processors as index doesnt
fit if CPUs come and go. This makes the code little bit better
to read, and helps physical hotplug use the same functions as boot.

Reserving CPU0 for BSP is too late to be done in smp_prepare_boot_cpu().
Since logical assignments from MADT is already done via
setup_arch()->acpi_boot_init()->parse lapic

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:53 -08:00
Jan Beulich
45948d7720 [PATCH] x86_64: save FPU context slightly later
Touching of the floating point state in a kernel debugger must be
NMI-safe, specifically math_state_restore() must be able to deal with
being called out of an NMI context. In order to do that reliably, the
context switch code must take care to not leave a window open where
the current task's TS_USEDFPU flag and CR0.TS could get out of sync.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:52 -08:00
Jan Beulich
2b514e74f4 [PATCH] x86_64: eliminate set_debug()
For consistency and to have only a single place of definition, replace
set_debug() uses with set_debugreg(), and eliminate the definition of
thj former.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:52 -08:00
Jan Beulich
893efca927 [PATCH] x86_64: disallow multi-byte hardware execution breakpoints
While AMD formally permits multi-byte execution breakpoints, Intel
disallows 8-byte as much as 2- or 4-byte ones.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:52 -08:00
Jan Beulich
3240114d23 [PATCH] x86_64: cpu_pda array to macro followup correction
Fix one place where the previous change of cpu_pda from being an array
to being a macro was not properly carried out.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:52 -08:00
Randy Dunlap
a94ddf3ab8 [PATCH] early_printk: cleanup trailiing whitespace
Remove all trailing tabs and spaces.  No other changes.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:25 -08:00
Adrian Bunk
cdb0452789 [PATCH] kill include/linux/platform.h, default_idle() cleanup
include/linux/platform.h contained nothing that was actually used except
the default_idle() prototype, and is therefore removed by this patch.

This patch does the following with the platform specific default_idle()
functions on different architectures:
- remove the unused function:
  - parisc
  - sparc64
- make the needlessly global function static:
  - arm
  - h8300
  - m68k
  - m68knommu
  - s390
  - v850
  - x86_64
- add a prototype in asm/system.h:
  - cris
  - i386
  - ia64

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Patrick Mochel <mochel@digitalimplant.org>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:21 -08:00
Akinobu Mita
3d1712c91d [PATCH] x86_64: {set,clear,test}_bit() related cleanup and pci_mmcfg_init() fix
While working on these patch set, I found several possible cleanup on x86-64
and ia64.

akpm: I stole this from Andi's queue.

Not only does it clean up bitops.  It also unrelatedly changes the prototype
of pci_mmcfg_init() and removes its arch_initcall().  It seems that the wrong
two patches got joined together, but this is the one which has been tested.

This patch fixes the current x86_64 build error (the pci_mmcfg_init()
declaration in arch/i386/pci/pci.h disagrees with the definition in
arch/x86_64/pci/mmconfig.c)

This also means that x86_64's pci_mmcfg_init() gets called in the same (new)
manner as x86's: from arch/i386/pci/init.c:pci_access_init(), rather than via
initcall.

The bitops cleanups came along for free.

All this worked OK in -mm testing (since 2.6.16-rc4-mm1) because x86_64 was
tested with both patches applied.

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Con Kolivas <kernel@kolivas.org>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:15 -08:00
Andrew Morton
394e3902c5 [PATCH] more for_each_cpu() conversions
When we stop allocating percpu memory for not-possible CPUs we must not touch
the percpu data for not-possible CPUs at all.  The correct way of doing this
is to test cpu_possible() or to use for_each_cpu().

This patch is a kernel-wide sweep of all instances of NR_CPUS.  I found very
few instances of this bug, if any.  But the patch converts lots of open-coded
test to use the preferred helper macros.

Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Christian Zankel <chris@zankel.net>
Cc: Philippe Elie <phil.el@wanadoo.fr>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Jens Axboe <axboe@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:17 -08:00
Ingo Molnar
7a7d1cf954 [PATCH] sem2mutex: kprobes
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:12 -08:00
Rafael J. Wysocki
fc558a7496 [PATCH] swsusp: finally solve mysqld problem
This patch from Pavel moves userland freeze signals handling into more logical
place.  It now hits even with mysqld running.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:08 -08:00
Andrew Morton
ffa930ef55 [PATCH] x86: early_printk(): remove MAX_YPOS and MAX_XPOS macros
Expand out these fairly pointless macros.

Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Stas Sergeev <stsp@aknet.ru>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:05 -08:00
Chuck Ebbert
98e7d9b052 [PATCH] x86: start early_printk at sensible screen row
Use boot info to start early_printk() at the current row on VGA console, as
left by the boot loader.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Stas Sergeev <stsp@aknet.ru>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:05 -08:00
Stas Sergeev
99b7de3347 [PATCH] x86: early printk handling fixes
The history is that -mm kernels do not work for me for a few months
already.  The things started from crashing somewhere after starting init,
and for the last month - no boot at all, just "Uncompressing...  OK,
booting kernel", and silence.  Early console didn't work too.  With the
latest releases this degraded into an infinite stream of the "Unknown
interrupt or fault" messages.  So today my patience ran out and I started
to think how can I collect at least some info for the bug-report.  Attached
is the patch that allows to gather some valueable debug info on the problem
by making an early console more useable.  I can't properly test the patch,
as the kernel still doesn't boot, so I'll explain it in details in a hope
someone else can justify the intrusive changes.

arch_hooks.h: added prototypes for setup_early_printk() and early_printk().

setup.c: killed wrong setup_early_printk() prototype.  Moved
setup_early_printk() a bit earlier, as it was not "early enough" to cover
the bug I was fighting with.

early_printk.c: made it to start printing from the bottom of the screen,
otherwise the messages interfere with the ones of the boot-loader, so you
can't read them.

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Cc: Andi Kleen <ak@muc.de>
Cc: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:05 -08:00
Andrew Morton
f4a641d66c [PATCH] multiple exports of strpbrk
Sam's tree includes a new check, which found that we're exporting strpbrk()
multiple times.

It seems that the convention is that this is exported from the arch files, so
reove the lib/string.c export.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: David Howells <dhowells@redhat.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:53:56 -08:00
Ravikiran G Thirumalai
68ed0040a8 [PATCH] x86: mark cyc2ns_scale readmostly
This variable is rarely written to.  Mark the variable accordingly.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:53:55 -08:00
Linus Torvalds
cbf0ec6ee0 Revert "[PATCH] x86-64: Fix up handling of non canonical user RIPs"
This reverts commit c33d4568ac.

Andrew Clayton and Hugh Dickins report that it's broken for them and
causes strange page table and slab corruption, and spontaneous reboots.

Let's get it right next time.

Cc: Andrew Clayton <andrew@rootshell.co.uk>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-14 08:01:47 -08:00
Andi Kleen
c33d4568ac [PATCH] x86-64: Fix up handling of non canonical user RIPs
EM64T CPUs have somewhat weird error reporting for non canonical RIPs in
SYSRET.

We can't handle any exceptions there because the exception handler would
end up running on the user stack which is unsafe.

To avoid problems any code that might end up with a user touched pt_regs
should return using int_ret_from_syscall.  int_ret_from_syscall ends up
using IRET, which allows safe exceptions.

Cc: Ernie Petrides <petrides@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-12 22:56:29 -08:00
Michael Matz
2ec5e3a867 [PATCH] fix kexec asm
While testing kexec and kdump we hit problems where the new kernel would
freeze or instantly reboot.  The easiest way to trigger it was to kexec a
kernel compiled for CONFIG_M586 on an athlon cpu.  Compiling for CONFIG_MK7
instead would work fine.

The patch fixes a few problems with the kexec inline asm.

Signed-off-by: Chris Mason <mason@suse.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-08 14:15:04 -08:00
Linus Torvalds
637029c6cb Revert "[PATCH] x86_64: Only do the clustered systems have unsynchronized TSC assumption on IBM systems"
This reverts commit 13a229abc2.

Quoth Andi:
  "After some consideration and feedback from various people it turns
   out this wasn't that good an idea.  It has some problems and needs
   more work.  Since it was only an optimization anyways it's best to
   just back it out again for now."

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-27 20:41:56 -08:00
Brian Magnuson
d51761233d [PATCH] fix build on x86_64 with !CONFIG_HOTPLUG_CPU
The commit e2c0388866 added
setup_additional_cpus to setup.c but this is only defined if
CONFIG_HOTPLUG_CPU is set.  This patch changes the #ifdef to reflect that.

Signed-off-by: Brian Magnuson <magnuson@rcn.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 19:07:42 -08:00
Andi Kleen
ab9b32ee62 [PATCH] x86_64: Better ATI timer fix
The previous experiment for using apicmaintimer on ATI systems didn't
work out very well.  In particular laptops with C2/C3 support often
don't let it tick during idle, which makes it useless.  There were also
some other bugs that made the apicmaintimer often not used at all.

I tried some other experiments - running timer over RTC and some other
things but they didn't really work well neither.

I rechecked the specs now and it turns out this simple change is
actually enough to avoid the double ticks on the ATI systems.  We just
turn off IRQ 0 in the 8254 and only route it directly using the IO-APIC.

I tested it on a few ATI systems and it worked there.  In fact it worked
on all chipsets (NVidia, Intel, AMD, ATI) I tried it on.

According to the ACPI spec routing should always work through the
IO-APIC so I think it's the correct thing to do anyways (and most of the
old gunk in check_timer should be thrown away for x86-64).

But for 2.6.16 it's best to do a fairly minimal change:
 - Use the known to be working everywhere-but-ATI IRQ0 both over 8254
   and IO-APIC setup everywhere
 - Except on ATI disable IRQ0 in the 8254
 - Remove the code to select apicmaintimer on ATI chipsets
 - Add some boot options to allow to override this (just paranoia)

In 2.6.17 I hope to switch the default over to this for everybody.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:31 -08:00
Andi Kleen
e8b917775b [PATCH] x86_64: Move the SMP time selection earlier
SMP time selection originally ran after all CPUs were brought up because
it needed to know the number of CPUs to decide if it needs an MP safe
timer or not.

This is not needed anymore because we know present CPUs early.

This fixes a couple of problems:
 - apicmaintimer didn't always work because it relied on state that was
   set up time_init_gtod too late.
 - The output for the used timer in early kernel log was misleading
   because time_init_gtod could actually change it later.  Now always
   print the final timer choice

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:31 -08:00
Andi Kleen
e2c0388866 [PATCH] x86_64: Fix the additional_cpus=.. option
It didn't set up the CPU possible map early enough, so the
option didn't actually work.

Noticed by Heiko Carstens

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:30 -08:00
Chris McDermott
1f99215392 [PATCH] x86_64: Fix NMI watchdog on x460
[description from AK]

Old check for the IO-APIC watchdog during the timer check was wrong -
it obviously should only drop into this if the IO-APIC watchdog is used.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:30 -08:00
Andi Kleen
13a229abc2 [PATCH] x86_64: Only do the clustered systems have unsynchronized TSC assumption on IBM systems
Big Unisys systems have multiple clusters too, but they have an
synchronized TSC.

I'm using the SMBIOS to check for vendor == IBM.

Cc: Chris McDermott <lcm@us.ibm.com>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com>

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:30 -08:00
Jon Mason
60b08c6722 [PATCH] x86_64: no_iommu removal in pci-gart.c
In previous versions of pci-gart.c, no_iommu was used to determine if IOMMU was
disabled in the GART DMA mapping functions.  This changed in 2.6.16 and now
gart_xxx() functions are only called if gart is enabled.  Therefore, uses of
no_iommu in the GART code are no longer necessary and can be removed.

Also, it removes double deceleration of no_iommu and force_iommu in pci.h and
proto.h, by removing the deceleration in pci.h.

Lastly, end_pfn off by one error.

Tested (along with patch 1/2) on dual opteron with gart enabled, iommu=soft,
and iommu=off.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26 09:53:29 -08:00
Dave Jones
a0124d780d [PATCH] x86-64: react to new topology.c location
Commit 9c869edac5 moved the i386 topology.c
file. That change broke x86-64 compiles, as it uses the same file.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-25 11:12:15 -08:00
Tim Hockin
9ff4ced467 [PATCH] Remove KERN_INFO from middle of printk line
Don't print KERN_INFO in the middle of a printk line.
	printk(KERN_INFO "OEM ID: %s ",str);
is just above this. This is already fixed up in i386 copy.

Signed-off-by: Martin J. Bligh <mbligh@google.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 13:59:27 -08:00
Andi Kleen
6574ffd74b [PATCH] x86_64: Resolve the RIP of an early exception using kallsyms
But do it after everything else to risk less from recursive
crashes.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 08:00:40 -08:00
Andi Kleen
7fd67843b9 [PATCH] x86_64: Disable tsc when apicpmtimer is active
Otherwise it has no effect anyways.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 08:00:40 -08:00
Andi Kleen
ab68805955 [PATCH] x86_64: Don't enable ATI apicmaintimer workaround when the machine has C2 or C3
Many laptops have problems with ticking the local APIC timer in C2/C3.
The code added earlier to use it by default on ATI didn't really work
for them. Don't enable it when the system supports C2/C3.

This doesn't fix the problem fully, but at least it's not worse than before.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17 08:00:40 -08:00