Add a function for generic parsing of dma-window properties (ie,
ibm,dma-window and ibm,my-dma-window) of pci and virtual device nodes.
This function will also be used by cell.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch allows the compiler to catch any printf-like mismatches for
udbg_printf(). After some brute force building I've only found issues
with my own code and lparcfg.c It could break some developers, but
IMHO that would be goodness.
Signed-off-by: Jimi Xenidis <jimix@watson.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We need to know the base address of the kdump kernel even when we're not a
kdump kernel, so add a #define for it. Move the logic that sets the kdump
kernelbase into kdump.h instead of page.h.
Rename kdump_setup() to setup_kdump_trampoline() to make it clearer what it's
doing, and add an empty definition for the !CRASH_DUMP case to avoid a
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We currently do mem= handling in three seperate places. And as benh pointed out
I wrote two of them. Now that we parse command line parameters earlier we can
clean this mess up.
Moving the parsing out of prom_init means the device tree might be allocated
above the memory limit. If that happens we'd have to move it. As it happens
we already have logic to do that for kdump, so just genericise it.
This also means we might have reserved regions above the memory limit, if we
do the bootmem allocator will blow up, so we have to modify
lmb_enforce_memory_limit() to truncate the reserves as well.
Tested on P5 LPAR, iSeries, F50, 44p. Tested moving device tree on P5 and
44p and F50.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently we have call parse_early_param() earliyish, but not really very
early. In particular, it's not early enough to do things like mem=x or
crashkernel=blah, which is annoying.
So do it earlier. I've checked all the early param handlers, and none of them
look like they should have any trouble with this. I haven't tested the
booke_wdt ones though.
On 32-bit we were doing the CONFIG_CMDLINE logic twice, so don't.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently early_xmon() calls directly into debugger() if xmon=early is passed.
This ties the invocation of early xmon to the location of parse_early_param(),
which might change.
Tested on P5 LPAR and F50.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Use the existence of RTAS device tree node to determine if
/proc/rtas. /proc/ppc64/rtas are to be created. Using machine type
is not reliable (i.e. Maple-like machines may have RTAS).
Signed-off-by: Michal Ostrowski <mostrows@watson.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Removed the do-nothing routines __setup_cpu_power3 and
__setup_cpu_power4 and replaced them with a null pointer check
in the caller. Also removed the Cell processor specific
routine __setup_cpu_be which improperly accessed the
hypervisor page size configuration at SPR HID6.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The OF trampoline code prom_init.c still needs to identify IBM pSeries
(PAPR) machines in order to run some platform specific code on them like
instanciating the TCE tables. The code doing that detection was changed
recently in 2.6.17 early stages but was done slightly incorrectly. It
should be testing for an exact match of "chrp" and it currently tests
for anything that begins with "chrp". That means it will incorrectly
match with platforms using Maple-like device-trees and have open
firmware. This fixes it by using strcmp instead of strncmp to match what
the actual platform detection code does.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Allow boards to provide a panic callback on ppc32. Moved the code to sets
this up into setup-common.c so its shared between ppc32 & ppc64. Also moved
do_init_bootmem prototype into setup.h.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Forthcoming IBM machines will have a "ibm,pa-features" property on CPU
nodes, that contains bits indicating which optional architecture
features are implemented by the CPU. This adds code to use the
property, if present, to update our CPU feature bitmaps. Note that
this means we can both set and clear feature bits based on what
the firmware tells us.
This is based on a patch by Will Schmidt <willschm@us.ibm.com>.
Signed-off-by: Paul Mackerras <paulus@samba.org>
We currently single-step inline if the instruction on which a kprobe is
inserted is a trap variant.
- variants (such as tdnei, used by BUG()) typically evaluate a condition
and cause a trap only if the condition is satisfied.
- kprobes uses the unconditional "trap" (0x7fe00008) and single-stepping
again on this instruction, resulting in another trap without
evaluating the condition is obviously incorrect.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'audit.b10' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
[PATCH] Audit Filter Performance
[PATCH] Rework of IPC auditing
[PATCH] More user space subject labels
[PATCH] Reworked patch for labels on user space messages
[PATCH] change lspp ipc auditing
[PATCH] audit inode patch
[PATCH] support for context based audit filtering, part 2
[PATCH] support for context based audit filtering
[PATCH] no need to wank with task_lock() and pinning task down in audit_syscall_exit()
[PATCH] drop task argument of audit_syscall_{entry,exit}
[PATCH] drop gfp_mask in audit_log_exit()
[PATCH] move call of audit_free() into do_exit()
[PATCH] sockaddr patch
[PATCH] deal with deadlocks in audit_free()
Change of_node_to_nid() to traverse the device tree, looking for a numa id.
Cell uses this to assign ids to SPUs, which are children of the CPU node.
Existing users of of_node_to_nid() are altered to use of_node_to_nid_single(),
which doesn't do the traversal.
Export an attach_sysdev_to_node() function, allowing system devices (eg.
SPUs) to link themselves into the numa topology in sysfs.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a cosmetic fixup. When printing the nvram partition table, the
first couple entries have a shorter 'index' value than the others, so
table is a bit askew. This change makes the table look pretty.
Tested on pseries and g5. Footnote: yes, this table is normally hidden
behind a DEBUG_NVRAM #define.
Signed-off-by: Will Schmidt <willschm@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A couple of minor renames:
* The iommu_table is no longer a part of the device node structure,
so devnode_table is misleading
* Rename struct device *-variables to hwdev
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This requires the compatible properties having vaules that are empty
strings instead of just being empty properties.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
As an added bonus, since every vio_dev now has a device_node
associated with it, hotplug now works.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add a cputable entry for the POWER6 processor.
The SIHV and SIPR bits in the mmcra have moved in POWER6, so disable
support for that until oprofile is fixed.
Also tell firmware that we know about POWER6.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Wire up *at syscalls.
This patch has been tested on ppc64 (using glibc's testsuite, both 32bit
and 64bit), and compile-tested for ppc32 (I have currently no ppc32 system
available, but I expect no problems).
Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Some people report that we die on some Macs when we are expecting to
catch machine checks after poking at some random I/O address. I'd seen
it happen on my dual G4 with serial ports until we fixed those to use
OF, but now other users are reporting it with i8042.
This expands the use of check_legacy_ioport() to avoid that situation
even on 32-bit kernels.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Normally, ppc64 module .ko files contain a table-of-contents (.toc)
section, but if the module doesn't reference any static or external
data or external procedures, it is possible for gcc/binutils to
generate a .ko that doesn't have a .toc. Currently the module
loader refuses to load such a module, since it needs the address
of the .toc section to use in relocations.
This patch fixes the problem by using the address of the .stubs
section instead, which is an acceptable substitute in this situation.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds code to call a new firmware method to tell the firmware
what machines and capabilities (such as VMX/Altivec) we support.
This will be needed on POWER5+ and POWER6 machines, and it has no
effect on past and current machines.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Few of the notifier_chain_register() callers use __init in the definition
of notifier_call. It is incorrect as the function definition should be
available after the initializations (they do not unregister them during
initializations).
This patch fixes all such usages to _not_ have the notifier_call __init
section.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Few of the notifier_chain_register() callers use __devinitdata in the
definition of notifier_block data structure. It is incorrect as the
data structure should be available after the initializations (they do
not unregister them during initializations).
This was leading to an oops when notifier_chain_register() call is
invoked for those callback chains after initialization.
This patch fixes all such usages to _not_ have the notifier_block data
structure in the init data section.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
sys_splice() moves data to/from pipes with a file input/output. sys_vmsplice()
moves data to a pipe, with the input being a user address range instead.
This uses an approach suggested by Linus, where we can hold partial ranges
inside the pages[] map. Hopefully this will be useful for network
receive support as well.
Signed-off-by: Jens Axboe <axboe@suse.de>
Not even the iSeries maintainer seems to have access to this legendary
piranha simulator. It adds a bit of ugliness in the common time init
code, and if it's no longer used we might as well be done with it and
remove the bloat.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Quiet some of the more debug related output from the pci probe routines.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Move time_init console output to KERN_DEBUG prink level. No need to
print it at every boot.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cleanup patch which removes the io_page_mask. It fixes the reset on
some e1000 devices which is needed for clean kexec reboots. The legacy
devices which broke with this patch (parallel port and PC speaker) have
now been fixed in Linus' tree.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We've seen several bugs caused by interrupt weirdness in the kdump kernel.
Panicking from an interrupt handler means we fail to EOI the interrupt, and
so the second kernel never gets that interrupt ever again. We also see hangs
on JS20 where we take interrupts in the second kernel early during boot.
This patch fixes both those problems, and although it adds more code to the
crash path I think it is the best solution.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Every time a new syscall gets added, a BUILD_BUG_ON in
arch/powerpc/platforms/cell/spu_callbacks.c gets triggered.
Since the addition of a new syscall is rather harmless,
the error should just be removed.
While we're here, add sys_tee to the list and add a comment
to systbl.S to remind people that there is another list
on powerpc.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Turn on the DART already at 1GB. This is needed because of crippled
devices in some systems, i.e. Airport Extreme cards, only supporting
30-bit DMA addresses.
Otherwise, users with between 1 and 2GB of memory will need to manually
enable it with iommu=force, and that's no good.
Some simple performance tests show that there's a slight impact of
enabling DART, but it's in the 1-3% range (kernel build with disk I/O
as well as over NFS).
iommu=off can still be used for those who don't want to deal with the
overhead (and don't need it for any devices).
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Some devices don't support full 32-bit DMA address space, which we currently
assume. Add the required mask-passing to the IOMMU allocators.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Andrew Morton pointed out that compiler might not inline the functions
marked for inline in kprobes. There-by allowing the insertion of probes
on these kprobes routines, which might cause recursion.
This patch removes all such inline and adds them to kprobes section
there by disallowing probes on all such routines. Some of the routines
can even still be inlined, since these routines gets executed after the
kprobes had done necessay setup for reentrancy.
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-merge:
powerpc: Use correct sequence for putting CPU into nap mode
[PATCH] spufs: fix context-switch decrementer code
[PATCH] powerpc32: Set cpu explicitly in kernel compiles
[PATCH] powerpc/pseries: bugfix: balance calls to pci_device_put
[PATCH] powerpc: Fix machine detection in prom_init.c
[PATCH] ppc32: Fix string comparing in platform_notify_map
[PATCH] powerpc: Avoid __initcall warnings
[PATCH] powerpc: Ensure runlatch is off in the idle loop
powerpc: Fix CHRP booting - needs a define_machine call
powerpc: iSeries has only 256 IRQs
We weren't using the recommended sequence for putting the CPU into
nap mode. When I changed the idle loop, for some reason 7447A cpus
started hanging when we put them into nap mode. Changing to the
recommended sequence fixes that.
The complexity here is that the recommended sequence is a loop that
keeps putting the cpu back into nap mode. Clearly we need some way
to break out of the loop when an interrupt (external interrupt,
decrementer, performance monitor) occurs. Here we use a bit in
the thread_info struct to indicate that we need this, and the exception
entry code notices this and arranges for the exception to return
to the value in the link register, thus breaking out of the loop.
We use a new `local_flags' field in the thread_info which we can
alter without needing to use an atomic update sequence.
The PPC970 has the same recommended sequence, so we do the same thing
there too.
This also fixes a bug in the kernel stack overflow handling code on
32-bit, since it was causing a value that we needed in a register to
get trashed.
Signed-off-by: Paul Mackerras <paulus@samba.org>
In e8222502ee the detection of machine types
in prom_init broke for some machines. We should be checking /device_type
instead of /model. This should make Power3 and Power4 boot again. Haven't
been able to test this. We also need to relocate before comparing.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fix __initcall return in proc_rtas_init and rtas_init.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since external and decrementer interrupts set the runlatch on, we need
to ensure its set off again in the idle loop. At the moment we dont turn
it off in the inner loop.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Basically an in-kernel implementation of tee, which uses splice and the
pipe buffers as an intelligent way to pass data around by reference.
Where the user space tee consumes the input and produces a stdout and
file output, this syscall merely duplicates the data inside a pipe to
another pipe. No data is copied, the output just grabs a reference to the
input pipe data.
Signed-off-by: Jens Axboe <axboe@suse.de>
The iSeries Hypervisor only allows us to specify IRQ numbers up to 255 (it
has a u8 field to pass it in). This patch allows platforms to specify a
maximum to the virtual IRQ numbers we will use and has iSeries set that
to 255. If not set, the maximum is NR_IRQS - 1 (as before).
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Since the powerpc 64k pages patch went in, systems that have SLBs
(like Power4 iSeries) needed to have slb_initialize called to set up
some variables for the SLB miss handler. This was not being called
on the boot processor on iSeries, so on single cpu iSeries machines,
we would get apparent memory curruption as soon as we entered user mode.
This patch fixes that by calling slb_initialize on the boot cpu if the
processor has an SLB.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes several problems with the lparcfg code. In case
someone gets a sense of deja-vu, part of this was submitted last Sep, I
thought the changes went in, but either got backed out, or just got
lost.
First, change the local_buffer declaration to be unsigned char *. We
had a bad-math problem in a 2.4 tree which was built with a
"-fsigned-char" parm. I dont believe we ever build with that parm
now-a-days, but to be safe, I'd prefer the declaration be explicit.
Second, fix a bad math calculation for splpar_strlen.
Third, on the rtas_call for get-system-parameter, pass in
RTAS_DATA_BUF_SIZE for the rtas_data_buf size, instead of letting random
data determine the size. Until recently, we've had a sufficiently
large 'random data' value get passed in, so the function just happens to
have worked OK. Now it's getting passed a '0', which causes the
rtas_call to return success, but no data shows up in the buffer.
(oops!). This was found by the LTC test org.
This is in a branch of code that only gets run on SPLPAR systems.
Tested on power5 Lpar.
Signed-off-by: Will Schmidt <willschm@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Also cleans up some nearby whitespace problems.
Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The only user of get_wchan is the proc fs - and proc can't be built modular.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The boot cmdline is parsed in parse_early_param() and
parse_args(,unknown_bootoption).
And __setup() is used in obsolete_checksetup().
start_kernel()
-> parse_args()
-> unknown_bootoption()
-> obsolete_checksetup()
If __setup()'s callback (->setup_func()) returns 1 in
obsolete_checksetup(), obsolete_checksetup() thinks a parameter was
handled.
If ->setup_func() returns 0, obsolete_checksetup() tries other
->setup_func(). If all ->setup_func() that matched a parameter returns 0,
a parameter is seted to argv_init[].
Then, when runing /sbin/init or init=app, argv_init[] is passed to the app.
If the app doesn't ignore those arguments, it will warning and exit.
This patch fixes a wrong usage of it, however fixes obvious one only.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Mark unwind info for signal trampolines using the new S augmentation flag
introduced in: http://gcc.gnu.org/PR26208.
GCC 4.2 (or patched earlier GCC) will be able to special case unwinding
through frames right above signal trampolines. As the augmentations start
with z flag and S is at the very end of the augmentation string, older GCCs
will just skip the S flag as unknown (that's why an augmentation flag was
chosen over say a new CFA opcode).
Signed-off-by: Jakub Jelinek <jakub@redhat.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Removed platform_init usage on 83xx and 85xx and use define_machine and
probe(). For now we always return true in the problem since you can only
build for one specific board at a time. This is an artificial constraint.
When we get ride of it we will need to update the Kconfig's for these
sub-arch's and make the board's probe() functions actually do something.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
for_each_cpu() actually iterates across all possible CPUs. We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs. This is inefficient and
possibly buggy.
We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.
This patch replaces for_each_cpu with for_each_possible_cpu.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This removes statically assigned platform numbers and reworks the
powerpc platform probe code to use a better mechanism. With this,
board support files can simply declare a new machine type with a
macro, and implement a probe() function that uses the flattened
device-tree to detect if they apply for a given machine.
We now have a machine_is() macro that replaces the comparisons of
_machine with the various PLATFORM_* constants. This commit also
changes various drivers to use the new macro instead of looking at
_machine.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We used to assume that a DMA mapping request with a NULL dev was for
ISA DMA. This assumption was broken at some point. Now we explicitly
pass the detected ISA PCI device in the floppy setup.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Non zero initcalls (except for -ENODEV) have started warning at boot.
Fix smt_setup and init_ras_IRQ.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A bug in the RTAS services incorrectly interprets some bits in the CR
when called from the OS. Specifically, bits in CR4. The result could
be a firmware crash that also takes down the partition. A firmware
fix is in the works. We have seen this situation when performing DLPAR
operations. As a temporary workaround, clear the CR in enter_rtas().
Note that enter_rtas() will not set any bits in CR4 before calling RTAS.
Also note that the 32 bit version of enter_rtas() should have the same
work around even though the chances of hitting the bug are much smaller
due to the lack of DLPAR on 32 bit kernels. However, my assembly skills
are a bit rusty and the 32 bit code doesn't seem to follow the conventions
for where things should be saved. In addition, I don't have a system
to test 32 bit kernels. Help creating and at least touch testing the
same workaround for 32 bit would be appreciated.
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
*) When setting a sighandler using sigaction() call, if the flag
SA_ONSTACK is set and no alternate stack is provided via sigaltstack(),
the kernel still try to install the alternate stack. This behavior is
the opposite of the one which is documented in Single Unix
Specifications V3.
*) Also when setting an alternate stack using sigaltstack() with the
flag SS_DISABLE, the kernel try to install the alternate stack on
signal delivery.
These two use cases makes the process crash at signal delivery.
This fixes it.
Signed-off-by: Laurent Meyer <meyerlau@fr.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We need to export ppc64_firmware_features for modules. Before we do that
I think we should probably rename it to powerpc_firmware_features.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Export validate_sp so we can use it in the oprofile calltrace code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes a mistake I made when editing these functions - when I
took out the interrupt disabling code (because interrupts are now
disabled by the caller) I left the register that is used for the MSR
value to be used during doze/nap uninitialized. This fixes it.
Also updated some of the comments in idle_power4.S and removed some
code that was copied over from idle_6xx.S but is no longer relevant
(we don't ever clear the CPU_FTR_CAN_NAP bit at runtime for POWER4).
Signed-off-by: Paul Mackerras <paulus@samba.org>
The kernel's implementation of notifier chains is unsafe. There is no
protection against entries being added to or removed from a chain while the
chain is in use. The issues were discussed in this thread:
http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2
We noticed that notifier chains in the kernel fall into two basic usage
classes:
"Blocking" chains are always called from a process context
and the callout routines are allowed to sleep;
"Atomic" chains can be called from an atomic context and
the callout routines are not allowed to sleep.
We decided to codify this distinction and make it part of the API. Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name). New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain. The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.
With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed. For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections. (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)
There are some limitations, which should not be too hard to live with. For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem. Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain. (This did happen in a couple of places and the code
had to be changed to avoid it.)
Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization. Instead we use RCU. The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.
Here is the list of chains that we adjusted and their classifications. None
of them use the raw API, so for the moment it is only a placeholder.
ATOMIC CHAINS
-------------
arch/i386/kernel/traps.c: i386die_chain
arch/ia64/kernel/traps.c: ia64die_chain
arch/powerpc/kernel/traps.c: powerpc_die_chain
arch/sparc64/kernel/traps.c: sparc64die_chain
arch/x86_64/kernel/traps.c: die_chain
drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list
kernel/panic.c: panic_notifier_list
kernel/profile.c: task_free_notifier
net/bluetooth/hci_core.c: hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain
net/ipv6/addrconf.c: inet6addr_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain
net/netlink/af_netlink.c: netlink_chain
BLOCKING CHAINS
---------------
arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain
arch/s390/kernel/process.c: idle_chain
arch/x86_64/kernel/process.c idle_notifier
drivers/base/memory.c: memory_chain
drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list
drivers/macintosh/adb.c: adb_client_list
drivers/macintosh/via-pmu.c sleep_notifier_list
drivers/macintosh/via-pmu68k.c sleep_notifier_list
drivers/macintosh/windfarm_core.c wf_client_list
drivers/usb/core/notify.c usb_notifier_list
drivers/video/fbmem.c fb_notifier_list
kernel/cpu.c cpu_chain
kernel/module.c module_notify_list
kernel/profile.c munmap_notifier
kernel/profile.c task_exit_notifier
kernel/sys.c reboot_notifier_list
net/core/dev.c netdev_chain
net/decnet/dn_dev.c: dnaddr_chain
net/ipv4/devinet.c: inetaddr_chain
It's possible that some of these classifications are wrong. If they are,
please let us know or submit a patch to fix them. Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)
The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.
[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
... and rename it to module_32.c since it is the 32-bit version.
The 32-bit and 64-bit ABIs are sufficiently different that having
a merged version isn't really practical.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Also renamed temp.c to tau_6xx.c (for thermal assist unit) and updated
the Kconfig option description and help text for CONFIG_TAU.
Signed-off-by: Paul Mackerras <paulus@samba.org>
No functional changes, but call it l2cr_6xx.S since it is specific
to 6xx-family (including G3/750 and G4/74xx) processors.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This unifies the 32-bit (ARCH=ppc and ARCH=powerpc) and 64-bit idle
loops. It brings over the concept of having a ppc_md.power_save
function from 32-bit to ARCH=powerpc, which lets us get rid of
native_idle(). With this we will also be able to simplify the idle
handling for pSeries and cell.
Signed-off-by: Paul Mackerras <paulus@samba.org>
We only ever execute the loop once, so let's move it to a function
making it more readable. Cleanup patch, no functional change.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We currently have a hack to flip the boot cpu and its secondary thread
to logical cpuid 0 and 1. This means the logical - physical mapping will
differ depending on which cpu is boot cpu. This is most apparent on
kexec, where we might kexec on any cpu and therefore change the mapping
from boot to boot.
The patch below does a first pass early on to work out the logical cpuid
of the boot thread. We then fix up some paca structures to match.
Ive also removed the boot_cpuid_phys variable for ppc64, to be
consistent we use get_hard_smp_processor_id(boot_cpuid) everywhere.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Do not call prom exit prom_panic. It clears the screen and the exit
message is lost.
On some (or all?) pmacs it causes another crash when OF tries to print
the date and time in its banner.
Set of_platform earlier to catch more prom_panic() calls.
Signed-off-by: Olaf Hering <olh@suse.de>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The address of variable val in prom_init_stdout is passed to prom_getprop.
prom_getprop casts the pointer to u32 and passes it to call_prom in the hope
that OpenFirmware stores something there.
But the pointer is truncated in the lower bits and the expected value is
stored somewhere else.
In my testing I had a stackpointer of 0x0023e6b4. val was at offset 120,
wich has address 0x0023e72c. But the value passed to OF was 0x0023e728.
c00000000040b710: 3b 01 00 78 addi r24,r1,120
...
c00000000040b754: 57 08 00 38 rlwinm r8,r24,0,0,28
...
c00000000040b784: 80 01 00 78 lwz r0,120(r1)
...
c00000000040b798: 90 1b 00 0c stw r0,12(r27)
...
The stackpointer came from 32bit code.
The chain was yaboot -> zImage -> vmlinux
PowerMac OpenFirmware does appearently not handle the ELF sections
correctly. If yaboot was compiled in
/usr/src/packages/BUILD/lilo-10.1.1/yaboot, then the stackpointer is
unaligned. But the stackpointer is correct if yaboot is compiled in
/tmp/yaboot.
This bug triggered since 2.6.15, now prom_getprop is an inline
function. gcc clears the lower bits, instead of just clearing the
upper 32 bits.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
powerpc currently declares some of its own system calls
in <asm/unistd.h>, but not all of them. That place also
contains remainders of the now almost unused kernel syscall
hack.
- Add a new <asm/syscalls.h> with clean declarations
- Include that file from every source that implements one
of these
- Get rid of old declarations in <asm/unistd.h>
This patch is required as a base for implementing system
calls from an SPU, but also makes sense as a general
cleanup.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Provide proper kprobes fault handling, if a user-specified pre/post handlers
tries to access user address space, through copy_from_user(), get_user() etc.
The user-specified fault handler gets called only if the fault occurs while
executing user-specified handlers. In such a case user-specified handler is
allowed to fix it first, later if the user-specifed fault handler does not fix
it, we try to fix it by calling fix_exception().
The user-specified handler will not be called if the fault happens when single
stepping the original instruction, instead we reset the current probe and
allow the system page fault handler to fix it up.
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently kprobe handler traps only happen in kernel space, so function
kprobe_exceptions_notify should skip traps which happen in user space.
This patch modifies this, and it is based on 2.6.16-rc4.
Signed-off-by: bibo mao <bibo.mao@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
Cc: <hiramatu@sdl.hitachi.co.jp>
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When kretprobe probes the schedule() function, if the probed process exits
then schedule() will never return, so some kretprobe instances will never
be recycled.
In this patch the parent process will recycle retprobe instances of the
probed function and there will be no memory leak of kretprobe instances.
Signed-off-by: bibo mao <bibo.mao@intel.com>
Cc: Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Create compat_sys_adjtimex and use it an all appropriate places.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We had a copy of the compatibility version of struct timex in each 64 bit
architecture. This patch just creates a global one and replaces all the
usages of the old ones.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When we stop allocating percpu memory for not-possible CPUs we must not touch
the percpu data for not-possible CPUs at all. The correct way of doing this
is to test cpu_possible() or to use for_each_cpu().
This patch is a kernel-wide sweep of all instances of NR_CPUS. I found very
few instances of this bug, if any. But the patch converts lots of open-coded
test to use the preferred helper macros.
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Christian Zankel <chris@zankel.net>
Cc: Philippe Elie <phil.el@wanadoo.fr>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Jens Axboe <axboe@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Semaphore to mutex conversion.
The conversion was generated via scripts, and the result was validated
automatically via a script as well.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove the assumption that driver_register() returns the number of devices
bound to the driver. In fact, it returns zero for success or a negative
error value.
Nobody uses the return value of of_register_driver() anyway.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
In mm_init_ppc64() we calculate the location of the "IO hole", but then
no one ever looks at the value. So don't bother.
That's actually all mm_init_ppc64() does, so get rid of it too.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The dynamic add path for PCI Host Bridges can fail to configure children
adapters under P5IOC controllers. It fails to properly fixup bus/device
resources, and it fails to properly enable EEH. Both of these steps
need to occur before any children devices are enabled in
pci_bus_add_devices().
Signed-off-by: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
remove warnings when building a 64bit kernel.
smp_call_function triggers also with 32bit kernel.
WARNING: vmlinux: duplicate symbol 'smp_call_function' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:164:EXPORT_SYMBOL(smp_call_function);
arch/powerpc/kernel/smp.c:300:EXPORT_SYMBOL(smp_call_function);
WARNING: vmlinux: duplicate symbol 'ioremap' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:113:EXPORT_SYMBOL(ioremap);
arch/powerpc/mm/pgtable_64.c:321:EXPORT_SYMBOL(ioremap);
WARNING: vmlinux: duplicate symbol '__ioremap' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:117:EXPORT_SYMBOL(__ioremap);
arch/powerpc/mm/pgtable_64.c:322:EXPORT_SYMBOL(__ioremap);
WARNING: vmlinux: duplicate symbol 'iounmap' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:118:EXPORT_SYMBOL(iounmap);
arch/powerpc/mm/pgtable_64.c:323:EXPORT_SYMBOL(iounmap);
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We should be memset'ing the data we are pointing to, not the pointer
itself. This is in an error path so we probably don't hit it much.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The recent changes to keep gettimeofday in sync with xtime had the side
effect that it was occasionally possible for the time reported by
gettimeofday to go back by a microsecond. There were two reasons:
(1) when we recalculated the offsets used by gettimeofday every 2^31
timebase ticks, we lost an accumulated fractional microsecond, and
(2) because the update is done some time after the notional start of
jiffy, if ntp is slowing the clock, it is possible to see time go backwards
when the timebase factor gets reduced.
This fixes it by (a) slowing the gettimeofday clock by about 1us in
2^31 timebase ticks (a factor of less than 1 in 3.7 million), and (b)
adjusting the timebase offsets in the rare case that the gettimeofday
result could possibly go backwards (i.e. when ntp is slowing the clock
and the timer interrupt is late). In this case the adjustment will
reduce to zero eventually because of (a).
Signed-off-by: Paul Mackerras <paulus@samba.org>
The current pcspkr code combines the device and driver registration.
This patch splits these, putting the device registration in the arch
specific code.
PowerPC and MIPS only have the pcspkr present sometimes.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
A careful reading of the recent changes to the system call entry/exit
paths revealed several problems, plus some things that could be
simplified and improved:
* 32-bit wasn't testing the _TIF_NOERROR bit in the syscall fast exit
path, so it was only doing anything with it once it saw some other
bit being set. In other words, the noerror behaviour would apply to
the next system call where we had to reschedule or deliver a signal,
which is not necessarily the current system call.
* 32-bit wasn't doing the call to ptrace_notify in the syscall exit
path when the _TIF_SINGLESTEP bit was set.
* _TIF_RESTOREALL was in both _TIF_USER_WORK_MASK and
_TIF_PERSYSCALL_MASK, which is odd since _TIF_RESTOREALL is only set
by system calls. I took it out of _TIF_USER_WORK_MASK.
* On 64-bit, _TIF_RESTOREALL wasn't causing the non-volatile registers
to be restored (unless perhaps a signal was delivered or the syscall
was traced or single-stepped). Thus the non-volatile registers
weren't restored on exit from a signal handler. We probably got
away with it mostly because signal handlers written in C wouldn't
alter the non-volatile registers.
* On 32-bit I simplified the code and made it more like 64-bit by
making the syscall exit path jump to ret_from_except to handle
preemption and signal delivery.
* 32-bit was calling do_signal unnecessarily when _TIF_RESTOREALL was
set - but I think because of that 32-bit was actually restoring the
non-volatile registers on exit from a signal handler.
* I changed the order of enabling interrupts and saving the
non-volatile registers before calling do_syscall_trace_leave; now we
enable interrupts first.
Signed-off-by: Paul Mackerras <paulus@samba.org>
yaboot is scrogged and calls us with an invalid stack alignment,
it seems.
Thanks to David Woodhouse to pointing me to the problem.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On Thu, 2006-03-02 at 19:55 +0100, Olaf Hering wrote:
> My iBook1 has 2 memory regions in reg. Depending on how I boot it
> (vmlinux+initrd) or zImage.initrd, it will not boot with current Linus
> tree.
> rmo_top should be 160MB instead of 32MB.
On logically-partitioned machines the first element of the reg
property in the memory node is defined to be the "RMO" region,
i.e. the memory that the processor can access in real mode. On other
machines the first element has no special meaning, so only take it to
be the RMO region on LPAR machines.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch makes userland aware of the icache snoop capability of the
POWER5 (and possibly others in the future) and of SMT capabilities.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On 32-bit, the exception prolog for the program check exception doesn't
enable interrupts early on. If it is an illegal instruction exception,
we read the instruction in order to emulate certain instructions, and
the get_user of the instruction triggers a WARN_ON since interrupts
are still disabled. This adds a local_irq_enable() to enable
interrupts before reading the instruction.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds mm->task_size to keep track of the task size of a given mm
and uses that to fix the powerpc vdso so that it uses the mm task size to
decide what pages to fault in instead of the current thread flags (which
broke when ptracing).
(akpm: I expect that mm_struct.task_size will become the way in which we
finally sort out the confusion between 32-bit processes and 32-bit mm's. It
may need tweaks, but at this stage this patch is powerpc-only.)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A bug in the assembly code of the vdso can cause gettimeofday() to hang
or to return incorrect results. The wrong register was used to test for
pending updates of the calibration variables and to create a dependency
for subsequent loads. This fixes it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The inline cputime_to_foo and foo_to_cputime conversion functions in
include/asm-powerpc/cputime.h refer to 5 variables, which need to be
exported if those functions are to be usable from modules.
Signed-off-by: Paul Mackerras <paulus@samba.org>
mem= command line option was being ignored in arch/powerpc if we were not
a CONFIG_MULTIPLATFORM (which is handled via prom_init stub). The initial
command line extraction and parsing needed to be moved earlier in the boot
process and have code to actual parse mem= and do something about it.
Also, fixed a compile warning in the file.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This implements accurate task and cpu time accounting for 64-bit
powerpc kernels. Instead of accounting a whole jiffy of time to a
task on a timer interrupt because that task happened to be running at
the time, we now account time in units of timebase ticks according to
the actual time spent by the task in user mode and kernel mode. We
also count the time spent processing hardware and software interrupts
accurately. This is conditional on CONFIG_VIRT_CPU_ACCOUNTING. If
that is not set, we do tick-based approximate accounting as before.
To get this accurate information, we read either the PURR (processor
utilization of resources register) on POWER5 machines, or the timebase
on other machines on
* each entry to the kernel from usermode
* each exit to usermode
* transitions between process context, hard irq context and soft irq
context in kernel mode
* context switches.
On POWER5 systems with shared-processor logical partitioning we also
read both the PURR and the timebase at each timer interrupt and
context switch in order to determine how much time has been taken by
the hypervisor to run other partitions ("steal" time). Unfortunately,
since we need values of the PURR on both threads at the same time to
accurately calculate the steal time, and since we can only calculate
steal time on a per-core basis, the apportioning of the steal time
between idle time (time which we ceded to the hypervisor in the idle
loop) and actual stolen time is somewhat approximate at the moment.
This is all based quite heavily on what s390 does, and it uses the
generic interfaces that were added by the s390 developers,
i.e. account_system_time(), account_user_time(), etc.
This patch doesn't add any new interfaces between the kernel and
userspace, and doesn't change the units in which time is reported to
userspace by things such as /proc/stat, /proc/<pid>/stat, getrusage(),
times(), etc. Internally the various task and cpu times are stored in
timebase units, but they are converted to USER_HZ units (1/100th of a
second) when reported to userspace. Some precision is therefore lost
but there should not be any accumulating error, since the internal
accumulation is at full precision.
Signed-off-by: Paul Mackerras <paulus@samba.org>
HMT support is currently broken and needs to be reworked to play nicely
with the SMT scheduler. Remove the bit rotten bits for the time being.
I also updated an incorrect comment, we enter __secondary_hold with the
physical cpu id in r3.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The runlatch SPR can take a lot of time to write. My original runlatch
code would set it on every exception entry even though most of the time
this was not required. It would also continually set it in the idle
loop, which is an issue on an SMT capable processor.
Now we cache the runlatch value in a threadinfo bit, and only check for
it in decrementer and hardware interrupt exceptions as well as the idle
loop. Boot on POWER3, POWER5 and iseries, and compile tested on pmac32.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
altivec_unavailable_exception is called without setting r3... it looks like
the r3 that actually gets passed in as struct pt_regs *regs is the
undisturbed value of r3 at the time the altivec instruction was encountered.
The user actually gets to choose the pt_regs printed in the Oops!
This fixes the oops by passing the correct pt_regs pointer to
altivec_unavailable_exception.
Signed-off-by: Alan Curry <pacman@TheWorld.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The panic CPU is waiting forever due to some large timeout value if some
CPU is not responding to an IPI.
This patch fixes the problem - the maximum waiting period will be
10 seconds and then the kdump boot will go ahead.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
For kexec we need to know the size of the MMU hash table.
Currently we calculate the size once in the htab code, and then twice more in
the kexec code, once using htab_hash_mask and once using ppc64_pft_size.
On some machines the ppc64_pft_size calculation is broken because
ppc64_pft_size is not set.
So we need to fix the second calculation, but better still we should just
calculate the size once and use it everywhere else.
Tested on Power5 LPAR, Power4 non-LPAR and Power3.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
For UP to SMP kexec to work we need to jump into pSeries_secondary_smp_init
event on a UP + KEXEC kernel. The secondary cpus will not find their hw_cpu_id
in the paca and so they'll jump into kexec_wait, ready for a kexec.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Because smp_release_cpus() is built for SMP || KEXEC, it's not safe to
unconditionally call it from setup_system(). On a UP && KEXEC kernel we'll
start up the secondary CPUs which will then go beserk and we die.
Simple fix is to conditionally call smp_release_cpus() in setup_system(). With
that in place we don't need the dummy definition of smp_release_cpus() because
all call sites are #ifdef'ed either SMP or KEXEC.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fallback gracefully when reading /proc/ppc64/lparcfg when the /rtas
device node can't be found.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A few symbols are exported twice, remove them from ppc_ksyms.c
Remove users of sys_ctrler in arch/ppc/
WARNING: vmlinux: duplicate symbol '__delay' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol '__up' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol '__down' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol '__down_interruptible' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'sys_ctrler' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strncat' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strncmp' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strchr' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strrchr' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strnlen' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strpbrk' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'memscan' previous definition was in vmlinux
WARNING: vmlinux: duplicate symbol 'strstr' previous definition was in vmlinux
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes a regression which was introduced by moving ppc32 to use
the same sort of lockless gettimeofday as ppc64 has been using for
some time. This involves getting the timebase and performing some
simple arithmetic to convert it to seconds and microseconds. However,
the factor and offset used there weren't being updated when NTP
varied the tick length using adjtimex. 64-bit didn't notice the
problem because it had a hook in the 32-bit adjtimex compat routine
that attempted to work out what the generic timekeeping code would
do and alter the factor and offset to match. However, that code
was very complex and it wasn't clear that it still matched what the
generic code would do.
Now we use the generic current_tick_length() routine that was recently
added to check that the current tick will be as long as we expect; if
not we recompute the factor and offset. This keeps gettimeofday and
xtime in sync. In addition we check that gettimeofday hasn't got ahead
of xtime on each timer interrupt; if it has, we resync.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch removes all self references and fixes references to files
in the now defunct arch/ppc64 tree. I think this accomplises
everything wanted, though there might be a few references I missed.
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently we have some stuff in firmware.h and kernel/firmware.c that is
#ifdef CONFIG_PPC_PSERIES. Move it all into platforms/pseries.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The e500 core reference manual indicates that isync is required
after mtmsr(DE bit) and mtspr DBCR0. Add isyncs to make the code
conform to the spec.
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Updated FP unavailable exception to refer to the correct
function in traps.c. head_booke.h was using the old name, KernelFP,
instead of kernel_fp_unavailable_exception.
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Registers system call for the powerpc architecture.
Signed-off-by: Janak Desai <janak@us.ibm.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
With this, new system calls only have to be wired up in one place
for ARCH=ppc and ARCH=powerpc, rather than 2.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently most callers of lmb_alloc() don't check if it worked or not, if it
ever does weird bad things will probably happen. The few callers who do check
just panic or BUG_ON.
So make lmb_alloc() panic internally, to catch bugs at the source. The few
callers who did check the result no longer need to.
The only caller that did anything interesting with the return result was
careful_allocation(). For it we create __lmb_alloc_base() which _doesn't_ panic
automatically, a little messy, but passable.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
remove extern declarations of pmac_newworld
move pmac_newworld to bss
if there is any "interrupt-controller" device, then it is newworld.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When a cpu is hotplug-onlined, if we don't set per_cpu(last_jiffy) to
something sane, timer_interrupt will execute its while loop for every
tick missed since the cpu was last online (or since the system was
booted, if we're adding a new cpu). This can cause weird hangs, ssh
sessions dropping, and we can even go xmon if we take a global IPI at
the wrong time.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since 404849bbd2 we've been using
LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked.
This can explode if we take the decrementer interrupt while we're in a module,
because the toc pointer in r2 will be the module's toc pointer.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Checking a pointer for NULL before passing it to kfree is pointless, kfree
does its own NULL checking of input.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
arch/powerpc/kernel/udbg_16550.c: In function `udbg_init_maple_realmode':
arch/powerpc/kernel/udbg_16550.c:162: warning: assignment from incompatible pointer type
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>