Addition to stop_wq needs to happen before adding to the runqeueue and
under the same lock so that we don't have a race window for a lost
wake up in the spu scheduler.
Signed-off-by: Luke Browning <lukebrowning@us.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
For quite a while now spu state is protected by a simple mutex instead
of the old rw_semaphore, and this means we can simplify the locking
around spu_setup_isolated a lot.
Instead of doing an spu_release before entering spu_setup_isolated and
then calling the complicated spu_acquire_exclusive we can now simply
enter the function locked an in guaranteed runnable state, so that the
only bit of spu_acquire_exclusive that's left is the call to
spu_unmap_mappings.
Similarly there's no more need to unlock and reacquire the state_mutex
when spu_setup_isolated is done, but we can always return with the
lock held and only drop it in spu_run_init in the failure case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
A single context should only be woken once, and we should not have
more wakeups for a given priority than the number of contexts on
that runqueue position.
Also add some asserts to trap future problems in this area more
easily.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
set_bit does not guarantee ordering on powerpc, so using it
for communication between threads requires explicit
mb() calls.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
To not lose a spu thread we need to make sure it always gets put back
on the runqueue. In find_victim aswell as in the scheduler tick as done
in the previous patch.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
To not lose a spu thread we need to make sure it always gets put back
on the runqueue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Make sure the pointers to various mappings are cleared once the last
user stopped using them. This avoids accessing freed memory when
tearing down the gang directory aswell as optimizing away
pte invalidations if no one uses these.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
The scheduler workqueue may rearm itself and deadlock when we try to stop
it. Put a flag in place to avoid skip the work if we're tearing down
the context.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Commit 79c8541924 introduced code to move
the initrd if it was in a place where it would get overwritten by the
kernel image. Unfortunately this exposed the fact that the code that
checks whether the values passed in r3 and r4 are intended to indicate
the start address and size of an initrd image was not as thorough as the
kernel's checks. The symptom is that on OF-based platforms, the
bootwrapper can cause an exception which causes the system to drop back
into OF.
Previously it didn't matter so much if the code incorrectly thought that
there was an initrd, since the values for start and size were just passed
through to the kernel. Now the bootwrapper needs to apply the same checks
as the kernel since it is now using the initrd data itself (in the process
of copying it if necessary). This adds the code to do that.
Signed-off-by: Paul Mackerras <paulus@samba.org>
In some cases, multiple OFDT nodes might share the same location code, so
the location code is not a unique identifier for an OFDT node. Changed the
ibmebus probe/remove interface to use the DT path of the device node instead
of the location code.
The DT path must be written into probe/remove right as it would appear in
the "devspec" attribute of the ebus device: relative to the DT root, with a
leading slash and without a trailing slash. One trailing newline will not
hurt; multiple newlines will (like perl's chomp()).
Example:
Add a device "/proc/device-tree/foo@12345678" to ibmebus like this:
echo /foo@12345678 > /sys/bus/ibmebus/probe
Remove the device like this:
echo /foo@12345678 > /sys/bus/ibmebus/remove
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Here's an implementation of DEBUG_PAGEALLOC for 64 bits powerpc.
It applies on top of the 32 bits patch.
Unlike Anton's previous attempt, I'm not using updatepp. I'm removing
the hash entries from the bolted mapping (using a map in RAM of all the
slots). Expensive but it doesn't really matter, does it ? :-)
Memory hot-added doesn't benefit from this unless it's added at an
address that is below end_of_DRAM() as calculated at boot time.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch/powerpc/Kconfig.debug | 2
arch/powerpc/mm/hash_utils_64.c | 84 ++++++++++++++++++++++++++++++++++++++--
2 files changed, 82 insertions(+), 4 deletions(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
Here's an implementation of DEBUG_PAGEALLOC for ppc32. It disables BAT
mapping and is only tested with Hash table based processor though it
shouldn't be too hard to adapt it to others.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch/powerpc/Kconfig.debug | 9 ++++++
arch/powerpc/mm/init_32.c | 4 +++
arch/powerpc/mm/pgtable_32.c | 52 +++++++++++++++++++++++++++++++++++++++
arch/powerpc/mm/ppc_mmu_32.c | 4 ++-
include/asm-powerpc/cacheflush.h | 6 ++++
5 files changed, 74 insertions(+), 1 deletion(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
On hash table based 32 bits powerpc's, the hash management code runs with
a big spinlock. It's thus important that it never causes itself a hash
fault. That code is generally safe (it does memory accesses in real mode
among other things) with the exception of the actual access to the code
itself. That is, the kernel text needs to be accessible without taking
a hash miss exceptions.
This is currently guaranteed by having a BAT register mapping part of the
linear mapping permanently, which includes the kernel text. But this is
not true if using the "nobats" kernel command line option (which can be
useful for debugging) and will not be true when using DEBUG_PAGEALLOC
implemented in a subsequent patch.
This patch fixes this by pre-faulting in the hash table pages that hit
the kernel text, and making sure we never evict such a page under hash
pressure.
Signed-off-by: Benjamin Herrenchmidt <benh@kernel.crashing.org>
arch/powerpc/mm/hash_low_32.S | 22 ++++++++++++++++++++--
arch/powerpc/mm/mem.c | 3 ---
arch/powerpc/mm/mmu_decl.h | 4 ++++
arch/powerpc/mm/pgtable_32.c | 11 +++++++----
4 files changed, 31 insertions(+), 9 deletions(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 32 bits map_page() function is used internally by the mm code
for early mmu mappings and for ioremap. It should never be called
for an address that already has a valid PTE or hash entry, so we
add a BUG_ON for that and remove the useless flush_HPTE call.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch/powerpc/mm/pgtable_32.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
The current tlb flush code on powerpc 64 bits has a subtle race since we
lost the page table lock due to the possible faulting in of new PTEs
after a previous one has been removed but before the corresponding hash
entry has been evicted, which can leads to all sort of fatal problems.
This patch reworks the batch code completely. It doesn't use the mmu_gather
stuff anymore. Instead, we use the lazy mmu hooks that were added by the
paravirt code. They have the nice property that the enter/leave lazy mmu
mode pair is always fully contained by the PTE lock for a given range
of PTEs. Thus we can guarantee that all batches are flushed on a given
CPU before it drops that lock.
We also generalize batching for any PTE update that require a flush.
Batching is now enabled on a CPU by arch_enter_lazy_mmu_mode() and
disabled by arch_leave_lazy_mmu_mode(). The code epects that this is
always contained within a PTE lock section so no preemption can happen
and no PTE insertion in that range from another CPU. When batching
is enabled on a CPU, every PTE updates that need a hash flush will
use the batch for that flush.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Make the alignment exception handler use the new _inatomic variants
of __get/put_user. This fixes erroneous warnings in the very rare
cases where we manage to have copy_tofrom_user_inatomic() trigger
an alignment exception.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch/powerpc/kernel/align.c | 56 ++++++++++++++++++++++++--------------------
1 file changed, 31 insertions(+), 25 deletions(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
Those are needed by things like alignment exception fixup handlers
since those can now be triggered by copy_tofrom_user_inatomic.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The firmware assigns irq 20/21 to the VIA IDE device on Pegasos.
But the required interrupt is 14/15.
Maybe someone confused decimal vs. hexadecimal values.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This also fixes a bug where a property value was being modified
in place.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Some drivers have resources that they want to be able to map into
userspace that are 4k in size. On a kernel configured with 64k pages
we currently end up mapping the 4k we want plus another 60k of
physical address space, which could contain anything. This can
introduce security problems, for example in the case of an infiniband
adaptor where the other 60k could contain registers that some other
program is using for its communications.
This patch adds a new function, remap_4k_pfn, which drivers can use to
map a single 4k page to userspace regardless of whether the kernel is
using a 4k or a 64k page size. Like remap_pfn_range, it would
typically be called in a driver's mmap function. It only maps a
single 4k page, which on a 64k page kernel appears replicated 16 times
throughout a 64k page. On a 4k page kernel it reduces to a call to
remap_pfn_range.
The way this works on a 64k kernel is that a new bit, _PAGE_4K_PFN,
gets set on the linux PTE. This alters the way that __hash_page_4K
computes the real address to put in the HPTE. The RPN field of the
linux PTE becomes the 4k RPN directly rather than being interpreted as
a 64k RPN. Since the RPN field is 32 bits, this means that physical
addresses being mapped with remap_4k_pfn have to be below 2^44,
i.e. 0x100000000000.
The patch also factors out the code in arch/powerpc/mm/hash_utils_64.c
that deals with demoting a process to use 4k pages into one function
that gets called in the various different places where we need to do
that. There were some discrepancies between exactly what was done in
the various places, such as a call to spu_flush_all_slbs in one case
but not in others.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is more consistent and gets us closer to the Sparc code.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is more consistent and gets us closer to the Sparc code.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is more consistent and gets us closer to the Sparc code.
We add a device_is_compatible define for compatibility during the
change over.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is more consistent and gets us closer to the Sparc code.
We add a get_property define for compatibility during the change over.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently the buf pointer is advanced too far during each iteration.
Also terminate the string with a newline.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Efika boards have to be booted with console=ttyPSC0 unless there is a
graphics card plugged in. Detect if the firmware stdout is the serial
connector.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Our kernels put everything in the first load segment, and we read that.
Instead of decompressing to the end of the gzip stream or supplied image
and hoping we get it all, decompress the expected size and complain if
it is not available.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Commit a9903811bf missed two uses of the
the .gz suffix in the wrapper script and didn't clean the additonal
possibly cached files.
Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
crt0.S had provisions to provide run address relocaton to got2 and
cache flush, but not on the bss clear or stack pointer load. Apply
the same fixup for them.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The ELF parsing routines local to arch/powerpc/boot/main.c are useful
to other callers therefore move them to their own file.
Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
72486f1f8f inverted the sense for enabling
hotplug CPU controls without reference to any other architecture other than
i386, ia64 and PowerPC. This left everyone else without hotplug CPU control.
Fix powerpc for this brain damage.
(akpm: patch adapted from rmk's ARM fix. Changelog stolen from rmk)
Signed-off-by: Giuliano Pochini <pochini@shiny.it>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Lifted from http://bugzilla.kernel.org/show_bug.cgi?id=8182
Steps to reproduce:
- Boot an Ocotea board with the mainline 2.6.20.1 kernel.
- Create an /etc/ntp.conf file with at least one NTP server and iburst mode set.
- Issue the command "ntpd -g".
- Wait about two minutes.
- Verify ntpd's status via "ntpq -pn" and by looking in /var/log/ntp.
This fixes this problem by adjusting the expected clock frequency.
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
dt_xlate_reg() uses the ranges properties of a node's parentage to find
the absolute physical address of the node's registers.
The ns16550 driver uses this when no virtual-reg property is found.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There is no reason to yield the CPU in spu_yield - if the backing
thread reenters spu_run it gets added to the end of the runqueue for
it's priority. So the yield is just a slowdown for the case where
we have higher priority contexts waiting.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
I wanted to enable CBE_THERM on PS3. So I had to enable CBE_RAS first.
But the resulting kernel doesn't link, as cbe_regs.c isn't compiled for
non-PPC_CELL_NATIVE.
CBE_RAS should depend on PPC_CELL_NATIVE; this makes it so.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is now inaccurate because we may not have entered prom_init() and
r3 is overwritten immediately anyway.
Signed-off-by: Sonny Rao <sonny@burdell.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Remove unneeded inclusion of linux/ide.h
It does not compile with CONFIG_BLOCK=n.
Remove asm/ide.h from ksyms file, it gets included earlier via
linux/ide.h.
Compile tested with all defconfig files.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This allows the PMU LED on both a PowerMac 7,2 (Dual G5 2.0GHz, June 2003)
and a PowerMac 7,3 (Dual G5 2.0GHz, June 2004) to be controlled.
The physical LED is never off, unlike an iBook/PowerBook LED.
It is rather dim ("off") or very bright ("on").
Signed-off-by: Tony Vroon <chainsaw@gentoo.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fix link errors with CONFIG_EEH=n:
arch/powerpc/platforms/built-in.o: In function `.pcibios_fixup_new_pci_devices':
(.text+0x41c8): undefined reference to `.eeh_add_device_tree_late'
arch/powerpc/platforms/built-in.o: In function `.init_phb_dynamic':
(.text+0x4280): undefined reference to `.eeh_add_device_tree_early'
arch/powerpc/platforms/built-in.o: In function `.pcibios_remove_pci_devices':
(.text+0x42fc): undefined reference to `.eeh_remove_bus_device'
arch/powerpc/platforms/built-in.o: In function `.pcibios_add_pci_devices':
(.text+0x43c0): undefined reference to `.eeh_add_device_tree_early'
arch/powerpc/platforms/built-in.o: In function `.pSeries_final_fixup':
(.init.text+0xb4): undefined reference to `.pci_addr_cache_build'
make[1]: *** [.tmp_vmlinux1] Error 1
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This cleans up how the zImage code manipulates the kernel
command line. Notable improvements from the old handling:
- Command line manipulation is consolidated into a new
prep_cmdline() function, rather than being scattered across start()
and some helper functions
- Less stack space use: we use just a single global command
line buffer, which can be initialized by an external tool as before,
we no longer need another command line sized buffer on the stack.
- Easier to support platforms whose firmware passes a
commandline, but not a device tree. Platform code can now point new
loader_info fields to the firmware's command line, rather than having
to do early manipulation of the /chosen bootargs property which may
then be rewritten again by the core.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds a library of useful device tree manipulation functions
to the zImage library, for use by platform code. These functions are
based on the hooks already in dt_ops, so they're not dependent on a
particular device tree implementation. This patch also slightly
streamlines the code in main.c using these new functions.
This is a consolidation of my work in this area with Scott Wood's
patches to a very similar end.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>