There are several issues with the rtas_ibm_suspend_me code, which
enables platform-assisted suspension of an LPAR as covered in PAPR
2.2.
1.) rtas_ibm_suspend_me uses on_each_cpu() to invoke
rtas_percpu_suspend_me on all cpus via IPI:
if (on_each_cpu(rtas_percpu_suspend_me, &data, 1, 0))
...
'data' is on the calling task's stack, but rtas_ibm_suspend_me takes
no measures to ensure that all instances of rtas_percpu_suspend_me are
finished accessing 'data' before returning. This can result in the
IPI'd cpus accessing random stack data and getting stuck in H_JOIN.
This is addressed by using an atomic count of workers and a completion
on the stack.
2.) rtas_percpu_suspend_me is needlessly calling H_JOIN in a loop.
The only event that can cause a cpu to return from H_JOIN is an H_PROD
from another cpu or a NMI/system reset. Each cpu need call H_JOIN
only once per suspend operation.
Remove the loop and the now unnecessary 'waiting' state variable.
3.) H_JOIN must be called with MSR[EE] off, but lazy interrupt
disabling may cause the caller of rtas_ibm_suspend_me to call H_JOIN
with it on; the local_irq_disable() in on_each_cpu() is not
sufficient.
Fix this by explicitly saving the MSR and clearing the EE bit before
calling H_JOIN.
4.) H_PROD is being called with the Linux logical cpu number as the
parameter, not the platform interrupt server value. (It's also being
called for all possible cpus, which is harmless, but unnecessary.)
This is fixed by calling H_PROD for each online cpu using
get_hard_smp_processor_id(cpu) for the argument.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
vmemmap_populate will printk (with KERN_WARNING) for a lot of pages
if CONFIG_SPARSEMEM_VMEMMAP is enabled (at least it does on iSeries).
Use pr_debug for it instead.
Replace the only other use of DBG in this file with pr_debug as well.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The early btext debug wouldn't work on PowerMac when booted from BootX
due to the code looking for the wrong property name.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
These don't need to be seen by everyone on every boot.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The patch "KVM: fix !SMP build error" change the way smp_call_function()
actually uses the passed in function names on non-SMP builds. So
previously it was never caught that the function passed in was never
actually defined.
This causes a build error on ppc64_defconfig + CONFIG_SMP=n:
arch/powerpc/mm/tlb_64.c: In function 'pgtable_free_now':
arch/powerpc/mm/tlb_64.c:71: error: 'pte_free_smp_sync' undeclared (first use in this function)
arch/powerpc/mm/tlb_64.c:71: error: (Each undeclared identifier is reported only once
arch/powerpc/mm/tlb_64.c:71: error: for each function it appears in.)
So we need to define it even if CONFIG_SMP is off. Either that or ifdef
out the smp_call_function() call, but that's ugly.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The context switch code in the kernel issues a dummy stwcx. to clear the
reservation, as recommended by the architecture. However, some processors
can have issues if this stwcx to address A occurs while the reservation
is already held to a different address B. To avoid this problem, the dummy
stwcx. needs to be paired with a dummy lwarx to the same address.
This adds the dummy lwarx, and creates a cpu feature bit to indicate
which cpus are affected. Tested on mpc8641_hpcn_defconfig in
arch/powerpc; build tested in arch/ppc.
Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the
deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been
broken on powerpc, because we end up counting user time twice: once in
timer_interrupt() and once in update_process_times().
This fixes the problem by pulling the code in update_process_times
that updates utime and stime into a separate function called
account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined,
there is a version of account_process_tick in kernel/timer.c that
simply accounts a whole tick to either utime or stime as before. If
CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to
implement account_process_tick.
This also lets us simplify the s390 code a bit; it means that the s390
timer interrupt can now call update_process_times even when
CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a
suitable account_process_tick().
account_process_tick() now takes the task_struct * as an argument.
Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A debugging printk is removed, and a comment is fixed to match
the code.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Newer GCC's are capable of autovectorization for ISA extensions like
AltiVec and SPE. If we happen to build with one of those compilers we
will get SPE instructions in random kernel code. Today we only allow
basic interger code in the kernel and FP, AltiVec, or SPE in special
explicit locations that have handled the proper saving and restoring of
the register state (since on uniprocessor we lazy context switch the
register state for FP, AltiVec, and SPE).
-mno-spe disables the compiler for automatically generating SPE
instructions without our knowledge.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Fix old buglet; a warning message should have been printed
when a hardware reset takes too long.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This makes the altivec code in swsusp_32.S depend on CONFIG_ALTIVEC to
avoid build failures for systems that don't have altivec. I'm not sure
whether the code will actually work for other systems, but it was merged
for just ppc32 rather than powermac a very long time ago.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
If the low level MMU hash table insertion returns an error (which
can happen in some rare circumstances when the hypervisor refuses
the insertion of a PTE, typically if you try to access junk via
/dev/mem), the generated signal had an incorrect si_addr value due
to a bug in the assembly, which was loading it as a 32 bits quantity
instead of a 64 bits quantity.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Refresh ppc64_defconfig, add PPC_PASEMI and various options that the
common boards there need:
* Chip drivers (iommu, ethernet, IDE, CF, EDAC, MDIO/PHY)
* PCMCIA
* PATA_PCMCIA
* RTC_CLASS
* SATA_MV
* SATA_SIL24
* IP_PNP + NFS_ROOT for diskless booting
+ possibly some other things I might have missed to list
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Update pasemi_defconfig. Add a few missing options for default devices
on electra boards, enable tickless and hrtimers, etc, etc.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The size passing to memset is wrong.
Signed-off-by Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
An allyesconfig build creates a .text section that is so big that the
.text.init.refok and .fixup sections are too far away for the relocations
to be fixed up correctly. This patch fixes that by linking all the
relevent text sections for each file together.
Suggested by Paul Mackerras.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
ppc_md.init_IRQ is not called if it is NULL, so we don't need an empty
routine in the non PCI case.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Bugfix: avoid crash if there's no PCI device for a given
openfirmware node.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Bugfix: if a driver controlling one part of a multi-function PCI card
has asked for a reset, honor that request above all others.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The decrementer in Book E and 4xx processors interrupts on the
transition from 1 to 0, rather than on the 0 to -1 transition as on
64-bit server and 32-bit "classic" (6xx/7xx/7xxx) processors. At the
moment we subtract 1 from the count of how many decrementer ticks are
required before the next interrupt before putting it into the
decrementer, which is correct for server/classic processors, but could
possibly cause the interrupt to happen too early on Book E and 4xx if
the timebase/decrementer frequency is low.
This fixes the problem by making set_dec subtract 1 from the count for
server and classic processors, instead of having the callers subtract
1. Since set_dec already had a bunch of ifdefs to handle different
processor types, there is no net increase in ugliness. :)
Note that calling set_dec(0) may not generate an interrupt on some
processors. To make sure that decrementer_set_next_event always calls
set_dec with an interval of at least 1 tick, we set min_delta_ns of
the decrementer_clockevent to correspond to 2 ticks (2 rather than 1
to compensate for truncations in the conversions between ticks and
ns).
This also removes a redundant call to set the decrementer to
0x7fffffff - it was already set to that earlier in timer_interrupt.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Now that we have 1TB segment size support, we need to be using the
GET_ESID_1T macro when comparing ESID values for pc, stack, and
unmapped_base within switch_slb(). A new helper function called
esids_match() contains the logic for deciding when to call GET_ESID
and GET_ESID_1T.
This fixes a duplicate-slb-entry inspired machine-check exception I
was seeing when trying to run java on a power6 partition.
Tested on power6 and power5.
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Without this patch I get the following build failure
CC arch/powerpc/platforms/celleb/setup.o
arch/powerpc/platforms/celleb/setup.c:151: error: 'generic_calibrate_decr' undeclared here (not in a function)
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes the error
error: implicit declaration of function "udbg_printf"
We have a few spots where we reference udbg_printf() without #including
udbg.h. These are within #ifdef DEBUG blocks, so unnoticed until we do
a #define DEBUG or #define DEBUG_LOW nearby.
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We had an historical confusion in the kernel between cache line
and cache block size. The former is an implementation detail of
the L1 cache which can be useful for performance optimisations,
the later is the actual size on which the cache control
instructions operate, which can be different.
For some reason, we had a weird hack reading the right property
on powermac and the wrong one on any other 64 bits (32 bits is
unaffected as it only uses the cputable for cache block size
infos at this stage).
This fixes the booting-without-of.txt documentation to mention
the right properties, and fixes the 64 bits initialization code
to look for the block size first, with a fallback to the line
size if the property is missing.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fix two build errors on powerpc allyesconfig + CONFIG_SMP=n:
arch/powerpc/platforms/built-in.o: In function `cpu_affinity_set':
arch/powerpc/platforms/cell/spu_priv1_mmio.c:78: undefined reference to `.iic_get_target_id'
arch/powerpc/platforms/built-in.o: In function `iic_init_IRQ':
arch/powerpc/platforms/cell/interrupt.c:397: undefined reference to `.iic_setup_cpu'
Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The ps3 target produces two images, and the binary one is not the
"primary" image that corresponds to the -o flag; thus, it no longer
uses the generic binary flag.
On platforms which do use the binary flag, it no longer produces a
.bin suffix, so that the output file matches what was passed to the -o flag.
This should fix the zImage ln problems for the ps3 target.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Commit 91a69029 introduced an additional parameter to the .read and .write
methods for sysfs binary attributes. Two mv64x60_pci functions
were missed in that patch, resulting in these errors:
/cache/git/linux-2.6/arch/powerpc/sysdev/mv64x60_pci.c:77: warning: initialization from incompatible pointer type
/cache/git/linux-2.6/arch/powerpc/sysdev/mv64x60_pci.c:78: warning: initialization from incompatible pointer type
Add the missing "struct bin_attribute *" parameter.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since commit 76d2160147, the NE2000 card
is not working anymore on PPC and POWERPC and produces WATCHDOG
timeouts.
The patch below fixes that the same way it has been done on x86, x86_64
and MIPS.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There are plans afoot to use pci_restore_msi_state() to restore MSI
state after a device reset. In order for this to work for the RTAS MSI
backend, we need to read back the MSI message from config space after
it has been setup by firmware.
This should be sufficient for restoring the MSI state after a device
reset, however we will need to revisit this for suspend to disk if that
is ever implemented on pseries.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Re-order the EMAC interrupts in the walnut.dts file so that they are mapped
correctly.
Signed-off-by: Steve Falco <sfalco at harris.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
mmu_mapin_ram() loops over total_lowmem to setup page tables. However, if
total_lowmem is less that 16M, the subtraction rolls over and results in
a number just under 4G (because total_lowmem is an unsigned value).
This patch rejigs the loop from countup to countdown to eliminate the
bug.
Special thanks to Magnus Hjorth who wrote the original patch to fix this
bug. This patch improves on his by making the loop code simpler (which
also eliminates the possibility of another rollover at the high end)
and also applies the change to arch/powerpc.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
The 44x family has an interesting "feature" which is a virtually
tagged instruction cache (yuck !). So far, we haven't dealt with
it properly, which means we've been mostly lucky or people didn't
report the problems, unless people have been running custom patches
in their distro...
This is an attempt at fixing it properly. I chose to do it by
setting a global flag whenever we change a PTE that was previously
marked executable, and flush the entire instruction cache upon
return to user space when that happens.
This is a bit heavy handed, but it's hard to do more fine grained
flushes as the icbi instruction, on those processor, for some very
strange reasons (since the cache is virtually mapped) still requires
a valid TLB entry for reading in the target address space, which
isn't something I want to deal with.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
On 4xx CPUs, the current implementation of flush_tlb_page() uses
a low level _tlbie() assembly function that only works for the
current PID. Thus, invalidations caused by, for example, a COW
fault triggered by get_user_pages() from a different context will
not work properly, causing among other things, gdb breakpoints
to fail.
This patch adds a "pid" argument to _tlbie() on 4xx processors,
and uses it to flush entries in the right context. FSL BookE
also gets the argument but it seems they don't need it (their
tlbivax form ignores the PID when invalidating according to the
document I have).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
PowerPC 440EP(x) 440GR(x) processors have the same PVR values, since
they have identical cores. However, FPU is not supported on GR(x) and
enabling APU instruction broadcast in the CCR0 register (to enable FPU)
may cause unpredictable results. There's no safe way to detect FPU
support at runtime. This patch provides a workarund for the issue.
We use a POWER6 "logical PVR approach". First, we identify all EP(x)
and GR(x) processors as GR(x) ones (which is safe). Then we check
the device tree cpu path. If we have a EP(x) processor entry,
we call identify_cpu again with PVR | 0x8. This bit is always 0
in the real PVR. This way we enable FPU only for 440EP(x).
Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Add the 'set -e' command to the wrapper script so that if any command
fails then the script will automatically exit
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Allow wrapper script to print verbose progress when the V is set in the
environment.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
When demoting a process to use 4K HW pages (instead of 64K), which
happens under various circumstances such as doing cache inhibited
mappings on machines that do not support 64K CI pages, the assembly
hash code calls back into the C function flush_hash_page(). This
function prototype was recently changed to accomodate for 1T segments
but the assembly call site was not updated, causing applications that
do demotion to hang. In addition, when updating the per-CPU PACA for
the new sizes, we didn't properly update the slice "map", thus causing
the SLB miss code to re-insert segments for the wrong size.
This fixes both and adds a warning comment next to the C
implementation to try to avoid problems next time someone changes it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Enable restart support for lite5200 board
[POWERPC] Add restart support for mpc52xx based platforms
[POWERPC] Update device tree binding for mpc5200 gpt
[POWERPC] Add mpc52xx_find_and_map_path(), refactor utility functions
[POWERPC] bestcomm: Restrict bus prefetch bugfix to original mpc5200 silicon.
Use the watchdog timer to implement board restart support.
Signed-off-by: Marian Balakowicz <m8@semihalf.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Add common helper routines: mpc52xx_map_wdt() and mpc52xx_restart().
Signed-off-by: Marian Balakowicz <m8@semihalf.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Add 'fsl,' prefix to 'compatible' property for gpt nodes.
Add 'fsl,' prefix to empty, GPT0 specific 'has-wdt' property.
The fsl, prefix is being added to better match the convention of prefixing
manufacturer specific properties and values with the vendors name.
Signed-off-by: Marian Balakowicz <m8@semihalf.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Add helper routine mpc52xx_find_and_map_path(). Extract common code to
mpc52xx_map_node() and refactor mpc52xx_find_and_map().
Signed-off-by: Jan Wrobel <wrr@semihalf.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Most of these fixes were already submitted for old kernel versions, and were
approved, but for some reason they never made it into the releases.
Because this is a consolidation of a couple old missed patches, it touches both
Kconfigs and documentation texts.
Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
* Convert files to UTF-8.
* Also correct some people's names
(one example is Eißfeldt, which was found in a source file.
Given that the author used an ß at all in a source file
indicates that the real name has in fact a 'ß' and not an 'ss',
which is commonly used as a substitute for 'ß' when limited to
7bit.)
* Correct town names (Goettingen -> Göttingen)
* Update Eberhard Mönkeberg's address (http://lkml.org/lkml/2007/1/8/313)
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Fix some device tree omissions that prevented the new EMAC driver from
setting up ethernet on the Bamboo board correctly and update the Bamboo
defconfig.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This patch enables the ibm_newemac driver for the Walnut board. It fixes the
device tree for the walnut board to order the MAL interrupts correctly and
adds the local-mac-address property to the EMAC node. The bootwrapper is also
updated to extract the MAC address from the OpenBIOS offset where it is stored.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
The current bootwrapper fails to set the timebase clock to the CPU clock
which causes the time to increment incorrectly. This fixes it by using the
correct #define for the CPC0_CR1 register.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Currently there's no way to enable early boot console on PowerPC 44x
not specifying uart's physical address in kernel config, which is used
for very early debug messages. This patch splits very early debug output
(which needs uart physical address in kernel config) and early boot console
(which searches for uarts in the device tree using find_legacy_serial_ports).
We enable early boot console for all 44x processors, while (dangerous)
early debug is user-selectable.
Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This patch enables NEW EMAC support for PowerPC 440EPx Sequoia board.
Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This adds RGMII support to Sequoia DTS and sets correct phy-mode
for EMACs. According to Sequoia datasheet, both ethernet ports
are connected to RGMII interface, while ZMII is used only for MDIO.
Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Fix the various misspellings of "system", controller", "interrupt" and
"[un]necessary".
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Quoting Randy:
"It seems sad that this patch sources Kconfig.marker, a 7-line file,
20-something times. Yes, you (we) don't want to put those 7 lines into
20-something different files, so sourcing is the right thing.
However, what you did for avr32 seems more on the right track to me: make
_one_ Instrumentation support menu that includes PROFILING, OPROFILE, KPROBES,
and MARKERS and then use (source) that in all of the arches."
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adapts the ppc64 code to use the generic parse_crashkernel()
function introduced in the generic patch of that series.
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
One of the easiest things to isolate is the pid printed in kernel log.
There was a patch, that made this for arch-independent code, this one makes
so for arch/xxx files.
It took some time to cross-compile it, but hopefully these are all the
printks in arch code.
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
remove asm/bitops.h includes
including asm/bitops directly may cause compile errors. don't include it
and include linux/bitops instead. next patch will deny including asm header
directly.
Cc: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
is_init() is an ambiguous name for the pid==1 check. Split it into
is_global_init() and is_container_init().
A cgroup init has it's tsk->pid == 1.
A global init also has it's tsk->pid == 1 and it's active pid namespace
is the init_pid_ns. But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.
Changelog:
2.6.22-rc4-mm2-pidns1:
- Use 'init_pid_ns.child_reaper' to determine if a given task is the
global init (/sbin/init) process. This would improve performance
and remove dependence on the task_pid().
2.6.21-mm2-pidns2:
- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
ppc,avr32}/traps.c for the _exception() call to is_global_init().
This way, we kill only the cgroup if the cgroup's init has a
bug rather than force a kernel panic.
[akpm@linux-foundation.org: fix comment]
[sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
[bunk@stusta.de: kernel/pid.c: remove unused exports]
[sukadev@us.ibm.com: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Acked-by: Pavel Emelianov <xemul@openvz.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Herbert Poetzel <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In pre-cgroup cpusets, a few config files enabled cpusets by default.
Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds POWERPC specific hooks for scaled time accounting.
POWER6 includes a SPURR register. The SPURR is based off the PURR register
but is scaled based on CPU frequency and issue rates. This gives a more
accurate account of the instructions used per task. The PURR and timebase
will be constant relative to the wall clock, irrespective of the CPU
frequency.
This implementation reads the SPURR register in account_system_vtime which
is only call called on context witch and hard and soft irq entry and exit.
The percentage of user and system time is then estimated using the ratio of
these accounted by the PURR. If the SPURR is not present, the PURR read.
An earlier implementation of this patch read the SPURR whenever the PURR
was read, which included the system call entry and exit path.
Unfortunately this showed a performance regression on lmbench runs, so was
re-implemented.
I've included the lmbench results here when run bare metal on POWER6. 1st
column is the unpatch results. 2nd column is the results using the below
patch and the 3rd is the % diff of these results from the base. 4th and
5th columns are the results and % differnce from the base using the older
patch (SPURR read in syscall entry/exit path).
Base Scaled-Acct SPURR-in-syscall
Result Result % diff Result % diff
Simple syscall: 0.3086 0.3086 0.0000 0.3452 11.8600
Simple read: 0.4591 0.4671 1.7425 0.5044 9.86713
Simple write: 0.4364 0.4366 0.0458 0.4731 8.40971
Simple stat: 2.0055 2.0295 1.1967 2.0669 3.06158
Simple fstat: 0.5962 0.5876 -1.442 0.6368 6.80979
Simple open/close: 3.1283 3.1009 -0.875 3.2088 2.57328
Select on 10 fd's: 0.8554 0.8457 -1.133 0.8667 1.32101
Select on 100 fd's: 3.5292 3.6329 2.9383 3.6664 3.88756
Select on 250 fd's: 7.9097 8.1881 3.5197 8.2242 3.97613
Select on 500 fd's: 15.2659 15.836 3.7357 15.873 3.97814
Select on 10 tcp fd's: 0.9576 0.9416 -1.670 0.9752 1.83792
Select on 100 tcp fd's: 7.248 7.2254 -0.311 7.2685 0.28283
Select on 250 tcp fd's: 17.7742 17.707 -0.375 17.749 -0.1406
Select on 500 tcp fd's: 35.4258 35.25 -0.496 35.286 -0.3929
Signal handler installation: 0.6131 0.6075 -0.913 0.647 5.52927
Signal handler overhead: 2.0919 2.1078 0.7600 2.1831 4.35967
Protection fault: 0.7345 0.7478 1.8107 0.8031 9.33968
Pipe latency: 33.006 16.398 -50.31 33.475 1.42368
AF_UNIX sock stream latency: 14.5093 30.910 113.03 30.715 111.692
Process fork+exit: 219.8 222.8 1.3648 229.37 4.35623
Process fork+execve: 876.14 873.28 -0.32 868.66 -0.8533
Process fork+/bin/sh -c: 2830 2876.5 1.6431 2958 4.52296
File /var/tmp/XXX write bw: 1193497 1195536 0.1708 118657 -0.5799
Pagefaults on /var/tmp/XXX: 3.1272 3.2117 2.7020 3.2521 3.99398
Also, kernel compile times show no difference with this patch applied.
[pbadari@us.ibm.com: Avoid unnecessary PURR reading]
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, there's a CONFIG_DISABLE_CONSOLE_SUSPEND that allows one to stop
the serial console from being suspended when the rest of the machine goes
to sleep. This is incredibly useful for debugging power management-related
things; however, having it as a compile-time option has proved to be
incredibly inconvenient for us (OLPC). There are plenty of times that we
want serial console to not suspend, but for the most part we'd like serial
console to be suspended.
This drops CONFIG_DISABLE_CONSOLE_SUSPEND, and replaces it with a kernel
boot parameter (no_console_suspend). By default, the serial console will
be suspended along with the rest of the system; by passing
'no_console_suspend' to the kernel during boot, serial console will remain
alive during suspend.
For now, this is pretty serial console specific; further fixes could be
applied to make this work for things like netconsole.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@suspend2.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is no reason why the .prepare() and .finish() methods in 'struct
platform_suspend_ops' should take any arguments, since architectures don't use
these methods' argument in any practically meaningful way (ie. either the
target system sleep state is conveyed to the platform by .set_target(), or
there is only one suspend state supported and it is indicated to the PM core
by .valid(), or .prepare() and .finish() aren't defined at all). There also
is no reason why .finish() should return any result.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The name of 'struct pm_ops' suggests that it is related to the power
management in general, but in fact it is only related to suspend. Moreover,
its name should indicate what this structure is used for, so it seems
reasonable to change it to 'struct platform_suspend_ops'. In that case, the
name of the global variable of this type used by the PM core and the names of
related functions should be changed accordingly.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the definition of 'struct pm_ops' and related functions from <linux/pm.h>
to <linux/suspend.h> .
There are, at least, the following reasons to do that:
* 'struct pm_ops' is specifically related to suspend and not to the power
management in general.
* As long as 'struct pm_ops' is defined in <linux/pm.h>, any modification of it
causes the entire kernel to be recompiled, which is unnecessary and annoying.
* Some suspend-related features are already defined in <linux/suspend.h>, so it
is logical to move the definition of 'struct pm_ops' into there.
* 'struct hibernation_ops', being the hibernation-related counterpart of
'struct pm_ops', is defined in <linux/suspend.h> .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (24 commits)
[POWERPC] Fix vmemmap warning in init_64.c
[POWERPC] Fix 64 bits vDSO DWARF info for CR register
[POWERPC] Add 1TB workaround for PA6T
[POWERPC] Enable NO_HZ and high res timers for pseries and ppc64 configs
[POWERPC] Quieten cache information at boot
[POWERPC] Quieten clockevent printk
[POWERPC] Enable SLUB in *_defconfig
[POWERPC] Fix 1TB segment detection
[POWERPC] Fix iSeries_hpte_insert prototype
[POWERPC] Fix copyright symbol
[POWERPC] ibmebus: Move to of_device and of_platform_driver, match eHCA and eHEA drivers
[POWERPC] ibmebus: Add device creation and bus probing based on of_device
[POWERPC] ibmebus: Remove bus match/probe/remove functions
[POWERPC] Move of_device allocation into of_device.[ch]
[POWERPC] mpc52xx: device tree changes for FEC and MDIO
[POWERPC] bestcomm: GenBD task support
[POWERPC] bestcomm: FEC task support
[POWERPC] bestcomm: ATA task support
[POWERPC] bestcomm: core bestcomm support for Freescale MPC5200
[POWERPC] mpc52xx: Update mpc52xx_psc structure with B revision changes
...
All asm/ipc.h files do only #include <asm-generic/ipc.h>.
This patch therefore removes all include/asm-*/ipc.h files and moves the
contents of include/asm-generic/ipc.h to include/linux/ipc.h.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This makes powerpc64's compat code use the new linux/elfcore-compat.h,
reducing some hand-copied duplication.
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Driver for the CompactFlash slot on the PA Semi Electra eval board. It's
a simple device sitting on localbus, with interrupts and detect/voltage
control over GPIO.
The driver is implemented as an of_platform driver, and adds localbus
as a bus being probed by the of_platform framework.
[akpm@linux-foundation.org: cleanups]
[olof@lixom.net: fix build]
Signed-off-by: Olof Johansson <olof@lixom.net>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Milton Miller <miltonm@bga.com>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It makes more sense to make instrumentation support experimental on a
case-by-case basis.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Slab constructors currently have a flags parameter that is never used. And
the order of the arguments is opposite to other slab functions. The object
pointer is placed before the kmem_cache pointer.
Convert
ctor(void *object, struct kmem_cache *s, unsigned long flags)
to
ctor(struct kmem_cache *s, void *object)
throughout the kernel
[akpm@linux-foundation.org: coupla fixes]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update dump_task_altivec() (which has so far never been put to use) so that
it dumps the Altivec/VMX registers (VR[0] - VR[31], VSCR and VRSAVE) in the
same format as the ptrace get_vrregs(), and add the appropriate glue
typedef and #defines to make it work.
A new note type of NT_PPC_VMX was chosen to be 0x100 (arbitrarily) because
it allows the low range values to be used for more generic purposes and
0x100 seems an adequate starting point for PowerPC extensions.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the right printk format to silence the following warning.
CC arch/powerpc/mm/init_64.o
arch/powerpc/mm/init_64.c: In function 'vmemmap_populate':
arch/powerpc/mm/init_64.c:243: warning: format '%p' expects type 'void *', but argument 4 has type 'long unsigned int'
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The current DWARF info for CR are incorrect, causing the gcc unwinder to
go to lunch if we take a segfault in the vdso. This fixes it.
Problem identified by Andrew Haley, and fix provided by Jakub Jelinek
(thanks !).
Unfortunately, a bug in gcc cause it to not quite work either, but that
is being fixed separately with something around the lines of:
linux-unwind.h:
fs->regs.reg[R_CR2].loc.offset = (long) ®s->ccr - new_cfa;
+ /* CR? regs are just 32-bit and PPC is big-endian. */
+ fs->regs.reg[R_CR2].loc.offset += sizeof (long) - 4;
(According to Jakub)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
PA6T has a bug where the slbie instruction does not honor the large
segment bit. As a result, we have to always use slbia when switching
context.
We don't have to worry about changing the slbie's during fault processing,
since they should never be replacing one VSID with another using the
same ESID. I.e. there's no risk for inserting duplicate entries due to a
failed slbie of the old entry. So as long as we clear it out on context
switch we should be fine.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Enable NO_HZ and high res timers for the ppc64 and pseries defconfigs.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
After 6 years the ppc64 kernel still thinks its important to tell me my
cache line size is 0x80 bytes. I think most people who care know that by
now. The rest probably cant even understand the hex output.
Since we might have misconfigured firmware or cpus that have a linesize
that isnt 128 bytes, I still print it out for those cases. If people
would prefer to remove it completely, lets do it.
Also for lpar remove the htab_address printout since its not used.
Anton
ppc64 boot log usability expert
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The clockevent bootup message only needs to be KERN_INFO.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When checking out the new NO_HZ support in powerpc, I noticed we never
slept for more than 2 seconds. It turns out SLAB has a 2 second per cpu
timer that causes this.
After switching to SLUB I see some nice 4 second sleeps which is the
limit on this POWER6 box (the decrementer ticks at 512MHz):
slept 4.19 sec
slept 4.19 sec
slept 4.19 sec
slept 4.19 sec
slept 3.96 sec
slept 3.80 sec
slept 2.99 sec
Since SLUB is now the default and some powerpc defconfigs already enable
it, lets enable SLUB across the board for consistency. While doing this
I also noticed that the maple defconfig has SLAB debugging enabled which
is sure to make your box nice and slow. Fix that too.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Buglet in the 1TB detection makes it return after checking the first
property word, even if it's not a match.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Commit 1189be6508 ([POWERPC] Use 1TB
segments) added an argument to hpte_insert.
Also make iSeries_hpte_insert static.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Replace struct ibmebus_dev and struct ibmebus_driver with struct of_device
and struct of_platform_driver, respectively. Match the external ibmebus
interface and drivers using it.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The devtree root is now searched for devices matching a built-in whitelist
during boot, so these devices appear on the bus from the beginning. It is
still possible to manually add/remove devices to/from the bus by using the
probe/remove sysfs interface. Also, when a device driver registers itself,
the devtree is matched against its matchlist.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Remove old code that will be replaced by rewritten and shorter functions in
the next patch. Keep struct ibmebus_dev and struct ibmebus_driver for now,
but replace ibmebus_{,un}register_driver() by dummy functions. This way, the
kernel will still compile and run during the transition and git bisect will
be happy.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Extract generic of_device allocation code from of_platform_device_create()
and move it into of_device.[ch], called of_device_alloc(). Also, there's now
of_device_free() which puts the device node.
This way, bus drivers that build on of_platform (like ibmebus will) can
build upon this code instead of reinventing the wheel.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add device tree entries for lite5200b's FEC's PHY.
Signed-off-by: Domen Puncer <domen.puncer@telargo.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This is the microcode for the GenBD task and the associated
support code. This is a generic task that copy data to/from
a hardware FIFO. This is currently locked to 32bits wide
access but could be extended as needed.
The microcode itself comes directly from the offical
API (v2.2)
Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This is the microcode for the FEC task and the associated
support code.
The microcode itself comes directly from the offical
API (v2.2)
Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>