This IOMMU helper function doesn't work for some architectures:
http://marc.info/?l=linux-kernel&m=121699304403202&w=2
It also breaks POWER and SPARC builds:
http://marc.info/?l=linux-kernel&m=121730388001890&w=2
Currently, only x86 IOMMUs use this so let's move it to x86 for
now.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Synchronize changes to host virtual addresses which are part of
a KVM memory slot to the KVM shadow mmu. This allows pte operations
like swapping, page migration, and madvise() to transparently work
with KVM.
Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This allows reading memslots with only the mmu_lock hold for mmu
notifiers that runs in atomic context and with mmu_lock held.
Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This allows the mmu notifier code to run unalias_gfn with only the
mmu_lock held. Only alias writes need the mmu_lock held. Readers will
either take the slots_lock in read mode or the mmu_lock.
Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
When I compiled Solution Engine, this become compile error
because plaform device of sh_eth device becomes enable.
When sh7710/sh7712 which could use sh_eth was chosen,
revised it so that platform device of sh_eth device became enable.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
lguest: turn Waker into a thread, not a process
lguest: Enlarge virtio rings
lguest: Use GSO/IFF_VNET_HDR extensions on tun/tap
lguest: Remove 'network: no dma buffer!' warning
lguest: Adaptive timeout
lguest: Tell Guest net not to notify us on every packet xmit
lguest: net block unneeded receive queue update notifications
lguest: wrap last_avail accesses.
lguest: use cpu capability accessors
lguest: virtio-rng support
lguest: Support assigning a MAC address
lguest: Don't leak /dev/zero fd
lguest: fix verbose printing of device features.
lguest: fix switcher_page leak on unload
lguest: Guest int3 fix
lguest: set max_pfn_mapped, growl loudly at Yinghai Lu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (21 commits)
x86/PCI: use dev_printk when possible
PCI: add D3 power state avoidance quirk
PCI: fix bogus "'device' may be used uninitialized" warning in pci_slot
PCI: add an option to allow ASPM enabled forcibly
PCI: disable ASPM on pre-1.1 PCIe devices
PCI: disable ASPM per ACPI FADT setting
PCI MSI: Don't disable MSIs if the mask bit isn't supported
PCI: handle 64-bit resources better on 32-bit machines
PCI: rewrite PCI BAR reading code
PCI: document pci_target_state
PCI hotplug: fix typo in pcie hotplug output
x86 gart: replace to_pages macro with iommu_num_pages
x86, AMD IOMMU: replace to_pages macro with iommu_num_pages
iommu: add iommu_num_pages helper function
dma-coherent: add documentation to new interfaces
Cris: convert to using generic dma-coherent mem allocator
Sh: use generic per-device coherent dma allocator
ARM: support generic per-device coherent dma mem
Generic dma-coherent: fix DMA_MEMORY_EXCLUSIVE
x86: use generic per-device dma coherent allocator
...
Alexey Dobriyan reported trouble with LTP with the new fast-gup code,
and Johannes Weiner debugged it to non-page-aligned addresses, where the
new get_user_pages_fast() code would do all the wrong things, including
just traversing past the end of the requested area due to 'addr' never
matching 'end' exactly.
This is not a pretty fix, and we may actually want to move the alignment
into generic code, leaving just the core code per-arch, but Alexey
verified that the vmsplice01 LTP test doesn't crash with this.
Reported-and-tested-by: Alexey Dobriyan <adobriyan@gmail.com>
Debugged-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes up the workaround in 2b4b2bb421
and cleans up __put_user_asm() to get the sizing right from the onset.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
6af61a7614 'x86: clean up max_pfn_mapped
usage - 32-bit' makes the following comment:
XEN PV and lguest may need to assign max_pfn_mapped too.
But no CC. Yinghai, wasting fellow developers' time is a VERY bad
habit. If you do it again, I will hunt you down and try to extract
the three hours of my life I just lost :)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to pages.
There are secondary MMUs (with secondary sptes and secondary tlbs) too.
sptes in the kvm case are shadow pagetables, but when I say spte in
mmu-notifier context, I mean "secondary pte". In GRU case there's no
actual secondary pte and there's only a secondary tlb because the GRU
secondary MMU has no knowledge about sptes and every secondary tlb miss
event in the MMU always generates a page fault that has to be resolved by
the CPU (this is not the case of KVM where the a secondary tlb miss will
walk sptes in hardware and it will refill the secondary tlb transparently
to software if the corresponding spte is present). The same way
zap_page_range has to invalidate the pte before freeing the page, the spte
(and secondary tlb) must also be invalidated before any page is freed and
reused.
Currently we take a page_count pin on every page mapped by sptes, but that
means the pages can't be swapped whenever they're mapped by any spte
because they're part of the guest working set. Furthermore a spte unmap
event can immediately lead to a page to be freed when the pin is released
(so requiring the same complex and relatively slow tlb_gather smp safe
logic we have in zap_page_range and that can be avoided completely if the
spte unmap event doesn't require an unpin of the page previously mapped in
the secondary MMU).
The mmu notifiers allow kvm/GRU/XPMEM to attach to the tsk->mm and know
when the VM is swapping or freeing or doing anything on the primary MMU so
that the secondary MMU code can drop sptes before the pages are freed,
avoiding all page pinning and allowing 100% reliable swapping of guest
physical address space. Furthermore it avoids the code that teardown the
mappings of the secondary MMU, to implement a logic like tlb_gather in
zap_page_range that would require many IPI to flush other cpu tlbs, for
each fixed number of spte unmapped.
To make an example: if what happens on the primary MMU is a protection
downgrade (from writeable to wrprotect) the secondary MMU mappings will be
invalidated, and the next secondary-mmu-page-fault will call
get_user_pages and trigger a do_wp_page through get_user_pages if it
called get_user_pages with write=1, and it'll re-establishing an updated
spte or secondary-tlb-mapping on the copied page. Or it will setup a
readonly spte or readonly tlb mapping if it's a guest-read, if it calls
get_user_pages with write=0. This is just an example.
This allows to map any page pointed by any pte (and in turn visible in the
primary CPU MMU), into a secondary MMU (be it a pure tlb like GRU, or an
full MMU with both sptes and secondary-tlb like the shadow-pagetable layer
with kvm), or a remote DMA in software like XPMEM (hence needing of
schedule in XPMEM code to send the invalidate to the remote node, while no
need to schedule in kvm/gru as it's an immediate event like invalidating
primary-mmu pte).
At least for KVM without this patch it's impossible to swap guests
reliably. And having this feature and removing the page pin allows
several other optimizations that simplify life considerably.
Dependencies:
1) mm_take_all_locks() to register the mmu notifier when the whole VM
isn't doing anything with "mm". This allows mmu notifier users to keep
track if the VM is in the middle of the invalidate_range_begin/end
critical section with an atomic counter incraese in range_begin and
decreased in range_end. No secondary MMU page fault is allowed to map
any spte or secondary tlb reference, while the VM is in the middle of
range_begin/end as any page returned by get_user_pages in that critical
section could later immediately be freed without any further
->invalidate_page notification (invalidate_range_begin/end works on
ranges and ->invalidate_page isn't called immediately before freeing
the page). To stop all page freeing and pagetable overwrites the
mmap_sem must be taken in write mode and all other anon_vma/i_mmap
locks must be taken too.
2) It'd be a waste to add branches in the VM if nobody could possibly
run KVM/GRU/XPMEM on the kernel, so mmu notifiers will only enabled if
CONFIG_KVM=m/y. In the current kernel kvm won't yet take advantage of
mmu notifiers, but this already allows to compile a KVM external module
against a kernel with mmu notifiers enabled and from the next pull from
kvm.git we'll start using them. And GRU/XPMEM will also be able to
continue the development by enabling KVM=m in their config, until they
submit all GRU/XPMEM GPLv2 code to the mainline kernel. Then they can
also enable MMU_NOTIFIERS in the same way KVM does it (even if KVM=n).
This guarantees nobody selects MMU_NOTIFIER=y if KVM and GRU and XPMEM
are all =n.
The mmu_notifier_register call can fail because mm_take_all_locks may be
interrupted by a signal and return -EINTR. Because mmu_notifier_reigster
is used when a driver startup, a failure can be gracefully handled. Here
an example of the change applied to kvm to register the mmu notifiers.
Usually when a driver startups other allocations are required anyway and
-ENOMEM failure paths exists already.
struct kvm *kvm_arch_create_vm(void)
{
struct kvm *kvm = kzalloc(sizeof(struct kvm), GFP_KERNEL);
+ int err;
if (!kvm)
return ERR_PTR(-ENOMEM);
INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
+ kvm->arch.mmu_notifier.ops = &kvm_mmu_notifier_ops;
+ err = mmu_notifier_register(&kvm->arch.mmu_notifier, current->mm);
+ if (err) {
+ kfree(kvm);
+ return ERR_PTR(err);
+ }
+
return kvm;
}
mmu_notifier_unregister returns void and it's reliable.
The patch also adds a few needed but missing includes that would prevent
kernel to compile after these changes on non-x86 archs (x86 didn't need
them by luck).
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix mm/filemap_xip.c build]
[akpm@linux-foundation.org: fix mm/mmu_notifier.c build]
Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Kanoj Sarcar <kanojsarcar@yahoo.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Chris Wright <chrisw@redhat.com>
Cc: Marcelo Tosatti <marcelo@kvack.org>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Izik Eidus <izike@qumranet.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes a merge goof whereby ARCH_EP93XX got the "select HAVE_CLK" line
which belongs instead with ARCH_AT91.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oops, machvec.h is in asm/, it was previously removed due to overzealous
trimming. Fix up the path again.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This follows the sparc changes a439fe51a1.
Most of the moving about was done with Sam's directions at:
http://marc.info/?l=linux-sh&m=121724823706062&w=2
with subsequent hacking and fixups entirely my fault.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Convert printks to use dev_printk().
I converted DBG() to dev_dbg(). This DBG() is from arch/x86/pci/pci.h and
requires source-code modification to enable, so dev_dbg() seems roughly
equivalent.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
(20072fd0c9 lost most of its changes
somehow, came from a mbox archive applied with git-am. No idea
what happened. This puts back the missing bits. --rmk)
The initial patch from Lothar, and Lennert make it into a cleaner
one, modified and tested on PXA320 by Eric Miao.
This patch moves the L2 cache operations out of proc-xsc3.S into
dedicated outer cache support code.
CACHE_XSC3L2 can be deselected so no L2 cache specific code will be
linked in, and that L2 enable bit will not be set, this applies to
the following cases:
a. _only_ PXA300/PXA310 support included and no L2 cache wanted
b. PXA320 support included, but want L2 be disabled
So the enabling of L2 depends on two things:
- CACHE_XSC3L2 is selected
- and L2 cache is present
Where the latter is only a safeguard (previous testing shows it works
OK even when this bit is turned on).
IXP series of processors with XScale3 cannot disable L2 cache for the
moment since they depend on the L2 cache for its coherent memory, so
IXP may always select CACHE_XSC3L2.
Other L2 relevant bits are always turned on (i.e. the original code
enclosed by #if L2_CACHE_ENABLED .. #endif), as they showed no side
effects. Specifically, these bits are:
- OC bits in TTBASE register (table walk outer cache attributes)
- LLR Outer Cache Attributes (OC) in Auxiliary Control Register
Signed-off-by: Lothar WaÃ<9f>mann <LW@KARO-electronics.de>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Clean up and optimize cpumask_of_cpu(), by sharing all the zero words.
Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns
creating a huge array of constant bitmasks, realize that the zero words
can be shared.
In other words, on a 64-bit architecture, we only ever need 64 of these
arrays - with a different bit set in one single world (with enough zero
words around it so that we can create any bitmask by just offsetting in
that big array). And then we just put enough zeroes around it that we
can point every single cpumask to be one of those things.
So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each,
with one bit set in each array - 2MB memory total), we have exactly 64
arrays instead, each 8k bits in size (64kB total).
And then we just point cpumask(n) to the right position (which we can
calculate dynamically). Once we have the right arrays, getting
"cpumask(n)" ends up being:
static inline const cpumask_t *get_cpu_mask(unsigned int cpu)
{
const unsigned long *p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG];
p -= cpu / BITS_PER_LONG;
return (const cpumask_t *)p;
}
This brings other advantages and simplifications as well:
- we are not wasting memory that is just filled with a single bit in
various different places
- we don't need all those games to re-create the arrays in some dense
format, because they're already going to be dense enough.
if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory
is a non-issue (especially since by doing this "overlapping" trick we
probably get better cache behaviour anyway).
[ mingo@elte.hu:
Converted Linus's mails into a commit. See:
http://lkml.org/lkml/2008/7/27/156http://lkml.org/lkml/2008/7/28/320
Also applied a family filter - which also has the side-effect of leaving
out the bits where Linus calls me an idio... Oh, never mind ;-)
]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
powerpc: Disable 64K hugetlb support when doing 64K SPU mappings
powerpc/powermac: Fixup default serial port device for pmac_zilog
powerpc/powermac: Use sane default baudrate for SCC debugging
powerpc/mm: Implement _PAGE_SPECIAL & pte_special() for 64-bit
powerpc: Show processor cache information in sysfs
powerpc: Make core id information available to userspace
powerpc: Make core sibling information available to userspace
powerpc/vio: More fallout from dma_mapping_error API change
ibmveth: Fix multiple errors with dma_mapping_error conversion
powerpc/pseries: Fix CMO sysdev attribute API change fallout
powerpc: Enable tracehook for the architecture
powerpc: Add TIF_NOTIFY_RESUME support for tracehook
powerpc: Add asm/syscall.h with the tracehook entry points
powerpc: Make syscall tracing use tracehook.h helpers
powerpc: Call tracehook_signal_handler() when setting up signal frames
powerpc: Update cpu_sibling_maps dynamically
powerpc: register_cpu_online should be __cpuinit
powerpc: kill useless SMT code in prom_hold_cpus
powerpc: Fix 8xx build failure
powerpc: Fix vio build warnings
...
struct at91_nand has been renamed atmel_nand. Fix the four boards that
were added since the patch was created.
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Acked-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (72 commits)
sh: SuperH Mobile CEU and camera platform data for AP325RXA
sh: Update smc911x platform data for AP325RXA
sh: SuperH Mobile LCDC platform data for AP325RXA
sh: Add SuperH Mobile CEU platform data for Migo-R
sh: Add SuperH Mobile LCDC platform data for Migo-R
sh: Move asid_cache() out of ifdef to fix SH-3/4 nommu build.
sh: Workaround for __put_user_asm() bug with gcc 4.x on big-endian.
sh: Wire up new syscalls.
sh: fix uImage Entry Point
sh_keysc: remove request_mem_region() and release_mem_region()
sh: Don't miss pending signals returning to user mode after signal processing
sh: Use clk_always_enable() on sh7366
sh: Use clk_always_enable() on sh7343 / SE77343
sh: Use clk_always_enable() on sh7722 / Migo-R / SE7722
sh: Use clk_always_enable() on sh7723 / ap325rxa
sh: Introduce clk_always_enable() function
sh: Show all clocks and their state in /proc/clocks
sh: Merge sh7343 and sh7722 clock code
sh: Add SuperH Mobile MSTPCR bits to clock framework
sh: Use arch_flags to simplify sh7722 siu clock code
...
The CPM2 BRG setup functions cpm_setbrg and cpm2_fastbrg don't support
external clocks. This patch adds a new exported __cpm2_setbrg function
that takes the clock rate and clock source as extra parameters, and moves
cpm_setbrg and cpm2_fastbrg to include/asm-powerpc/cpm2.h where they
become inline wrappers around __cpm2_setbrg.
Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
i8259 PIC is disabled on MPC8610HPCD boards, thus currently rtc-cmos
driver fails to probe.
To fix the issue, we lookup the device tree for "chrp,iic" and
"pnpPNP,000" compatible devices, and if not found we do not assign RTC
IRQ and assuming that i8259 was disabled.
Though this patch fixes RTC on some boards (and surely should not break
any other), the whole approach is still broken. We can't easily fix this
though, because old device trees do not specify i8259 interrupts for the
cmos rtc node.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This patch introduces baudrate setting support via the generic clock API.
When present the optional device tree clock property is used instead of
fsl-cpm-brg. Platforms can then define complex clock schemes, to output
the serial clock on an external pin for instance.
Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This patch implement GPIO LIB support for the CPM1 GPIOs.
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This patch implement GPIO LIB support for the CPM2 GPIOs. The code can
also be used for CPM1 GPIO port E, as both cores are compatible at the
register level.
Based on earlier work by Laurent Pinchart.
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Cc: Laurent Pinchart <laurentp@cse-semaphore.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Add AP325RXA specific platform data for on-chip sh7723 CEU and ncm03j camera.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Pass board specific smc911x parameters using struct smc911x_platdata.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add Migo-R specific platform data for on-chip sh7722 CEU and ov772x camera.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Remove not needed export and fix warning:
WARNING: vmlinux.o(__ksymtab+0x400): Section mismatch in reference from the variable __ksymtab_set_imx_fb_info to the function .init.text:set_imx_fb_info()
The symbol set_imx_fb_info is exported and annotated __init
Fix this by removing the __init annotation of set_imx_fb_info or drop the export.
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
It seems this small label was lost in the last merge. Without it
no CPU type is selected for the MX2 family of processors. And a build
will fail badly...
Signed-off-by: Juergen Beisert <j.beisert@pengutronix.de>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
fix the problem that cannot boot using uImage when PAGE_SIZE is
8kbyte or 64kbyte.
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Without this patch, signals sent during architecture specific signal
handling (typically as a result of the user's stack being inaccessible)
are ignored.
This is the SH version of commit c3ff8ec31c
which was for the i386.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Use clk_always_enable() on the sh7343 processor and in the board code
for Solution Engine 7343. Remove duplicate MSTPCR register definitions.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Use clk_always_enable() on the sh7722 processor and in the board code
for Migo-R and Solution Engine 7722. Remove duplicate MSTPCR register
definitions.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Use clk_always_enable() on the sh7723 processor and in the ap325rxa
board code. Remove duplicate MSTPCR register definitions.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Show all clocks in /proc/clocks, and also show if they are enabled or
disabled. This is useful to show MSTPCR bits on SuperH Mobile processors.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This code makes sh7343 share the sh7722 clock code. Instead of just using
the good and very old sh7343 clock implmentation, switch to the new MSTPCR
enabled clock code. SIU clocks are disabled on sh7343 for now.
With this change all SuperH Mobile devices now use the same clock code.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Handle module stop clock bits in MSTPCRn through the clock framework.
The clocks are named after the bits in the data sheet. The association
between bit number and hardware block is processor specific.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Commit 1ea0704e0d
(mm: add a ptep_modify_prot transaction abstraction)
triggered on sh build errors like the following:
<-- snip -->
...
CC arch/sh/mm/pg-sh4.o
cc1: warnings being treated as errors
include2/asm/pgtable.h:139: error: 'ptep_get_and_clear' declared inline after being called
include2/asm/pgtable.h:139: error: previous declaration of 'ptep_get_and_clear' was here
make[2]: *** [arch/sh/mm/pg-sh4.o] Error 1
<-- snip -->
Since there's no good reason for marking these global functions as
"inline" this patch therefore removes the inline's.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds initial support for the Renesas R0P7785LC0011RL board.
This patch supports 29bit address mode only.
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds physically contiguous memory chunks to the UIO devices.
The same strategy can be used in the future for the CEU as well.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch exports the VPU, VEU(1) and VEU(2) blocks of the sh7366
to user space using the uio_pdrv_genirq platform driver.
While at it, fix up the VEU(2) interrupt vector.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch exports the VPU, VEU2H0 and VEU2H1 blocks of the sh7723
to user space using the uio_pdrv_genirq platform driver.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch exports the VPU and VEU blocks of the sh7722 to user space
using the uio_pdrv_genirq platform driver.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch exports the VPU and VEU blocks of the sh7343 to user space
using the uio_pdrv_genirq platform driver.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch is
By sh2
- Remove duplicate code
- Reduce stack usage
- Cleanup and little optimize
By sh2a
- Add missing handler(256 to 511)
- Use sh2a instructions handler
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
updated the following codes for Solution Endine 7343:
- fix compile error in arch/sh/boards/se/7343/irq.c
- add nor flash physmaps
- update defconfig
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Remove inline from ptep_get_and_clean() to match with header file prototype.
Makes linux-next build.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch enables I2C on the sh7723-based ap325rxa board.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds platform data for the single I2C channel on sh7366.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds platform data for the single I2C channel on sh7723.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds platform data for two I2C channels to the sh7343.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch is based on interrupt acknowledge code for external
interrupt sources on sh3 processors and adds on sh4a processors.
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The current kernel behaviour is to reenable interrupts unconditionally
when taking a page fault. This patch changes this to only enable them
if interrupts were previously enabled.
It also fixes a problem seen with this fix in place: the kernel previously
flushed the vsyscall page when handling a signal, which is not only
unncessary, but caused a possible sleep with interrupts disabled.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add implementation of flush_icache_range() suitable for signal handler
and kprobes. Remove flush_cache_sigtramp() and change signal.c to use
flush_icache_range().
Signed-off-by: Chris Smith <chris.smith@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
sh_pcic_io_xxx function are very old.
In linux-2.4, mrshpc_ss socket driver used this function.
But there is not this driver to the present kernel.
I deleted these cords and checked operation.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The CPU of AP-325RXA is SH7723, but a CPU becomes selectable.
This patch fixes this problem.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Enable SH-Ether support and NFS userland support.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add support SH-Ether for Hitachi Solution Engine.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch contains the following cleanups:
- make the following needlessly global code static:
- cf-enabler.c: cf_init()
- cpu/clock.c: __clk_enable()
- cpu/clock.c: __clk_disable()
- process_32.c: default_idle()
- time_32.c: struct clocksource_sh
- timers/timer-tmu.c: struct tmu_timer_ops
- remove the following unused functions (no CONFIG_BLK_DEV_FD on sh):
- process_{32,64}.c: disable_hlt()
- process_{32,64}.c: enable_hlt()
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch makes the needlessly global pcibios_max_latency static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch makes the needlessly global EARLY_PCI_OP's static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch makes the needlessly global aica_rtc_{get,set}timeofday()
static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds basic support for the SH7763RDP board.
This supports a basic stuff provided in SH7763, like SCIF,
NOR Flash and USB host.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
SH7763 has 3 SCIF device. Current code supports SCIF0 and 1.
SCIF0 and 1 are same register constitution, but only SCIF2 is different.
I added support of SCIF2.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This board is SH7723 base board.
This has SCIF, LCDC, USB Host controler, NOR/NAND Flash, Sound,
Ether and other.
This patch supports SCIF, NOR Flash.
Signed-off-by: Yusuke Goda <goda.yusuke@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
16kB is a useful size on nommu, while 64kB still tends to be too big to
be useful. Newer MMUs are likely to support this as well, so plug it
in in anticipation of those, too.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
PAGE_SIZE doesn't need to be fixed at 4096 on nommu, so stub in a !MMU
case for the various PAGE_SIZE Kconfig options.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Currently this is only linked in for CONFIG_BINFMT_ELF, make it dependent
on CONFIG_ELF_CORE, so it's both selectable there and also linked in for
CONFIG_BINFMT_ELF_FDPIC.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch fixes the following build error:
<-- snip -->
...
MODPOST 1837 modules
ERROR: "board_pci_channels" [drivers/pcmcia/yenta_socket.ko] undefined!
...
make[2]: *** [__modpost] Error 1
<-- snip -->
I freely admit that it's a pathological configuration, but as long as
it is allowed it should build.
Reported-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
When using single_open(), single_release() should be used instead
of seq_release(), otherwise there is a memory leak.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The 64K SPU local store mapping feature is incompatible with the
64K huge pages support due to the inability of some parts of
the memory management to differenciate between them while they
use a different page table format.
For now, disable 64K huge pages when CONFIG_SPU_FS_64K_LS,
in the long run, this can be fixed by making this feature use
the hugetlb page table format.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This removes the non-working code in legacy_serial that tried to handle
the powermac SCC ports, and instead add a (now working) function to the
powermac platform code to find the default serial console if any.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When using the "sccdbg" option to route early kernel messages and
xmon to the SCC serial port on PowerMacs, when this wasn't the
configured output port of Open Firmware, we initialize the baudrate
to 57600bps. This isn't a very good default on some powermacs where
both the FW and pmac_zilog will default to 38400. This fixes it to
use the same logic as pmac_zilog to pick a default speed.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Collect cache information from the OF device tree and display it in
the cpu hierarchy in sysfs. This is intended to be compatible at the
userspace level with x86's implementation[1], hence some of the funny
attribute names. The arrangement of cache info is not immediately
intuitive, but (again) it's for compatibility's sake.
The cache attributes exposed are:
type (Data, Instruction, or Unified)
level (1, 2, 3...)
size
coherency_line_size
number_of_sets
ways_of_associativity
All of these can be derived on platforms that follow the OF PowerPC
Processor binding. The code "publishes" only those attributes for
which it is able to determine values; attributes for values which
cannot be determined are not created at all.
[1] arch/x86/kernel/cpu/intel_cacheinfo.c
BenH: Turned some printk's into pr_debug, added better NULL checking
in a couple of places.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Existing Open Firmware practice is to report each processor core as a
separate node in the device tree. Report the value of the "reg" OF
property corresponding to a logical CPU's device node as the core_id
attribute in /sys/devices/system/cpu/cpu*/topology/core_id.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Implement the notion of "core siblings" for powerpc. This makes
/sys/devices/system/cpu/cpu*/topology/core_siblings present sensible
values, indicating online CPUs which share an L2 cache.
BenH: Made cpu_to_l2cache() use of_find_node_by_phandle() instead
of IBM-specific open coded search
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch/powerpc/kernel/vio.c:533: error: too few arguments to function 'dma_mapping_error'
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Noticed due to these wanings:
arch/powerpc/platforms/pseries/cmm.c:298: warning: initialization from incompatible pointer type
arch/powerpc/platforms/pseries/cmm.c:299: warning: initialization from incompatible pointer type
arch/powerpc/platforms/pseries/cmm.c:320: warning: initialization from incompatible pointer type
arch/powerpc/platforms/pseries/cmm.c:320: warning: initialization from incompatible pointer type
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The powerpc arch code has all the prerequisites, so set HAVE_ARCH_TRACEHOOK.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This adds TIF_NOTIFY_RESUME support for powerpc. When set,
we call tracehook_notify_resume() on the way to user mode.
This overloads do_signal() to do the work, but changes its
arguments to it has the TIF_* bits handy in a register and
drops the useless first argument that was always zero.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This changes powerpc syscall tracing to use the new tracehook.h entry
points. There is no change, only cleanup.
In addition, the assembly changes allow do_syscall_trace_enter() to
abort the syscall without losing the information about the original
r0 value.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This makes the powerpc signal handling code call tracehook_signal_handler()
after a handler is set up. This means that using PTRACE_SINGLESTEP to
enter a signal handler will report to ptrace on the first instruction of
the handler, instead of the second. This is consistent with what x86 and
other machines do, and what users and debuggers want.
BenH: Fixed up the test for the trap value.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rather doing one initialization pass over all the per-cpu
cpu_sibling_maps at boot, update the maps at cpu online/offline time.
This is a behavior change -- the thread_siblings attribute now
reflects only online siblings, whereas it would display offline
siblings before. The new behavior matches that of x86, and is
arguably more useful.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It is called only in cpu online paths.
(caught by CONFIG_DEBUG_SECTION_MISMATCH=y)
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This piece of code is broken for >2 threads, and possibly in some
other subtle ways (such as comparing a value obtained from an
"ibm,ppc-interrupt-server#s" property to a value obtained from a
"reg" property) and doesn't seem to have any useful purpose in the
first place other than a dubious warning in case NR_CPUS is too
small, which probably isn't the right place to do so.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch/powerpc/kernel/vio.c:1034: warning: function declaration isnât a prototype
arch/powerpc/kernel/vio.c:1035: warning: function declaration isnât a prototype
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* CONFIG_BOOKE is selected by CONFIG_44x so we dont need both
* Fixed a few comments
* Go back to only using DBCR0_IDM to determine if we are using
debug resources.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Removed duplicated include file <linux/module.h> in
arch/powerpc/kernel/stacktrace.c.
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Call the standard hook after setting up signal handlers.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds TIF_NOTIFY_RESUME support for sparc64.
When set, we call tracehook_notify_resume() on the way to user mode.
Signed-off-by: Roland McGrath <roland@redhat.com>
This changes sparc64 syscall tracing to use the new tracehook.h entry
points.
[ Add assembly changes to force an immediate -ENOSYS return from
the system call when syscall_trace() returns non-zero at syscall
entry. -DaveM ]
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The majority of this patch was created by the following script:
***
ASM=arch/sparc/include/asm
mkdir -p $ASM
git mv include/asm-sparc64/ftrace.h $ASM
git rm include/asm-sparc64/*
git mv include/asm-sparc/* $ASM
sed -ie 's/asm-sparc64/asm/g' $ASM/*
sed -ie 's/asm-sparc/asm/g' $ASM/*
***
The rest was an update of the top-level Makefile to use sparc
for header files when sparc64 is being build.
And a small fixlet to pick up the correct unistd.h from
sparc64 code.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
commit 3e9704739d ("x86: boot secondary
cpus through initial_code") causes the kernel to crash when a CPU is
brought online after the read only sections have been write
protected. The write to initial_code in do_boot_cpu() fails.
Move inital_code to .cpuinit.data section.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
avr32: some mmc/sd cleanups
include/video/atmel_lcdc.h must #include <linux/workqueue.h>
avr32: allow system timer to share interrupt to make OProfile work
drivers/misc/atmel-ssc.c: Removed duplicated include
avr32: Add platform data for AC97C platform device
avr32: clean up mci platform code
fix avr32 build errors
Minor cleanups for the MMC/SD support on avr32:
- Make at32_add_device_mci() properly initialize "missing"
platform data ... so boards like STK1002 won't try GPIO 0.
- Switch over to gpio_is_valid() instead of testing for only
one designated value.
- Provide STK1002 platform data for the unlikely case that
switches are set so first Ethernet controller isn't in use.
(That's the only way to get card detect and writeprotect
switch sensing on the STK1000.)
And get rid of one "unused variable" warning.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
The shared mmap code works fine for the test case, which only checked
for two shared maps of the same file. However, three shared maps
result in one mapping remaining cached, resulting in stale data being
visible via that mapping. Fix this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When guest invalidates a large tlb map, there may be more than one
corresponding shadow tlb maps that need to be invalidated. Use eaddr and eend
to find these shadow tlb maps.
Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
IRQT_* and __IRQT_* were obsoleted long ago by patch [3692/1].
Remove them completely. Sed script for the reference:
s/__IRQT_RISEDGE/IRQ_TYPE_EDGE_RISING/g
s/__IRQT_FALEDGE/IRQ_TYPE_EDGE_FALLING/g
s/__IRQT_LOWLVL/IRQ_TYPE_LEVEL_LOW/g
s/__IRQT_HIGHLVL/IRQ_TYPE_LEVEL_HIGH/g
s/IRQT_RISING/IRQ_TYPE_EDGE_RISING/g
s/IRQT_FALLING/IRQ_TYPE_EDGE_FALLING/g
s/IRQT_BOTHEDGE/IRQ_TYPE_EDGE_BOTH/g
s/IRQT_LOW/IRQ_TYPE_LEVEL_LOW/g
s/IRQT_HIGH/IRQ_TYPE_LEVEL_HIGH/g
s/IRQT_PROBE/IRQ_TYPE_PROBE/g
s/IRQT_NOEDGE/IRQ_TYPE_NONE/g
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The lctl(g) instructions require a specific alignment for the parameters.
The architecture requires a specification program check if these alignments
are not used. Enforcing this alignment also removes a possible host BUG,
since the get_guest functions check for proper alignment and emits a BUG.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Lets fix the name for the lctlg instruction...
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
The current interrupt handling on s390 misbehaves on an error case. On s390
each cpu has the prefix area (lowcore) for interrupt delivery. This memory
must always be available. If we fail to access the prefix area for a guest
on interrupt delivery the configuration is completely unusable. There is no
point in sending another program interrupt to an inaccessible lowcore.
Furthermore, we should not bug the host kernel, because this can be triggered
by userspace. I think the guest kernel itself can not trigger the problem, as
SET PREFIX and SIGNAL PROCESSOR SET PREFIX both check that the memory is
available and sane. As this is a userspace bug (e.g. setting the wrong guest
offset, unmapping guest memory) we should kill the userspace process instead
of BUGing the host kernel.
In the long term we probably should notify the userspace process about this
problem.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
All registers are unsigned long types. This patch changes all occurences
of guestaddr in gaccess from u64 to unsigned long.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
KVM_CAP_USER_MEMORY is used by s390, therefore, we should advertise it.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
There is a call to local_irq_restore in the normal exit case, so it would
seem that there should be one on an error return as well.
The semantic patch that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
expression l;
expression E,E1,E2;
@@
local_irq_save(l);
... when != local_irq_restore(l)
when != spin_unlock_irqrestore(E,l)
when any
when strict
(
if (...) { ... when != local_irq_restore(l)
when != spin_unlock_irqrestore(E1,l)
+ local_irq_restore(l);
return ...;
}
|
if (...)
+ {local_irq_restore(l);
return ...;
+ }
|
spin_unlock_irqrestore(E2,l);
|
local_irq_restore(l);
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Avi Kivity <avi@qumranet.com>
When an event (such as an interrupt) is injected, and the stack is
shadowed (and therefore write protected), the guest will exit. The
current code will see that the stack is shadowed and emulate a few
instructions, each time postponing the injection. Eventually the
injection may succeed, but at that time the guest may be unwilling
to accept the interrupt (for example, the TPR may have changed).
This occurs every once in a while during a Windows 2008 boot.
Fix by unshadowing the fault address if the fault was due to an event
injection.
Signed-off-by: Avi Kivity <avi@qumranet.com>
There is no guarantee that the old TSS descriptor in the GDT contains
the proper base address. This is the case for Windows installation's
reboot-via-triplefault.
Use guest registers instead. Also translate the address properly.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
The segment base is always a linear address, so translate before
accessing guest memory.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
If NPT is enabled after loading both KVM modules on AMD and it should be
disabled, both KVM modules must be reloaded. If only the architecture module is
reloaded the behavior is undefined. With this patch it is possible to disable
NPT only by reloading the kvm_amd module.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
* git://git.infradead.org/mtd-2.6: (57 commits)
[MTD] [NAND] subpage read feature as a way to increase performance.
CPUFREQ: S3C24XX NAND driver frequency scaling support.
[MTD][NAND] au1550nd: remove unused variable
[MTD] jedec_probe: Fix SST 16-bit chip detection
[MTD][MTDPART] Fix a division by zero bug
[MTD][MTDPART] Cleanup and document the erase region handling
[MTD][MTDPART] Handle most checkpatch findings
[MTD][MTDPART] Seperate main loop from per-partition code in add_mtd_partition
[MTD] physmap: resume already suspended chips on failure to suspend
[MTD] physmap: Fix suspend/resume/shutdown bugs.
[MTD] [NOR] Fix -ETIMEO errors in CFI driver
[MTD] [NAND] fsl_elbc_nand: fix section mismatch with CONFIG_MTD_OF_PARTS=y
[JFFS2] Use .unlocked_ioctl
[MTD] Fix const assignment in the MTD command line partitioning driver
[MTD] [NOR] gen_probe: No debug message when debugging is disabled
[MTD] [NAND] remove __PPC__ hardcoded address from DiskOnChip drivers
[MTD] [MAPS] Remove the bast-flash driver.
[MTD] [NAND] fsl_elbc_nand: ecclayout cleanups
[MTD] [NAND] fsl_elbc_nand: implement support for flash-based BBT
[MTD] [NAND] fsl_elbc_nand: fix OOB workability for large page NAND chips
...
* do not pass nameidata; struct path is all the callers want.
* switch to new helpers:
user_path_at(dfd, pathname, flags, &path)
user_path(pathname, &path)
user_lpath(pathname, &path)
user_path_dir(pathname, &path) (fail if not a directory)
The last 3 are trivial macro wrappers for the first one.
* remove nameidata in callers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, AMD IOMMU: include amd_iommu_last_bdf in device initialization
x86: fix IBM Summit based systems' phys_cpu_present_map on 32-bit kernels
x86, RDC321x: remove gpio.h complications
x86, RDC321x: add to mach-default
crashdump: fix undefined reference to `elfcorehdr_addr'
flag parameters: fix compile error of sys_epoll_create1
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (30 commits)
Blackfin arch: If we double fault, rather than hang forever, reset
Blackfin arch: When icache is off, make sure people know it
Blackfin arch: Fix bug - skip single step in high priority interrupt handler instead of disabling all interrupts in single step debugging.
Blackfin arch: cache the values of vco/sclk/cclk as the overhead of doing so (~24 bytes) is worth avoiding the software mult/div routines
Blackfin arch: fix bug - IMDMA is not type struct dma_register
Blackfin arch: check the EXTBANKS field of the DDRCTL1 register to see if we are using both memory banks
Blackfin arch: Apply Bluetechnix CM-BF527 board support patch
Blackfin arch: Add unwinding for stack info, and a little more detail on trace buffer
Blackfin arch: Add ISP1760 board resources to BF548-EZKIT
Blackfin arch: fix bug - detect 0.1 silicon revision BF527-EZKIT as 0.0 version
Blackfin arch: add missing IORESOURCE_MEM flags to UART3
Blackfin arch: Add return value check in bfin_sir_probe(), remove SSYNC().
Blackfin arch: Extend sram malloc to handle L2 SRAM.
Blackfin arch: Remove useless config option.
Blackfin arch: change L1 malloc to base on slab cache and lists.
Blackfin arch: use local labels and ENDPROC() markings
Blackfin arch: Do not need this dualcore test module in kernel.
Blackfin arch: Allow ptrace to peek and poke application data in L1 data SRAM.
Blackfin arch: Add ANOMALY_05000368 workaround
Blackfin arch: Functional power management support
...
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Mikael Starvik <starvik@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
where show_mem() calls show_free_areas().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free swap pages, printed by show_swap_cache_info()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
And show_mem() does now actually print something on configurations
with multiple nodes.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
- pages in slab, printed by show_free_areas()
- free swap pages, printed by show_swap_cache_info()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free swap pages, printed by show_swap_cache_info()
- pages in swapcache, printed by show_swap_cache_info()
- dirty pages, writeback pages, mapped pages, slab pages,
pagetables pages, printed by show_free_areas()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free swap pages, printed by show_swap_cache_info()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- pages in swapcache, printed by show_swap_cache_info()
- dirty pages, writeback pages, mapped pages, slab pages,
pagetable pages, printed by show_free_areas()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
- pages in slabs, printed by show_free_areas()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove arch-specific show_mem() in favor of the generic version.
This also removes the following redundant information display:
- free pages, printed by show_free_areas()
- free swap pages, printed by show_swap_cache_info()
- pages in swapcache, printed by show_swap_cache_info()
where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This extends wait_task_inactive() with a new argument so it can be used in
a "soft" mode where it will check for the task changing state unexpectedly
and back off. There is no change to existing callers. This lays the
groundwork to allow robust, noninvasive tracing that can try to sample a
blocked thread but back off safely if it wakes up.
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds the generic HAVE_ARCH_TRACEHOOK kconfig item. Each arch should
add to some Kconfig file:
select HAVE_ARCH_TRACEHOOK
if the arch code uses all the latest hooks to enable newfangled tracing
and debugging code. The comment in arch/Kconfig lists all the
prerequisite arch support. When all these are available, setting
HAVE_ARCH_TRACEHOOK will allow enabling any new features that depend on
the modern arch interfaces.
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This moves all the ptrace hooks related to exec into tracehook.h inlines.
This also lifts the calls for tracing out of the binfmt load_binary hooks
into search_binary_handler() after it calls into the binfmt module. This
change has no effect, since all the binfmt modules' load_binary functions
did the call at the end on success, and now search_binary_handler() does
it immediately after return if successful. We consolidate the repeated
code, and binfmt modules no longer need to import ptrace_notify().
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kmem cache passed to constructor is only needed for constructors that are
themselves multiplexeres. Nobody uses this "feature", nor does anybody uses
passed kmem cache in non-trivial way, so pass only pointer to object.
Non-trivial places are:
arch/powerpc/mm/init_64.c
arch/powerpc/mm/hugetlbpage.c
This is flag day, yes.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Matt Mackall <mpm@selenic.com>
[akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
[akpm@linux-foundation.org: fix mm/slab.c]
[akpm@linux-foundation.org: fix ubifs]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement get_user_pages_fast without locking in the fastpath on x86.
Do an optimistic lockless pagetable walk, without taking mmap_sem or any
page table locks or even mmap_sem. Page table existence is guaranteed by
turning interrupts off (combined with the fact that we're always looking
up the current mm, means we can do the lockless page table walk within the
constraints of the TLB shootdown design). Basically we can do this
lockless pagetable walk in a similar manner to the way the CPU's pagetable
walker does not have to take any locks to find present ptes.
This patch (combined with the subsequent ones to convert direct IO to use
it) was found to give about 10% performance improvement on a 2 socket 8
core Intel Xeon system running an OLTP workload on DB2 v9.5
"To test the effects of the patch, an OLTP workload was run on an IBM
x3850 M2 server with 2 processors (quad-core Intel Xeon processors at
2.93 GHz) using IBM DB2 v9.5 running Linux 2.6.24rc7 kernel. Comparing
runs with and without the patch resulted in an overall performance
benefit of ~9.8%. Correspondingly, oprofiles showed that samples from
__up_read and __down_read routines that is seen during thread contention
for system resources was reduced from 2.8% down to .05%. Monitoring the
/proc/vmstat output from the patched run showed that the counter for
fast_gup contained a very high number while the fast_gup_slow value was
zero."
(fast_gup is the old name for get_user_pages_fast, fast_gup_slow is a
counter we had for the number of times the slowpath was invoked).
The main reason for the improvement is that DB2 has multiple threads each
issuing direct-IO. Direct-IO uses get_user_pages, and thus the threads
contend the mmap_sem cacheline, and can also contend on page table locks.
I would anticipate larger performance gains on larger systems, however I
think DB2 uses an adaptive mix of threads and processes, so it could be
that thread contention remains pretty constant as machine size increases.
In which case, we stuck with "only" a 10% gain.
The downside of using get_user_pages_fast is that if there is not a pte
with the correct permissions for the access, we end up falling back to
get_user_pages and so the get_user_pages_fast is a bit of extra work.
However this should not be the common case in most performance critical
code.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: Kconfig fix]
[akpm@linux-foundation.org: Makefile fix/cleanup]
[akpm@linux-foundation.org: warning fix]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch implements devices state save/restore before after kexec.
This patch together with features in kexec_jump patch can be used for
following:
- A simple hibernation implementation without ACPI support. You can kexec a
hibernating kernel, save the memory image of original system and shutdown
the system. When resuming, you restore the memory image of original system
via ordinary kexec load then jump back.
- Kernel/system debug through making system snapshot. You can make system
snapshot, jump back, do some thing and make another system snapshot.
- Cooperative multi-kernel/system. With kexec jump, you can switch between
several kernels/systems quickly without boot process except the first time.
This appears like swap a whole kernel/system out/in.
- A general method to call program in physical mode (paging turning
off). This can be used to invoke BIOS code under Linux.
The following user-space tools can be used with kexec jump:
- kexec-tools needs to be patched to support kexec jump. The patches
and the precompiled kexec can be download from the following URL:
source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10
- makedumpfile with patches are used as memory image saving tool, it
can exclude free pages from original kernel memory image file. The
patches and the precompiled makedumpfile can be download from the
following URL:
source: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-src_cvs_kh10.tar.bz2
patches: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-patches_cvs_kh10.tar.bz2
binary: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile_cvs_kh10
- An initramfs image can be used as the root file system of kexeced
kernel. An initramfs image built with "BuildRoot" can be downloaded
from the following URL:
initramfs image: http://khibernation.sourceforge.net/download/release_v10/initramfs/rootfs_cvs_kh10.gz
All user space tools above are included in the initramfs image.
Usage example of simple hibernation:
1. Compile and install patched kernel with following options selected:
CONFIG_X86_32=y
CONFIG_RELOCATABLE=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_PM=y
CONFIG_HIBERNATION=y
CONFIG_KEXEC_JUMP=y
2. Build an initramfs image contains kexec-tool and makedumpfile, or
download the pre-built initramfs image, called rootfs.gz in
following text.
3. Prepare a partition to save memory image of original kernel, called
hibernating partition in following text.
4. Boot kernel compiled in step 1 (kernel A).
5. In the kernel A, load kernel compiled in step 1 (kernel B) with
/sbin/kexec. The shell command line can be as follow:
/sbin/kexec --load-preserve-context /boot/bzImage --mem-min=0x100000
--mem-max=0xffffff --initrd=rootfs.gz
6. Boot the kernel B with following shell command line:
/sbin/kexec -e
7. The kernel B will boot as normal kexec. In kernel B the memory
image of kernel A can be saved into hibernating partition as
follow:
jump_back_entry=`cat /proc/cmdline | tr ' ' '\n' | grep kexec_jump_back_entry | cut -d '='`
echo $jump_back_entry > kexec_jump_back_entry
cp /proc/vmcore dump.elf
Then you can shutdown the machine as normal.
8. Boot kernel compiled in step 1 (kernel C). Use the rootfs.gz as
root file system.
9. In kernel C, load the memory image of kernel A as follow:
/sbin/kexec -l --args-none --entry=`cat kexec_jump_back_entry` dump.elf
10. Jump back to the kernel A as follow:
/sbin/kexec -e
Then, kernel A is resumed.
Implementation point:
To support jumping between two kernels, before jumping to (executing)
the new kernel and jumping back to the original kernel, the devices
are put into quiescent state, and the state of devices and CPU is
saved. After jumping back from kexeced kernel and jumping to the new
kernel, the state of devices and CPU are restored accordingly. The
devices/CPU state save/restore code of software suspend is called to
implement corresponding function.
Known issues:
- Because the segment number supported by sys_kexec_load is limited,
hibernation image with many segments may not be load. This is
planned to be eliminated by adding a new flag to sys_kexec_load to
make a image can be loaded with multiple sys_kexec_load invoking.
Now, only the i386 architecture is supported.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch provides an enhancement to kexec/kdump. It implements the
following features:
- Backup/restore memory used by the original kernel before/after
kexec.
- Save/restore CPU state before/after kexec.
The features of this patch can be used as a general method to call program in
physical mode (paging turning off). This can be used to call BIOS code under
Linux.
kexec-tools needs to be patched to support kexec jump. The patches and
the precompiled kexec can be download from the following URL:
source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10
Usage example of calling some physical mode code and return:
1. Compile and install patched kernel with following options selected:
CONFIG_X86_32=y
CONFIG_KEXEC=y
CONFIG_PM=y
CONFIG_KEXEC_JUMP=y
2. Build patched kexec-tool or download the pre-built one.
3. Build some physical mode executable named such as "phy_mode"
4. Boot kernel compiled in step 1.
5. Load physical mode executable with /sbin/kexec. The shell command
line can be as follow:
/sbin/kexec --load-preserve-context --args-none phy_mode
6. Call physical mode executable with following shell command line:
/sbin/kexec -e
Implementation point:
To support jumping without reserving memory. One shadow backup page (source
page) is allocated for each page used by kexeced code image (destination
page). When do kexec_load, the image of kexeced code is loaded into source
pages, and before executing, the destination pages and the source pages are
swapped, so the contents of destination pages are backupped. Before jumping
to the kexeced code image and after jumping back to the original kernel, the
destination pages and the source pages are swapped too.
C ABI (calling convention) is used as communication protocol between
kernel and called code.
A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to
indicate that the loaded kernel image is used for jumping back.
Now, only the i386 architecture is supported.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The calgary code can give drivers addresses above 4GB which is very bad
for hardware that is only 32bit DMA addressable.
With this patch, the calgary code sets the global dma_ops to swiotlb or
nommu properly, and the dma_ops of devices behind the Calgary/CalIOC2
to calgary_dma_ops. So the calgary code can handle devices safely that
aren't behind the Calgary/CalIOC2.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexis Bruemmer <alexisb@us.ibm.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
architecture does:
This enables us to cleanly fix the Calgary IOMMU issue that some devices
are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423).
I think that per-device dma_mapping_ops support would be also helpful for
KVM people to support PCI passthrough but Andi thinks that this makes it
difficult to support the PCI passthrough (see the above thread). So I
CC'ed this to KVM camp. Comments are appreciated.
A pointer to dma_mapping_ops to struct dev_archdata is added. If the
pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's
NULL, the system-wide dma_ops pointer is used as before.
If it's useful for KVM people, I plan to implement a mechanism to register
a hook called when a new pci (or dma capable) device is created (it works
with hot plugging). It enables IOMMUs to set up an appropriate
dma_mapping_ops per device.
The major obstacle is that dma_mapping_error doesn't take a pointer to the
device unlike other DMA operations. So x86 can't have dma_mapping_ops per
device. Note all the POWER IOMMUs use the same dma_mapping_error function
so this is not a problem for POWER but x86 IOMMUs use different
dma_mapping_error functions.
The first patch adds the device argument to dma_mapping_error. The patch
is trivial but large since it touches lots of drivers and dma-mapping.h in
all the architecture.
This patch:
dma_mapping_error() doesn't take a pointer to the device unlike other DMA
operations. So we can't have dma_mapping_ops per device.
Note that POWER already has dma_mapping_ops per device but all the POWER
IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device
argument.
[akpm@linux-foundation.org: fix sge]
[akpm@linux-foundation.org: fix svc_rdma]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix bnx2x]
[akpm@linux-foundation.org: fix s2io]
[akpm@linux-foundation.org: fix pasemi_mac]
[akpm@linux-foundation.org: fix sdhci]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix sparc]
[akpm@linux-foundation.org: fix ibmvscsi]
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7444a72eff caused these platforms to lose
their GPIOLIB configuration. Convert the missed Kconfig symbols using:
sed -i s/HAVE_GPIO_LIB/ARCH_REQUIRE_GPIOLIB/ arch/arm/Kconfig arch/arm/plat-s3c24xx/Kconfig
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* Replace previous instances of the cpumask_of_cpu_ptr* macros
with a the new (lvalue capable) generic cpumask_of_cpu().
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>