The architecture specifies that when the processor wakes up from a WFE
or WFI instruction, the instruction is considered complete, however we
currrently return to EL1 (or EL0) at the WFI/WFE instruction itself.
While most guests may not be affected by this because their local
exception handler performs an exception returning setting the event bit
or with an interrupt pending, some guests like UEFI will get wedged due
this little mishap.
Simply skip the instruction when we have completed the emulation.
Cc: <stable@vger.kernel.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
X-Gene u-boot runs in EL2 mode with MMU enabled hence we might
have stale EL2 tlb enteris when we enable EL2 MMU on each host CPU.
This can happen on any ARM/ARM64 board running bootloader in
Hyp-mode (or EL2-mode) with MMU enabled.
This patch ensures that we flush all Hyp-mode (or EL2-mode) TLBs
on each host CPU before enabling Hyp-mode (or EL2-mode) MMU.
Cc: <stable@vger.kernel.org>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Commit f0a3eaff71 (ARM64: KVM: fix big endian issue in
access_vm_reg for 32bit guest) changed the way we handle CP15
VM accesses, so that all 64bit accesses are done via vcpu_sys_reg.
This looks like a good idea as it solves indianness issues in an
elegant way, except for one small detail: the register index is
doesn't refer to the same array! We end up corrupting some random
data structure instead.
Fix this by reverting to the original code, except for the introduction
of a vcpu_cp15_64_high macro that deals with the endianness thing.
Tested on Juno with 32bit SMP guests.
Cc: Victor Kamensky <victor.kamensky@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Commit 72c5839515 (arm64: gicv3: Allow GICv3 compilation with
older binutils) changed the way we express the GICv3 system registers,
but couldn't change the occurences used by KVM as the code wasn't
merged yet.
Just fix the accessors.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Enable trapping of the debug registers, preventing the guests to
mess with the host state (and allowing guests to use the debug
infrastructure as well).
Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Implement switching of the debug registers. While the number
of registers is massive, CPUs usually don't implement them all
(A57 has 6 breakpoints and 4 watchpoints, which gives us a total
of 22 registers "only").
Also, we only save/restore them when MDSCR_EL1 has debug enabled,
or when we've flagged the debug registers as dirty. It means that
most of the time, we only save/restore MDSCR_EL1.
Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Add handlers for all the AArch32 debug registers that are accessible
from EL0 or EL1. The code follow the same strategy as the AArch64
counterpart with regards to tracking the dirty state of the debug
registers.
Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We now have multiple tables for the various system registers
we trap. Make sure we check the order of all of them, as it is
critical that we get the order right (been there, done that...).
Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
An interesting "feature" of the CP14 encoding is that there is
an overlap between 32 and 64bit registers, meaning they cannot
live in the same table as we did for CP15.
Create separate tables for 64bit CP14 and CP15 registers, and
let the top level handler use the right one.
Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As we're about to trap a bunch of CP14 registers, let's rework
the CP15 handling so it can be generalized and work with multiple
tables.
Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Add handlers for all the AArch64 debug registers that are accessible
from EL0 or EL1. The trapping code keeps track of the state of the
debug registers, allowing for the switch code to implement a lazy
switching strategy.
Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
pm_fake doesn't quite describe what the handler does (ignoring writes
and returning 0 for reads).
As we're about to use it (a lot) in a different context, rename it
with a (admitedly cryptic) name that make sense for all users.
Reviewed-by: Anup Patel <anup.patel@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Fix issue with 32bit guests running on top of BE KVM host.
Indexes of high and low words of 64bit cp15 register are
swapped in case of big endian code, since 64bit cp15 state is
restored or saved with double word write or read instruction.
Define helper macro to access low words of 64bit cp15 register.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Since size of all sys registers is always 8 bytes. Current
code is actually endian agnostic. Just clean it up a bit.
Removed comment about little endian. Change type of pointer
from 'void *' to 'u64 *' to enforce stronger type checking.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
esr_el2 field of struct kvm_vcpu_fault_info has u32 type.
It should be stored as word. Current code works in LE case
because existing puts least significant word of x1 into
esr_el2, and it puts most significant work of x1 into next
field, which accidentally is OK because it is updated again
by next instruction. But existing code breaks in BE case.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Introduce the GICv3 world switch code used to save/restore the
GICv3 context.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Introduce the support code for emulating a GICv2 on top of GICv3
hardware.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
GICv3 requires the IMO and FMO bits to be tightly coupled with some
of the interrupt controller's register switch.
In order to have similar code paths, move the manipulation of these
bits to the GICv2 switch code.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Move the GICv2 world switch code into its own file, and add the
necessary indirection to the arm64 switch code.
Also introduce a new type field to the vgic_params structure.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We already have __hyp_text_{start,end} to express the boundaries
of the HYP text section, and __kvm_hyp_code_{start,end} are getting
in the way of a more modular world switch code.
Just turn __kvm_hyp_code_{start,end} into #defines mapping the
linker-emited symbols.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Brutally hack the innocent vgic code, and move the GICv2 specific code
to its own file, using vgic_ops and vgic_params as a way to pass
information between the two blocks.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
In order to make way for the GICv3 registers, move the v2-specific
registers to their own structure.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
For correct guest suspend/resume behaviour we need to ensure we include
the generic timer registers for 64 bit guests. As CONFIG_KVM_ARM_TIMER is
always set for arm64 we don't need to worry about null implementations.
However I have re-jigged the kvm_arm_timer_set/get_reg declarations to
be in the common include/kvm/arm_arch_timer.h headers.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
I suspect this is a -ECUTPASTE fault from the initial implementation. If
we don't declare the register ID to be KVM_REG_ARM64 the KVM_GET_ONE_REG
implementation kvm_arm_get_reg() returns -EINVAL and hilarity ensues.
The kvm/api.txt document describes all arm64 registers as starting with
0x60xx... (i.e KVM_REG_ARM64).
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Strings library contributed to glibc but re-licensed under GPLv2)
- Optimised crypto algorithms making use of the ARMv8 crypto extensions
(together with kernel API for using FPSIMD instructions in interrupt
context)
- Ftrace support
- CPU topology parsing from DT
- ESR_EL1 (Exception Syndrome Register) exposed to user space signal
handlers for SIGSEGV/SIGBUS (useful to emulation tools like Qemu)
- 1GB section linear mapping if applicable
- Barriers usage clean-up
- Default pgprot clean-up
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iQIcBAABAgAGBQJTkb+CAAoJEGvWsS0AyF7xLyEQAJgL8s2SdDyd+R8aukNDu3n9
tCK7yVHO9Kg96dfeXVuSOVEo2jszo6R3nxzUL05FMovr230WBcmoeHvHz8ETGnw1
g0yO8Ltkckjevog4UleCa3wGtYISjvwwrTalzbqoEWzsF2AV8oiqv/yuIn/EdkUr
jaOqfNsnAQa8TIz4vMhi/AVdJWTTU/F6WP80oqCbxqXu/WL2InuBlHtOJMbk1HDI
u1DJUGDQ1B9OgSVRkAOjCjSsEtz8sDY3lXsg3V1qT5+NbZTyomYM2IiBLdgQcX4P
t/rqX9nX4VmRQtzefeP5WhKFks2x80C0BKibWC4teeL++tJHbgbFkyjoZZGcP27o
zued3cYABrjrcAEU6ko/LUiL2Q4ozBOzosClpjpWulCxNPzsOps82UZWo3F3XbAt
xjE3k7WF9WeNBOJdDGrarEaSLdnjjgCLoWVs8cOUYLpOOrtdSw16D29jJ68U0Y5g
31wdwKxoueC8SFt8M9fP9J9Jyau08g+kvW1xQXrRmroppweFxjSpSy90imARyux/
wUFz79HxkQB79ZHpJ0I5TNrw/w+7pBnfVSKGPOzrk+ZUsaH76caNRBoffUCzFMzz
T3Sc8A36TZtOIcGR/Q4DMZNFXlIUXDSzCHP2Iu0QoIjTd5Ex96cqNvy3nswCYWwv
yGe3ZEqUq9+WL7snNW4v
=Jj8U
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into next
Pull arm64 updates from Catalin Marinas:
- Optimised assembly string/memory routines (based on the AArch64
Cortex Strings library contributed to glibc but re-licensed under
GPLv2)
- Optimised crypto algorithms making use of the ARMv8 crypto extensions
(together with kernel API for using FPSIMD instructions in interrupt
context)
- Ftrace support
- CPU topology parsing from DT
- ESR_EL1 (Exception Syndrome Register) exposed to user space signal
handlers for SIGSEGV/SIGBUS (useful to emulation tools like Qemu)
- 1GB section linear mapping if applicable
- Barriers usage clean-up
- Default pgprot clean-up
Conflicts as per Catalin.
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (57 commits)
arm64: kernel: initialize broadcast hrtimer based clock event device
arm64: ftrace: Add system call tracepoint
arm64: ftrace: Add CALLER_ADDRx macros
arm64: ftrace: Add dynamic ftrace support
arm64: Add ftrace support
ftrace: Add arm64 support to recordmcount
arm64: Add 'notrace' attribute to unwind_frame() for ftrace
arm64: add __ASSEMBLY__ in asm/insn.h
arm64: Fix linker script entry point
arm64: lib: Implement optimized string length routines
arm64: lib: Implement optimized string compare routines
arm64: lib: Implement optimized memcmp routine
arm64: lib: Implement optimized memset routine
arm64: lib: Implement optimized memmove routine
arm64: lib: Implement optimized memcpy routine
arm64: defconfig: enable a few more common/useful options in defconfig
ftrace: Make CALLER_ADDRx macros more generic
arm64: Fix deadlock scenario with smp_send_stop()
arm64: Fix machine_shutdown() definition
arm64: Support arch_irq_work_raise() via self IPIs
...
In order to allow KVM to run on Cortex-A53 implementations, wire the
minimal support required.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
In order to ensure completion of inner-shareable maintenance instructions
(cache and TLB) on AArch64, we can use the -ish suffix to the dsb
instruction.
This patch relaxes our dsb sy instructions to dsb ish where possible.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When calling our low-level barrier macros directly, we can often suffice
with more relaxed behaviour than the default "all accesses, full system"
option.
This patch updates the users of dsb() to specify the option which they
actually require.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently, the kvm_psci_call() returns 'true' or 'false' based on whether
the PSCI function call was handled successfully or not. This does not help
us emulate system-level PSCI functions where the actual emulation work will
be done by user space (QEMU or KVMTOOL). Examples of such system-level PSCI
functions are: PSCI v0.2 SYSTEM_OFF and SYSTEM_RESET.
This patch updates kvm_psci_call() to return three types of values:
1) > 0 (success)
2) = 0 (success but exit to user space)
3) < 0 (errors)
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Pull kvm updates from Paolo Bonzini:
"PPC and ARM do not have much going on this time. Most of the cool
stuff, instead, is in s390 and (after a few releases) x86.
ARM has some caching fixes and PPC has transactional memory support in
guests. MIPS has some fixes, with more probably coming in 3.16 as
QEMU will soon get support for MIPS KVM.
For x86 there are optimizations for debug registers, which trigger on
some Windows games, and other important fixes for Windows guests. We
now expose to the guest Broadwell instruction set extensions and also
Intel MPX. There's also a fix/workaround for OS X guests, nested
virtualization features (preemption timer), and a couple kvmclock
refinements.
For s390, the main news is asynchronous page faults, together with
improvements to IRQs (floating irqs and adapter irqs) that speed up
virtio devices"
* tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (96 commits)
KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8
KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset
KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode
KVM: PPC: Book3S HV: Return ENODEV error rather than EIO
KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code
KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state
KVM: PPC: Book3S HV: Add transactional memory support
KVM: Specify byte order for KVM_EXIT_MMIO
KVM: vmx: fix MPX detection
KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n
KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCE
KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd write
KVM: s390: clear local interrupts at cpu initial reset
KVM: s390: Fix possible memory leak in SIGP functions
KVM: s390: fix calculation of idle_mask array size
KVM: s390: randomize sca address
KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP
KVM: Bump KVM_MAX_IRQ_ROUTES for s390
KVM: s390: irq routing for adapter interrupts.
KVM: s390: adapter interrupt sources
...
- PCI I/O space extended to 16M (in preparation of PCIe support patches)
- Dropping ZONE_DMA32 in favour of ZONE_DMA (we only need one for the
time being), together with swiotlb late initialisation to correctly
setup the bounce buffer
- DMA API cache maintenance support (not all ARMv8 platforms have
hardware cache coherency)
- Crypto extensions advertising via ELF_HWCAP2 for compat user space
- Perf support for dwarf unwinding in compat mode
- asm/tlb.h converted to the generic mmu_gather code
- asm-generic rwsem implementation
- Code clean-up
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iQIcBAABAgAGBQJTOaqsAAoJEGvWsS0AyF7xYNUP/3/IPySIB+/6pyUG6q7kvIpF
Di93M+VdmnLEOKhhx/tjkiEmEQMp0hFPeOlQRWf/Ugg4ksulP6gRejdDEjIfkmsk
LrRXLjvH79NDJbN0pTUXqGDvLLZ9Qnib+HEOuKABIYUrwhNKySBk+5omGfXFtwLR
Mb5JxPX0kbBXOqbOX4RgANQoRlE8GxJR3V245zlGxA4klcN4IiaDy/99kj+kaeaa
Cl8X9K2I550IZ2YUAWPOut2aee2qRFQtAhIDgVthTYlGRx7Y/rDLM16B8fFY/T0H
7azIpSO5hk5lp8J3giJHYajlJlXNla5FeHQb8XAVnlyqFBmCUn0vvd2VbPvWREJp
UD8t1vZZt/s2he6CVAQIfQghwLyzrpPa19KbnyI+3HtsZ+NS/puBJmcVKZ2PBY/L
28BsRzB7BKAPEVhNmyPwFHNdZTvjaqYUCLhQ0uTp1sSHMcLeSs7+vyMR99f/0u9E
doSYAeF41ZkxHXL5xEevdj4sFkCEY1XFxER1Y8VM1rqHTeGEoeYbdS/u9tEeBgit
jBelvHAlNTBgbur2nW4E9fQpAF2CsvWnRq6lSmDRTkyjzcLUQqA8bsQJ3aUyJtZt
j17kUIzSH1q7x3zAaWQcvMVeawdkv2+HanjuTOdeO2ehvyG71vvxA3RkCv8o5Jhh
da+jAMhkpYQxk8mSKkWm
=8+cB
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull ARM64 updates from Catalin Marinas:
- KGDB support for arm64
- PCI I/O space extended to 16M (in preparation of PCIe support
patches)
- Dropping ZONE_DMA32 in favour of ZONE_DMA (we only need one for the
time being), together with swiotlb late initialisation to correctly
setup the bounce buffer
- DMA API cache maintenance support (not all ARMv8 platforms have
hardware cache coherency)
- Crypto extensions advertising via ELF_HWCAP2 for compat user space
- Perf support for dwarf unwinding in compat mode
- asm/tlb.h converted to the generic mmu_gather code
- asm-generic rwsem implementation
- Code clean-up
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits)
arm64: Remove pgprot_dmacoherent()
arm64: Support DMA_ATTR_WRITE_COMBINE
arm64: Implement custom mmap functions for dma mapping
arm64: Fix __range_ok macro
arm64: Fix duplicated Kconfig entries
arm64: mm: Route pmd thp functions through pte equivalents
arm64: rwsem: use asm-generic rwsem implementation
asm-generic: rwsem: de-PPCify rwsem.h
arm64: enable generic CPU feature modalias matching for this architecture
arm64: smp: make local symbol static
arm64: debug: make local symbols static
ARM64: perf: support dwarf unwinding in compat mode
ARM64: perf: add support for frame pointer unwinding in compat mode
ARM64: perf: add support for perf registers API
arm64: Add boot time configuration of Intermediate Physical Address size
arm64: Do not synchronise I and D caches for special ptes
arm64: Make DMA coherent and strongly ordered mappings not executable
arm64: barriers: add dmb barrier
arm64: topology: Implement basic CPU topology support
arm64: advertise ARMv8 extensions to 32-bit compat ELF binaries
...
ARMv8 supports a range of physical address bit sizes. The PARange bits
from ID_AA64MMFR0_EL1 register are read during boot-time and the
intermediate physical address size bits are written in the translation
control registers (TCR_EL1 and VTCR_EL2).
There is no change in the VA bits and levels of translation.
Signed-off-by: Radha Mohan Chintakuntla <rchintakuntla@cavium.com>
Reviewed-by: Will Deacon <Will.deacon@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When the guest runs with caches disabled (like in an early boot
sequence, for example), all the writes are diectly going to RAM,
bypassing the caches altogether.
Once the MMU and caches are enabled, whatever sits in the cache
becomes suddenly visible, which isn't what the guest expects.
A way to avoid this potential disaster is to invalidate the cache
when the MMU is being turned on. For this, we hook into the SCTLR_EL1
trapping code, and scan the stage-2 page tables, invalidating the
pages/sections that have already been mapped in.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
In order to be able to detect the point where the guest enables
its MMU and caches, trap all the VM related system registers.
Once we see the guest enabling both the MMU and the caches, we
can go back to a saner mode of operation, which is to leave these
registers in complete control of the guest.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
The current handling of AArch32 trapping is slightly less than
perfect, as it is not possible (from a handler point of view)
to distinguish it from an AArch64 access, nor to tell a 32bit
from a 64bit access either.
Fix this by introducing two additional flags:
- is_aarch32: true if the access was made in AArch32 mode
- is_32bit: true if is_aarch32 == true and a MCR/MRC instruction
was used to perform the access (as opposed to MCRR/MRRC).
This allows a handler to cover all the possible conditions in which
a system register gets trapped.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Commit 1fcf7ce0c6 (arm: kvm: implement CPU PM notifier) added
support for CPU power-management, using a cpu_notifier to re-init
KVM on a CPU that entered CPU idle.
The code assumed that a CPU entering idle would actually be powered
off, loosing its state entierely, and would then need to be
reinitialized. It turns out that this is not always the case, and
some HW performs CPU PM without actually killing the core. In this
case, we try to reinitialize KVM while it is still live. It ends up
badly, as reported by Andre Przywara (using a Calxeda Midway):
[ 3.663897] Kernel panic - not syncing: unexpected prefetch abort in Hyp mode at: 0x685760
[ 3.663897] unexpected data abort in Hyp mode at: 0xc067d150
[ 3.663897] unexpected HVC/SVC trap in Hyp mode at: 0xc0901dd0
The trick here is to detect if we've been through a full re-init or
not by looking at HVBAR (VBAR_EL2 on arm64). This involves
implementing the backend for __hyp_get_vectors in the main KVM HYP
code (rather small), and checking the return value against the
default one when the CPU notifier is called on CPU_PM_EXIT.
Reported-by: Andre Przywara <osp@andrep.de>
Tested-by: Andre Przywara <osp@andrep.de>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Rob Herring <rob.herring@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nothing major here, just bugfixes all over the place. The most
interesting part is the ARM guys' virtualized interrupt controller
overhaul, which lets userspace get/set the state and thus enables
migration of ARM VMs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJS3TVKAAoJEBvWZb6bTYbyIFgP/2cmt4ifCuFMaZv4+G1S8jZU
uC9ZB/+7vzht/p6zAy+4BxurKbHmSBFkC1OKcxYuy7yB4CQkHabzj4V2vRtqFdwH
5lExP9qh3kqaVLuhnvxLTmkktR3EW4PFy6OI53l5kRNktOXSuZ0aN6K3V7tCg/X0
iL7ASo4bJKlxeWcDpmuVrNgAajmZVfXrjKY7robgBQno+yIsgKhRZRBQHjozA6B8
FpCo/k48RZd/EzIbV/PDDRI4hmmry/lgrO9SKjzq56wSqff2bd/k/KYze4dbAPfd
Ps60enPTuHmeEjjb4MMMU4EKHVdTQFUMx/xZCmT4xzoh8s4of6RHphXbfE0SUznQ
dTveyEQAR7E3JNS0k1+3WEX5fWlFesp0hO2NeE0wzUq4TAr9ztgVO9NQ6Si15e7Z
2HysO0T5Ojtt0lY08/PvS6i48eCAuuBomrejJS8hLW4SUZ5adn+yW4Qo7Fp9JeBR
l9a3LsVT8BZMtUWrUuFcVhlM4MbzElUPjDbgWhR8UYU/kpfVZOQu8qWgGKR4UWXy
X7/t9l/tjR99CmfMJBAOzJid+ScSpAfg77BdaKiQrVfVIJmsjEjlO8vUMyj5b1HF
hPX5wNyJjHAOfridLeHSs4Rdm4a8sk8Az5d4h76pLVz8M4jyTi2v0rO3N4/dU/pu
x7N8KR5hAj+mLBoM9/Al
=8sYU
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"First round of KVM updates for 3.14; PPC parts will come next week.
Nothing major here, just bugfixes all over the place. The most
interesting part is the ARM guys' virtualized interrupt controller
overhaul, which lets userspace get/set the state and thus enables
migration of ARM VMs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits)
kvm: make KVM_MMU_AUDIT help text more readable
KVM: s390: Fix memory access error detection
KVM: nVMX: Update guest activity state field on L2 exits
KVM: nVMX: Fix nested_run_pending on activity state HLT
KVM: nVMX: Clean up handling of VMX-related MSRs
KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject
KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
KVM: nVMX: Leave VMX mode on clearing of feature control MSR
KVM: VMX: Fix DR6 update on #DB exception
KVM: SVM: Fix reading of DR6
KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS
add support for Hyper-V reference time counter
KVM: remove useless write to vcpu->hv_clock.tsc_timestamp
KVM: x86: fix tsc catchup issue with tsc scaling
KVM: x86: limit PIT timer frequency
KVM: x86: handle invalid root_hpa everywhere
kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub
kvm: vfio: silence GCC warning
KVM: ARM: Remove duplicate include
arm/arm64: KVM: relax the requirements of VMA alignment for THP
...
The SMC-based PSCI emulation for Guest is going to be very different
from the in-kernel HVC-based PSCI emulation hence for now just inject
undefined exception when Guest executes SMC instruction.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: marc Zyngier <marc.zyngier@arm.com>
This patch allows us to have X-Gene guest VCPU when using KVM arm64
on APM X-Gene host.
We add KVM_ARM_TARGET_XGENE_POTENZA for X-Gene Potenza compatible
guest VCPU and we return KVM_ARM_TARGET_XGENE_POTENZA in kvm_target_cpu()
when running on X-Gene host with Potenza core.
[maz: sanitized the commit log]
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Current max VCPUs per-Guest is set to 4 which is preventing
us from creating a Guest (or VM) with 8 VCPUs on Host (e.g.
X-Gene Storm SOC) with 8 Host CPUs.
The correct value of max VCPUs per-Guest should be same as
the max CPUs supported by GICv2 which is 8 but, increasing
value of max VCPUs per-Guest can make things slower hence
we add Kconfig option to let KVM users select appropriate
max VCPUs per-Guest.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Correct spelling typo in various part of kernel
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
side: the HV and emulation flavors can now coexist in a single kernel
is probably the most interesting change from a user point of view.
On the x86 side there are nested virtualization improvements and a
few bugfixes. ARM got transparent huge page support, improved
overcommit, and support for big endian guests.
Finally, there is a new interface to connect KVM with VFIO. This
helps with devices that use NoSnoop PCI transactions, letting the
driver in the guest execute WBINVD instructions. This includes
some nVidia cards on Windows, that fail to start without these
patches and the corresponding userspace changes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJShPAhAAoJEBvWZb6bTYbyl48P/297GgmELHAGBgjvb6q7yyGu
L8+eHjKbh4XBAkPwyzbvUjuww5z2hM0N3JQ0BDV9oeXlO+zwwCEns/sg2Q5/NJXq
XxnTeShaKnp9lqVBnE6G9rAOUWKoyLJ2wItlvUL8JlaO9xJ0Vmk0ta4n2Nv5GqDp
db6UD7vju6rHtIAhNpvvAO51kAOwc01xxRixCVb7KUYOnmO9nvpixzoI/S0Rp1gu
w/OWMfCosDzBoT+cOe79Yx1OKcpaVW94X6CH1s+ShCw3wcbCL2f13Ka8/E3FIcuq
vkZaLBxio7vjUAHRjPObw0XBW4InXEbhI1DjzIvm8dmc4VsgmtLQkTCG8fj+jINc
dlHQUq6Do+1F4zy6WMBUj8tNeP1Z9DsABp98rQwR8+BwHoQpGQBpAxW0TE0ZMngC
t1caqyvjZ5pPpFUxSrAV+8Kg4AvobXPYOim0vqV7Qea07KhFcBXLCfF7BWdwq/Jc
0CAOlsLL4mHGIQWZJuVGw0YGP7oATDCyewlBuDObx+szYCoV4fQGZVBEL0KwJx/1
7lrLN7JWzRyw6xTgJ5VVwgYE1tUY4IFQcHu7/5N+dw8/xg9KWA3f4PeMavIKSf+R
qteewbtmQsxUnvuQIBHLs8NRWPnBPy+F3Sc2ckeOLIe4pmfTte6shtTXcLDL+LqH
NTmT/cfmYp2BRkiCfCiS
=rWNf
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM changes from Paolo Bonzini:
"Here are the 3.13 KVM changes. There was a lot of work on the PPC
side: the HV and emulation flavors can now coexist in a single kernel
is probably the most interesting change from a user point of view.
On the x86 side there are nested virtualization improvements and a few
bugfixes.
ARM got transparent huge page support, improved overcommit, and
support for big endian guests.
Finally, there is a new interface to connect KVM with VFIO. This
helps with devices that use NoSnoop PCI transactions, letting the
driver in the guest execute WBINVD instructions. This includes some
nVidia cards on Windows, that fail to start without these patches and
the corresponding userspace changes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits)
kvm, vmx: Fix lazy FPU on nested guest
arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
arm/arm64: KVM: MMIO support for BE guest
kvm, cpuid: Fix sparse warning
kvm: Delete prototype for non-existent function kvm_check_iopl
kvm: Delete prototype for non-existent function complete_pio
hung_task: add method to reset detector
pvclock: detect watchdog reset at pvclock read
kvm: optimize out smp_mb after srcu_read_unlock
srcu: API for barrier after srcu read unlock
KVM: remove vm mmap method
KVM: IOMMU: hva align mapping page size
KVM: x86: trace cpuid emulation when called from emulator
KVM: emulator: cleanup decode_register_operand() a bit
KVM: emulator: check rex prefix inside decode_register()
KVM: x86: fix emulation of "movzbl %bpl, %eax"
kvm_host: typo fix
KVM: x86: emulate SAHF instruction
MAINTAINERS: add tree for kvm.git
Documentation/kvm: add a 00-INDEX file
...
- A couple a basic fixes for running BE guests on a LE host
- A performance improvement for overcommitted VMs (same as the equivalent
patch for ARM)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJSfLSVAAoJECPQ0LrRPXpDMG0QAJrTocErN2BQMDoT9DpcQhh6
yoD6KjbS3O4lWz60wJ0BgJ6gcQFg7JiFrPk6JcyT+ykXYf1UuLymUhAkU7Sw+0lP
GVt7sr2SaaQd6ZjGphWyWPXuDbvN1CxyIi7TD4CNe0tTYwSI6Vaf19h2Bkjd+VfJ
o2Sf2zHz4mutTCmPJuqnI255MLveTyQr/VZT1xNS79FiJM3/j3+UxCEi1fwgTkkb
4l3AyW9RN2mmTsS4VE6an2iosCi9pqoAC3y88vnaeBpUHIf/O2O57sT0+8o7jM6z
6uZVesMsKNmDtkMUFRQyj4Mps3yIVcccDpFJr4UcZH+ipbM+5nPY8AlYcKqk+4KY
T7Zys1hITq7xSEK4HiQt+AJnXDXZF5YZnzqUqVQHZFBn5P1GfB4/Bo9E3+QG68oq
AO1ry6SiRRPmTAZVeYqV82DX1YSjbvghvvPXhtPvolNhyzJJooBvhpWfGthVeZds
tuazKvvwDv0pFEWSwiFvWyqGW4FHKz3vWSUfuR1MF2P86fIfT5buJ9/StMxiRRH8
tSwoV3Ksut2kX9l5o8MZqmv1UkH88hwxxxd2J2aHGU+XPLlbQaYSWSK0d6hKTLMZ
ErZmCK/BARdll9UiTdXZ8h7UjfVDfuhTVeW1PvKQJ7sLGCDeT0VfhcrXnTtr41Je
iLWeDea0NJBzQxRZoJHG
=s0ny
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm64/for-3.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into kvm-next
A handful of fixes for KVM/arm64:
- A couple a basic fixes for running BE guests on a LE host
- A performance improvement for overcommitted VMs (same as the equivalent
patch for ARM)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Conflicts:
arch/arm/include/asm/kvm_emulate.h
arch/arm64/include/asm/kvm_emulate.h
Ensure that accesses to the GICH_* registers are byteswapped
when the kernel is compiled as big-endian.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Force SCTLR_EL2.EE to 1 if the kernel is compiled as BE.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.
The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This patch implements kvm_vcpu_preferred_target() function for
KVM ARM64 which will help us implement KVM_ARM_PREFERRED_TARGET
ioctl for user space.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
When performing a Stage-2 TLB invalidation, it is necessary to
make sure the write to the page tables is observable by all CPUs.
For this purpose, add dsb instructions to __kvm_tlb_flush_vmid_ipa
and __kvm_flush_vm_context before doing the TLB invalidation itself.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Not saving PAR_EL1 is an unfortunate oversight. If the guest
performs an AT* operation and gets scheduled out before reading
the result of the translation from PAREL1, it could become
corrupted by another guest or the host.
Saving this register is made slightly more complicated as KVM also
uses it on the permission fault handling path, leading to an ugly
"stash and restore" sequence. Fortunately, this is already a slow
path so we don't really care. Also, Linux doesn't do any AT*
operation, so Linux guests are not impacted by this bug.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>