kernel-ark

Author	SHA1	Message	Date
Christian Borntraeger	8a88ac6183	s390: KVM preparation: address of the 64bit extint parm in lowcore The address 0x11b8 is used by z/VM for pfault and diag 250 I/O to provide a 64 bit extint parameter. virtio uses the same address, so its time to update the lowcore structure. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:41 +03:00
Christian Borntraeger	5b7baf0578	s390: KVM preparation: host memory management changes for s390 kvm This patch changes the s390 memory management defintions to use the pgste field for dirty and reference bit tracking of host and guest code. Usually on s390, dirty and referenced are tracked in storage keys, which belong to the physical page. This changes with virtualization: The guest and host dirty/reference bits are defined to be the logical OR of the values for the mapping and the physical page. This patch implements the necessary changes in pgtable.h for s390. There is a common code change in mm/rmap.c, the call to page_test_and_clear_young must be moved. This is a no-op for all architecture but s390. page_referenced checks the referenced bits for the physiscal page and for all mappings: o The physical page is checked with page_test_and_clear_young. o The mappings are checked with ptep_test_and_clear_young and friends. Without pgstes (the current implementation on Linux s390) the physical page check is implemented but the mapping callbacks are no-ops because dirty and referenced are not tracked in the s390 page tables. The pgstes introduces guest and host dirty and reference bits for s390 in the host mapping. These mapping must be checked before page_test_and_clear_young resets the reference bit. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:40 +03:00
Carsten Otte	402b08622d	s390: KVM preparation: provide hook to enable pgstes in user pagetable The SIE instruction on s390 uses the 2nd half of the page table page to virtualize the storage keys of a guest. This patch offers the s390_enable_sie function, which reorganizes the page tables of a single-threaded process to reserve space in the page table: s390_enable_sie makes sure that the process is single threaded and then uses dup_mm to create a new mm with reorganized page tables. The old mm is freed and the process has now a page status extended field after every page table. Code that wants to exploit pgstes should SELECT CONFIG_PGSTE. This patch has a small common code hit, namely making dup_mm non-static. Edit (Carsten): I've modified Martin's patch, following Jeremy Fitzhardinge's review feedback. Now we do have the prototype for dup_mm in include/linux/sched.h. Following Martin's suggestion, s390_enable_sie() does now call task_lock() to prevent race against ptrace modification of mm_users. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:40 +03:00
Izik Eidus	37817f2982	KVM: x86: hardware task switching support This emulates the x86 hardware task switch mechanism in software, as it is unsupported by either vmx or svm. It allows operating systems which use it, like freedos, to run as kvm guests. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:39 +03:00
Izik Eidus	2e4d265349	KVM: x86: add functions to get the cpl of vcpu Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:38 +03:00
Avi Kivity	4c9fc8ef50	KVM: VMX: Add module option to disable flexpriority Useful for debugging. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:37 +03:00
Avi Kivity	268fe02ae0	KVM: no longer EXPERIMENTAL Long overdue. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:36 +03:00
Avi Kivity	0b49ea8659	KVM: MMU: Introduce and use spte_to_page() Encapsulate the pte mask'n'shift in a function. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:35 +03:00
Izik Eidus	855149aaa9	KVM: MMU: fix dirty bit setting when removing write permissions When mmu_set_spte() checks if a page related to spte should be release as dirty or clean, it check if the shadow pte was writeble, but in case rmap_write_protect() is called called it is possible for shadow ptes that were writeble to become readonly and therefor mmu_set_spte will release the pages as clean. This patch fix this issue by marking the page as dirty inside rmap_write_protect(). Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:34 +03:00
Avi Kivity	69a9f69bb2	KVM: Move some x86 specific constants and structures to include/asm-x86 Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:34 +03:00
Avi Kivity	947da53830	KVM: MMU: Set the accessed bit on non-speculative shadow ptes If we populate a shadow pte due to a fault (and not speculatively due to a pte write) then we can set the accessed bit on it, as we know it will be set immediately on the next guest instruction. This saves a read-modify-write operation. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:33 +03:00
Christian Borntraeger	97646202bc	KVM: kvm.h: __user requires compiler.h include/linux/kvm.h defines struct kvm_dirty_log to [...] union { void __user dirty_bitmap; / one bit per page */ __u64 padding; }; __user requires compiler.h to compile. Currently, this works on x86 only coincidentally due to other include files. This patch makes kvm.h compile in all cases. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:32 +03:00
Glauber Costa	1e977aa12d	x86: KVM guest: disable clock before rebooting. This patch writes 0 (actually, what really matters is that the LSB is cleared) to the system time msr before shutting down the machine for kexec. Without it, we can have a random memory location being written when the guest comes back It overrides the functions shutdown, used in the path of kernel_kexec() (sys.c) and crash_shutdown, used in the path of crash_kexec() (kexec.c) Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:31 +03:00
Glauber Costa	3c62c62502	x86: make native_machine_shutdown non-static it will allow external users to call it. It is mainly useful for routines that will override its machine_ops field for its own special purposes, but want to call the normal shutdown routine after they're done Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:30 +03:00
Glauber Costa	ed23dc6f5b	x86: allow machine_crash_shutdown to be replaced This patch a llows machine_crash_shutdown to be replaced, just like any of the other functions in machine_ops Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:29 +03:00
Marcelo Tosatti	096d14a3b5	x86: KVM guest: hypercall batching Batch pte updates and tlb flushes in lazy MMU mode. [avi: - adjust to mmu_op - helper for getting para_state without debug warnings] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:28 +03:00
Marcelo Tosatti	1da8a77bdc	x86: KVM guest: hypercall based pte updates and TLB flushes Hypercall based pte updates are faster than faults, and also allow use of the lazy MMU mode to batch operations. Don't report the feature if two dimensional paging is enabled. [avi: - guest/host split - fix 32-bit truncation issues - adjust to mmu_op - adjust to ->release_*() renamed - add ->release_pud()] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:28 +03:00
Marcelo Tosatti	2f333bcb4e	KVM: MMU: hypercall based pte updates and TLB flushes Hypercall based pte updates are faster than faults, and also allow use of the lazy MMU mode to batch operations. Don't report the feature if two dimensional paging is enabled. [avi: - one mmu_op hypercall instead of one per op - allow 64-bit gpa on hypercall - don't pass host errors (-ENOMEM) to guest] [akpm: warning fix on i386] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:27 +03:00
Avi Kivity	9f81128591	KVM: Provide unlocked version of emulator_write_phys() Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:26 +03:00
Marcelo Tosatti	0cf1bfd273	x86: KVM guest: add basic paravirt support Add basic KVM paravirt support. Avoid vm-exits on IO delays. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:25 +03:00
Marcelo Tosatti	a28e4f5a62	KVM: add basic paravirt support Add basic KVM paravirt support. Avoid vm-exits on IO delays. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:24 +03:00
Sheng Yang	308b0f239e	KVM: Add reset support for in kernel PIT Separate the reset part and prepare for reset support. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:23 +03:00
Sheng Yang	e0f63cb927	KVM: Add save/restore supporting of in kernel PIT Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:22 +03:00
Sheng Yang	7837699fa6	KVM: In kernel PIT model The patch moves the PIT model from userspace to kernel, and increases the timer accuracy greatly. [marcelo: make last_injected_time per-guest] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Tested-and-Acked-by: Alex Davis <alex14641@yahoo.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:21 +03:00
Avi Kivity	4fcaa98267	KVM: Remove pointless desc_ptr #ifdef The desc_struct changes left an unnecessary #ifdef; remove it. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:27 +03:00
Avi Kivity	019960ae99	KVM: VMX: Don't adjust tsc offset forward Most Intel hosts have a stable tsc, and playing with the offset only reduces accuracy. By limiting tsc offset adjustment only to forward updates, we effectively disable tsc offset adjustment on these hosts. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:27 +03:00
Harvey Harrison	b8688d51bb	KVM: replace remaining __FUNCTION__ occurances __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:27 +03:00
Joerg Roedel	71c4dfafc0	KVM: detect if VCPU triple faults In the current inject_page_fault path KVM only checks if there is another PF pending and injects a DF then. But it has to check for a pending DF too to detect a shutdown condition in the VCPU. If this is not detected the VCPU goes to a PF -> DF -> PF loop when it should triple fault. This patch detects this condition and handles it with an KVM_SHUTDOWN exit to userspace. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:27 +03:00
Xiantao Zhang	3e4bb3ac9e	KVM: Use kzalloc to avoid allocating kvm_regs from kernel stack Since the size of kvm_regs is too big to allocate from kernel stack on ia64, use kzalloc to allocate it. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:26 +03:00
Avi Kivity	2d3ad1f40c	KVM: Prefix control register accessors with kvm_ to avoid namespace pollution Names like 'set_cr3()' look dangerously close to affecting the host. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:26 +03:00
Marcelo Tosatti	05da45583d	KVM: MMU: large page support Create large pages mappings if the guest PTE's are marked as such and the underlying memory is hugetlbfs backed. If the largepage contains write-protected pages, a large pte is not used. Gives a consistent 2% improvement for data copies on ram mounted filesystem, without NPT/EPT. Anthony measures a 4% improvement on 4-way kernbench, with NPT. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:25 +03:00
Marcelo Tosatti	2e53d63acb	KVM: MMU: ignore zapped root pagetables Mark zapped root pagetables as invalid and ignore such pages during lookup. This is a problem with the cr3-target feature, where a zapped root table fools the faulting code into creating a read-only mapping. The result is a lockup if the instruction can't be emulated. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:25 +03:00
Alexander Graf	847f0ad8cb	KVM: Implement dummy values for MSR_PERF_STATUS Darwin relies on this and ceases to work without. Signed-off-by: Alexander Graf <alex@csgraf.de> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:25 +03:00
Harvey Harrison	14af3f3c56	KVM: sparse fixes for kvm/x86.c In two case statements, use the ever popular 'i' instead of index: arch/x86/kvm/x86.c:1063:7: warning: symbol 'index' shadows an earlier one arch/x86/kvm/x86.c:1000:9: originally declared here arch/x86/kvm/x86.c:1079:7: warning: symbol 'index' shadows an earlier one arch/x86/kvm/x86.c:1000:9: originally declared here Make it static. arch/x86/kvm/x86.c:1945:24: warning: symbol 'emulate_ops' was not declared. Should it be static? Drop the return statements. arch/x86/kvm/x86.c:2878:2: warning: returning void-valued expression arch/x86/kvm/x86.c:2944:2: warning: returning void-valued expression Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:24 +03:00
Harvey Harrison	4866d5e3d5	KVM: SVM: make iopm_base static Fixes sparse warning as well. arch/x86/kvm/svm.c:69:15: warning: symbol 'iopm_base' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:24 +03:00
Harvey Harrison	77cd337f22	KVM: x86 emulator: fix sparse warnings in x86_emulate.c Nesting __emulate_2op_nobyte inside__emulate_2op produces many shadowed variable warnings on the internal variable _tmp used by both macros. Change the outer macro to use __tmp. Avoids a sparse warning like the following at every call site of __emulate_2op arch/x86/kvm/x86_emulate.c:1091:3: warning: symbol '_tmp' shadows an earlier one arch/x86/kvm/x86_emulate.c:1091:3: originally declared here [18 more warnings suppressed] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:24 +03:00
Amit Shah	f11c3a8d84	KVM: Add stat counter for hypercalls Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:24 +03:00
Avi Kivity	a5f61300c4	KVM: Use x86's segment descriptor struct instead of private definition The x86 desc_struct unification allows us to remove segment_descriptor.h. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:24 +03:00
Avi Kivity	ef2979bd98	KVM: Increase the number of user memory slots per vm Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:23 +03:00
Avi Kivity	a988b910ef	KVM: Add API for determining the number of supported memory slots Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:23 +03:00
Avi Kivity	edbe6c325d	KVM: Increase vcpu count to 16 With NPT support, scalability is much improved. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:23 +03:00
Avi Kivity	f725230af9	KVM: Add API to retrieve the number of supported vcpus per vm Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:23 +03:00
Harvey Harrison	7a95727567	KVM: x86 emulator: make register_address_increment and JMP_REL static inlines Change jmp_rel() to a function as well. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:23 +03:00
Harvey Harrison	e4706772ea	KVM: x86 emulator: make register_address, address_mask static inlines Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:22 +03:00
Harvey Harrison	ddcb2885e2	KVM: x86 emulator: add ad_mask static inline Replaces open-coded mask calculation in macros. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:22 +03:00
Glauber de Oliveira Costa	790c73f628	x86: KVM guest: paravirtualized clocksource This is the guest part of kvm clock implementation It does not do tsc-only timing, as tsc can have deltas between cpus, and it did not seem worthy to me to keep adjusting them. We do use it, however, for fine-grained adjustment. Other than that, time comes from the host. [randy dunlap: add missing include] [randy dunlap: disallow on Voyager or Visual WS] Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:22 +03:00
Glauber de Oliveira Costa	18068523d3	KVM: paravirtualized clocksource: host part This is the host part of kvm clocksource implementation. As it does not include clockevents, it is a fairly simple implementation. We only have to register a per-vcpu area, and start writing to it periodically. The area is binary compatible with xen, as we use the same shadow_info structure. [marcelo: fix bad_page on MSR_KVM_SYSTEM_TIME] [avi: save full value of the msr, even if enable bit is clear] [avi: clear previous value of time_page] Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:22 +03:00
Joerg Roedel	24e09cbf48	KVM: SVM: enable LBR virtualization This patch implements the Last Branch Record Virtualization (LBRV) feature of the AMD Barcelona and Phenom processors into the kvm-amd module. It will only be enabled if the guest enables last branch recording in the DEBUG_CTL MSR. So there is no increased world switch overhead when the guest doesn't use these MSRs. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:21 +03:00
Joerg Roedel	f65c229c3e	KVM: SVM: allocate the MSR permission map per VCPU This patch changes the kvm-amd module to allocate the SVM MSR permission map per VCPU instead of a global map for all VCPUs. With this we have more flexibility allowing specific guests to access virtualized MSRs. This is required for LBR virtualization. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:21 +03:00
Joerg Roedel	e6101a96c9	KVM: SVM: let init_vmcb() take struct vcpu_svm as parameter Change the parameter of the init_vmcb() function in the kvm-amd module from struct vmcb to struct vcpu_svm. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:21 +03:00

... 5 6 7 8 9 ...

94102 Commits