kernel-ark

Author	SHA1	Message	Date
Dave Jones	f93576e1ac	xen/pvh: Fix misplaced kfree from xlated_setup_gnttab_pages Passing a freed 'pages' to free_xenballooned_pages will end badly on kernels with slub debug enabled. This looks out of place between the rc assign and the check, but we do want to kfree pages regardless of which path we take. Signed-off-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-31 09:48:58 -05:00
Bob Liu	bc1b0df59e	drivers: xen: deaggressive selfballoon driver Current xen-selfballoon driver is too aggressive which may cause OOM be triggered more often. Eg. this bug reported by James: https://lkml.org/lkml/2013/11/21/158 There are two mainly reasons: 1) The original goal_page didn't consider some pages used by kernel space, like slab pages and pages used by device drivers. 2) The balloon driver may not give back memory to guest OS fast enough when the workload suddenly aquries a lot of physical memory. In both cases, the guest OS will suffer from memory pressure and OOM may be triggered. The fix is make xen-selfballoon driver not that aggressive by adding extra 10% of total ram pages to goal_page. It's more valuable to keep the guest system reliable and response faster than balloon out these 10% pages to XEN. Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-31 09:48:43 -05:00
Zoltan Kiss	08ece5bb23	xen/grant-table: Avoid m2p_override during mapping The grant mapping API does m2p_override unnecessarily: only gntdev needs it, for blkback and future netback patches it just cause a lock contention, as those pages never go to userspace. Therefore this series does the following: - the original functions were renamed to __gnttab_[un]map_refs, with a new parameter m2p_override - based on m2p_override either they follow the original behaviour, or just set the private flag and call set_phys_to_machine - gnttab_[un]map_refs are now a wrapper to call __gnttab_[un]map_refs with m2p_override false - a new function gnttab_[un]map_refs_userspace provides the old behaviour It also removes a stray space from page.h and change ret to 0 if XENFEAT_auto_translated_physmap, as that is the only possible return value there. v2: - move the storing of the old mfn in page->index to gnttab_map_refs - move the function header update to a separate patch v3: - a new approach to retain old behaviour where it needed - squash the patches into one v4: - move out the common bits from m2p* functions, and pass pfn/mfn as parameter - clear page->private before doing anything with the page, so m2p_find_override won't race with this v5: - change return value handling in __gnttab_[un]map_refs - remove a stray space in page.h - add detail why ret = 0 now at some places v6: - don't pass pfn to m2p* functions, just get it locally Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Suggested-by: David Vrabel <david.vrabel@citrix.com> Acked-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-31 09:48:32 -05:00
Julien Grall	47c542050d	xen/gnttab: Use phys_addr_t to describe the grant frame base address On ARM, address size can be 32 bits or 64 bits (if CONFIG_ARCH_PHYS_ADDR_T_64BIT is enabled). We can't assume that the grant frame base address will always fits in an unsigned long. Use phys_addr_t instead of unsigned long as argument for gnttab_setup_auto_xlat_frames. Signed-off-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com>	2014-01-30 12:56:34 +00:00
Ian Campbell	e17b2f114c	xen: swiotlb: handle sizeof(dma_addr_t) != sizeof(phys_addr_t) The use of phys_to_machine and machine_to_phys in the phys<=>bus conversions causes us to lose the top bits of the DMA address if the size of a DMA address is not the same as the size of the phyiscal address. This can happen in practice on ARM where foreign pages can be above 4GB even though the local kernel does not have LPAE page tables enabled (which is totally reasonable if the guest does not itself have >4GB of RAM). In this case the kernel still maps the foreign pages at a phys addr below 4G (as it must) but the resulting DMA address (returned by the grant map operation) is much higher. This is analogous to a hardware device which has its view of RAM mapped up high for some reason. This patch makes I/O to foreign pages (specifically blkif) work on 32-bit ARM systems with more than 4GB of RAM. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-30 12:54:20 +00:00
Julien Grall	8b271d57b5	arm/xen: Initialize event channels earlier Event channels driver needs to be initialized very early. Until now, Xen initialization was done after all CPUs was bring up. We can safely move the initialization to an early initcall. Also use a cpu notifier to: - Register the VCPU when the CPU is prepared - Enable event channel IRQ when the CPU is running Signed-off-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-30 12:52:59 +00:00
Roger Pau Monne	c9f6e9977e	xen/pvh: Set X86_CR0_WP and others in CR0 (v2) otherwise we will get for some user-space applications that use 'clone' with CLONE_CHILD_SETTID \| CLONE_CHILD_CLEARTID end up hitting an assert in glibc manifested by: general protection ip:7f80720d364c sp:7fff98fd8a80 error:0 in libc-2.13.so[7f807209e000+180000] This is due to the nature of said operations which sets and clears the PID. "In the successful one I can see that the page table of the parent process has been updated successfully to use a different physical page, so the write of the tid on that page only affects the child... On the other hand, in the failed case, the write seems to happen before the copy of the original page is done, so both the parent and the child end up with the same value (because the parent copies the page after the write of the child tid has already happened)." (Roger's analysis). The nature of this is due to the Xen's commit of 51e2cac257ec8b4080d89f0855c498cbbd76a5e5 "x86/pvh: set only minimal cr0 and cr4 flags in order to use paging" the CR0_WP was removed so COW features of the Linux kernel were not operating properly. While doing that also update the rest of the CR0 flags to be inline with what a baremetal Linux kernel would set them to. In 'secondary_startup_64' (baremetal Linux) sets: X86_CR0_PE \| X86_CR0_MP \| X86_CR0_ET \| X86_CR0_NE \| X86_CR0_WP \| X86_CR0_AM \| X86_CR0_PG The hypervisor for HVM type guests (which PVH is a bit) sets: X86_CR0_PE \| X86_CR0_ET \| X86_CR0_TS For PVH it specifically sets: X86_CR0_PG Which means we need to set the rest: X86_CR0_MP \| X86_CR0_NE \| X86_CR0_WP \| X86_CR0_AM to have full parity. Signed-off-by: Roger Pau Monne <roger.pau@citrix.com> Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v1: Took out the cr4 writes to be a seperate patch] [v2: 0-DAY kernel found xen_setup_gdt to be missing a static]	2014-01-21 13:26:05 -05:00
David Vrabel	ea70ba3ab9	MAINTAINERS: add git repository for Xen Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-20 10:17:55 -05:00
Konrad Rzeszutek Wilk	54d44eb3c7	xen/pvh: Use 'depend' instead of 'select'. The usage of 'select' means it will enable the CONFIG options without checking their dependencies. That meant we would inadvertently turn on CONFIG_XEN_PVHM while its core dependency (CONFIG_PCI) was turned off. This patch fixes the warnings and compile failures: warning: (XEN_PVH) selects XEN_PVHVM which has unmet direct dependencies (HYPERVISOR_GUEST && XEN && PCI && X86_LOCAL_APIC) Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-10 10:45:35 -05:00
Paul Gortmaker	0db6991dd2	xen: delete new instances of __cpuinit usage Commit `1fe565517b` ("xen/events: use the FIFO-based ABI if available") added new instances of __cpuinit macro usage. We removed this a couple versions ago; we now want to remove the compat no-op stubs. Introducing new users is not what we want to see at this point in time, as it will break once the stubs are gone. Cc: David Vrabel <david.vrabel@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-10 10:44:43 -05:00
Stefano Stabellini	b1a3b1c8a8	xen/fb: allow xenfb initialization for hvm guests There is no reasons why an HVM guest shouldn't be allowed to use xenfb. As a matter of fact ARM guests, HVM from Linux POV, can use xenfb. Given that no Xen toolstacks configure a xenfb backend for x86 HVM guests, they are not affected. Please note that at this time QEMU needs few outstanding fixes to provide xenfb on ARM: http://marc.info/?l=qemu-devel&m=138739419700837&w=2 Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: David Vrabel <david.vrabel@citrix.com> CC: boris.ostrovsky@oracle.com CC: plagnioj@jcrosoft.com CC: tomi.valkeinen@ti.com CC: linux-fbdev@vger.kernel.org CC: konrad.wilk@oracle.com Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Tomi Valkeinen <tomi.valkeinen@ti.com>	2014-01-07 09:59:54 -05:00
Wei Yongjun	be1403b9e6	xen/evtchn_fifo: fix error return code in evtchn_fifo_setup() Fix to return -ENOMEM from the error handling case instead of 0 (overwrited to 0 by the HYPERVISOR_event_channel_op call), otherwise the error condition cann't be reflected from the return value. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com>	2014-01-07 09:59:52 -05:00
Wei Yongjun	89b9e08f18	xen-platform: fix error return code in platform_pci_init() Fix to return a negative error code from the error handling case instead of 0, otherwise the error condition cann't be reflected from the return value. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-07 09:59:51 -05:00
Wei Yongjun	5602aba808	xen/pvh: remove duplicated include from enlighten.c Remove duplicated include. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-07 09:59:49 -05:00
Konrad Rzeszutek Wilk	0869a64232	xen/pvh: Fix compile issues with xen_pvh_domain() Oddly enough it compiles for my ancient compiler but with the supplied .config it does blow up. Fix is easy enough. Reported-by: kbuild test robot <fengguang.wu@intel.com> Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-07 09:59:28 -05:00
Yijing Wang	89c3cf52c7	xen: Use dev_is_pci() to check whether it is pci device Use PCI standard marco dev_is_pci() instead of directly compare pci_bus_type to check whether it is pci device. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jan Beulich <JBeulich@suse.com>	2014-01-07 09:53:33 -05:00
Konrad Rzeszutek Wilk	11c7ff17c9	xen/grant-table: Force to use v1 of grants. We have the framework to use v2, but there are no backends that actually use it. The end result is that on PV we use v2 grants and on PVHVM v1. The v1 has a capacity of 512 grants per page while the v2 has 256 grants per page. This means we lose about 50% capacity - and if we want more than 16 VIFs (each VIF takes 512 grants), then we are hitting the max per guest of 32. Oracle-bug: 16039922 CC: annie.li@oracle.com CC: msw@amazon.com Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com>	2014-01-06 10:44:39 -05:00
Mukesh Rathor	4e903a20da	xen/pvh: Support ParaVirtualized Hardware extensions (v3). PVH allows PV linux guest to utilize hardware extended capabilities, such as running MMU updates in a HVM container. The Xen side defines PVH as (from docs/misc/pvh-readme.txt, with modifications): "* the guest uses auto translate: - p2m is managed by Xen - pagetables are owned by the guest - mmu_update hypercall not available * it uses event callback and not vlapic emulation, * IDT is native, so set_trap_table hcall is also N/A for a PVH guest. For a full list of hcalls supported for PVH, see pvh_hypercall64_table in arch/x86/hvm/hvm.c in xen. From the ABI prespective, it's mostly a PV guest with auto translate, although it does use hvm_op for setting callback vector." Use .ascii and .asciz to define xen feature string. Note, the PVH string must be in a single line (not multiple lines with \) to keep the assembler from putting null char after each string before \. This patch allows it to be configured and enabled. We also use introduce the 'XEN_ELFNOTE_SUPPORTED_FEATURES' ELF note to tell the hypervisor that 'hvm_callback_vector' is what the kernel needs. We can not put it in 'XEN_ELFNOTE_FEATURES' as older hypervisor parse fields they don't understand as errors and refuse to load the kernel. This work-around fixes the problem. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:44:24 -05:00
Mukesh Rathor	be3e9cf330	xen/pvh: Piggyback on PVHVM XenBus. PVH is a PV guest with a twist - there are certain things that work in it like HVM and some like PV. For the XenBus mechanism we want to use the PVHVM mechanism. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:44:23 -05:00
Konrad Rzeszutek Wilk	6926f6d610	xen/pvh: Piggyback on PVHVM for grant driver (v4) In PVH the shared grant frame is the PFN and not MFN, hence its mapped via the same code path as HVM. The allocation of the grant frame is done differently - we do not use the early platform-pci driver and have an ioremap area - instead we use balloon memory and stitch all of the non-contingous pages in a virtualized area. That means when we call the hypervisor to replace the GMFN with a XENMAPSPACE_grant_table type, we need to lookup the old PFN for every iteration instead of assuming a flat contingous PFN allocation. Lastly, we only use v1 for grants. This is because PVHVM is not able to use v2 due to no XENMEM_add_to_physmap calls on the error status page (see commit `69e8f430e2` xen/granttable: Disable grant v2 for HVM domains.) Until that is implemented this workaround has to be in place. Also per suggestions by Stefano utilize the PVHVM paths as they share common functionality. v2 of this patch moves most of the PVH code out in the arch/x86/xen/grant-table driver and touches only minimally the generic driver. v3, v4: fixes us some of the code due to earlier patches. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:44:21 -05:00
Konrad Rzeszutek Wilk	efaf30a335	xen/grant: Implement an grant frame array struct (v3). The 'xen_hvm_resume_frames' used to be an 'unsigned long' and contain the virtual address of the grants. That was OK for most architectures (PVHVM, ARM) were the grants are contiguous in memory. That however is not the case for PVH - in which case we will have to do a lookup for each virtual address for the PFN. Instead of doing that, lets make it a structure which will contain the array of PFNs, the virtual address and the count of said PFNs. Also provide a generic functions: gnttab_setup_auto_xlat_frames and gnttab_free_auto_xlat_frames to populate said structure with appropriate values for PVHVM and ARM. To round it off, change the name from 'xen_hvm_resume_frames' to a more descriptive one - 'xen_auto_xlat_grant_frames'. For PVH, in patch "xen/pvh: Piggyback on PVHVM for grant driver" we will populate the 'xen_auto_xlat_grant_frames' by ourselves. v2 moves the xen_remap in the gnttab_setup_auto_xlat_frames and also introduces xen_unmap for gnttab_free_auto_xlat_frames. Suggested-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v3: Based on top of 'asm/xen/page.h: remove redundant semicolon'] Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:44:20 -05:00
Konrad Rzeszutek Wilk	456847533b	xen/grant-table: Refactor gnttab_init We have this odd scenario of where for PV paths we take a shortcut but for the HVM paths we first ioremap xen_hvm_resume_frames, then assign it to gnttab_shared.addr. This is needed because gnttab_map uses gnttab_shared.addr. Instead of having: if (pv) return gnttab_map if (hvm) ... gnttab_map Lets move the HVM part before the gnttab_map and remove the first call to gnttab_map. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:44:18 -05:00
Konrad Rzeszutek Wilk	7f256020cc	xen/grants: Remove gnttab_max_grant_frames dependency on gnttab_init. The function gnttab_max_grant_frames() returns the maximum amount of frames (pages) of grants we can have. Unfortunatly it was dependent on gnttab_init() having been run before to initialize the boot max value (boot_max_nr_grant_frames). This meant that users of gnttab_max_grant_frames would always get a zero value if they called before gnttab_init() - such as 'platform_pci_init' (drivers/xen/platform-pci.c). Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:44:16 -05:00
Mukesh Rathor	2771374d47	xen/pvh: Piggyback on PVHVM for event channels (v2) PVH is a PV guest with a twist - there are certain things that work in it like HVM and some like PV. There is a similar mode - PVHVM where we run in HVM mode with PV code enabled - and this patch explores that. The most notable PV interfaces are the XenBus and event channels. We will piggyback on how the event channel mechanism is used in PVHVM - that is we want the normal native IRQ mechanism and we will install a vector (hvm callback) for which we will call the event channel mechanism. This means that from a pvops perspective, we can use native_irq_ops instead of the Xen PV specific. Albeit in the future we could support pirq_eoi_map. But that is a feature request that can be shared with PVHVM. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:44:15 -05:00
Mukesh Rathor	9103bb0f82	xen/pvh: Update E820 to work with PVH (v2) In xen_add_extra_mem() we can skip updating P2M as it's managed by Xen. PVH maps the entire IO space, but only RAM pages need to be repopulated. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:44:13 -05:00
Mukesh Rathor	5840c84b16	xen/pvh: Secondary VCPU bringup (non-bootup CPUs) The VCPU bringup protocol follows the PV with certain twists. From xen/include/public/arch-x86/xen.h: Also note that when calling DOMCTL_setvcpucontext and VCPU_initialise for HVM and PVH guests, not all information in this structure is updated: - For HVM guests, the structures read include: fpu_ctxt (if VGCT_I387_VALID is set), flags, user_regs, debugreg[*] - PVH guests are the same as HVM guests, but additionally use ctrlreg[3] to set cr3. All other fields not used should be set to 0. This is what we do. We piggyback on the 'xen_setup_gdt' - but modify a bit - we need to call 'load_percpu_segment' so that 'switch_to_new_gdt' can load per-cpu data-structures. It has no effect on the VCPU0. We also piggyback on the %rdi register to pass in the CPU number - so that when we bootup a new CPU, the cpu_bringup_and_idle will have passed as the first parameter the CPU number (via %rdi for 64-bit). Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-06 10:44:12 -05:00
Mukesh Rathor	8d656bbe43	xen/pvh: Load GDT/GS in early PV bootup code for BSP. During early bootup we start life using the Xen provided GDT, which means that we are running with %cs segment set to FLAT_KERNEL_CS (FLAT_RING3_CS64 0xe033, GDT index 261). But for PVH we want to be use HVM type mechanism for segment operations. As such we need to switch to the HVM one and also reload ourselves with the __KERNEL_CS:eip to run in the proper GDT and segment. For HVM this is usually done in 'secondary_startup_64' in (head_64.S) but since we are not taking that bootup path (we start in PV - xen_start_kernel) we need to do that in the early PV bootup paths. For good measure we also zero out the %fs, %ds, and %es (not strictly needed as Xen has already cleared them for us). The %gs is loaded by 'switch_to_new_gdt'. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com>	2014-01-06 10:44:10 -05:00
Mukesh Rathor	4dd322bc3b	xen/pvh: Setup up shared_info. For PVHVM the shared_info structure is provided via the same way as for normal PV guests (see include/xen/interface/xen.h). That is during bootup we get 'xen_start_info' via the %esi register in startup_xen. Then later we extract the 'shared_info' from said structure (in xen_setup_shared_info) and start using it. The 'xen_setup_shared_info' is all setup to work with auto-xlat guests, but there are two functions which it calls that are not: xen_setup_mfn_list_list and xen_setup_vcpu_info_placement. This patch modifies the P2M code (xen_setup_mfn_list_list) while the "Piggyback on PVHVM for event channels" modifies the xen_setup_vcpu_info_placement. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-06 10:44:09 -05:00
Mukesh Rathor	76bcceff0b	xen/pvh/mmu: Use PV TLB instead of native. We also optimize one - the TLB flush. The native operation would needlessly IPI offline VCPUs causing extra wakeups. Using the Xen one avoids that and lets the hypervisor determine which VCPU needs the TLB flush. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-06 10:44:07 -05:00
Mukesh Rathor	4e44e44b0b	xen/pvh: MMU changes for PVH (v2) .. which are surprisingly small compared to the amount for PV code. PVH uses mostly native mmu ops, we leave the generic (native_*) for the majority and just overwrite the baremetal with the ones we need. At startup, we are running with pre-allocated page-tables courtesy of the tool-stack. But we still need to graft them in the Linux initial pagetables. However there is no need to unpin/pin and change them to R/O or R/W. Note that the xen_pagetable_init due to 7836fec9d0994cc9c9150c5a33f0eb0eb08a335a "xen/mmu/p2m: Refactor the xen_pagetable_init code." does not need any changes - we just need to make sure that xen_post_allocator_init does not alter the pvops from the default native one. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:44:05 -05:00
Konrad Rzeszutek Wilk	b621e157ba	xen/mmu: Cleanup xen_pagetable_p2m_copy a bit. Stefano noticed that the code runs only under 64-bit so the comments about 32-bit are pointless. Also we change the condition for xen_revector_p2m_tree returning the same value (because it could not allocate a swath of space to put the new P2M in) or it had been called once already. In such we return early from the function. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:44:04 -05:00
Konrad Rzeszutek Wilk	32df75cd14	xen/mmu/p2m: Refactor the xen_pagetable_init code (v2). The revectoring and copying of the P2M only happens when !auto-xlat and on 64-bit builds. It is not obvious from the code, so lets have seperate 32 and 64-bit functions. We also invert the check for auto-xlat to make the code flow simpler. Suggested-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-06 10:44:02 -05:00
Konrad Rzeszutek Wilk	696fd7c5b2	xen/pvh: Don't setup P2M tree. P2M is not available for PVH. Fortunatly for us the P2M code already has mostly the support for auto-xlat guest thanks to commit `3d24bbd7dd` "grant-table: call set_phys_to_machine after mapping grant refs" which: " introduces set_phys_to_machine calls for auto_translated guests (even on x86) in gnttab_map_refs and gnttab_unmap_refs. translated by swiotlb-xen... " so we don't need to muck much. with above mentioned "commit you'll get set_phys_to_machine calls from gnttab_map_refs and gnttab_unmap_refs but PVH guests won't do anything with them " (Stefano Stabellini) which is OK - we want them to be NOPs. This is because we assume that an "IOMMU is always present on the plaform and Xen is going to make the appropriate IOMMU pagetable changes in the hypercall implementation of GNTTABOP_map_grant_ref and GNTTABOP_unmap_grant_ref, then eveything should be transparent from PVH priviligied point of view and DMA transfers involving foreign pages keep working with no issues[sp] Otherwise we would need a P2M (and an M2P) for PVH priviligied to track these foreign pages .. (see arch/arm/xen/p2m.c)." (Stefano Stabellini). We still have to inhibit the building of the P2M tree. That had been done in the past by not calling xen_build_dynamic_phys_to_machine (which setups the P2M tree and gives us virtual address to access them). But we are missing a check for xen_build_mfn_list_list - which was continuing to setup the P2M tree and would blow up at trying to get the virtual address of p2m_missing (which would have been setup by xen_build_dynamic_phys_to_machine). Hence a check is needed to not call xen_build_mfn_list_list when running in auto-xlat mode. Instead of replicating the check for auto-xlat in enlighten.c do it in the p2m.c code. The reason is that the xen_build_mfn_list_list is called also in xen_arch_post_suspend without any checks for auto-xlat. So for PVH or PV with auto-xlat - we would needlessly allocate space for an P2M tree. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:44:01 -05:00
Mukesh Rathor	d285d68314	xen/pvh: Early bootup changes in PV code (v4). We don't use the filtering that 'xen_cpuid' is doing because the hypervisor treats 'XEN_EMULATE_PREFIX' as an invalid instruction. This means that all of the filtering will have to be done in the hypervisor/toolstack. Without the filtering we expose to the guest the: - cpu topology (sockets, cores, etc); - the APERF (which the generic scheduler likes to use), see `5e62625420` "xen/setup: filter APERFMPERF cpuid feature out" - and the inability to figure out whether MWAIT_LEAF should be exposed or not. See `df88b2d96e` "xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded." - x2apic, see `4ea9b9aca9` "xen: mask x2APIC feature in PV" We also check for vector callback early on, as it is a required feature. PVH also runs at default kernel IOPL. Finally, pure PV settings are moved to a separate function that are only called for pure PV, ie, pv with pvmmu. They are also #ifdef with CONFIG_XEN_PVMMU. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:43:59 -05:00
Mukesh Rathor	ddc416cbc4	xen/pvh/x86: Define what an PVH guest is (v3). Which is a PV guest with auto page translation enabled and with vector callback. It is a cross between PVHVM and PV. The Xen side defines PVH as (from docs/misc/pvh-readme.txt, with modifications): "* the guest uses auto translate: - p2m is managed by Xen - pagetables are owned by the guest - mmu_update hypercall not available * it uses event callback and not vlapic emulation, * IDT is native, so set_trap_table hcall is also N/A for a PVH guest. For a full list of hcalls supported for PVH, see pvh_hypercall64_table in arch/x86/hvm/hvm.c in xen. From the ABI prespective, it's mostly a PV guest with auto translate, although it does use hvm_op for setting callback vector." Also we use the PV cpuid, albeit we can use the HVM (native) cpuid. However, we do have a fair bit of filtering in the xen_cpuid and we can piggyback on that until the hypervisor/toolstack filters the appropiate cpuids. Once that is done we can swap over to use the native one. We setup a Kconfig entry that is disabled by default and cannot be enabled. Note that on ARM the concept of PVH is non-existent. As Ian put it: "an ARM guest is neither PV nor HVM nor PVHVM. It's a bit like PVH but is different also (it's further towards the H end of the spectrum than even PVH).". As such these options (PVHVM, PVH) are never enabled nor seen on ARM compilations. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-01-06 10:43:58 -05:00
Mukesh Rathor	fc590efe66	xen/p2m: Check for auto-xlat when doing mfn_to_local_pfn. Most of the functions in page.h are prefaced with if (xen_feature(XENFEAT_auto_translated_physmap)) return mfn; Except the mfn_to_local_pfn. At a first sight, the function should work without this patch - as the 'mfn_to_mfn' has a similar check. But there are no such check in the 'get_phys_to_machine' function - so we would crash in there. This fixes it by following the convention of having the check for auto-xlat in these static functions. Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2014-01-06 10:43:56 -05:00
David Vrabel	1fe565517b	xen/events: use the FIFO-based ABI if available Implement all the event channel port ops for the FIFO-based ABI. If the hypervisor supports the FIFO-based ABI, enable it by initializing the control block for the boot VCPU and subsequent VCPUs as they are brought up and on resume. The event array is expanded as required when event ports are setup. The 'xen.fifo_events=0' command line option may be used to disable use of the FIFO-based ABI. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:57 -05:00
David Vrabel	8785c67663	xen/x86: set VIRQ_TIMER priority to maximum Commit `bee980d9e` (xen/events: Handle VIRQ_TIMER before any other hardirq in event loop) effectively made the VIRQ_TIMER the highest priority event when using the 2-level ABI. Set the VIRQ_TIMER priority to the highest so this behaviour is retained when using the FIFO-based ABI. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:55 -05:00
David Vrabel	6ccecb0fbc	xen/events: allow event channel priority to be set Add xen_irq_set_priority() to set an event channels priority. This function will only work with event channel ABIs that support priority (i.e., the FIFO-based ABI). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:54 -05:00
David Vrabel	bf2bbe07f1	xen/events: Add the hypervisor interface for the FIFO-based event channels Add the hypercall sub-ops and the structures for the shared data used in the FIFO-based event channel ABI. The design document for this new ABI is available here: http://xenbits.xen.org/people/dvrabel/event-channels-H.pdf In summary, events are reported using a per-domain shared event array of event words. Each event word has PENDING, LINKED and MASKED bits and a LINK field for pointing to the next event in the event queue. There are 16 event queues (with different priorities) per-VCPU. Key advantages of this new ABI include: - Support for over 100,000 events (2^17). - 16 different event priorities. - Improved fairness in event latency through the use of FIFOs. The ABI is available in Xen 4.4 and later. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:52 -05:00
David Vrabel	0dc0064add	xen/evtchn: support more than 4096 ports Remove the check during unbind for NR_EVENT_CHANNELS as this limits support to less than 4096 ports. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:50 -05:00
David Vrabel	fd21069dfe	xen/events: add xen_evtchn_mask_all() Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:49 -05:00
David Vrabel	d0b075ffee	xen/events: Refactor evtchn_to_irq array to be dynamically allocated Refactor static array evtchn_to_irq array to be dynamically allocated by implementing get and set functions for accesses to the array. Two new port ops are added: max_channels (maximum supported number of event channels) and nr_channels (number of currently usable event channels). For the 2-level ABI, these numbers are both the same as the shared data structure is a fixed size. For the FIFO ABI, these will be different as the event array is expanded dynamically. This allows more than 65000 event channels so an unsigned short is no longer sufficient for an event channel port number and unsigned int is used instead. Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:47 -05:00
David Vrabel	083858758f	xen/events: add a evtchn_op for port setup Add a hook for port-specific setup and call it from xen_irq_info_common_setup(). The FIFO-based ABIs may need to perform additional setup (expanding the event array) before a bound event channel can start to receive events. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:46 -05:00
David Vrabel	96d4c58818	xen/events: allow setup of irq_info to fail The FIFO-based event ABI requires additional setup of newly bound events (it may need to expand the event array) and this setup may fail. xen_irq_info_common_init() is a useful place to put this setup so allow this call to fail. This call and the other similar calls are renamed to be *_setup() to reflect that they may now fail. This failure can only occur with new event channels not on rebind. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:44 -05:00
David Vrabel	ab9a1cca3d	xen/events: add struct evtchn_ops for the low-level port operations evtchn_ops contains the low-level operations that access the shared data structures. This allows alternate ABIs to be supported. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:43 -05:00
David Vrabel	9a489f45a1	xen/events: move 2-level specific code into its own file In preparation for alternative event channel ABIs, move all the functions accessing the shared data structures into their own file. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:41 -05:00
David Vrabel	d2ba3166f2	xen/events: move drivers/xen/events.c into drivers/xen/events/ events.c will be split into multiple files so move it into its own directory. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:38 -05:00
Wei Liu	76ec8d64ce	xen/events: replace raw bit ops with functions In preparation for adding event channel port ops, use set_evtchn() instead of sync_set_bit(). Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:36 -05:00
Wei Liu	3f70fa8282	xen/events: introduce test_and_set_mask() In preparation for adding event channel port ops, add test_and_set_mask(). Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-01-06 10:07:35 -05:00

1 2 3 4 5 ...

414215 Commits