The Host used to create some page tables for the Guest to use at the
top of Guest memory; it would then tell the Guest where this was. In
particular, it created linear mappings for 0 and 0xC0000000 addresses
because lguest used to switch to its real page tables quite late in
boot.
However, since d50d8fe19 Linux initialized boot page tables in
head_32.S even before the "are we lguest?" boot jump. So, now we can
simplify things: the Host pagetable code assumes 1:1 linear mapping
until it first calls the LHCALL_NEW_PGTABLE hypercall, which we now do
before we reach C code.
This also means that the Host doesn't need to know anything about the
Guest's PAGE_OFFSET. (Non-Linux guests might not even have such a
thing).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I've been doing this for years, and akpm picked me up on it about 12
months ago. lguest partly serves as example code, so let's do it Right.
Also, remove two unused fields in struct vblk_info in the example launcher.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
I don't really notice it (except to begrudge the extra vertical
space), but Ingo does. And he pointed out that one excuse of lguest
is as a teaching tool, it should set a good example.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
lguest never checked for pending interrupts when enabling interrupts, and
things still worked. However, it makes a significant difference to TCP
performance, so it's time we fixed it by introducing a pending_irq flag
and checking it on irq_restore and irq_enable.
These two routines are now too big to patch into the 8/10 bytes
patch space, so we drop that code.
Note: The high latency on interrupt delivery had a very curious
effect: once everything else was optimized, networking without GSO was
faster than networking with GSO, since more interrupts were sent and
hence a greater chance of one getting through to the Guest!
Note2: (Almost) Closing the same loophole for iret doesn't have any
measurable effect, so I'm leaving that patch for the moment.
Before:
1GB tcpblast Guest->Host: 30.7 seconds
1GB tcpblast Guest->Host (no GSO): 76.0 seconds
After:
1GB tcpblast Guest->Host: 6.8 seconds
1GB tcpblast Guest->Host (no GSO): 27.8 seconds
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Went through the documentation doing typo and content fixes. This
patch contains only comment and whitespace changes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1) This allows us to get alot closer to booting bzImages.
2) It means we don't have to know page_offset.
3) The Guest needs to modify the boot pagetables to create the
PAGE_OFFSET mapping before jumping to C code.
4) guest_pa() walks the page tables rather than using page_offset.
5) We don't use page_offset to figure out whether to emulate: it was
always kinda quesationable, and won't work for instructions done
before remapping (bzImage unpacking in particular).
6) We still want the kernel address for tlb flushing: have the initial
hypercall give us that, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Based on Ron Minnich's LGUEST_PLAN9_SYSCALL patch).
This patch allows Guests to specify what system call vector they want,
and we try to reserve it. We only allow one non-Linux system call
vector, to try to avoid DoS on the Host.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Clean up the hypercall code to make the code in hypercalls.c
architecture independent. First process the common hypercalls and
then call lguest_arch_do_hcall() if the call hasn't been handled.
Rename struct hcall_ring to hcall_args.
This patch requires the previous patch which reorganize the layout of
struct lguest_regs on i386 so they match the layout of struct
hcall_args.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Back when we had all the Guest state in the switcher, we had a fixed
array of them. This is no longer necessary.
If we switch the network code to using random_ether_addr (46 bits is
enough to avoid clashes), we can get rid of the concept of "guest id"
altogether.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Move architecture specific portion of lg_hcall code to asm-i386/lg_hcall.h
and have it included from linux/lguest.h.
[Changed to asm-i386/lguest_hcall.h so documentation finds it -RR]
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jes Sorensen <jes@sgi.com>
A non-periodic clock_event_device and the "jiffies" clock don't mix well:
tick_handle_periodic() can go into an infinite loop.
Currently lguest guests use the jiffies clock when the TSC is
unusable. Instead, make the Host write the current time into the lguest
page on every interrupt. This doesn't cost much but is more precise
and at least as accurate as the jiffies clock. It also gets rid of
the GET_WALLCLOCK hypercall.
Also, delay setting sched_clock until our clock is set up, otherwise
the early printk timestamps can go backwards (not harmful, just ugly).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Documentation: The Guest
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the code for the "lg.ko" module, which allows lguest guests to
be launched.
[akpm@linux-foundation.org: update for futex-new-private-futexes]
[akpm@linux-foundation.org: build fix]
[jmorris@namei.org: lguest: use hrtimers]
[akpm@linux-foundation.org: x86_64 build fix]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lguest is a simple hypervisor for Linux on Linux. Unlike kvm it doesn't need
VT/SVM hardware. Unlike Xen it's simply "modprobe and go". Unlike both, it's
5000 lines and self-contained.
Performance is ok, but not great (-30% on kernel compile). But given its
hackability, I expect this to improve, along with the paravirt_ops code which
it supplies a complete example for. There's also a 64-bit version being
worked on and other craziness.
But most of all, lguest is awesome fun! Too much of the kernel is a big ball
of hair. lguest is simple enough to dive into and hack, plus has some warts
which scream "fork me!".
This patch:
This is the code and headers required to make an i386 kernel an lguest guest.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>