Register a panic notifier so that when the guest crashes it can shut
down the domain and indicate it was a crash to the host.
Signed-off-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
When vcpu info placement is supported, we're not limited to MAX_VIRT_CPUS
vcpus. However, if it isn't supported, then ignore any excess vcpus.
Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
xen_sched_clock only counts unstolen time. In principle this should
be useful to the Linux scheduler so that it knows how much time a process
actually consumed. But in practice this doesn't work very well as the
scheduler expects the sched_clock time to be synchronized between
cpus. It also uses sched_clock to measure the time a task spends
sleeping, in which case "unstolen time" isn't meaningful.
So just use plain xen_clocksource_read to return wallclock nanoseconds
for sched_clock.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Currently register_xenstore_notifier notifies the caller during the
registration itself if xenstore is believed to be ready. This behaviour
causes problems to PV on HVM guests, in which case callers should be
notified by xenbus_probe only after the platform pci driver is loaded.
We already make sure xenbus_probe is called at the right time, calling
it either from device_initcall (PV case) or from the platform pci
driver initialization (HVM case) so we don't need this additional
notification.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Scan the set of pages we're freeing and make sure they're actually
owned by the domain before freeing. This generally won't happen on a
domU (since Xen gives us contigious memory), but it could happen if
there are some hardware mappings passed through.
We only bother going up to the highest page Xen actually claimed to
give us, since there's definitely nothing of ours above that.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Scan an e820 table and release any memory which lies between e820 entries,
as it won't be used and would just be wasted. At present this is just to
release any memory beyond the end of the e820 map, but it will also deal
with holes being punched in the map.
Derived from patch by Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
rtnetlink: make SR-IOV VF interface symmetric
sctp: delete active ICMP proto unreachable timer when free transport
tcp: fix MD5 (RFC2385) support
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: Oprofile: Fix Loongson irq handler
MIPS: N32: Use compat version for sys_ppoll.
MIPS FPU emulator: allow Cause bits of FCSR to be writeable by ctc1
Now we have a set of nested attributes:
IFLA_VFINFO_LIST (NESTED)
IFLA_VF_INFO (NESTED)
IFLA_VF_MAC
IFLA_VF_VLAN
IFLA_VF_TX_RATE
This allows a single set to operate on multiple attributes if desired.
Among other things, it means a dump can be replayed to set state.
The current interface has yet to be released, so this seems like
something to consider for 2.6.34.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
transport may be free before ICMP proto unreachable timer expire, so
we should delete active ICMP proto unreachable timer when transport
is going away.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP MD5 support uses percpu data for temporary storage. It currently
disables preemption so that same storage cannot be reclaimed by another
thread on same cpu.
We also have to make sure a softirq handler wont try to use also same
context. Various bug reports demonstrated corruptions.
Fix is to disable preemption and BH.
Reported-by: Bhaskar Dutta <bhaskie@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The interrupt enable bit for the performance counters is in the Control
Register $24, not in the counter register.
loongson2_perfcount_handler(), we need to use
Reported-by: Xu Hengyang <hengyang@mail.ustc.edu.cn>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/1198/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
---
The sys_ppoll() takes struct 'struct timespec'. This is different for the
N32 and N64 ABIs. Use the compat version to do the proper conversions.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/1210/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
---
In the FPU emulator code of the MIPS, the Cause bits of the FCSR register
are not currently writeable by the ctc1 instruction. In odd corner cases,
this can cause problems. For example, a case existed where a divide-by-zero
exception was generated by the FPU, and the signal handler attempted to
restore the FPU registers to their state before the exception occurred. In
this particular setup, writing the old value to the FCSR register would
cause another divide-by-zero exception to occur immediately. The solution
is to change the ctc1 instruction emulator code to allow the Cause bits of
the FCSR register to be writeable. This is the behaviour of the hardware
that the code is emulating.
This problem was found by Shane McDonald, but the credit for the fix goes
to Kevin Kissell. In Kevin's words:
I submit that the bug is indeed in that ctc_op: case of the emulator. The
Cause bits (17:12) are supposed to be writable by that instruction, but the
CTC1 emulation won't let them be updated by the instruction. I think that
actually if you just completely removed lines 387-388 [...] things would
work a good deal better. At least, it would be a more accurate emulation of
the architecturally defined FPU. If I wanted to be really, really pedantic
(which I sometimes do), I'd also protect the reserved bits that aren't
necessarily writable.
Signed-off-by: Shane McDonald <mcdonald.shane@gmail.com>
To: anemo@mba.ocn.ne.jp
To: kevink@paralogos.com
To: sshtylyov@mvista.com
Patchwork: http://patchwork.linux-mips.org/patch/1205/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
---
As we were using an internal dma flushing routine, this patch changes to
the DMA API flush_kernel_dcache_page(). Driver is able to compile now.
[akpm@linux-foundation.org: flush_kernel_dcache_page() comes before kunmap_atomic()]
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
JFS: Free sbi memory in error path
fs/sysv: dereferencing ERR_PTR()
Fix double-free in logfs
Fix the regression created by "set S_DEAD on unlink()..." commit
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf record: Add a fallback to the reference relocation symbol
I spotted the missing kfree() while removing the BKL.
[akpm@linux-foundation.org: avoid multiple returns so it doesn't happen again]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
I moved the dir_put_page() inside the if condition so we don't dereference
"page", if it's an ERR_PTR().
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1) i_flags simply doesn't work for mount/unlink race prevention;
we may have many links to file and rm on one of those obviously
shouldn't prevent bind on top of another later on. To fix it
right way we need to mark _dentry_ as unsuitable for mounting
upon; new flag (DCACHE_CANT_MOUNT) is protected by d_flags and
i_mutex on the inode in question. Set it (with dont_mount(dentry))
in unlink/rmdir/etc., check (with cant_mount(dentry)) in places
in namespace.c that used to check for S_DEAD. Setting S_DEAD
is still needed in places where we used to set it (for directories
getting killed), since we rely on it for readdir/rmdir race
prevention.
2) rename()/mount() protection has another bogosity - we unhash
the target before we'd checked that it's not a mountpoint. Fixed.
3) ancient bogosity in pivot_root() - we locked i_mutex on the
right directory, but checked S_DEAD on the different (and wrong)
one. Noticed and fixed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 6126/1: ARM mpcore_wdt: fix build failure and other fixes
ARM: 6125/1: ARM TWD: move TWD registers to common header
ARM: 6110/1: Fix Thumb-2 kernel builds when UACCESS_WITH_MEMCPY is enabled
ARM: 6112/1: Use the Inner Shareable I-cache and BTB ops on ARMv7 SMP
ARM: 6111/1: Implement read/write for ownership in the ARMv6 DMA cache ops
ARM: 6106/1: Implement copy_to_user_page() for noMMU
ARM: 6105/1: Fix the __arm_ioremap_caller() definition in nommu.c
If the kernel is large or the profiling step small, /proc/profile
leaks data and readprofile shows silly stats, until readprofile -r
has reset the buffer: clear the prof_buffer when it is vmalloc()ed.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
My old address will shut down in a couple of weeks: update the tree.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Do not blindly access extended configuration space unless we actively
know we're on a Moorestown platform. The fixed-size BAR capability
lives in the extended configuration space, and thus is not applicable
if the configuration space isn't appropriately sized.
This fixes booting certain VMware configurations with CONFIG_MRST=y.
Moorestown will add a fake PCI-X 266 capability to advertise the
presence of extended configuration space.
Reported-and-tested-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Jacob Pan <jacob.jun.pan@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
LKML-Reference: <AANLkTiltKUa3TrKR1M51eGw8FLNoQJSLT0k0_K5X3-OJ@mail.gmail.com>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments
x86, k8: Fix build error when K8_NB is disabled
x86, amd: Check X86_FEATURE_OSVW bit before accessing OSVW MSRs
x86: Fix fake apicid to node mapping for numa emulation
K8_NB depends on PCI and when the last is disabled (allnoconfig) we fail
at the final linking stage due to missing exported num_k8_northbridges.
Add a header stub for that.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100503183036.GJ26107@aftab>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org>
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
inotify: don't leak user struct on inotify release
inotify: race use after free/double free in inotify inode marks
inotify: clean up the inotify_add_watch out path
Inotify: undefined reference to `anon_inode_getfd'
Manual merge to remove duplicate "select ANON_INODES" from Kconfig file
DA8xx OHCI driver fails to load due to failing clk_get() call for the USB 2.0
clock. Arrange matching USB 2.0 clock by the clock name instead of the device.
(Adding another CLK() entry for "ohci.0" device won't do -- in the future I'll
also have to enable USB 2.0 clock to configure CPPI 4.1 module, in which case
I won't have any device at all.)
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
inotify_new_group() receives a get_uid-ed user_struct and saves the
reference on group->inotify_data.user. The problem is that free_uid() is
never called on it.
Issue seem to be introduced by 63c882a0 (inotify: reimplement inotify
using fsnotify) after 2.6.30.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Eric Paris <eparis@parisplace.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Paris <eparis@redhat.com>
There is a race in the inotify add/rm watch code. A task can find and
remove a mark which doesn't have all of it's references. This can
result in a use after free/double free situation.
Task A Task B
------------ -----------
inotify_new_watch()
allocate a mark (refcnt == 1)
add it to the idr
inotify_rm_watch()
inotify_remove_from_idr()
fsnotify_put_mark()
refcnt hits 0, free
take reference because we are on idr
[at this point it is a use after free]
[time goes on]
refcnt may hit 0 again, double free
The fix is to take the reference BEFORE the object can be found in the
idr.
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: <stable@kernel.org>
inotify_add_watch explictly frees the unused inode mark, but it can just
use the generic code. Just do that.
Signed-off-by: Eric Paris <eparis@redhat.com>
* 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Fix module loading on system with WB cache
microblaze: export assembly functions used by modules
microblaze: Remove powerpc code from Microblaze port
microblaze: Remove compilation warnings in cache macro
microblaze: export assembly functions used by modules
microblaze: fix get_user/put_user side-effects
microblaze: re-enable interrupts before calling schedule
Redirecting directly to lsm, here's the patch discussed on lkml:
http://lkml.org/lkml/2010/4/22/219
The mmap_min_addr value is useful information for an admin to see without
being root ("is my system vulnerable to kernel NULL pointer attacks?") and
its setting is trivially easy for an attacker to determine by calling
mmap() in PAGE_SIZE increments starting at 0, so trying to keep it private
has no value.
Only require CAP_SYS_RAWIO if changing the value, not reading it.
Comment from Serge :
Me, I like to write my passwords with light blue pen on dark blue
paper, pasted on my window - if you're going to get my password, you're
gonna get a headache.
Signed-off-by: Kees Cook <kees.cook@canonical.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
(cherry picked from commit 822cceec72)
If host CPU is exposed to a guest the OSVW MSRs are not guaranteed
to be present and a GP fault occurs. Thus checking the feature flag is
essential.
Cc: <stable@kernel.org> # .32.x .33.x
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20100427101348.GC4489@alberich.amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* 'kvm-updates/2.6.34' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: PPC: Keep index within boundaries in kvmppc_44x_emul_tlbwe()
KVM: VMX: blocked-by-sti must not defer NMI injections
KVM: x86: Call vcpu_load and vcpu_put in cpuid_update
KVM: SVM: Fix wrong intercept masks on 32 bit
KVM: convert ioapic lock to spinlock
The imx CTS trigger level is left at its reset value that is 32
chars. Since the RX FIFO has 32 entries, when CTS is raised, the
FIFO already is full. However, some serial port devices first empty
their TX FIFO before stopping when CTS is raised, resulting in lost
chars.
This patch sets the trigger level lower so that other chars arrive
after CTS is raised, there is still room for 16 of them.
Signed-off-by: Valentin Longchamp<valentin.longchamp@epfl.ch>
Tested-by: Philippe Rétornaz<philippe.retornaz@epfl.ch>
Acked-by: Wolfram Sang<w.sang@pengutronix.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>