* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] IOSAPIC bogus error cleanup
[IA64] Update printing of feature set bits
[IA64] Fix IOSAPIC delivery mode setting
[IA64] XPC heartbeat timer function must run on CPU 0
[IA64] Clean up /proc/interrupts output
[IA64] Disable/re-enable CPE interrupts on Altix
[IA64] Clean-up McKinley Errata message
[IA64] Add gate.lds to list of files ignored by Git
[IA64] Fix section mismatch in contig.c version of per_cpu_init()
[IA64] Wrong args to memset in efi_gettimeofday()
[IA64] Remove duplicate includes from ia32priv.h
[IA64] fix number of bytes zeroed by sys_fw_init() in arch/ia64/hp/sim/boot/fw-emu.c
[IA64] Fix perfmon sysctl directory modes
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (26 commits)
sh: remove dead config symbols from SH code
sh: Kill off broken snapgear ds1302 code.
sh: Add a dummy vga.h.
rtc: rtc-sh: Zero out tm value for invalid rtc states.
rtc: sh-rtc: Handle rtc_device_register() failure properly.
sh: Fix heartbeart on Solution Engine series
sh: Remove SCI_NPORTS from sh-sci.h
sh: Fix up PAGE_KERNEL_PCC() for nommu.
sh: hs7751rvoip: Kill off dead IPR IRQ mappings.
sh: hs7751rvoip: irq.c needs linux/interrupt.h.
sh: Kill off __{copy,clear}_user_page().
sh: Optimized copy_{to,from}_user_page() for SH-4.
sh: Wire up clear_user_highpage().
sh: Kill off the remaining ST40 cruft.
superhyway: Handle device_register() retval properly.
sh: kgdb sysrq depends on magic sysrq.
sh: Add -Werror for clean directories.
sh: Fix up kgdb build with modular sh-sci.
sh: Export __{s,u}divsi3_i4i on all CPUs.
sh: Fix up kgdb-on-NMI branch target.
...
When a share is mounted using no username, cifs_mount sets
volume_info.username as a NULL pointer, and the sesInfo userName as an
empty string. The volume_info.username is passed to a couple of other
functions to see if there is an existing unc or tcp connection that can
be used. These functions assume that the username will be a valid
string that can be passed to strncmp. If the pointer is NULL, then the
kernel will oops if there's an existing session to which the string
can be compared.
This patch changes cifs_mount to set volume_info.username to an empty
string in this situation, which prevents the oops and should make it
so that the comparison to other null auth sessions match.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (37 commits)
[POWERPC] EEH: Make sure warning message is printed
[POWERPC] Make altivec code in swsusp_32.S depend on CONFIG_ALTIVEC
[POWERPC] windfarm: Fix windfarm thread freezer interaction
[POWERPC] Fix si_addr value on low level hash failures
[POWERPC] Refresh ppc64_defconfig and enable pasemi-related options
[POWERPC] pasemi: Update defconfig
[POWERPC] iSeries: Fix ref counting in vio setup
[POWERPC] ] Fix memset size error
[POWERPC] Fix link errors for allyesconfig
[POWERPC] iSeries_init_IRQ non-PCI tidy
[POWERPC] Change fallocate to match unistd.h on powerpc
[POWERPC] EEH: Avoid crash on null device
[POWERPC] EEH: Drivers that need reset trump others
[POWERPC] EEH: Clean up comments
[POWERPC] Fix off-by-one error in setting decrementer on Book E/4xx (v2)
[POWERPC] Fix switch_slb handling of 1T ESID values
[POWERPC] Fix build failure when CONFIG_VIRT_CPU_ACCOUNTING is not defined
[POWERPC] Include udbg.h when using udbg_printf
[POWERPC] Fix cache line vs. block size confusion
[POWERPC] Fix sysctl table check failure on PowerMac
...
The old NO_IRQ define some platforms had was long ago declared obsolete
and wrong. FRV should therefore not be re-introducing this, especially as
IRQs are usually unsigned in the kernel. The "no IRQ" case is defined to be
zero and Linus made this rather clear at the time.
arch/frv shows no dependancy on this but it might show up driver fixes
needing doing I guess
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: Use "is_power_of_2" macro for simplicity.
[SPARC]: Remove duplicate includes.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
SELinux: add more validity checks on policy load
SELinux: fix bug in new ebitmap code.
SELinux: suppress a warning for 64k pages.
Remove the section annotation on FRV's free_initmem(). It can't be marked
__init, lest it free itself.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds a proper prototype for migration_init() in
include/linux/sched.h
Since there's no point in always returning 0 to a caller that doesn't check
the return value it also changes the function to return void.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
SMP balancing is done with IRQs disabled and can iterate the full rq.
When rqs are large this can cause large irq-latencies. Limit the nr of
iterations on each run.
This fixes a scheduling latency regression reported by the -rt folks.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Sukadev Bhattiprolu reported a kernel crash with control groups.
There are couple of problems discovered by Suka's test:
- The test requires the cgroup filesystem to be mounted with
atleast the cpu and ns options (i.e both namespace and cpu
controllers are active in the same hierarchy).
# mkdir /dev/cpuctl
# mount -t cgroup -ocpu,ns none cpuctl
(or simply)
# mount -t cgroup none cpuctl -> Will activate all controllers
in same hierarchy.
- The test invokes clone() with CLONE_NEWNS set. This causes a a new child
to be created, also a new group (do_fork->copy_namespaces->ns_cgroup_clone->
cgroup_clone) and the child is attached to the new group (cgroup_clone->
attach_task->sched_move_task). At this point in time, the child's scheduler
related fields are uninitialized (including its on_rq field, which it has
inherited from parent). As a result sched_move_task thinks its on
runqueue, when it isn't.
As a solution to this problem, I moved sched_fork() call, which
initializes scheduler related fields on a new task, before
copy_namespaces(). I am not sure though whether moving up will
cause other side-effects. Do you see any issue?
- The second problem exposed by this test is that task_new_fair()
assumes that parent and child will be part of the same group (which
needn't be as this test shows). As a result, cfs_rq->curr can be NULL
for the child.
The solution is to test for curr pointer being NULL in
task_new_fair().
With the patch below, I could run ns_exec() fine w/o a crash.
Reported-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
clean up the preemption check to not use unnecessary 64-bit
variables. This improves code size:
text data bss dec hex filename
44227 3326 36 47589 b9e5 sched.o.before
44201 3326 36 47563 b9cb sched.o.after
Signed-off-by: Ingo Molnar <mingo@elte.hu>
wakeup preemption fix: do not make it dependent on p->prio.
Preemption purely depends on ->vruntime.
This improves preemption in mixed-nice-level workloads.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
remove PREEMPT_RESTRICT. (this is a separate commit so that any
regression related to the removal itself is bisectable)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
PREEMPT_RESTRICT was a method aimed at reducing the amount of wakeup
related preemption. It has a disadvantage though, it can prevent
legitimate wakeups if a task is 'unlucky' to be hit too early by a tick
that clears peer_preempt.
Now that the wakeup preemption has been cleaned up we dont seem to have
excessive preemptions anymore, so this feature can be turned off. (and
removed in the next patch)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix a !SMP build error:
drivers/kvm/kvm_main.c: In function 'kvm_flush_remote_tlbs':
drivers/kvm/kvm_main.c:220: error: implicit declaration of function 'smp_call_function_mask'
(and also avoid unused function warning related to up_smp_call_function()
not making use of the 'func' parameter.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
prepare for up_smp_call_function() to ensure that the 'func'
pointer is unused. (which is related to a KVM build fix)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
1) hardcoded 1000000000 value is used five times in places where
NSEC_PER_SEC might be more readable.
2) A conversion from nsec to msec uses the hardcoded 1000000 value,
which is a candidate for NSEC_PER_MSEC.
no code changed:
text data bss dec hex filename
44359 3326 36 47721 ba69 sched.o.before
44359 3326 36 47721 ba69 sched.o.after
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yanmin Zhang reported an aim7 regression and bisected it down to:
| commit 38ad464d41
| Author: Ingo Molnar <mingo@elte.hu>
| Date: Mon Oct 15 17:00:02 2007 +0200
|
| sched: uniform tunings
|
| use the same defaults on both UP and SMP.
fix this by reintroducing similar SMP tunings again. This resolves
the regression.
(also update the comments to match the ilog2(nr_cpus) tuning effect)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the
deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been
broken on powerpc, because we end up counting user time twice: once in
timer_interrupt() and once in update_process_times().
This fixes the problem by pulling the code in update_process_times
that updates utime and stime into a separate function called
account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined,
there is a version of account_process_tick in kernel/timer.c that
simply accounts a whole tick to either utime or stime as before. If
CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to
implement account_process_tick.
This also lets us simplify the s390 code a bit; it means that the s390
timer interrupt can now call update_process_times even when
CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a
suitable account_process_tick().
account_process_tick() now takes the task_struct * as an argument.
Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix the delay accounting regression introduced by commit
75d4ef16a6. rq no longer has sched_info
data associated with it. task_struct sched_info structure is used by delay
accounting to provide back statistics to user space.
also remove direct use of sched_clock() (which is not a valid thing to
do anymore) and use rq->clock instead.
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
we lost the sched_min_granularity tunable to a clever optimization
that uses the sched_latency/min_granularity ratio - but the ratio
is quite unintuitive to users and can also crash the kernel if the
ratio is set to 0. So reintroduce the min_granularity tunable,
while keeping the ratio maintained internally.
no functionality changed.
[ mingo@elte.hu: some fixlets. ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a few comments to place_entity(). No code changed.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
vslice was missing a factor NICE_0_LOAD, as weight is in
weight*NICE_0_LOAD units.
the effect of this bug was larger initial slices and
thus latency-noisier forks.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On Altix (sn2) machines the "Error parsing MADT" message is
misleading because the lack of IOSAPIC entries is expected.
Since I am sure someone will ask, I have been told that
the chance of this changing anytime soon is close to nil.
Signed-off-by: George Beshers <gbeshers@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Newer Itanium versions have added additional processor feature set
bits. This patch prints all the implemented feature set bits. Some
bit descriptions have not been made public. For those bits, a generic
"Feature set X bit Y" message is printed. Bits that are not implemented
will no longer be printed.
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix the problem that redirect hit bit in I/O SAPIC RTE is set even
when it must be disabled (e.g. nointroute boot option is set, CPU
hotplug is enabled or percpu vector is enabled).
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Currently, XPC's heartbeat timer function runs on whatever CPU modprobe/insmod
ran on when XPC was started. To avoid the heartbeat from being delayed for
long periods the timer function must run on CPU 0.
N.B. Altix doesn't currently allow cpu0 to be taken offline, so this is
safe for now. This code must be revised when offline of cpu0 is enabled.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When the code calls uncork, trigger a queue flush, even
if the queue was not corked. Most callers that explicitely
cork the queue will have additinal checks to see if they
corked it. Callers who do not cork the queue expect packets
to flow when they call uncork.
The scneario that showcased this bug happend when we were not
able to bundle DATA with outgoing COOKIE-ECHO. As a result
the data just sat in the outqueue and did not get transmitted.
The application expected a response, but nothing happened.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
There is a small bug when we process a FWD-TSN. We'll deliver
anything upto the current next expected SSN. However, if the
next expected is already in the queue, it will take another
chunk to trigger its delivery. The fix is to simply check
the current queued SSN is the next expected one.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
SCTP-AUTH and future ADD-IP updates have a requirement to
do additional verification of parameters and an ability to
ABORT the association if verification fails. So, introduce
additional return code so that we can clear signal a required
action.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
A SCTP endpoint may have a lot of associations on them and walking
the list is fairly inefficient. Instead, use a hashed lookup,
and filter out the hash list based on the endopoing we already have.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Added blk_unplug interface, allowing all invocations of unplugs to result
in a generated blktrace UNPLUG.
Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Credit goes to juergen.kadidlo@exasol.com for diagnosing this issue
and supplying the initial patch.
blk_queue_invalidate_tags() must use the proper requeueing paths instead
of open coding the re-add of the request, otherwise we bug out in rq
accounting. Just switch to using blk_requeue_request(), that takes care
of end-tag handling as well and also adds the blktrace REQUEUE notify
event that is also appropriate here.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
kmap_atomic calls flush_tlb_page with a NULL VMA and thus we end
up dereferencing a NULL pointer to try and get the context.id.
If the VMA is null use the global pid value of 0.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
A debugging printk is removed, and a comment is fixed to match
the code.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Newer GCC's are capable of autovectorization for ISA extensions like
AltiVec and SPE. If we happen to build with one of those compilers we
will get SPE instructions in random kernel code. Today we only allow
basic interger code in the kernel and FP, AltiVec, or SPE in special
explicit locations that have handled the proper saving and restoring of
the register state (since on uniprocessor we lazy context switch the
register state for FP, AltiVec, and SPE).
-mno-spe disables the compiler for automatically generating SPE
instructions without our knowledge.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
One-shot timer mode on PXA has various bugs which prevent kernels
build with NO_HZ enabled booting. They end up spinning on a
permanently asserted timer interrupt because we don't properly
clear it down - clearing the OIER bit does not stop the pending
interrupt status. Fix this in the set_mode handler as well.
Moreover, the code which sets the next expiry point may race with
the hardware, and we might not set the match register sufficiently
in the future. If we encounter that situation, return -ETIME so
the generic time code retries.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>