Commit Graph

71408 Commits

Author SHA1 Message Date
Rafael J. Wysocki
d158cbdf39 Hibernation: Arbitrary boot kernel support on x86_64
Make it possible to restore a hibernation image on x86_64 with the help of a
kernel different from the one in the image.

The idea is to split the core restoration code into two separate parts and to
place each of them in a different page.   The first part belongs to the boot
kernel and is executed as the last step of the image kernel's memory
restoration procedure.   Before being executed, it is relocated to a safe page
that won't be overwritten while copying the image kernel pages.

The final operation performed by it is a jump to the second part of the core
restoration code that belongs to the image kernel and has just been restored.
This code makes the CPU switch to the image kernel's page tables and restores
the state of general purpose registers (including the stack pointer) from
before the hibernation.

The main issue with this idea is that in order to jump to the second part of
the core restoration code the boot kernel needs to know its address.
 However, this address may be passed to it in the image header.   Namely, the
part of the image header previously used for checking if the version of the
image kernel is correct can be replaced with some architecture specific data
that will allow the boot kernel to jump to the right address within the image
kernel.   These data should also be used for checking if the image kernel is
compatible with the boot kernel (as far as the memory restroration procedure
is concerned).  It can be done, for example, with the help of a "magic" value
that has to be equal in both kernels, so that they can be regarded as
compatible.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Rafael J. Wysocki
d307c4a8e8 Hibernation: Arbitrary boot kernel support - generic code
Add the bits needed for supporting arbitrary boot kernels to the common
hibernation code.

To support arbitrary boot kernels, make it possible to replace the 'struct
new_utsname' and the kernel version in the hibernation image header by some
architecture specific data that will be used to verify if the image is valid
and to restore the image.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Pavel Machek
50a1efe14f s2ram: kill old debugging junk
This removes old debugging stuff, that should be no longer neccessary.  It
accessed VGA hardware (which may not be ready at this point), and used LEDs
at port 80 for debugging.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Andres Salomon
8f4ce8c32f serial: turn serial console suspend a boot rather than compile time option
Currently, there's a CONFIG_DISABLE_CONSOLE_SUSPEND that allows one to stop
the serial console from being suspended when the rest of the machine goes
to sleep.  This is incredibly useful for debugging power management-related
things; however, having it as a compile-time option has proved to be
incredibly inconvenient for us (OLPC).  There are plenty of times that we
want serial console to not suspend, but for the most part we'd like serial
console to be suspended.

This drops CONFIG_DISABLE_CONSOLE_SUSPEND, and replaces it with a kernel
boot parameter (no_console_suspend).  By default, the serial console will
be suspended along with the rest of the system; by passing
'no_console_suspend' to the kernel during boot, serial console will remain
alive during suspend.

For now, this is pretty serial console specific; further fixes could be
applied to make this work for things like netconsole.

Signed-off-by: Andres Salomon <dilinger@debian.org>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@suspend2.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Rafael J. Wysocki
438e2ce68d freezer: measure freezing time
Measure the time of the freezing of tasks, even if it doesn't fail.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Rafael J. Wysocki
b842ee578e freezer: be more verbose
Increase the freezer's verbosity a bit, so that it's easier to read problem
reports related to it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Rafael J. Wysocki
f059bca1c5 pm_trace displays the wrong time from the RTC
The way in which read_magic_time() displays the date read from the RTC is
apparently confusing to the users (cf.
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=250238).  Make it
print dates in the standard way.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Adrian Bunk
c3d42d7527 unexport pm_power_off_prepare
This patch removes the unused EXPORT_SYMBOL(pm_power_off_prepare).

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Rafael J. Wysocki
d5d8c5976d freezer: do not send signals to kernel threads
The freezer should not send signals to kernel threads, since that may lead to
subtle problems.  In particular, commit
b74d0deb96 has changed recalc_sigpending_tsk()
so that it doesn't clear TIF_SIGPENDING.  For this reason, if the freezer
continues to send fake signals to kernel threads and the freezing of kernel
threads fails, some of them may be running with TIF_SIGPENDING set forever.

Accordingly, recalc_sigpending_tsk() shouldn't set the task's TIF_SIGPENDING
flag if TIF_FREEZE is set.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Rafael J. Wysocki
e42837bcd3 freezer: introduce freezer-friendly waiting macros
Introduce freezer-friendly wrappers around wait_event_interruptible() and
wait_event_interruptible_timeout(), originally defined in <linux/wait.h>, to
be used in freezable kernel threads.  Make some of the freezable kernel
threads use them.

This is necessary for the freezer to stop sending signals to kernel threads,
which is implemented in the next patch.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Rafael J. Wysocki
2e1318956c freezer: prevent new tasks from inheriting TIF_FREEZE set
Tasks should go to the refrigerator only if explicitly requested to do that by
the freezer and not as a result of inheriting the TIF_FREEZE flag set from the
parent.  Make it happen.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Rafael J. Wysocki
232b143280 freezer: do not sync filesystems from freeze_processes
The syncing of filesystems from within the freezer is generally not needed.
Also, if there's an ext3 filesystem loopback-mounted from a FUSE one, the
syncing results in writes to it and deadlocks.  Similarly, it will deadlock if
FUSE implements sync.

Change freeze_processes() so that it doesn't execute sys_sync() and make the
suspend and hibernation code path sync filesystems independently of the
freezer.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Rafael J. Wysocki
2776365370 freezer: document relationship with memory shrinking
One important reason to freeze tasks, which is that we don't want them to
allocate memory after freeing it for the hibernation image, has not been
documented.  Fix it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Rafael J. Wysocki
b3dac3b304 PM: Rename hibernation_ops to platform_hibernation_ops
Rename 'struct hibernation_ops' to 'struct platform_hibernation_ops' in
analogy with 'struct platform_suspend_ops'.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Rafael J. Wysocki
74f270af0c PM: Rework struct hibernation_ops
During hibernation we also need to tell the ACPI core that we're going to put
the system into the S4 sleep state.  For this reason, an additional method in
'struct hibernation_ops' is needed, playing the role of set_target() in
'struct platform_suspend_operations'.  Moreover, the role of the .prepare()
method is now different, so it's better to introduce another method, that in
general may be different from .prepare(), that will be used to prepare the
platform for creating the hibernation image (.prepare() is used anyway to
notify the platform that we're going to enter the low power state after the
image has been saved).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Rafael J. Wysocki
f242d9196f PM: Make suspend_ops static
The variable suspend_ops representing the set of global platform-specific
suspend-related operations, used by the PM core, need not be exported outside
of kernel/power/main.c .   Make it static.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Rafael J. Wysocki
e6c5eb9541 PM: Rework struct platform_suspend_ops
There is no reason why the .prepare() and .finish() methods in 'struct
platform_suspend_ops' should take any arguments, since architectures don't use
these methods' argument in any practically meaningful way (ie.  either the
target system sleep state is conveyed to the platform by .set_target(), or
there is only one suspend state supported and it is indicated to the PM core
by .valid(), or .prepare() and .finish() aren't defined at all).   There also
is no reason why .finish() should return any result.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Rafael J. Wysocki
26398a70ea PM: Rename struct pm_ops and related things
The name of 'struct pm_ops' suggests that it is related to the power
management in general, but in fact it is only related to suspend.   Moreover,
its name should indicate what this structure is used for, so it seems
reasonable to change it to 'struct platform_suspend_ops'.   In that case, the
name of the global variable of this type used by the PM core and the names of
related functions should be changed accordingly.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Rafael J. Wysocki
95d9ffbe01 PM: Move definition of struct pm_ops to suspend.h
Move the definition of 'struct pm_ops' and related functions from <linux/pm.h>
to <linux/suspend.h> .

There are, at least, the following reasons to do that:
* 'struct pm_ops' is specifically related to suspend and not to the power
  management in general.
* As long as 'struct pm_ops' is defined in <linux/pm.h>, any modification of it
  causes the entire kernel to be recompiled, which is unnecessary and annoying.
* Some suspend-related features are already defined in <linux/suspend.h>, so it
  is logical to move the definition of 'struct pm_ops' into there.
* 'struct hibernation_ops', being the hibernation-related counterpart of
  'struct pm_ops', is defined in <linux/suspend.h> .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Adrian Bunk
a065c86e1b make kernel/power/main.c:suspend_enter() static
suspend_enter() can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:17 -07:00
Ralf Baechle
41702d9a4f logo.c: get rid of mips_machgroup
This has not been any serious user of this ill conceived thing since the
original invention in like '95 so I recently deleted this from everywhere
except the last instance in logo.c.  This patch removes the last two
instances in logo.c.  They conditions were not useful anyway as when
compiled in they would always evaluate as true.

Last not least this is necessary to get the SGI IP22 and DECstation kernels
to compile again.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:17 -07:00
Geert Uytterhoeven
c40eea98cd fb modedb: Refactor confusing mode_option assignment
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:17 -07:00
Maciej W. Rozycki
75e8b71d55 tty_ioctl: fix the baud_table check in encode_baud_rate
The tty_termios_encode_baud_rate() function as defined by tty_ioctl.c has a
problem with the baud_table within.  The comparison operators are reversed
and as a result this table's entries never match and BOTHER is always used.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:17 -07:00
Jan Engelhardt
77bf2bab91 Remove CONFIG_VT_UNICODE
Since default_utf8 is already a sysfs attribute, having an extra
CONFIG_VT_UNICODE compile-time option is redundant, since sysfs attributes can
be set at boot and run time.

Also let Linux VCs default to UTF-8 (as per the discussion at
http://lkml.org/lkml/2007/9/6/99).

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Cc: Bill Nottingham <notting@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:17 -07:00
Signed-off-by@vergenet.net":Simon
b7c698f755 Kexec: Update URL in MAINTAINERS file
I'm not sure that the new URL satifies the requirement of status/info, but
it does at least as good a job as the old URL, and contains current
releases of kexec-tools, rather than somewhat ancient versions.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:17 -07:00
Karsten Keil
1ccfd63367 i4l: Fix random hard freeze with AVM c4 card
The patch
- Includes the call to capilib_data_b3_req in the spinlock. This routine
  in turn calls the offending mq_enqueue routine that triggered the
  freeze if not locked.  This should also fix other indicators of
  incosistent capilib_msgidqueue list, that trigger messages like:
  Oct  5 03:05:57 BERL0 kernel: kcapi: msgid 3019 ncci 0x30301 not on queue
  that we saw several times a day (usually several in a row).
- Fixes all occurrences of c4_dispatch_tx to be called with active
  spinlock, there were some instances where no lock was active. Mostly
  these are in very infrequently called routines, so the additional
  performance penalty is minimal.

Signed-off-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Rainer Brestan <rainer.brestan@frequentis.com>
Signed-off-by: Ralf Schlatterbeck <rsc@runtux.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:17 -07:00
Karsten Keil
9713d9e650 i4l: fix random freezes with AVM B1 drivers
This fix the same issue which was debbuged for the C4 controller for the B1
versions.

The capilib_ function modify or traverse a linked list without locking.

This patch extends the existing locking to the calls of these function to
prevent access to a list which is in the middle of a modification.

Signed-off-by: Karsten Keil <kkeil@suse.de>
C: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:17 -07:00
Ralf Baechle
0c42ea3f93 au1100fb: fix modpost warnings
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x170be8): Section mismatch: reference to .init.data:au1100fb_fix (between 'au1100fb_drv_probe' and 'read_null')
WARNING: vmlinux.o(.text+0x170dc4): Section mismatch: reference to .init.data:au1100fb_var (between 'au1100fb_drv_probe' and 'read_null')
WARNING: vmlinux.o(.text+0x170dd0): Section mismatch: reference to .init.data:au1100fb_fix (between 'au1100fb_drv_probe' and 'read_null')
WARNING: vmlinux.o(.text+0x170de0): Section mismatch: reference to .init.data:au1100fb_var (between 'au1100fb_drv_probe' and 'read_null')
WARNING: vmlinux.o(.text+0x170e70): Section mismatch: reference to .init.data:au1100fb_var (between 'au1100fb_drv_probe' and 'read_null')

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:17 -07:00
Ralf Baechle
53da05632b netport_con.c: fix build errors and warnings
Fix build broken by accaa24c49:

  CC      drivers/video/console/newport_con.o
drivers/video/console/newport_con.c: In function 'newport_show_logo':
drivers/video/console/newport_con.c:111: error: assignment of read-only location
drivers/video/console/newport_con.c:111: warning: assignment makes integer from pointer without a cast
drivers/video/console/newport_con.c:112: error: assignment of read-only location
drivers/video/console/newport_con.c:112: warning: assignment makes integer from pointer without a cast

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:17 -07:00
Anton Blanchard
c70878b4e0 hrtimer: hook compat_sys_nanosleep up to high res timer code
Now we have high res timers on ppc64 I thought Id test them. It turns
out compat_sys_nanosleep hasnt been converted to the hrtimer code and so
is limited to HZ resolution.

The follow patch converts compat_sys_nanosleep to use high res timers.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-18 22:54:18 +02:00
Anton Blanchard
04c227140f hrtimer: Rework hrtimer_nanosleep to make sys_compat_nanosleep easier
Pull the copy_to_user out of hrtimer_nanosleep and into the callers
(common_nsleep, sys_nanosleep) in preparation for converting
compat_sys_nanosleep to use hrtimers.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-18 22:54:18 +02:00
Jeff Garzik
3be6cbd73f [libata] kill ata_sg_is_last()
Short term, this works around a bug introduced by early sg-chaining
work.

Long term, removing this function eliminates a branch from a hot
path loop in each scatter/gather table build.  Also, as this code
demonstrates, we don't need to _track_ the end of the s/g list, as
long as we mark it in some way.  And doing so programatically is nice.
So its a useful cleanup, regardless of its short term effects.

Based conceptually on a quick patch by Jens Axboe.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-18 16:21:18 -04:00
Ken Chen
480b9434c5 sched: reduce schedstat variable overhead a bit
schedstat is useful in investigating CPU scheduler behavior.  Ideally,
I think it is beneficial to have it on all the time.  However, the
cost of turning it on in production system is quite high, largely due
to number of events it collects and also due to its large memory
footprint.

Most of the fields probably don't need to be full 64-bit on 64-bit
arch.  Rolling over 4 billion events will most like take a long time
and user space tool can be made to accommodate that.  I'm proposing
kernel to cut back most of variable width on 64-bit system.  (note,
the following patch doesn't affect 32-bit system).

Signed-off-by: Ken Chen <kenchen@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-18 21:32:56 +02:00
Ingo Molnar
cc4ea79588 sched: add KERN_CONT annotation
printk: add the KERN_CONT annotation (which is empty string but via
which checkpatch.pl can notice that the lacking KERN_ level is fine).

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-18 21:32:56 +02:00
Ingo Molnar
d801649162 sched: cleanup, make struct rq comments more consistent
cleanup, make struct rq comments more consistent.

found via scripts/checkpatch.pl.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-18 21:32:55 +02:00
Ingo Molnar
8401f77505 sched: cleanup, fix spacing
cleanup: fix sysctl_sched_features initialization spacing, and
fix sd_alloc_ctl_cpu_table() prototype spacing.

found via scripts/checkpatch.pl.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-18 21:32:55 +02:00
Andi Kleen
51e9799023 sched: fix return value of wait_for_completion_interruptible()
The recent wait_for_completion() cleanups:

    commit 8cbbe86dfc
    Author: Andi Kleen <ak@suse.de>
    Date:   Mon Oct 15 17:00:14 2007 +0200

    sched: cleanup: refactor common code of sleep_on / wait_for_completion

    Refactor common code of sleep_on / wait_for_completion

broke the return value of wait_for_completion_interruptible().
Previously it returned 0 on success, now -1.  Fix that.

Problem found by Geert Uytterhoeven.

[ mingo: fixed whitespace damage ]

Reported-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-18 21:32:55 +02:00
Ralf Baechle
42f77542f4 [MIPS] time: Move R4000 clockevent device code to separate configurable file
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:47 +01:00
Ralf Baechle
2cfa7660db [MIPS] time: Delete dead cycles_per_jiffy, mips_timer_ack and null_timer_ack
cycles_per_jiffy was only ever getting assigned and the function pointer
not being called anymore and mips_timer_ack had gotten similarly stale.  I
leave the remaining assignments unfixed as a lighthouse pointing platform
maintainers to what needs a rewrite.  These changes make null_timer_ack()
unreferenced, so delete that too.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:47 +01:00
Ralf Baechle
76d3a7a54c [MIPS] IP32: Retire use of plat_timer_setup.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:47 +01:00
Ralf Baechle
89742e5376 [MIPS] Jazz: Retire use of plat_timer_setup.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:47 +01:00
Ralf Baechle
e887b24592 [MIPS] IP27: Convert to clock_event_device.
This separates the tick timer stuff from the generic MIPS time.c.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:47 +01:00
Ralf Baechle
832348ff0b [MIPS] JMR3927: Convert to clock_event_device.
This separate the tick timer stuff from the generic MIPS time.c.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:46 +01:00
Thomas Bogendoerfer
15ad838d28 [MIPS] Always do the ARC64_TWIDDLE_PC thing.
Always jump to the place where the kernel is linked to. This helps where
the bootloaders/proms ignores the start address inside the ELF header.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:46 +01:00
Pavel Emelyanov
52f095ee88 [IPV6]: Fix again the fl6_sock_lookup() fixed locking
YOSHIFUJI fairly pointed out, that the users increment should
be done under the ip6_sk_fl_lock not to give IPV6_FL_A_PUT a
chance to put this count to zero and release the flowlabel.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-18 05:38:48 -07:00
Jozsef Kadlecsik
bc34b84155 [NETFILTER]: nf_conntrack_tcp: fix connection reopening fix
If one side aborts an established connection, the entry still lingers
for 10s in conntrack for the late packets. Allow to open up the
connection again for the party which sent the RST packet.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-18 05:20:12 -07:00
Pavel Emelyanov
78c2e50253 [IPV6]: Fix race in ipv6_flowlabel_opt() when inserting two labels
In the IPV6_FL_A_GET case the hash is checked for flowlabels
with the given label. If it is not found, the lock, protecting 
the hash, is dropped to be re-get for writing. After this a
newly allocated entry is inserted, but no checks are performed
to catch a classical SMP race, when the conflicting label may 
be inserted on another cpu.

Use the (currently unused) return value from fl_intern() to
return the conflicting entry (if found) and re-check, whether
we can reuse it (IPV6_FL_F_EXCL) or return -EEXISTS.

Also add the comment, about why not re-lookup the current
sock for conflicting flowlabel entry.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-18 05:18:56 -07:00
Pavel Emelyanov
bd0bf57700 [IPV6]: Lost locking in fl6_sock_lookup
This routine scans the ipv6_fl_list whose update is
protected with the socket lock and the ip6_sk_fl_lock.

Since the socket lock is not taken in the lookup, use
the other one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-18 05:15:57 -07:00
Pavel Emelyanov
04028045a1 [IPV6]: Lost locking when inserting a flowlabel in ipv6_fl_list
The new flowlabels should be inserted into the sock list
under the ip6_sk_fl_lock. This was lost in one place.

This list is naturally protected with the socket lock, but
the fl6_sock_lookup() is called without it, so another
protection is required.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-18 05:14:58 -07:00
Li Zefan
009e8c965f [NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is required
Macros like SCTP_CHUNKMAP_XXX(chukmap) require chukmap to be an array,
but match_packet() passes a pointer to these macros. Also remove the
ELEMCOUNT macro and fix a bug in SCTP_CHUNKMAP_COPY.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-18 05:12:21 -07:00