Commit Graph

8637 Commits

Author SHA1 Message Date
Stephen Hemminger
ecef969e5b [VETH]: move veth.h to include/linux
Move veth.h from net/ to linux/ since it is a user api, and add it to
user header processing Kbuild.

[ Use header-y as suggested by Sam Ravnborg.  -DaveM ]

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-26 19:36:35 -08:00
Stephen Hemminger
75ec533ec3 [NET] tc_nat: header install
iproute2 build needs tc_nat.h header from kernel make install_headers.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-26 19:36:35 -08:00
Christoph Lameter
ed367fc3a7 quicklists: do not release off node pages early
quicklists must keep even off node pages on the quicklists until the TLB
flush has been completed.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-23 12:54:36 -08:00
Neil Brown
91212507f9 dm: merge max_hw_sector
Make sure dm honours max_hw_sectors of underlying devices

  We still have no firm testing evidence in support of this patch but
  believe it may help to resolve some bug reports.  - agk

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-12-20 17:32:12 +00:00
Linus Torvalds
3e3b3916a9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!"
  genirq: revert lazy irq disable for simple irqs
  x86: also define AT_VECTOR_SIZE_ARCH
  x86: kprobes bugfix
  x86: jprobe bugfix
  timer: kernel/timer.c section fixes
  genirq: add unlocked version of set_irq_handler()
  clockevents: fix reprogramming decision in oneshot broadcast
  oprofile: op_model_athlon.c support for AMD family 10h barcelona performance counters
2007-12-18 09:42:44 -08:00
Kevin Hilman
b019e57321 genirq: add unlocked version of set_irq_handler()
Add unlocked version for use by irq_chip.set_type handlers which may
wish to change handler to level or edge handler when IRQ type is
changed.

The normal set_irq_handler() call cannot be used because it tries to
take irq_desc.lock which is already held when the irq_chip.set_type
hook is called.

Signed-off-by: Kevin Hilman <khilman@mvista.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-12-18 18:05:58 +01:00
Linus Torvalds
3c615e19a4 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Cleanup umem driver: fix most checkpatch warnings, conform to kernel
  block: let elv_register() return void
  as-iosched: fix write batch start point
  as-iosched: fix incorrect comments
  block: use jiffies conversion functions in scsi_ioctl.c
2007-12-18 08:04:24 -08:00
Linus Torvalds
d55653377d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
  mmc: remove unused 'mode' from the mmc_host structure
  sdhci: support JMicron JMB38x chips
  sdhci: use PIO when DMA can't satisfy the request
  sdhci: don't warn about sdhci 2.0 controllers
  sdhci: describe quirks
2007-12-18 08:03:01 -08:00
Adrian Bunk
2fdd82bd88 block: let elv_register() return void
elv_register() always returns 0, and there isn't anything it does where
it should return an error (the only error condition is so grave that
it's handled with a BUG_ON).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-12-18 08:29:28 +01:00
Linus Torvalds
ededa4d396 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: fix ATAPI draining
  libata: update atapi_eh_request_sense() such that lbam/lbah contains buffer size
  libata-acpi: implement _GTF command filtering
  libata-acpi: improve _GTF execution error handling and reporting
  libata-acpi: improve ACPI disabling
  libata-acpi: implement dev->gtf_cache and evaluate _GTF right after _STM during resume
  libata-acpi: implement and use ata_acpi_init_gtm()
  libata-acpi: add new hooks ata_acpi_dissociate() and ata_acpi_on_disable()
  libata: ata_dev_disable() should be called from EH context
  libata: add more opcodes to ata.h
  libata: update ata_*_printk() macros such that level can be a variable
  libata-acpi: adjust constness in ata_acpi_gtm/stm() parameters
  sata_mv: improve warnings about Highpoint RocketRAID 23xx cards
  libata: add ST3160023AS / 3.42 to NCQ blacklist
  libata: clear link->eh_info.serror from ata_std_postreset()
  sata_sil: fix spurious IRQ handling
2007-12-17 19:29:32 -08:00
Nishanth Aravamudan
368d2c6358 Revert "hugetlb: Add hugetlb_dynamic_pool sysctl"
This reverts commit 54f9f80d65 ("hugetlb:
Add hugetlb_dynamic_pool sysctl")

Given the new sysctl nr_overcommit_hugepages, the boolean dynamic pool
sysctl is not needed, as its semantics can be expressed by 0 in the
overcommit sysctl (no dynamic pool) and non-0 in the overcommit sysctl
(pool enabled).

(Needed in 2.6.24 since it reverts a post-2.6.23 userspace-visible change)

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17 19:28:17 -08:00
Nishanth Aravamudan
d1c3fb1f8f hugetlb: introduce nr_overcommit_hugepages sysctl
hugetlb: introduce nr_overcommit_hugepages sysctl

While examining the code to support /proc/sys/vm/hugetlb_dynamic_pool, I
became convinced that having a boolean sysctl was insufficient:

1) To support per-node control of hugepages, I have previously submitted
patches to add a sysfs attribute related to nr_hugepages. However, with
a boolean global value and per-mount quota enforcement constraining the
dynamic pool, adding corresponding control of the dynamic pool on a
per-node basis seems inconsistent to me.

2) Administration of the hugetlb dynamic pool with multiple hugetlbfs
mount points is, arguably, more arduous than it needs to be. Each quota
would need to be set separately, and the sum would need to be monitored.

To ease the administration, and to help make the way for per-node
control of the static & dynamic hugepage pool, I added a separate
sysctl, nr_overcommit_hugepages. This value serves as a high watermark
for the overall hugepage pool, while nr_hugepages serves as a low
watermark. The boolean sysctl can then be removed, as the condition

	nr_overcommit_hugepages > 0

indicates the same administrative setting as

	hugetlb_dynamic_pool == 1

Quotas still serve as local enforcement of the size of the pool on a
per-mount basis.

A few caveats:

1) There is a race whereby the global surplus huge page counter is
incremented before a hugepage has allocated. Another process could then
try grow the pool, and fail to convert a surplus huge page to a normal
huge page and instead allocate a fresh huge page. I believe this is
benign, as no memory is leaked (the actual pages are still tracked
correctly) and the counters won't go out of sync.

2) Shrinking the static pool while a surplus is in effect will allow the
number of surplus huge pages to exceed the overcommit value. As long as
this condition holds, however, no more surplus huge pages will be
allowed on the system until one of the two sysctls are increased
sufficiently, or the surplus huge pages go out of use and are freed.

Successfully tested on x86_64 with the current libhugetlbfs snapshot,
modified to use the new sysctl.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17 19:28:17 -08:00
Adam Jackson
8d936626dd apm_event{,info}_t are userspace types
These types define the size of data read from /dev/apm_bios.  They should
not be hidden behind #ifdef __KERNEL__.

This is killing my xserver compile, apm_event_t is used in the xserver
source.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17 19:28:16 -08:00
Andrew Morton
755271358c fix headers_install
make[3]: *** No rule to make target `/usr/src/devel/include/linux/ticable.h', needed by `/usr/src/devel/usr/include/linux/ticable.h'.  Stop.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17 19:28:15 -08:00
Tejun Heo
140b5e5911 libata: fix ATAPI draining
With ATAPI transfer chunk size properly programmed, libata PIO HSM
should be able to handle full spurious data chunks.  Also, it's a good
idea to suppress trailing data warning for misc ATAPI commands as
there can be many of them per command - for example, if the chunk size
is 16 and the drive tries to transfer 510 bytes, there can be 31
trailing data messages.

This patch makes the following updates to libata ATAPI PIO HSM
implementation.

* Make it drain full spurious chunks.

* Suppress trailing data warning message for misc commands.

* Put limit on how many bytes can be drained.

* If odd, round up consumed bytes and the number of bytes to be
  drained.  This gets the number of bytes to drain right for drivers
  which do 16bit PIO.

This patch is partial backport of improve-ATAPI-data-xfer patchset
pending for #upstream.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:43:28 -05:00
Tejun Heo
398e07826b libata-acpi: implement dev->gtf_cache and evaluate _GTF right after _STM during resume
On certain implementations, _GTF evaluation depends on preceding _STM
and both can be pretty picky about the configuration.  Using _GTM
result cached during controller initialization satisfies the most
neurotic _STM implementation.  However, libata evaluates _GTF after
reset during device configuration and the hardware state can be
different from what _GTF expects and can cause evaluation failure.

This patch adds dev->gtf_cache and updates ata_dev_get_GTF() such that
it uses the cached value if available.  Cache is cleared with a call
to ata_acpi_clear_gtf().

Because for SATA ACPI nodes _GTF must be evaluated after _SDD which
can't be done till IDENTIFY is complete, _GTF caching from
ata_acpi_on_resume() is used only for IDE ACPI nodes.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:33:14 -05:00
Tejun Heo
c05e6ff035 libata-acpi: implement and use ata_acpi_init_gtm()
_GTM fetches currently configured transfer mode while _STM configures
controller according to _GTM parameter and prepares transfer mode
configuration TFs for _GTF.  In many cases _GTM and _STM
implementations are quite brittle and can't cope with configuration
changed by libata.

libata does not depend on ATA ACPI to configure devices.  The only
reason libata performs _GTM and _STM are to make _GTF evaluation
succeed and libata also doesn't care about how _GTF TFs configure
transfer mode.  It overrides that configuration anyway, so from
libata's POV, it doesn't matter what value is feeded to _STM as long
as evaluation succeeds for _STM and following _GTF.

This patch adds dev->__acpi_init_gtm and store initial _GTM values on
host initialization before modified by reset and mode configuration.
If the field is valid, ata_acpi_init_gtm() returns pointer to the
saved _GTM structure; otherwise, NULL.

This saved value is used for _STM during resume and peek at
BIOS/firmware programmed initial timing for later use.  The accessor
is there to make building w/o ACPI easy as dev->__acpi_init doesn't
exist if ACPI is not enabled.

On driver detach, the initial BIOS configuration is restored by
executing _STM with the initial _GTM values such that the next driver
can also use the initial BIOS configured values.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:33:14 -05:00
Tejun Heo
ce2e0abbd3 libata: add more opcodes to ata.h
Add constants for DEVICE CONFIGURATION OVERLAY and SET_MAX to
include/linux/ata.h.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:33:12 -05:00
Tejun Heo
c2e366a107 libata: update ata_*_printk() macros such that level can be a variable
Make prink helpers format @lv together rather than prepending to the
format string as constant.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:33:12 -05:00
Tejun Heo
0d02f0b22b libata-acpi: adjust constness in ata_acpi_gtm/stm() parameters
* No internal function uses const ata_port.  Drop const from @ap.

* Make ata_acpi_stm() copy @stm before using it and change @stm to
  const.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-17 20:33:12 -05:00
Linus Torvalds
87d5df6bde Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6:
  HOWTO: update misspelling and word incorrected
  add stable_api_nonsense.txt in korean
  HOWTO: change addresses of maintainer and lxr url for Korean HOWTO
  Add Documentation for FAIR_USER_SCHED sysfs files
  HOWTO: Change man-page maintainer address for Japanese HOWTO
  tipar: remove obsolete module
  kobject: fix the documentation of how kobject_set_name works
2007-12-17 13:33:47 -08:00
Linus Torvalds
4942093e9d Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: revert portions of "UNUSUAL_DEV: Sync up some reported devices from Ubuntu"
  usb: Remove broken optimisation in OHCI IRQ handler
  USB: at91_udc: correct hanging while disconnecting usb cable
  USB: use IRQF_DISABLED for HCD interrupt handlers
  USB: fix locking loop by avoiding flush_scheduled_work
  usb.h: fix kernel-doc warning
  USB: option: Bind to the correct interface of the Huawei E220
  USB: cp2101: new device id
  usb-storage: Fix devices that cannot handle 32k transfers
  USB: sierra: fix product id
2007-12-17 13:33:30 -08:00
Linus Torvalds
07232b9715 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ide: fix ->io_32bit race in set_io_32bit()
  ide: remove stale changelog from ide-probe.c
  ide: remove stale changelog from ide-disk.c
  ide: remove dead code from __ide_dma_test_irq()
  hpt366: fix HPT37x PIO mode timings (take 2)
  pdc202xx_new: fix Promise TX4 support
  ide-cd: remove dead post_transform_command()
  ide: DMA reporting and validity checking fixes (take 3)
  ide: add /sys/bus/ide/devices/*/{model,firmware,serial} sysfs entries
  ide: coding style fixes for drivers/ide/setup-pci.c
  ide: fix ide_scan_pcibus() error message
  ide: deprecate CONFIG_BLK_DEV_OFFBOARD
  ide: add missing checks for control register existence
  ide-scsi: add ide_scsi_hex_dump() helper
2007-12-17 13:32:49 -08:00
Randy Dunlap
f88ed90d86 usb.h: fix kernel-doc warning
Fix kernel-doc warning in usb.h:
Warning(linux-2.6.24-rc3-git7//include/linux/usb.h:166): No description found for parameter 'sysfs_files_created'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-12-17 10:47:15 -08:00
Doug Maxey
33abc04f04 usb-storage: Fix devices that cannot handle 32k transfers
When a device cannot handle the smallest previously limited transfer
size (64 blocks) without stalling, limit the device to the amount of
packets that fit in a platform native page.

The lowest possible limit is PAGE_CACHE_SIZE, so if the device is ever
used on a platform that has larger than 8K pages, you lose unless you
can convince the device firmware folks to fix the issue.

Cc: Mathew Dharm <mdharm-scsi@one-eyed-alien.net>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Doug Maxey <dwm@austin.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-12-17 10:47:14 -08:00
Romain Liévin
cb8c9b6de0 tipar: remove obsolete module
tipar: remove obsolete module

The tipar character driver was used to implement bit-banging access
to Texas Instruments parallel link cable. A user-land method now 
exists thru PPDEV & PARPORT.

Signed-off-by: Romain Liévin <roms@lpg.ticalc.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-12-17 10:33:18 -08:00
Patrick McHardy
4a9ecd5960 [NETFILTER]: bridge: fix missing link layer headers on outgoing routed packets
As reported by Damien Thebault, the double POSTROUTING hook invocation
fix caused outgoing packets routed between two bridges to appear without
a link-layer header. The reason for this is that we're skipping the
br_nf_post_routing hook for routed packets now and don't save the
original link layer header, but nevertheless tries to restore it on
output, causing corruption.

The root cause for this is that skb->nf_bridge has no clearly defined
lifetime and is used to indicate all kind of things, but that is
quite complicated to fix. For now simply don't touch these packets
and handle them like packets from any other device.

Tested-by: Damien Thebault <damien.thebault@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-14 13:54:39 -08:00
Bartlomiej Zolnierkiewicz
3ab7efe8e2 ide: DMA reporting and validity checking fixes (take 3)
* ide_xfer_verbose() fixups:
  - beautify returned mode names
  - fix PIO5 reporting
  - make it return 'const char *'

* Change printk() level from KERN_DEBUG to KERN_INFO in ide_find_dma_mode().

* Add ide_id_dma_bug() helper based on ide_dma_verbose() to check for invalid
  DMA info in identify block.

* Use ide_id_dma_bug() in ide_tune_dma() and ide_driveid_update().

  As a result DMA won't be tuned or will be disabled after tuning if device
  reports inconsistent info about enabled DMA mode (ide_dma_verbose() does the
  same checks while the IDE device is probed by ide-{cd,disk} device driver).

* Remove no longer needed ide_dma_verbose().

This patch should fix the following problem with out-of-sync IDE messages
reported by Nick Warne:

       hdd: ATAPI 48X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache<7>hdd:
       skipping word 93 validity check
        , UDMA(66)

and later debugged by Mark Lord to be caused by:

        ide_dma_verbose()
                printk( ... "2048kB Cache");
        eighty_ninty_three()
                printk(KERN_DEBUG "%s: skipping word 93 validity check\n");
        ide_dma_verbose()
                printk(", UDMA(66)"

Please note that as a result ide-{cd,disk} device drivers won't report the
DMA speed used but this is intended since now DMA mode being used is always
reported by IDE core code.

v2:
* fixes suggested by Randy:
  - use KERN_CONT for printk()-s in ide-{cd,disk}.c
  - don't remove argument name from ide_xfer_verbose() declaration

v3:
* Remove incorrect check for (id->field_valid & 1) from ide_id_dma_bug()
  (spotted by Sergei).

* "XFER SLOW" -> "PIO SLOW" in ide_xfer_verbose() (suggested by Sergei).

* Fix ide_find_dma_mode() to report the correct mode ('mode' after being
  limited by 'req_mode').

Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Nick Warne <nick@ukfsn.org>
Cc: Mark Lord <lkml@rtr.ca>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-12-12 23:31:58 +01:00
Nicolas Pitre
cc3000e4ef mmc: remove unused 'mode' from the mmc_host structure
This field and corresponding defines are simply never used anywhere
in the code.  But its mere presence is enough to confuse some host
driver authors who attempt to rely on it.  Let's eliminate the
possibility for confusion and remove it entirely.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2007-12-12 20:01:01 +01:00
Pierre Ossman
84c46a53fc sdhci: support JMicron JMB38x chips
The JMicron JMB38x chip doesn't support transfers that aren't 32-bit
aligned (both size and start address). It also doesn't like switching
between PIO and DMA mode, so it needs to be reset after each request.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2007-12-12 20:01:00 +01:00
Jay Vosburgh
6f6652be18 bonding: Add new layer2+3 hash for xor/802.3ad modes
Add new hash for balance-xor and 802.3ad modes.  Originally
 submitted by "Glenn Griffin" <ggriffin.kernel@gmail.com>; modified by
 Jay Vosburgh to move setting of hash policy out of line, tweak the
 documentation update and add version update to 3.2.2.

	Glenn's original comment follows:

Included is a patch for a new xmit_hash_policy for the bonding driver
that selects slaves based on MAC and IP information.  This is a middle
ground between what currently exists in the layer2 only policy and the
layer3+4 policy.  This policy strives to be fully 802.3ad compliant by
transmitting every packet of any particular flow over the same link.
As documented the layer3+4 policy is not fully compliant for extreme
cases such as ip fragmentation, so this policy is a nice compromise
for environments that require full compliance but desire more than the
layer2 only policy.

Signed-off-by: "Glenn Griffin" <ggriffin.kernel@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-07 15:00:32 -05:00
Richard Purdie
dc47206e55 leds: Fix led trigger locking bugs
Convert part of the led trigger core from rw spinlocks to rw
semaphores. We're calling functions which can sleep from invalid
contexts otherwise. Fixes bug #9264.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
2007-12-07 09:06:53 +00:00
Linus Torvalds
7e1fb765c6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  futex: correctly return -EFAULT not -EINVAL
  lockdep: in_range() fix
  lockdep: fix debug_show_all_locks()
  sched: style cleanups
  futex: fix for futex_wait signal stack corruption
2007-12-05 09:27:46 -08:00
Linus Torvalds
ad658cec23 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
  VM/Security: add security hook to do_brk
  Security: round mmap hint address above mmap_min_addr
  security: protect from stack expantion into low vm addresses
  Security: allow capable check to permit mmap or low vm space
  SELinux: detect dead booleans
  SELinux: do not clear f_op when removing entries
2007-12-05 09:26:52 -08:00
Linus Torvalds
2a1292b36b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  [LRO]: fix lro_gen_skb() alignment
  [TCP]: NAGLE_PUSH seems to be a wrong way around
  [TCP]: Move prior_in_flight collect to more robust place
  [TCP] FRTO: Use of existing funcs make code more obvious & robust
  [IRDA]: Move ircomm_tty_line_info() under #ifdef CONFIG_PROC_FS
  [ROSE]: Trivial compilation CONFIG_INET=n case
  [IPVS]: Fix sched registration race when checking for name collision.
  [IPVS]: Don't leak sysctl tables if the scheduler registration fails.
2007-12-05 09:26:13 -08:00
Alexey Dobriyan
5a622f2d0f proc: fix proc_dir_entry refcounting
Creating PDEs with refcount 0 and "deleted" flag has problems (see below).
Switch to usual scheme:
* PDE is created with refcount 1
* every de_get does +1
* every de_put() and remove_proc_entry() do -1
* once refcount reaches 0, PDE is freed.

This elegantly fixes at least two following races (both observed) without
introducing new locks, without abusing old locks, without spreading
lock_kernel():

1) PDE leak

remove_proc_entry			de_put
-----------------			------
			[refcnt = 1]
if (atomic_read(&de->count) == 0)
					if (atomic_dec_and_test(&de->count))
						if (de->deleted)
							/* also not taken! */
							free_proc_entry(de);
else
	de->deleted = 1;
		[refcount=0, deleted=1]

2) use after free

remove_proc_entry			de_put
-----------------			------
			[refcnt = 1]

					if (atomic_dec_and_test(&de->count))
if (atomic_read(&de->count) == 0)
	free_proc_entry(de);
						/* boom! */
						if (de->deleted)
							free_proc_entry(de);

BUG: unable to handle kernel paging request at virtual address 6b6b6b6b
printing eip: c10acdda *pdpt = 00000000338f8001 *pde = 0000000000000000
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom
Pid: 23161, comm: cat Not tainted (2.6.24-rc2-8c0863403f109a43d7000b4646da4818220d501f #4)
EIP: 0060:[<c10acdda>] EFLAGS: 00210097 CPU: 1
EIP is at strnlen+0x6/0x18
EAX: 6b6b6b6b EBX: 6b6b6b6b ECX: 6b6b6b6b EDX: fffffffe
ESI: c128fa3b EDI: f380bf34 EBP: ffffffff ESP: f380be44
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 23161, ti=f380b000 task=f38f2570 task.ti=f380b000)
Stack: c10ac4f0 00000278 c12ce000 f43cd2a8 00000163 00000000 7da86067 00000400
       c128fa20 00896b18 f38325a8 c128fe20 ffffffff 00000000 c11f291e 00000400
       f75be300 c128fa20 f769c9a0 c10ac779 f380bf34 f7bfee70 c1018e6b f380bf34
Call Trace:
 [<c10ac4f0>] vsnprintf+0x2ad/0x49b
 [<c10ac779>] vscnprintf+0x14/0x1f
 [<c1018e6b>] vprintk+0xc5/0x2f9
 [<c10379f1>] handle_fasteoi_irq+0x0/0xab
 [<c1004f44>] do_IRQ+0x9f/0xb7
 [<c117db3b>] preempt_schedule_irq+0x3f/0x5b
 [<c100264e>] need_resched+0x1f/0x21
 [<c10190ba>] printk+0x1b/0x1f
 [<c107c8ad>] de_put+0x3d/0x50
 [<c107c8f8>] proc_delete_inode+0x38/0x41
 [<c107c8c0>] proc_delete_inode+0x0/0x41
 [<c1066298>] generic_delete_inode+0x5e/0xc6
 [<c1065aa9>] iput+0x60/0x62
 [<c1063c8e>] d_kill+0x2d/0x46
 [<c1063fa9>] dput+0xdc/0xe4
 [<c10571a1>] __fput+0xb0/0xcd
 [<c1054e49>] filp_close+0x48/0x4f
 [<c1055ee9>] sys_close+0x67/0xa5
 [<c10026b6>] sysenter_past_esp+0x5f/0x85
=======================
Code: c9 74 0c f2 ae 74 05 bf 01 00 00 00 4f 89 fa 5f 89 d0 c3 85 c9 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f c3 89 c1 89 c8 eb 06 <80> 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 c3 90 90 90 57 83 c9
EIP: [<c10acdda>] strnlen+0x6/0x18 SS:ESP 0068:f380be44

Also, remove broken usage of ->deleted from reiserfs: if sget() succeeds,
module is already pinned and remove_proc_entry() can't happen => nobody
can mark PDE deleted.

Dummy proc root in netns code is not marked with refcount 1. AFAICS, we
never get it, it's just for proper /proc/net removal. I double checked
CLONE_NETNS continues to work.

Patch survives many hours of modprobe/rmmod/cat loops without new bugs
which can be attributed to refcounting.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-05 09:21:20 -08:00
Jan Kara
d4beaf4ab5 jbd: Fix assertion failure in fs/jbd/checkpoint.c
Before we start committing a transaction, we call
__journal_clean_checkpoint_list() to cleanup transaction's written-back
buffers.

If this call happens to remove all of them (and there were already some
buffers), __journal_remove_checkpoint() will decide to free the transaction
because it isn't (yet) a committing transaction and soon we fail some
assertion - the transaction really isn't ready to be freed :).

We change the check in __journal_remove_checkpoint() to free only a
transaction in T_FINISHED state.  The locking there is subtle though (as
everywhere in JBD ;().  We use j_list_lock to protect the check and a
subsequent call to __journal_drop_transaction() and do the same in the end
of journal_commit_transaction() which is the only place where a transaction
can get to T_FINISHED state.

Probably I'm too paranoid here and such locking is not really necessary -
checkpoint lists are processed only from log_do_checkpoint() where a
transaction must be already committed to be processed or from
__journal_clean_checkpoint_list() where kjournald itself calls it and thus
transaction cannot change state either.  Better be safe if something
changes in future...

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-05 09:21:20 -08:00
Steven Rostedt
ce6bd420f4 futex: fix for futex_wait signal stack corruption
David Holmes found a bug in the -rt tree with respect to
pthread_cond_timedwait. After trying his test program on the latest git
from mainline, I found the bug was there too.  The bug he was seeing
that his test program showed, was that if one were to do a "Ctrl-Z" on a
process that was in the pthread_cond_timedwait, and then did a "bg" on
that process, it would return with a "-ETIMEDOUT" but early. That is,
the timer would go off early.

Looking into this, I found the source of the problem. And it is a rather
nasty bug at that.

Here's the relevant code from kernel/futex.c: (not in order in the file)

[...]
smlinkage long sys_futex(u32 __user *uaddr, int op, u32 val,
                          struct timespec __user *utime, u32 __user *uaddr2,
                          u32 val3)
{
        struct timespec ts;
        ktime_t t, *tp = NULL;
        u32 val2 = 0;
        int cmd = op & FUTEX_CMD_MASK;

        if (utime && (cmd == FUTEX_WAIT || cmd == FUTEX_LOCK_PI)) {
                if (copy_from_user(&ts, utime, sizeof(ts)) != 0)
                        return -EFAULT;
                if (!timespec_valid(&ts))
                        return -EINVAL;

                t = timespec_to_ktime(ts);
                if (cmd == FUTEX_WAIT)
                        t = ktime_add(ktime_get(), t);
                tp = &t;
        }
[...]
        return do_futex(uaddr, op, val, tp, uaddr2, val2, val3);
}

[...]

long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
                u32 __user *uaddr2, u32 val2, u32 val3)
{
        int ret;
        int cmd = op & FUTEX_CMD_MASK;
        struct rw_semaphore *fshared = NULL;

        if (!(op & FUTEX_PRIVATE_FLAG))
                fshared = &current->mm->mmap_sem;

        switch (cmd) {
        case FUTEX_WAIT:
                ret = futex_wait(uaddr, fshared, val, timeout);

[...]

static int futex_wait(u32 __user *uaddr, struct rw_semaphore *fshared,
                      u32 val, ktime_t *abs_time)
{
[...]
               struct restart_block *restart;
                restart = &current_thread_info()->restart_block;
                restart->fn = futex_wait_restart;
                restart->arg0 = (unsigned long)uaddr;
                restart->arg1 = (unsigned long)val;
                restart->arg2 = (unsigned long)abs_time;
                restart->arg3 = 0;
                if (fshared)
                        restart->arg3 |= ARG3_SHARED;
                return -ERESTART_RESTARTBLOCK;
[...]

static long futex_wait_restart(struct restart_block *restart)
{
        u32 __user *uaddr = (u32 __user *)restart->arg0;
        u32 val = (u32)restart->arg1;
        ktime_t *abs_time = (ktime_t *)restart->arg2;
        struct rw_semaphore *fshared = NULL;

        restart->fn = do_no_restart_syscall;
        if (restart->arg3 & ARG3_SHARED)
                fshared = &current->mm->mmap_sem;
        return (long)futex_wait(uaddr, fshared, val, abs_time);
}

So when the futex_wait is interrupt by a signal we break out of the
hrtimer code and set up or return from signal. This code does not return
back to userspace, so we set up a RESTARTBLOCK.  The bug here is that we
save the "abs_time" which is a pointer to the stack variable "ktime_t t"
from sys_futex.

This returns and unwinds the stack before we get to call our signal. On
return from the signal we go to futex_wait_restart, where we update all
the parameters for futex_wait and call it. But here we have a problem
where abs_time is no longer valid.

I verified this with print statements, and sure enough, what abs_time
was set to ends up being garbage when we get to futex_wait_restart.

The solution I did to solve this (with input from Linus Torvalds)
was to add unions to the restart_block to allow system calls to
use the restart with specific parameters.  This way the futex code now
saves the time in a 64bit value in the restart block instead of storing
it on the stack.

Note: I'm a bit nervious to add "linux/types.h" and use u32 and u64
in thread_info.h, when there's a #ifdef __KERNEL__ just below that.
Not sure what that is there for.  If this turns out to be a problem, I've
tested this with using "unsigned int" for u32 and "unsigned long long" for
u64 and it worked just the same. I'm using u32 and u64 just to be
consistent with what the futex code uses.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-05 15:46:09 +01:00
Andrew Gallatin
621544eb8c [LRO]: fix lro_gen_skb() alignment
Add a field to the lro_mgr struct so that drivers can specify how much
padding is required to align layer 3 headers when a packet is copied
into a freshly allocated skb by inet_lro.c:lro_gen_skb().  Without
padding, skbs generated by LRO will cause alignment warnings on
architectures which require strict alignment (seen on sparc64).

Myri10GE is updated to use this field.

Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-05 05:37:32 -08:00
Eric Paris
7cd94146cd Security: round mmap hint address above mmap_min_addr
If mmap_min_addr is set and a process attempts to mmap (not fixed) with a
non-null hint address less than mmap_min_addr the mapping will fail the
security checks.  Since this is just a hint address this patch will round
such a hint address above mmap_min_addr.

gcj was found to try to be very frugal with vm usage and give hint addresses
in the 8k-32k range.  Without this patch all such programs failed and with
the patch they happily get a higher address.

This patch is wrappad in CONFIG_SECURITY since mmap_min_addr doesn't exist
without it and there would be no security check possible no matter what.  So
we should not bother compiling in this rounding if it is just a waste of
time.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2007-12-06 00:25:10 +11:00
Anton Vorontsov
6f4a7f4183 PHY: Add the phy_device_release device method.
Lately I've got this nice badness on mdio bus removal:

Device 'e0103120:06' does not have a release() function, it is broken and must be fixed.
------------[ cut here ]------------
Badness at drivers/base/core.c:107
NIP: c015c1a8 LR: c015c1a8 CTR: c0157488
REGS: c34bdcf0 TRAP: 0700   Not tainted  (2.6.23-rc5-g9ebadfbb-dirty)
MSR: 00029032 <EE,ME,IR,DR>  CR: 24088422  XER: 00000000
...
[c34bdda0] [c015c1a8] device_release+0x78/0x80 (unreliable)
[c34bddb0] [c01354cc] kobject_cleanup+0x80/0xbc
[c34bddd0] [c01365f0] kref_put+0x54/0x6c
[c34bdde0] [c013543c] kobject_put+0x24/0x34
[c34bddf0] [c015c384] put_device+0x1c/0x2c
[c34bde00] [c0180e84] mdiobus_unregister+0x2c/0x58
...

Though actually there is nothing broken, it just device
subsystem core expects another "pattern" of resource managment.

This patch implement phy device's release function, thus
we're getting rid of this badness.

Also small hidden bug fixed, hope none other introduced. ;-)

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-04 15:06:33 -05:00
Linus Torvalds
ca6435f188 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  sched: cpu accounting controller (V2)
2007-12-03 08:21:06 -08:00
Linus Torvalds
8002cedc1a Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6: (27 commits)
  [INET]: Fix inet_diag dead-lock regression
  [NETNS]: Fix /proc/net breakage
  [TEXTSEARCH]: Do not allow zero length patterns in the textsearch infrastructure
  [NETFILTER]: fix forgotten module release in xt_CONNMARK and xt_CONNSECMARK
  [NETFILTER]: xt_TCPMSS: remove network triggerable WARN_ON
  [DECNET]: dn_nl_deladdr() almost always returns no error
  [IPV6]: Restore IPv6 when MTU is big enough
  [RXRPC]: Add missing select on CRYPTO
  mac80211: rate limit wep decrypt failed messages
  rfkill: fix double-mutex-locking
  mac80211: drop unencrypted frames if encryption is expected
  mac80211: Fix behavior of ieee80211_open and ieee80211_close
  ieee80211: fix unaligned access in ieee80211_copy_snap
  mac80211: free ifsta->extra_ie and clear IEEE80211_STA_PRIVACY_INVOKED
  SCTP: Fix build issues with SCTP AUTH.
  SCTP: Fix chunk acceptance when no authenticated chunks were listed.
  SCTP: Fix the supported extensions paramter
  SCTP: Fix SCTP-AUTH to correctly add HMACS paramter.
  SCTP: Fix the number of HB transmissions.
  [TCP] illinois: Incorrect beta usage
  ...
2007-12-03 08:15:36 -08:00
Srivatsa Vaddagiri
d842de871c sched: cpu accounting controller (V2)
Commit cfb5285660 removed a useful feature for
us, which provided a cpu accounting resource controller.  This feature would be
useful if someone wants to group tasks only for accounting purpose and doesnt
really want to exercise any control over their cpu consumption.

The patch below reintroduces the feature. It is based on Paul Menage's
original patch (Commit 62d0df6406), with
these differences:

        - Removed load average information. I felt it needs more thought (esp
	  to deal with SMP and virtualized platforms) and can be added for
	  2.6.25 after more discussions.
        - Convert group cpu usage to be nanosecond accurate (as rest of the cfs
	  stats are) and invoke cpuacct_charge() from the respective scheduler
	  classes
	- Make accounting scalable on SMP systems by splitting the usage
	  counter to be per-cpu
	- Move the code from kernel/cpu_acct.c to kernel/sched.c (since the
	  code is not big enough to warrant a new file and also this rightly
	  needs to live inside the scheduler. Also things like accessing
	  rq->lock while reading cpu usage becomes easier if the code lived in
	  kernel/sched.c)

The patch also modifies the cpu controller not to provide the same accounting
information.

Tested-by: Balbir Singh <balbir@linux.vnet.ibm.com>

 Tested the patches on top of 2.6.24-rc3. The patches work fine. Ran
 some simple tests like cpuspin (spin on the cpu), ran several tasks in
 the same group and timed them. Compared their time stamps with
 cpuacct.usage.

Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-12-02 20:04:49 +01:00
Kim Phillips
7d400a4c58 phylib: add PHY interface modes for internal delay for tx and rx only
Allow phylib specification of cases where hardware needs to configure
PHYs for Internal Delay only on either RX or TX (not both).

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Tested-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Li Yang <leoli@freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-01 16:32:30 -05:00
Jeff Garzik
c99da91e7a Merge branch 'master' into upstream-fixes 2007-12-01 16:18:56 -05:00
Eric W. Biederman
2b1e300a9d [NETNS]: Fix /proc/net breakage
Well I clearly goofed when I added the initial network namespace support
for /proc/net.  Currently things work but there are odd details visible to
user space, even when we have a single network namespace.

Since we do not cache proc_dir_entry dentries at the moment we can just
modify ->lookup to return a different directory inode depending on the
network namespace of the process looking at /proc/net, replacing the
current technique of using a magic and fragile follow_link method.

To accomplish that this patch:
- introduces a shadow_proc method to allow different dentries to
  be returned from proc_lookup.
- Removes the old /proc/net follow_link magic
- Fixes a weakness in our not caching of proc generic dentries.

As shadow_proc uses a task struct to decided which dentry to return we can
go back later and fix the proc generic caching without modifying any code
that uses the shadow_proc method.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2007-12-02 00:33:17 +11:00
Jiri Kosina
8853c202b4 RTC: convert mutex to bitfield
RTC code is using mutex to assure exclusive access to /dev/rtc.  This is
however wrong usage, as it leaves the mutex locked when returning into
userspace, which is unacceptable.

Convert rtc->char_lock into bit operation.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:54 -08:00
Miklos Szeredi
a6643094e7 fuse: pass open flags to read and write
Some open flags (O_APPEND, O_DIRECT) can be changed with fcntl(F_SETFL, ...)
after open, but fuse currently only sends the flags to userspace in open.

To make it possible to correcly handle changing flags, send the
current value to userspace in each read and write.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:54 -08:00
Huang, Ying
7c83172b98 x86_64 EFI boot support: EFI frame buffer driver
This patch adds Graphics Output Protocol support to the kernel.  UEFI2.0 spec
deprecates Universal Graphics Adapter (UGA) protocol and only Graphics Output
Protocol (GOP) is produced.  Therefore, the boot loader needs to query the
UEFI firmware with appropriate Output Protocol and pass the video information
to the kernel.  As a result of GOP protocol, an EFI framebuffer driver is
needed for displaying console messages.  The patch adds a EFI framebuffer
driver.  The EFI frame buffer driver in this patch is based on the Intel Mac
framebuffer driver.

The ELILO bootloader takes care of passing the video information as
appropriate for EFI firmware.

The framebuffer driver has been tested in i386 kernel and x86_64 kernel on EFI
platform.

Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:54 -08:00
Tobias Poschwatta
6454d1f903 fix up ext2_fs.h for userspace after reservations backport
In commit a686cd898b:

 "Val's cross-port of the ext3 reservations code into ext2."

include/linux/ext2_fs.h got a new function whose return value is only
defined if __KERNEL__ is defined. Putting #ifdef __KERNEL__ around the
function seems to help, patch below.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:53 -08:00
Thomas Bogendoerfer
68576cf122 IP22ZILOG: fix lockup and sysrq
- fix lockup when switching from early console to real console
 - make sysrq reliable
 - fix panic, if sysrq is issued before console is opened

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:53 -08:00
David Woodhouse
79288f5e93 Fix <linux/kd.h> usage in userspace
For reasons unclear to me, glibc's <sys/kd.h> deliberately defeats the
attempt we make in <linux/kd.h> to include <linux/types.h>

For now, change the one instance of __u32 to 'unsigned int' instead
because it's breaking userspace. We should probably also remove our
inclusion of <linux/types.h>, since we don't use it -- but that's not a
change to make in -rc.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:52 -08:00
Zhao Yakui
a7839e9606 PNP: increase the maximum number of resources
On some systems the number of resources(IO,MEM) returnedy by PNP device is
greater than the PNP constant, for example motherboard devices.  It brings
that some resources can't be reserved and resource confilicts.  This will
cause PCI resources are assigned wrongly in some systems, and cause hang.
This is a regression since we deleted ACPI motherboard driver and use PNP
system driver.

[akpm@linux-foundation.org: fix text and coding-style a bit]
Signed-off-by: Li Shaohua <shaohua.li@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:52 -08:00
Linus Torvalds
bb0851ff9d Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (25 commits)
  USB: s3c2410 gadget: ensure vbus pin in input mode during read
  USB: s3c2410 gadget: allow sharing of vbus irq
  USB: s3c2410 gadget: Header move fixups
  USB: usb-storage: unusual_devs entry for JetFlash TS1GJF2A
  USB: fix up EHCI startup synchronization
  USB: make the microtek driver and HAL cooperate
  USB: uevent environment key fix
  USB: keep track of whether interface sysfs files exist
  USB: sierra: new product id
  USB HCD: avoid duplicate local_irq_disable()
  USB: mailing lists have changed
  USB: remove USB HUB entry from MAINTAINERS
  USB: fix directory references in usb/README
  USB: add support for an older firmware revision for the Nikon D200
  USB: FIx locks and urb->status in adutux (updated)
  USB: power-management documenation update
  USB: Fix signr comment in usbdevice_fs.h
  usbserial: fix inconsistent lock state
  USB: fix usbled disconnect read race #2
  USB: free memory when writing fails in usb/serial/mos7840.c
  ...
2007-11-28 16:03:09 -08:00
Alan Stern
7e61559f61 USB: keep track of whether interface sysfs files exist
This patch (as1009) solves the problem of multiple registrations for
USB sysfs files in a more satisfying way than the existing code.  It
simply adds a flag to keep track of whether or not the files have been
created; that way the files can be created or removed as needed.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
2007-11-28 13:58:35 -08:00
Phil Endecott
bc59462b80 USB: Fix signr comment in usbdevice_fs.h
This trivial documentation patch corrects a comment in usbdevice_fs.h; it
previously suggested that the signal would only be sent on error, but I am
told that it is sent on both successful and unsuccessful completion, and
that zero indicates that no signal should be sent.

Signed-off-by: Phil Endecott <spam_from_usb_devel@chezphil.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-28 13:58:34 -08:00
Ingo Molnar
deaf2227dd sched: clean up, move __sched_text_start/end to sched.h
move __sched_text_start/end to sched.h. No code changed:

   text    data     bss     dec     hex filename
  26582    2310      28   28920    70f8 sched.o.before
  26582    2310      28   28920    70f8 sched.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-28 15:52:56 +01:00
Linus Torvalds
9c8ff4f4da Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  scatterlist: add more safeguards
  Revert "ll_rw_blk: temporarily enable max_segments tweaking"
  mmc: Add missing sg_init_table() call
  block: Fix memory leak in alloc_disk_node()
  alpha: fix sg_page breakage
  blktrace: Make sure BLKTRACETEARDOWN does the full cleanup.
2007-11-27 14:21:19 -08:00
Linus Torvalds
febb187761 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: adds the context menu key (HUT GenDesc 0x84)
  Input: add definitions for frame forward and frame back keys
  Input: bf54x-keys - keypad does not exist on BF544 parts
  Input: gpio-keys - request and configure GPIOs
  Input: i8042 - add i8042.noloop quirk for MS Virtual Machine
  Sonypi: use synchronize_irq instead of sycnronize_sched
  sonypi: fit input devices into sysfs tree
  sony-laptop: fit input devices into sysfs tree
2007-11-27 14:20:35 -08:00
Tejun Heo
645a8d9462 scatterlist: add more safeguards
Add more safeguards to protect against misinterpreting a chain entry
as a normal scatterlist and vice-versa.

* Make sure the entry isn't a chain when assigning and reading a
  normal sg.

* Clear offset and length when chaining.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-27 09:30:39 +01:00
Aristeu Rozanski
35baef2afb Input: adds the context menu key (HUT GenDesc 0x84)
Signed-off-by: Aristeu Rozanski <aris@ruivo.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2007-11-27 00:47:04 -05:00
Aristeu Rozanski
c23f1f9c40 Input: add definitions for frame forward and frame back keys
Signed-off-by: Aristeu Rozanski <aris@ruivo.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2007-11-27 00:46:57 -05:00
Linus Torvalds
8c27eba549 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6: (41 commits)
  [XFRM]: Fix leak of expired xfrm_states
  [ATM]: [he] initialize lock and tasklet earlier
  [IPV4]: Remove bogus ifdef mess in arp_process
  [SKBUFF]: Free old skb properly in skb_morph
  [IPV4]: Fix memory leak in inet_hashtables.h when NUMA is on
  [IPSEC]: Temporarily remove locks around copying of non-atomic fields
  [TCP] MTUprobe: Cleanup send queue check (no need to loop)
  [TCP]: MTUprobe: receiver window & data available checks fixed
  [MAINTAINERS]: tlan list is subscribers-only
  [SUNRPC]: Remove SPIN_LOCK_UNLOCKED
  [SUNRPC]: Make xprtsock.c:xs_setup_{udp,tcp}() static
  [PFKEY]: Sending an SADB_GET responds with an SADB_GET
  [IRDA]: Compilation for CONFIG_INET=n case
  [IPVS]: Fix compiler warning about unused register_ip_vs_protocol
  [ARP]: Fix arp reply when sender ip 0
  [IPV6] TCPMD5: Fix deleting key operation.
  [IPV6] TCPMD5: Check return value of tcp_alloc_md5sig_pool().
  [IPV4] TCPMD5: Use memmove() instead of memcpy() because we have overlaps.
  [IPV4] TCPMD5: Omit redundant NULL check for kfree() argument.
  ieee80211: Stop net_ratelimit/IEEE80211_DEBUG_DROP log pollution
  ...
2007-11-26 20:09:07 -08:00
Linus Torvalds
423eaf8f00 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6:
  NFS: Clean up new multi-segment direct I/O changes
  NFS: Ensure we return zero if applications attempt to write zero bytes
  NFS: Support multiple segment iovecs in the NFS direct I/O path
  NFS: Introduce iovec I/O helpers to fs/nfs/direct.c
  SUNRPC: Add missing "space" to net/sunrpc/auth_gss.c
  SUNRPC: make sunrpc/xprtsock.c:xs_setup_{udp,tcp}() static
  NFS: fs/nfs/dir.c should #include "internal.h"
  NFS: make nfs_wb_page_priority() static
  NFS: mount failure causes bad page state
  SUNRPC: remove NFS/RDMA client's binary sysctls
  kernel BUG at fs/nfs/namespace.c:108! - can be triggered by bad server
  sunrpc: rpc_pipe_poll may miss available data in some cases
  sunrpc: return error if unsupported enctype or cksumtype is encountered
  sunrpc: gss_pipe_downcall(), don't assume all errors are transient
  NFS: Fix the ustat() regression
2007-11-26 19:42:59 -08:00
Linus Torvalds
ff1ea52fa3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: fix APIC related bootup crash on Athlon XP CPUs
  time: add ADJ_OFFSET_SS_READ
  x86: export the symbol empty_zero_page on the 32-bit x86 architecture
  x86: fix kprobes_64.c inlining borkage
  pci: use pci=bfsort for HP DL385 G2, DL585 G2
  x86: correctly set UTS_MACHINE for "make ARCH=x86"
  lockdep: annotate do_debug() trap handler
  x86: turn off iommu merge by default
  x86: fix ACPI compile for LOCAL_APIC=n
  x86: printk kernel version in WARN_ON and other dump_stack users
  ACPI: Set max_cstate to 1 for early Opterons.
  x86: fix NMI watchdog & 'stopped time' problem
2007-11-26 19:41:28 -08:00
Linus Torvalds
6d27294053 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (21 commits)
  libata: bump transfer chunk size if it's odd
  libata: Return proper ATA INT status in pata_bf54x driver
  pata_ali: trim trailing whitespace (fix checkpatch complaints)
  pata_isapnp: Polled devices
  pata_hpt37x: Fix cable detect bug spotted by Sergei
  pata_ali: Lots of problems still showing up with small ATAPI DMA
  pata_ali: Add Mitac 8317 and derivatives
  libata-core: List more documentation sources for reference
  ata_piix: Invalid use of writel/readl with iomap
  sata_sil24: fix sg table sizing
  pata_jmicron: fix disabled port handling in jmicron_pre_reset()
  pata_sil680: kill bogus reset code (take 2)
  ata_piix: port enable for the first SATA controller of ICH8 is 0xf not 0x3
  ata_piix: only enable the first port on apple macbook pro
  ata_piix: reorganize controller IDs
  pata_sis.c: Add Packard Bell EasyNote K5305 to laptops
  libata-scsi: be tolerant of 12-byte ATAPI commands in 16-byte CDBs
  libata: use ATA_HORKAGE_STUCK_ERR for ATAPI tape drives
  libata: workaround DRQ=1 ERR=1 for ATAPI tape drives
  libata: remove unused functions
  ...
2007-11-26 19:18:22 -08:00
Linus Torvalds
6b41016032 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (39 commits)
  ACPI: EC: Workaround for optimized controllers (version 3)
  ACPI: EC: use printk_ratelimit(), add some DEBUG mode messages
  Revert "ACPI: EC: Workaround for optimized controllers"
  ACPI: fix two IRQ8 issues in IOAPIC mode
  ACPI: Add missing spaces to printk format
  cpuidle: fix HP nx6125 regression
  cpuidle: add sched_clock_idle_[sleep|wakeup]_event() hooks
  cpuidle: fix C3 for no bus-master control case
  ACPI: thinkpad-acpi: fix oops when a module parameter has no value
  Revert "Fix very high interrupt rate for IRQ8 (rtc) unless pnpacpi=off"
  ACPI: EC: Don't init EC early if it has no _INI
  Revert "acpi: make ACPI_PROCFS default to y"
  Revert "ACPI: add documentation for deprecated /proc/acpi/battery in ACPI_PROCFS"
  ACPI: Split out control for /proc/acpi entries from battery, ac, and sbs.
  ACPI: Video: Increase buffer size for writes to brightness proc file.
  ACPI: EC: Workaround for optimized controllers
  ACPI: SBS: Fix retval warning
  ACPI: Enable MSR (FixedHW) support for T-States
  ACPI: Get throttling info from BIOS only after evaluating _PDC
  ACPI: Use _TSS for throttling control, when present. Add error checks.
  ...
2007-11-26 19:09:58 -08:00
Adrian Bunk
483066d62e SUNRPC: make sunrpc/xprtsock.c:xs_setup_{udp,tcp}() static
xs_setup_{udp,tcp}() can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-11-26 16:24:50 -05:00
Adrian Bunk
5334eb13d4 NFS: make nfs_wb_page_priority() static
nfs_wb_page_priority() can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-11-26 16:24:48 -05:00
James Lentini
cfcb43ff7c SUNRPC: remove NFS/RDMA client's binary sysctls
Support for binary sysctls is being deprecated in 2.6.24. Since there
are no applications using the NFS/RDMA client's binary sysctls, it
makes sense to remove them. The patch below does this while leaving
the /proc/sys interface unchanged.

Please consider this for 2.6.24.

Signed-off-by: James Lentini <jlentini@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-11-26 16:21:19 -05:00
John Stultz
52bfb36050 time: add ADJ_OFFSET_SS_READ
Michael Kerrisk reported that a long standing bug in the adjtimex()
system call causes glibc's adjtime(3) function to deliver the wrong
results if 'delta' is NULL.

add the ADJ_OFFSET_SS_READ API detail, which will be used by glibc
to fix this API compatibility bug.

Also see: http://bugzilla.kernel.org/show_bug.cgi?id=6761

[ mingo@elte.hu: added patch description and made it backwards compatible ]

NOTE: the new flag is defined 0xa001 so that it returns -EINVAL on
older kernels - this way glibc can use it safely. Suggested by Ulrich
Drepper.

Acked-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-11-26 20:42:19 +01:00
Herbert Xu
2d4baff8da [SKBUFF]: Free old skb properly in skb_morph
The skb_morph function only freed the data part of the dst skb, but leaked
the auxiliary data such as the netfilter fields.  This patch fixes this by
moving the relevant parts from __kfree_skb to skb_release_all and calling
it in skb_morph.

It also makes kfree_skbmem static since it's no longer called anywhere else
and it now no longer does skb_release_data.

Thanks to Yasuyuki KOZAKAI for finding this problem and posting a patch for
it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2007-11-26 23:11:19 +08:00
Ayaz Abdulla
490dde8990 forcedeth: new mcp79 pci ids
This patch adds new device ids and features for mcp79 devices into the
forcedeth driver.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-11-23 20:54:01 -05:00
Adrian Bunk
5fe4a33430 [SUNRPC]: Make xprtsock.c:xs_setup_{udp,tcp}() static
xs_setup_{udp,tcp}() can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2007-11-22 19:38:25 +08:00
Heiko Carstens
37e3a6ac5a [S390] appldata: remove unused binary sysctls.
Remove binary sysctls that never worked due to missing strategy functions.

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <geraldsc@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-11-20 11:13:45 +01:00
Heiko Carstens
43ebbf119a [S390] cmm: remove unused binary sysctls.
Remove binary sysctls that never worked due to missing strategy functions.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-11-20 11:13:45 +01:00
Len Brown
95b00786f3 Pull cpuidle into release branch 2007-11-20 01:18:37 -05:00
Shaohua Li
61fd47e0c8 ACPI: fix two IRQ8 issues in IOAPIC mode
Use mp_irqs[] to get PNP device's interrupt polarity and trigger.
There are two reasons to do this:
1. BIOS bug for PNP interrupt
2. BIOS explictly does override
mp_irqs[] should cover all the cases.

http://bugzilla.kernel.org/show_bug.cgi?id=5243
http://bugzilla.kernel.org/show_bug.cgi?id=7679
http://bugzilla.kernel.org/show_bug.cgi?id=9153

[lenb: fixed !IOAPIC and 64-bit !SMP builds]

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-11-20 01:16:29 -05:00
Venkatesh Pallipadi
ddc081a195 cpuidle: fix HP nx6125 regression
Fix for http://bugzilla.kernel.org/show_bug.cgi?id=9355

cpuidle always used to fallback to C2 if there is some bm activity while
entering C3. But, presence of C2 is not always guaranteed. Change cpuidle
algorithm to detect a safe_state to fallback in case of bm_activity and
use that state instead of C2.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-11-19 21:43:22 -05:00
Albert Lee
2d3b8eea7f libata: workaround DRQ=1 ERR=1 for ATAPI tape drives
After an error condition, some ATAPI tape drives set DRQ=1 together
with ERR=1 when asking the host to transfer the CDB of the next packet
command (i.e. request sense).  This patch, a revised version of
Alan/Mark's previous patch, adds ATA_HORKAGE_STUCK_ERR to workaround
the problem by ignoring the ERR bit and proceed sending the CDB.

Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Mark Lord <liml@rtr.ca>
Signed-off-by: Tejun Heo <htejun@gmail.com>
2007-11-19 12:28:11 +09:00
Adrian Bunk
21bef6dd2b libata: remove unused functions
This patch removes the following obsolete functions:
- libata-core.c: __sata_phy_reset()
- libata-core.c: sata_phy_reset()
- libata-eh.c: ata_qc_timeout()
- libata-eh.c: ata_eng_timeout()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Tejun Heo <htejun@gmail.com>
2007-11-19 12:28:09 +09:00
Eric Paris
ec41878170 SELinux: return EOPNOTSUPP not ENOTSUPP
ENOTSUPP is not a valid error code in the kernel (it is defined in some
NFS internal error codes and has been improperly used other places).  In
the !CONFIG_SECURITY_SELINUX case though it is possible that we could
return this from selinux_audit_rule_init().  This patch just returns the
userspace valid EOPNOTSUPP.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2007-11-17 10:38:16 +11:00
Jean Delvare
5e31c2bd3c i2c: Make i2c_check_addr static
i2c_check_addr is only used inside i2c-core now, so we can make it
static and stop exporting it. Thanks to David Brownell for noticing.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
2007-11-15 19:24:02 +01:00
Jan Kara
7c06a8dc64 Fix 64KB blocksize in ext3 directories
With 64KB blocksize, a directory entry can have size 64KB which does not
fit into 16 bits we have for entry lenght.  So we store 0xffff instead and
convert value when read from / written to disk.  The patch also converts
some places to use ext3_next_entry() when we are changing them anyway.

[akpm@linux-foundation.org: coding-style cleanups]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:43 -08:00
Eric W. Biederman
57d5f66b86 pidns: Place under CONFIG_EXPERIMENTAL
This is my trivial patch to swat innumerable little bugs with a single
blow.

After some intensive review (my apologies for not having gotten to this
sooner) what we have looks like a good base to build on with the current
pid namespace code but it is not complete, and it is still much to simple
to find issues where the kernel does the wrong thing outside of the initial
pid namespace.

Until the dust settles and we are certain we have the ABI and the
implementation is as correct as humanly possible let's keep process ID
namespaces behind CONFIG_EXPERIMENTAL.

Allowing us the option of fixing any ABI or other bugs we find as long as
they are minor.

Allowing users of the kernel to avoid those bugs simply by ensuring their
kernel does not have support for multiple pid namespaces.

[akpm@linux-foundation.org: coding-style cleanups]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Kir Kolyshkin <kir@swsoft.com>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:43 -08:00
Bjorn Helgaas
9626f1f117 rtc: fall back to requesting only the ports we actually use
Firmware like PNPBIOS or ACPI can report the address space consumed by the
RTC.  The actual space consumed may be less than the size (RTC_IO_EXTENT)
assumed by the RTC driver.

The PNP core doesn't request resources yet, but I'd like to make it do so.
If/when it does, the RTC_IO_EXTENT request may fail, which prevents the RTC
driver from loading.

Since we only use the RTC index and data registers at RTC_PORT(0) and
RTC_PORT(1), we can fall back to requesting just enough space for those.

If the PNP core requests resources, this results in typical I/O port usage
like this:

    0070-0073 : 00:06		<-- PNP device 00:06 responds to 70-73
      0070-0071 : rtc		<-- RTC driver uses only 70-71

instead of the current:

    0070-0077 : rtc		<-- RTC_IO_EXTENT == 8

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:41 -08:00
Shannon Nelson
7bb67c14fd I/OAT: Add support for version 2 of ioatdma device
Add support for version 2 of the ioatdma device.  This device handles
the descriptor chain and DCA services slightly differently:
 - Instead of moving the dma descriptors between a busy and an idle chain,
   this new version uses a single circular chain so that we don't have
   rewrite the next_descriptor pointers as we add new requests, and the
   device doesn't need to re-read the last descriptor.
 - The new device has the DCA tags defined internally instead of needing
   them defined statically.

Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Cc: "Williams, Dan J" <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:41 -08:00
Andrew Morton
cfb5285660 revert "Task Control Groups: example CPU accounting subsystem"
Revert 62d0df6406.

This was originally intended as a simple initial example of how to create a
control groups subsystem; it wasn't intended for mainline, but I didn't make
this clear enough to Andrew.

The CFS cgroup subsystem now has better functionality for the per-cgroup usage
accounting (based directly on CFS stats) than the "usage" status file in this
patch, and the "load" status file is rather simplistic - although having a
per-cgroup load average report would be a useful feature, I don't believe this
patch actually provides it.  If it gets into the final 2.6.24 we'd probably
have to support this interface for ever.

Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:40 -08:00
Ken Chen
45c682a68a hugetlb: fix i_blocks accounting
For administrative purpose, we want to query actual block usage for
hugetlbfs file via fstat.  Currently, hugetlbfs always return 0.  Fix that
up since kernel already has all the information to track it properly.

Signed-off-by: Ken Chen <kenchen@google.com>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:40 -08:00
Adam Litke
9a119c056d hugetlb: allow bulk updating in hugetlb_*_quota()
Add a second parameter 'delta' to hugetlb_get_quota and hugetlb_put_quota to
allow bulk updating of the sbinfo->free_blocks counter.  This will be used by
the next patch in the series.

Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Ken Chen <kenchen@google.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: David Gibson <hermes@gibson.dropbear.id.au>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:40 -08:00
Adam Litke
5b23dbe817 hugetlb: follow_hugetlb_page() for write access
When calling get_user_pages(), a write flag is passed in by the caller to
indicate if write access is required on the faulted-in pages.  Currently,
follow_hugetlb_page() ignores this flag and always faults pages for
read-only access.  This can cause data corruption because a device driver
that calls get_user_pages() with write set will not expect COW faults to
occur on the returned pages.

This patch passes the write flag down to follow_hugetlb_page() and makes
sure hugetlb_fault() is called with the right write_access parameter.

[ezk@cs.sunysb.edu: build fix]
Signed-off-by: Adam Litke <agl@us.ibm.com>
Reviewed-by: Ken Chen <kenchen@google.com>
Cc: David Gibson <hermes@gibson.dropbear.id.au>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:39 -08:00
Linus Torvalds
9418d5dc9b Merge branch 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6
* 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6:
  hwmon: (i5k_amb) Convert macros to C functions
  hwmon: (w83781d) Add missing curly braces
  hwmon: (abituguru3) Identify ABit IP35 Pro as such
  hwmon: (f75375s) pwmX_mode sysfs files writable for f75375 variant
  hwmon: (f75375s) On n2100 systems, set fans to full speed on boot
  hwmon: (f75375s) Allow setting up fans with platform_data
  hwmon: (f75375s) Add new style bindings
  hwmon: (lm70) Convert semaphore to mutex
  hwmon: (applesmc) Add support for Mac Pro 2 x Quad-Core
  hwmon: (abituguru3) Add support for 2 new motherboards
  hwmon: (ibmpex) Change printk to dev_{info,err} macros
  hwmon: (i5k_amb) New memory temperature sensor driver
  hwmon: (f75375s) fix pwm mode setting
  hwmon: (ibmpex.c) fix NULL dereference
  hwmon: (sis5595) Split sis5595_attributes_opt
  hwmon: (sis5595) Add individual alarm files
  hwmon: (w83627hf) push nr+1 offset into *_REG_FAN macros and simplify
  hwmon: (w83627hf) hoist nr-1 offset out of show-store-temp-X
  hwmon: Add power meter spec to Documentation/hwmon/sysfs-interface
2007-11-13 09:09:36 -08:00
Trond Myklebust
91cf45f02a [NET]: Add the helper kernel_sock_shutdown()
...and fix a couple of bugs in the NBD, CIFS and OCFS2 socket handlers.

Looking at the sock->op->shutdown() handlers, it looks as if all of them
take a SHUT_RD/SHUT_WR/SHUT_RDWR argument instead of the
RCV_SHUTDOWN/SEND_SHUTDOWN arguments.
Add a helper, and then define the SHUT_* enum to ensure that kernel users
of shutdown() don't get confused.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Mark Fasheh <mark.fasheh@oracle.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12 18:10:39 -08:00
Pierre Ynard
dbb2ed2485 [IPV6]: Add ifindex field to ND user option messages.
Userland neighbor discovery options are typically heavily involved with
the interface on which thay are received: add a missing ifindex field to
the original struct. Thanks to Rmi Denis-Courmont.

Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12 17:58:35 -08:00
Linus Torvalds
05f3f41589 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-virtio
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-virtio:
  virtio: Force use of power-of-two for descriptor ring sizes
  lguest: Fix lguest virtio-blk backend size computation
  virtio: Fix used_idx wrap-around
  virtio: more fallout from scatterlist changes.
  virtio: fix vring_init for 64 bits
2007-11-12 11:13:31 -08:00
Rusty Russell
42b36cc0ce virtio: Force use of power-of-two for descriptor ring sizes
The virtio descriptor rings of size N-1 were nicely set up to be
aligned to an N-byte boundary.  But as Anthony Liguori points out, the
free-running indices used by virtio require that the sizes be a power
of 2, otherwise we get problems on wrap (demonstrated with lguest).

So we replace the clever "2^n-1" scheme with a simple "align to page
boundary" scheme: this means that all virtio rings take at least two
pages, but it's safer than guessing cache alignment.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-11-12 13:59:40 +11:00
Anthony Liguori
44332f7167 virtio: fix vring_init for 64 bits
This patch fixes a typo in vring_init().  This happens to work today in lguest
because the sizeof(struct vring_desc) is 16 and struct vring contains 3
pointers and an unsigned int so on 32-bit
sizeof(struct vring_desc) == sizeof(struct vring).  However, this is no longer
true on 64-bit where the bug is exposed.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-11-12 13:55:12 +11:00
Chuck Lever
78608ba032 [NET]: Fix skb_truesize_check() assertion
The intent of the assertion in skb_truesize_check() is to check
for skb->truesize being decremented too much by other code,
resulting in a wraparound below zero.

The type of the right side of the comparison causes the compiler to
promote the left side to an unsigned type, despite the presence of an
explicit type cast.  This defeats the check for negativity.

Ensure both sides of the comparison are a signed type to prevent the
implicit type conversion.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10 21:53:30 -08:00
Linus Torvalds
a70a932299 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  sched: proper prototype for kernel/sched.c:migration_init()
  sched: avoid large irq-latencies in smp-balancing
  sched: fix copy_namespace() <-> sched_fork() dependency in do_fork
  sched: clean up the wakeup preempt check, #2
  sched: clean up the wakeup preempt check
  sched: wakeup preemption fix
  sched: remove PREEMPT_RESTRICT
  sched: turn off PREEMPT_RESTRICT
  KVM: fix !SMP build error
  x86: make nmi_cpu_busy() always defined
  x86: make ipi_handler() always defined
  sched: cleanup, use NSEC_PER_MSEC and NSEC_PER_SEC
  sched: reintroduce SMP tunings again
  sched: restore deterministic CPU accounting on powerpc
  sched: fix delay accounting regression
  sched: reintroduce the sched_min_granularity tunable
  sched: documentation: place_entity() comments
  sched: fix vslice
2007-11-09 15:27:54 -08:00
Linus Torvalds
4c31c30302 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Add UNPLUG traces to all appropriate places
  block: fix requeue handling in blk_queue_invalidate_tags()
  mmc: Fix sg helper copy-and-paste error
  pktcdvd: fix BUG caused by sysfs module reference semantics change
  ioprio: allow sys_ioprio_set() value of 0 to reset ioprio setting
  cfq_idle_class_timer: add paranoid checks for jiffies overflow
  cfq: fix IOPRIO_CLASS_IDLE delays
  cfq: fix IOPRIO_CLASS_IDLE accounting
2007-11-09 15:17:49 -08:00
Adrian Bunk
e6fe6649b4 sched: proper prototype for kernel/sched.c:migration_init()
This patch adds a proper prototype for migration_init() in
include/linux/sched.h

Since there's no point in always returning 0 to a caller that doesn't check
the return value it also changes the function to return void.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09 22:39:39 +01:00
Peter Zijlstra
b82d9fdd84 sched: avoid large irq-latencies in smp-balancing
SMP balancing is done with IRQs disabled and can iterate the full rq.
When rqs are large this can cause large irq-latencies. Limit the nr of
iterations on each run.

This fixes a scheduling latency regression reported by the -rt folks.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09 22:39:39 +01:00
Ingo Molnar
3e3e13f399 sched: remove PREEMPT_RESTRICT
remove PREEMPT_RESTRICT. (this is a separate commit so that any
regression related to the removal itself is bisectable)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09 22:39:39 +01:00
Ingo Molnar
a5fbb6d106 KVM: fix !SMP build error
fix a !SMP build error:

drivers/kvm/kvm_main.c: In function 'kvm_flush_remote_tlbs':
drivers/kvm/kvm_main.c:220: error: implicit declaration of function 'smp_call_function_mask'

(and also avoid unused function warning related to up_smp_call_function()
not making use of the 'func' parameter.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09 22:39:38 +01:00
Paul Mackerras
fa13a5a1f2 sched: restore deterministic CPU accounting on powerpc
Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the
deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been
broken on powerpc, because we end up counting user time twice: once in
timer_interrupt() and once in update_process_times().

This fixes the problem by pulling the code in update_process_times
that updates utime and stime into a separate function called
account_process_tick.  If CONFIG_VIRT_CPU_ACCOUNTING is not defined,
there is a version of account_process_tick in kernel/timer.c that
simply accounts a whole tick to either utime or stime as before.  If
CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to
implement account_process_tick.

This also lets us simplify the s390 code a bit; it means that the s390
timer interrupt can now call update_process_times even when
CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a
suitable account_process_tick().

account_process_tick() now takes the task_struct * as an argument.
Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09 22:39:38 +01:00
Peter Zijlstra
b2be5e96dc sched: reintroduce the sched_min_granularity tunable
we lost the sched_min_granularity tunable to a clever optimization
that uses the sched_latency/min_granularity ratio - but the ratio
is quite unintuitive to users and can also crash the kernel if the
ratio is set to 0. So reintroduce the min_granularity tunable,
while keeping the ratio maintained internally.

no functionality changed.

[ mingo@elte.hu: some fixlets. ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09 22:39:37 +01:00
Alan D. Brunelle
2ad8b1ef11 Add UNPLUG traces to all appropriate places
Added blk_unplug interface, allowing all invocations of unplugs to result
in a generated blktrace UNPLUG.

Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-09 13:41:32 +01:00
Riku Voipio
ff312d07c2 hwmon: (f75375s) Allow setting up fans with platform_data
Allow initializing fans on systems where BIOS does not do that by
default.

 - define f75375s_platform_data in new file f75375s.h
 - if platform_data was provided, set fans accordingly in f75375_init()
 - split set_pwm_enable() to a sysfs callback and directly usable
   set_pwm_enable_direct()

Signed-off-by: Riku Voipio <riku.voipio@movial.fi>
Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
2007-11-08 08:42:46 -05:00
Darrick J. Wong
298c752491 hwmon: (i5k_amb) New memory temperature sensor driver
New driver to read FB-DIMM temperature sensors on systems with the
Intel 5000 series chipsets.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
2007-11-08 08:42:46 -05:00
Patrick McHardy
c3d8d1e30c [NETLINK]: Fix unicast timeouts
Commit ed6dcf4a in the history.git tree broke netlink_unicast timeouts
by moving the schedule_timeout() call to a new function that doesn't
propagate the remaining timeout back to the caller. This means on each
retry we start with the full timeout again.

ipc/mqueue.c seems to actually want to wait indefinitely so this
behaviour is retained.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:15:12 -08:00
Alan Cox
0fc00e2440 [TTY]: Fix network driver interactions with TCGET/SET calls.
Dave Miller noted various cases where line disciplines for things like
ppp go poking around in termios themselves in ways that broke with the
new termios code. Rather than have them all learning about termios
internals provide proper methods for this

- tty_mode_ioctl()

	This handles all the terminal mode handling for speed/carrier
etc and none of the methods are ldisc dependant so they can be called
by any user

- tty_perform_flush()

	This extracts the flush functionality and enables pppd the ppp
layer to share it cleanly.

The existing n_tty_ioctl code is refactored in this patch to provide
the new functions and to call them itself appropriately. This patch
has no (intended) behaviour changes and simply prepares for the other
fixes.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:14:19 -08:00
David S. Miller
44656ba128 [NET]: Kill proc_net_create()
There are no more users.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:10:52 -08:00
Pavel Emelyanov
6a9fb9479f [IPV4]: Clean the ip_sockglue.c from some ugly ifdefs
The #idfed CONFIG_IP_MROUTE is sometimes places inside the if-s,
which looks completely bad. Similar ifdefs inside the functions
looks a bit better, but they are also not recommended to be used.

Provide an ifdef-ed ip_mroute_opt() helper to cleanup the code.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:55 -08:00
Jan Engelhardt
b98e1747ee [NETFILTER]: Sort matches/targets in Kbuild file
Sort matches and targets in the Kbuild file.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:21 -08:00
Linus Torvalds
f8a9efb528 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: handle broken cable reporting
  pata_hpt37x: Fix outstanding bug reports on the HPT374 and 37x cable detect
  ata_piix: Add additional PCI identifier for 40 wire short cable
  pata_serverworks: Fix problem with some drive combinations
  libata: Don't disable dipm with SET FEATURES
  libata and bogus LBA48 drives
2007-11-05 17:43:04 -08:00
Kamalesh Babulal
5a75983eef Missing include file in kallsyms.h
The Build with randconfig fails with following error with the
2.6.24-rc4-git9

include/linux/kallsyms.h:56: error: `NULL' undeclared (first use in this
function)
include/linux/kallsyms.h:56: error: (Each undeclared identifier is
reported only once
include/linux/kallsyms.h:56: error: for each function it appears in.)
make[2]: *** [arch/powerpc/platforms/cell/spu_callbacks.o] Error 1
make[1]: *** [arch/powerpc/platforms/cell] Error 2
make: *** [arch/powerpc/platforms] Error 2

Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-05 15:12:32 -08:00
Alan Cox
6bbfd53d47 libata: handle broken cable reporting
One or two ancient drives predated the cable spec and didn't sent the
valid bits for the field. I had hoped to leave this out of libata as a
piece of historical annoyance but a recent CD drive shows the same bug so
we have to import support for it.

Same concept as Bartlomiej's changes old IDE except that as we have
centralised blacklists we can avoid keeping another private table of stuff

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-11-05 18:10:28 -05:00
Linus Torvalds
5d66f151ac Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6:
  PCI: Add Kconfig option to disable deprecated pci_find_* API
  PCI: pciserial_resume_one ignored return value of pci_enable_device
  PCI Hotplug: cpqhp_pushbutton_thread(): remove a pointless if() check
  PCI: make pci_match_device() static
  PCI: Remove 3 incorrect MSI quirks.
  PCI: Add MSI INTX_DISABLE quirks for ATI SB700/800 SATA and IXP SB400 USB
  PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set.
  PCI: Add MSI quirk for ServerWorks HT1000 PCIX bridge.
  PCI: Revert "PCI: disable MSI by default on systems with Serverworks HT1000 chips"
2007-11-05 14:08:00 -08:00
Jeff Garzik
bd3989e006 PCI: Add Kconfig option to disable deprecated pci_find_* API
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-05 13:35:17 -08:00
Adrian Bunk
d73460d79b PCI: make pci_match_device() static
pci_match_device() no longer has any other users.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-05 13:35:17 -08:00
David Miller
5257dca0bd PCI: Remove 3 incorrect MSI quirks.
Now that we have dealt with the real issue, in that some ATI SATA and
USB controllers needed the INTX_DISABLE quirk, we can remove these AMD
chipset global MSI disabling quirks.

This reverts three changesets:

4be8f90643 (PCI: disable MSI on RS690)
aea6a433f5 (PCI: disable MSI on RD580)
f122392f67 (PCI: disable MSI on RX790)

This is based upon testing and feedback from
Shane Huang <Shane.Huang@amd.com>.

Cc: Shane Huang <Shane.Huang@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-05 13:35:17 -08:00
David Miller
ba698ad4b7 PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set.
A reasonably common problem with some devices is that they will
disable MSI generation when the INTX_DISABLE bit is set in the
PCI_COMMAND register.

Quirk this explicitly, guarding the pci_intx() calls in msi.c with
this quirk indication.

The first entries for this quirk are for 5714 and 5780 Tigon3 chips,
and thus we can remove the workaround code from the tg3.c driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-05 13:35:16 -08:00
David Miller
1d84b5424e PCI: Add MSI quirk for ServerWorks HT1000 PCIX bridge.
This is the fix for the following problem:

https://bugzilla.redhat.com/show_bug.cgi?id=227657

The bnx2 device 5706 complains about MSI not working behind a
ServerWorks HT1000 PCIX bridge. An earlier commit to fix the problem:

e3008dedff:

"PCI: disable MSI by default on systems with Serverworks HT1000 chips"

was not entirely correct, and has been reverted.

MSI does not work on the PCIX bus because the BIOS did not set the
HT_MSI_FLAGS_ENABLE bit in the HyperTransport MSI capability on the
bridge.  We use the existing quirk_msi_ht_cap() to detect the problem
and disable MSI in all buses behind it.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Cc: Anantha Subramanyam <ananth@broadcom.com>
Cc: Naren Sankar <nsankar@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-05 13:35:16 -08:00
David Miller
2cc31879f8 PCI: Revert "PCI: disable MSI by default on systems with Serverworks HT1000 chips"
This reverts commit e3008dedff.

The real bug was an INTX issue in the tg3 ethernet chip, and
cured by commit c129d962a66c76964954a98b38586ada82cf9381

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-05 13:35:16 -08:00
Bartlomiej Zolnierkiewicz
01745112de ide: move ide_fixstring() documentation to ide-iops.c from ide.h
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-11-05 21:42:29 +01:00
Adrian Bunk
fad23fc78b kernel/futex.c: make 3 functions static
The following functions can now become static again:
- get_futex_key()
- get_futex_key_refs()
- drop_futex_key_refs()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-11-05 21:53:46 +11:00
Geert Uytterhoeven
17bd9a2f4c libata and bogus LBA48 drives
A colleague noticed recent versions of Ubuntu no longer detect his 80 GB
ST380020ACE drive. This drive is special in that it advertises LBA48 support,
but has the lba_capacity_2 field set to zero (cfr.
http://lkml.org/lkml/2004/3/30/163).

Upon closer look, libata indeed doesn't seem to handle this case yet.
Below is an (untested) fix.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-11-04 22:53:15 -05:00
Linus Torvalds
b4f555081f Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  [BLOCK] Don't allow empty barriers to be passed down to queues that don't grok them
  dm: bounce_pfn limit added
  Deadline iosched: Fix batching fairness
  Deadline iosched: Reset batch for ordered requests
  Deadline iosched: Factor out finding latter reques
2007-11-03 12:43:36 -07:00
Linus Torvalds
160acc2e89 Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block
* 'sg' of git://git.kernel.dk/linux-2.6-block:
  [SG] Get rid of __sg_mark_end()
  cleanup asm/scatterlist.h includes
  SG: Make sg_init_one() use general table init functions
2007-11-03 12:43:21 -07:00
Tony Battersby
f8d8e5799b libata: increase 128 KB / cmd limit for ATAPI tape drives
Commands sent to ATAPI tape drives via the SCSI generic (sg) driver are
limited in the amount of data that they can transfer by the max_sectors
value.  The max_sectors value is currently calculated according to the
command set for disk drives, which doesn't apply to tape drives.  The
default max_sectors value of 256 limits ATAPI tape drive commands to
128 KB.  This patch against 2.6.24-rc1 increases the max_sectors value
for tape drives to 65535, which permits tape drive commands to transfer
just under 32 MB.

Tested with a SuperMicro PDSME motherboard, AHCI, and a Sony SDX-570V
SATA tape drive.

Note that some of the chipset drivers also set their own max_sectors
value, which may override the value set in libata-core.  I don't have
any of these chipsets to test, so I didn't go messing with them.  Also,
ATAPI devices other than tape drives may benefit from similar changes,
but I have only tape drives and disk drives to test.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-11-03 08:46:54 -04:00
Linus Torvalds
a89b7717a8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: linux-input mailing list moved to vger.kernel.org
  Input: inport, logibm - use KERN_INFO when reporting missing mouse
  Input: appletouch - idle reset logic broke older Fountains
  Input: hp_sdc.c - fix section mismatch
  Input: appletouch - add Johannes Berg as maintainer
  Input: Add Euro and Dollar key codes
  Input: xpad - add more USB IDs
2007-11-02 19:37:41 -07:00
Vasily Averin
5ec140e600 dm: bounce_pfn limit added
Device mapper uses its own bounce_pfn that may differ from one on underlying
device. In that way dm can build incorrect requests that contain sg elements
greater than underlying device is able to handle.

This is the cause of slab corruption in i2o layer, occurred on i386 arch when
very long direct IO requests are addressed to dm-over-i2o device.

Signed-off-by: Vasily Averin <vvs@sw.ru>
Cc: <stable@kernel.org>
Cc: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-02 08:47:25 +01:00
Jens Axboe
c46f2334c8 [SG] Get rid of __sg_mark_end()
sg_mark_end() overwrites the page_link information, but all users want
__sg_mark_end() behaviour where we just set the end bit. That is the most
natural way to use the sg list, since you'll fill it in and then mark the
end point.

So change sg_mark_end() to only set the termination bit. Add a sg_magic
debug check as well, and clear a chain pointer if it is set.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-02 08:47:06 +01:00
Jens Axboe
013fb33972 SG: Make sg_init_one() use general table init functions
Don't open code sg_init_one(), make it reuse sg_init_table().

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-02 08:47:06 +01:00
Stephen Hemminger
3b582cc14c [NET]: docbook fixes for netif_ functions
Documentation updates for network interfaces.

1. Add doc for netif_napi_add
2. Remove doc for unused returns from netif_rx
3. Add doc for netif_receive_skb

[ Incorporated minor mods from Randy Dunlap -DaveM ]

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 02:21:47 -07:00
Greg Kroah-Hartman
d919fd433b Revert "Driver core: remove class_device_*_bin_file"
This reverts commit fcd239d3d5.

I messed up, ia64 still uses these files in the current tree, and now
can not build the pci code, which all ia64 boxes seem to require :)

This fixes that mistake.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-31 12:51:29 -07:00
Linus Torvalds
dd13810b42 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  [AF_KEY]: suppress a warning for 64k pages.
  [TIPC]: Fix headercheck wrt. tipc_config.h
  [COMPAT]: Fix build on COMPAT platforms when CONFIG_NET is disabled.
  [CONNECTOR]: Fix a spurious kfree_skb() call
  [COMPAT]: Fix new dev_ifname32 returning -EFAULT
  [NET]: Fix incorrect sg_mark_end() calls.
  [IPVS]: Remove /proc/net/ip_vs_lblcr
  [IPV6]: remove duplicate call to proc_net_remove
  [NETNS]: fix net released by rcu callback
  [NET]: Fix free_netdev on register_netdev failure.
  [WAN]: fix drivers/net/wan/lmc/ compilation
2007-10-31 07:46:51 -07:00
Greg Kroah-Hartman
fcd239d3d5 Driver core: remove class_device_*_bin_file
These functions are not used by anyone, so remove them from the tree.

The class_device code will be removed soon anyway, so no future users
will ever be possible.


Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-30 21:52:33 -07:00
David S. Miller
97ef1bb0c8 [TIPC]: Fix headercheck wrt. tipc_config.h
It wants string functions like memcpy() for inline
routines, and these define userland interfaces.

The only clean way to deal with this is to simply
put linux/string.h into unifdef-y and have it
include <string.h> when not-__KERNEL__.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30 21:44:00 -07:00
Linus Torvalds
71d00feca2 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
  ixgb: fix TX hangs under heavy load
  e1000e: Fix typo ! &
  ixgbe: minor sparse fixes
  e1000: sparse warnings fixes
  ixgb: fix sparse warnings
  e1000e: fix sparse warnings
  mv643xx_eth: Fix MV643XX_ETH offsets used by Pegasos 2
  Blackfin EMAC driver: Fix Ethernet communication bug (dupliated and lost packets)
  DM9601: Support for ADMtek ADM8515 NIC
2007-10-30 12:04:29 -07:00
Linus Torvalds
8c1ee54cb3 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: implement and use ATA_QCFLAG_QUIET
  libata: stop being overjealous about non-IO commands
  libata: flush is an IO command
  sata_promise: cleanups
  sata_promise: ASIC PRD table bug workaround, take 2
2007-10-30 11:49:13 -07:00
Dale Farnsworth
3077d78a74 mv643xx_eth: Fix MV643XX_ETH offsets used by Pegasos 2
In the mv643xx_eth driver, we now use offsets from the ethernet
register block within the chip, but the pegasos 2 platform still
needs offsets from the full chip's register base address.

Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-30 14:32:16 -04:00
Linus Torvalds
2d175d438f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  [TIPC]: Add tipc_config.h to include/linux/Kbuild.
  [WAN]: lmc_ioctl: don't return with locks held
  [SUNRPC]: fix rpc debugging
  [TCP]: Saner thash_entries default with much memory.
  [SUNRPC] rpc_rdma: we need to cast u64 to unsigned long long for printing
  [IPv4] SNMP: Refer correct memory location to display ICMP out-going statistics
  [NET]: Fix error reporting in sys_socketpair().
  [NETFILTER]: nf_ct_alloc_hashtable(): use __GFP_NOWARN
  [NET]: Fix race between poll_napi() and net_rx_action()
  [TCP] MD5: Remove some more unnecessary casting.
  [TCP] vegas: Fix a bug in disabling slow start by gamma parameter.
  [IPVS]: use proper timeout instead of fixed value
  [IPV6] NDISC: Fix setting base_reachable_time_ms variable.
2007-10-30 08:08:40 -07:00
Corey Minyard
64e862a579 IPMI: fix comparison in demangle_device_id
Coverity spotted some incorrect code in a recent change to the IPMI driver;
this patch make sure the data is really long enough to pull the
manufacturer id and product id out of a get device id message.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Stian Jordet <liste@jordet.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-30 08:06:55 -07:00
Tejun Heo
e027bd36c1 libata: implement and use ATA_QCFLAG_QUIET
Implement ATA_QCFLAG_QUIET which indicates that there's no need to
report if the command fails with AC_ERR_DEV and set it for passthrough
commands.

Combined with previous changes, this now makes device errors for all
direct commands reported directly to the issuer without going through
EH actions and reporting.

Note that EH is still invoked after non-IO device errors to determine
the nature of the error and resume command execution (some controller
requires special care after error to continue).  It just performs
default maintenance after error, examines what's going on, realizes
that it's none of its business and reports the command failure without
logging any error messages.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-30 09:59:43 -04:00
David S. Miller
502ef38da1 [TIPC]: Add tipc_config.h to include/linux/Kbuild.
Needed, as reported in:

http://bugzilla.kernel.org/show_bug.cgi?id=9260

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30 01:19:19 -07:00
Balbir Singh
9301899be7 sched: fix /proc/<PID>/stat stime/utime monotonicity, part 2
Extend Peter's patch to fix accounting issues, by keeping stime
monotonic too.

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Frans Pop <elendil@planet.nl>
2007-10-30 00:26:32 +01:00
Linus Torvalds
db8185360d Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  sched: fix style in kernel/sched.c
  sched: fix style of swap() macro in kernel/sched_fair.c
  sched: report CPU usage in CFS cgroup directories
  sched: move rcu_head to task_group struct
  sched: fix incorrect assumption that cpu 0 exists
  sched: keep utime/stime monotonic
  sched: make kernel/sched.c:account_guest_time() static
2007-10-29 14:06:19 -07:00
Linus Torvalds
6a22c57b8d Revert "x86_64: allocate sparsemem memmap above 4G"
This reverts commit 2e1c49db4c.

First off, testing in Fedora has shown it to cause boot failures,
bisected down by Martin Ebourne, and reported by Dave Jobes.  So the
commit will likely be reverted in the 2.6.23 stable kernels.

Secondly, in the 2.6.24 model, x86-64 has now grown support for
SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the
bug is not visible any more, it's become invisible due to the code just
being irrelevant and no longer enabled on the only architecture that
this ever affected.

Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Martin Ebourne <fedora@ebourne.me.uk>
Cc: Zou Nan hai <nanhai.zou@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-29 14:05:37 -07:00
Peter Zijlstra
73a2bcb0ed sched: keep utime/stime monotonic
keep utime/stime monotonic.

cpustats use utime/stime as a ratio against sum_exec_runtime, as a
consequence it can happen - when the ratio changes faster than time
accumulates - that either can be appear to go backwards.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-29 21:18:11 +01:00
Linus Torvalds
3529a23342 Merge branch 'alpm' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'alpm' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] AHCI: add hw link power management support
  [libata] Link power management infrastructure
2007-10-29 12:12:34 -07:00
Linus Torvalds
00cda56d39 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] AHCI: fix newly introduced host-reset bug
  [libata] sata_nv: fix SWNCQ enabling
  libata: add MAXTOR 7V300F0/VA111900 to NCQ blacklist
  libata: no need to speed down if already at PIO0
  libata: relocate forcing PIO0 on reset
  pata_ns87415: define SUPERIO_IDE_MAX_RETRIES
  [libata] Address some checkpatch-spotted issues
  [libata] fix 'if(' and similar areas that lack whitespace
  libata: implement ata_wait_after_reset()
  libata: track SLEEP state and issue SRST to wake it up
  libata: relocate and fix post-command processing
2007-10-29 12:11:54 -07:00
Kristen Carlson Accardi
ca77329fb7 [libata] Link power management infrastructure
Device Initiated Power Management, which is defined
in SATA 2.5 can be enabled for disks which support it.
This patch enables DIPM when the user sets the link
power management policy to "min_power".

Additionally, libata drivers can define a function
(enable_pm) that will perform hardware specific actions to
enable whatever power management policy the user set up
for Host Initiated Power management (HIPM).
This power management policy will be activated after all
disks have been enumerated and intialized.  Drivers should
also define disable_pm, which will turn off link power
management, but not change link power management policy.

Documentation/scsi/link_power_management_policy.txt has additional
information.

Signed-off-by:  Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-29 11:00:35 -04:00
Linus Torvalds
cbf67812b2 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  compat_ioctl: fix block device compat ioctl regression
  [BLOCK] Fix bad sharing of tag busy list on queues with shared tag maps
  Fix a build error when BLOCK=n
  block: use lock bitops for the tag map.
  cciss: update copyright notices
  cfq_get_queue: fix possible NULL pointer access
  blk_sync_queue() should cancel request_queue->unplug_work
  cfq_exit_queue() should cancel cfq_data->unplug_work
  block layer: remove a unused argument of drive_stat_acct()
2007-10-29 07:49:28 -07:00
Linus Torvalds
20dc9f01a8 Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block
* 'sg' of git://git.kernel.dk/linux-2.6-block:
  Correction of "Update drivers to use sg helpers" patch for IMXMMC driver
  sg_init_table() should use unsigned loop index variable
  sg_last() should use unsigned loop index variable
  Initialise scatter/gather list in sg driver
  Initialise scatter/gather list in ata_sg_setup
  x86: fix pci-gart failure handling
  SG: s390-scsi: missing size parameter in zfcp_address_to_sg()
  SG: clear termination bit in sg_chain()
2007-10-29 07:49:10 -07:00
Al Viro
142956af52 fix abuses of ptrdiff_t
Use of ptrdiff_t in places like

-                       if (!access_ok(VERIFY_WRITE, u_tmp->rx_buf, u_tmp->len))
+                       if (!access_ok(VERIFY_WRITE, (u8 __user *)
+                                               (ptrdiff_t) u_tmp->rx_buf,
+                                               u_tmp->len))

is wrong; for one thing, it's a bad C (it's what uintptr_t is for; in general
we are not even promised that ptrdiff_t is large enough to hold a pointer,
just enough to hold a difference between two pointers within the same object).
For another, it confuses the fsck out of sparse.

Use unsigned long or uintptr_t instead.  There are several places misusing
ptrdiff_t; fixed.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-29 07:41:33 -07:00
Al Viro
2d8a972661 SUNRPC endianness annotations
rpcrdma stuff lacks endianness annotations for on-the-wire data.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-29 07:41:32 -07:00
Al Viro
ca5cd877ae x86 merge fallout: uml
Don't undef __i386__/__x86_64__ in uml anymore, make sure that (few) places
that required adjusting the ifdefs got those.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-29 07:41:32 -07:00
Jens Axboe
6eca9004df [BLOCK] Fix bad sharing of tag busy list on queues with shared tag maps
For the locking to work, only the tag map and tag bit map may be shared
(incidentally, I was just explaining this to Nick yesterday, but I
apparently didn't review the code well enough myself). But we also share
the busy list!  The busy_list must be queue private, or we need a
block_queue_tag covering lock as well.

So we have to move the busy_list to the queue. This'll work fine, and
it'll actually also fix a problem with blk_queue_invalidate_tags() which
will invalidate tags across all shared queues. This is a bit confusing,
the low level driver should call it for each queue seperately since
otherwise you cannot kill tags on just a single queue for eg a hard
drive that stops responding. Since the function has no callers
currently, it's not an issue.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-29 11:33:06 +01:00
Tejun Heo
88ff6eafbb libata: implement ata_wait_after_reset()
On certain device/controller combination, 0xff status is asserted
after reset and doesn't get cleared during 150ms post-reset wait.  As
0xff status is interpreted as no device (for good reasons), this can
lead to misdetection on such cases.

This patch implements ata_wait_after_reset() which replaces the 150ms
sleep and waits upto ATA_TMOUT_FF_WAIT if status is 0xff.
ATA_TMOUT_FF_WAIT is currently 800ms which is enough for
HHD424020F7SV00 to get detected but not enough for Quantum GoVault
drive which is known to take upto 2s.

Without parallel probing, spending 2s on 0xff port would incur too
much delay on ata_piix's which use 0xff to indicate empty port and
doesn't have SCR register, so GoVault needs to wait till parallel
probing.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-29 06:15:27 -04:00
Tejun Heo
054a5fbace libata: track SLEEP state and issue SRST to wake it up
ATA devices in SLEEP mode don't respond to any commands.  SRST is
necessary to wake it up.  Till now, when a command is issued to a
device in SLEEP mode, the command times out, which makes EH reset the
device and retry the command after that, causing a long delay.

This patch makes libata track SLEEP state and issue SRST automatically
if a command is about to be issued to a device in SLEEP.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Bruce Allen <ballen@gravity.phys.uwm.edu>
Cc: Andrew Paprocki <andrew@ishiboo.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-29 06:15:25 -04:00
Chuck Lever
513f54b78f sg_init_table() should use unsigned loop index variable
Clean up: fix a mixed sign comparison in sg_init_table() accidentally
introduced by commit d6ec0842.  The sign of the loop index variable
should match the sign of the "nents" argument.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
2007-10-29 09:18:04 +01:00
Chuck Lever
74eb94f7b8 sg_last() should use unsigned loop index variable
Clean up: fix a mixed sign comparison in sg_last() accidentally
introduced by commit 70eb8040.  The sign of the loop index variable
should match the sign of the "nents" argument.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
2007-10-29 09:18:04 +01:00
Jens Axboe
73fd546aa7 SG: clear termination bit in sg_chain()
Since we are using the last entry in the list, clear any possible
termination bit that may have already been set. Pointed out by Rusty.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-29 09:18:03 +01:00
Al Viro
30e69bf4cc fix breakage in pegasos_eth
Fix fallout from commit b45d9147f1
("mv643xx_eth: Remove unused register defines")

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-27 22:23:06 -07:00
Carlos Corbacho
f7852be649 Input: Add Euro and Dollar key codes
Most newer Acer laptops (from 2005 onwards) now ship with an extra Dollar
and Euro key either side of the 'Up' arrow. These cannot be mapped in the
traditional way, since they are not combination keys.

Signed-off-by: Carlos Corbacho <cathectic@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2007-10-27 23:42:32 -04:00
Linus Torvalds
ec3b67c11d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
  [NetLabel]: correct usage of RCU locking
  [TCP]: fix D-SACK cwnd handling
  [NET] napi: use non-interruptible sleep in napi_disable
  [SCTP] net/sctp/auth.c: make 3 functions static
  [TCP]: Add missing I/O AT code to ipv6 side.
  [SCTP]: #if 0 sctp_update_copy_cksum()
  [INET]: Unexport icmpmsg_statistics
  [NET]: Unexport sock_enable_timestamp().
  [TCP]: Make tcp_match_skb_to_sack() static.
  [IRDA]: Make ircomm_tty static.
  [NET] fs/proc/proc_net.c: make a struct static
  [NET] dev_change_name: ignore changes to same name
  [NET]: Document some simple rules for actions
  [NET_CLS_ACT]: Use skb_act_clone
  [NET_CLS_ACT]: Introduce skb_act_clone
  [TCP]: Fix scatterlist handling in MD5 signature support.
  [IPSEC]: Fix scatterlist handling in skb_icv_walk().
  [IPSEC]: Add missing sg_init_table() calls to ESP.
  [CRYPTO]: Initialize TCRYPT on-stack scatterlist objects correctly.
  [CRYPTO]: HMAC needs some more scatterlist fixups.
  ...
2007-10-26 08:43:05 -07:00
Alexey Dobriyan
e868171a94 De-constify sched.h
[PATCH] De-constify sched.h

This reverts commit a8972ccf00 ("sched:
constify sched.h")

 1) Patch doesn't change any code here, so gcc is already smart enough
    to "feel" constness in such simple functions.
 2) There is no such thing as const task_struct.  Anyone who think
    otherwise deserves compiler warning.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-26 08:42:24 -07:00
Benjamin Herrenschmidt
43cc7380ec [NET] napi: use non-interruptible sleep in napi_disable
The current napi_disable() uses msleep_interruptible() but doesn't
(and can't) exit in case there's a signal, thus ending up doing a
hot spin without a cpu_relax. Use uninterruptible sleep instead.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 04:23:22 -07:00
David S. Miller
8c56a347c1 Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6 2007-10-26 03:50:02 -07:00
Linus Torvalds
06dbbfef82 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  [IPV4]: Explicitly call fib_get_table() in fib_frontend.c
  [NET]: Use BUILD_BUG_ON in net/core/flowi.c
  [NET]: Remove in-code externs for some functions from net/core/dev.c
  [NET]: Don't declare extern variables in net/core/sysctl_net_core.c
  [TCP]: Remove unneeded implicit type cast when calling tcp_minshall_update()
  [NET]: Treat the sign of the result of skb_headroom() consistently
  [9P]: Fix missing unlock before return in p9_mux_poll_start
  [PKT_SCHED]: Fix sch_prio.c build with CONFIG_NETDEVICES_MULTIQUEUE
  [IPV4] ip_gre: sendto/recvfrom NBMA address
  [SCTP]: Consolidate sctp_ulpq_renege_xxx functions
  [NETLINK]: Fix ACK processing after netlink_dump_start
  [VLAN]: MAINTAINERS update
  [DCCP]: Implement SIOCINQ/FIONREAD
  [NET]: Validate device addr prior to interface-up
2007-10-25 15:50:32 -07:00
Linus Torvalds
7f14957453 Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block
* 'sg' of git://git.kernel.dk/linux-2.6-block:
  fix sg_phys to use dma_addr_t
  ub: add sg_init_table for sense and read capacity commands
  x86: pci-gart fix
  blackfin: fix sg fallout
  xtensa: dma-mapping.h is using linux/scatterlist.h functions, so include it
  SG: audit of drivers that use blk_rq_map_sg()
  arch/um/drivers/ubd_kern.c: fix a building error
  SG: Change sg_set_page() to take length and offset argument
  AVR32: Fix sg_page breakage
  mmc: sg fallout
  m68k: sg fallout
  More SG build fixes
  sg: add missing sg_init_table calls to zfcp
  SG build fix
2007-10-25 15:44:54 -07:00
Linus Torvalds
2c75055703 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest:
  lguest: documentation update
  lguest: Add to maintainers file.
  lguest: build fix
  lguest: clean up lguest_launcher.h
  lguest: remove unused "wake" element from struct lguest
  lguest: use defines from x86 headers instead of magic numbers
  lguest: example launcher header cleanup.
2007-10-25 15:38:19 -07:00
Linus Torvalds
2304c3ac36 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
  [netdrvr] forcedeth: add MCP77 device IDs
  rndis_host: reduce MTU instead of refusing to talk to devices with low max packet size
  cpmac: update to new fixed phy driver interface
  cpmac: convert to napi_struct interface
  cpmac: use print_mac() instead of MAC_FMT
  natsemi: fix oops, link back netdevice from private-struct
  ehea: fix port_napi_disable/enable
  bonding/bond_main.c: fix cut'n'paste error
  make bonding/bond_main.c:bond_deinit() static
  drivers/net/ipg.c: cleanups
  remove Documentation/networking/net-modules.txt
2007-10-25 15:19:59 -07:00
Linus Torvalds
fcd05809e1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  sched: mark CONFIG_FAIR_GROUP_SCHED as !EXPERIMENTAL
  sched: isolate SMP balancing code a bit more
  sched: reduce balance-tasks overhead
  sched: make cpu_shares_{show,store}() static
  sched: clean up some control group code
  sched: constify sched.h
  sched: document profile=sleep requiring CONFIG_SCHEDSTATS
  sched: use show_regs() to improve __schedule_bug() output
  sched: clean up sched_domain_debug()
  sched: fix fastcall mismatch in completion APIs
  sched: fix sched_domain sysctl registration again
2007-10-25 15:19:03 -07:00
Jeff Garzik
de48844398 Permit silencing of __deprecated warnings.
The __deprecated marker is quite useful in highlighting the remnants of
old APIs that want removing.

However, it is quite normal for one or more years to pass, before the
(usually ancient, bitrotten) code in question is either updated or
deleted.

Thus, like __must_check, add a Kconfig option that permits the silencing
of this compiler warning.

This change mimics the ifdef-ery and Kconfig defaults of MUST_CHECK as
closely as possible.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-25 15:10:17 -07:00
Hugh Dickins
85cdffcde0 fix sg_phys to use dma_addr_t
x86_32 CONFIG_HIGHMEM64G with 5GB RAM hung when booting, after issuing
some "request_module: runaway loop modprobe binfmt-0000" messages in
trying to exec /sbin/init.

The binprm buf doesn't see the right ".ELF" header because sg_phys()
is providing the wrong physical addresses for high pages: a 32-bit
unsigned long is too small in this case, we need to use dma_addr_t.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-25 09:55:05 +02:00
Ayaz Abdulla
96fd4cd3e4 [netdrvr] forcedeth: add MCP77 device IDs
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-25 03:36:42 -04:00
Rusty Russell
e1e72965ec lguest: documentation update
Went through the documentation doing typo and content fixes.  This
patch contains only comment and whitespace changes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-25 15:02:50 +10:00
Rusty Russell
7334492b53 lguest: clean up lguest_launcher.h
Remove now-unused defines.
Fix old idempotent #ifndef _ASM_LGUEST_USER name.
Fix comment on use of lguest_req.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-25 14:12:20 +10:00
Peter Williams
681f3e6854 sched: isolate SMP balancing code a bit more
At the moment, a lot of load balancing code that is irrelevant to non
SMP systems gets included during non SMP builds.

This patch addresses this issue and reduces the binary size on non
SMP systems:

   text    data     bss     dec     hex filename
  10983      28    1192   12203    2fab sched.o.before
  10739      28    1192   11959    2eb7 sched.o.after

Signed-off-by: Peter Williams <pwil3058@bigpond.net.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:51 +02:00
Peter Williams
e1d1484f72 sched: reduce balance-tasks overhead
At the moment, balance_tasks() provides low level functionality for both
  move_tasks() and move_one_task() (indirectly) via the load_balance()
function (in the sched_class interface) which also provides dual
functionality.  This dual functionality complicates the interfaces and
internal mechanisms and makes the run time overhead of operations that
are called with two run queue locks held.

This patch addresses this issue and reduces the overhead of these
operations.

Signed-off-by: Peter Williams <pwil3058@bigpond.net.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:51 +02:00
Joe Perches
a8972ccf00 sched: constify sched.h
Add const to some struct task_struct * uses

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:50 +02:00
Ingo Molnar
b15136e949 sched: fix fastcall mismatch in completion APIs
Jeff Dike noticed that wait_for_completion_interruptible()'s prototype
had a mismatched fastcall.

Fix this by removing the fastcall attributes from all the completion APIs.

Found-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-24 18:23:48 +02:00
Gerrit Renker
d8ef2c29a0 [DCCP]: Convert Reset code into socket error number
This adds support for converting the 11 currently defined Reset codes into system
error numbers, which are stored in sk_err for further interpretation.

This makes the externally visible API behaviour similar to TCP, since a client
connecting to a non-existing port will experience ECONNREFUSED.

* Code 0, Unspecified, is interpreted as non-error (0);
* Code 1, Closed (normal termination), also maps into 0;
* Code 2, Aborted, maps into "Connection reset by peer" (ECONNRESET);
* Code 3, No Connection and
  Code 7, Connection Refused, map into "Connection refused" (ECONNREFUSED);
* Code 4, Packet Error, maps into "No message of desired type" (ENOMSG);
* Code 5, Option Error, maps into "Illegal byte sequence" (EILSEQ);
* Code 6, Mandatory Error, maps into "Operation not supported on transport endpoint" (EOPNOTSUPP);
* Code 8, Bad Service Code, maps into "Invalid request code" (EBADRQC);
* Code 9, Too Busy, maps into "Too many users" (EUSERS);
* Code 10, Bad Init Cookie, maps into "Invalid request descriptor" (EBADR);
* Code 11, Aggression Penalty, maps into "Quota exceeded" (EDQUOT)
  which makes sense in terms of using more than the `fair share' of bandwidth.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-10-24 10:27:48 -02:00
Gerrit Renker
fde20105f3 [DCCP]: Retrieve packet sequence number for error reporting
This fixes a problem when analysing erroneous packets in dccp_v{4,6}_err:
* dccp_hdr_seq currently takes an skb
* however, the transport headers in the skb are shifted, due to the
  preceding IPv4/v6 header.
Fixed for v4 and v6 by changing dccp_hdr_seq to take a struct dccp_hdr as
argument. Verified that the correct sequence number is now reported in the
error handler.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-10-24 10:12:09 -02:00
Jens Axboe
642f149031 SG: Change sg_set_page() to take length and offset argument
Most drivers need to set length and offset as well, so may as well fold
those three lines into one.

Add sg_assign_page() for those two locations that only needed to set
the page, where the offset/length is set outside of the function context.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-24 11:20:47 +02:00
Chuck Lever
c2636b4d9e [NET]: Treat the sign of the result of skb_headroom() consistently
In some places, the result of skb_headroom() is compared to an unsigned
integer, and in others, the result is compared to a signed integer.  Make
the comparisons consistent and correct.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:55 -07:00
Jeff Garzik
bada339ba2 [NET]: Validate device addr prior to interface-up
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:50 -07:00
Greg Ungerer
f0c15f48bb add port definition for mcf UART driver
Add a port type definition for the Freescale UART driver ports (mcf.c).

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-23 20:45:44 -07:00
Linus Torvalds
25c263542d Merge branch 'irq-upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6
* 'irq-upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
  [SPARC, XEN, NET/CXGB3] use irq_handler_t where appropriate
  drivers/char/riscom8: clean up irq handling
  isdn/sc: irq handler clean
  isdn/act2000: fix major bug. clean irq handler.
  char/pcmcia/synclink_cs: trim trailing whitespace
  drivers/char/ip2: separate polling and irq-driven work entry points
  drivers/char/ip2: split out irq core logic into separate function
  [NETDRVR] lib82596, netxen: delete pointless tests from irq handler
  Eliminate pointless casts from void* in a few driver irq handlers.
  [PARPORT] Remove unused 'irq' argument from parport irq functions
  [PARPORT] Kill useful 'irq' arg from parport_{generic_irq,ieee1284_interrupt}
  [PARPORT] Consolidate code copies into a single generic irq handler
2007-10-23 18:57:39 -07:00
Linus Torvalds
5a0e554b62 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (39 commits)
  Remove Andrew Morton from list of net driver maintainers.
  bonding: Acquire correct locks in alb for promisc change
  bonding: Convert more locks to _bh, acquire rtnl, for new locking
  bonding: Convert locks to _bh, rework alb locking for new locking
  bonding: Convert miimon to new locking
  bonding: Convert balance-rr transmit to new locking
  Convert bonding timers to workqueues
  Update MAINTAINERS to reflect my (jgarzik's) current efforts.
  pasemi_mac: fix typo
  defxx.c: dfx_bus_init() is __devexit not __devinit
  s390 MAINTAINERS
  remove header_ops bug in qeth driver
  sky2: crash on remove
  MIPSnet: Delete all the useless debugging printks.
  AR7 ethernet: small post-merge cleanups and fixes
  mv643xx_eth: Hook up mv643xx_get_sset_count
  mv643xx_eth: Remove obsolete checksum offload comment
  mv643xx_eth: Merge drivers/net/mv643xx_eth.h into mv643xx_eth.c
  mv643xx_eth: Remove unused register defines
  mv643xx_eth: Clean up mv643xx_eth.h
  ...
2007-10-23 18:56:54 -07:00
Jeff Garzik
2dcb407e61 [libata] checkpatch-inspired cleanups
Tackle the relatively sane complaints of checkpatch --file.

The vast majority is indentation and whitespace changes, the rest are

* #include fixes
* printk KERN_xxx prefix addition
* BSS/initializer cleanups

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-23 20:59:42 -04:00
Ursula Braun
f1ecfd5d3b remove header_ops bug in qeth driver
Remove qeth bug caused by commit:
[NET]: Move hardware header operations out of netdevice.

This is the second part of the qeth header_ops patch, since
first patch sent 10/19 has been insufficient.
Nevertheless first patch is still valid and should be kept.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-23 20:18:13 -04:00
Jeff Garzik
02bae21297 Merge branch 'features' of git://farnsworth.org/dale/linux-2.6-mv643xx_eth into upstream 2007-10-23 20:15:54 -04:00
Jeff Garzik
5712cb3d81 [PARPORT] Remove unused 'irq' argument from parport irq functions
None of the drivers with a struct pardevice's ->irq_func() hook ever
used the 'irq' argument passed to it, so remove it.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-23 19:53:16 -04:00
Jeff Garzik
f230d1010a [PARPORT] Kill useful 'irq' arg from parport_{generic_irq,ieee1284_interrupt}
parport_ieee1284_interrupt() was not using its first arg at all.
Delete.

parport_generic_irq()'s second arg makes its first arg completely
redundant.  Delete, and use port->irq in the one place where we actually
need it.

Also, s/__inline__/inline/ to make the code look nicer.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-23 19:53:15 -04:00
Jeff Garzik
3f2e40df0e [PARPORT] Consolidate code copies into a single generic irq handler
Several arches used the exact same code for their parport irq handling.
Make that code generic, in parport_irq_handler().

Also, s/__inline__/inline/ in include/linux/parport.h.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-23 19:53:15 -04:00
Linus Torvalds
01e7ae8c13 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: v9fs_vfs_rename incorrect clunk order
  9p: fix memleak in fs/9p/v9fs.c
  9p: add virtio transport
2007-10-23 12:04:01 -07:00
Eric Van Hensbergen
b530cc7940 9p: add virtio transport
This adds a transport to 9p for communicating between guests and a host
using a virtio based transport.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-10-23 13:47:31 -05:00
Jens Axboe
de26103de5 [SG] Add debug check for page alignment
Suggested by Boaz Harrosh <bharrosh@panasas.com>

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-23 20:35:58 +02:00
Linus Torvalds
0b776eb542 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  mlx4_core: Increase command timeout for INIT_HCA to 10 seconds
  IPoIB/cm: Use common CQ for CM send completions
  IB/uverbs: Fix checking of userspace object ownership
  IB/mlx4: Sanity check userspace send queue sizes
  IPoIB: Rewrite "if (!likely(...))" as "if (unlikely(!(...)))"
  IB/ehca: Enable large page MRs by default
  IB/ehca: Change meaning of hca_cap_mr_pgsize
  IB/ehca: Fix ehca_encode_hwpage_size() and alloc_fmr()
  IB/ehca: Fix masking error in {,re}reg_phys_mr()
  IB/ehca: Supply QP token for SRQ base QPs
  IPoIB: Use round_jiffies() for ah_reap_task
  RDMA/cma: Fix deadlock destroying listen requests
  RDMA/cma: Add locking around QP accesses
  IB/mthca: Avoid alignment traps when writing doorbells
  mlx4_core: Kill mlx4_write64_raw()
2007-10-23 09:56:11 -07:00
Lennert Buytenhek
e2734d6c61 mv643xx_eth: Move ethernet register definitions into private header
Move the mv643xx's ethernet-related register definitions from
include/linux/mv643xx.h into drivers/net/mv643xx_eth.h, since
they aren't of any use outside the ethernet driver.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Acked-by: Tzachi Perelstein <tzachi@marvell.com>
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
2007-10-23 08:23:00 -07:00
Lennert Buytenhek
c4a6a2ab5e mv643xx_eth: Split off mv643xx_eth platform device data
The mv643xx ethernet silicon block is also found in a couple of other
Marvell chips.  As a first step towards splitting off the mv643xx_eth
bits from the rest of the mv643xx bits, this patch splits the mv643xx
ethernet platform device data struct in linux/mv643xx.h off into
linux/mv643xx_eth.h, and includes the latter from the former.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Acked-by: Tzachi Perelstein <tzachi@marvell.com>
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
2007-10-23 08:22:58 -07:00
Rusty Russell
19f1537b7b Lguest support for Virtio
This makes lguest able to use the virtio devices.

We change the device descriptor page from a simple array to a variable
length "type, config_len, status, config data..." format, and
implement virtio_config_ops to read from that config data.

We use the virtio ring implementation for an efficient Guest <-> Host
virtqueue mechanism, and the new LHCALL_NOTIFY hypercall to kick the
host when it changes.

We also use LHCALL_NOTIFY on kernel addresses for very very early
console output.  We could have another hypercall, but this hack works
quite well.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:56 +10:00
Rusty Russell
15045275c3 Remove old lguest I/O infrrasructure.
This patch gets rid of the old lguest host I/O infrastructure and
replaces it with a single hypercall "LHCALL_NOTIFY" which takes an
address.

The main change is the removal of io.c: that mainly did inter-guest
I/O, which virtio doesn't yet support.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:55 +10:00
Rusty Russell
0ca49ca946 Remove old lguest bus and drivers.
This gets rid of the lguest bus, drivers and DMA mechanism, to make
way for a generic virtio mechanism.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:55 +10:00
Rusty Russell
0a8a69dd77 Virtio helper routines for a descriptor ringbuffer implementation
These helper routines supply most of the virtqueue_ops for hypervisors
which want to use a ring for virtio.  Unlike the previous lguest
implementation:

1) The rings are variable sized (2^n-1 elements).
2) They have an unfortunate limit of 65535 bytes per sg element.
3) The page numbers are always 64 bit (PAE anyone?)
4) They no longer place used[] on a separate page, just a separate
   cacheline.
5) We do a modulo on a variable.  We could be tricky if we cared.
6) Interrupts and notifies are suppressed using flags within the rings.

Users need only get the ring pages and provide a notify hook (KVM
wants the guest to allocate the rings, lguest does it sanely).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dor Laor <dor.laor@qumranet.com>
2007-10-23 15:49:55 +10:00
Rusty Russell
31610434bc Virtio console driver
This is an hvc-based virtio console driver.  It's suboptimal becuase
hvc expects to have raw access to interrupts and virtio doesn't assume
that, so it currently polls.

There are two solutions: expose hvc's "kick" interface, or wean off hvc.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:55 +10:00
Rusty Russell
e467cde238 Block driver using virtio.
The block driver uses scatter-gather lists with sg[0] being the
request information (struct virtio_blk_outhdr) with the type, sector
and inbuf id.  The next N sg entries are the bio itself, then the last
sg is the status byte.  Whether the N entries are in or out depends on
whether it's a read or a write.

We accept the normal (SCSI) ioctls: they get handed through to the other
side which can then handle it or reply that it's unsupported.  It's
not clear that this actually works in general, since I don't know
if blk_pc_request() requests have an accurate rq_data_dir().

Although we try to reply -ENOTTY on unsupported commands, ioctl(fd,
CDROMEJECT) returns success to userspace.  This needs a separate
patch.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <jens.axboe@oracle.com>
2007-10-23 15:49:54 +10:00
Rusty Russell
296f96fcfc Net driver using virtio
The network driver uses two virtqueues: one for input packets and one
for output packets.  This has nice locking properties (ie. we don't do
any for recv vs send).

TODO:
	1) Big packets.
	2) Multi-client devices (maybe separate driver?).
	3) Resolve freeing of old xmit skbs (Christian Borntraeger)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: netdev@vger.kernel.org
2007-10-23 15:49:54 +10:00
Rusty Russell
ec3d41c4db Virtio interface
This attempts to implement a "virtual I/O" layer which should allow
common drivers to be efficiently used across most virtual I/O
mechanisms.  It will no-doubt need further enhancement.

The virtio drivers add buffers to virtio queues; as the buffers are consumed
the driver "interrupt" callbacks are invoked.

There is also a generic implementation of config space which drivers can query
to get setup information from the host.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dor Laor <dor.laor@qumranet.com>
Cc: Arnd Bergmann <arnd@arndb.de>
2007-10-23 15:49:54 +10:00
Rusty Russell
47436aa4ad Boot with virtual == physical to get closer to native Linux.
1) This allows us to get alot closer to booting bzImages.

2) It means we don't have to know page_offset.

3) The Guest needs to modify the boot pagetables to create the
   PAGE_OFFSET mapping before jumping to C code.

4) guest_pa() walks the page tables rather than using page_offset.

5) We don't use page_offset to figure out whether to emulate: it was
   always kinda quesationable, and won't work for instructions done
   before remapping (bzImage unpacking in particular).

6) We still want the kernel address for tlb flushing: have the initial
   hypercall give us that, too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:54 +10:00
Rusty Russell
c18acd73ff Allow guest to specify syscall vector to use.
(Based on Ron Minnich's LGUEST_PLAN9_SYSCALL patch).

This patch allows Guests to specify what system call vector they want,
and we try to reserve it.  We only allow one non-Linux system call
vector, to try to avoid DoS on the Host.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:53 +10:00
Jes Sorensen
47aee45ae3 lguest.h declares a struct timespec, make it include linux/time.h
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:52 +10:00
Jes Sorensen
b410e7b149 Make hypercalls arch-independent.
Clean up the hypercall code to make the code in hypercalls.c
architecture independent. First process the common hypercalls and
then call lguest_arch_do_hcall() if the call hasn't been handled.
Rename struct hcall_ring to hcall_args.

This patch requires the previous patch which reorganize the layout of
struct lguest_regs on i386 so they match the layout of struct
hcall_args.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:52 +10:00
Rusty Russell
48245cc070 Remove fixed limit on number of guests, and lguests array.
Back when we had all the Guest state in the switcher, we had a fixed
array of them.  This is no longer necessary.

If we switch the network code to using random_ether_addr (46 bits is
enough to avoid clashes), we can get rid of the concept of "guest id"
altogether.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:51 +10:00
Jes Sorensen
c37ae93d59 Move lguest hcalls to arch-specific header
Move architecture specific portion of lg_hcall code to asm-i386/lg_hcall.h
and have it included from linux/lguest.h.

[Changed to asm-i386/lguest_hcall.h so documentation finds it -RR]

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jes Sorensen <jes@sgi.com>
2007-10-23 15:49:49 +10:00
Rusty Russell
b45d8cb054 Make lguest_launcher.h types userspace-friendly
lguest_launcher.h uses "u32" not "__u32", which sets a bad example.  Fix that,
and include <linux/types.h>.

This means we need to use -I on the Launcher build line so types.h is found.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:49 +10:00
Rusty Russell
ee8e7cfe9d Make asm-x86/bootparam.h includable from userspace.
To actually write a bootloader (or, say, the lguest launcher)
currently requires duplication of these structures.  Making them
includable from userspace is much nicer.

We merge the common userspace-required definitions of e820_32/64.h
into e820.h for export.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:47 +10:00
David Miller
e95d9c6b04 Expand hwif->host_flags so that it fits new flags.
Commit 238e4f142c ("ide: add
IDE_HFLAG_NO_LBA48 and IDE_HFLAG_NO_LBA48_DMA host flags") caused a
regression because the host_flags in struct hwif_s wasn't expanded to
cope with the fact that the host flags no longer fit in 16 bits.

Signed-off-by: David S. Miller <davem@davemloft.net>
[ I hate having to add good commit descriptions.  - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 19:35:14 -07:00
Linus Torvalds
81f8320f62 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: appletouch - apply idle reset logic to all touchpads
  Input: usbtouchscreen - add support for GoTop tablet devices
  Input: bf54x-keys - return real error when request_irq() fails
  Input: i8042 - export i8042_command()
2007-10-22 19:29:58 -07:00
Linus Torvalds
f09cc910fe Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits)
  [IPSEC] IPV6: Fix to add tunnel mode SA correctly.
  [NET]: Cut off the queue_mapping field from sk_buff
  [NET]: Hide the queue_mapping field inside netif_subqueue_stopped
  [NET]: Make and use skb_get_queue_mapping
  [NET]: Use the skb_set_queue_mapping where appropriate
  [INET]: Use MODULE_ALIAS_NET_PF_PROTO_TYPE where possible.
  [INET]: Let inet_diag and friends autoload
  [NIU]: Cleanup PAGE_SIZE checks a bit
  [NET]: Fix SKB_WITH_OVERHEAD calculation
  [ATM]: Fix clip module reload crash.
  [TG3]: Update version to 3.85
  [TG3]: PCI command adjustment
  [TG3]: Add management FW version to ethtool report
  [TG3]: Add 5723 support
  [Bluetooth] Convert RFCOMM to use kthread API
  [Bluetooth] Add constant for Bluetooth socket options level
  [Bluetooth] Add support for handling simple eSCO links
  [Bluetooth] Add address and channel attribute to RFCOMM TTY device
  [Bluetooth] Fix wrong argument in debug code of HIDP
  [Bluetooth] Add generic driver for Bluetooth USB devices
  ...
2007-10-22 19:22:33 -07:00
Linus Torvalds
ad792f4f46 Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (37 commits)
  V4L/DVB (6382): saa7134: fix NULL dereference at suspend time for cards without IR receiver
  V4L/DVB (6380): ivtvfb: Removal of the 'osd_compat' module option
  V4L/DVB (6379): patch which improves GotView Saa7135 remote control
  V4L/DVB (6378b): Updates info about the removal of V4L1 at feature-removal-schedule.txt
  V4L/DVB (6378a): Removal of VIDIOC_[G|S]_MPEGCOMP from feature-removal-schedule.txt
  V4L/DVB (6378): DiB0700-device: Using 1.10 firmware
  V4L/DVB (6357): pvrusb2: Improve encoder chip health tracking
  V4L/DVB (6356): "while (!ca->wakeup)" breaks the CAM initialisation
  V4L/DVB (6352): ir-kbd-i2c: Missing break statement
  V4L/DVB (6350): V4L: possible leak in em28xx_init_isoc
  V4L/DVB (6348): ivtv: undo video mute when closing the radio
  V4L/DVB (6347): ivtv: fix video mute when radio is used
  V4L/DVB (6346): ivtvfb: YUV output size fix when ivtvfb is not loaded
  V4L/DVB (6345): ivtvfb: YUV handling of an image which is not visible in the display area
  V4L/DVB (6343): ivtvfb: check return value of unregister_framebuffer
  V4L/DVB (6342): ivtv: fix circular locking (bug 9037)
  V4L/DVB (6341): ivtv: fix resizing MPEG1 streams
  V4L/DVB (6340): ivtvfb: screen mode change sometimes goes wrong
  V4L/DVB (6339): ivtv: set the video color to black instead of green when capturing from the radio
  V4L/DVB (6338): ivtv: fix incorrect EBUSY return
  ...
2007-10-22 19:20:22 -07:00
Linus Torvalds
69450bb5eb Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block
* 'sg' of git://git.kernel.dk/linux-2.6-block:
  Add CONFIG_DEBUG_SG sg validation
  Change table chaining layout
  Update arch/ to use sg helpers
  Update swiotlb to use sg helpers
  Update net/ to use sg helpers
  Update fs/ to use sg helpers
  [SG] Update drivers to use sg helpers
  [SG] Update crypto/ to sg helpers
  [SG] Update block layer to use sg helpers
  [SG] Add helpers for manipulating SG entries
2007-10-22 19:11:06 -07:00
Jens Axboe
d6ec084200 Add CONFIG_DEBUG_SG sg validation
Add a Kconfig entry which will toggle some sanity checks on the sg
entry and tables.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-22 21:20:03 +02:00
Jens Axboe
18dabf473e Change table chaining layout
Change the page member of the scatterlist structure to be an unsigned
long, and encode more stuff in the lower bits:

- Bits 0 and 1 zero: this is a normal sg entry. Next sg entry is located
  at sg + 1.
- Bit 0 set: this is a chain entry, the next real entry is at ->page_link
  with the two low bits masked off.
- Bit 1 set: this is the final entry in the sg entry. sg_next() will return
  NULL when passed such an entry.

It's thus important that sg table users use the proper accessors to get
and set the page member.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-22 21:20:01 +02:00
Christoph Hellwig
e38f981758 exportfs: update documentation
Update documentation to the current state of affairs.  Remove duplicated
method descruptions in exportfs.h and point to Documentation/filesystems/
Exporting instead.  Add a little file header comment in expfs.c describing
what's going on and mentioning Neils and my copyright [1].

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: David Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:21 -07:00
Christoph Hellwig
3965516440 exportfs: make struct export_operations const
Now that nfsd has stopped writing to the find_exported_dentry member we an
mark the export_operations const

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: David Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:21 -07:00
Christoph Hellwig
cfaea787c0 exportfs: remove old methods
Now that all filesystems are converted remove support for the old methods.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: David Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:21 -07:00
Christoph Hellwig
be55caf177 reiserfs: new export ops
Another nice little cleanup by using the new methods.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:20 -07:00
Christoph Hellwig
05da080482 efs: new export ops
Trivial switch over to the new generic helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:20 -07:00
Christoph Hellwig
2596110a39 exportfs: add new methods
Add the guts for the new filesystem API to exportfs.

There's now a fh_to_dentry method that returns a dentry for the object looked
for given a filehandle fragment, and a fh_to_parent operation that returns the
dentry for the encoded parent directory in case the file handle contains it.

There are default implementations for these methods that only take a callback
for an nfs-enhanced iget variant and implement the rest of the semantics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: David Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:19 -07:00
Christoph Hellwig
6e91ea2bb0 exportfs: add fid type
This patchset is a medium scale rewrite of the export operations interface.
The goal is to make the interface less complex, and easier to understand from
the filesystem side, aswell as preparing generic support for exporting of
64bit inode numbers.

This touches all nfs exporting filesystems, and I've done testing on all of
the filesystems I have here locally (xfs, ext2, ext3, reiserfs, jfs)

This patch:

Add a structured fid type so that we don't have to pass an array of u32 values
around everywhere.  It's a union of possible layouts.

As a start there's only the u32 array and the traditional 32bit inode format,
but there will be more in one of my next patchset when I start to document the
various filehandle formats we have in lowlevel filesystems better.

Also add an enum that gives the various filehandle types human- readable
names.

Note: Some people might think the struct containing an anonymous union is
ugly, but I didn't want to pass around a raw union type.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: David Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:19 -07:00
Bernhard Walle
00bf4098be kexec: add BSS to resource tree
Add the BSS to the resource tree just as kernel text and kernel data are in
the resource tree.  The main reason behind this is to avoid crashkernel
reservation in that area.

While it's not strictly necessary to have the BSS in the resource tree (the
actual collision detection is done in the reserve_bootmem() function before),
the usage of the BSS resource should be presented to the user in /proc/iomem
just as Kernel data and Kernel code.

Note: The patch currently is only implemented for x86 and ia64 (because
efi_initialize_iomem_resources() has the same signature on i386 and ia64).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:19 -07:00
Keshavamurthy, Anil S
3460a6d9ce Intel IOMMU: DMAR fault handling support
MSI interrupt handler registrations and fault handling support for Intel-IOMMU
hadrware.

This patch enables the MSI interrupts for the DMA remapping units and in the
interrupt handler read the fault cause and outputs the same on to the console.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:19 -07:00
Keshavamurthy, Anil S
ba39592764 Intel IOMMU: Intel IOMMU driver
Actual intel IOMMU driver.  Hardware spec can be found at:
http://www.intel.com/technology/virtualization

This driver sets X86_64 'dma_ops', so hook into standard DMA APIs.  In this
way, PCI driver will get virtual DMA address.  This change is transparent to
PCI drivers.

[akpm@linux-foundation.org: remove unneeded cast]
[akpm@linux-foundation.org: build fix]
[bunk@stusta.de: fix duplicate CONFIG_DMAR Makefile line]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Keshavamurthy, Anil S
994a65e25d Intel IOMMU: PCI generic helper function
When devices are under a p2p bridge, upstream transactions get replaced by the
device id of the bridge as it owns the PCIE transaction.  Hence its necessary
to setup translations on behalf of the bridge as well.  Due to this limitation
all devices under a p2p share the same domain in a DMAR.

We just cache the type of device, if its a native PCIe device
or not for later use.

[akpm@linux-foundation.org: BUG_ON -> WARN_ON+recover]
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Keshavamurthy, Anil S
10e5247f40 Intel IOMMU: DMAR detection and parsing logic
This patch supports the upcomming Intel IOMMU hardware a.k.a.  Intel(R)
Virtualization Technology for Directed I/O Architecture and the hardware spec
for the same can be found here
http://www.intel.com/technology/virtualization/index.htm

FAQ! (questions from akpm, answers from ak)

> So...  what's all this code for?
>
> I assume that the intent here is to speed things up under Xen, etc?

Yes in some cases, but not this code.  That would be the Xen version of this
code that could potentially assign whole devices to guests.  I expect this to
be only useful in some special cases though because most hardware is not
virtualizable and you typically want an own instance for each guest.

Ok at some point KVM might implement this too; i likely would use this code
for this.

> Do we
> have any benchmark results to help us to decide whether a merge would be
> justified?

The main advantage for doing it in the normal kernel is not performance, but
more safety.  Broken devices won't be able to corrupt memory by doing random
DMA.

Unfortunately that doesn't work for graphics yet, for that need user space
interfaces for the X server are needed.

There are some potential performance benefits too:

- When you have a device that cannot address the complete address range an
  IOMMU can remap its memory instead of bounce buffering.  Remapping is likely
  cheaper than copying.

- The IOMMU can merge sg lists into a single virtual block.  This could
  potentially speed up SG IO when the device is slow walking SG lists.  [I
  long ago benchmarked 5% on some block benchmark with an old MPT Fusion; but
  it probably depends a lot on the HBA]

And you get better driver debugging because unexpected memory accesses from
the devices will cause a trappable event.

>
> Does it slow anything down?

It adds more overhead to each IO so yes.

This patch:

Add support for early detection and parsing of DMAR's (DMA Remapping) reported
to OS via ACPI tables.

DMA remapping(DMAR) devices support enables independent address translations
for Direct Memory Access(DMA) from Devices.  These DMA remapping devices are
reported via ACPI tables and includes pci device scope covered by these DMA
remapping device.

For detailed info on the specification of "Intel(R) Virtualization Technology
for Directed I/O Architecture" please see
http://www.intel.com/technology/virtualization/index.htm

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Jan Kara
89910cccb8 ext2: avoid rec_len overflow with 64KB block size
With 64KB blocksize, a directory entry can have size 64KB which does not
fit into 16 bits we have for entry length.  So we store 0xffff instead and
convert the value when read from / written to disk.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Serge E. Hallyn
b68680e473 capabilities: clean up file capability reading
Simplify the vfs_cap_data structure.

Also fix get_file_caps which was declaring
__le32 v1caps[XATTR_CAPS_SZ] on the stack, but
XATTR_CAPS_SZ is already * sizeof(__le32).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Andrew Morgan <morgan@kernel.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Yasunori Goto
b9049e2344 memory hotplug: make kmem_cache_node for SLUB on memory online avoid panic
Fix a panic due to access NULL pointer of kmem_cache_node at discard_slab()
after memory online.

When memory online is called, kmem_cache_nodes are created for all SLUBs
for new node whose memory are available.

slab_mem_going_online_callback() is called to make kmem_cache_node() in
callback of memory online event.  If it (or other callbacks) fails, then
slab_mem_offline_callback() is called for rollback.

In memory offline, slab_mem_going_offline_callback() is called to shrink
all slub cache, then slab_mem_offline_callback() is called later.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: locking fix]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:17 -07:00
Yasunori Goto
7b78d335ac memory hotplug: rearrange memory hotplug notifier
Current memory notifier has some defects yet.  (Fortunately, nothing uses
it.) This patch is to fix and rearrange for them.

  - Add information of start_pfn, nr_pages, and node id if node status is
    changes from/to memoryless node for callback functions.
    Callbacks can't do anything without those information.
  - Add notification going-online status.
    It is necessary for creating per node structure before the node's
    pages are available.
  - Move GOING_OFFLINE status notification after page isolation.
    It is good place for return memory like cache for callback,
    because returned page is not used again.
  - Make CANCEL events for rollingback when error occurs.
  - Delete MEM_MAPPING_INVALID notification. It will be not used.
  - Fix compile error of (un)register_memory_notifier().

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:17 -07:00
Rusty Russell
214541d1f3 add WEAK() for creating weak asm labels
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:17 -07:00
Jens Axboe
82f66fbef5 [SG] Add helpers for manipulating SG entries
We can then transition drivers without changing the generated code.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-22 17:07:37 +02:00
Hans Verkuil
3bcc95760c V4L/DVB (6321): Remove obsolete VIDIOC_S/G_MPEGCOMP ioctls
Remove the obsolete VIDIOC_G_MPEGCOMP and VIDIOC_S_MPEGCOMP ioctls from
the V4L2 API as per the removal schedule (October 2007).

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2007-10-22 12:01:30 -02:00
Mauro Carvalho Chehab
22c4a4e98e V4L/DVB (6320): v4l core: remove the unused .hardware V4L1 field
struct video_device used to define a .hardware field. While
initialized on severl drivers, this field is never used inside V4L.
However, drivers using it need to include the old V4L1 header.

This seems to cause compilation troubles with some random configs.
Better just to remove it from all drivers.

Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2007-10-22 12:01:24 -02:00
Pavel Emelyanov
e3fa259bcb [NET]: Cut off the queue_mapping field from sk_buff
Just hide it behind the #ifdef, because nobody wants
it now.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-22 02:59:57 -07:00
Pavel Emelyanov
668f895a85 [NET]: Hide the queue_mapping field inside netif_subqueue_stopped
Many places get the queue_mapping field from skb to pass it to the
netif_subqueue_stopped() which will be 0 in any case.

Make the helper that works with sk_buff

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-22 02:59:56 -07:00