Commit Graph

2064 Commits

Author SHA1 Message Date
Evgeniy Polyakov
7672d0b544 [NET]: Add netlink connector.
Kernel connector - new userspace <-> kernel space easy to use
communication module which implements easy to use bidirectional
message bus using netlink as it's backend.  Connector was created to
eliminate complex skb handling both in send and receive message bus
direction.

Connector driver adds possibility to connect various agents using as
one of it's backends netlink based network.  One must register
callback and identifier. When driver receives special netlink message
with appropriate identifier, appropriate callback will be called.

From the userspace point of view it's quite straightforward:

	socket();
	bind();
	send();
	recv();

But if kernelspace want to use full power of such connections, driver
writer must create special sockets, must know about struct sk_buff
handling...  Connector allows any kernelspace agents to use netlink
based networking for inter-process communication in a significantly
easier way:

int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *));
void cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask);

struct cb_id
{
	__u32			idx;
	__u32			val;
};

idx and val are unique identifiers which must be registered in
connector.h for in-kernel usage.  void (*callback) (void *) - is a
callback function which will be called when message with above idx.val
will be received by connector core.

Using connector completely hides low-level transport layer from it's
users.

Connector uses new netlink ability to have many groups in one socket.

[ Incorporating many cleanups and fixes by myself and
  Andrew Morton -DaveM ]

Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-11 19:15:07 -07:00
Tony Luck
d67eb16f5d Pull sn-features into release branch 2005-09-11 14:34:23 -07:00
Keith Owens
49a28cc8fd [IA64] MCA/INIT: remove obsolete unwind code
Delete the special case unwind code that was only used by the old
MCA/INIT handler.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-11 14:09:34 -07:00
Keith Owens
7f613c7d22 [PATCH] MCA/INIT: use per cpu stacks
The bulk of the change.  Use per cpu MCA/INIT stacks.  Change the SAL
to OS state (sos) to be per process.  Do all the assembler work on the
MCA/INIT stacks, leaving the original stack alone.  Pass per cpu state
data to the C handlers for MCA and INIT, which also means changing the
mca_drv interfaces slightly.  Lots of verification on whether the
original stack is usable before converting it to a sleeping process.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-11 14:08:41 -07:00
Keith Owens
e619ae0b96 [IA64] MCA/INIT: add an extra thread_info flag
Add an extra thread_info flag to indicate the special MCA/INIT stacks.
Mainly for debuggers.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-11 14:02:10 -07:00
Keith Owens
a2a979821b [PATCH] MCA/INIT: scheduler hooks
Scheduler hooks to see/change which process is deemed to be on a cpu.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-09-11 14:01:30 -07:00
Linus Torvalds
9fe66dfd88 Merge master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband 2005-09-11 10:16:07 -07:00
Linus Torvalds
2f79f458d2 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2005-09-10 17:42:47 -07:00
Al Viro
d3fd4c2d48 [PATCH] uml spinlock breakage
mingo missed that one...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 16:50:01 -07:00
Linus Torvalds
7f93220b62 Merge master.kernel.org:/pub/scm/linux/kernel/git/dtor/input 2005-09-10 15:54:41 -07:00
Paolo 'Blaisorblade' Giarrusso
d99c4022f6 [PATCH] uml: inline mk_pte and various friends
Turns out that, for UML, a *lot* of VM-related trivial functions are not
inlined but rather normal functions.

In other sections of UML code, this is justified by having files which
interact with the host and cannot therefore include kernel headers, but in
this case there's no such justification.

I've had to turn many of them to macros because of missing declarations. While
doing this, I've decided to reuse some already existing macros.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 12:00:18 -07:00
Paolo 'Blaisorblade' Giarrusso
a7d0c21033 [PATCH] i386 / uml: add dwarf sections to static link script
Inside the linker script, insert the code for DWARF debug info sections. This
may help GDB'ing a Uml binary. Actually, it seems that ld is able to guess
what I added correctly, but normal linker scripts include this section so it
should be correct anyway adding it.

On request by Sam Ravnborg <sam@ravnborg.org>, I've added it to
asm-generic/vmlinux.lds.s. I've also moved there the stabs debug section,
used the new macro in i386 linker script and added DWARF debug section to
that.

In the truth, I've not been able to verify the difference in GDB behaviour
after this change (I've seen large improvements with another patch). This
may depend on my binutils version, older one may have worse defaults.

However, this section is present in normal linker script, so add it at
least for the sake of cleanness.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 12:00:17 -07:00
David S. Miller
2a04451581 Merge davem@outer-richmond.davemloft.net:src/GIT/net-2.6/ 2005-09-10 11:01:33 -07:00
Linus Torvalds
abf914208a Merge master.kernel.org:/home/rmk/linux-2.6-arm 2005-09-10 10:16:47 -07:00
viro@ZenIV.linux.org.uk
fe08ac3178 [PATCH] __user annotations (scsi/ch)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:16:27 -07:00
Paul Mackerras
31139971b3 [PATCH] ppc32: support hotplug cpu on powermacs
This allows cpus to be off-lined on 32-bit SMP powermacs.  When a cpu
is off-lined, it is put into sleep mode with interrupts disabled.  It
can be on-lined again by asserting its soft-reset pin, which is
connected to a GPIO pin.

With this I can off-line the second cpu in my dual G4 powermac, which
means that I can then suspend the machine (the suspend/resume code
refuses to suspend if more than one cpu is online, and making it cope
with multiple cpus is surprisingly messy).

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:15:11 -07:00
Paul Mackerras
bb0bb3b659 [PATCH] ppc32: Kill init on unhandled synchronous signals
This is a patch that I have had in my tree for ages.  If init causes
an exception that raises a signal, such as a SIGSEGV, SIGILL or
SIGFPE, and it hasn't registered a handler for it, we don't deliver
the signal, since init doesn't get any signals that it doesn't have a
handler for.  But that means that we just return to userland and
generate the same exception again immediately.  With this patch we
print a message and kill init in this situation.

This is very useful when you have a bug in the kernel that means that
init doesn't get as far as executing its first instruction. :)
Without this patch the system hangs when it gets to starting the
userland init; with it you at least get a message giving you a clue
about what has gone wrong.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:15:11 -07:00
Andrew Morton
373016e9e1 [PATCH] time.h: remove ifdefs
Remove these ifdefs - there's no need to have more than one definition of
these multipliers anywhere.

Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:36 -07:00
Nishanth Aravamudan
84f902c090 [PATCH] include: update jiffies/{m,u}secs conversion functions
Clarify the human-time units to jiffies conversion functions by using the
constants in time.h.  This makes many of the subsequent patches direct
copies of the current code.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:36 -07:00
Nishanth Aravamudan
64ed93a268 [PATCH] add schedule_timeout_{,un}interruptible() interfaces
Add schedule_timeout_{,un}interruptible() interfaces so that
schedule_timeout() callers don't have to worry about forgetting to add the
set_current_state() call beforehand.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:36 -07:00
Adrian Bunk
672289e9fa [PATCH] i386/x86_64: make get_cpu_vendor() static
get_cpu_vendor() no longer has any users in other files.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:35 -07:00
Adrian Bunk
c2d08dade7 [PATCH] include/linux/bio.h: "extern inline" -> "static inline"
"extern inline" doesn't make much sense.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:35 -07:00
Adrian Bunk
9adeb1b409 [PATCH] "extern inline" -> "static inline"
"extern inline" doesn't make much sense.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:35 -07:00
Adrian Bunk
2befb9e36d [PATCH] include/linux/blkdev.h: "extern inline" -> "static inline"
"extern inline" doesn't make much sense.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:34 -07:00
Adrian Bunk
e2afe67453 [PATCH] include/asm-i386/: "extern inline" -> "static inline"
"extern inline" doesn't make much sense.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:34 -07:00
Adrian Bunk
f8e38dde33 [PATCH] include/asm-arm26/hardirq.h: remove #define irq_enter()
This patch removes a #define for irq_enter that is superfluous due to a
similar one in include/linux/hardirq.h.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:30 -07:00
Victor Fusco
3a11ec5e50 [PATCH] dmapool: Fix "nocast type" warnings
Fix the sparse warning "implicit cast to nocast type"

Signed-off-by: Victor Fusco <victor@cetuc.puc-rio.br>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:29 -07:00
Victor Fusco
00b61f5192 [PATCH] lib/radix-tree: Fix "nocast type" warnings
Fix the sparse warning "implicit cast to nocast type"

Signed-off-by: Victor Fusco <victor@cetuc.puc-rio.br>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:28 -07:00
Victor Fusco
b2d550736f [PATCH] mm/slab: fix sparse warnings
Fix the sparse warning "implicit cast to nocast type"

Signed-off-by: Victor Fusco <victor@cetuc.puc-rio.br>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:26 -07:00
Adrian Bunk
5ce7852cdf [PATCH] mm/filemap.c: make two functions static
With Nick Piggin <npiggin@suse.de>

Give some things static scope.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:25 -07:00
Ingo Molnar
d79fc0fc66 [PATCH] sched: TASK_NONINTERACTIVE
This patch implements a task state bit (TASK_NONINTERACTIVE), which can be
used by blocking points to mark the task's wait as "non-interactive".  This
does not mean the task will be considered a CPU-hog - the wait will simply
not have an effect on the waiting task's priority - positive or negative
alike.  Right now only pipe_wait() will make use of it, because it's a
common source of not-so-interactive waits (kernel compilation jobs, etc.).

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:22 -07:00
Paul Jackson
4247bdc600 [PATCH] cpuset semaphore depth check deadlock fix
The cpusets-formalize-intermediate-gfp_kernel-containment patch
has a deadlock problem.

This patch was part of a set of four patches to make more
extensive use of the cpuset 'mem_exclusive' attribute to
manage kernel GFP_KERNEL memory allocations and to constrain
the out-of-memory (oom) killer.

A task that is changing cpusets in particular ways on a system
when it is very short of free memory could double trip over
the global cpuset_sem semaphore (get the lock and then deadlock
trying to get it again).

The second attempt to get cpuset_sem would be in the routine
cpuset_zone_allowed().  This was discovered by code inspection.
I can not reproduce the problem except with an artifically
hacked kernel and a specialized stress test.

In real life you cannot hit this unless you are manipulating
cpusets, and are very unlikely to hit it unless you are rapidly
modifying cpusets on a memory tight system.  Even then it would
be a rare occurence.

If you did hit it, the task double tripping over cpuset_sem
would deadlock in the kernel, and any other task also trying
to manipulate cpusets would deadlock there too, on cpuset_sem.
Your batch manager would be wedged solid (if it was cpuset
savvy), but classic Unix shells and utilities would work well
enough to reboot the system.

The unusual condition that led to this bug is that unlike most
semaphores, cpuset_sem _can_ be acquired while in the page
allocation code, when __alloc_pages() calls cpuset_zone_allowed.
So it easy to mistakenly perform the following sequence:
  1) task makes system call to alter a cpuset
  2) take cpuset_sem
  3) try to allocate memory
  4) memory allocator, via cpuset_zone_allowed, trys to take cpuset_sem
  5) deadlock

The reason that this is not a serious bug for most users
is that almost all calls to allocate memory don't require
taking cpuset_sem.  Only some code paths off the beaten
track require taking cpuset_sem -- which is good.  Taking
a global semaphore on the main code path for allocating
memory would not scale well.

This patch fixes this deadlock by wrapping the up() and down()
calls on cpuset_sem in kernel/cpuset.c with code that tracks
the nesting depth of the current task on that semaphore, and
only does the real down() if the task doesn't hold the lock
already, and only does the real up() if the nesting depth
(number of unmatched downs) is exactly one.

The previous required use of refresh_mems(), anytime that
the cpuset_sem semaphore was acquired and the code executed
while holding that semaphore might try to allocate memory, is
no longer required.  Two refresh_mems() calls were removed
thanks to this.  This is a good change, as failing to get
all the necessary refresh_mems() calls placed was a primary
source of bugs in this cpuset code.  The only remaining call
to refresh_mems() is made while doing a memory allocation,
if certain task memory placement data needs to be updated
from its cpuset, due to the cpuset having been changed behind
the tasks back.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:21 -07:00
Ingo Molnar
fb1c8f93d8 [PATCH] spinlock consolidation
This patch (written by me and also containing many suggestions of Arjan van
de Ven) does a major cleanup of the spinlock code.  It does the following
things:

 - consolidates and enhances the spinlock/rwlock debugging code

 - simplifies the asm/spinlock.h files

 - encapsulates the raw spinlock type and moves generic spinlock
   features (such as ->break_lock) into the generic code.

 - cleans up the spinlock code hierarchy to get rid of the spaghetti.

Most notably there's now only a single variant of the debugging code,
located in lib/spinlock_debug.c.  (previously we had one SMP debugging
variant per architecture, plus a separate generic one for UP builds)

Also, i've enhanced the rwlock debugging facility, it will now track
write-owners.  There is new spinlock-owner/CPU-tracking on SMP builds too.
All locks have lockup detection now, which will work for both soft and hard
spin/rwlock lockups.

The arch-level include files now only contain the minimally necessary
subset of the spinlock code - all the rest that can be generalized now
lives in the generic headers:

 include/asm-i386/spinlock_types.h       |   16
 include/asm-x86_64/spinlock_types.h     |   16

I have also split up the various spinlock variants into separate files,
making it easier to see which does what. The new layout is:

   SMP                         |  UP
   ----------------------------|-----------------------------------
   asm/spinlock_types_smp.h    |  linux/spinlock_types_up.h
   linux/spinlock_types.h      |  linux/spinlock_types.h
   asm/spinlock_smp.h          |  linux/spinlock_up.h
   linux/spinlock_api_smp.h    |  linux/spinlock_api_up.h
   linux/spinlock.h            |  linux/spinlock.h

/*
 * here's the role of the various spinlock/rwlock related include files:
 *
 * on SMP builds:
 *
 *  asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
 *                        initializers
 *
 *  linux/spinlock_types.h:
 *                        defines the generic type and initializers
 *
 *  asm/spinlock.h:       contains the __raw_spin_*()/etc. lowlevel
 *                        implementations, mostly inline assembly code
 *
 *   (also included on UP-debug builds:)
 *
 *  linux/spinlock_api_smp.h:
 *                        contains the prototypes for the _spin_*() APIs.
 *
 *  linux/spinlock.h:     builds the final spin_*() APIs.
 *
 * on UP builds:
 *
 *  linux/spinlock_type_up.h:
 *                        contains the generic, simplified UP spinlock type.
 *                        (which is an empty structure on non-debug builds)
 *
 *  linux/spinlock_types.h:
 *                        defines the generic type and initializers
 *
 *  linux/spinlock_up.h:
 *                        contains the __raw_spin_*()/etc. version of UP
 *                        builds. (which are NOPs on non-debug, non-preempt
 *                        builds)
 *
 *   (included on UP-non-debug builds:)
 *
 *  linux/spinlock_api_up.h:
 *                        builds the _spin_*() APIs.
 *
 *  linux/spinlock.h:     builds the final spin_*() APIs.
 */

All SMP and UP architectures are converted by this patch.

arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
crosscompilers.  m32r, mips, sh, sparc, have not been tested yet, but should
be mostly fine.

From: Grant Grundler <grundler@parisc-linux.org>

  Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
  Builds 32-bit SMP kernel (not booted or tested).  I did not try to build
  non-SMP kernels.  That should be trivial to fix up later if necessary.

  I converted bit ops atomic_hash lock to raw_spinlock_t.  Doing so avoids
  some ugly nesting of linux/*.h and asm/*.h files.  Those particular locks
  are well tested and contained entirely inside arch specific code.  I do NOT
  expect any new issues to arise with them.

 If someone does ever need to use debug/metrics with them, then they will
  need to unravel this hairball between spinlocks, atomic ops, and bit ops
  that exist only because parisc has exactly one atomic instruction: LDCW
  (load and clear word).

From: "Luck, Tony" <tony.luck@intel.com>

   ia64 fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjanv@infradead.org>
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se>
Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:21 -07:00
Brian Haley
e6df439b89 [IPV6]: Bring Type 0 routing header in-line with rfc3542.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-10 00:15:06 -07:00
David S. Miller
3874b98c65 Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6-git-rfc3542 2005-09-10 00:10:49 -07:00
YOSHIFUJI Hideaki
dd27466df9 [IPV6]: Note values allocated for ip6_tables.
To avoid future conflicts, add a note values allocated for ip6_tables.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-09-10 11:32:45 +09:00
YOSHIFUJI Hideaki
9928890c1f [IPV6]: rearrange constants for new advanced API to solve conflicts.
64, 65 are already used in ip6_tables.
Pointed out by Patrick McHardy <kaber@trash.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2005-09-10 11:26:34 +09:00
John Kingman
354ba39cf9 [PATCH] IB CM: support CM redir
Changes to CM to support CM and port redirection (REJ reason 24).

Signed-off-by: John Kingman <kingman <at> storagegear.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-09-09 18:23:32 -07:00
Dmitry Torokhov
d344c5e085 Manual merge with Linus 2005-09-09 20:14:47 -05:00
NeilBrown
72626685dc [PATCH] md: add write-intent-bitmap support to raid5
Most awkward part of this is delaying write requests until bitmap updates have
been flushed.

To achieve this, we have a sequence number (seq_flush) which is incremented
each time the raid5 is unplugged.

If the raid thread notices that this has changed, it flushes bitmap changes,
and assigned the value of seq_flush to seq_write.

When a write request arrives, it is given the number from seq_write, and that
write request may not complete until seq_flush is larger than the saved seq
number.

We have a new queue for storing stripes which are waiting for a bitmap flush
and an extra flag for stripes to record if the write was 'degraded' and so
should not clear the a bit in the bitmap.

Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 16:39:12 -07:00
NeilBrown
0002b2718d [PATCH] md: limit size of sb read/written to appropriate amount
version-1 superblocks are not (normally) 4K long, and can be of variable size.
 Writing the full 4K can cause corruption (but only in non-default
configurations).

With this patch the super-block-flavour can choose a size to read, and set a
size to write based on what it finds.

Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 16:39:12 -07:00
NeilBrown
773f783442 [PATCH] md: remove old cruft from md_k.h header file
These inlines haven't been used for ages, they should go.

Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 16:39:12 -07:00
NeilBrown
71c0805cb4 [PATCH] md: allow md to load a superblock with feature-bit '1' set
As this is used to flag an internal bitmap.

Also, introduce symbolic names for feature bits.

Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 16:39:11 -07:00
NeilBrown
15945fee6f [PATCH] md: support md/linear array with components greater than 2 terabytes.
linear currently uses division by the size of the smallest componenet device
to find which device a request goes to.  If that smallest device is larger
than 2 terabytes, then the division will not work on some systems.

So we introduce a pre-shift, and take care not to make the hash table too
large, much like the code in raid0.

Also get rid of conf->nr_zones, which is not needed.

Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 16:39:10 -07:00
NeilBrown
4b6d287f62 [PATCH] md: add write-behind support for md/raid1
If a device is flagged 'WriteMostly' and the array has a bitmap, and the
bitmap superblock indicates that write_behind is allowed, then write_behind is
enabled for WriteMostly devices.

Write requests will be acknowledges as complete to the caller (via b_end_io)
when all non-WriteMostly devices have completed the write, but will not be
cleared from the bitmap until all devices complete.

This requires memory allocation to make a local copy of the data being
written.  If there is insufficient memory, then we fall-back on normal write
semantics.

Signed-Off-By: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 16:39:10 -07:00
NeilBrown
8ddf9efe67 [PATCH] md: support write-mostly device in raid1
This allows a device in a raid1 to be marked as "write mostly".  Read requests
will only be sent if there is no other option.

Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 16:39:10 -07:00
NeilBrown
36fa30636f [PATCH] md: all hot-add and hot-remove of md intent logging bitmaps
Both file-bitmaps and superblock bitmaps are supported.

If you add a bitmap file on the array device, you lose.

This introduces a 'default_bitmap_offset' field in mddev, as the ioctl used
for adding a superblock bitmap doesn't have room for giving an offset.  Later,
this value will be setable via sysfs.

Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 16:39:10 -07:00
Roland Dreier
63aaf64752 Make sure that userspace does not retrieve stale asynchronous or
completion events after destroying a CQ, QP or SRQ.  We do this by
sweeping the event lists before returning from a destroy calls, and
then return the number of events already reported before the destroy
call.  This allows userspace wait until it has processed all events
for an object returned from the kernel before it frees its context for
the object.

The ABI of the destroy CQ, destroy QP and destroy SRQ commands has to
change to return the event count, so bump the ABI version from 1 to 2.
The userspace libibverbs library has already been updated to handle
both the old and new ABI versions.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-09-09 15:55:08 -07:00
Linus Torvalds
486a153f0e Merge master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild 2005-09-09 15:46:49 -07:00
Roland Dreier
2e9f7cb786 [PATCH] IB: Add struct for ClassPortInfo
Add structure definition for ClassPortInfo format.  This is
needed for (at least) handling CM redirects.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-09-09 15:45:57 -07:00