kernel-ark

Author	SHA1	Message	Date
Christoph Hellwig	6b9c7ed848	[PATCH] use ptrace_get_task_struct in various places The ptrace_get_task_struct() helper that I added as part of the ptrace consolidation is useful in variety of places that currently opencode it. Switch them to the common helpers. Add a ptrace_traceme() helper that needs to be explicitly called, and simplify the ptrace_get_task_struct() interface. We don't need the request argument now, and we return the task_struct directly, using ERR_PTR() for error returns. It's a bit more code in the callers, but we have two sane routines that do one thing well now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:51 -08:00
Tom Zanussi	761da5c88a	[PATCH] relayfs: cleanup, change relayfs_file_* to relay_file_* This patch renames relayfs_file_operations to relay_file_operations, and the file operations themselves from relayfs_XXX to relay_file_XXX, to make it more clear that they refer to relay files. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:51 -08:00
Tom Zanussi	e6c08367b8	[PATCH] relayfs: add support for global relay buffers This patch adds the optional is_global outparam to the create_buf_file() callback. This can be used by clients to create a single global relayfs buffer instead of the default per-cpu buffers. This was suggested as being useful for certain debugging applications where it's more convenient to be able to get all the data from a single channel without having to go to the bother of dealing with per-cpu files. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:50 -08:00
Tom Zanussi	08c541a7ad	[PATCH] relayfs: add support for relay files in other filesystems This patch adds a couple of callback functions that allow a client to hook into relay_open()/close() and supply the files that will be used to represent the channel buffers; the default implementation if no callbacks are defined is to create the files in relayfs. This is to support the creation and use of relay files in other filesystems such as debugfs, as implied by the fact that relayfs_file_operations are exported. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:50 -08:00
Tom Zanussi	aaea25d7a6	[PATCH] relayfs: remove unused alloc/destroy_inode() Since we're no longer using relayfs_inode_info, remove relayfs_alloc_inode() and relayfs_destroy_inode() along with the relayfs inode cache. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:50 -08:00
Tom Zanussi	7431733791	[PATCH] relayfs: add relayfs_remove_file() This patch adds and exports relayfs_remove_file(), for API symmetry (with relayfs_create_file()). Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:49 -08:00
Tom Zanussi	907f2c77d1	[PATCH] relayfs: export relayfs_create_file() with fileops param This patch adds a mandatory fileops param to relayfs_create_file() and exports that function so that clients can use it to create files defined by their own set of file operations, in relayfs. The purpose is to allow relayfs applications to create their own set of 'control' files alongside their relay files in relayfs rather than having to create them in /proc or debugfs for instance. relayfs_create_file() is also used by relay_open_buf() to create the relay files for a channel. In this case, a pointer to relayfs_file_operations is passed in, along with a pointer to the buffer associated with the file. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:49 -08:00
Jan Beulich	b3f3d6141f	[PATCH] ELF: symbol table type additions Needed for the Novell kernel debugger and perhaps some per-cpu data on x86_64 in the future. Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:49 -08:00
Nick Piggin	095975da26	[PATCH] rcu file: use atomic primitives Use atomic_inc_not_zero for rcu files instead of special case rcuref. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:48 -08:00
Adrian Bunk	2a10e0b28b	[PATCH] move rtc_interrupt() prototype to rtc.h This patch moves the rtc_interrupt() prototype to rtc.h and removes the prototypes from C files. It also renames static rtc_interrupt() functions in arch/arm/mach-integrator/time.c and arch/sh64/kernel/time.c to avoid compile problems. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Paul Gortmaker <p_gortmaker@yahoo.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:47 -08:00
OGAWA Hirofumi	05eb0b51fb	[PATCH] fat: support a truncate() for expanding size (generic_cont_expand) This patch changes generic_cont_expand(), in order to share the code with fatfs. - Use vmtruncate() if ->prepare_write() returns a error. Even if ->prepare_write() returns an error, it may already have added some blocks. So, this truncates blocks outside of ->i_size by vmtruncate(). - Add generic_cont_expand_simple(). The generic_cont_expand_simple() assumes that ->prepare_write() can handle the block boundary. With this, we don't need to care the extra byte. And for expanding a file size by truncate(), fatfs uses the added generic_cont_expand_simple(). Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:47 -08:00
OGAWA Hirofumi	268fc16e34	[PATCH] export/change sync_page_range/_nolock() This exports/changes the sync_page_range/_nolock(). The fatfs needs sync_page_range/_nolock() for expanding truncate, and changes "size_t count" to "loff_t count". Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:47 -08:00
OGAWA Hirofumi	e5174baaea	[PATCH] fat: support ->direct_IO() This patch add to support of ->direct_IO() for mostly read. The user of this seems to want to use for streaming read. So, current direct I/O has limitation, it can only overwrite. (For write operation, mainly we need to handle the hole etc..) Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:46 -08:00
Russell King	9ded96f24c	[PATCH] IRQ type flags Some ARM platforms have the ability to program the interrupt controller to detect various interrupt edges and/or levels. For some platforms, this is critical to setup correctly, particularly those which the setting is dependent on the device. Currently, ARM drivers do (eg) the following: err = request_irq(irq, ...); set_irq_type(irq, IRQT_RISING); However, if the interrupt has previously been programmed to be level sensitive (for whatever reason) then this will cause an interrupt storm. Hence, if we combine set_irq_type() with request_irq(), we can then safely set the type prior to unmasking the interrupt. The unfortunate problem is that in order to support this, these flags need to be visible outside of the ARM architecture - drivers such as smc91x need these flags and they're cross-architecture. Finally, the SA_TRIGGER_* flag passed to request_irq() should reflect the property that the device would like. The IRQ controller code should do its best to select the most appropriate supported mode. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:46 -08:00
Paul Fulghum	705b6c7b34	[PATCH] new driver synclink_gt New character device driver for the SyncLink GT and SyncLink AC families of synchronous and asynchronous serial adapters Signed-off-by: Paul Fulghum <paulkf@microgate.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:45 -08:00
Tim Schmielau	de25968cc8	[PATCH] fix more missing includes Include fixes for 2.6.14-git11. Should allow to remove sched.h from module.h on i386, x86_64, arm, ia64, ppc, ppc64, and s390. Probably more to come since I haven't yet checked the other archs. Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:45 -08:00
Paul Jackson	c417f0242e	[PATCH] cpuset: remove test for null cpuset from alloc code path Remove a couple of more lines of code from the cpuset hooks in the page allocation code path. There was a check for a NULL cpuset pointer in the routine cpuset_update_task_memory_state() that was only needed during system boot, after the memory subsystem was initialized, before the cpuset subsystem was initialized, to catch a NULL task->cpuset pointer. Add a cpuset_init_early() routine, just before the mem_init() call in init/main.c, that sets up just enough of the init tasks cpuset structure to render cpuset_update_task_memory_state() calls harmless. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:44 -08:00
Paul Jackson	4225399a66	[PATCH] cpuset: rebind vma mempolicies fix Fix more of longstanding bug in cpuset/mempolicy interaction. NUMA mempolicies (mm/mempolicy.c) are constrained by the current tasks cpuset to just the Memory Nodes allowed by that cpuset. The kernel maintains internal state for each mempolicy, tracking what nodes are used for the MPOL_INTERLEAVE, MPOL_BIND or MPOL_PREFERRED policies. When a tasks cpuset memory placement changes, whether because the cpuset changed, or because the task was attached to a different cpuset, then the tasks mempolicies have to be rebound to the new cpuset placement, so as to preserve the cpuset-relative numbering of the nodes in that policy. An earlier fix handled such mempolicy rebinding for mempolicies attached to a task. This fix rebinds mempolicies attached to vma's (address ranges in a tasks address space.) Due to the need to hold the task->mm->mmap_sem semaphore while updating vma's, the rebinding of vma mempolicies has to be done when the cpuset memory placement is changed, at which time mmap_sem can be safely acquired. The tasks mempolicy is rebound later, when the task next attempts to allocate memory and notices that its task->cpuset_mems_generation is out-of-date with its cpusets mems_generation. Because walking the tasklist to find all tasks attached to a changing cpuset requires holding tasklist_lock, a spinlock, one cannot update the vma's of the affected tasks while doing the tasklist scan. In general, one cannot acquire a semaphore (which can sleep) while already holding a spinlock (such as tasklist_lock). So a list of mm references has to be built up during the tasklist scan, then the tasklist lock dropped, then for each mm, its mmap_sem acquired, and the vma's in that mm rebound. Once the tasklist lock is dropped, affected tasks may fork new tasks, before their mm's are rebound. A kernel global 'cpuset_being_rebound' is set to point to the cpuset being rebound (there can only be one; cpuset modifications are done under a global 'manage_sem' semaphore), and the mpol_copy code that is used to copy a tasks mempolicies during fork catches such forking tasks, and ensures their children are also rebound. When a task is moved to a different cpuset, it is easier, as there is only one task involved. It's mm->vma's are scanned, using the same mpol_rebind_policy() as used above. It may happen that both the mpol_copy hook and the update done via the tasklist scan update the same mm twice. This is ok, as the mempolicies of each vma in an mm keep track of what mems_allowed they are relative to, and safely no-op a second request to rebind to the same nodes. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:44 -08:00
Paul Jackson	202f72d5d1	[PATCH] cpuset: number_of_cpusets optimization Easy little optimization hack to avoid actually having to call cpuset_zone_allowed() and check mems_allowed, in the main page allocation routine, __alloc_pages(). This saves several CPU cycles per page allocation on systems not using cpusets. A counter is updated each time a cpuset is created or removed, and whenever there is only one cpuset in the system, it must be the root cpuset, which contains all CPUs and all Memory Nodes. In that case, when the counter is one, all allocations are allowed. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:44 -08:00
Paul Jackson	74cb21553f	[PATCH] cpuset: numa_policy_rebind cleanup Cleanup, reorganize and make more robust the mempolicy.c code to rebind mempolicies relative to the containing cpuset after a tasks memory placement changes. The real motivator for this cleanup patch is to lay more groundwork for the upcoming patch to correctly rebind NUMA mempolicies that are attached to vma's after the containing cpuset memory placement changes. NUMA mempolicies are constrained by the cpuset their task is a member of. When either (1) a task is moved to a different cpuset, or (2) the 'mems' mems_allowed of a cpuset is changed, then the NUMA mempolicies have embedded node numbers (for MPOL_BIND, MPOL_INTERLEAVE and MPOL_PREFERRED) that need to be recalculated, relative to their new cpuset placement. The old code used an unreliable method of determining what was the old mems_allowed constraining the mempolicy. It just looked at the tasks mems_allowed value. This sort of worked with the present code, that just rebinds the -task- mempolicy, and leaves any -vma- mempolicies broken, referring to the old nodes. But in an upcoming patch, the vma mempolicies will be rebound as well. Then the order in which the various task and vma mempolicies are updated will no longer be deterministic, and one can no longer count on the task->mems_allowed holding the old value for as long as needed. It's not even clear if the current code was guaranteed to work reliably for task mempolicies. So I added a mems_allowed field to each mempolicy, stating exactly what mems_allowed the policy is relative to, and updated synchronously and reliably anytime that the mempolicy is rebound. Also removed a useless wrapper routine, numa_policy_rebind(), and had its caller, cpuset_update_task_memory_state(), call directly to the rewritten policy_rebind() routine, and made that rebind routine extern instead of static, and added a "mpol_" prefix to its name, making it mpol_rebind_policy(). Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:44 -08:00
Paul Jackson	909d75a3b7	[PATCH] cpuset: implement cpuset_mems_allowed Provide a cpuset_mems_allowed() method, which the sys_migrate_pages() code needed, to obtain the mems_allowed vector of a cpuset, and replaced the workaround in sys_migrate_pages() to call this new method. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:44 -08:00
Paul Jackson	cf2a473c40	[PATCH] cpuset: combine refresh_mems and update_mems The important code paths through alloc_pages_current() and alloc_page_vma(), by which most kernel page allocations go, both called cpuset_update_current_mems_allowed(), which in turn called refresh_mems(). -Both- of these latter two routines did a tasklock, got the tasks cpuset pointer, and checked for out of date cpuset->mems_generation. That was a silly duplication of code and waste of CPU cycles on an important code path. Consolidated those two routines into a single routine, called cpuset_update_task_memory_state(), since it updates more than just mems_allowed. Changed all callers of either routine to call the new consolidated routine. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:43 -08:00
Paul Jackson	3e0d98b9f1	[PATCH] cpuset: memory pressure meter Provide a simple per-cpuset metric of memory pressure, tracking the -rate- that the tasks in a cpuset call try_to_free_pages(), the synchronous (direct) memory reclaim code. This enables batch managers monitoring jobs running in dedicated cpusets to efficiently detect what level of memory pressure that job is causing. This is useful both on tightly managed systems running a wide mix of submitted jobs, which may choose to terminate or reprioritize jobs that are trying to use more memory than allowed on the nodes assigned them, and with tightly coupled, long running, massively parallel scientific computing jobs that will dramatically fail to meet required performance goals if they start to use more memory than allowed to them. This patch just provides a very economical way for the batch manager to monitor a cpuset for signs of memory pressure. It's up to the batch manager or other user code to decide what to do about it and take action. ==> Unless this feature is enabled by writing "1" to the special file /dev/cpuset/memory_pressure_enabled, the hook in the rebalance code of __alloc_pages() for this metric reduces to simply noticing that the cpuset_memory_pressure_enabled flag is zero. So only systems that enable this feature will compute the metric. Why a per-cpuset, running average: Because this meter is per-cpuset, rather than per-task or mm, the system load imposed by a batch scheduler monitoring this metric is sharply reduced on large systems, because a scan of the tasklist can be avoided on each set of queries. Because this meter is a running average, instead of an accumulating counter, a batch scheduler can detect memory pressure with a single read, instead of having to read and accumulate results for a period of time. Because this meter is per-cpuset rather than per-task or mm, the batch scheduler can obtain the key information, memory pressure in a cpuset, with a single read, rather than having to query and accumulate results over all the (dynamically changing) set of tasks in the cpuset. A per-cpuset simple digital filter (requires a spinlock and 3 words of data per-cpuset) is kept, and updated by any task attached to that cpuset, if it enters the synchronous (direct) page reclaim code. A per-cpuset file provides an integer number representing the recent (half-life of 10 seconds) rate of direct page reclaims caused by the tasks in the cpuset, in units of reclaims attempted per second, times 1000. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:42 -08:00
Paul Jackson	5966514db6	[PATCH] cpuset: mempolicy one more nodemask conversion Finish converting mm/mempolicy.c from bitmaps to nodemasks. The previous conversion had left one routine using bitmaps, since it involved a corresponding change to kernel/cpuset.c Fix that interface by replacing with a simple macro that calls nodes_subset(), or if !CONFIG_CPUSET, returns (1). Signed-off-by: Paul Jackson <pj@sgi.com> Cc: Christoph Lameter <christoph@lameter.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:42 -08:00
Matt Mackall	10cef60295	[PATCH] slob: introduce the SLOB allocator configurable replacement for slab allocator This adds a CONFIG_SLAB option under CONFIG_EMBEDDED. When CONFIG_SLAB is disabled, the kernel falls back to using the 'SLOB' allocator. SLOB is a traditional K&R/UNIX allocator with a SLAB emulation layer, similar to the original Linux kmalloc allocator that SLAB replaced. It's signicantly smaller code and is more memory efficient. But like all similar allocators, it scales poorly and suffers from fragmentation more than SLAB, so it's only appropriate for small systems. It's been tested extensively in the Linux-tiny tree. I've also stress-tested it with make -j 8 compiles on a 3G SMP+PREEMPT box (not recommended). Here's a comparison for otherwise identical builds, showing SLOB saving nearly half a megabyte of RAM: $ size vmlinux* text data bss dec hex filename 3336372 529360 190812 4056544 3de5e0 vmlinux-slab 3323208 527948 190684 4041840 3dac70 vmlinux-slob $ size mm/{slab,slob}.o text data bss dec hex filename 13221 752 48 14021 36c5 mm/slab.o 1896 52 8 1956 7a4 mm/slob.o /proc/meminfo: SLAB SLOB delta MemTotal: 27964 kB 27980 kB +16 kB MemFree: 24596 kB 25092 kB +496 kB Buffers: 36 kB 36 kB 0 kB Cached: 1188 kB 1188 kB 0 kB SwapCached: 0 kB 0 kB 0 kB Active: 608 kB 600 kB -8 kB Inactive: 808 kB 812 kB +4 kB HighTotal: 0 kB 0 kB 0 kB HighFree: 0 kB 0 kB 0 kB LowTotal: 27964 kB 27980 kB +16 kB LowFree: 24596 kB 25092 kB +496 kB SwapTotal: 0 kB 0 kB 0 kB SwapFree: 0 kB 0 kB 0 kB Dirty: 4 kB 12 kB +8 kB Writeback: 0 kB 0 kB 0 kB Mapped: 560 kB 556 kB -4 kB Slab: 1756 kB 0 kB -1756 kB CommitLimit: 13980 kB 13988 kB +8 kB Committed_AS: 4208 kB 4208 kB 0 kB PageTables: 28 kB 28 kB 0 kB VmallocTotal: 1007312 kB 1007312 kB 0 kB VmallocUsed: 48 kB 48 kB 0 kB VmallocChunk: 1007264 kB 1007264 kB 0 kB (this work has been sponsored in part by CELF) From: Ingo Molnar <mingo@elte.hu> Fix 32-bitness bugs in mm/slob.c. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:41 -08:00
Paul E. McKenney	d4829cd5b4	[PATCH] remove get_task_struct_rcu() The latest set of signal-RCU patches does not use get_task_struct_rcu(). Attached is a patch that removes it. Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:40 -08:00
Ingo Molnar	e56d090310	[PATCH] RCU signal handling RCU tasklist_lock and RCU signal handling: send signals RCU-read-locked instead of tasklist_lock read-locked. This is a scalability improvement on SMP and a preemption-latency improvement under PREEMPT_RCU. Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: William Irwin <wli@holomorphy.com> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:40 -08:00
Ravikiran G Thirumalai	22fc6eccbf	[PATCH] Change maxaligned_in_smp alignemnt macros to internodealigned_in_smp macros ____cacheline_maxaligned_in_smp is currently used to align critical structures and avoid false sharing. It uses per-arch L1_CACHE_SHIFT_MAX and people find L1_CACHE_SHIFT_MAX useless. However, we have been using ____cacheline_maxaligned_in_smp to align structures on the internode cacheline size. As per Andi's suggestion, following patch kills ____cacheline_maxaligned_in_smp and introduces INTERNODE_CACHE_SHIFT, which defaults to L1_CACHE_SHIFT for all arches. Arches needing L3/Internode cacheline alignment can define INTERNODE_CACHE_SHIFT in the arch asm/cache.h. Patch replaces ____cacheline_maxaligned_in_smp with ____cacheline_internodealigned_in_smp With this patch, L1_CACHE_SHIFT_MAX can be killed Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:38 -08:00
Christoph Lameter	48fce3429d	[PATCH] mempolicies: unexport get_vma_policy() Since the numa_maps functionality is now in mempolicy.c we no longer need to export get_vma_policy(). Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:44 -08:00
Avishay Traeger	152194aaa6	[PATCH] set_page_count() macro safety Fix set_page_count() macro to handle complex arguments. Signed-off-by: Avishay Traeger <atraeger@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:43 -08:00
Paul Jackson	45b07ef31d	[PATCH] cpusets: swap migration interface Add a boolean "memory_migrate" to each cpuset, represented by a file containing "0" or "1" in each directory below /dev/cpuset. It defaults to false (file contains "0"). It can be set true by writing "1" to the file. If true, then anytime that a task is attached to the cpuset so marked, the pages of that task will be moved to that cpuset, preserving, to the extent practical, the cpuset-relative placement of the pages. Also anytime that a cpuset so marked has its memory placement changed (by writing to its "mems" file), the tasks in that cpuset will have their pages moved to the cpusets new nodes, preserving, to the extent practical, the cpuset-relative placement of the moved pages. Signed-off-by: Paul Jackson <pj@sgi.com> Cc: Christoph Lameter <christoph@lameter.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:43 -08:00
Christoph Lameter	d498471133	[PATCH] SwapMig: Extend parameters for migrate_pages() Extend the parameters of migrate_pages() to allow the caller control over the fate of successfully migrated or impossible to migrate pages. Swap migration and direct migration will have the same interface after this patch so that patches can be independently applied to the policy layer and the core migration code. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:42 -08:00
Christoph Lameter	1480a540c9	[PATCH] SwapMig: add_to_swap() avoid atomic allocations Add gfp_mask to add_to_swap add_to_swap does allocations with GFP_ATOMIC in order not to interfere with swapping. During migration we may have use add_to_swap extensively which may lead to out of memory errors. This patch makes add_to_swap take a parameter that specifies the gfp mask. The page migration code can then make add_to_swap use GFP_KERNEL. Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:42 -08:00
Christoph Lameter	8419c31810	[PATCH] SwapMig: CONFIG_MIGRATION fixes Move move_to_lru, putback_lru_pages and isolate_lru in section surrounded by CONFIG_MIGRATION saving some codesize for single processor kernels. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:42 -08:00
Christoph Lameter	39743889aa	[PATCH] Swap Migration V5: sys_migrate_pages interface sys_migrate_pages implementation using swap based page migration This is the original API proposed by Ray Bryant in his posts during the first half of 2005 on linux-mm@kvack.org and linux-kernel@vger.kernel.org. The intent of sys_migrate is to migrate memory of a process. A process may have migrated to another node. Memory was allocated optimally for the prior context. sys_migrate_pages allows to shift the memory to the new node. sys_migrate_pages is also useful if the processes available memory nodes have changed through cpuset operations to manually move the processes memory. Paul Jackson is working on an automated mechanism that will allow an automatic migration if the cpuset of a process is changed. However, a user may decide to manually control the migration. This implementation is put into the policy layer since it uses concepts and functions that are also needed for mbind and friends. The patch also provides a do_migrate_pages function that may be useful for cpusets to automatically move memory. sys_migrate_pages does not modify policies in contrast to Ray's implementation. The current code here is based on the swap based page migration capability and thus is not able to preserve the physical layout relative to it containing nodeset (which may be a cpuset). When direct page migration becomes available then the implementation needs to be changed to do a isomorphic move of pages between different nodesets. The current implementation simply evicts all pages in source nodeset that are not in the target nodeset. Patch supports ia64, i386 and x86_64. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:42 -08:00
Christoph Lameter	dc9aa5b9d6	[PATCH] Swap Migration V5: MPOL_MF_MOVE interface Add page migration support via swap to the NUMA policy layer This patch adds page migration support to the NUMA policy layer. An additional flag MPOL_MF_MOVE is introduced for mbind. If MPOL_MF_MOVE is specified then pages that do not conform to the memory policy will be evicted from memory. When they get pages back in new pages will be allocated following the numa policy. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:41 -08:00
Christoph Lameter	7cbe34cf86	[PATCH] Swap Migration V5: Add CONFIG_MIGRATION for page migration support Include page migration if the system is NUMA or having a memory model that allows distinct areas of memory (SPARSEMEM, DISCONTIGMEM). And: - Only include lru_add_drain_per_cpu if building for an SMP system. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:41 -08:00
Christoph Lameter	49d2e9cc45	[PATCH] Swap Migration V5: migrate_pages() function This adds the basic page migration function with a minimal implementation that only allows the eviction of pages to swap space. Page eviction and migration may be useful to migrate pages, to suspend programs or for remapping single pages (useful for faulty pages or pages with soft ECC failures) The process is as follows: The function wanting to migrate pages must first build a list of pages to be migrated or evicted and take them off the lru lists via isolate_lru_page(). isolate_lru_page determines that a page is freeable based on the LRU bit set. Then the actual migration or swapout can happen by calling migrate_pages(). migrate_pages does its best to migrate or swapout the pages and does multiple passes over the list. Some pages may only be swappable if they are not dirty. migrate_pages may start writing out dirty pages in the initial passes over the pages. However, migrate_pages may not be able to migrate or evict all pages for a variety of reasons. The remaining pages may be returned to the LRU lists using putback_lru_pages(). Changelog V4->V5: - Use the lru caches to return pages to the LRU Changelog V3->V4: - Restructure code so that applying patches to support full migration does require minimal changes. Rename swapout_pages() to migrate_pages(). Changelog V2->V3: - Extract common code from shrink_list() and swapout_pages() Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: "Michael Kerrisk" <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:41 -08:00
Christoph Lameter	930d915252	[PATCH] Swap Migration V5: PF_SWAPWRITE to allow writing to swap Add PF_SWAPWRITE to control a processes permission to write to swap. - Use PF_SWAPWRITE in may_write_to_queue() instead of checking for kswapd and pdflush - Set PF_SWAPWRITE flag for kswapd and pdflush Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:41 -08:00
Christoph Lameter	21eac81f25	[PATCH] Swap Migration V5: LRU operations This is the start of the `swap migration' patch series. Swap migration allows the moving of the physical location of pages between nodes in a numa system while the process is running. This means that the virtual addresses that the process sees do not change. However, the system rearranges the physical location of those pages. The main intent of page migration patches here is to reduce the latency of memory access by moving pages near to the processor where the process accessing that memory is running. The patchset allows a process to manually relocate the node on which its pages are located through the MF_MOVE and MF_MOVE_ALL options while setting a new memory policy. The pages of process can also be relocated from another process using the sys_migrate_pages() function call. Requires CAP_SYS_ADMIN. The migrate_pages function call takes two sets of nodes and moves pages of a process that are located on the from nodes to the destination nodes. Manual migration is very useful if for example the scheduler has relocated a process to a processor on a distant node. A batch scheduler or an administrator can detect the situation and move the pages of the process nearer to the new processor. sys_migrate_pages() could be used on non-numa machines as well, to force all of a particualr process's pages out to swap, if someone thinks that's useful. Larger installations usually partition the system using cpusets into sections of nodes. Paul has equipped cpusets with the ability to move pages when a task is moved to another cpuset. This allows automatic control over locality of a process. If a task is moved to a new cpuset then also all its pages are moved with it so that the performance of the process does not sink dramatically (as is the case today). Swap migration works by simply evicting the page. The pages must be faulted back in. The pages are then typically reallocated by the system near the node where the process is executing. For swap migration the destination of the move is controlled by the allocation policy. Cpusets set the allocation policy before calling sys_migrate_pages() in order to move the pages as intended. No allocation policy changes are performed for sys_migrate_pages(). This means that the pages may not faulted in to the specified nodes if no allocation policy was set by other means. The pages will just end up near the node where the fault occurred. There's another patch series in the pipeline which implements "direct migration". The direct migration patchset extends the migration functionality to avoid going through swap. The destination node of the relation is controllable during the actual moving of pages. The crutch of using the allocation policy to relocate is not necessary and the pages are moved directly to the target. Its also faster since swap is not used. And sys_migrate_pages() can then move pages directly to the specified node. Implement functions to isolate pages from the LRU and put them back later. This patch: An earlier implementation was provided by Hirokazu Takahashi <taka@valinux.co.jp> and IWAMOTO Toshihiro <iwamoto@valinux.co.jp> for the memory hotplug project. From: Magnus This breaks out isolate_lru_page() and putpack_lru_page(). Needed for swap migration. Signed-off-by: Magnus Damm <magnus.damm@gmail.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:41 -08:00
Christoph Lameter	15316ba81a	[PATCH] add schedule_on_each_cpu() swap migration's isolate_lru_page() currently uses an IPI to notify other processors that the lru caches need to be drained if the page cannot be found on the LRU. The IPI interrupt may interrupt a processor that is just processing lru requests and cause a race condition. This patch introduces a new function run_on_each_cpu() that uses the keventd() to run the LRU draining on each processor. Processors disable preemption when dealing the LRU caches (these are per processor) and thus executing LRU draining from another process is safe. Thanks to Lee Schermerhorn <lee.schermerhorn@hp.com> for finding this race condition. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:40 -08:00
Rohit Seth	8ad4b1fb82	[PATCH] Make high and batch sizes of per_cpu_pagelists configurable As recently there has been lot of traffic on the right values for batch and high water marks for per_cpu_pagelists. This patch makes these two variables configurable through /proc interface. A new tunable /proc/sys/vm/percpu_pagelist_fraction is added. This entry controls the fraction of pages at most in each zone that are allocated for each per cpu page list. The min value for this is 8. It means that we don't allow more than 1/8th of pages in each zone to be allocated in any single per_cpu_pagelist. The batch value of each per cpu pagelist is also updated as a result. It is set to pcp->high/4. The upper limit of batch is (PAGE_SHIFT * 8) Signed-off-by: Rohit Seth <rohit.seth@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:40 -08:00
Andrew Morton	9d0243bca3	[PATCH] drop-pagecache Add /proc/sys/vm/drop_caches. When written to, this will cause the kernel to discard as much pagecache and/or reclaimable slab objects as it can. THis operation requires root permissions. It won't drop dirty data, so the user should run `sync' first. Caveats: a) Holds inode_lock for exorbitant amounts of time. b) Needs to be taught about NUMA nodes: propagate these all the way through so the discarding can be controlled on a per-node basis. This is a debugging feature: useful for getting consistent results between filesystem benchmarks. We could possibly put it under a config option, but it's less than 300 bytes. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:40 -08:00
Pekka Enberg	f9f7500521	[PATCH] slab: remove unused align parameter from alloc_percpu __alloc_percpu and alloc_percpu both take an 'align' argument which is completely ignored. snmp6_mib_init() in net/ipv6/af_inet6.c attempts to use it, but it will be ignored. Therefore, remove the 'align' argument and fixup the lone caller. Signed-off-by: Matthew Dobson <colpatch@us.ibm.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:39 -08:00
Olaf Hering	b792de39d8	[PATCH] Fix compilation with CONFIG_MEMORY_HOTPLUG=y and gcc41. Fix compilation with CONFIG_MEMORY_HOTPLUG=y and gcc41. Also remove unneeded declations, add a public function. drivers/base/memory.c:53: error: static declaration of 'register_memory_notifier' follows non-static declaration include/linux/memory.h:85: error: previous declaration of 'register_memory_notifier' was here drivers/base/memory.c:58: error: static declaration of 'unregister_memory_notifier' follows non-static declaration include/linux/memory.h:86: error: previous declaration of 'unregister_memory_notifier' was here drivers/base/memory.c:68: error: static declaration of 'register_memory' follows non-static declaration include/linux/memory.h:73: error: previous declaration of 'register_memory' was here Signed-off-by: Olaf Hering <olh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:39 -08:00
Benjamin Herrenschmidt	1beb6a7d6c	[PATCH] powerpc: Experimental support for new G5 Macs (#2 ) This adds some very basic support for the new machines, including the Quad G5 (tested), and other new dual core based machines and iMac G5 iSight (untested). This is still experimental ! There is no thermal control yet, there is no proper handing of MSIs, etc.. but it boots, I have all 4 cores up on my machine. Compared to the previous version of this patch, this one adds DART IOMMU support for the U4 chipset and thus should work fine on setups with more than 2Gb of RAM. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-01-09 15:03:17 +11:00
Arnd Bergmann	67207b9664	[PATCH] spufs: The SPU file system, base This is the current version of the spu file system, used for driving SPEs on the Cell Broadband Engine. This release is almost identical to the version for the 2.6.14 kernel posted earlier, which is available as part of the Cell BE Linux distribution from http://www.bsc.es/projects/deepcomputing/linuxoncell/. The first patch provides all the interfaces for running spu application, but does not have any support for debugging SPU tasks or for scheduling. Both these functionalities are added in the subsequent patches. See Documentation/filesystems/spufs.txt on how to use spufs. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-01-09 14:49:12 +11:00
David S. Miller	f53b61d8c3	[NETFILTER]: Add dummy nf_hook{_thresh}() when NETFILTER is disabled. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:42 -08:00
Patrick McHardy	e16a8f0b8c	[NETFILTER]: Add ipt_policy/ip6t_policy matches Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:38 -08:00
Patrick McHardy	eb9c7ebe69	[NETFILTER]: Handle NAT in IPsec policy checks Handle NAT of decapsulated IPsec packets by reconstructing the struct flowi of the original packet from the conntrack information for IPsec policy checks. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:37 -08:00
Patrick McHardy	3e3850e989	[NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder ip_route_me_harder doesn't use the port numbers of the xfrm lookup and uses ip_route_input for non-local addresses which doesn't do a xfrm lookup, ip6_route_me_harder doesn't do a xfrm lookup at all. Use xfrm_decode_session and do the lookup manually, make sure both only do the lookup if the packet hasn't been transformed already. Makeing sure the lookup only happens once needs a new field in the IP6CB, which exceeds the size of skb->cb. The size of skb->cb is increased to 48b. Apparently the IPv6 mobile extensions need some more room anyway. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:33 -08:00
Patrick McHardy	951dbc8ac7	[IPV6]: Move nextheader offset to the IP6CB Move nextheader offset to the IP6CB to make it possible to pass a packet to ip6_input_finish multiple times and have it skip already parsed headers. As a nice side effect this gets rid of the manual hopopts skipping in ip6_input_finish. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:29 -08:00
Patrick McHardy	16a6677fdf	[XFRM]: Netfilter IPsec output hooks Call netfilter hooks before IPsec transforms. Packets visit the FORWARD/LOCAL_OUT and POST_ROUTING hook before the first encapsulation and the LOCAL_OUT and POST_ROUTING hook before each following tunnel mode transform. Patch from Herbert Xu <herbert@gondor.apana.org.au>: Move the loop from dst_output into xfrm4_output/xfrm6_output since they're the only ones who need to it. xfrm{4,6}_output_one() processes the first SA all subsequent transport mode SAs and is called in a loop that calls the netfilter hooks between each two calls. In order to avoid the tail call issue, I've added the inline function nf_hook which is nf_hook_slow plus the empty list check. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:28 -08:00
Linus Torvalds	8995b161eb	Merge master.kernel.org:/home/rmk/linux-2.6-arm	2006-01-07 10:45:22 -08:00
Linus Torvalds	cc918c7ab7	Merge master.kernel.org:/home/rmk/linux-2.6-serial	2006-01-07 10:44:22 -08:00
Linus Torvalds	f9c5d0451b	Merge master.kernel.org:/home/rmk/linux-2.6-mmc	2006-01-07 10:43:40 -08:00
Russell King	f8ce25476d	[ARM] Move asm/hardware/clock.h to linux/clk.h This is needs to be visible to other architectures using the AMBA bus and peripherals. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2006-01-07 16:15:52 +00:00
Russell King	123656d4cc	Merge with Linus' kernel.	2006-01-07 14:40:05 +00:00
Russell King	a62c80e559	[ARM] Move AMBA include files to include/linux/amba/ Since the ARM AMBA bus is used on MIPS as well as ARM, we need to make the bus available for other architectures to use. Move the AMBA include files from include/asm-arm/hardware/ to include/linux/amba/ Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2006-01-07 13:52:45 +00:00
Linus Torvalds	0feb9bfcfa	Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6	2006-01-06 15:25:08 -08:00
Linus Torvalds	d8d8f6a4fd	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2006-01-06 15:24:28 -08:00
Linus Torvalds	57d1c91fa6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild	2006-01-06 15:23:56 -08:00
Alexey Dobriyan	a2167dc62e	[NET]: Endian-annotate in_aton() Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-06 13:24:54 -08:00
Alexey Dobriyan	76ab608d86	[NET]: Endian-annotate struct iphdr And fix trivial warnings that emerged. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-06 13:24:29 -08:00
Kris Katterjohn	4bad4dc919	[NET]: Change sk_run_filter()'s return type in net/core/filter.c It should return an unsigned value, and fix sk_filter() as well. Signed-off-by: Kris Katterjohn <kjak@ispwest.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-06 13:08:20 -08:00
Greg Kroah-Hartman	ccf18968b1	Merge ../torvalds-2.6/	2006-01-06 12:59:59 -08:00
Sam Ravnborg	367cb70421	kbuild: un-stringnify KBUILD_MODNAME Now when kbuild passes KBUILD_MODNAME with "" do not __stringify it when used. Remove __stringnify for all users. This also fixes the output of: $ ls -l /sys/module/ drwxr-xr-x 4 root root 0 2006-01-05 14:24 pcmcia drwxr-xr-x 4 root root 0 2006-01-05 14:24 pcmcia_core drwxr-xr-x 3 root root 0 2006-01-05 14:24 "processor" drwxr-xr-x 3 root root 0 2006-01-05 14:24 "psmouse" The quoting of the module names will be gone again. Thanks to GregKH + Kay Sievers for reproting this. Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2006-01-06 21:17:50 +01:00
J. Bruce Fields	9eed129bbd	SUNRPC: Update the spkm3 code to use the make_checksum interface Also update the tokenlen calculations to accomodate g_token_size(). Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:59 -05:00
Trond Myklebust	58df095b73	NFSv4: Allow entries in the idmap cache to expire If someone changes the uid/gid mapping in userland, then we do eventually want those changes to be propagated to the kernel. Currently the kernel assumes that it may cache entries forever. Add an expiration time + garbage collector for idmap entries. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:58 -05:00
Trond Myklebust	632e3bdc50	SUNRPC: Ensure client closes the socket when server initiates a close If the server decides to close the RPC socket, we currently don't actually respond until either another RPC call is scheduled, or until xprt_autoclose() gets called by the socket expiry timer (which may be up to 5 minutes later). This patch ensures that xprt_autoclose() is called much sooner if the server closes the socket. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:57 -05:00
Chuck Lever	f518e35aec	SUNRPC: get rid of cl_chatty Clean up: Every ULP that uses the in-kernel RPC client, except the NLM client, sets cl_chatty. There's no reason why NLM shouldn't set it, so just get rid of cl_chatty and always be verbose. Test-plan: Compile with CONFIG_NFS enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:56 -05:00
Chuck Lever	922004120b	SUNRPC: transport switch API for setting port number At some point, transport endpoint addresses will no longer be IPv4. To hide the structure of the rpc_xprt's address field from ULPs and port mappers, add an API for setting the port number during an RPC bind operation. Test-plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. NFSv2/3 and NFSv4 mounting should be carefully checked. Probably need to rig a server where certain services aren't running, or that returns an error for some typical operation. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:56 -05:00
Chuck Lever	35f5a422ce	SUNRPC: new interface to force an RPC rebind We'd like to hide fields in rpc_xprt and rpc_clnt from upper layer protocols. Start by creating an API to force RPC rebind, replacing logic that simply sets cl_port to zero. Test-plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. NFSv2/3 and NFSv4 mounting should be carefully checked. Probably need to rig a server where certain services aren't running, or that returns an error for some typical operation. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:56 -05:00
Chuck Lever	0210714834	SUNRPC: switchable buffer allocation Add RPC client transport switch support for replacing buffer management on a per-transport basis. In the current IPv4 socket transport implementation, RPC buffers are allocated as needed for each RPC message that is sent. Some transport implementations may choose to use pre-allocated buffers for encoding, sending, receiving, and unmarshalling RPC messages, however. For transports capable of direct data placement, the buffers can be carved out of a pre-registered area of memory rather than from a slab cache. Test-plan: Millions of fsx operations. Performance characterization with "sio" and "iozone". Use oprofile and other tools to look for significant regression in CPU utilization. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:55 -05:00
J. Bruce Fields	64a318ee2a	NLM: Further cancel fixes If the server receives an NLM cancel call and finds no waiting lock to cancel, then chances are the lock has already been applied, and the client just hadn't yet processed the NLM granted callback before it sent the cancel. The Open Group text, for example, perimts a server to return either success (LCK_GRANTED) or failure (LCK_DENIED) in this case. But returning an error seems more helpful; the client may be able to use it to recognize that a race has occurred and to recover from the race. So, modify the relevant functions to return an error in this case. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:54 -05:00
Adrian Bunk	fb459f45f7	SUNRPC: net/sunrpc/xdr.c: remove xdr_decode_string() This patch removes ths unused function xdr_decode_string(). Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Neil Brown <neilb@suse.de> Acked-by: Charles Lever <Charles.Lever@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:53 -05:00
Trond Myklebust	a72b44222d	NFSv4: Allow user to set the port used by the NFSv4 callback channel Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:52 -05:00
Trond Myklebust	fa178f29c0	NFSv4: Ensure DELEGRETURN returns attributes Upon return of a write delegation, the server will almost always bump the change attribute. Ensure that we pick up that change so that we don't invalidate our data cache unnecessarily. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:51 -05:00
Trond Myklebust	70b9ecbdb9	NFS: Make stat() return updated mtimes after a write() The SuS states that a call to write() will cause mtime to be updated on the file. In order to satisfy that requirement, we need to flush out any cached writes in nfs_getattr(). Speed things up slightly by not committing the writes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:50 -05:00
Chuck Lever	40859d7ee6	NFS: support large reads and writes on the wire Most NFS server implementations allow up to 64KB reads and writes on the wire. The Solaris NFS server allows up to a megabyte, for instance. Now the Linux NFS client supports transfer sizes up to 1MB, too. This will help reduce protocol and context switch overhead on read/write intensive NFS workloads, and support larger atomic read and write operations on servers that support them. Test-plan: Connectathon and iozone on mount point with wsize=rsize>32768 over TCP. Tests with NFS over UDP to verify the maximum RPC payload size cap. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:49 -05:00
Chuck Lever	a911fd9a60	NFS: simplify inlined bit ops in nfs_page.h Minor cleanup: inlined bit ops in nfs_page.h can be simpler. Test plan: Write-intensive workload against a server that requires COMMITs. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:48 -05:00
Trond Myklebust	911d1aaf26	NFSv4: locking XDR cleanup Get rid of some unnecessary intermediate structures Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:44 -05:00
Trond Myklebust	cdd4e68b5f	NFSv4: Make open_confirm() asynchronous too Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:42 -05:00
Trond Myklebust	44c288732f	NFSv4: stateful NFSv4 RPC call interface The NFSv4 model requires us to complete all RPC calls that might establish state on the server whether or not the user wants to interrupt it. We may also need to schedule new work (including new RPC calls) in order to cancel the new state. The asynchronous RPC model will allow us to ensure that RPC calls always complete, but in order to allow for "synchronous" RPC, we want to add the ability to wait for completion. The waits are, of course, interruptible. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:40 -05:00
Trond Myklebust	4ce70ada1f	SUNRPC: Further cleanups Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:40 -05:00
Trond Myklebust	963d8fe533	RPC: Clean up RPC task structure Shrink the RPC task structure. Instead of storing separate pointers for task->tk_exit and task->tk_release, put them in a structure. Also pass the user data pointer as a parameter instead of passing it via task->tk_calldata. This enables us to nest callbacks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:39 -05:00
Trond Myklebust	abbcf28f23	SUNRPC: Yet more RPC cleanups Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:39 -05:00
Andrew Morton	22905f775d	identify multipage ->writepages() calls NFS needs to be able to distinguish between single-page ->writepage() calls and multipage ->writepages() calls. For the single-page writepage calls NFS can kick off the I/O within the context of ->writepage(). For multipage ->writepages calls, nfs_writepage() will leave the I/O pending and nfs_writepages() will kick off the I/O when it all has been queued up within NFS. Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:38 -05:00
Linus Torvalds	d99cf9d679	Merge branch 'post-2.6.15' of git://brick.kernel.dk/data/git/linux-2.6-block Manual fixup for merge with Jens' "Suspend support for libata", commit ID `9b84754866`. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 09:01:25 -08:00
Jens Axboe	9b84754866	[PATCH] Suspend support for libata This patch adds suspend patch to libata, and ata_piix in particular. For most low level drivers, they should just need to add the 4 hooks to work. As I can only test ata_piix, I didn't enable it for more though. Suspend support is the single most important feature on a notebook, and most new notebooks have sata drives. It's quite embarrassing that we _still_ do not support this. Right now, it's perfectly possible to suspend the drive in mid-transfer. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:36:09 -08:00
NeilBrown	88202a0c84	[PATCH] md: allow sync-speed to be controlled per-device Also export current (average) speed and status in sysfs. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:10 -08:00
NeilBrown	4dbcdc751c	[PATCH] md: count corrected read errors per drive Store this total in superblock (As appropriate), and make it available to userspace via sysfs. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:09 -08:00
NeilBrown	d9d166c2a9	[PATCH] md: allow array level to be set textually via sysfs Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:09 -08:00
NeilBrown	2989ddbd6e	[PATCH] md: make a couple of names in md.c static .. because they aren't used outside md.c Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:07 -08:00
NeilBrown	1345b1d8ad	[PATCH] md: define and use safe_put_page for md md sometimes call put_page on NULL pointers (treating it like kfree). This is not safe, so define and use a 'safe_put_page' which checks for NULL. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:07 -08:00
NeilBrown	2604b703b6	[PATCH] md: remove personality numbering from md md supports multiple different RAID level, each being implemented by a 'personality' (which is often in a separate module). These personalities have fairly artificial 'numbers'. The numbers are use to: 1- provide an index into an array where the various personalities are recorded 2- identify the module (via an alias) which implements are particular personality. Neither of these uses really justify the existence of personality numbers. The array can be replaced by a linked list which is searched (array lookup only happens very rarely). Module identification can be done using an alias based on level rather than 'personality' number. The current 'raid5' modules support two level (4 and 5) but only one personality. This slight awkwardness (which was handled in the mapping from level to personality) can be better handled by allowing raid5 to register 2 personalities. With this change in place, the core md module does not need to have an exhaustive list of all possible personalities, so other personalities can be added independently. This patch also moves the check for chunksize being non-zero into the ->run routines for the personalities that need it, rather than having it in core-md. This has a side effect of allowing 'faulty' and 'linear' not to have a chunk-size set. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:06 -08:00
NeilBrown	fccddba060	[PATCH] md: tidy up raid5/6 hash table code - replace open-coded hash chain with hlist macros - Fix hash-table size at one page - it is already quite generous, so there will never be a need to use multiple pages, so no need for __get_free_pages No functional change. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:06 -08:00
NeilBrown	0eb3ff12aa	[PATCH] md: raid10 read-error handling - resync and read-only Add in correct read-error handling for resync and read-only situations. When read-only, we don't over-write, so we need to mark the failed drive in the r10_bio so we don't re-try it. During resync, we always read all blocks, so if there is a read error, we simply over-write it with the good block that we found (assuming we found one). Note that the recovery case still isn't handled in an interesting way. There is nothing useful to do for the 2-copies case. If there are 3 or more copies, then we could try reading from one of the non-missing copies, but this is a bit complicated and very rarely would be used, so I'm leaving it for now. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:05 -08:00
NeilBrown	4443ae10ca	[PATCH] md: auto-correct correctable read errors in raid10 Largely just a cross-port from raid1. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:05 -08:00
NeilBrown	9910f16af3	[PATCH] md: fix up some rdev rcu locking in raid5/6 There is this "FIXME" comment with a typo in it!! that been annoying me for days, so I just had to remove it. conf->disks[i].rdev should only be accessed if - we know we hold a reference or - the mddev->reconfig_sem is down or - we have a rcu_readlock handle_stripe was referencing rdev in three places without any of these. For the first two, get an rcu_readlock. For the last, the same access (md_sync_acct call) is made a little later after the rdev has been claimed under and rcu_readlock, if R5_Syncio is set. So just use that access... However R5_Syncio isn't really needed as the 'syncing' variable contains the same information. So use that instead. Issues, comment, and fix are identical in raid5 and raid6. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:04 -08:00
NeilBrown	cf30a473a0	[PATCH] md: handle errors when read-only Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:04 -08:00
NeilBrown	ddaf22abaa	[PATCH] md: attempt to auto-correct read errors in raid1 On a read-error we suspend the array, then synchronously read the block from other arrays until we find one where we can read it. Then we try writing the good data back everywhere and make sure it works. If any write or subsequent read fails, only then do we fail the device out of the array. To be able to suspend the array, we need to also keep track of how many requests are queued for handling by raid1d. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:03 -08:00
NeilBrown	ca65b73bd9	[PATCH] md: fix raid6 resync check/repair code raid6 currently does not check the P/Q syndromes when doing a resync, it just calculates the correct value and writes it. Doing the check can reduce writes (often to 0) for a resync, and it is needed to properly implement the echo check > sync_action operation. This patch implements the appropriate checks and tidies up some related code. It also allows raid6 user-requested resync to bypass the intent bitmap. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:03 -08:00
NeilBrown	6cce3b23f6	[PATCH] md: write intent bitmap support for raid10 Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:03 -08:00
NeilBrown	6ff8d8ec06	[PATCH] md: allow dirty raid[456] arrays to be started at boot See patch to md.txt for more details Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:02 -08:00
NeilBrown	0a27ec96b6	[PATCH] md: improve raid10 "IO Barrier" concept raid10 needs to put up a barrier to new requests while it does resync or other background recovery. The code for this is currently open-coded, slighty obscure by its use of two waitqueues, and not documented. This patch gathers all the related code into 4 functions, and includes a comment which (hopefully) explains what is happening. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:02 -08:00
NeilBrown	17999be4aa	[PATCH] md: improve raid1 "IO Barrier" concept raid1 needs to put up a barrier to new requests while it does resync or other background recovery. The code for this is currently open-coded, slighty obscure by its use of two waitqueues, and not documented. This patch gathers all the related code into 4 functions, and includes a comment which (hopefully) explains what is happening. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:01 -08:00
Alasdair G Kergon	6da487dcc0	[PATCH] device-mapper ioctl: add skip lock_fs flag Add ioctl DM_SKIP_LOCKFS_FLAG for userspace to request that lock_fs is bypassed when suspending a device. There's no change to the behaviour of existing code that doesn't know about the new flag. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:01 -08:00
David Shaw	a334de2866	[PATCH] knfsd: check error status from vfs_getattr and i_op->fsync Both vfs_getattr and i_op->fsync return error statuses which nfsd was largely ignoring. This as noticed when exporting directories using fuse. This patch cleans up most of the offences, which involves moving the call to vfs_getattr out of the xdr encoding routines (where it is too late to report an error) into the main NFS procedure handling routines. There is still a called to vfs_gettattr (related to the ACL code) where the status is ignored, and called to nfsd_sync_dir don't check return status either. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:59 -08:00
Jan Kara	f93ea411b7	[PATCH] jbd: split checkpoint lists Split the checkpoint list of the transaction into two lists. In the first list we keep the buffers that need to be submitted for IO. In the second list are kept buffers that were already submitted and we just have to wait for the IO to complete. This should simplify a handling of checkpoint lists a bit and can eventually be also a performance gain. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:59 -08:00
Adrian Bunk	81684ee645	[PATCH] include/linux/parport_pc.h: "extern inline" -> "static inline" "extern inline" doesn't make much sense. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:58 -08:00
Marko Kohtala	110bee75d2	[PATCH] parport: DEBUG_PARPORT build fix Add missing "struct" keyword preventing compilation with DEBUG_PARPORT defined. Also add some "const". Signed-off-by: Marko Kohtala <marko.kohtala@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:57 -08:00
Marko Kohtala	742ec650e9	[PATCH] parport: phase fixes Did not move the parport interface properly into IEEE1284_PH_REV_IDLE phase at end of data due to comparing bytes with nibbles. Internal phase IEEE1284_PH_HBUSY_DNA became unused, so remove it. Signed-off-by: Marko Kohtala <marko.kohtala@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:56 -08:00
Miklos Szeredi	3ec870d524	[PATCH] fuse: make maximum write data configurable Make the maximum size of write data configurable by the filesystem. The previous fixed 4096 limit only worked on architectures where the page size is less or equal to this. This change make writing work on other architectures too, and also lets the filesystem receive bigger write requests in direct_io mode. Normal writes which go through the page cache are still limited to a page sized chunk per request. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:56 -08:00
Miklos Szeredi	1d3d752b47	[PATCH] fuse: clean up request size limit checking Change the way a too large request is handled. Until now in this case the device read returned -EINVAL and the operation returned -EIO. Make it more flexibible by not returning -EINVAL from the read, but restarting it instead. Also remove the fixed limit on setxattr data and let the filesystem provide as large a read buffer as it needs to handle the extended attribute data. The symbolic link length is already checked by VFS to be less than PATH_MAX, so the extra check against FUSE_SYMLINK_MAX is not needed. The check in fuse_create_open() against FUSE_NAME_MAX is not needed, since the dentry has already been looked up, and hence the name already checked. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:56 -08:00
Miklos Szeredi	de5f120255	[PATCH] fuse: add frsize to statfs reply Add 'frsize' member to the statfs reply. I'm not sure if sending f_fsid will ever be needed, but just in case leave some space at the end of the structure, so less compatibility mess would be required. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:55 -08:00
Miklos Szeredi	45714d6561	[PATCH] fuse: bump interface version Change interface version to 7.4. Following changes will need backward compatibility support, so store the minor version returned by userspace. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:55 -08:00
Markus Lidel	dcceafe25a	[PATCH] I2O: Bugfixes - Removed some kmalloc's with __GFP_ZERO and replace it with memset() because it didn't work properly. - Fixed returned message frame in i2o_cfg_passthru() which caused raidutils to display wrong error message in case a disk was missing. - Fixed size of printk() in i2o_scsi.c. - Fixed get_device() and put_device() in probing of the I2O controller. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:54 -08:00
Markus Lidel	24791bd48f	[PATCH] I2O: Remove wrong I2O device class Removed wrong I2O device class, which was only needed to add sysfs attributes. Signed-off-by: Markus Lidel <Markus.Lidel@shadowconnect.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:54 -08:00
Markus Lidel	a1a5ea70a6	[PATCH] I2O: changed I2O API to create I2O messages in kernel memory Changed the I2O API to create I2O messages first in kernel memory and then transfer it at once over the PCI bus instead of sending each quad-word over the PCI bus. Signed-off-by: Markus Lidel <Markus.Lidel@shadowconnect.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:53 -08:00
Martin Schwidefsky	347a8dc3b8	[PATCH] s390: cleanup Kconfig Sanitize some s390 Kconfig options. We have ARCH_S390, ARCH_S390X, ARCH_S390_31, 64BIT, S390_SUPPORT and COMPAT. Replace these 6 options by S390, 64BIT and COMPAT. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:53 -08:00
Rafael J. Wysocki	3a291a20bd	[PATCH] mm: add a new function (needed for swap suspend) This adds the function get_swap_page_of_type() allowing us to specify an index in swap_info[] and select a swap_info_struct structure to be used for allocating a swap page. This function (or another one of similar functionality) will be necessary for implementing the image-writing part of swsusp in the user space. It can also be used for simplifying the current in-kernel implementation of the image-writing part of swsusp. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:43 -08:00
Rafael J. Wysocki	72a97e0839	[PATCH] swsusp: improve freeing of memory This patch makes swsusp free only as much memory as needed to complete the suspend and not as much as possible. In the most of cases this should speed up the suspend and make the system much more responsive after resume, especially if a GUI (eg. X Windows) is used. If needed, the old behavior (ie to free as much memory as possible during suspend) can be restored by unsetting FAST_FREE in power.h Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:40 -08:00
Rafael J. Wysocki	7088a5c001	[PATCH] swsusp: introduce the swap map structure This patch introduces the swap map structure that can be used by swsusp for keeping tracks of data pages written to the swap. The structure itself is described in a comment within the patch. The overall idea is to reduce the amount of metadata written to the swap and to write and read the image pages sequentially, in a file-alike way. This makes the swap-handling part of swsusp fairly independent of its snapshot-handling part and will hopefully allow us to completely separate these two parts in the future. This patch is needed to remove the suspend image size limit imposed by the limited size of the swsusp_info structure, which is essential for x86-64 systems with more than 512 MB of RAM. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:40 -08:00
Ivan Kokshaysky	eee45269b0	[PATCH] Alpha: convert to generic irq framework (generic part) Thanks to Christoph for doing most of the work. This allows automatic SMP IRQ affinity assignment other than default "all interrupts on all CPUs" which is rather expensive. This might be useful if the hardware can be programmed to distribute interrupts among different CPUs, like Alpha does. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Christoph Hellwig <hch@lst.de> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:40 -08:00
Jordan Crouse	f90b811603	[PATCH] Base support for AMD Geode GX/LX processors Provide basic support for the AMD Geode GX and LX processors. Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:38 -08:00
David Howells	7ee1dd3fee	[PATCH] FRV: Make futex code compilable on nommu [try #2 ] Make the futex code compilable and usable on NOMMU by making the attempt to handle page faults conditional on CONFIG_MMU. If this is not enabled, then we can assume that EFAULT returned from futex_atomic_op_inuser() is not recoverable, and that the address lies outside of valid memory. handle_mm_fault() is made to BUG if called on NOMMU without attempting to invoke the actual handler (__handle_mm_fault). Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:33 -08:00
David Howells	b0e15190ea	[PATCH] NOMMU: Make SYSV IPC SHM use ramfs facilities on NOMMU The attached patch makes the SYSV IPC shared memory facilities use the new ramfs facilities on a no-MMU kernel. The following changes are made: (1) There are now shmem_mmap() and shmem_get_unmapped_area() functions to allow the IPC SHM facilities to commune with the tiny-shmem and shmem code. (2) ramfs files now need resizing using do_truncate() rather than by modifying the inode size directly (see shmem_file_setup()). This causes ramfs to attempt to bind a block of pages of sufficient size to the inode. (3) CONFIG_SYSVIPC is no longer contingent on CONFIG_MMU. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:32 -08:00
David Howells	642fb4d1f1	[PATCH] NOMMU: Provide shared-writable mmap support on ramfs The attached patch makes ramfs support shared-writable mmaps by: (1) Attempting to perform a contiguous block allocation to the requested size when truncate attempts to increase the file from zero size, such as happens when: fd = shm_open("/file/on/ramfs", ...): ftruncate(fd, size_requested); addr = mmap(NULL, subsize, PROT_READ\|PROT_WRITE\|PROT_EXEC, MAP_SHARED, fd, offset); (2) Permitting any shared-writable mapping over any contiguous set of extant pages. get_unmapped_area() will return the address into the actual ramfs pages. The mapping may start anywhere and be of any size, but may not go over the end of file. Multiple mappings may overlap in any way. (3) Not permitting a file to be shrunk if it would truncate any shared mappings (private mappings are copied). Thus this patch provides support for POSIX shared memory on NOMMU kernels, with certain limitations such as there being a large enough block of pages available to support the allocation and it only working on directly mappable filesystems. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:32 -08:00
David Howells	8d9067bda9	[PATCH] Keys: Remove key duplication Remove the key duplication stuff since there's nothing that uses it, no way to get at it and it's awkward to deal with for LSM purposes. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:29 -08:00
Nick Piggin	b09eb1c06a	[PATCH] mm: page_state opt docs Comment the new locking rules for page_state statistics. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:29 -08:00
Nick Piggin	a74609fafa	[PATCH] mm: page_state opt Optimise page_state manipulations by introducing interrupt unsafe accessors to page_state fields. Callers must provide their own locking (either disable interrupts or not update from interrupt context). Switch over the hot callsites that can easily be moved under interrupts off sections. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:29 -08:00
Christoph Lameter	d3cb487149	[PATCH] atomic_long_t & include/asm-generic/atomic.h V2 Several counters already have the need to use 64 atomic variables on 64 bit platforms (see mm_counter_t in sched.h). We have to do ugly ifdefs to fall back to 32 bit atomic on 32 bit platforms. The VM statistics patch that I am working on will also make more extensive use of atomic64. This patch introduces a new type atomic_long_t by providing definitions in asm-generic/atomic.h that works similar to the c "long" type. Its 32 bits on 32 bit platforms and 64 bits on 64 bit platforms. Also cleans up the determination of the mm_counter_t in sched.h. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:29 -08:00
Christoph Lameter	4be38e351c	[PATCH] mm: move determination of policy_zone into page allocator Currently the function to build a zonelist for a BIND policy has the side effect to set the policy_zone. This seems to be a bit strange. policy zone seems to not be initialized elsewhere and therefore 0. Do we police ZONE_DMA if no bind policy has been used yet? This patch moves the determination of the zone to apply policies to into the page allocator. We determine the zone while building the zonelist for nodes. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:28 -08:00
Con Kolivas	f3fe65122d	[PATCH] mm: add populated_zone() helper There are numerous places we check whether a zone is populated or not. Provide a helper function to check for populated zones and convert all checks for zone->present_pages. Signed-off-by: Con Kolivas <kernel@kolivas.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:28 -08:00
Nick Piggin	9617d95e6e	[PATCH] mm: rmap optimisation Optimise rmap functions by minimising atomic operations when we know there will be no concurrent modifications. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:27 -08:00
Nick Piggin	9328b8faae	[PATCH] mm: dma32 zone statistics Add dma32 to zone statistics. Also attempt to arrange struct page_state a bit better (visually). Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:26 -08:00
Andrew Morton	7756b9e4e3	[PATCH] kill last zone_reclaim() bits Remove the last bits of Martin's ill-fated sys_set_zone_reclaim(). Cc: Martin Hicks <mort@wildopensource.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:26 -08:00
Ravikiran G Thirumalai	008857c1a4	[PATCH] Cleanup bootmem allocator and fix alloc_bootmem_low Patch cleans up the alloc_bootmem fix for swiotlb. Patch removes alloc_bootmem__limit api and fixes alloc_boot_low api to do the right thing -- allocate from low32 memory. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:26 -08:00
Nick Piggin	2d92c5c915	[PATCH] mm: remove pcp low struct per_cpu_pages.low is useless. Remove it. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:25 -08:00
Andy Whitcroft	161599ff39	[PATCH] sparsemem: provide pfn_to_nid Before SPARSEMEM is initialised we cannot provide an efficient pfn_to_nid() implmentation; before initialisation is complete we use early_pfn_to_nid() to provide location information. Until recently there was no non-init user of this functionality. Provide a post init pfn_to_nid() implementation. Note that this implmentation assumes that the pfn passed has been validated with pfn_valid(). The current single user of this function already has this check. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:24 -08:00
Andy Whitcroft	2bdaf115b1	[PATCH] flatmem split out memory model There are three places we define pfn_to_nid(). Two in linux/mmzone.h and one in asm/mmzone.h. These in essence represent the three memory models. The definition in linux/mmzone.h under !NEED_MULTIPLE_NODES is both the FLATMEM definition and the optimisation for single NUMA nodes; the one under SPARSEMEM is the NUMA sparsemem one; the one in asm/mmzone.h under DISCONTIGMEM is the discontigmem one. This is not in the least bit obvious, particularly the connection between the non-NUMA optimisations and the memory models. Two patches: flatmem-split-out-memory-model: simplifies the selection of pfn_to_nid() implementations. The selection is based primarily off the memory model selected. Optimisations for non-NUMA are applied where needed. sparse-provide-pfn_to_nid: implement pfn_to_nid() for SPARSEMEM This patch: pfn_to_nid is memory model specific The pfn_to_nid() call is memory model specific. It represents the locality identifier for the memory passed. Classically this would be a NUMA node, but not a chunk of memory under DISCONTIGMEM. The SPARSEMEM and FLATMEM memory model non-NUMA versions of pfn_to_nid() are folded together under NEED_MULTIPLE_NODES, while DISCONTIGMEM has its own optimisation. This is all very confusing. This patch splits out each implementation of pfn_to_nid() so that we can see them and the optimisations to each. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:24 -08:00
Russell King	03b00ebcc8	[PATCH] Shut up warnings in ipc/shm.c Fix two warnings in ipc/shm.c ipc/shm.c:122: warning: statement with no effect ipc/shm.c:560: warning: statement with no effect by converting the macros to empty inline functions. For safety, let's do all three. This also has the advantage that typechecking gets performed even without CONFIG_SHMEM enabled. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:24 -08:00
Mike Kravetz	a94b3ab7ea	[PATCH] mm: remove arch independent NODES_SPAN_OTHER_NODES The NODES_SPAN_OTHER_NODES config option was created so that DISCONTIGMEM could handle pSeries numa layouts. However, support for DISCONTIGMEM has been replaced by SPARSEMEM on powerpc. As a result, this config option and supporting code is no longer needed. I have already sent a patch to Paul that removes the option from powerpc specific code. This removes the arch independent piece. Doesn't really matter which is applied first. Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:24 -08:00
Andy Whitcroft	d5afa6dcf7	[PATCH] mm: pfn_to_pgdat not used in common code pfn_to_pgdat() isn't used in common code. Remove definition. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:24 -08:00
Andy Whitcroft	9f3fd602ae	[PATCH] mm: kvaddr_to_nid not used in common code kvaddr_to_nid() isn't used in common code nor in i386 code. Remove these definitions. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:23 -08:00
Christoph Lameter	21abb1478a	[PATCH] Remove old node based policy interface from mempolicy.c mempolicy.c contains provisional interface for huge page allocation based on node numbers. This is in use in SLES9 but was never used (AFAIK) in upstream versions of Linux. Huge page allocations now use zonelists to figure out where to allocate pages. The use of zonelists allows us to find the closest hugepage which was the consideration of the NUMA distance for huge page allocations. Remove the obsolete functions. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@muc.de> Acked-by: William Lee Irwin III <wli@holomorphy.com> Cc: Adam Litke <agl@us.ibm.com> Acked-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:23 -08:00
Christoph Lameter	5da7ca8607	[PATCH] Add NUMA policy support for huge pages. The huge_zonelist() function in the memory policy layer provides an list of zones ordered by NUMA distance. The hugetlb layer will walk that list looking for a zone that has available huge pages but is also in the nodeset of the current cpuset. This patch does not contain the folding of find_or_alloc_huge_page() that was controversial in the earlier discussion. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@muc.de> Acked-by: William Lee Irwin III <wli@holomorphy.com> Cc: Adam Litke <agl@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:23 -08:00
Badari Pulavarty	f6b3ec238d	[PATCH] madvise(MADV_REMOVE): remove pages from tmpfs shm backing store Here is the patch to implement madvise(MADV_REMOVE) - which frees up a given range of pages & its associated backing store. Current implementation supports only shmfs/tmpfs and other filesystems return -ENOSYS. "Some app allocates large tmpfs files, then when some task quits and some client disconnect, some memory can be released. However the only way to release tmpfs-swap is to MADV_REMOVE". - Andrea Arcangeli Databases want to use this feature to drop a section of their bufferpool (shared memory segments) - without writing back to disk/swap space. This feature is also useful for supporting hot-plug memory on UML. Concerns raised by Andrew Morton: - "We have no plan for holepunching! If we _do_ have such a plan (or might in the future) then what would the API look like? I think sys_holepunch(fd, start, len), so we should start out with that." - Using madvise is very weird, because people will ask "why do I need to mmap my file before I can stick a hole in it?" - None of the other madvise operations call into the filesystem in this manner. A broad question is: is this capability an MM operation or a filesytem operation? truncate, for example, is a filesystem operation which sometimes has MM side-effects. madvise is an mm operation and with this patch, it gains FS side-effects, only they're really, really significant ones." Comments: - Andrea suggested the fs operation too but then it's more efficient to have it as a mm operation with fs side effects, because they don't immediatly know fd and physical offset of the range. It's possible to fixup in userland and to use the fs operation but it's more expensive, the vmas are already in the kernel and we can use them. Short term plan & Future Direction: - We seem to need this interface only for shmfs/tmpfs files in the short term. We have to add hooks into the filesystem for correctness and completeness. This is what this patch does. - In the future, plan is to support both fs and mmap apis also. This also involves (other) filesystem specific functions to be implemented. - Current patch doesn't support VM_NONLINEAR - which can be addressed in the future. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Andrea Arcangeli <andrea@suse.de> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:22 -08:00
Hans Reiser	d7339071f6	[PATCH] reiser4: vfs: add truncate_inode_pages_range() This patch makes truncate_inode_pages_range from truncate_inode_pages. truncate_inode_pages became a one-liner call to truncate_inode_pages_range. Reiser4 needs truncate_inode_pages_ranges because it tries to keep correspondence between existences of metadata pointing to data pages and pages to which those metadata point to. So, when metadata of certain part of file is removed from filesystem tree, only pages of corresponding range are to be truncated. (Needed by the madvise(MADV_REMOVE) patch) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:22 -08:00
Herbert Xu	4b2f0260c7	[PATCH] nbd: fix TX/RX race condition Janos Haar of First NetCenter Bt. reported numerous crashes involving the NBD driver. With his help, this was tracked down to bogus bio vectors which in turn was the result of a race condition between the receive/transmit routines in the NBD driver. The bug manifests itself like this: CPU0 CPU1 do_nbd_request add req to queuelist nbd_send_request send req head for each bio kmap send nbd_read_stat nbd_find_request nbd_end_request kunmap When CPU1 finishes nbd_end_request, the request and all its associated bio's are freed. So when CPU0 calls kunmap whose argument is derived from the last bio, it may crash. Under normal circumstances, the race occurs only on the last bio. However, if an error is encountered on the remote NBD server (such as an incorrect magic number in the request), or if there were a bug in the server, it is possible for the nbd_end_request to occur any time after the request's addition to the queuelist. The following patch fixes this problem by making sure that requests are not added to the queuelist until after they have been completed transmission. In order for the receiving side to be ready for responses involving requests still being transmitted, the patch introduces the concept of the active request. When a response matches the current active request, its processing is delayed until after the tranmission has come to a stop. This has been tested by Janos and it has been successful in curing this race condition. From: Herbert Xu <herbert@gondor.apana.org.au> Here is an updated patch which removes the active_req wait in nbd_clear_queue and the associated memory barrier. I've also clarified this in the comment. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: <djani22@dynamicweb.hu> Cc: Paul Clements <Paul.Clements@SteelEye.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:20 -08:00
Jens Axboe	15fc858a00	[BLOCK] Correct blk_execute_rq_nowait() prototype	2006-01-06 10:00:50 +01:00
Tejun Heo	9a3dccc425	[BLOCK] add FUA support to libata Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-06 09:56:18 +01:00
Tejun Heo	797e7dbbee	[BLOCK] reimplement handling of barrier request Reimplement handling of barrier requests. * Flexible handling to deal with various capabilities of target devices. * Retry support for falling back. * Tagged queues which don't support ordered tag can do ordered. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-06 09:51:03 +01:00
Tejun Heo	8ffdc6550c	[BLOCK] add @uptodate to end_that_request_last() and @error to rq_end_io_fn() add @uptodate argument to end_that_request_last() and @error to rq_end_io_fn(). there's no generic way to pass error code to request completion function, making generic error handling of non-fs request difficult (rq->errors is driver-specific and each driver uses it differently). this patch adds @uptodate to end_that_request_last() and @error to rq_end_io_fn(). for fs requests, this doesn't really matter, so just using the same uptodate argument used in the last call to end_that_request_first() should suffice. imho, this can also help the generic command-carrying request jens is working on. Signed-off-by: tejun heo <htejun@gmail.com> Signed-Off-By: Jens Axboe <axboe@suse.de>	2006-01-06 09:49:03 +01:00
Jean Delvare	7c72ccf09b	[PATCH] i2c: i2c-nforce2 add nforce4 MCP-04 device ID One more supported PCI ID for the i2c-nforce2 driver. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-05 22:16:27 -08:00
Jean Delvare	04b4b8434a	[PATCH] i2c: driver ID list cleanups Cleanups to i2c driver ID list: * Remove mostly bogus comments about driver ID ranges. * Drop experimental driver IDs, as the concept is pretty broken. * Drop now unused IDs of non-I2C (ISA) drivers. * Drop a few more IDs which are no more used. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-05 22:16:26 -08:00
Rudolf Marek	734a12a366	[PATCH] hwmon: add VRM/VID support for some VIA CPUs This patch adds the VIA CENTAUR CPUs to detection table. Table was updated to treat future Intel x86 CPUs as VRD10. Stepping field was added, because some VIA CPUs have different VRM specs across stepping. I changed the vrm type to u8 because all drivers use u8 anyway. Signed-off-by: Rudolf Marek <r.marek@sh.cvut.cz> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-05 22:16:26 -08:00
Greg Kroah-Hartman	de59cf9ed4	[PATCH] I2C: Make i2c_add_driver automatically set the proper module owner This prevents i2c drivers from messing up and forgetting to set the module owner of their driver. It also reduces the size of their drivers by one line :) Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Cc: Jean Delvare <khali@linux-fr.org>	2006-01-05 22:16:24 -08:00
Laurent Riffard	35d8b2e6b8	[PATCH] i2c: Drop i2c_driver.{owner,name}, 1 of 11 We should use the i2c_driver.driver's .name and .owner fields instead of the i2c_driver's ones. This patch updates the core of the i2c drivers: it removes .name and .owner fields from the struct i2c_device and modify various functions to use struct device fields instead. Signed-off-by: Laurent Riffard <laurent.riffard@free.fr> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-05 22:16:22 -08:00
Jean Delvare	482c788ded	[PATCH] i2c: i2c_get_client is gone The i2c_get_client function doesn't exist anymore, so we shouldn't have a definition for it in i2c.h. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-05 22:16:22 -08:00
Jean Delvare	cf02df7702	[PATCH] i2c: Rework client usage count, 3 of 3 Do not limit the usage count of i2c clients to 1. In other words, change the client usage count behavior from the old I2C_CLIENT_ALLOW_USE to the old I2C_CLIENT_ALLOW_MULTIPLE_USE. The rationale is that no driver actually needs the limiting behavior, and the unlimiting behavior is slightly easier to implement. Update the documentation to reflect this change. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-05 22:16:22 -08:00
Jean Delvare	cde7859bda	[PATCH] i2c: Rework client usage count, 2 of 3 Make I2C_CLIENT_ALLOW_USE the default for all i2c clients. It doesn't hurt if the usage count is actually never used for any given driver, and allows for nice code simplifications in i2c-core. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-05 22:16:22 -08:00
Jean Delvare	cb748fb201	[PATCH] i2c: Rework client usage count, 1 of 3 No i2c client uses the I2C_CLIENT_ALLOW_MULTIPLE_USE flag, drop it. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-05 22:16:22 -08:00
Jean Delvare	5d7b851dcc	[PATCH] i2c: Drop i2c_driver.flags, 3 of 3 The flags member of the i2c_driver structure is no more used. Drop it. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-05 22:16:22 -08:00
Jean Delvare	8a9947552d	[PATCH] i2c: Drop i2c_driver.flags, 2 of 3 Just about every i2c chip driver sets the I2C_DF_NOTIFY flag, so we can simply make it the default and drop the flag. If any driver really doesn't want to be notified when i2c adapters are added, that driver can simply omit to set .attach_adapter. This approach is also more robust as it prevents accidental NULL pointer dereferences. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-05 22:16:21 -08:00
Jean Delvare	ff179c8cf5	[PATCH] i2c: Drop i2c_driver.flags, 1 of 3 The I2C_DF_DUMMY flag is gone since 2.5.70, it's about time to drop all ifdef'd out references thereto. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-05 22:16:21 -08:00
Linus Torvalds	29552b1462	Merge http://oss.oracle.com/git/ocfs2	2006-01-05 20:43:11 -08:00
Patrick McHardy	22dea562bb	[NETFILTER]: Export ip6_masked_addrcmp, don't pass IPv6 addresses on stack Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 12:21:34 -08:00
Patrick McHardy	b777e0ce74	[NETFILTER]: make ipv6_find_hdr() find transport protocol header The original ipv6_find_hdr() finds the specified header in IPv6 packets. This makes it possible to get transport header so that we can kill similar loop in ip6_match_packet(). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 12:21:16 -08:00
Patrick McHardy	a9b305c4e5	[NETFILTER]: ctnetlink: Fix dumping of helper name Properly dump the helper name instead of internal kernel data. Based on patch by Marcus Sundberg <marcus@ingate.com>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 12:20:02 -08:00
Pablo Neira Ayuso	c1d10adb4a	[NETFILTER]: Add ctnetlink port for nf_conntrack Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 12:19:05 -08:00
Linus Torvalds	db9edfd7e3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6 Trivial manual merge fixup for usb_find_interface clashes.	2006-01-04 18:44:12 -08:00
Linus Torvalds	4da5cc2cec	Merge git://git.kernel.org/pub/scm/linux/kernel/git/perex/alsa	2006-01-04 16:38:36 -08:00
Linus Torvalds	25c862cc9e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild	2006-01-04 16:36:52 -08:00
Linus Torvalds	52347f4e81	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial	2006-01-04 16:34:57 -08:00
Linus Torvalds	1cb9e8e01d	Merge branch 'upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev	2006-01-04 16:32:33 -08:00
Linus Torvalds	d779188d2b	Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6	2006-01-04 16:31:56 -08:00
Linus Torvalds	f61ea1b0c8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6	2006-01-04 16:30:12 -08:00
Linus Torvalds	d347da0def	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	2006-01-04 16:27:41 -08:00
Linus Torvalds	c6c88bbde4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6	2006-01-04 16:25:44 -08:00
Linus Torvalds	0356dbb7fe	Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq	2006-01-04 16:21:26 -08:00
Dmitry Torokhov	93ce3061be	[PATCH] Driver Core: Add platform_device_del() Driver core: add platform_device_del function Having platform_device_del90 allows more straightforward error handling code in drivers registering platform devices. Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 16:18:09 -08:00
Rusty Russell	e39b84337b	[PATCH] Input: fix add modalias support build error Fix build when scripts/mod/file2alias.c includes linux/input.h, which tries to include /usr/include/linux/mod_devicetable.h: In file included from scripts/mod/file2alias.c:40: include/linux/input.h:21:35: linux/mod_devicetable.h: No such file or directory make[2]: *** [scripts/mod/file2alias.o] Error 1 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 16:18:09 -08:00
Rusty Russell	1d8f430c15	[PATCH] Input: add modalias support Here's the patch for modalias support for input classes. It uses comma-separated numbers, and doesn't describe all the potential keys (no module currently cares, and that would make the strings huge). The changes to input.h are to move the definitions needed by file2alias outside __KERNEL__. I chose not to move those definitions to mod_devicetable.h, because there are so many that it might break compile of something else in the kernel. The rest is fairly straightforward. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> CC: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 16:18:09 -08:00
akpm@osdl.org	f743ca5e10	[PATCH] kobject_uevent CONFIG_NET=n fix lib/lib.a(kobject_uevent.o)(.text+0x25f): In function `kobject_uevent': : undefined reference to `__alloc_skb' lib/lib.a(kobject_uevent.o)(.text+0x2a1): In function `kobject_uevent': : undefined reference to `skb_over_panic' lib/lib.a(kobject_uevent.o)(.text+0x31d): In function `kobject_uevent': : undefined reference to `skb_over_panic' lib/lib.a(kobject_uevent.o)(.text+0x356): In function `kobject_uevent': : undefined reference to `netlink_broadcast' lib/lib.a(kobject_uevent.o)(.init.text+0x9): In function `kobject_uevent_init': : undefined reference to `netlink_kernel_create' make: *** [.tmp_vmlinux1] Error 1 Netlink is unconditionally enabled if CONFIG_NET, so that's OK. kobject_uevent.o is compiled even if !CONFIG_HOTPLUG, which is lazy. Let's compound the sin. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 16:18:08 -08:00
Kay Sievers	312c004d36	[PATCH] driver core: replace "hotplug" by "uevent" Leave the overloaded "hotplug" word to susbsystems which are handling real devices. The driver core does not "plug" anything, it just exports the state to userspace and generates events. Signed-off-by: Kay Sievers <kay.sievers@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 16:18:08 -08:00
Kay Sievers	5f123fbd80	[PATCH] merge kobject_uevent and kobject_hotplug The distinction between hotplug and uevent does not make sense these days, netlink events are the default. udev depends entirely on netlink uevents. Only during early boot and in initramfs, /sbin/hotplug is needed. So merge the two functions and provide only one interface without all the options. The netlink layer got a nice generic interface with named slots recently, which is probably a better facility to plug events for subsystem specific events. Also the new poll() interface to /proc/mounts is a nicer way to notify about changes than sending events through the core. The uevents should only be used for driver core related requests to userspace now. Signed-off-by: Kay Sievers <kay.sievers@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 16:18:07 -08:00
Kay Sievers	033b96fd30	[PATCH] remove mount/umount uevents from superblock handling The names of these events have been confusing from the beginning on, as they have been more like claim/release events. We needed these events for noticing HAL if storage devices have been mounted. Thanks to Al, we have the proper solution now and can poll() /proc/mounts instead to get notfied about mount tree changes. Signed-off-by: Kay Sievers <kay.sievers@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 16:18:07 -08:00
Kay Sievers	0296b22813	[PATCH] remove CONFIG_KOBJECT_UEVENT option It makes zero sense to have hotplug, but not the netlink events enabled today. Remove this option and merge the kobject_uevent.h header into the kobject.h header file. Signed-off-by: Kay Sievers <kay.sievers@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 16:18:07 -08:00
Matthew Dharm	e80b0fade0	[PATCH] USB Storage: add alauda support This patch adds another usb-storage subdriver, which supports two fairly old dual-XD/SmartMedia reader-writers (USB1.1 devices). This driver was written by Daniel Drake <dsd@gentoo.org> -- he notes that he wrote this driver without specs, however a vendor-supplied GPL driver for the previous generation of products ("sma03") did prove to be quite useful, as did the sddr09 driver which also has to deal with low-level physical block layout on SmartMedia. The original patch has been reformed by me, as it clashed with the libusual patches. We really need to consolidate some of this common SmartMedia code, and get together with the MTD guys to share it with them as well. Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: Matthew Dharm <mdharm-usb@one-eyed-alien.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 13:51:42 -08:00
Alan Stern	12c3da346e	[PATCH] USB: Store port number in usb_device This patch (as610) adds a field to struct usb_device to store the device's port number. This allows us to remove several loops in the hub driver (searching for a particular device among all the entries in the parent's array of children). Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 13:48:35 -08:00
Alan Stern	55c527187c	[PATCH] USB: Consider power budget when choosing configuration This patch (as609) changes the way we keep track of power budgeting for USB hubs and devices, and it updates the choose_configuration routine to take this information into account. (This is something we should have been doing all along.) A new field in struct usb_device holds the amount of bus current available from the upstream port, and the usb_hub structure keeps track of the current available for each downstream port. Two new rules for configuration selection are added: Don't select a self-powered configuration when only bus power is available. Don't select a configuration requiring more bus power than is available. However the first rule is #if-ed out, because I found that the internal hub in my HP USB keyboard claims that its only configuration is self-powered. The rule would prevent the configuration from being chosen, leaving the hub & keyboard unconfigured. Since similar descriptor errors may turn out to be fairly common, it seemed wise not to include a rule that would break automatic configuration unnecessarily for such devices. The second rule may also trigger unnecessarily, although this should be less common. More likely it will annoy people by sometimes failing to accept configurations that should never have been chosen in the first place. The patch also changes usbcore's reaction when no configuration is suitable. Instead of raising an error and rejecting the device, now the core will simply leave the device unconfigured. People can always work around such problems by installing configurations manually through sysfs. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 13:48:34 -08:00
Alan Stern	9ad3d6ccf5	[PATCH] USB: Remove USB private semaphore This patch (as605) removes the private udev->serialize semaphore, relying instead on the locking provided by the embedded struct device's semaphore. The changes are confined to the core, except that the usb_trylock_device routine now uses the return convention of down_trylock rather than down_read_trylock (they return opposite values for no good reason). A couple of other associated changes are included as well: Now that we aren't concerned about HCDs that avoid using the hcd glue layer, usb_disconnect no longer needs to acquire the usb_bus_lock -- that can be done by usb_remove_hcd where it belongs. Devices aren't locked over the same scope of code in usb_new_device and hub_port_connect_change as they used to be. This shouldn't cause any trouble. Along with the preceding driver core patch, this needs a lot of testing. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 13:48:34 -08:00
Greg Kroah-Hartman	75318d2d7c	[PATCH] USB: remove .owner field from struct usb_driver It is no longer needed, so let's remove it, saving a bit of memory. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 13:48:34 -08:00
Greg Kroah-Hartman	2143acc6dc	[PATCH] USB: make registering a usb driver automatically set the module owner This fixes the driver that forgot to set the module owner up. Now we can remove the unneeded pointer from the usb driver structure. The idea for how to do this was from Al Viro, who did this for the PCI drivers. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 13:48:32 -08:00
Greg Kroah-Hartman	ba9dc657af	[PATCH] USB: allow usb drivers to disable dynamic ids This lets drivers, like the usb-serial ones, disable the ability to add ids from sysfs. The usb-serial drivers are "odd" in that they are really usb-serial bus drivers, not usb bus drivers, so the dynamic id logic will have to go into the usb-serial bus core for those drivers to get that ability. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 13:48:32 -08:00
Greg Kroah-Hartman	733260ff9c	[PATCH] USB: add dynamic id functionality to USB core Echo the usb vendor and product id to the "new_id" file in the driver's sysfs directory, and then that driver will be able to bind to a device with those ids if it is present. Example: echo 0557 2008 > /sys/bus/usb/drivers/foo_driver/new_id adds the hex values 0557 and 2008 to the device id table for the foo_driver. Note, usb-serial drivers do not currently work with this capability yet. usb-storage also might have some oddities. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 13:48:32 -08:00
Pete Zaitcev	a00828e9ac	[PATCH] USB: drivers/usb/storage/libusual This patch adds a shim driver libusual, which routes devices between usb-storage and ub according to the common table, based on unusual_devs.h. The help and example syntax is in Kconfig. Signed-off-by: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-01-04 13:48:31 -08:00
Gareth Howlett	26e92861be	[SERIAL] Add support for more Connect Tech PCI serial boards I've also fixed the sort-ordering comments on this naming convention. Signed-off-by: Stuart MacDonald <stuartm@connecttech.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2006-01-04 17:00:42 +00:00
Russell King	ce11a161c1	[MMC] Fix missing ',' Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2006-01-04 12:40:39 +00:00
Stephen Hemminger	88df8ef59a	[NET]: Don't exclude broadcast addresses from is_multicast_ether_addr() The check for multicast shouldn't exclude broadcast type addresses. This reverts the incorrect change done in 2.6.13. The broadcast address is a multicast address and should be excluded from being a valid_ether_address for use in bridging or device address. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 15:25:45 -08:00
Russell King	a6f6c96b65	[MMC] Improve MMC card block size selection Select a block size for IO based on the read and write block size combinations, and whether the card supports partial block reads and/or partial block writes. If we are able to satisfy block reads but not block writes, mark the device read only. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2006-01-03 22:38:44 +00:00
Benjamin LaHaise	4947d3ef8d	[NET]: Speed up __alloc_skb() From: Benjamin LaHaise <bcrl@kvack.org> In __alloc_skb(), the use of skb_shinfo() which casts a u8 * to the shared info structure results in gcc being forced to do a reload of the pointer since it has no information on possible aliasing. Fix this by using a pointer to refer to skb_shared_info. By initializing skb_shared_info sequentially, the write combining buffers can reduce the number of memory transactions to a single write. Reorder the initialization in __alloc_skb() to match the structure definition. There is also an alignment issue on 64 bit systems with skb_shared_info by converting nr_frags to a short everything packs up nicely. Also, pass the slab cache pointer according to the fclone flag instead of using two almost identical function calls. This raises bw_unix performance up to a peak of 707KB/s when combined with the spinlock patch. It should help other networking protocols, too. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 14:06:50 -08:00
David S. Miller	17ba15fb62	[PPPOX]: Fix assignment into const proto_ops. And actually, with this, the whole pppox layer can basically be removed and subsumed into pppoe.c, no other pppox sub-protocol implementation exists and we've had this thing for at least 4 years. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:23 -08:00
Arnaldo Carvalho de Melo	14c850212e	[INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h To help in reducing the number of include dependencies, several files were touched as they were getting needed headers indirectly for stuff they use. Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had linux/dccp.h include twice. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:21 -08:00
Eric Dumazet	90ddc4f047	[NET]: move struct proto_ops to const I noticed that some of 'struct proto_ops' used in the kernel may share a cache line used by locks or other heavily modified data. (default linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at least) This patch makes sure a 'struct proto_ops' can be declared as const, so that all cpus can share all parts of it without false sharing. This is not mandatory : a driver can still use a read/write structure if it needs to (and eventually a __read_mostly) I made a global stubstitute to change all existing occurences to make them const. This should reduce the possibility of false sharing on SMP, and speedup some socket system calls. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:15 -08:00
Andi Kleen	77d76ea310	[NET]: Small cleanup to socket initialization sock_init can be done as a core_initcall instead of calling it directly in init/main.c Also I removed an out of date #ifdef. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:14 -08:00
Stephen Hemminger	3821af2fe1	[FLS64]: generic version Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:06 -08:00
Stephen Hemminger	c865e5d99e	[PKT_SCHED] netem: packet corruption option Here is a new feature for netem in 2.6.16. It adds the ability to randomly corrupt packets with netem. A version was done by Hagen Paul Pfeifer, but I redid it to handle the cases of backwards compatibility with netlink interface and presence of hardware checksum offload. It is useful for testing hardware offload in devices. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:05 -08:00
Arnaldo Carvalho de Melo	d83d8461f9	[IP_SOCKGLUE]: Remove most of the tcp specific calls As DCCP needs to be called in the same spots. Now we have a member in inet_sock (is_icsk), set at sock creation time from struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and DCCP) to see if a struct sock instance is a inet_connection_sock for places like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if sk_type was SOCK_STREAM, that is insufficient because we now use the same code for DCCP, that has sk_type SOCK_DCCP. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:58 -08:00
Arnaldo Carvalho de Melo	2271281362	[TCP]: Move the TCPF_ enum to tcp_states.h Upcoming patches will make, for instance, ip_sockglue.c need just this enum and not all of tcp.h. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:57 -08:00
Arnaldo Carvalho de Melo	d8313f5ca2	[INET6]: Generalise tcp_v6_hash_connect Renaming it to inet6_hash_connect, making it possible to ditch dccp_v6_hash_connect and share the same code with TCP instead. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:56 -08:00
Arnaldo Carvalho de Melo	a7f5e7f164	[INET]: Generalise tcp_v4_hash_connect Renaming it to inet_hash_connect, making it possible to ditch dccp_v4_hash_connect and share the same code with TCP instead. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:55 -08:00
Arnaldo Carvalho de Melo	6d6ee43e0b	[TWSK]: Introduce struct timewait_sock_ops So that we can share several timewait sockets related functions and make the timewait mini sockets infrastructure closer to the request mini sockets one. Next changesets will take advantage of this, moving more code out of TCP and DCCP v4 and v6 to common infrastructure. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:54 -08:00
Arnaldo Carvalho de Melo	0fa1a53e1f	[IPV6]: Introduce inet6_timewait_sock Out of tcp6_timewait_sock, that now is just an aggregation of inet_timewait_sock and inet6_timewait_sock, using tw_ipv6_offset in struct inet_timewait_sock, that is common to the IPv6 transport protocols that use timewait sockets, like DCCP and TCP. tw_ipv6_offset plays the struct inet_sock pinfo6 role, i.e. for the generic code to find the IPv6 area in a timewait sock. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:47 -08:00
Arnaldo Carvalho de Melo	b9750ce13c	[IPV6]: Generalise some functions Using sk->sk_protocol instead of IPPROTO_TCP. Will be used by DCCPv6 in the next changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:46 -08:00
Herbert Xu	3305b80c21	[IP]: Simplify and consolidate MSG_PEEK error handling When a packet is obtained from skb_recv_datagram with MSG_PEEK enabled it is left on the socket receive queue. This means that when we detect a checksum error we have to be careful when trying to free the packet as someone could have dequeued it in the time being. Currently this delicate logic is duplicated three times between UDPv4, UDPv6 and RAWv6. This patch moves them into a one place and simplifies the code somewhat. This is based on a suggestion by Eric Dumazet. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:41 -08:00
Arnaldo Carvalho de Melo	8292a17a39	[ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_ops And move it to struct inet_connection_sock. DCCP will use it in the upcoming changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:38 -08:00
Arnaldo Carvalho de Melo	ca304b6104	[IPV6]: Introduce inet6_rsk() And inet6_rsk_offset in inet_request_sock, for the same reasons as inet_sock's pinfo6 member. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:37 -08:00
Herbert Xu	89cee8b1cb	[IPV4]: Safer reassembly Another spin of Herbert Xu's "safer ip reassembly" patch for 2.6.16. (The original patch is here: http://marc.theaimsgroup.com/?l=linux-netdev&m=112281936522415&w=2 and my only contribution is to have tested it.) This patch (optionally) does additional checks before accepting IP fragments, which can greatly reduce the possibility of reassembling fragments which originated from different IP datagrams. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Arthur Kepner <akepner@sgi.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:31 -08:00
Trent Jaeger	df71837d50	[LSM-IPSec]: Security association restriction. This patch series implements per packet access control via the extension of the Linux Security Modules (LSM) interface by hooks in the XFRM and pfkey subsystems that leverage IPSec security associations to label packets. Extensions to the SELinux LSM are included that leverage the patch for this purpose. This patch implements the changes necessary to the XFRM subsystem, pfkey interface, ipv4/ipv6, and xfrm_user interface to restrict a socket to use only authorized security associations (or no security association) to send/receive network packets. Patch purpose: The patch is designed to enable access control per packets based on the strongly authenticated IPSec security association. Such access controls augment the existing ones based on network interface and IP address. The former are very coarse-grained, and the latter can be spoofed. By using IPSec, the system can control access to remote hosts based on cryptographic keys generated using the IPSec mechanism. This enables access control on a per-machine basis or per-application if the remote machine is running the same mechanism and trusted to enforce the access control policy. Patch design approach: The overall approach is that policy (xfrm_policy) entries set by user-level programs (e.g., setkey for ipsec-tools) are extended with a security context that is used at policy selection time in the XFRM subsystem to restrict the sockets that can send/receive packets via security associations (xfrm_states) that are built from those policies. A presentation available at www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf from the SELinux symposium describes the overall approach. Patch implementation details: On output, the policy retrieved (via xfrm_policy_lookup or xfrm_sk_policy_lookup) must be authorized for the security context of the socket and the same security context is required for resultant security association (retrieved or negotiated via racoon in ipsec-tools). This is enforced in xfrm_state_find. On input, the policy retrieved must also be authorized for the socket (at __xfrm_policy_check), and the security context of the policy must also match the security association being used. The patch has virtually no impact on packets that do not use IPSec. The existing Netfilter (outgoing) and LSM rcv_skb hooks are used as before. Also, if IPSec is used without security contexts, the impact is minimal. The LSM must allow such policies to be selected for the combination of socket and remote machine, but subsequent IPSec processing proceeds as in the original case. Testing: The pfkey interface is tested using the ipsec-tools. ipsec-tools have been modified (a separate ipsec-tools patch is available for version 0.5) that supports assignment of xfrm_policy entries and security associations with security contexts via setkey and the negotiation using the security contexts via racoon. The xfrm_user interface is tested via ad hoc programs that set security contexts. These programs are also available from me, and contain programs for setting, getting, and deleting policy for testing this interface. Testing of sa functions was done by tracing kernel behavior. Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:24 -08:00
Zach Brown	994fc28c7b	[PATCH] add AOP_TRUNCATED_PAGE, prepend AOP_ to WRITEPAGE_ACTIVATE readpage(), prepare_write(), and commit_write() callers are updated to understand the special return code AOP_TRUNCATED_PAGE in the style of writepage() and WRITEPAGE_ACTIVATE. AOP_TRUNCATED_PAGE tells the caller that the callee has unlocked the page and that the operation should be tried again with a new page. OCFS2 uses this to detect and work around a lock inversion in its aop methods. There should be no change in behaviour for methods that don't return AOP_TRUNCATED_PAGE. WRITEPAGE_ACTIVATE is also prepended with AOP_ for consistency and they are made enums so that kerneldoc can be used to document their semantics. Signed-off-by: Zach Brown <zach.brown@oracle.com>	2006-01-03 11:45:42 -08:00
Joel Becker	7063fbf226	[PATCH] configfs: User-driven configuration filesystem Configfs, a file system for userspace-driven kernel object configuration. The OCFS2 stack makes extensive use of this for propagation of cluster configuration information into kernel. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2006-01-03 11:45:28 -08:00
Jeff Garzik	a18ceba7b4	Merge branch 'master'	2006-01-03 10:58:53 -05:00
Jeff Garzik	ac67c62473	Merge branch 'master'	2006-01-03 10:49:18 -05:00
Adrian Bunk	4d399cae3f	remove pointers to the defunct UDF mailing list This patch removes pointers to the defunct UDF mailing list. Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-01-03 13:19:13 +01:00
Pierre Ossman	68094e3251	[ALSA] [PATCH] alsa: Improved PnP suspend support Also use the PnP functions to start/stop the devices during the suspend so that drivers will not have to duplicate this code. Cc: Adam Belay <ambx1@neo.rr.com> Cc: Jaroslav Kysela <perex@suse.cz> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2006-01-03 12:31:30 +01:00
Takashi Iwai	4c98cfef2e	[ALSA] PATCH] Add PM support to PnP drivers Add suspend/resume callback to pnp_driver and pnp_card_driver. Signed-off-by: Takashi Iwai <tiwai@suse.de>	2006-01-03 12:31:19 +01:00
Jaya Kumar	9b4ffa48ae	[ALSA] Add support for the CS5535 Audio device Add support for the CS5535 Audio device. I've fixed up some errors as per Takashi's advice from the thread: http://lkml.org/lkml/2005/9/15/119 From: Alan Cox <alan@lxorguk.ukuu.org.uk> cs5535 is a 32bit x86 only device using weird CPU features Signed-off-by: Jaya Kumar <jayakumar.alsa@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2006-01-03 12:16:27 +01:00
Ustyugov Roman	f83b5e323f	kbuild: set correct KBUILD_MODNAME when using well known kernel symbols as module names This patch fixes a problem when we use well known kernel symbols as module names. For example, if module source name is current.c, idle_stack.c or etc., we have a bad KBUILD_MODNAME value. For example, KBUILD_MODNAME will be "get_current()" instead of "current", or "(init_thread_union.stack)" instead of "idle_task". The trick is to define a stringify macro on the commandline - named KBUILD_STR for namespace reasons - and then to stringify the module name. There are a few uses of KBUILD_MODNAME throughout the tree but the usage is for debug and will not be harmed by this change so left untouched for now. While at it KBUILD_BASENAME was changed too. Any spinlock usage in the unix module would have created wrong section names without it. Usage in spinlock.h fixed so it no longer stringify KBUILD_BASENAME. Original patch from Ustyogov Roman - all bugs introduced by me. Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2005-12-26 00:33:41 +01:00
Kurt Huwig	01e33b5a2a	[PATCH] n_r3964: fixed usage of HZ; removed bad include Fix n_r3964 timeouts (hardcoded for 100Hz) Also the include of <asm/termios.h> in 'n_r3964.h' is unnecessary and prevents using the header file in any application that has to include <termios.h> due to duplicate definition of 'struct termio'. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-24 15:37:00 -08:00
Jeff Garzik	aaadff8119	Merge branch 'master'	2005-12-24 09:31:05 -05:00
Jeff Garzik	ebc62fb36c	Merge branch 'master'	2005-12-24 09:28:21 -05:00
Linus Torvalds	c162eeaa21	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2005-12-22 09:41:03 -08:00
Nicolas Pitre	d6f029130f	[PATCH] fix race with preempt_enable() Currently a simple void foo(void) { preempt_enable(); } produces the following code on ARM: foo: bic r3, sp, #8128 bic r3, r3, #63 ldr r2, [r3, #4] ldr r1, [r3, #0] sub r2, r2, #1 tst r1, #4 str r2, [r3, #4] blne preempt_schedule mov pc, lr The problem is that the TIF_NEED_RESCHED flag is loaded _before_ the preemption count is stored back, hence any interrupt coming within that 3 instruction window causing TIF_NEED_RESCHED to be set won't be seen and scheduling won't happen as it should. Nothing currently prevents gcc from performing that reordering. There is already a barrier() before the decrement of the preemption count, but another one is needed between this and the TIF_NEED_RESCHED flag test for proper code ordering. Signed-off-by: Nicolas Pitre <nico@cam.org> Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-22 09:17:39 -08:00
David S. Miller	e6469297d4	Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6.14+git+ipv6-fix-20051221a	2005-12-22 07:41:27 -08:00
Adrian Bunk	23f9b317e0	[PATCH] include/linux/irq.h: #include <linux/smp.h> Jan's crosscompile page [1] shows, that one regression in 2.6.15-rc is that the v850 defconfig does no longer compile. The compile error is: <-- snip --> ... CC arch/v850/kernel/setup.o In file included from /usr/src/ctest/rc/kernel/arch/v850/kernel/setup.c:17: /usr/src/ctest/rc/kernel/include/linux/irq.h:13:43: asm/smp.h: No such file or directory make[2]: *** [arch/v850/kernel/setup.o] Error 1 <-- snip --> The #include <asm/smp.h> in irq.h was intruduced in 2.6.15-rc. Since include/linux/irq.h needs code from asm/smp.h only in the CONFIG_SMP=y case and linux/smp.h #include's asm/smp.h only in the CONFIG_SMP=y case, I'm suggesting this patch to #include <linux/smp.h> in irq.h. I've tested the compilation with both CONFIG_SMP=y and CONFIG_SMP=n on i386. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-21 14:45:25 -08:00
YOSHIFUJI Hideaki	58c4fb86ea	[IPV6]: Flag RTF_ANYCAST for anycast routes. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-12-21 22:56:42 +09:00
Tom Zanussi	fd30fc3256	[PATCH] relayfs: remove warning printk() in relay_switch_subbuf() There's currently a diagnostic printk in relay_switch_subbuf() meant as a warning if you accidentally try to log an event larger than the sub-buffer size. The problem is if this happens while logging from somewhere it's not safe to be doing printks, such as in the scheduler, you can end up with a deadlock. This patch removes the warning from relay_switch_subbuf() and instead prints some diagnostic info when the channel is closed. Thanks to Mathieu Desnoyers for pointing out the problem and suggesting a fix. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-20 17:33:22 -08:00
Trond Myklebust	29884df0d8	NFS: Fix another O_DIRECT race Ensure we call unmap_mapping_range() and sync dirty pages to disk before doing an NFS direct write. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-12-19 23:12:09 -05:00
Kristian Slavov	6b80ebedbe	[RTNETLINK]: Fix RTNLGRP definitions in rtnetlink.h I reported a problem and gave hints to the solution, but nobody seemed to react. So I prepared a patch against 2.6.14.4. Tested on 2.6.14.4 with "ip monitor addr" and with the program attached, while adding and removing IPv6 address. Both programs didn't receive any messages. Tested 2.6.14.4 + this patch, and both programs received add and remove messages. Signed-off-by: Kristian Slavov <kristian.slavov@nomadiclab.com> Acked-by: Jamal Hadi salim <hadi@cyberus.ca> ACKed-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-19 13:54:44 -08:00
Jeff Garzik	418fbfe979	Merge branch 'master'	2005-12-19 00:09:53 -05:00
Kyungmin Park	532a37cf8d	[PATCH] mtd onenand driver: reduce stack usage Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-18 16:28:24 -08:00
Kyungmin Park	37b1cc3910	[PATCH] mtd onenand driver: check correct manufacturer This (and the three subsequent patches) is working well on OMAP H4 with 2.6.15-rc4 kernel and passes the LTP fs test. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-18 16:28:23 -08:00
Linus Torvalds	48ea753075	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6	2005-12-16 14:43:57 -08:00
Christoph Lameter	dc86e88c2b	[IA64] Add __read_mostly support for IA64 sparc64, i386 and x86_64 have support for a special data section dedicated to rarely updated data that is frequently read. The section was created to avoid false sharing of those rarely read data with frequently written kernel data. This patch creates such a data section for ia64 and will group rarely written data into this section. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2005-12-16 10:52:46 -08:00
Linus Torvalds	4d7672b462	Make sure we copy pages inserted with "vm_insert_page()" on fork The logic that decides that a fork() might be able to avoid copying a VM area when it can be re-created by page faults didn't know about the new vm_insert_page() case. Also make some things a bit more anal wrt VM_PFNMAP. Pointed out by Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-16 10:21:23 -08:00
James Bottomley	2a1e1379ba	Merge by hand (conflicts in scsi_lib.c) This merge is pretty extensive. The conflict is over the new req->retries parameter, so I had to change the prototype to scsi_setup_blk_pc_cmnd() and the usage in sd, sr and st. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-12-15 17:35:24 -06:00
Mike Christie	defd94b754	[SCSI] seperate max_sectors from max_hw_sectors - export __blk_put_request and blk_execute_rq_nowait needed for async REQ_BLOCK_PC requests - seperate max_hw_sectors and max_sectors for block/scsi_ioctl.c and SG_IO bio.c helpers per Jens's last comments. Since block/scsi_ioctl.c SG_IO was already testing against max_sectors and SCSI-ml was setting max_sectors and max_hw_sectors to the same value this does not change any scsi SG_IO behavior. It only prepares ll_rw_blk.c, scsi_ioctl.c and bio.c for when SCSI-ml begins to set a valid max_hw_sectors for all LLDs. Today if a LLD does not set it SCSI-ml sets it to a safe default and some LLDs set it to a artificial low value to overcome memory and feedback issues. Note: Since we now cap max_sectors to BLK_DEF_MAX_SECTORS, which is 1024, drivers that used to call blk_queue_max_sectors with a large value of max_sectors will now see the fs requests capped to BLK_DEF_MAX_SECTORS. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-12-15 15:11:40 -08:00

... 3 4 5 6 7 ...

2111 Commits