kernel-ark

Author	SHA1	Message	Date
Vadim Lobanov	bbea9f6966	[PATCH] fdtable: Make fdarray and fdsets equal in size Currently, each fdtable supports three dynamically-sized arrays of data: the fdarray and two fdsets. The code allows the number of fds supported by the fdarray (fdtable->max_fds) to differ from the number of fds supported by each of the fdsets (fdtable->max_fdset). In practice, it is wasteful for these two sizes to differ: whenever we hit a limit on the smaller-capacity structure, we will reallocate the entire fdtable and all the dynamic arrays within it, so any delta in the memory used by the larger-capacity structure will never be touched at all. Rather than hogging this excess, we shouldn't even allocate it in the first place, and keep the capacities of the fdarray and the fdsets equal. This patch removes fdtable->max_fdset. As an added bonus, most of the supporting code becomes simpler. Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:57:22 -08:00
Raz Ben-Jehuda(caro)	46031f9a38	[PATCH] md: allow reads that have bypassed the cache to be retried on failure If a bypass-the-cache read fails, we simply try again through the cache. If it fails again it will trigger normal recovery precedures. update 1: From: NeilBrown <neilb@suse.de> 1/ chunk_aligned_read and retry_aligned_read assume that data_disks == raid_disks - 1 which is not true for raid6. So when an aligned read request bypasses the cache, we can get the wrong data. 2/ The cloned bio is being used-after-free in raid5_align_endio (to test BIO_UPTODATE). 3/ We forgot to add rdev->data_offset when submitting a bio for aligned-read 4/ clone_bio calls blk_recount_segments and then we change bi_bdev, so we need to invalidate the segment counts. 5/ We don't de-reference the rdev when the read completes. This means we need to record the rdev to so it is still available in the end_io routine. Fortunately bi_next in the original bio is unused at this point so we can stuff it in there. 6/ We leak a cloned bio if the target rdev is not usable. From: NeilBrown <neilb@suse.de> update 2: 1/ When aligned requests fail (read error) they need to be retried via the normal method (stripe cache). As we cannot be sure that we can process a single read in one go (we may not be able to allocate all the stripes needed) we store a bio-being-retried and a list of bioes-that-still-need-to-be-retried. When find a bio that needs to be retried, we should add it to the list, not to single-bio... 2/ We were never incrementing 'scnt' when resubmitting failed aligned requests. [akpm@osdl.org: build fix] Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:57:20 -08:00
Alan Cox	ee2f344b33	[PATCH] ide-cd: Handle strange interrupt on the Intel ESB2 The ESB2 appears to emit spurious DMA interrupts when configured for native mode and handling ATAPI devices. Stratus were able to pin this bug down and produce a patch. This is a rework which applies the fixup only to the ESB2 (for now). We can apply it to other chips later if the same problem is found. This code has been tested and confirmed to fix the problem on the tested systems. Signed-off-by: Alan Cox <alan@redhat.com> (Most of the hard work done by Stratus however) Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:57:20 -08:00
Chen, Kenneth W	06066714f6	[PATCH] sched: remove lb_stopbalance counter Remove scheduler stats lb_stopbalance counter. This counter can be calculated by: lb_balanced - lb_nobusyg - lb_nobusyq. There is no need to create gazillion counters while we can derive the value. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:43 -08:00
Siddha, Suresh B	783609c6cb	[PATCH] sched: decrease number of load balances Currently at a particular domain, each cpu in the sched group will do a load balance at the frequency of balance_interval. More the cores and threads, more the cpus will be in each sched group at SMP and NUMA domain. And we endup spending quite a bit of time doing load balancing in those domains. Fix this by making only one cpu(first idle cpu or first cpu in the group if all the cpus are busy) in the sched group do the load balance at that particular sched domain and this load will slowly percolate down to the other cpus with in that group(when they do load balancing at lower domains). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Christoph Lameter <clameter@engr.sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:43 -08:00
Christoph Lameter	08c183f31b	[PATCH] sched: add option to serialize load balancing Large sched domains can be very expensive to scan. Add an option SD_SERIALIZE to the sched domain flags. If that flag is set then we make sure that no other such domain is being balanced. [akpm@osdl.org: build fix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:43 -08:00
Christoph Lameter	c9819f4593	[PATCH] sched: use softirq for load balancing Call rebalance_tick (renamed to run_rebalance_domains) from a newly introduced softirq. We calculate the earliest time for each layer of sched domains to be rescanned (this is the rescan time for idle) and use the earliest of those to schedule the softirq via a new field "next_balance" added to struct rq. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:42 -08:00
Siddha, Suresh B	ac7d550499	[PATCH] sched domain: increase the SMT busy rebalance interval With SMT, if the logical processor is busy, load balance happens for every 8msec(min)-16msec(max). There is no need to do this often, as this is just for fairness(to maintain uniform runqueue lengths) and default time slice anyhow is 100msec. Appended patch increases this interval to 64msec(min)-128msec(max) when the logical processor is busy. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:42 -08:00
Andrew Morton	4a7864ca63	[PATCH] io-accounting: via taskstats Deliver IO accounting via taskstats. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:41 -08:00
Andrew Morton	f2f1f8a3b8	[PATCH] cleanup taskstats.h Fix weird whitespace mangling in taskstats.h Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:41 -08:00
Andrew Morton	7c3ab7381e	[PATCH] io-accounting: core statistics The present per-task IO accounting isn't very useful. It simply counts the number of bytes passed into read() and write(). So if a process reads 1MB from an already-cached file, it is accused of having performed 1MB of I/O, which is wrong. (David Wright had some comments on the applicability of the present logical IO accounting: For billing purposes it is useless but for workload analysis it is very useful read_bytes/read_calls average read request size write_bytes/write_calls average write request size read_bytes/read_blocks ie logical/physical can indicate hit rate or thrashing write_bytes/write_blocks ie logical/physical guess since pdflush writes can be missed I often look for logical larger than physical to see filesystem cache problems. And the bytes/cpusec can help find applications that are dominating the cache and causing slow interactive response from page cache contention. I want to find the IO intensive applications and make sure they are doing efficient IO. Thus the acctcms(sysV) or csacms command would give the high IO commands). This patchset adds new accounting which tries to be more accurate. We account for three things: reads: attempt to count the number of bytes which this process really did cause to be fetched from the storage layer. Done at the submit_bio() level, so it is accurate for block-backed filesystems. I also attempt to wire up NFS and CIFS. writes: attempt to count the number of bytes which this process caused to be sent to the storage layer. This is done at page-dirtying time. The big inaccuracy here is truncate. If a process writes 1MB to a file and then deletes the file, it will in fact perform no writeout. But it will have been accounted as having caused 1MB of write. So... cancelled_writes: account the number of bytes which this process caused to not happen, by truncating pagecache. We _could_ just subtract this from the process's `write' accounting. But that means that some processes would be reported to have done negative amounts of write IO, which is silly. So we just report the raw number and punt this decision up to userspace. Now, we _could_ account for writes at the physical I/O level. But - This would require that we track memory-dirtying tasks at the per-page level (would require a new pointer in struct page). - It would mean that IO statistics for a process are usually only available long after that process has exitted. Which means that we probably cannot communicate this info via taskstats. This patch: Wire up the kernel-private data structures and the accessor functions to manipulate them. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:41 -08:00
David Woodhouse	58f64d83c3	[PATCH] Fix noise in futex.h There are some kernel-only bits in the middle of <linux/futex.h> which should be removed in what we export to userspace. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:41 -08:00
Alexey Dobriyan	1f29bcd739	[PATCH] sysctl: remove unused "context" param Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andi Kleen <ak@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:41 -08:00
Scott Wood	884b4aaaa2	[PATCH] rtc: Add rtc_merge_alarm() Add rtc_merge_alarm(), which can be used by rtc drivers to turn a partially specified alarm expiry (i.e. most significant fields set to -1, as with the RTC_ALM_SET ioctl()) into a fully specified expiry. If the most significant specified field is earlier than the current time, the least significant unspecified field is incremented. Signed-off-by: Scott Wood <scottwood@freescale.com> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:40 -08:00
Randy Dunlap	5c543eff6c	[PATCH] freezer.h uses task_struct fields freezer.h uses task_struct fields so it should include sched.h. CC [M] fs/jfs/jfs_txnmgr.o In file included from fs/jfs/jfs_txnmgr.c:49: include/linux/freezer.h: In function 'frozen': include/linux/freezer.h:9: error: dereferencing pointer to incomplete type include/linux/freezer.h:9: error: 'PF_FROZEN' undeclared (first use in this function) include/linux/freezer.h:9: error: (Each undeclared identifier is reported only once include/linux/freezer.h:9: error: for each function it appears in.) include/linux/freezer.h: In function 'freezing': include/linux/freezer.h:17: error: dereferencing pointer to incomplete type include/linux/freezer.h:17: error: 'PF_FREEZE' undeclared (first use in this function) include/linux/freezer.h: In function 'freeze': include/linux/freezer.h:26: error: dereferencing pointer to incomplete type include/linux/freezer.h:26: error: 'PF_FREEZE' undeclared (first use in this function) include/linux/freezer.h: In function 'do_not_freeze': include/linux/freezer.h:34: error: dereferencing pointer to incomplete type include/linux/freezer.h:34: error: 'PF_FREEZE' undeclared (first use in this function) include/linux/freezer.h: In function 'thaw_process': include/linux/freezer.h:43: error: dereferencing pointer to incomplete type include/linux/freezer.h:43: error: 'PF_FROZEN' undeclared (first use in this function) include/linux/freezer.h:44: warning: implicit declaration of function 'wake_up_process' include/linux/freezer.h: In function 'frozen_process': include/linux/freezer.h:55: error: dereferencing pointer to incomplete type include/linux/freezer.h:55: error: dereferencing pointer to incomplete type include/linux/freezer.h:55: error: 'PF_FREEZE' undeclared (first use in this function) include/linux/freezer.h:55: error: 'PF_FROZEN' undeclared (first use in this function) fs/jfs/jfs_txnmgr.c: In function 'freezing': include/linux/freezer.h:18: warning: control reaches end of non-void function make[2]: *** [fs/jfs/jfs_txnmgr.o] Error 1 Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:40 -08:00
David S. Miller	d3dcc077bf	[NETLINK]: Put {IFA,IFLA}_{RTA,PAYLOAD} macros back for userspace. GLIBC uses them etc. They are guarded by ifndef __KERNEL__ so nobody will start accidently using them in the kernel again, it's just for userspace. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-08 17:19:33 -08:00
J Hadi Salim	93366c537b	[XFRM]: Fix XFRMGRP_REPORT to use correct multicast group. XFRMGRP_REPORT uses 0x10 which is a group that belongs to events. The correct value is 0x20. We should really be using xfrm_nlgroups going forward; it was tempting to delete the definition of XFRMGRP_REPORT but it would break at least iproute2. Signed-off-by: J Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-08 17:19:30 -08:00
Eric Dumazet	f0490980a1	[NET]: Force a cache line split in hh_cache in SMP. hh_lock was converted from rwlock to seqlock by Stephen. To have a 100% benefit of this change, I suggest to place read mostly fields of hh_cache in a separate cache line, because hh_refcnt may be changed quite frequently on some busy machines. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-08 17:19:29 -08:00
Thomas Graf	e07bca84cd	[NETLINK]: Restore API compatibility of address and neighbour bits Restore API compatibility due to bits moved from rtnetlink.h to separate headers. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-08 17:19:27 -08:00
Stephen Hemminger	3644f0cee7	[NET]: Convert hh_lock to seqlock. The hard header cache is in the main output path, so using seqlock instead of reader/writer lock should reduce overhead. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-08 17:19:20 -08:00
Jiri Kosina	4c2ae844b5	[PATCH] Generic HID layer - pb_fnmode pb_fnmode parameter has to be passed to usbhid, both for compatibility reasons and also because it logically belongs there. Also removes empty hid-input.c file in drivers/usb/input. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-12-08 10:43:19 -08:00
Jiri Kosina	aa8de2f038	[PATCH] Generic HID layer - input and event reporting hid_input_report() was needlessly USB-specific in USB HID. This patch makes the function independent of HID implementation and fixes all the current users. Bluetooth patches comply with this prototype. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-12-08 10:43:17 -08:00
Jiri Kosina	aa938f7974	[PATCH] Generic HID layer - hiddev - hiddev is USB-only (agreed with Marcel Holtmann that Bluetooth currently doesn't need it, and future planned interface (rawhid) will be more flexible and usable) - both HID and USB-hid can be now compiled as modules (wasn't possible before hiddev was fully separated from generic HID layer) Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-12-08 10:43:15 -08:00
Jiri Kosina	4916b3a57f	[PATCH] Generic HID layer - USB API - 'dev' in struct hid_device changed from struct usb_device to struct device and fixed all the users - renamed functions which are part of USB HID API from 'hid_' to 'usbhid_' - force feedback initialization moved from common part into USB-specific driver - added usbhid.h header for USB HID API users - removed USB-specific fields from struct hid_device and moved them to new usbhid_device, which is pointed to by hid_device->driver_data - fixed all USB users to use this new structure Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-12-08 10:43:14 -08:00
Jiri Kosina	229695e51e	[PATCH] Generic HID layer - API - fixed generic API (added neccessary EXPORT_SYMBOL, fixed hid.h to provide correct prototypes) - extended hid_device with open/close/event function pointers to driver-specific functions - added driver specific driver_data to hid_device Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-12-08 10:43:12 -08:00
Jiri Kosina	dde5845a52	[PATCH] Generic HID layer - code split The "big main" split of USB HID code into generic HID code and USB-transport specific HID handling. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-12-08 10:43:01 -08:00
Kiyoshi Ueda	2e93ccc193	[PATCH] dm: suspend: add noflush pushback In device-mapper I/O is sometimes queued within targets for later processing. For example the multipath target can be configured to store I/O when no paths are available instead of returning it -EIO. This patch allows the device-mapper core to instruct a target to transfer the contents of any such in-target queue back into the core. This frees up the resources used by the target so the core can replace that target with an alternative one and then resend the I/O to it. Without this patch the only way to change the target in such circumstances involves returning the I/O with an error back to the filesystem/application. In the multipath case, this patch will let us add new paths for existing I/O to try after all the existing paths have failed. DMF_NOFLUSH_SUSPENDING ---------------------- If the DM_NOFLUSH_FLAG ioctl option is specified at suspend time, the DMF_NOFLUSH_SUSPENDING flag is set in md->flags during dm_suspend(). It is always cleared before dm_suspend() returns. The flag must be visible while the target is flushing pending I/Os so it is set before presuspend where the flush starts and unset after the wait for md->pending where the flush ends. Target drivers can check this flag by calling dm_noflush_suspending(). DM_MAPIO_REQUEUE / DM_ENDIO_REQUEUE ----------------------------------- A target's map() function can now return DM_MAPIO_REQUEUE to request the device mapper core queue the bio. Similarly, a target's end_io() function can return DM_ENDIO_REQUEUE to request the same. This has been labelled 'pushback'. The __map_bio() and clone_endio() functions in the core treat these return values as errors and call dec_pending() to end the I/O. dec_pending ----------- dec_pending() saves the pushback request in struct dm_io->error. Once all the split clones have ended, dec_pending() will put the original bio on the md->pushback list. Note that this supercedes any I/O errors. It is possible for the suspend with DM_NOFLUSH_FLAG to be aborted while in progress (e.g. by user interrupt). dec_pending() checks for this and returns -EIO if it happened. pushdback list and pushback_lock -------------------------------- The bio is queued on md->pushback temporarily in dec_pending(), and after all pending I/Os return, md->pushback is merged into md->deferred in dm_suspend() for re-issuing at resume time. md->pushback_lock protects md->pushback. The lock should be held with irq disabled because dec_pending() can be called from interrupt context. Queueing bios to md->pushback in dec_pending() must be done atomically with the check for DMF_NOFLUSH_SUSPENDING flag. So md->pushback_lock is held when checking the flag. Otherwise dec_pending() may queue a bio to md->pushback after the interrupted dm_suspend() flushes md->pushback. Then the bio would be left in md->pushback. Flag setting in dm_suspend() can be done without md->pushback_lock because the flag is checked only after presuspend and the set value is already made visible via the target's presuspend function. The flag can be checked without md->pushback_lock (e.g. the first part of the dec_pending() or target drivers), because the flag is checked again with md->pushback_lock held when the bio is really queued to md->pushback as described above. So even if the flag is cleared after the lockless checkings, the bio isn't left in md->pushback but returned to applications with -EIO. Other notes on the current patch -------------------------------- - md->pushback is added to the struct mapped_device instead of using md->deferred directly because md->io_lock which protects md->deferred is rw_semaphore and can't be used in interrupt context like dec_pending(), and md->io_lock protects the DMF_BLOCK_IO flag of md->flags too. - Don't issue lock_fs() in dm_suspend() if the DM_NOFLUSH_FLAG ioctl option is specified, because I/Os generated by lock_fs() would be pushed back and never return if there were no valid devices. - If an error occurs in dm_suspend() after the DMF_NOFLUSH_SUSPENDING flag is set, md->pushback must be flushed because I/Os may be queued to the list already. (flush_and_out label in dm_suspend()) Test results ------------ I have tested using multipath target with the next patch. The following tests are for regression/compatibility: - I/Os succeed when valid paths exist; - I/Os fail when there are no valid paths and queue_if_no_path is not set; - I/Os are queued in the multipath target when there are no valid paths and queue_if_no_path is set; - The queued I/Os above fail when suspend is issued without the DM_NOFLUSH_FLAG ioctl option. I/Os spanning 2 multipath targets also fail. The following tests are for the normal code path of new pushback feature: - Queued I/Os in the multipath target are flushed from the target but don't return when suspend is issued with the DM_NOFLUSH_FLAG ioctl option; - The I/Os above are queued in the multipath target again when resume is issued without path recovery; - The I/Os above succeed when resume is issued after path recovery or table load; - Queued I/Os in the multipath target succeed when resume is issued with the DM_NOFLUSH_FLAG ioctl option after table load. I/Os spanning 2 multipath targets also succeed. The following tests are for the error paths of the new pushback feature: - When the bdget_disk() fails in dm_suspend(), the DMF_NOFLUSH_SUSPENDING flag is cleared and I/Os already queued to the pushback list are flushed properly. - When suspend with the DM_NOFLUSH_FLAG ioctl option is interrupted, o I/Os which had already been queued to the pushback list at the time don't return, and are re-issued at resume time; o I/Os which hadn't been returned at the time return with EIO. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:09 -08:00
Kiyoshi Ueda	81fdb096db	[PATCH] dm: ioctl: add noflush suspend Provide a dm ioctl option to request noflush suspending. (See next patch for what this is for.) As the interface is extended, the version number is incremented. Other than accepting the new option through the interface, There is no change to existing behaviour. Test results: Confirmed the option is given from user-space correctly. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:09 -08:00
Kiyoshi Ueda	45cbcd7983	[PATCH] dm: map and endio return code clarification Tighten the use of return values from the target map and end_io functions. Values of 2 and above are now explictly reserved for future use. There are no existing targets using such values. The patch has no effect on existing behaviour. o Reserve return values of 2 and above from target map functions. Any positive value currently indicates "mapping complete", but all existing drivers use the value 1. We now make that a requirement so we can assign new meaning to higher values in future. The new definition of return values from target map functions is: < 0 : error = 0 : The target will handle the io (DM_MAPIO_SUBMITTED). = 1 : Mapping completed (DM_MAPIO_REMAPPED). > 1 : Reserved (undefined). Previously this was the same as '= 1'. o Reserve return values of 2 and above from target end_io functions for similar reasons. DM_ENDIO_INCOMPLETE is introduced for a return value of 1. Test results: I have tested by using the multipath target. I/Os succeed when valid paths exist. I/Os are queued in the multipath target when there are no valid paths and queue_if_no_path is set. I/Os fail when there are no valid paths and queue_if_no_path is not set. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:09 -08:00
Kiyoshi Ueda	a3d77d35be	[PATCH] dm: suspend: parameter change Change the interface of dm_suspend() so that we can pass several options without increasing the number of parameters. The existing 'do_lockfs' integer parameter is replaced by a flag DM_SUSPEND_LOCKFS_FLAG. There is no functional change to the code. Test results: I have tested 'dmsetup suspend' command with/without the '--nolockfs' option and confirmed the do_lockfs value is correctly set. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:09 -08:00
Helge Deller	adf6b20654	[PATCH] fbcmap.c: mark structs const or __read_mostly - Mark the default colormaps read-only, as nobody should be allowed to modify them - Additionally mark color values as __read_mostly since they will only be modified (very seldom) by fb_invert_cmaps() - Add named C99-initializers in fb_cmap structs and use the ARRAY_SIZE() macro Signed-off-by: Helge Deller <deller@gmx.de> Acked-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:05 -08:00
Don Mullis	6b1b60f41e	[PATCH] fault-injection: defaults likely to please a new user Assign defaults most likely to please a new user: 1) generate some logging output (verbose=2) 2) avoid injecting failures likely to lock up UI (ignore_gfp_wait=1, ignore_gfp_highmem=1) Signed-off-by: Don Mullis <dwm@meer.net> Cc: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:03 -08:00
Don Mullis	08b3df2d16	[PATCH] fault-injection: Use bool-true-false throughout Use bool-true-false throughout. Signed-off-by: Don Mullis <dwm@meer.net> Cc: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:03 -08:00
Akinobu Mita	329409aeda	[PATCH] fault injection: stacktrace filtering This patch provides stacktrace filtering feature. The stacktrace filter allows failing only for the caller you are interested in. For example someone may want to inject kmalloc() failures into only e100 module. they want to inject not only direct kmalloc() call, but also indirect allocation, too. - e100_poll --> netif_receive_skb --> packet_rcv_spkt --> skb_clone --> kmem_cache_alloc This patch enables to detect function calls like this by stacktrace and inject failures. The script Documentaion/fault-injection/failmodule.sh helps it. The range of text section of loaded e100 is expected to be [/sys/module/e100/sections/.text, /sys/module/e100/sections/.exit.text) So failmodule.sh stores these values into /debug/failslab/address-start and /debug/failslab/address-end. The maximum stacktrace depth is specified by /debug/failslab/stacktrace-depth. Please see the example that demonstrates how to inject slab allocation failures only for a specific module in Documentation/fault-injection/fault-injection.txt [dwm@meer.net: reject failure if any caller lies within specified range] Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Don Mullis <dwm@meer.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:03 -08:00
Akinobu Mita	f4f154fd92	[PATCH] fault injection: process filtering for fault-injection capabilities This patch provides process filtering feature. The process filter allows failing only permitted processes by /proc/<pid>/make-it-fail Please see the example that demostrates how to inject slab allocation failures into module init/cleanup code in Documentation/fault-injection/fault-injection.txt Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:02 -08:00
Akinobu Mita	c17bb49517	[PATCH] fault-injection capability for disk IO This patch provides fault-injection capability for disk IO. Boot option: fail_make_request=<probability>,<interval>,<space>,<times> <interval> -- specifies the interval of failures. <probability> -- specifies how often it should fail in percent. <space> -- specifies the size of free space where disk IO can be issued safely in bytes. <times> -- specifies how many times failures may happen at most. Debugfs: /debug/fail_make_request/interval /debug/fail_make_request/probability /debug/fail_make_request/specifies /debug/fail_make_request/times Example: fail_make_request=10,100,0,-1 echo 1 > /sys/blocks/hda/hda1/make-it-fail generic_make_request() on /dev/hda1 fails once per 10 times. Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:02 -08:00
Akinobu Mita	6ff1cb355e	[PATCH] fault-injection capabilities infrastructure This patch provides base functions implement to fault-injection capabilities. - The function should_fail() is taken from failmalloc-1.0 (http://www.nongnu.org/failmalloc/) [akpm@osdl.org: cleanups, comments, add __init] Cc: <okuji@enbug.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Don Mullis <dwm@meer.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:02 -08:00
Jiri Slaby	1328d737f5	[PATCH] Char: istallion, variables cleanup - wipe gcc -W warnings by int -> uint conversion - move 2 global variables into their local place Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:00 -08:00
Jiri Slaby	1f8ec435e3	[PATCH] Char: istallion, eliminate typedefs Use only struct <name> instead of defining a new type <name_t>. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:00 -08:00
Jiri Slaby	6b2c9457bb	[PATCH] Char: stallion, variables cleanup - fix `gcc -W' un/signed warnings by converting some ints -> uints. - move 3 global variables into functions, where are they used. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:59 -08:00
Alan Cox	606d099cdd	[PATCH] tty: switch to ktermios This is the grungy swap all the occurrences in the right places patch that goes with the updates. At this point we have the same functionality as before (except that sgttyb() returns speeds not zero) and are ready to begin turning new stuff on providing nobody reports lots of bugs If you are a tty driver author converting an out of tree driver the only impact should be termios->ktermios name changes for the speed/property setting functions from your upper layers. If you are implementing your own TCGETS function before then your driver was broken already and its about to get a whole lot more painful for you so please fix it 8) Also fill in c_ispeed/ospeed on init for most devices, although the current code will do this for you anyway but I'd like eventually to lose that extra paranoia [akpm@osdl.org: bluetooth fix] [mp3@de.ibm.com: sclp fix] [mp3@de.ibm.com: warning fix for tty3270] [hugh@veritas.com: fix tty_ioctl powerpc build] [jdike@addtoit.com: uml: fix ->set_termios declaration] Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Martin Peschke <mp3@de.ibm.com> Acked-by: Peter Oberparleiter <oberpar@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:57 -08:00
Alan Cox	edc6afc549	[PATCH] tty: switch to ktermios and new framework This is the core of the switch to the new framework. I've split it from the driver patches which are mostly search/replace and would encourage people to give this one a good hard stare. The references to BOTHER and ISHIFT are the termios values that must be defined by a platform once it wants to turn on "new style" ioctl support. The code patches here ensure that providing 1. The termios overlays the ktermios in memory 2. The only new kernel only fields are c_ispeed/c_ospeed (or none) the existing behaviour is retained. This is true for the patches at this point in time. Future patches will define BOTHER, ISHIFT and enable newer termios structures for each architecture, and once they are all done some of the ifdefs also vanish. [akpm@osdl.org: warning fix] [akpm@osdl.org: IRDA fix] Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:56 -08:00
Jiri Slaby	ca7ed0f22f	[PATCH] Char: stallion, kill typedefs Typedefs are considered ugly in the kernel. Eliminate them. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:55 -08:00
Jiri Slaby	3306ce3d05	[PATCH] Char: mxser_new, upgrade to 1.9.1 Change cloned experimental driver according to original 1.9.1 moxa driver. Some int->ulong conversions, outb ~UART_IER_THRI constant. Remove commented stuff. I also added printk line with info, if somebody wants to test it, he may contact me as I can potentially debug the driver with him or just to confirm it works properly. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:53 -08:00
Sukadev Bhattiprolu	84d737866e	[PATCH] add child reaper to pid_namespace Add a per pid_namespace child-reaper. This is needed so processes are reaped within the same pid space and do not spill over to the parent pid space. Its also needed so containers preserve existing semantic that pid == 1 would reap orphaned children. This is based on Eric Biederman's patch: http://lkml.org/lkml/2006/2/6/285 Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:52 -08:00
Cedric Le Goater	9a575a92db	[PATCH] to nsproxy Add the pid namespace framework to the nsproxy object. The copy of the pid namespace only increases the refcount on the global pid namespace, init_pid_ns, and unshare is not implemented. There is no configuration option to activate or deactivate this feature because this not relevant for the moment. Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:52 -08:00
Sukadev Bhattiprolu	61a58c6c23	[PATCH] rename struct pspace to struct pid_namespace Rename struct pspace to struct pid_namespace for consistency with other namespaces (uts_namespace and ipc_namespace). Also rename include/linux/pspace.h to include/linux/pid_namespace.h and variables from pspace to pid_ns. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:52 -08:00
Cedric Le Goater	373beb35cd	[PATCH] identifier to nsproxy Add an identifier to nsproxy. The default init_ns_proxy has identifier 0 and allocated nsproxies are given -1. This identifier will be used by a new syscall sys_bind_ns. Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:52 -08:00
Kirill Korotaev	6b3286ed11	[PATCH] rename struct namespace to struct mnt_namespace Rename 'struct namespace' to 'struct mnt_namespace' to avoid confusion with other namespaces being developped for the containers : pid, uts, ipc, etc. 'namespace' variables and attributes are also renamed to 'mnt_ns' Signed-off-by: Kirill Korotaev <dev@sw.ru> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:51 -08:00
Cedric Le Goater	1ec320afdc	[PATCH] add process_session() helper routine: deprecate old field Add an anonymous union and ((deprecated)) to catch direct usage of the session field. [akpm@osdl.org: fix various missed conversions] [jdike@addtoit.com: fix UML bug] Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:51 -08:00

1 2 3 4 5 ...

5209 Commits