Added blk_unplug interface, allowing all invocations of unplugs to result
in a generated blktrace UNPLUG.
Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
For the locking to work, only the tag map and tag bit map may be shared
(incidentally, I was just explaining this to Nick yesterday, but I
apparently didn't review the code well enough myself). But we also share
the busy list! The busy_list must be queue private, or we need a
block_queue_tag covering lock as well.
So we have to move the busy_list to the queue. This'll work fine, and
it'll actually also fix a problem with blk_queue_invalidate_tags() which
will invalidate tags across all shared queues. This is a bit confusing,
the low level driver should call it for each queue seperately since
otherwise you cannot kill tags on just a single queue for eg a hard
drive that stops responding. Since the function has no callers
currently, it's not an issue.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This implements functionality to pass down or insert a barrier
in a queue, without having data attached to it. The ->prepare_flush_fn()
infrastructure from data barriers are reused to provide this
functionality.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
We can use this helper in the elevator core for BLKPREP_KILL, and it'll
also be useful for the empty barrier patch.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The entire function of flush_dry_bio_endio is to undo the effects
of bio_endio (when called on a barrier request). So remove the
function and the call to bio_endio.
This allows us to remove "bi_size" from "struct request_queue".
Signed-off-by: Neil Brown <neilb@suse.de>
### Diffstat output
./block/ll_rw_blk.c | 39 ++-------------------------------------
./include/linux/blkdev.h | 1 -
2 files changed, 2 insertions(+), 38 deletions(-)
diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Hide everything in blkdev.h with CONFIG_BLOCK isn't set, and fixup
the (few) files that fail to build because they were relying on blkdev.h
pulling in extra includes for them.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
blk_rq_bio_prep is exported for use in exactly
one place. That place can benefit from using
the new blk_rq_append_bio instead.
So
- change dm-emc to call blk_rq_append_bio
- stop exporting blk_rq_bio_prep, and
- initialise rq_disk in blk_rq_bio_prep,
as dm-emc needs it.
Signed-off-by: Neil Brown <neilb@suse.de>
diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
ll_back_merge_fn is currently exported to SCSI where is it used,
together with blk_rq_bio_prep, in exactly the same way these
functions are used in __blk_rq_map_user.
So move the common code into a new function (blk_rq_append_bio), and
don't export ll_back_merge_fn any longer.
Signed-off-by: Neil Brown <neilb@suse.de>
diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Every usage of rq_for_each_bio wraps a usage of
bio_for_each_segment, so these can be combined into
rq_for_each_segment.
We define "struct req_iterator" to hold the 'bio' and 'index' that
are needed for the double iteration.
Signed-off-by: Neil Brown <neilb@suse.de>
Various compile fixes by me...
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Andrew thinks I should be nice and allow outside code to at least just
compile, so add the request_queue_t typedef back and mark it deprecated.
It'll warn people that this type is going away soonish.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
blk_fill_sghdr_rq, blk_unmap_sghdr_rq, and blk_complete_sghdr_rq were
exported for bsg, however bsg was changed to support only sg v4.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The bounce buffer logic is included on systems that do not need it. If a
system does not have zones like ZONE_DMA and ZONE_HIGHMEM that can lead to
the use of bounce buffers then there is no need to reserve memory pools etc
etc. This is true f.e. for SGI Altix.
Also nicifies the Makefile and gets rid of the tricky "and" there.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds a struct request pointer to the request structure for the
second data phase (bidi for now). A request queue supporting bidi
requests sets QUEUE_FLAG_BIDI. This prevents sending bidi requests to
a non-bidi queue.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch binds bsg devices to request_queue instead of gendisk. Any
objects (like transport entities) can define own request_handler and
create own bsg device.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
bsg uses scsi_cmd_ioctl() for some SCSI/sg ioctl
commands. scsi_cmd_ioctl() gets a request queue from a gendisk
arguement. This prevents bsg being bound to SCSI devices that don't
have a gendisk (like OSD). This adds a request_queue argument to
scsi_cmd_ioctl(). The SCSI/sg ioctl commands doesn't use a gendisk so
it's safe for any SCSI devices to use scsi_cmd_ioctl().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
blk_fill_sghdr_rq doesn't work for SG v4 so verify_command needed to
be exported.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
blk_congestion_wait() doesn't exist anymore, but there's still a stub
in blkdev.h for the !CONFIG_BLOCK case. Kill it.
Signed-off-by: Benjamin Gilbert <bgilbert@cs.cmu.edu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Switch the kblockd flushing from a global flush to a more specific
flush_work().
(akpm: bypassed maintainers, sorry. There are other patches which depend on
this)
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Jens Axboe <axboe@suse.de>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The blk_rq_unmap_user() API is not very nice. It expects the caller to
know that rq->bio has to be reset to the original bio, and it will
silently do nothing if that is not done. Instead make it explicit that
we need to pass in the first bio, by expecting a bio argument.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
We have full flexibility of merging parameters now, so we can remove the
hooks that define back/front/request merge strategies. Nobody is using
them anymore.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
While working on bidi support at struct request level
I have found that blk_queue_activity_fn is actually never used.
The only user is in ide-probe.c with this code:
/* enable led activity for disk drives only */
if (drive->media == ide_disk && hwif->led_act)
blk_queue_activity_fn(q, hwif->led_act, drive);
And led_act is never initialized anywhere.
(Looking back at older kernels it was used in the PPC arch, but was removed around 2.6.18)
Unless it is all for future use off course.
(this patch is against linux-2.6-block.git as off 2006/12/4)
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch modifies blk_rq_map/unmap_user() and the cdrom and scsi_ioctl.c
users so that it supports requests larger than bio by chaining them together.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Separate out the concept of "queue congestion" from "backing-dev congestion".
Congestion is a backing-dev concept, not a queue concept.
The blk_* congestion functions are retained, as wrappers around the core
backing-dev congestion functions.
This proper layering is needed so that NFS can cleanly use the congestion
functions, and so that CONFIG_BLOCK=n actually links.
Cc: "Thomas Maier" <balagi@justmail.de>
Cc: "Jens Axboe" <jens.axboe@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: David Howells <dhowells@redhat.com>
Cc: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Export the clear_queue_congested() and set_queue_congested() functions
located in ll_rw_blk.c
The functions are renamed to blk_clear_queue_congested() and
blk_set_queue_congested().
(needed in the pktcdvd driver's bio write congestion control)
Signed-off-by: Thomas Maier <balagi@justmail.de>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We still need to maintain a private PC style command, since it
isn't completely unified with REQ_TYPE_BLOCK_PC yet.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This was necessitated by the need for a function to get back
to a scsi_cmnd, when an hba the posts its (corresponding) completion
interrupt with a block layer tag as its reference.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: David Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Don't just do nothing: it'll cause busywaits all over writeback and page
reclaim.
For now, take a fixed-length nap. Will improve when NFS starts waking up
throttled processes.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.
This patch does the following:
(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.
(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:
(*) Block I/O tracing.
(*) Disk partition code.
(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.
(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.
(*) MTD blockdev handling and FTL.
(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.
(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.
(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.
(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.
(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:
(*) Default blockdev file operations (to give error ENODEV on opening).
(*) Makes some /proc changes:
(*) /proc/devices does not list any blockdevs.
(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.
(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.
(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).
(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We can use this information for making more intelligent priority
decisions, and it will also be useful for blktrace.
Signed-off-by: Jens Axboe <axboe@suse.de>
CFQ implements this on its own now, but it's really block layer
knowledge. Tells a device queue to start dispatching requests to
the driver, taking care to unplug if needed. Also fixes the issue
where as/cfq will invoke a stopped queue, which we really don't
want.
Signed-off-by: Jens Axboe <axboe@suse.de>
cfq_exit_lock is protecting two things now:
- The per-ioc rbtree of cfq_io_contexts
- The per-cfqd linked list of cfq_io_contexts
The per-cfqd linked list can be protected by the queue lock, as it is (by
definition) per cfqd as the queue lock is.
The per-ioc rbtree is mainly used and updated by the process itself only.
The only outside use is the io priority changing. If we move the
priority changing to not browsing the rbtree, we can remove any locking
from the rbtree updates and lookup completely. Let the sys_ioprio syscall
just mark processes as having the iopriority changed and lazily update
the private cfq io contexts the next time io is queued, and we can
remove this locking as well.
Signed-off-by: Jens Axboe <axboe@suse.de>
Move some members around and unionize completion_data and rb_node since
they cannot ever be used at the same time.
Signed-off-by: Jens Axboe <axboe@suse.de>
After Christophs SCSI change, the only usage left is RQ_ACTIVE
and RQ_INACTIVE. The block layer sets RQ_INACTIVE right before freeing
the request, so any check for RQ_INACTIVE in a driver is a bug and
indicates use-after-free.
So kill/clean the remaining users, straight forward.
Signed-off-by: Jens Axboe <axboe@suse.de>
It is always identical to &q->rq, and we only use it for detecting
whether this request came out of our mempool or not. So replace it
with an additional ->flags bit flag.
Signed-off-by: Jens Axboe <axboe@suse.de>
As the comments indicates in blkdev.h, we can fold it into ->end_io_data
usage as that is really what ->waiting is. Fixup the users of
blk_end_sync_rq().
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The rbtree sort/lookup/reposition logic is mostly duplicated in
cfq/deadline/as, so move it to the elevator core. The io schedulers
still provide the actual rb root, as we don't want to impose any sort
of specific handling on the schedulers.
Introduce the helpers and rb_node in struct request to help migrate the
IO schedulers.
Signed-off-by: Jens Axboe <axboe@suse.de>
Right now, every IO scheduler implements its own backmerging (except for
noop, which does no merging). That results in duplicated code for
essentially the same operation, which is never a good thing. This patch
moves the backmerging out of the io schedulers and into the elevator
core. We save 1.6kb of text and as a bonus get backmerging for noop as
well. Win-win!
Signed-off-by: Jens Axboe <axboe@suse.de>
Right now ->flags is a bit of a mess: some are request types, and
others are just modifiers. Clean this up by splitting it into
->cmd_type and ->cmd_flags. This allows introduction of generic
Linux block message types, useful for sending generic Linux commands
to block devices.
Signed-off-by: Jens Axboe <axboe@suse.de>