vmalloc() returns a void pointer, so casting to (void *) is pretty pointless.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The previous patch which limited the number of sectors in a single request
to a COWed device was correct in concept, but the limit was implemented in
the wrong place.
By putting it in ubd_add, it covered the cases where the COWing was
specified on the command line. However, when the command line only has the
COW file specified, the fact that it's a COW file isn't known until it's
opened, so the limit is missed in these cases.
This patch moves the sector limit from ubd_add to ubd_open_dev.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
COWed devices can't handle more than 32 (64 on x86_64) sectors in one request
due to the size of the bitmap being carried around in the io_thread_req.
Enforce that by telling the block layer not to put too many sectors in
requests to COWed devices.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is theoretically possible for a request to finish and be freed between
writing it to the I/O thread and updating the sector count. In this case, the
update will dereference a freed pointer.
To avoid this, I delay the update until processing the next sg segment, when
the request pointer is known to be good.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Include linux/kernel.h wherever simple_strtoul is used. This kills a
compile warning in stderr_console.c and potential ones in the other files.
This also fixes a bunch of style violations in exitcode.c.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename os_{read_write}_file_k back to os_{read_write}_file, delete
the originals and their bogus infrastructure, and fix all the callers.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Formatting fixes ahead of renaming os_{read_write}_file_k to
os_{read_write}_file and fixing all the callers.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Convert all remaining os_{read_write}_file users to use the simple
{read,write} wrappers, os_{read_write}_file_k.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sanitise gfp flags; it actually is an atomic context, so drop the
GFP_KERNEL part.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of writing entire structures between UML and the I/O thread, we send
pointers. This cuts down on the amount of data being copied and possibly
allows more requests to be pending between the two.
This requires that the requests be kmalloced and freed instead of living on
the stack.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Send as many I/O requests to the I/O thread as possible, even though it will
still only handle one at a time. This provides an opportunity to reduce
latency by starting one request before the previous one has been finished in
the driver.
Request handling is somewhat modernized by requesting sg pieces of a request
and handling them separately, finishing off the entire request after all the
pieces are done.
When a request queue stalls, normally because its pipe to the I/O thread is
full, it is put on the restart list. This list is processed by starting up
the queues on it whenever there is some indication that progress might be
possible again. Currently, this happens in the driver interrupt routine.
Some requests have been finished, so there is likely to be room in the pipe
again.
This almost doubles throughput when copying data between devices, but made no
noticable difference on anything else I tried.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch starts the removal of a very old, very broken piece of code. This
stems from the problem of passing a userspace buffer into read() or write() on
the host. If that buffer had not yet been faulted in, read and write will
return -EFAULT.
To avoid this problem, the solution was to fault the buffer in before the
system call by touching the pages that hold the buffer by doing a copy-user of
a byte to each page. This is obviously bogus, but it does usually work, in tt
mode, since the kernel and process are in the same address space and userspace
addresses can be accessed directly in the kernel.
In skas mode, where the kernel and process are in separate address spaces, it
is completely bogus because the userspace address, which is invalid in the
kernel, is passed into the system call instead of the corresponding physical
address, which would be valid. Here, it appears that this code, on every host
read() or write(), tries to fault in a random process page. This doesn't seem
to cause any correctness problems, but there is a performance impact. This
patch, and the ones following, result in a 10-15% performance gain on a kernel
build.
This code can't be immediately tossed out because when it is, you can't log
in. Apparently, there is some code in the console driver which depends on
this somehow.
However, we can start removing it by switching the code which does I/O using
kernel addresses to using plain read() and write(). This patch introduces
os_read_file_k and os_write_file_k for use with kernel buffers and converts
all call locations which use obvious kernel buffers to use them. These
include I/O using buffers which are local variables which are on the stack or
kmalloc-ed. Later patches will handle the less obvious cases, followed by a
mass conversion back to the original interface.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Define release methods for the ubd and net drivers. They contain as much of
the remove methods as make sense. All error checking must have already been
done as well as anything else that might be holding a reference on the device
kobject.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
user_util.h isn't needed any more, so delete it and remove all includes of it.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a disk fails to open, i.e. its host file doesn't exist, it won't be
removable because the hot-unplug code checks the existence of its gendisk.
This won't exist because it is only allocated for successfully opened disks.
Thus, a typo on the command line can result in a unusable and unfixable disk.
This is fixed by freeing the gendisk if it's there, but not letting that
affect the removal.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 62f96cb01e introduced per-devices
queues and locks, which was fine as far as it went, but left in place a
global which controlled access to submitting requests to the host. This
should have been made per-device as well, since it causes I/O hangs when
multiple block devices are in use.
This patch fixes that by replacing the global with an activity flag in the
device structure in order to tell whether the queue is currently being run.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: pnx8550 code creates directory but resets ->nlink to 1.
create_proc_entry() et al will correctly set ->nlink for you.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some small locking and formatting fixes in the ubd driver.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace global queue and lock with per-device queues and locks. Mostly a
straightforward replacement of ubd_io_lock with dev->lock and ubd_queue with
dev->queue.
Complications -
There was no way to get a request struct (and queue) from the
structure sent to the io_thread, so a pointer to the request was
added. This is needed in ubd_handler in order to kick do_ubd_request
to process another request.
Queue initialization is moved from ubd_init to ubd_add.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Locking fixes. Locking was totally lacking for the mconsole_devices, which
got a spin lock, and the unplugged pages data, which got a mutex.
The locking of the mconsole console output code was confused. Now, the
console_lock (renamed to client_lock) protects the clients list.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I noticed that errors happening while hotplugging devices from the host were
never returned back to the mconsole client. In some cases, success was
returned instead of even an information-free error.
This patch cleans that up by having the low-level configuration code pass back
an error string along with an error code. At the top level, which knows
whether it is early boot time or responding to an mconsole request, the string
is printk'd or returned to the mconsole client.
There are also whitespace and trivial code cleanups in the surrounding code.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a small memory leak in ubd_config, and clearify the confusion which lead
to it.
Then, some little changes not affecting operations -
* move init functions together,
* add a comment about a potential problem in case of some evolution in the block layer,
* mark all initcalls as static __init functions
* mark an used once little function as inline
* document that mconsole methods are all called in process context (was
triggered when checking ubd mconsole methods).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
To simplify error handling, make sure fd is saved into ubd_dev->fd only when
we are sure it is an fd and not an error code.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use bitfields for boolean fields in ubd data structure.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pure whitespace and style fixes split out from subsequent patch. Some changes
(err -> ret) don't make sense now, only later, but I split them out anyway
since they cluttered the patch.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
do_ubd is actually just a boolean variable - the way it is used currently is a
leftover from the old 2.4 block layer, but it is still used; its use is
suspicious, but removing it would be too intrusive for now and needs more
thinking.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add some comments about requirements for ubd_io_lock and expand its use.
When an irq signals that the "controller" (i.e. another thread on the host,
which does the actual requests and is the only one blocked on I/O on the host)
has done some work, we call again the request function ourselves
(do_ubd_request).
We now do that with ubd_io_lock held - that's useful to protect against
concurrent calls to elv_next_request and so on.
XXX: Maybe we shouldn't call at all the request function. Input needed on
this. Are we supposed to plug and unplug the queue? That code "indirectly"
does that by setting a flag, called do_ubd, which makes the request function
return (it's a residual of 2.4 block layer interface).
Meanwhile, however, merge this patch, which improves things.
Cc: Jens Axboe <axboe@suse.de>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This lock protects ubd setup and teardown, so is only used in process context;
beyond that, during such setup memory allocations must be performed and some
generic functions which can sleep must be called (such as add_disk()). So the
only correct solution is to make it a mutex instead of a spin_lock. No other
change is done - this lock must be acquired in different places but it's done
afterwards.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
To rethink locking, I needed to understand well what each function does.
While doing this I renamed some:
* ubd_close -> ubd_close_dev (since it pairs with ubd_open_dev)
* ubd_new_disk -> ubd_disk_register (it handles registration with the block
layer - one hopes this makes clearer the difference with ubd_add())
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Rename the ubd_dev array to ubd_devs and then call any "struct ubd" ubd_dev
instead of dev, which doesn't make clear what we're treating (and no, it's not
hungarian notation - not any more than calling all vm_area_struct vma or all
inodes inode).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add documentation about some fields in struct ubd, whose meaning is
non-obvious due to struct names (should change names altogether, I agree).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With 256 minors and 16 minors used per each UBD device, we can allow the use
of up to 16 UBD devices per UML.
Also chnage parse_unit and leave to the caller (which already do it) the check
for excess numbers, since this is just supposed to do raw parsing.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Real fix for UML pt_regs stuff. Note set_irq_regs() logics in there...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
After Christophs SCSI change, the only usage left is RQ_ACTIVE
and RQ_INACTIVE. The block layer sets RQ_INACTIVE right before freeing
the request, so any check for RQ_INACTIVE in a driver is a bug and
indicates use-after-free.
So kill/clean the remaining users, straight forward.
Signed-off-by: Jens Axboe <axboe@suse.de>
Close two file descriptor leaks, one in the ubd driver and one to
/proc/mounts. The ubd driver bug also leaked some vmalloc space. The
/proc/mounts leak was a descriptor that was just never closed.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The dedevfsification of UML left an unused variable behind.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the new IRQF_ constants and remove the SA_INTERRUPT define
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6: (22 commits)
[PATCH] devfs: Remove it from the feature_removal.txt file
[PATCH] devfs: Last little devfs cleanups throughout the kernel tree.
[PATCH] devfs: Rename TTY_DRIVER_NO_DEVFS to TTY_DRIVER_DYNAMIC_DEV
[PATCH] devfs: Remove the tty_driver devfs_name field as it's no longer needed
[PATCH] devfs: Remove the line_driver devfs_name field as it's no longer needed
[PATCH] devfs: Remove the videodevice devfs_name field as it's no longer needed
[PATCH] devfs: Remove the gendisk devfs_name field as it's no longer needed
[PATCH] devfs: Remove the miscdevice devfs_name field as it's no longer needed
[PATCH] devfs: Remove the devfs_fs_kernel.h file from the tree
[PATCH] devfs: Remove devfs_remove() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_cdev() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_bdev() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_symlink() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_dir() function from the kernel tree
[PATCH] devfs: Remove devfs_*_tape() functions from the kernel tree
[PATCH] devfs: Remove devfs support from the sound subsystem
[PATCH] devfs: Remove devfs support from the ide subsystem.
[PATCH] devfs: Remove devfs support from the serial subsystem
[PATCH] devfs: Remove devfs from the init code
[PATCH] devfs: Remove devfs from the partition code
...
A number of UML initcalls were improperly returning 1. Also removed any
nearby emacs formatting comments.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds a 'c' option to the ubd switch which turns off host file locking so
that the device can be shared, as with a cluster. There's also some
whitespace cleanup while I was in this file.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Improve (especially for coherence) some prototypes, and return code of
init_cow_file in error case - for a short write return -EINVAL, otherwise
return the error we got!
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Correct a bit of whitespace problems while working here.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>