Miquel van Smoorenburg <miquels@cistron.nl> forwarded me this fix to
resolve a deadlock condition that occurs due to the API change in
2.6.13+ kernels dropping the host locking when entering the error
handling. They all end up calling adpt_i2o_post_wait(), which if you
call it unlocked, might return with host_lock locked anyway and that
causes a deadlock.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This follows on from Jens' patch and consolidates all of the ULD
separate handlers for REQ_BLOCK_PC into a single call which has his
fix for our direction bug.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When we got a device only capable of async, we would zero out goal->period
which would cause us to try PPR negotiations. Leave goal->period alone,
and check goal->offset before doing PPR. Kudos to Daniel Forsgren for
figuring this out.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Some hardware does not support the PACKET command at all.
Other hardware supports ATAPI, but the driver does something nasty such
as calling BUG() when an ATAPI command is issued.
For these such cases, we mark them with a new flag, ATA_FLAG_NO_ATAPI.
Initial version contributed by Ben Collins.
There is no user of qc->waiting left after ata_exec_internal()
changes. Kill the field.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
There is no user of ata_qc_wait_err() and ata_qc_complete_noop() after
ata_exec_internal() changes. Remove unused functions.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This patch converts all users of libata internal commands to use
ata_exec_internal().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This patch implements ata_exec_internal() function which performs
libata internal command execution. Previously, this was done by each
user by manually initializing a qc, issueing it, waiting for its
completion and handling errors. In addition to obvious code
factoring, using ata_exec_internal() fixes the following bugs.
* qc not freed on issue failure
* ap->qactive clearing could race with the next internal command
* race between timeout handling and irq
* ignoring error condition not represented in tf->status
Also, qc & hardware are not accessed anymore once it's completed,
making internal commands more conformant with general semantics.
ata_exec_internal() also makes it easy to issue internal commands from
multiple threads if that becomes necessary.
This patch only implements ata_exec_internal(). A following patch
will convert all users.
Signed-off-by: Tejun Heo <htejun@gmail.com>
--
Jeff, all patches have been regenerated against upstream branch as of
today. (575ab52a21)
Also, I took out a debug printk from ata_exec_internal (don't know how
that one got left there). Other than that, all patches are identical
to the previous posting.
Thanks. :-)
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Fix incorrect pointer usage on two calls to kunmap_atomic().
This seems to happen a lot, because kunmap() wants the struct page *,
whereas kunmap_atomic() instead wants the mapped virtual address.
Signed-off-by: Mark Lord <liml@rtr.ca>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
There is a double free in the scsi scan code if a LLDD's slave_alloc()
call fails. There is a direct call to scsi_free_queue and then the
following put_device calls the release function, which also frees the
queue.
Remove the redundant scsi_free_queue.
Signed-off-by: Brian King <brking@us.ibm.com>
Tested-by: Nathan Lynch <ntl@pobox.com>
[ Also removed some strange whitespace artifacts in that area ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ok lets start with the 'easy' stuff. This includes my research and
summary of chip errata into the new driver so that people can refer to
it when updating ata_piix.
No code changes
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Current scsi scanning code appears to have a use after free
bug is a LLDD's slave_alloc fails. Remove the redundant
scsi_free_queue.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This reverts commit 1b0997f561, which in
turn reverted 34ea80ec6a (which is thus
re-instated).
Quoth James Bottomley:
"All it's doing is deferring the device_put() from the
scsi_put_command() to after the scsi_run_queue(), which doesn't fix
the sleep while atomic problem of the device release method. In both
cases we still get the semaphore in atomic context problem which is
caused by scsi_reap_target() doing a device_del(), which I assumed
(wrongly) was valid from atomic context."
who also promised to fix scsi_reap_target().
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The scsi_library routines don't correctly set DMA_NONE when
req->data_len is zero (instead they check the command type first, so
if it's write, we end up with req->data_len == 0 and direction as
DMA_TO_DEVICE which confuses some drivers)
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The eh_action semaphore in scsi_eh_send_command is cleared after a
command timeout. The command is subsequently aborted and the abort
will try to call scsi_done() on it. Unfortunately, the scsi_eh_done()
routine unconditinally completes the semaphore (which is now null).
Fix this race by makiong the scsi_eh_done() routine check that the
semaphore is non null before completing it (mirroring the ordinary
command done/timeout logic).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The SCSI megaraid drive goes to great effort to kmap
the scatterlist buffer (if used), but then uses the
wrong pointer when copying to it afterward.
Signed-off-by: Mark Lord <lkml@rtr.ca>
Acked by: Ju, Seokmann <Seokmann.Ju@engenio.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Properly check FC_RESID for any non-transfered bytes
regardless of firmware completion status.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
A regression in a recent change
33135aa2a5 caused the driver
to mistakenly drop handling of AENs. Due to the incorrect
handling, ports would not reappear after RSCNs and LIPs.
Drops unused/incorrect compound #define from qla_def.h.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This makes ibmvscsi work correctly with the recent set of kexec
patches that went in. This is based on work by Michael Ellerman, who
chased this initially. He validated that it works during kexec.
Handle kexec correctly in ibmvscsi. During kexec the adapter
will not get cleaned up correctly, so we may need to reset it
to make it sane again.
Signed-off-by: Dave Boutcher <sleddog@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
1. ata_pio_complete():
It seems unnecessary to wait for the clearing of the DRQ bit.
(Waiting for BSY=0 should be enough.
ata_ok() also checks the correctness of the status bits later.)
2. ata_pio_block():
- added error checking, before transfering data.
- minor comments fix
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
============
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
- set qc->err_mask directly when we found the error
- remove the code to determine err_mask from device status
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
============
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
- move "qc->err_mask |= AC_ERR_ATA_BUS" to where the error is found
- add "assert(qc->err_mask)" to ata_pio_error() to make sure qc->err_mask was available when we enter the error state
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
============
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
- remove err_mask from the parameter list of the complete functions
- move err_mask to ata_queued_cmd
- initialize qc->err_mask when needed
- for each function call to ata_qc_complete(), replace the err_mask parameter with qc->err_mask.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
===============
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
- add qc to ata_pio_poll()
- reorder the initialization of qc in ata_pio_complete()
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
===================
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This patch makes ata_scsi_pass_thru() properly set result code and
sense data on translation failures.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This reverts commit 34ea80ec6a.
It does a put_device() from softirq context, which is bad since it gets
a semaphore for reading.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
sg's st_map_user_pages is modelled on an earlier version of st's
sgl_map_user_pages, and has the same bug: if get_user_pages got some but
not all of the pages, then those got were released, but the positive res
code returned implied that they were still to be freed.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2.6.15-rc1 made sg's st_unmap_user_pages and st's sgl_unmap_user_pages
BUG on a PageReserved page. But that's wrong: they could be unmapping
the ZERO_PAGE, which is marked PG_reserved; and perhaps others (while
get_user_pages is still permitted on VM_PFNMAP areas - that may change).
More change is needed here: sg claims to dirty even pages written from,
and st claims not to dirty even pages read into; and SetPageDirty is not
adequate for this nowadays. Fixes to those follow in a later patch: for
the moment just fix the 2.6.15 regression.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Nick and I had already been looking at drivers/scsi/{sg.c,st.c},
brought there by __put_page in sg.c's peculiar sg_rb_correct4mmap,
which we'd like to remove. But that's irrelevant to your pain, except...
One extract from the patches I'd like to send Doug and Kai for 2.6.15
or 2.6.16 is this below: since the incomplete get_user_pages path omits
to reset res, but has already released all the pages, it will result in
premature freeing of user pages, and behaviour just like you've seen.
Though I'd have thought incomplete get_user_pages was an exceptional
case, and a bit surprised you'd encounter it. Perhaps there's some
other premature freeing in the driver, and this instance has nothing
whatever to do with it.
If the problem were easily reproducible, it'd be great if you could
try this patch; but I think you've said it's not :-(
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Enabling these features causes problems with some drives, so disable
them until they're debugged
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn.
scsi_bios_ptable return value is not being checked in aac_biosparm.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Some SCSI devices apparently get very confused if we try to use the
echo buffer on a non-DT negotiated bus (this mirrors the problems of
using PPR on non-LVD for some devices). The fix is to be far more
conservative about when we use an echo buffer. With this patch, we'll
now see what parameters are negotiated by the read only test, and only
look for an echo buffer if DT is negotiated.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Hi,
the patch below marks several libata (and libata-driver) structures
const so that they end up in the .rodata segment and don't false-share
cachelines with things that get dirtied often.
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This fixes locking in megaraid.c, namely:
(1) make sure megaraid_queue release the adapter lock by changing the
code to have a single return
(2) remove the errornous scsi_assign_lock call
Testing by Burton Windle.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Burton Windle <bwindle@fint.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
To transport scsi reset command to device aic7xxx reset handler looks
at the driver's pending_list and searches any proper command. However
the search condition has been inverted: ahc_match_scb() returns TRUE
if a matched command is found. As a result the reset on required
devices did not turn out well, a correctly working neighbour device
may be surprised by the reset. aic7xxx reset handler reports about the
success, but really the original situation is not corrected yet.
Signed-off-by: Vasily Averin <vvs@sw.ru>
Naturally, there's a corresponding problem in the aic79xx driver, so
I've also added the same fix for that.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_get_command() attempts to write into a structure that may not have
been successfully allocated. Move this write inside the if statement that
ensures we won't panic the kernel with a NULL pointer dereference.
Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>