The zfcp_port might have been removed, while the FC fast_io_fail timer
is still running and could trigger the terminate_rport_io callback.
Set the pointer to the zfcp_port to NULL and check accordingly
before using it.
Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When the abort handler cannot find a pending FSF request, the request
completion could just be running. This means we cannot return SUCCESS,
since this would lead to call to scsi_done after exiting the SCSI
error handler which is not allowed.
Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When running the scsi_scan from the zfcp workqueue and the target
device does not respond, the zfcp workqueue can block until the
scsi_scan hits a timeout. Move the work to the scsi host workqueue,
since this one is also used for the scan from the SCSI midlayer.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
It will not be necessary to set the erp failed status bit
in case a SCSI device is removed by the SCSI mid layer.
In the case a SCSI device is unavailable for a short time
(15 to 20 seconds) a FCP unit will not get on-line again.
Signed-off-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Use the I/O blocking mechanism in the FC transport class to allow
faster failovers for multipathing:
- Call fc_remote_port_delete early to set the rport to BLOCKED.
- Check the rport status in queuecommand with fc_remote_portchkready
to no longer accept new I/O for this port and fail the I/O with the
appropriate scsi_cmnd result.
- Implement the terminate_rport_io handler to abort all pending I/O
requests
- Return SCSI commands with DID_TRANSPORT_DISRUPTED while erp is
running.
- When updating the remote port status, check for late changes and
update the remote ports status accordingly.
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The current number based id ERP logging is replaced by a string
based tag version. The benefit is an easier location of the code in
question and the removal of the lengthy array referencing the
individual messages.
The string (7 bytes) based version does not use more space since those
bytes were "used" anyway due to the alignment of the structure.
The encoding of the 7 byte string is as follows
[0-1] = filename
[2-5] = task/function
[6] = section
Due to the character of this string (fixed length) a string
termination is not required here.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When the SCSI midlayer is running error recovery, the low-level error
recovery in zfcp could be running and preventing the SCSI midlayer to
issue error recovery requests. To avoid unnecessary error recovery
escalation, wait for the zfcp erp to finish and retry if necessary.
While reworking the SCSI eh handlers, alsa cleanup the code and
simplify the interface from zfcp_scsi to the fsf layer.
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Use the device pointer in zfcp_unit for tracking if we have a
registered SCSI device. With this approach, the flag
ZFCP_STATUS_UNIT_REGISTERED is only redundant and can be removed.
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The zfcp_scsi_queuecommand was not acting according to the standard
when the respective unit was not available. In this case an -EBUSY was
returned, which is not valid in itself, and in addition scsi_done
was called. This combination is not allowed and was leading to a
double finish of the request and therefor double decrement of the
host_busy counter.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
It is possible that a remote port has a problem, the SCSI device gets
deleted after the rport timeout and then the timeout for pending SCSI
commands trigger an abort. For this case, don't delete the reference
from the SCSI device to the zfcp unit, so that we can still have the
reference to issue an abort request.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reduce the size of zfcp data structures by removing unused and
redundant members. scsi_lun is only the mangled version of the
fcp_lun. So, remove the redundant field and use the fcp_lun instead.
Since the queue lock and the pci_batch indicator are only used in the
request queue, move them from the common queue struct to the adapter
struct.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
- Remove unused references and declarations, including one instance
of the FC ls_adisc struct that has been defined twice.
- Also remove the flags COMMON_OPENING, COMMON_CLOSING,
ADAPTER_REGISTERED and XPORT_OK that are only set and cleared, but
not checked anywhere.
- Remove the zfcp specific atomic_test_mask makro. Simply use
atomic_read directly instead.
- Remove the zfcp internal sg helper functions and switch the places
where it is still used to call sg_virt directly.
- With the update of the QDIO code, the QDIO data structures no
longer use the volatile type qualifier. Now we can also remove the
volatile qualifiers from the zfcp code.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Update the kernel messages in zfcp with input from the message review
and remove some messages that have been identified as redundant.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
zfcp was using three files to deal with sysfs representation
for adapters, ports and units. The consolidation into one file
prevents code-duplication and eases maintainability.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cleanup code in zfcp_scsi.c, fix coding style issues and simplify the
code.
Signed-off-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Move the accessor functions for the scsi_cmnd status from zfcp to the
SCSI include file. Change the interface to the functions to pass the
scsi_cmnd pointer instead of the status pointer.
Signed-off-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The sysfs attribute /sys/class/fc_host/hostX/port_state was not set by
zfcp so far.
Now, the appropriate members of the fc_function_template struct are
set during its initialziation. The first is a boolean to show the port
state. The second is a function pointer to the function
zfcp_get_host_port_state, which reads the port state from our adapter
status bits and calls fc_host_port_state with the approriate port
state afterwards.
Signed-off-by: Sven Schuetz <sven@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cleanup the messages used in the zfcp driver: Remove unnecessary debug
and trace message and convert the remaining messages to standard
kernel macros. Remove the zfcp message macros and while updating the
whole flie also update the copyright headers.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The latency information is provided on a SCSI device level (LUN)
which can be found at the following location
/sys/class/scsi_device/<H:C:T:L>/device/cmd_latency
/sys/class/scsi_device/<H:C:T:L>/device/read_latency
/sys/class/scsi_device/<H:C:T:L>/device/write_latency
Each sysfs attribute provides the available data: min, max and sum for
fabric and channel latency and the number of requests processed.
An overrun of the variables is neither detected nor treated. The file
has to be read twice to make a meaningful statement, because only the
differences of the values between the two reads can be used. A reset
of the values can be achieved by writing to the attribute.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The new FCP adapter statistics provide a variety of information about
the virtual adapter (subchannel). In order to collect this information
the zfcp driver is extended to query this information.
The information provided by the new FCP adapter statistics can be
fetched by reading from the following files in the sysfs filesystem
/sys/class/scsi_host/host<n>/seconds_active
/sys/class/scsi_host/host<n>/requests
/sys/class/scsi_host/host<n>/megabytes
/sys/class/scsi_host/host<n>/utilization
These are the statistics on a virtual adapter (subchannel) level.
The information provided is raw and not modified or interpreted by any
means. No interpretation or modification of the values is done by the
zfcp driver.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_fsf_incoming_els_rscn’:
drivers/s390/scsi/zfcp_aux.c:1379: warning: cast from pointer to integer of
different size
drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_fsf_incoming_els_plogi’:
drivers/s390/scsi/zfcp_aux.c:1432: warning: cast from pointer to integer of
different size
drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_fsf_incoming_els_logo’:
drivers/s390/scsi/zfcp_aux.c:1457: warning: cast from pointer to integer of
different size
..
Just passing pointers rids us of these warnings and improves readability.
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch allows any recovery event to be traced back to an exact
cause, e.g. a particular request identified by an id (address).
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch writes a trace record which provides information about state
changes for adapters, ports and units, e.g. target failure, targets becoming
online, targets being temporarily blocked due to pending recovery, targets
which have been recovered successfully etc.
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
[based on proposal from Mike Christie <michaelc@cs.wisc.edu>, this
patch adds some simplifications to the handler functions]
With the new target reset handler callback in the SCSI midlayer, the
device reset handler in zfcp can be split in two parts. Now, zfcp does
not have to track anymore whether the device supports LUN resets, so
remove this flag and let the SCSI midlayer decide what to do.
The device reset handler simply issues a LUN reset and the target
reset handler a target reset.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We need to hold the queue-lock when checking whether we still have a valid
unit/port handle for the FCP command, i.e whether we can issue this request for
this unit/port. If the error recovery is about to close this unit/port, then it
competes for the queue-lock. If the close request issued by the error recovery
wins, then it is guaranteed that this unit/port has been blocked for other
requests.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When adding an invalid LUN, there is a deadlock between the add
via scsi_scan_target and the slave_destroy handler: The handler
waits for the scan to complete, but for an invalid unit,
scsi_scan_target directly calls the slave_destroy handler.
Fix the deadlock by removing the wait in the slave_destroy
handler, it was not necessary anyway.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The callback function used by zfcp always returns success,
which is an indication for the SCSI midlayer to stop error
handling. Remove the bus_reset callback, since the same
function will be called via the host_reset callback.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cleanup the whitepace from the entire zfcp driver to prevent
to have those changes in future feature or function patches.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
cleanup, using ERP request mempool for all ERP versions of
the exchange functions (exchange_config (ECD), exchange_port (EPD) )
providing individual versions of the ECD, EPD functions for ERP
and other purposes (_sync).
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Remove braces for only one statement
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
IO stall after deleting and path checker changes after reenabling zfcp device
Setting one zfcp device offline using chccwdev in a multipath
environment and waiting will lead to IO stall on all paths.
After setting the zfcp device back online using chccwdev,
the devices with io stall will have a different path checker.
Devices corresponding to the deleted units are never freed.
This has the effect that 'slave_destroy' is never called and zfcp
still thinks that this unit is registered
(ZFCP_STATUS_UNIT_REGISTERED is still set). Hence the erp
routine is not called correctly and the unit is not enabled properly.
Do not delete rport and the sdev. Just set the host to block on
'offline'. Setting host online again will then remove the blocked status
and everything is fine again.
Signed-off-by: Michael Loehr <mloehr2@linux.vnet.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Simplify request ID management and make sure that frequently used
functions are inlined. Also fix a memory leak in zfcp_adapter_enqueue()
which only gets hit in error handling.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The SCSI stack requires low level drivers to register and
unregister devices. For zfcp this leads to the situation where
zfcp calls the SCSI stack, the SCSI tries to scan the new device
and the scan SCSI command fails. This would require the zfcp erp,
but the erp thread is already blocked in the register call.
The fix is to make sure that the calls from the ERP thread to
the SCSI stack do not block the ERP thread. In detail:
1) Use a workqueue to avoid blocking of the scsi_scan_target calls.
2) When removing a unit make sure that no scsi_scan_target call is
pending.
3) Replace scsi_flush_work with scsi_target_unblock. This avoids
blocking and has the same result.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
All on stack DECLARE_COMPLETIONs should be replaced by:
DECLARE_COMPLETION_ONSTACK
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the fix ... One of my previous fixes introduced removal of all fsf
requests in zfcp's eh_host_reset_handler. But this must not happen
before qdio queues are shut down. So, I revert the changes of
zfcp_scsi_eh_host_reset_handler.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This instance will be used whenever a timer is needed for
a request by zfcp.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
zfcp's eh_abort_handler used the wrong request ID to
identify the request to be aborted. The bug was introduced
with commit fea9d6c7bc
for improved management of request IDs. The bug is
fixed with this patch.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Create private slab caches in order to guarantee proper alignment of
data structures that get passed to hardware.
Sidenote: with this patch slab cache debugging will finally work on s390
(at least no known problems left).
Furthermore this patch does some minor cleanups:
- store ptr for transport template in struct zfcp_data
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Compile fix ups and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Bug fixes for zfcp's erp:
- trigger adapter reopen if do_QDIO fails
- avoid erp deadlock if registration of scsi target or remote port hang
- do not treat as error if exchange port data fails
- decrease timeout for target reset and aborts
- mark unit failed if slave_destroy is called
Additionally some code cleanup was done:
- made some functions void when retval is not of interest
- shortened initialization of zfcp's host_template
- corrected some comments
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If zfcp's port erp fails we now call fc_remote_port_delete. This helps
to avoid offlined scsi devices if scsi commands time out due to path
failures. When an adapter erp fails we call fc_remote_port_delete for
all ports on that adapter.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Ralph Wuerthner <rwuerthn@de.ibm.com>
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>