The SCSI midlayer retries commands based on the remote port state and
the command status reported by the driver. Returning
DID_TRANSPORT_DISRUPTED is a better approach, use this for reporting
FSF errors back to the SCSI midlayer. See
http://marc.info/?l=linux-scsi&m=125668044215051&w=2 as reference.
There is also no need in special treatment of ABORTED commands, so
remove the ZFCP_STATUS_FSFREQ_ABORTED, the commands are then returned
with DID_TRANSPORT_DISRUPTED.
Also remove the ZFCP_STATUS_FSFREQ_RETRY: It is useless, no retry is
happening in the FSF layer and nobody checks the state of this flag.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Introduce kmem_cache for ELS ADISC data to guarantee the required
hardware alignment and free the allocated memory in case the send
failes.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Remove some redundancies in FC related code and trace:
- drop redundant data from SAN trace (local s_id that only changes
during link down, ls_code that is already part of payload, d_id in
ct response trace that is always the same as in ct request trace)
- use one common fsf struct to hold zfcp data for ct and els requests
- leverage common fsf struct for FC passthrough job data, allocate it
with dd_bsg_data for passthrough requests and unify common code for
ct and els passthrough request
- simplify callback handling in zfcp_fc
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Instead of assigning 4 bytes with the highest byte masked out, use a 3
byte array with the ntoh24 and h24ton helper functions, thus
eliminating the need for the ZFCP_DID_MASK.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The well-known-address (WKA) port handling code is part of the FC code
in zfcp. Move everything WKA related to the zfcp_fc files and use the
common zfcp_fc prefix for structs and functions. Drop the unused key
management service while renaming the struct, no request could ever
reach this service in zfcp and it is obsolete anyway.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Use common code definitions for FC GPN_FT and GID_PN
instead of inventing private ones. Move the private structs still
required inside zfcp to zfcp_fc header file.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Use common code definitions for FC plogi, logo, rscn and adisc structs
instead of inventing private ones. Move the private struct for issuing
ELS ADISC inside zfcp to zfcp_fc header file.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Use common data structures for FCP CMND, FCP RSP and related
definitions and remove zfcp private definitions. Split the FCP CMND
setup and FCP RSP evaluation code in seperate functions. Use inline
functions to not negatively impact the I/O path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
If an error occurs that triggers the call to fc_remote_port_delete,
ideally this call would happen before any I/O is passed back to the
SCSI midlayer through scsi_done. The SCSI midlayer will retry the
commands and fc_remote_port_chkready will return the correct status
code. But with the delay between calling scsi_done in softirq context
and the call to fc_remote_port_delete from the workqueue, there is a
window where zfcp returns DID_ERROR. This leads to SCSI error recovery
which then leads to offline SCSI devices since all recovery actions
will fail with the rport now being blocked.
In this window, zfcp has to return DID_IMM_RETRY just as the FC
transport class would do in fc_remote_port_chkready for the blocked
fc_rport. As soon as the fc_rport is BLOCKED, fc_remote_port_chkready
will do the right thing.
Additionally, there are two more cases to catch in zfcp_scsi_queuecommand:
- After the port has been opened, the unit has to be opened. During
this period I/O has to be retried. This can also be handled with
DID_IMM_RETRY.
- If the access to the unit fails, but the port is good, then
this single unit cannot be accessed and I/O to this unit has to fail
without involving the FC transport class.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The port_scan work was scheduled to the work_queue provided by the
kernel. This resulted on SMP systems to a likely situation that more
than one scan_work were processed in parallel. This is not required
and openes the possibility of race conditions between the removal of
invalid ports and the enqueue of just scanned ports. This patch
synchronizes the scan_work tasks by scheduling them to adapter local
work_queue.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The flag ZFCP_STATUS_COMMON_REMOVE was used to indicate that a
resource is not ready to be used or about to be removed from the
system. This is now better done by an improved list handling
and therefore the additional indicator is not required anymore.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
With the reference counting for zfcp data structures, it is now
possible to implement module unloading again. Module unloading
requires to free all data structures in the module exit function. This
is done by unregistering zfcp from s390 cio and the SCSI midlayer
first in the module exit function.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The latencies traced per fsf request are traced for sysfs output and
for blktrace, each in one function. Simplify the tracing code by
merging both tracing functions into one.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When accessing port and unit attributes, use container_of instead of
dev_get_drvdata. This eliminates some code checker warnings about
aliased access of data structures.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The callback for suspend is not required because it contains exactly
the same functionality as the _set_offline routine does.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The global config_mutex was required for the serialization of a
configuration change within the zfcp driver. This global locking is
now obsolete and can be removed. The requirement of serializing the
access to a zfcp_adapter reference via a ccw_device is realized wth a
static spinlock.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Replace the local reference counting by already available mechanisms
offered by kref. Where possible existing device structures were used,
including the same functionality.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The global config_lock was used to protect the configuration organized
in independent lists. It is not necessary to have a lock on driver
level for this purpose. This patch replaces the global config_lock
with a set of local list locks.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Adapt the change_queue_depth callback in zfcp for the new reason
parameter. Simply pass each call back to the SCSI midlayer, there are
no resource adjustments necessary for zfcp.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Removes check for (depth <= default_depth) in case of
SCSI_QDEPTH_RAMP_UP call back, not needed after added
max_queue_depth per sdev.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch modifies scsi_host_template->change_queue_depth so that
it takes an argument indicating why it is being called. This will be
used so that if a LLD needs to do some extra processing when
handling queue fulls or later ramp ups, it can do so.
This is a simple port of the drivers setting a change_queue_depth
callback. In the patch I just have these LLDs adjust the queue depth
if the user was requesting it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
[Vasu.Dev: v2
Also converted pmcraid_change_queue_depth and then verified
all modules compile using "make allmodconfig" for any new build
warnings on X86_64.
Updated original description after combing two original
patches from Mike to make this patch git bisectable.]
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
[jejb: fixed up 53c700]
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] zfcp: Flush SCSI registration work when adding unit
[SCSI] zfcp: Fix timer initialization for ct and els requests
[SCSI] zfcp: Warn about storage devices with broken PLOGI data
[SCSI] zfcp: Handle WWPN mismatch in PLOGI payload
[SCSI] zfcp: fix kfree handling in zfcp_init_device_setup
[SCSI] fix memory leak in initialization
When configuring a LUN for use in zfcp, flush the SCSI work to ensure
the SCSI device has been created before returning. This means that a
configuration procedure can run these commands in a script and the
SCSI device is available immediately after the unit_add:
echo 1 > /sys/bus/ccw/drivers/zfcp/0.0.181d/online
echo 0x401040C300000000 > \
/sys/bus/ccw/drivers/zfcp/0.0.181d/0x500507630313c562/unit_add
lsscsi
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add HZ since the start_timer function expects jiffies, not seconds.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
After opening a remote port zfcp checks if the WWPN returned in the
PLOGI maches the WWPN of the port that should have been opened. On a
mismatch zfcp assumes that the DID just changed, queries the FC
nameserver and tries again. If the situation persists the erp will
give up.
With this strategy, if the remote port always returns the wrong PLOGI
data, the remote port will not be opened. Introduce a warning, so that
the system administrator knows why the remote port is not being opened
and to have a pointer to investigate the problem on the storage
system.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
For ports, zfcp gets the DID from the FC nameserver and tries to open
the port. If the open succeeds, zfcp compares the WWPN from the
nameserver with the WWPN in the PLOGI payload. In case of a mismatch,
zfcp assumes that the DID of the port just changed and we opened the
wrong port. This means that zfcp has to forget the DID, lookup the DID
again and retry.
This error case had a problem that zfcp forgets the DID, but never
looks up a new one, stalling the ERP in this case. Fix this by
triggering the DID lookup and properly exit from the ERP. The DID
lookup will trigger a new ERP action.
Also ensure when trying to open the port again with the new DID, first
close the open port, even in the NOESC case.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The pointer that is allocated with kmalloc() is passed to strsep()
which modifies it. Later on the modified pointer value will be passed
to kfree. Save the original pointer and pass that one to kfree
instead.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix this build error:
next-20091013 randconfig build on s390x build breaks with
drivers/s390/built-in.o:(.data+0x3354): undefined reference to `sclp_vt220_pm_event_fn'
Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <michael.holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use cio_is_console() in io_subchannel_probe to indicate that the
special handling is console specific. As long as there is no other
subchannel for which this might be true, it is misleading to speak
of "early devices". Should more of these devices be introduced,
a cleanup of all console special handling is in order anyway.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
8d65af78 "sysctl: remove "struct file *" argument of ->proc_handler"
removed the struct file argument from all proc_handlers but didn't
change the call home proc handler (or call home was merged later).
So fix this now.
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Hans-Joachim Picht <hans@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If the rdc_buffer is above 2G we need indirect addresssing so we have
to use an idaw to give the rdc_buffer to the ccw.
If the rdc_buffer is under 2G nothing changes.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Replace spin_lock with spin_lock_irqsave in dasd_eckd_restore_device.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When setting a channel attached tape online under Linux 2.6.31, the
"vol_id" process from udev hangs in sync_page():
2 sync_page+144 [0x1dfaac]
3 __wait_on_bit_lock+194 [0x58c23e]
4 __lock_page+116 [0x1df9dc]
5 truncate_inode_pages_range+728 [0x1ed7cc]
6 __blkdev_put+244 [0x25f738]
7 __fput+300 [0x229c4c]
8 filp_close+122 [0x225a3a]
The reason for that is an error in the request queue handling. It can
happen that we fetch a request, but do not process it further because
the number of queued requests exceeds TAPEBLOCK_MIN_REQUEUE.
To fix this, we should call blk_peek_request() instead of
blk_fetch_request() in the while condition and fetch the request in
the loop body afterwards.
This bug was introduced with the patch "block: implement and enforce
request peek/start/fetch" (9934c8c045)
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (34 commits)
[SCSI] qla2xxx: Fix NULL ptr deref bug in fail path during queue create
[SCSI] st: fix possible memory use after free after MTSETBLK ioctl
[SCSI] be2iscsi: Moving to pci_pools v3
[SCSI] libiscsi: iscsi_session_setup to allow for private space
[SCSI] be2iscsi: add 10Gbps iSCSI - BladeEngine 2 driver
[SCSI] zfcp: Fix hang when offlining device with offline chpid
[SCSI] zfcp: Fix lockdep warning when offlining device with offline chpid
[SCSI] zfcp: Fix oops during shutdown of offline device
[SCSI] zfcp: Fix initial device and cfdc for delayed adapter allocation
[SCSI] zfcp: correctly initialize unchained requests
[SCSI] mpt2sas: Bump version 02.100.03.00
[SCSI] mpt2sas: Support dev remove when phy status is MPI2_EVENT_SAS_TOPO_PHYSTATUS_VACANT
[SCSI] mpt2sas: Timeout occurred within the HANDSHAKE logic while waiting on firmware to ACK.
[SCSI] mpt2sas: Call init_completion on a per request basis.
[SCSI] mpt2sas: Target Reset will be issued from Interrupt context.
[SCSI] mpt2sas: Added SCSIIO, Internal and high priority memory pools to support multiple TM
[SCSI] mpt2sas: Copyright change to 2009.
[SCSI] mpt2sas: Added mpi2_history.txt for MPI2 headers.
[SCSI] mpt2sas: Update driver to MPI2 REV K headers.
[SCSI] bfa: Brocade BFA FC SCSI driver
...
There is a race while re-reading the device characteristics. After
cleaning the memory area a cqr is build which reads the device
characteristics. This may take a rather long time and the device
characteristics structure is zero during this. Now it could be
possible that the block tasklet starts working and a new cqr will be
build. The build_cp command refers to the device characteristics
structure and this may lead into a divide by zero exception.
Fix this by re-reading the device characteristics into a temporary
structur and copy the data to the original structure. Also take the
ccwdev_lock.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Improve the comments for switch cases without a break. This fixes
some warnings of a code checker tool.
Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Allow users to set boxed devices offline. After setting them
offline, the device state will still be boxed.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When a ccw device appears not operational, inform the associated
device driver and act according to the response: if the driver
wants to keep the device, put it into the disconnected state.
If not, or if there is no driver or if the device is not online,
unregister it. This approach is consistent with no-path event
handling.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When there is no path left to a ccw device, inform the associated
device driver and act according to the response: if the driver
wants to keep the device, put it into the disconnected state.
If not, or if there is no driver or if the device is not online,
unregister it.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There is a memory leak in /proc/cio_ignore. The iterator is allocated
in cio_ignore_proc_seq_start, but never freed in
cio_ignore_proc_seq_stop, because we cannot use the iterator
that was passed by seqfile. The seqfile interface passes the last
seen iterator to the stop function and not the first one. Since our
next function will return NULL at the end, the iter passed to
cio_ignore_proc_seq_stop is NULL. The original iter has leaked.
The solution is to use seq_open_private.
Found with kmemleak:
unreferenced object 0x1c720580 (size 32):
comm "head", pid 973, jiffies 4294958302
hex dump (first 32 bytes):
00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<0000000000203154>] kmem_cache_alloc+0x190/0x19c
[<00000000003fb462>] cio_ignore_proc_seq_start+0x5e/0x128
[<0000000000231018>] seq_read+0xc8/0x4bc
[<0000000000273954>] proc_reg_read+0xa8/0xf4
[<000000000020e3d8>] vfs_read+0xac/0x1a4
[<000000000020e5c6>] SyS_read+0x52/0xa8
[<000000000011836e>] sysc_noemu+0x10/0x16
[<0000004690b7936c>] 0x4690b7936c
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Move dev_set_name to when we know that the device will actually be
registered in order to avoid a memory leak if the allocated memory
for the channel path has to be freed.
Signed-off-by: Michael Ernst <mernst@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix this build failure:
drivers/s390/built-in.o: In function `raw3270_pm_unfreeze':
(.text+0x3ac04): undefined reference to `ccw_device_force_console'
with:
CONFIG_TN3270=y
CONFIG_TN3270_CONSOLE=n
CONFIG_TN3215_CONSOLE=n
Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Running chchp --vary 0 and chccwdev -d on a FCP device with scsi
devices attached can lead to this thread hanging:
================================================================
STACK TRACE FOR TASK: 0x2fbfcc00 (kslowcrw)
STACK:
0 schedule+1136 [0x45f99c]
1 schedule_timeout+534 [0x46054e]
2 wait_for_common+374 [0x45f442]
3 blk_execute_rq+160 [0x217a2c]
4 scsi_execute+278 [0x26daf2]
5 scsi_execute_req+150 [0x26dc86]
6 sd_sync_cache+138 [0x28460a]
7 sd_shutdown+130 [0x28486a]
8 sd_remove+104 [0x284c84]
9 __device_release_driver+152 [0x257430]
10 device_release_driver+56 [0x2575c8]
11 bus_remove_device+214 [0x25672a]
12 device_del+352 [0x25456c]
13 __scsi_remove_device+108 [0x272630]
14 scsi_remove_device+66 [0x2726ba]
15 zfcp_ccw_remove+824 [0x335558]
16 ccw_device_remove+62 [0x2b3f2a]
17 __device_release_driver+152 [0x257430]
18 device_release_driver+56 [0x2575c8]
19 bus_remove_device+214 [0x25672a]
20 device_del+352 [0x25456c]
21 ccw_device_unregister+92 [0x2b48c4]
22 io_subchannel_remove+108 [0x2b4950]
23 css_remove+62 [0x2af7ee]
24 __device_release_driver+152 [0x257430]
25 device_release_driver+56 [0x2575c8]
26 bus_remove_device+214 [0x25672a]
27 device_del+352 [0x25456c]
28 device_unregister+38 [0x25464a]
29 css_sch_device_unregister+68 [0x2af97c]
30 ccw_device_call_sch_unregister+78 [0x2b581e]
31 worker_thread+604 [0x69eb0]
32 kthread+154 [0x6ff42]
33 kernel_thread_starter+6 [0x1c952]
================================================================
The problem is that the chchp --vary 0 leads to zfcp first calling
fc_remote_port_delete which blocks all scsi devices on the remote
port. Calling scsi_remove_device later lets the sd driver issue a
SYNCHRONIZE_CACHE command. This command stays on the "stopped" request
requeue because the SCSI device is blocked. Fix this by first removing
the scsi and fc hosts which removes all scsi devices and do not use
scsi_remove_device.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.31-39.x.20090917-s390xdefault #1
-------------------------------------------------------
kslowcrw/83 is trying to acquire lock:
(&adapter->scan_work){+.+.+.}, at: [<0000000000169c5c>] __cancel_work_timer+0x64/0x3d4
but task is already holding lock:
(&zfcp_data.config_mutex){+.+.+.}, at: [<00000000004671ea>] zfcp_ccw_remove+0x66/0x384
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&zfcp_data.config_mutex){+.+.+.}:
[<0000000000189962>] __lock_acquire+0xe26/0x1834
[<000000000018a4b6>] lock_acquire+0x146/0x178
[<000000000058cb5a>] mutex_lock_nested+0x82/0x3ec
[<0000000000477170>] zfcp_fc_scan_ports+0x3ec/0x728
[<0000000000168e34>] worker_thread+0x278/0x3a8
[<000000000016ff08>] kthread+0x9c/0xa4
[<0000000000109ebe>] kernel_thread_starter+0x6/0xc
[<0000000000109eb8>] kernel_thread_starter+0x0/0xc
-> #0 (&adapter->scan_work){+.+.+.}:
[<0000000000189e60>] __lock_acquire+0x1324/0x1834
[<000000000018a4b6>] lock_acquire+0x146/0x178
[<0000000000169c9a>] __cancel_work_timer+0xa2/0x3d4
[<0000000000465cb2>] zfcp_adapter_dequeue+0x32/0x14c
[<00000000004673e4>] zfcp_ccw_remove+0x260/0x384
[<00000000004250f6>] ccw_device_remove+0x42/0x1ac
[<00000000003cb6be>] __device_release_driver+0x9a/0x10c
[<00000000003cb856>] device_release_driver+0x3a/0x4c
[<00000000003ca94c>] bus_remove_device+0xcc/0x114
[<00000000003c8506>] device_del+0x162/0x21c
[<0000000000425ff2>] ccw_device_unregister+0x5e/0x7c
[<000000000042607e>] io_subchannel_remove+0x6e/0x9c
[<000000000041ff9a>] css_remove+0x3e/0x7c
[<00000000003cb6be>] __device_release_driver+0x9a/0x10c
[<00000000003cb856>] device_release_driver+0x3a/0x4c
[<00000000003ca94c>] bus_remove_device+0xcc/0x114
[<00000000003c8506>] device_del+0x162/0x21c
[<00000000003c85e8>] device_unregister+0x28/0x38
[<0000000000420152>] css_sch_device_unregister+0x46/0x58
[<00000000004276a6>] io_subchannel_sch_event+0x28e/0x794
[<0000000000420442>] css_evaluate_known_subchannel+0x46/0xd0
[<0000000000420ebc>] slow_eval_known_fn+0x88/0xa0
[<00000000003caffa>] bus_for_each_dev+0x7e/0xd0
[<000000000042188c>] for_each_subchannel_staged+0x6c/0xd4
[<0000000000421a00>] css_slow_path_func+0x54/0xd8
[<0000000000168e34>] worker_thread+0x278/0x3a8
[<000000000016ff08>] kthread+0x9c/0xa4
[<0000000000109ebe>] kernel_thread_starter+0x6/0xc
[<0000000000109eb8>] kernel_thread_starter+0x0/0xc
cancel_work_sync is called while holding the config_mutex. But the
work that is being cancelled or flushed also uses the config_mutex.
Fix the resulting deadlock possibility by calling cancel_work_sync
earlier without holding the mutex. The best place to do is is after
offlining the device. No new port scan work will be scheduled for the
offline device, so this is a safe place to call cancel_work_sync.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
With the change that the zfcp_adapter struct is only allocated when
the device is set online, the shutdown handler has to check for a
non-existing zfcp_adapter struct. On the other hand, this check is not
necessary in the offline callback, since an online device has the
zfcp_adapter allocated and we go through the offline callback before
removing the ccw device.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
With the change for delaying the allocation of zfcp_adapter, the
initial device parameter function has to first call
ccw_device_set_online which allocates the zfcp_adapter structure.
Change this and adapt the cfdc part accordingly.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The common initialization of ct/gs and els requests missed the
initialization of unchained requests. Fix this by moving the common
parts to a place that is called for all ct/gs and els requests.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* remove asm/atomic.h inclusion from linux/utsname.h --
not needed after kref conversion
* remove linux/utsname.h inclusion from files which do not need it
NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>