Blob Blame History Raw
From 5d69aadca6a34dbd28749ba285ed57b42e63253c Mon Sep 17 00:00:00 2001
From: Himanshu Madhani <hmadhani@redhat.com>
Date: Thu, 21 Nov 2019 16:36:41 -0500
Subject: [PATCH 131/155] [scsi] scsi: qla2xxx: Really fix qla2xxx_eh_abort()

Message-id: <20191121163701.43688-7-hmadhani@redhat.com>
Patchwork-id: 287855
O-Subject: [RHLE 7.8 e-stor PATCH v3 06/26] scsi: qla2xxx: Really fix qla2xxx_eh_abort()
Bugzilla: 1731581
RH-Acked-by: Jarod Wilson <jarod@redhat.com>
RH-Acked-by: Ewan Milne <emilne@redhat.com>
RH-Acked-by: Tony Camuso <tcamuso@redhat.com>

From: Bart Van Assche <bvanassche@acm.org>

Bugzilla 1731581

I'm not sure how this happened but the patch that was intended to fix abort
handling was incomplete. This patch fixes that patch as follows:

 - If aborting the SCSI command failed, wait until the SCSI command
   completes.

 - Return SUCCESS instead of FAILED if an abort attempt races with SCSI
   command completion.

 - Since qla2xxx_eh_abort() increments the sp reference count by calling
   sp_get(), decrement the sp reference count before returning.

Cc: Himanshu Madhani <hmadhani@marvell.com>
Fixes: 219d27d7147e ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 8dd9593cc07ad7d999bef81b06789ef873a94881)
Signed-off-by: Himanshu Madhani <hmadhani@redhat.com>
Signed-off-by: Jan Stancek <jstancek@redhat.com>
---
 drivers/scsi/qla2xxx/qla_os.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 85243e30eb9e..9e0a724f08f6 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1271,6 +1271,7 @@ static int
 qla2xxx_eh_abort(struct scsi_cmnd *cmd)
 {
 	scsi_qla_host_t *vha = shost_priv(cmd->device->host);
+	DECLARE_COMPLETION_ONSTACK(comp);
 	srb_t *sp;
 	int ret;
 	unsigned int id, lun;
@@ -1308,6 +1309,7 @@ qla2xxx_eh_abort(struct scsi_cmnd *cmd)
 		return SUCCESS;
 	}
 
+	/* Get a reference to the sp and drop the lock. */
 	if (sp_get(sp)){
 		/* ref_count is already 0 */
 		spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
@@ -1335,6 +1337,23 @@ qla2xxx_eh_abort(struct scsi_cmnd *cmd)
 		sp->done(sp, DID_ABORT << 16);
 		ret = SUCCESS;
 		break;
+	case QLA_FUNCTION_PARAMETER_ERROR: {
+		/* Wait for the command completion. */
+		uint32_t ratov = ha->r_a_tov/10;
+		uint32_t ratov_j = msecs_to_jiffies(4 * ratov * 1000);
+
+		WARN_ON_ONCE(sp->comp);
+		sp->comp = &comp;
+		if (!wait_for_completion_timeout(&comp, ratov_j)) {
+			ql_dbg(ql_dbg_taskm, vha, 0xffff,
+			    "%s: Abort wait timer (4 * R_A_TOV[%d]) expired\n",
+			    __func__, ha->r_a_tov);
+			ret = FAILED;
+		} else {
+			ret = SUCCESS;
+		}
+		break;
+	}
 	default:
 		/*
 		 * Either abort failed or abort and completion raced. Let
@@ -1344,6 +1363,8 @@ qla2xxx_eh_abort(struct scsi_cmnd *cmd)
 		break;
 	}
 
+	sp->comp = NULL;
+	atomic_dec(&sp->ref_count);
 	ql_log(ql_log_info, vha, 0x801c,
 	    "Abort command issued nexus=%ld:%d:%d -- %x.\n",
 	    vha->host_no, id, lun, ret);
-- 
2.13.6