aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/misc/habanalabs/gaudi2
diff options
context:
space:
mode:
authorTomer Tayar <ttayar@habana.ai>2022-07-20 20:02:20 +0300
committerOded Gabbay <ogabbay@kernel.org>2022-09-18 13:29:50 +0300
commit21fc79336b9587fcc251e77246b68b6e20340146 (patch)
tree864221e079aa6ee381b824bf98adcded51eec7c8 /drivers/misc/habanalabs/gaudi2
parentf018c54e3de6619c46e33ab1c613761e9fba21d0 (diff)
downloadlinux-21fc79336b9587fcc251e77246b68b6e20340146.tar.gz
habanalabs/gaudi2: mark PCIE access error as fatal
F/W events are enabled in a late phase of the device init, so an event for a PCIE access error during the init, can be received after the init is already done and considered as successful. A resulting device reset, which does the same H/W init, can end similarly with this event right after the reset is done and considered as successful, and a loop of this sequence can continue. To avoid it mark the PCIE access error as a fatal event, so after 2 consecutive events no more resets will be done. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Diffstat (limited to 'drivers/misc/habanalabs/gaudi2')
-rw-r--r--drivers/misc/habanalabs/gaudi2/gaudi2.c1
1 files changed, 1 insertions, 0 deletions
diff --git a/drivers/misc/habanalabs/gaudi2/gaudi2.c b/drivers/misc/habanalabs/gaudi2/gaudi2.c
index 2c43ed403509..68ab407fa6ba 100644
--- a/drivers/misc/habanalabs/gaudi2/gaudi2.c
+++ b/drivers/misc/habanalabs/gaudi2/gaudi2.c
@@ -8532,6 +8532,7 @@ static void gaudi2_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_ent
case GAUDI2_EVENT_PCIE_ADDR_DEC_ERR:
gaudi2_print_pcie_addr_dec_info(hdev,
le64_to_cpu(eq_entry->intr_cause.intr_cause_data));
+ reset_flags |= HL_DRV_RESET_FW_FATAL_ERR;
break;
case GAUDI2_EVENT_HMMU0_PAGE_FAULT_OR_WR_PERM ... GAUDI2_EVENT_HMMU12_SECURITY_ERROR: