aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wqJack Wang2020-07-292-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | lockdep triggers a warning from time to time when running a regression test: rnbd_client L685: </dev/nullb0@bla> Device disconnected. rnbd_client L1756: Unloading module workqueue: WQ_MEM_RECLAIM rtrs_client_wq:rtrs_clt_reconnect_work [rtrs_client] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core] WARNING: CPU: 2 PID: 18824 at kernel/workqueue.c:2517 check_flush_dependency+0xad/0x130 The root cause is workqueue core expect flushing should not be done for a !WQ_MEM_RECLAIM wq from a WQ_MEM_RECLAIM workqueue. In above case ib_addr workqueue without WQ_MEM_RECLAIM, but rtrs_wq WQ_MEM_RECLAIM. To avoid the warning, remove the WQ_MEM_RECLAIM flag. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20200724111508.15734-4-haris.iqbal@cloud.ionos.com Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rtrs-clt: add an additional random 8 seconds before reconnectingDanil Kipnis2020-07-291-2/+12
| | | | | | | | | | | In order to avoid all the clients to start reconnecting at the same time schedule the reconnect dwork with a random jitter of +[0,8] seconds. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20200724111508.15734-2-haris.iqbal@cloud.ionos.com Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/cma: Execute rdma_cm destruction from a handler properlyJason Gunthorpe2020-07-291-90/+84
| | | | | | | | | | | | | | | | | | | | | When a rdma_cm_id needs to be destroyed after a handler callback fails, part of the destruction pattern is open coded into each call site. Unfortunately the blind assignment to state discards important information needed to do cma_cancel_operation(). This results in active operations being left running after rdma_destroy_id() completes, and the use-after-free bugs from KASAN. Consolidate this entire pattern into destroy_id_handler_unlock() and manage the locking correctly. The state should be set to RDMA_CM_DESTROYING under the handler_lock to atomically ensure no futher handlers are called. Link: https://lore.kernel.org/r/20200723070707.1771101-5-leon@kernel.org Reported-by: syzbot+08092148130652a6faae@syzkaller.appspotmail.com Reported-by: syzbot+a929647172775e335941@syzkaller.appspotmail.com Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/cma: Remove unneeded locking for req pathsJason Gunthorpe2020-07-291-25/+6
| | | | | | | | | | | | | | | | | The REQ flows are concerned that once the handler is called on the new cm_id the ULP can choose to trigger a rdma_destroy_id() concurrently at any time. However, this is not true, while the ULP can call rdma_destroy_id(), it immediately blocks on the handler_mutex which prevents anything harmful from running concurrently. Remove the confusing extra locking and refcounts and make the handler_mutex protecting state during destroy more clear. Link: https://lore.kernel.org/r/20200723070707.1771101-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/cma: Using the standard locking pattern when delivering the removal eventJason Gunthorpe2020-07-291-26/+36
| | | | | | | | | | | | | | | | | Whenever an event is delivered to the handler it should be done under the handler_mutex and upon any non-zero return from the handler it should trigger destruction of the cm_id. cma_process_remove() skips some steps here, it is not necessarily wrong since the state change should prevent any races, but it is confusing and unnecessary. Follow the standard pattern here, with the slight twist that the transition to RDMA_CM_DEVICE_REMOVAL includes a cma_cancel_operation(). Link: https://lore.kernel.org/r/20200723070707.1771101-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/cma: Simplify DEVICE_REMOVAL for internal_idJason Gunthorpe2020-07-291-1/+5
| | | | | | | | | | | | | | | | | cma_process_remove() triggers an unconditional rdma_destroy_id() for internal_id's and skips the event deliver and transition through RDMA_CM_DEVICE_REMOVAL. This is confusing and unnecessary. internal_id always has cma_listen_handler() as the handler, have it catch the RDMA_CM_DEVICE_REMOVAL event and directly consume it and signal removal. This way the FSM sequence never skips the DEVICE_REMOVAL case and the logic in this hard to test area is simplified. Link: https://lore.kernel.org/r/20200723070707.1771101-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/efa: Add EFA 0xefa1 PCI IDGal Pressman2020-07-291-2/+4
| | | | | | | | | | Add support for 0xefa1 devices. Link: https://lore.kernel.org/r/20200722140312.3651-5-galpress@amazon.com Reviewed-by: Shadi Ammouri <sammouri@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/efa: User/kernel compatibility handshake mechanismGal Pressman2020-07-292-0/+50
| | | | | | | | | | | | | | | | | Introduce a mechanism that performs an handshake between the userspace provider and kernel driver which verifies that the user supports all required features in order to operate correctly. The handshake verifies the needed functionality by comparing the reported device caps and the provider caps. If the device reports a non-zero capability the appropriate comp mask is required from the userspace provider in order to allocate the context. Link: https://lore.kernel.org/r/20200722140312.3651-4-galpress@amazon.com Reviewed-by: Shadi Ammouri <sammouri@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/efa: Expose minimum SQ sizeGal Pressman2020-07-295-3/+7
| | | | | | | | | | | | | The device reports the minimum SQ size required for creation. This patch queries the min SQ size and reports it back to the userspace library. Link: https://lore.kernel.org/r/20200722140312.3651-3-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Shadi Ammouri <sammouri@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/efa: Expose maximum TX doorbell batchGal Pressman2020-07-295-1/+17
| | | | | | | | | | | | | | The device reports the maximum number of bytes to be written before ringing the doorbell (zero means unlimited). This patch queries the max batch size and reports it back to the userspace library. Link: https://lore.kernel.org/r/20200722140312.3651-2-galpress@amazon.com Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com> Reviewed-by: Firas JahJah <firasj@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* IB/srpt: use new shared CQ mechanismYamin Friedman2020-07-292-8/+10
| | | | | | | | | | | | Have the driver use shared CQs provided by the rdma core driver. This provides the advantage of improved efficiency handling interrupts. Link: https://lore.kernel.org/r/20200722135629.49467-3-maxg@mellanox.com Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* IB/isert: use new shared CQ mechanismYamin Friedman2020-07-292-153/+34
| | | | | | | | | | | | | | | | | Have the driver use shared CQs provided by the rdma core driver. Since this provides similar functionality to iser_comp it has been removed. Now there is no reason to allocate very large CQs when the driver is loaded while gaining the advantage of shared CQs. Previously when a single connection was opened a CQ was opened for every core with enough space for eight connections, this is a very large overhead that in most cases will not be utilized. Link: https://lore.kernel.org/r/20200722135629.49467-2-maxg@mellanox.com Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* IB/iser: use new shared CQ mechanismYamin Friedman2020-07-292-106/+29
| | | | | | | | | | | | | Have the driver use shared CQs provided by the rdma core driver. Since this provides similar functionality to iser_comp it has been removed. Now there is no reason to allocate very large CQs when the driver is loaded while gaining the advantage of shared CQs. Link: https://lore.kernel.org/r/20200722135629.49467-1-maxg@mellanox.com Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Acked-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/mlx5: Delete unreachable codeLeon Romanovsky2020-07-282-9/+4
| | | | | | | | Delete two occurrences of unreachable code discovered by the Coverity. Link: https://lore.kernel.org/r/20200727095746.495915-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/core: Fix return error value in _ib_modify_qp() to negativeLi Heng2020-07-271-1/+1
| | | | | | | | | | | The error codes in _ib_modify_qp() are supposed to be negative errno. Fixes: 7a5c938b9ed0 ("IB/core: Check for rdma_protocol_ib only after validating port_num") Link: https://lore.kernel.org/r/1595645787-20375-1-git-send-email-liheng40@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Li Heng <liheng40@huawei.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* Merge branch 'mlx5_uar' into rdma.git /for-nextJason Gunthorpe2020-07-277-49/+180
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Meir Lichtinger says: ==================== ConnectX-7 supports setting relaxed ordering read/write mkey attribute by UMR, indicated by new HCA capabilities, so extend mlx5_ib driver to configure UMR control segment ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. * branch 'mlx5_uar': RDMA/mlx5: Set mkey relaxed ordering by UMR with ConnectX-7 RDMA/mlx5: Use MLX5_SET macro instead of local structure RDMA/mlx5: ConnectX-7 new capabilities to set relaxed ordering by UMR
| * RDMA/mlx5: Set mkey relaxed ordering by UMR with ConnectX-7Meir Lichtinger2020-07-273-12/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to ConnectX-7 UMR is not used when user passes relaxed ordering access flag. ConnectX-7 supports setting relaxed ordering read/write mkey attribute by UMR, indicated by new HCA capabilities. With ConnectX-7 driver uses UMR when user set relaxed ordering access flag, in contrast to previous silicon models. Specifically it includes setting relvant flags of mkey context mask in UMR control segment, and relaxed ordering write and read flags in UMR mkey context segment. Link: https://lore.kernel.org/r/20200716105248.1423452-4-leon@kernel.org Signed-off-by: Meir Lichtinger <meirl@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * RDMA/mlx5: Use MLX5_SET macro instead of local structureMeir Lichtinger2020-07-273-20/+16
| | | | | | | | | | | | | | | | | | | | | | Use generic mlx5 structure defined in mlx5_ifc.h to represent ConnectX device data structures instead of using structure defined specifically for mlx5_ib module. Link: https://lore.kernel.org/r/20200716105248.1423452-3-leon@kernel.org Signed-off-by: Meir Lichtinger <meirl@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * RDMA/mlx5: ConnectX-7 new capabilities to set relaxed ordering by UMRMeir Lichtinger2020-07-241-1/+3
| | | | | | | | | | | | | | | | | | | | Up to ConnectX-7 setting mkey relaxed ordering read/write attributes by UMR is not supported. ConnectX-7 supports this option, which is indicated by two new HCA capabilities - relaxed_ordering_write_umr and relaxed_ordering_read_umr. Signed-off-by: Meir Lichtinger <meirl@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| * net/mlx5: Enable count action for rules with allow actionMichael Guralnik2020-07-151-0/+1
| | | | | | | | | | | | | | | | | | Enable the creation of rules with allow and count actions. This enables using counters on egress flow tables. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5: Add interface changes required for VDPAEli Cohen2020-07-152-16/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | Rename mlx5_ifc_device_virtio_emulation_cap_bits to mlx5_ifc_virtio_emulation_cap_bits to match names produced by the tools producing these auto generated files. In addition missing capabilities that will be required by VDPA implementation. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5: Add VDPA interface type to supported enumerationsEli Cohen2020-07-151-0/+1
| | | | | | | | | | | | | | | | | | VDPA is a new interface that will be added in subsequent patches. It uses mlx5 core devices and resources. Add an interface type for it. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * net/mlx5: Support setting access rights of dma addressesEli Cohen2020-07-153-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | mlx5_fill_page_frag_array() is used to populate dma addresses to resources that require it, such as QPs, RQs etc. When the resource is used, PA list permissions are ignored. For resources that use MTT list, the user is required to provide the access rights. Subsequent patches use resources that require MTT lists, so modify API and implementation to support that. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | RDMA/mlx5: Fix typo in enum namePavel Machek2020-07-241-1/+1
| | | | | | | | | | | | | | | | | | Nnothing uses the enum name, so this is harmless. Fixes: 322694412400 ("IB/mlx5: Introduce driver create and destroy flow methods") Link: https://lore.kernel.org/r/20200724084112.GC31930@amd Signed-off-by: Pavel Machek (CIP) <pavel@denx.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | IB/hfi1: Use fallthrough pseudo-keywordGustavo A. R. Silva2020-07-2411-60/+41
| | | | | | | | | | | | | | | | | | | | | | | | Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7-rc7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Link: https://lore.kernel.org/r/20200721133455.GA14363@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/uverbs: Silence shiftTooManyBitsSigned warningLeon Romanovsky2020-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fix reported by kbuild warning. drivers/infiniband/core/uverbs_cmd.c:1897:47: warning: Shifting signed 32-bit value by 31 bits is undefined behaviour [shiftTooManyBitsSigned] BUILD_BUG_ON(IB_USER_LAST_QP_ATTR_MASK == (1 << 31)); ^ Link: https://lore.kernel.org/r/20200720175627.1273096-3-leon@kernel.org Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/uverbs: Remove redundant assignmentsLeon Romanovsky2020-07-241-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kbuild reported the following warning, so clean whole uverbs_cmd.c file. drivers/infiniband/core/uverbs_cmd.c:1066:6: warning: Variable 'ret' is reassigned a value before the old one has been used. [redundantAssignment] ret = uverbs_request(attrs, &cmd, sizeof(cmd)); ^ drivers/infiniband/core/uverbs_cmd.c:1064:0: note: Variable 'ret' is reassigned a value before the old one has been used. int ret = -EINVAL; ^ Link: https://lore.kernel.org/r/20200720175627.1273096-2-leon@kernel.org Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/mlx5: Add missing srcu_read_lock in ODP implicit flowMaor Gottlieb2020-07-243-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the locking scheme, mlx5_ib_update_xlt() should be called with srcu_read_lock(dev->odp->srcu). Prefetch missed this. This fixes the below WARN from lockdep_assert_held(): WARNING: CPU: 1 PID: 1130 at drivers/infiniband/hw/mlx5/odp.c:132 mlx5_odp_populate_xlt+0x175/0x180 [mlx5_ib] Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay ib_srp scsi_transport_srp rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm mlx5_ib ib_uverbs ib_core mlx5_core mlxfw ptp pps_core CPU: 1 PID: 1130 Comm: kworker/u16:11 Tainted: G W 5.8.0-rc5_for_upstream_debug_2020_07_13_11_04 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Workqueue: events_unbound mlx5_ib_prefetch_mr_work [mlx5_ib] RIP: 0010:mlx5_odp_populate_xlt+0x175/0x180 [mlx5_ib] Code: 08 e2 85 c0 0f 84 65 ff ff ff 49 8b 87 60 01 00 00 be ff ff ff ff 48 8d b8 b0 39 00 00 e8 93 e0 50 e1 85 c0 0f 85 45 ff ff ff <0f> 0b e9 3e ff ff ff 0f 0b eb c7 0f 1f 44 00 00 48 8b 87 98 0f 00 RSP: 0018:ffff88840f44fc68 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88840cc9d000 RCX: ffff88840efcd940 RDX: 0000000000000000 RSI: ffff88844871b9b0 RDI: ffff88840efce100 RBP: ffff88840cc9d040 R08: 0000000000000040 R09: 0000000000000001 R10: ffff88846ced3068 R11: 0000000000000000 R12: 00000000000156ec R13: 0000000000000004 R14: 0000000000000004 R15: ffff888439941000 FS: 0000000000000000(0000) GS:ffff88846fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8536d12430 CR3: 0000000437a5e006 CR4: 0000000000360ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5_ib_update_xlt+0x37c/0x7c0 [mlx5_ib] pagefault_mr+0x315/0x440 [mlx5_ib] mlx5_ib_prefetch_mr_work+0x56/0xa0 [mlx5_ib] process_one_work+0x215/0x5c0 worker_thread+0x3c/0x380 ? process_one_work+0x5c0/0x5c0 kthread+0x133/0x150 ? kthread_park+0x90/0x90 ret_from_fork+0x1f/0x30 Hold the SRCU during prefetch, even though it strictly isn't needed since prefetch is holding the num_deferred_work it does make it easier to reason about. Fixes: 5256edcb98a1 ("RDMA/mlx5: Rework implicit ODP destroy") Link: https://lore.kernel.org/r/20200719065747.131157-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/core: Update write interface to use automatic object lifetimeLeon Romanovsky2020-07-241-207/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The automatic object lifetime model allows us to change the write() interface to have the same logic as the ioctl() path. Update the create/alloc functions to be in the following format, so the code flow will be the same: * Allocate objects * Initialize them * Call to the drivers, this is last step that is allowed to fail * Finalize object * Return response and allow to core code to handle abort/commit respectively. Link: https://lore.kernel.org/r/20200719052223.75245-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/core: Align abort/commit object scheme for write() and ioctl() pathsLeon Romanovsky2020-07-244-1/+25
| | | | | | | | | | | | | | | | | | | | Create the same logic flow for the write() interface as we have for the ioctl() path by making sure that the object is committed or aborted automatically after HW object creation. Link: https://lore.kernel.org/r/20200719052223.75245-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/mlx5: Allow SQ modificationMaor Gottlieb2020-07-241-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the SQ is set to a ready state when the RAW QP is modified to INIT. When the TIS is modified, e.g. to change the lag_tx_affinity, then SQs which are already in the ready state will not be affected. Open a window to modify the SQ behavior by setting the SQ as ready only when QP was modified to RTS. Link: https://lore.kernel.org/r/20200716105416.1423826-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA: rdma_user_ioctl.h: fix a duplicated word + clarifyRandy Dunlap2020-07-201-1/+1
| | | | | | | | | | | | | | | | | | Change the repeated word "it" in a comment to "it to". Also insert a dash in the sentence to add clarity. Link: https://lore.kernel.org/r/20200719003220.21250-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/bnxt_re: Update maintainers for Broadcom rdma driverDevesh Sharma2020-07-201-0/+1
| | | | | | | | | | | | | | | | Adding a new co-maintainer for Broadcom's RDMA driver. Link: https://lore.kernel.org/r/1594822619-4098-7-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/bnxt_re: Change wr posting logic to accommodate variable wqesDevesh Sharma2020-07-205-173/+398
| | | | | | | | | | | | | | | | | | | | | | | | | | Modifying the post-send and post-recv to initialize the wqes slot by slot dynamically depending on the number of max sges requested by consumer at the time of QP creation. Changed the QP creation logic to determine the size of SQ and RQ in 16B slots based on the number of wqe and number of SGEs requested by consumer Link: https://lore.kernel.org/r/1594822619-4098-6-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/bnxt_re: Add helper data structuresDevesh Sharma2020-07-202-0/+46
| | | | | | | | | | | | | | | | | | | | | | Adding few helper data structure which are useful to initialize hardware send wqe in variable wqe mode. Adding a qp flag in HSI to indicate variable wqe is enabled for this qp. Link: https://lore.kernel.org/r/1594822619-4098-5-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/bnxt_re: Pull psn buffer dynamically based on prodDevesh Sharma2020-07-202-58/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changing the PSN management memory buffers from statically initialized to dynamic pull scheme. During create qp only the start pointers are initialized and during post-send the psn buffer is pulled based on current producer index. Adjusting post_send code to accommodate dynamic psn-pull and changing post_recv code to match post-send code wrt pseudo flush wqe generation. Link: https://lore.kernel.org/r/1594822619-4098-4-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/bnxt_re: introduce a function to allocate swqDevesh Sharma2020-07-203-171/+207
| | | | | | | | | | | | | | | | | | | | | | The bnxt_re driver now allocates shadow sq and rq to maintain per wqe wr_id and few other flags required to support variable wqe. Segregated the allocation of shadow queue in a separate function and adjust the cqe polling logic. The new polling logic is based on shadow queue indices. Link: https://lore.kernel.org/r/1594822619-4098-3-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/bnxt_re: introduce wqe mode to select execution pathDevesh Sharma2020-07-204-17/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bnxt_re driver need to decide on how much SQ and RQ memory should to be allocated and which wqe posting/polling algorithm to use. Making changes to set the wqe-mode to a default value during device registration sequence. The wqe-mode is passed to the lower layer driver as well. Going forward in the lower layer driver wqe-mode will be used to decide execution path. Initializing the wqe-mode to static wqe type for now. Link: https://lore.kernel.org/r/1594822619-4098-2-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/qedr: Remove the query_pkey callbackKamal Heib2020-07-202-3/+1
| | | | | | | | | | | | | | | | | | | | | | Now that the query_pkey() isn't mandatory by the RDMA core for iwarp providers, this callback can be removed from the common ops and moved to the RoCE only ops within the qedr driver. Link: https://lore.kernel.org/r/20200714183414.61069-8-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/i40iw: Remove the query_pkey callbackKamal Heib2020-07-201-19/+0
| | | | | | | | | | | | | | | | | | Now that the query_pkey() isn't mandatory by the RDMA core for iwarp providers, this callback can be removed. Link: https://lore.kernel.org/r/20200714183414.61069-7-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/cxgb4: Remove the query_pkey callbackKamal Heib2020-07-201-11/+0
| | | | | | | | | | | | | | | | | | Now that the query_pkey() isn't mandatory by the RDMA core for iwarp providers, this callback can be removed. Link: https://lore.kernel.org/r/20200714183414.61069-6-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/siw: Remove the query_pkey callbackKamal Heib2020-07-203-11/+0
| | | | | | | | | | | | | | | | | | | | Now that the query_pkey() isn't mandatory by the RDMA core for iwarp providers, this callback can be removed. Link: https://lore.kernel.org/r/20200714183414.61069-5-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/core: Remove query_pkey from the mandatory opsKamal Heib2020-07-201-1/+3
| | | | | | | | | | | | | | | | | | The query_pkey() isn't mandatory for the iwarp providers, so remove this requirement from the RDMA core. Link: https://lore.kernel.org/r/20200714183414.61069-4-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/core: Allocate the pkey cache only if the pkey_tbl_len is setKamal Heib2020-07-201-16/+29
| | | | | | | | | | | | | | | | | | Allocate the pkey cache only if the pkey_tbl_len is set by the provider, also add checks to avoid accessing the pkey cache when it not initialized. Link: https://lore.kernel.org/r/20200714183414.61069-3-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/core: Expose pkeys sysfs files only if pkey_tbl_len is setKamal Heib2020-07-201-20/+41
| | | | | | | | | | | | | | | | | | Expose the pkeys sysfs files only if the pkey_tbl_len is set by the providers. Link: https://lore.kernel.org/r/20200714183414.61069-2-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/rxe: Prevent access to wr->next ptr afrer wr is posted to send queueMikhail Malygin2020-07-161-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rxe_post_send_kernel() iterates over linked list of wr's, until the wr->next ptr is NULL. However if we've got an interrupt after last wr is posted, control may be returned to the code after send completion callback is executed and wr memory is freed. As a result, wr->next pointer may contain incorrect value leading to panic. Store the wr->next on the stack before posting it. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200716190340.23453-1-m.malygin@yadro.com Signed-off-by: Mikhail Malygin <m.malygin@yadro.com> Signed-off-by: Sergey Kojushev <s.kojushev@yadro.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/qedr: Add EDPM max size to alloc ucontext responseMichal Kalderon2020-07-162-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | User space should receive the maximum edpm size from kernel driver, similar to other edpm/ldpm related limits. Add an additional parameter to the alloc_ucontext_resp structure for the edpm maximum size. In addition, pass an indication from user-space to kernel (and not just kernel to user) that the DPM sizes are supported. This is for supporting backward-forward compatibility between driver and lib for everything related to DPM transaction and limit sizes. This should have been part of commit mentioned in Fixes tag. Link: https://lore.kernel.org/r/20200707063100.3811-3-michal.kalderon@marvell.com Fixes: 93a3d05f9d68 ("RDMA/qedr: Add kernel capability flags for dpm enabled mode") Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/qedr: Add EDPM mode type for user-fw compatibilityMichal Kalderon2020-07-163-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In older FW versions the completion flag was treated as the ack flag in edpm messages. commit ff937b916eb6 ("qed: Add EDPM mode type for user-fw compatibility") exposed the FW option of setting which mode the QP is in by adding a flag to the qedr <-> qed API. This patch adds the qedr <-> libqedr interface so that the libqedr can set the flag appropriately and qedr can pass it down to FW. Flag is added for backward compatibility with libqedr. For older libs, this flag didn't exist and therefore set to zero. Fixes: ac1b36e55a51 ("qedr: Add support for user context verbs") Link: https://lore.kernel.org/r/20200707063100.3811-2-michal.kalderon@marvell.com Signed-off-by: Yuval Bason <yuval.bason@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/usnic: switch from 'pci_' to 'dma_' APIChristophe JAILLET2020-07-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script bellow. It has been compile tested. When memory is allocated, GFP_ATOMIC should be used to be consistent with the surrounding code. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Link: https://lore.kernel.org/r/20200711073120.249146-1-christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | IB/hfi1: Remove unnecessary fall-through markingsGustavo A. R. Silva2020-07-161-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | Reorganize the code a bit in a more standard way[1] and remove unnecessary fall-through markings. [1] https://lore.kernel.org/lkml/20200708054703.GR207186@unreal/ Link: https://lore.kernel.org/r/20200709235250.GA26678@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>