diff options
34 files changed, 2917 insertions, 47 deletions
diff --git a/Documentation/driver-api/auxiliary_bus.rst b/Documentation/driver-api/auxiliary_bus.rst index 2312506b0674..fff96c7ba7a8 100644 --- a/Documentation/driver-api/auxiliary_bus.rst +++ b/Documentation/driver-api/auxiliary_bus.rst @@ -1,5 +1,7 @@ .. SPDX-License-Identifier: GPL-2.0-only +.. _auxiliary_bus: + ============= Auxiliary Bus ============= diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst index e9b65035cd47..a1b32fcd0d76 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst @@ -12,6 +12,8 @@ Contents - `Enabling the driver and kconfig options`_ - `Devlink info`_ - `Devlink parameters`_ +- `mlx5 subfunction`_ +- `mlx5 port function`_ - `Devlink health reporters`_ - `mlx5 tracepoints`_ @@ -97,6 +99,11 @@ Enabling the driver and kconfig options | Provides low-level InfiniBand/RDMA and `RoCE <https://community.mellanox.com/s/article/recommended-network-configuration-examples-for-roce-deployment>`_ support. +**CONFIG_MLX5_SF=(y/n)** + +| Build support for subfunction. +| Subfunctons are more light weight than PCI SRIOV VFs. Choosing this option +| will enable support for creating subfunction devices. **External options** ( Choose if the corresponding mlx5 feature is required ) @@ -176,6 +183,214 @@ User command examples: values: cmode driverinit value true +mlx5 subfunction +================ +mlx5 supports subfunction management using devlink port (see :ref:`Documentation/networking/devlink/devlink-port.rst <devlink_port>`) interface. + +A Subfunction has its own function capabilities and its own resources. This +means a subfunction has its own dedicated queues (txq, rxq, cq, eq). These +queues are neither shared nor stolen from the parent PCI function. + +When a subfunction is RDMA capable, it has its own QP1, GID table and rdma +resources neither shared nor stolen from the parent PCI function. + +A subfunction has a dedicated window in PCI BAR space that is not shared +with ther other subfunctions or the parent PCI function. This ensures that all +devices (netdev, rdma, vdpa etc.) of the subfunction accesses only assigned +PCI BAR space. + +A Subfunction supports eswitch representation through which it supports tc +offloads. The user configures eswitch to send/receive packets from/to +the subfunction port. + +Subfunctions share PCI level resources such as PCI MSI-X IRQs with +other subfunctions and/or with its parent PCI function. + +Example mlx5 software, system and device view:: + + _______ + | admin | + | user |---------- + |_______| | + | | + ____|____ __|______ _________________ + | | | | | | + | devlink | | tc tool | | user | + | tool | |_________| | applications | + |_________| | |_________________| + | | | | + | | | | Userspace + +---------|-------------|-------------------|----------|--------------------+ + | | +----------+ +----------+ Kernel + | | | netdev | | rdma dev | + | | +----------+ +----------+ + (devlink port add/del | ^ ^ + port function set) | | | + | | +---------------| + _____|___ | | _______|_______ + | | | | | mlx5 class | + | devlink | +------------+ | | drivers | + | kernel | | rep netdev | | |(mlx5_core,ib) | + |_________| +------------+ | |_______________| + | | | ^ + (devlink ops) | | (probe/remove) + _________|________ | | ____|________ + | subfunction | | +---------------+ | subfunction | + | management driver|----- | subfunction |---| driver | + | (mlx5_core) | | auxiliary dev | | (mlx5_core) | + |__________________| +---------------+ |_____________| + | ^ + (sf add/del, vhca events) | + | (device add/del) + _____|____ ____|________ + | | | subfunction | + | PCI NIC |---- activate/deactive events---->| host driver | + |__________| | (mlx5_core) | + |_____________| + +Subfunction is created using devlink port interface. + +- Change device to switchdev mode:: + + $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev + +- Add a devlink port of subfunction flaovur:: + + $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88 + pci/0000:06:00.0/32768: type eth netdev eth6 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false + function: + hw_addr 00:00:00:00:00:00 state inactive opstate detached + +- Show a devlink port of the subfunction:: + + $ devlink port show pci/0000:06:00.0/32768 + pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcisf pfnum 0 sfnum 88 + function: + hw_addr 00:00:00:00:00:00 state inactive opstate detached + +- Delete a devlink port of subfunction after use:: + + $ devlink port del pci/0000:06:00.0/32768 + +mlx5 function attributes +======================== +The mlx5 driver provides a mechanism to setup PCI VF/SF function attributes in +a unified way for SmartNIC and non-SmartNIC. + +This is supported only when the eswitch mode is set to switchdev. Port function +configuration of the PCI VF/SF is supported through devlink eswitch port. + +Port function attributes should be set before PCI VF/SF is enumerated by the +driver. + +MAC address setup +----------------- +mlx5 driver provides mechanism to setup the MAC address of the PCI VF/SF. + +The configured MAC address of the PCI VF/SF will be used by netdevice and rdma +device created for the PCI VF/SF. + +- Get the MAC address of the VF identified by its unique devlink port index:: + + $ devlink port show pci/0000:06:00.0/2 + pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 + function: + hw_addr 00:00:00:00:00:00 + +- Set the MAC address of the VF identified by its unique devlink port index:: + + $ devlink port function set pci/0000:06:00.0/2 hw_addr 00:11:22:33:44:55 + + $ devlink port show pci/0000:06:00.0/2 + pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 + function: + hw_addr 00:11:22:33:44:55 + +- Get the MAC address of the SF identified by its unique devlink port index:: + + $ devlink port show pci/0000:06:00.0/32768 + pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcisf pfnum 0 sfnum 88 + function: + hw_addr 00:00:00:00:00:00 + +- Set the MAC address of the VF identified by its unique devlink port index:: + + $ devlink port function set pci/0000:06:00.0/32768 hw_addr 00:00:00:00:88:88 + + $ devlink port show pci/0000:06:00.0/32768 + pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcivf pfnum 0 sfnum 88 + function: + hw_addr 00:00:00:00:88:88 + +SF state setup +-------------- +To use the SF, the user must active the SF using the SF function state +attribute. + +- Get the state of the SF identified by its unique devlink port index:: + + $ devlink port show ens2f0npf0sf88 + pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false + function: + hw_addr 00:00:00:00:88:88 state inactive opstate detached + +- Activate the function and verify its state is active:: + + $ devlink port function set ens2f0npf0sf88 state active + + $ devlink port show ens2f0npf0sf88 + pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false + function: + hw_addr 00:00:00:00:88:88 state active opstate detached + +Upon function activation, the PF driver instance gets the event from the device +that a particular SF was activated. It's the cue to put the device on bus, probe +it and instantiate the devlink instance and class specific auxiliary devices +for it. + +- Show the auxiliary device and port of the subfunction:: + + $ devlink dev show + devlink dev show auxiliary/mlx5_core.sf.4 + + $ devlink port show auxiliary/mlx5_core.sf.4/1 + auxiliary/mlx5_core.sf.4/1: type eth netdev p0sf88 flavour virtual port 0 splittable false + + $ rdma link show mlx5_0/1 + link mlx5_0/1 state ACTIVE physical_state LINK_UP netdev p0sf88 + + $ rdma dev show + 8: rocep6s0f1: node_type ca fw 16.29.0550 node_guid 248a:0703:00b3:d113 sys_image_guid 248a:0703:00b3:d112 + 13: mlx5_0: node_type ca fw 16.29.0550 node_guid 0000:00ff:fe00:8888 sys_image_guid 248a:0703:00b3:d112 + +- Subfunction auxiliary device and class device hierarchy:: + + mlx5_core.sf.4 + (subfunction auxiliary device) + /\ + / \ + / \ + / \ + / \ + mlx5_core.eth.4 mlx5_core.rdma.4 + (sf eth aux dev) (sf rdma aux dev) + | | + | | + p0sf88 mlx5_0 + (sf netdev) (sf rdma device) + +Additionally, the SF port also gets the event when the driver attaches to the +auxiliary device of the subfunction. This results in changing the operational +state of the function. This provides visiblity to the user to decide when is it +safe to delete the SF port for graceful termination of the subfunction. + +- Show the SF port operational state:: + + $ devlink port show ens2f0npf0sf88 + pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false + function: + hw_addr 00:00:00:00:88:88 state active opstate attached + Devlink health reporters ======================== diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst new file mode 100644 index 000000000000..e99b41599465 --- /dev/null +++ b/Documentation/networking/devlink/devlink-port.rst @@ -0,0 +1,199 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. _devlink_port: + +============ +Devlink Port +============ + +``devlink-port`` is a port that exists on the device. It has a logically +separate ingress/egress point of the device. A devlink port can be any one +of many flavours. A devlink port flavour along with port attributes +describe what a port represents. + +A device driver that intends to publish a devlink port sets the +devlink port attributes and registers the devlink port. + +Devlink port flavours are described below. + +.. list-table:: List of devlink port flavours + :widths: 33 90 + + * - Flavour + - Description + * - ``DEVLINK_PORT_FLAVOUR_PHYSICAL`` + - Any kind of physical port. This can be an eswitch physical port or any + other physical port on the device. + * - ``DEVLINK_PORT_FLAVOUR_DSA`` + - This indicates a DSA interconnect port. + * - ``DEVLINK_PORT_FLAVOUR_CPU`` + - This indicates a CPU port applicable only to DSA. + * - ``DEVLINK_PORT_FLAVOUR_PCI_PF`` + - This indicates an eswitch port representing a port of PCI + physical function (PF). + * - ``DEVLINK_PORT_FLAVOUR_PCI_VF`` + - This indicates an eswitch port representing a port of PCI + virtual function (VF). + * - ``DEVLINK_PORT_FLAVOUR_PCI_SF`` + - This indicates an eswitch port representing a port of PCI + subfunction (SF). + * - ``DEVLINK_PORT_FLAVOUR_VIRTUAL`` + - This indicates a virtual port for the PCI virtual function. + +Devlink port can have a different type based on the link layer described below. + +.. list-table:: List of devlink port types + :widths: 23 90 + + * - Type + - Description + * - ``DEVLINK_PORT_TYPE_ETH`` + - Driver should set this port type when a link layer of the port is + Ethernet. + * - ``DEVLINK_PORT_TYPE_IB`` + - Driver should set this port type when a link layer of the port is + InfiniBand. + * - ``DEVLINK_PORT_TYPE_AUTO`` + - This type is indicated by the user when driver should detect the port + type automatically. + +PCI controllers +--------------- +In most cases a PCI device has only one controller. A controller consists of +potentially multiple physical, virtual functions and subfunctions. A function +consists of one or more ports. This port is represented by the devlink eswitch +port. + +A PCI device connected to multiple CPUs or multiple PCI root complexes or a +SmartNIC, however, may have multiple controllers. For a device with multiple +controllers, each controller is distinguished by a unique controller number. +An eswitch is on the PCI device which supports ports of multiple controllers. + +An example view of a system with two controllers:: + + --------------------------------------------------------- + | | + | --------- --------- ------- ------- | + ----------- | | vf(s) | | sf(s) | |vf(s)| |sf(s)| | + | server | | ------- ----/---- ---/----- ------- ---/--- ---/--- | + | pci rc |=== | pf0 |______/________/ | pf1 |___/_______/ | + | connect | | ------- ------- | + ----------- | | controller_num=1 (no eswitch) | + ------|-------------------------------------------------- + (internal wire) + | + --------------------------------------------------------- + | devlink eswitch ports and reps | + | ----------------------------------------------------- | + | |ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 |ctrl-0 | | + | |pf0 | pf0vfN | pf0sfN | pf1 | pf1vfN |pf1sfN | | + | ----------------------------------------------------- | + | |ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 |ctrl-1 | | + | |pf0 | pf0vfN | pf0sfN | pf1 | pf1vfN |pf1sfN | | + | ----------------------------------------------------- | + | | + | | + ----------- | --------- --------- ------- ------- | + | smartNIC| | | vf(s) | | sf(s) | |vf(s)| |sf(s)| | + | pci rc |==| ------- ----/---- ---/----- ------- ---/--- ---/--- | + | connect | | | pf0 |______/________/ | pf1 |___/_______/ | + ----------- | ------- ------- | + | | + | local controller_num=0 (eswitch) | + --------------------------------------------------------- + +In the above example, the external controller (identified by controller number = 1) +doesn't have the eswitch. Local controller (identified by controller number = 0) +has the eswitch. The Devlink instance on the local controller has eswitch +devlink ports for both the controllers. + +Function configuration +====================== + +A user can configure the function attribute before enumerating the PCI +function. Usually it means, user should configure function attribute +before a bus specific device for the function is created. However, when +SRIOV is enabled, virtual function devices are created on the PCI bus. +Hence, function attribute should be configured before binding virtual +function device to the driver. For subfunctions, this means user should +configure port function attribute before activating the port function. + +A user may set the hardware address of the function using +'devlink port function set hw_addr' command. For Ethernet port function +this means a MAC address. + +Subfunction +============ + +Subfunction is a lightweight function that has a parent PCI function on which +it is deployed. Subfunction is created and deployed in unit of 1. Unlike +SRIOV VFs, a subfunction doesn't require its own PCI virtual function. +A subfunction communicates with the hardware through the parent PCI function. + +To use a subfunction, 3 steps setup sequence is followed. +(1) create - create a subfunction; +(2) configure - configure subfunction attributes; +(3) deploy - deploy the subfunction; + +Subfunction management is done using devlink port user interface. +User performs setup on the subfunction management device. + +(1) Create +---------- +A subfunction is created using a devlink port interface. A user adds the +subfunction by adding a devlink port of subfunction flavour. The devlink +kernel code calls down to subfunction management driver (devlink ops) and asks +it to create a subfunction devlink port. Driver then instantiates the +subfunction port and any associated objects such as health reporters and +representor netdevice. + +(2) Configure +------------- +A subfunction devlink port is created but it is not active yet. That means the +entities are created on devlink side, the e-switch port representor is created, +but the subfunction device itself it not created. A user might use e-switch port +representor to do settings, putting it into bridge, adding TC rules, etc. A user +might as well configure the hardware address (such as MAC address) of the +subfunction while subfunction is inactive. + +(3) Deploy +---------- +Once a subfunction is configured, user must activate it to use it. Upon +activation, subfunction management driver asks the subfunction management +device to instantiate the subfunction device on particular PCI function. +A subfunction device is created on the :ref:`Documentation/driver-api/auxiliary_bus.rst <auxiliary_bus>`. +At this point a matching subfunction driver binds to the subfunction's auxiliary device. + +Terms and Definitions +===================== + +.. list-table:: Terms and Definitions + :widths: 22 90 + + * - Term + - Definitions + * - ``PCI device`` + - A physical PCI device having one or more PCI bus consists of one or + more PCI controllers. + * - ``PCI controller`` + - A controller consists of potentially multiple physical functions, + virtual functions and subfunctions. + * - ``Port function`` + - An object to manage the function of a port. + * - ``Subfunction`` + - A lightweight function that has parent PCI function on which it is + deployed. + * - ``Subfunction device`` + - A bus device of the subfunction, usually on a auxiliary bus. + * - ``Subfunction driver`` + - A device driver for the subfunction auxiliary device. + * - ``Subfunction management device`` + - A PCI physical function that supports subfunction management. + * - ``Subfunction management driver`` + - A device driver for PCI physical function that supports + subfunction management using devlink port interface. + * - ``Subfunction host driver`` + - A device driver for PCI physical function that hosts subfunction + devices. In most cases it is same as subfunction management driver. When + subfunction is used on external controller, subfunction management and + host drivers are different. diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst index d82874760ae2..aab79667f97b 100644 --- a/Documentation/networking/devlink/index.rst +++ b/Documentation/networking/devlink/index.rst @@ -18,6 +18,7 @@ general. devlink-info devlink-flash devlink-params + devlink-port devlink-region devlink-resource devlink-reload diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig index 6e4d7bb7fea2..ad45d20f9d44 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig @@ -203,3 +203,22 @@ config MLX5_SW_STEERING default y help Build support for software-managed steering in the NIC. + +config MLX5_SF + bool "Mellanox Technologies subfunction device support using auxiliary device" + depends on MLX5_CORE && MLX5_CORE_EN + default n + help + Build support for subfuction device in the NIC. A Mellanox subfunction + device can support RDMA, netdevice and vdpa device. + It is similar to a SRIOV VF but it doesn't require SRIOV support. + +config MLX5_SF_MANAGER + bool + depends on MLX5_SF && MLX5_ESWITCH + default y + help + Build support for subfuction port in the NIC. A Mellanox subfunction + port is managed through devlink. A subfunction supports RDMA, netdevice + and vdpa device. It is similar to a SRIOV VF but it doesn't require + SRIOV support. diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index d44f5f6ee449..f61ee7110243 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -88,3 +88,12 @@ mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o steering/dr_ste_v0.o \ steering/dr_cmd.o steering/dr_fw.o \ steering/dr_action.o steering/fs_dr.o +# +# SF device +# +mlx5_core-$(CONFIG_MLX5_SF) += sf/vhca_event.o sf/dev/dev.o sf/dev/driver.o + +# +# SF manager +# +mlx5_core-$(CONFIG_MLX5_SF_MANAGER) += sf/cmd.o sf/hw_table.o sf/devlink.o diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 50c7b9ee80c3..e8cecd50558d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -333,6 +333,7 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op, case MLX5_CMD_OP_DEALLOC_MEMIC: case MLX5_CMD_OP_PAGE_FAULT_RESUME: case MLX5_CMD_OP_QUERY_ESW_FUNCTIONS: + case MLX5_CMD_OP_DEALLOC_SF: return MLX5_CMD_STAT_OK; case MLX5_CMD_OP_QUERY_HCA_CAP: @@ -464,6 +465,9 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op, case MLX5_CMD_OP_ALLOC_MEMIC: case MLX5_CMD_OP_MODIFY_XRQ: case MLX5_CMD_OP_RELEASE_XRQ_ERROR: + case MLX5_CMD_OP_QUERY_VHCA_STATE: + case MLX5_CMD_OP_MODIFY_VHCA_STATE: + case MLX5_CMD_OP_ALLOC_SF: *status = MLX5_DRIVER_STATUS_ABORTED; *synd = MLX5_DRIVER_SYND; return -EIO; @@ -657,6 +661,10 @@ const char *mlx5_command_str(int command) MLX5_COMMAND_STR_CASE(DESTROY_UMEM); MLX5_COMMAND_STR_CASE(RELEASE_XRQ_ERROR); MLX5_COMMAND_STR_CASE(MODIFY_XRQ); + MLX5_COMMAND_STR_CASE(QUERY_VHCA_STATE); + MLX5_COMMAND_STR_CASE(MODIFY_VHCA_STATE); + MLX5_COMMAND_STR_CASE(ALLOC_SF); + MLX5_COMMAND_STR_CASE(DEALLOC_SF); default: return "unknown command opcode"; } } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index b23b54814356..aa76a6e0dae8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -7,6 +7,8 @@ #include "fw_reset.h" #include "fs_core.h" #include "eswitch.h" +#include "sf/dev/dev.h" +#include "sf/sf.h" static int mlx5_devlink_flash_update(struct devlink *devlink, struct devlink_flash_update_params *params, @@ -127,6 +129,17 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change, struct netlink_ext_ack *extack) { struct mlx5_core_dev *dev = devlink_priv(devlink); + bool sf_dev_allocated; + + sf_dev_allocated = mlx5_sf_dev_allocated(dev); + if (sf_dev_allocated) { + /* Reload results in deleting SF device which further results in + * unregistering devlink instance while holding devlink_mutext. + * Hence, do not support reload. + */ + NL_SET_ERR_MSG_MOD(extack, "reload is unsupported when SFs are allocated\n"); + return -EOPNOTSUPP; + } switch (action) { case DEVLINK_RELOAD_ACTION_DRIVER_REINIT: @@ -264,6 +277,12 @@ static const struct devlink_ops mlx5_devlink_ops = { .port_function_hw_addr_get = mlx5_devlink_port_function_hw_addr_get, .port_function_hw_addr_set = mlx5_devlink_port_function_hw_addr_set, #endif +#ifdef CONFIG_MLX5_SF_MANAGER + .port_new = mlx5_devlink_sf_port_new, + .port_del = mlx5_devlink_sf_port_del, + .port_fn_state_get = mlx5_devlink_sf_port_fn_state_get, + .port_fn_state_set = mlx5_devlink_sf_port_fn_state_set, +#endif .flash_update = mlx5_devlink_flash_update, .info_get = mlx5_devlink_info_get, .reload_actions = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) | diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c index fc0afa03d407..174dfbc996c6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -467,7 +467,7 @@ int mlx5_eq_table_init(struct mlx5_core_dev *dev) for (i = 0; i < MLX5_EVENT_TYPE_MAX; i++) ATOMIC_INIT_NOTIFIER_HEAD(&eq_table->nh[i]); - eq_table->irq_table = dev->priv.irq_table; + eq_table->irq_table = mlx5_irq_table_get(dev); return 0; } @@ -595,6 +595,9 @@ static void gather_async_events_mask(struct mlx5_core_dev *dev, u64 mask[4]) async_event_mask |= (1ull << MLX5_EVENT_TYPE_ESW_FUNCTIONS_CHANGED); + if (MLX5_CAP_GEN_MAX(dev, vhca_state)) + async_event_mask |= (1ull << MLX5_EVENT_TYPE_VHCA_STATE_CHANGE); + mask[0] = async_event_mask; if (MLX5_CAP_GEN(dev, event_cap)) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c index 4c74e2690d57..26b37a0f8762 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c @@ -150,7 +150,7 @@ static void esw_acl_egress_ofld_groups_destroy(struct mlx5_vport *vport) static bool esw_acl_egress_needed(const struct mlx5_eswitch *esw, u16 vport_num) { - return mlx5_eswitch_is_vf_vport(esw, vport_num); + return mlx5_eswitch_is_vf_vport(esw, vport_num) || mlx5_esw_is_sf_vport(esw, vport_num); } int esw_acl_egress_ofld_setup(struct mlx5_eswitch *esw, struct mlx5_vport *vport) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c index ffff11baa3d0..cb1e181f4c6a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c @@ -122,3 +122,44 @@ struct devlink_port *mlx5_esw_offloads_devlink_port(struct mlx5_eswitch *esw, u1 vport = mlx5_eswitch_get_vport(esw, vport_num); return vport->dl_port; } + +int mlx5_esw_devlink_sf_port_register(struct mlx5_eswitch *esw, struct devlink_port *dl_port, + u16 vport_num, u32 sfnum) +{ + struct mlx5_core_dev *dev = esw->dev; + struct netdev_phys_item_id ppid = {}; + unsigned int dl_port_index; + struct mlx5_vport *vport; + struct devlink *devlink; + u16 pfnum; + int err; + + vport = mlx5_eswitch_get_vport(esw, vport_num); + if (IS_ERR(vport)) + return PTR_ERR(vport); + + pfnum = PCI_FUNC(dev->pdev->devfn); + mlx5_esw_get_port_parent_id(dev, &ppid); + memcpy(dl_port->attrs.switch_id.id, &ppid.id[0], ppid.id_len); + dl_port->attrs.switch_id.id_len = ppid.id_len; + devlink_port_attrs_pci_sf_set(dl_port, 0, pfnum, sfnum); + devlink = priv_to_devlink(dev); + dl_port_index = mlx5_esw_vport_to_devlink_port_index(dev, vport_num); + err = devlink_port_register(devlink, dl_port, dl_port_index); + if (err) + return err; + + vport->dl_port = dl_port; + return 0; +} + +void mlx5_esw_devlink_sf_port_unregister(struct mlx5_eswitch *esw, u16 vport_num) +{ + struct mlx5_vport *vport; + + vport = mlx5_eswitch_get_vport(esw, vport_num); + if (IS_ERR(vport)) + return; + devlink_port_unregister(vport->dl_port); + vport->dl_port = NULL; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index 876e6449edb3..820305b1664e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -1272,8 +1272,8 @@ static void esw_vport_cleanup(struct mlx5_eswitch *esw, struct mlx5_vport *vport esw_vport_cleanup_acl(esw, vport); } -static int esw_enable_vport(struct mlx5_eswitch *esw, u16 vport_num, - enum mlx5_eswitch_vport_event enabled_events) +int mlx5_esw_vport_enable(struct mlx5_eswitch *esw, u16 vport_num, + enum mlx5_eswitch_vport_event enabled_events) { struct mlx5_vport *vport; int ret; @@ -1309,7 +1309,7 @@ done: return ret; } -static void esw_disable_vport(struct mlx5_eswitch *esw, u16 vport_num) +void mlx5_esw_vport_disable(struct mlx5_eswitch *esw, u16 vport_num) { struct mlx5_vport *vport; @@ -1365,9 +1365,15 @@ const u32 *mlx5_esw_query_functions(struct mlx5_core_dev *dev) { int outlen = MLX5_ST_SZ_BYTES(query_esw_functions_out); u32 in[MLX5_ST_SZ_DW(query_esw_functions_in)] = {}; + u16 max_sf_vports; u32 *out; int err; + max_sf_vports = mlx5_sf_max_functions(dev); + /* Device interface is array of 64-bits */ + if (max_sf_vports) + outlen += DIV_ROUND_UP(max_sf_vports, BITS_PER_TYPE(__be64)) * sizeof(__be64); + out = kvzalloc(outlen, GFP_KERNEL); if (!out) return ERR_PTR(-ENOMEM); @@ -1375,7 +1381,7 @@ const u32 *mlx5_esw_query_functions(struct mlx5_core_dev *dev) MLX5_SET(query_esw_functions_in, in, opcode, MLX5_CMD_OP_QUERY_ESW_FUNCTIONS); - err = mlx5_cmd_exec_inout(dev, query_esw_functions, in, out); + err = mlx5_cmd_exec(dev, in, sizeof(in), out, outlen); if (!err) return out; @@ -1425,7 +1431,7 @@ int mlx5_eswitch_load_vport(struct mlx5_eswitch *esw, u16 vport_num, { int err; - err = esw_enable_vport(esw, vport_num, enabled_events); + err = mlx5_esw_vport_enable(esw, vport_num, enabled_events); if (err) return err; @@ -1436,14 +1442,14 @@ int mlx5_eswitch_load_vport(struct mlx5_eswitch *esw, u16 vport_num, return err; err_rep: - esw_disable_vport(esw, vport_num); + mlx5_esw_vport_disable(esw, vport_num); return err; } void mlx5_eswitch_unload_vport(struct mlx5_eswitch *esw, u16 vport_num) { esw_offloads_unload_rep(esw, vport_num); - esw_disable_vport(esw, vport_num); + mlx5_esw_vport_disable(esw, vport_num); } void mlx5_eswitch_unload_vf_vports(struct mlx5_eswitch *esw, u16 num_vfs) @@ -1593,6 +1599,15 @@ mlx5_eswitch_update_num_of_vfs(struct mlx5_eswitch *esw, int num_vfs) kvfree(out); } +static void mlx5_esw_mode_change_notify(struct mlx5_eswitch *esw, u16 mode) +{ + struct mlx5_esw_event_info info = {}; + + info.new_mode = mode; + + blocking_notifier_call_chain(&esw->n_head, 0, &info); +} + /** * mlx5_eswitch_enable_locked - Enable eswitch * @esw: Pointer to eswitch @@ -1653,6 +1668,8 @@ int mlx5_eswitch_enable_locked(struct mlx5_eswitch *esw, int mode, int num_vfs) mode == MLX5_ESWITCH_LEGACY ? "LEGACY" : "OFFLOADS", esw->esw_funcs.num_vfs, esw->enabled_vports); + mlx5_esw_mode_change_notify(esw, mode); + return 0; abort: @@ -1709,6 +1726,11 @@ void mlx5_eswitch_disable_locked(struct mlx5_eswitch *esw, bool clear_vf) esw->mode == MLX5_ESWITCH_LEGACY ? "LEGACY" : "OFFLOADS", esw->esw_funcs.num_vfs, esw->enabled_vports); + /* Notify eswitch users that it is exiting from current mode. + * So that it can do necessary cleanup before the eswitch is disabled. + */ + mlx5_esw_mode_change_notify(esw, MLX5_ESWITCH_NONE); + mlx5_eswitch_event_handlers_unregister(esw); if (esw->mode == MLX5_ESWITCH_LEGACY) @@ -1809,6 +1831,7 @@ int mlx5_eswitch_init(struct mlx5_core_dev *dev) esw->offloads.inline_mode = MLX5_INLINE_MODE_NONE; dev->priv.eswitch = esw; + BLOCKING_INIT_NOTIFIER_HEAD(&esw->n_head); return 0; abort: if (esw->work_queue) @@ -1898,7 +1921,8 @@ static bool is_port_function_supported(const struct mlx5_eswitch *esw, u16 vport_num) { return vport_num == MLX5_VPORT_PF || - mlx5_eswitch_is_vf_vport(esw, vport_num); + mlx5_eswitch_is_vf_vport(esw, vport_num) || + mlx5_esw_is_sf_vport(esw, vport_num); } int mlx5_devlink_port_function_hw_addr_get(struct devlink *devlink, @@ -2499,4 +2523,12 @@ bool mlx5_esw_multipath_prereq(struct mlx5_core_dev *dev0, dev1->priv.eswitch->mode == MLX5_ESWITCH_OFFLOADS); } +int mlx5_esw_event_notifier_register(struct mlx5_eswitch *esw, struct notifier_block *nb) +{ + return blocking_notifier_chain_register(&esw->n_head, nb); +} +void mlx5_esw_event_notifier_unregister(struct mlx5_eswitch *esw, struct notifier_block *nb) +{ + blocking_notifier_chain_unregister(&esw->n_head, nb); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index cf87de94418f..479d2ac2cd85 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -43,6 +43,7 @@ #include <linux/mlx5/fs.h> #include "lib/mpfs.h" #include "lib/fs_chains.h" +#include "sf/sf.h" #include "en/tc_ct.h" #ifdef CONFIG_MLX5_ESWITCH @@ -277,6 +278,7 @@ struct mlx5_eswitch { struct { u32 large_group_num; } params; + struct blocking_notifier_head n_head; }; void esw_offloads_disable(struct mlx5_eswitch *esw); @@ -499,6 +501,40 @@ static inline u16 mlx5_eswitch_first_host_vport_num(struct mlx5_core_dev *dev) MLX5_VPORT_PF : MLX5_VPORT_FIRST_VF; } +static inline int mlx5_esw_sf_start_idx(const struct mlx5_eswitch *esw) +{ + /* PF and VF vports indices start from 0 to max_vfs */ + return MLX5_VPORT_PF_PLACEHOLDER + mlx5_core_max_vfs(esw->dev); +} + +static inline int mlx5_esw_sf_end_idx(const struct mlx5_eswitch *esw) +{ + return mlx5_esw_sf_start_idx(esw) + mlx5_sf_max_functions(esw->dev); +} + +static inline int +mlx5_esw_sf_vport_num_to_index(const struct mlx5_eswitch *esw, u16 vport_num) +{ + return vport_num - mlx5_sf_start_function_id(esw->dev) + + MLX5_VPORT_PF_PLACEHOLDER + mlx5_core_max_vfs(esw->dev); +} + +static inline u16 +mlx5_esw_sf_vport_index_to_num(const struct mlx5_eswitch *esw, int idx) +{ + return mlx5_sf_start_function_id(esw->dev) + idx - + (MLX5_VPORT_PF_PLACEHOLDER + mlx5_core_max_vfs(esw->dev)); +} + +static inline bool +mlx5_esw_is_sf_vport(const struct mlx5_eswitch *esw, u16 vport_num) +{ + return mlx5_sf_supported(esw->dev) && + vport_num >= mlx5_sf_start_function_id(esw->dev) && + (vport_num < (mlx5_sf_start_function_id(esw->dev) + + mlx5_sf_max_functions(esw->dev))); +} + static inline bool mlx5_eswitch_is_funcs_handler(const struct mlx5_core_dev *dev) { return mlx5_core_is_ecpf_esw_manager(dev); @@ -527,6 +563,10 @@ static inline int mlx5_eswitch_vport_num_to_index(struct mlx5_eswitch *esw, if (vport_num == MLX5_VPORT_UPLINK) return mlx5_eswitch_uplink_idx(esw); + if (mlx5_esw_is_sf_vport(esw, vport_num)) + return mlx5_esw_sf_vport_num_to_index(esw, vport_num); + + /* PF and VF vports start from 0 to max_vfs */ return vport_num; } @@ -540,6 +580,12 @@ static inline u16 mlx5_eswitch_index_to_vport_num(struct mlx5_eswitch *esw, if (index == mlx5_eswitch_uplink_idx(esw)) return MLX5_VPORT_UPLINK; + /* SF vports indices are after VFs and before ECPF */ + if (mlx5_sf_supported(esw->dev) && + index > mlx5_core_max_vfs(esw->dev)) + return mlx5_esw_sf_vport_index_to_num(esw, index); + + /* PF and VF vports start from 0 to max_vfs */ return index; } @@ -625,6 +671,11 @@ void mlx5e_tc_clean_fdb_peer_flows(struct mlx5_eswitch *esw); for ((vport) = (nvfs); \ (vport) >= (esw)->first_host_vport; (vport)--) +#define mlx5_esw_for_each_sf_rep(esw, i, rep) \ + for ((i) = mlx5_esw_sf_start_idx(esw); \ + (rep) = &(esw)->offloads.vport_reps[(i)], \ + (i) < mlx5_esw_sf_end_idx(esw); (i++)) + struct mlx5_eswitch *mlx5_devlink_eswitch_get(struct devlink *devlink); struct mlx5_vport *__must_check mlx5_eswitch_get_vport(struct mlx5_eswitch *esw, u16 vport_num); @@ -638,6 +689,10 @@ mlx5_eswitch_enable_pf_vf_vports(struct mlx5_eswitch *esw, enum mlx5_eswitch_vport_event enabled_events); void mlx5_eswitch_disable_pf_vf_vports(struct mlx5_eswitch *esw); +int mlx5_esw_vport_enable(struct mlx5_eswitch *esw, u16 vport_num, + enum mlx5_eswitch_vport_event enabled_events); +void mlx5_esw_vport_disable(struct mlx5_eswitch *esw, u16 vport_num); + int esw_vport_create_offloads_acl_tables(struct mlx5_eswitch *esw, struct mlx5_vport *vport); @@ -656,6 +711,9 @@ esw_get_max_restore_tag(struct mlx5_eswitch *esw); int esw_offloads_load_rep(struct mlx5_eswitch *esw, u16 vport_num); void esw_offloads_unload_rep(struct mlx5_eswitch *esw, u16 vport_num); +int mlx5_esw_offloads_rep_load(struct mlx5_eswitch *esw, u16 vport_num); +void mlx5_esw_offloads_rep_unload(struct mlx5_eswitch *esw, u16 vport_num); + int mlx5_eswitch_load_vport(struct mlx5_eswitch *esw, u16 vport_num, enum mlx5_eswitch_vport_event enabled_events); void mlx5_eswitch_unload_vport(struct mlx5_eswitch *esw, u16 vport_num); @@ -667,6 +725,26 @@ void mlx5_eswitch_unload_vf_vports(struct mlx5_eswitch *esw, u16 num_vfs); int mlx5_esw_offloads_devlink_port_register(struct mlx5_eswitch *esw, u16 vport_num); void mlx5_esw_offloads_devlink_port_unregister(struct mlx5_eswitch *esw, u16 vport_num); struct devlink_port *mlx5_esw_offloads_devlink_port(struct mlx5_eswitch *esw, u16 vport_num); + +int mlx5_esw_devlink_sf_port_register(struct mlx5_eswitch *esw, struct devlink_port *dl_port, + u16 vport_num, u32 sfnum); +void mlx5_esw_devlink_sf_port_unregister(struct mlx5_eswitch *esw, u16 vport_num); + +int mlx5_esw_offloads_sf_vport_enable(struct mlx5_eswitch *esw, struct devlink_port *dl_port, + u16 vport_num, u32 sfnum); +void mlx5_esw_offloads_sf_vport_disable(struct mlx5_eswitch *esw, u16 vport_num); + +/** + * mlx5_esw_event_info - Indicates eswitch mode changed/changing. + * + * @new_mode: New mode of eswitch. + */ +struct mlx5_esw_event_info { + u16 new_mode; +}; + +int mlx5_esw_event_notifier_register(struct mlx5_eswitch *esw, struct notifier_block *n); +void mlx5_esw_event_notifier_unregister(struct mlx5_eswitch *esw, struct notifier_block *n); #else /* CONFIG_MLX5_ESWITCH */ /* eswitch API stubs */ static inline int mlx5_eswitch_init(struct mlx5_core_dev *dev) { return 0; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 2f6a0ae20650..7f09f2bbf7c1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -1800,11 +1800,22 @@ static void __esw_offloads_unload_rep(struct mlx5_eswitch *esw, esw->offloads.rep_ops[rep_type]->unload(rep); } +static void __unload_reps_sf_vport(struct mlx5_eswitch *esw, u8 rep_type) +{ + struct mlx5_eswitch_rep *rep; + int i; + + mlx5_esw_for_each_sf_rep(esw, i, rep) + __esw_offloads_unload_rep(esw, rep, rep_type); +} + static void __unload_reps_all_vport(struct mlx5_eswitch *esw, u8 rep_type) { struct mlx5_eswitch_rep *rep; int i; + __unload_reps_sf_vport(esw, rep_type); + mlx5_esw_for_each_vf_rep_reverse(esw, i, rep, esw->esw_funcs.num_vfs) __esw_offloads_unload_rep(esw, rep, rep_type); @@ -1822,7 +1833,7 @@ static void __unload_reps_all_vport(struct mlx5_eswitch *esw, u8 rep_type) __esw_offloads_unload_rep(esw, rep, rep_type); } -static int mlx5_esw_offloads_rep_load(struct mlx5_eswitch *esw, u16 vport_num) +int mlx5_esw_offloads_rep_load(struct mlx5_eswitch *esw, u16 vport_num) { struct mlx5_eswitch_rep *rep; int rep_type; @@ -1846,7 +1857,7 @@ err_reps: return err; } -static void mlx5_esw_offloads_rep_unload(struct mlx5_eswitch *esw, u16 vport_num) +void mlx5_esw_offloads_rep_unload(struct mlx5_eswitch *esw, u16 vport_num) { struct mlx5_eswitch_rep *rep; int rep_type; @@ -2824,3 +2835,35 @@ u32 mlx5_eswitch_get_vport_metadata_for_match(struct mlx5_eswitch *esw, return vport->metadata << (32 - ESW_SOURCE_PORT_METADATA_BITS); } EXPORT_SYMBOL(mlx5_eswitch_get_vport_metadata_for_match); + +int mlx5_esw_offloads_sf_vport_enable(struct mlx5_eswitch *esw, struct devlink_port *dl_port, + u16 vport_num, u32 sfnum) +{ + int err; + + err = mlx5_esw_vport_enable(esw, vport_num, MLX5_VPORT_UC_ADDR_CHANGE); + if (err) + return err; + + err = mlx5_esw_devlink_sf_port_register(esw, dl_port, vport_num, sfnum); + if (err) + goto devlink_err; + + err = mlx5_esw_offloads_rep_load(esw, vport_num); + if (err) + goto rep_err; + return 0; + +rep_err: + mlx5_esw_devlink_sf_port_unregister(esw, vport_num); +devlink_err: + mlx5_esw_vport_disable(esw, vport_num); + return err; +} + +void mlx5_esw_offloads_sf_vport_disable(struct mlx5_eswitch *esw, u16 vport_num) +{ + mlx5_esw_offloads_rep_unload(esw, vport_num); + mlx5_esw_devlink_sf_port_unregister(esw, vport_num); + mlx5_esw_vport_disable(esw, vport_num); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/events.c b/drivers/net/ethernet/mellanox/mlx5/core/events.c index 670f25f5ffd1..d713ae24d6b6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/events.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/events.c @@ -112,6 +112,8 @@ static const char *eqe_type_str(u8 type) return "MLX5_EVENT_TYPE_CMD"; case MLX5_EVENT_TYPE_ESW_FUNCTIONS_CHANGED: return "MLX5_EVENT_TYPE_ESW_FUNCTIONS_CHANGED"; + case MLX5_EVENT_TYPE_VHCA_STATE_CHANGE: + return "MLX5_EVENT_TYPE_VHCA_STATE_CHANGE"; case MLX5_EVENT_TYPE_PAGE_REQUEST: return "MLX5_EVENT_TYPE_PAGE_REQUEST"; case MLX5_EVENT_TYPE_PAGE_FAULT: @@ -434,3 +436,8 @@ int mlx5_blocking_notifier_call_chain(struct mlx5_core_dev *dev, unsigned int ev return blocking_notifier_call_chain(&events->sw_nh, event, data); } + +void mlx5_events_work_enqueue(struct mlx5_core_dev *dev, struct work_struct *work) +{ + queue_work(dev->priv.events->wq, work); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index bfedf064db1a..e4c9627485aa 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -73,6 +73,9 @@ #include "ecpf.h" #include "lib/hv_vhca.h" #include "diag/rsc_dump.h" +#include "sf/vhca_event.h" +#include "sf/dev/dev.h" +#include "sf/sf.h" MODULE_AUTHOR("Eli Cohen <eli@mellanox.com>"); MODULE_DESCRIPTION("Mellanox 5th generation network adapters (ConnectX series) core driver"); @@ -82,7 +85,6 @@ unsigned int mlx5_core_debug_mask; module_param_named(debug_mask, mlx5_core_debug_mask, uint, 0644); MODULE_PARM_DESC(debug_mask, "debug mask: 1 = dump cmd data, 2 = dump cmd exec time, 3 = both. Default=0"); -#define MLX5_DEFAULT_PROF 2 static unsigned int prof_sel = MLX5_DEFAULT_PROF; module_param_named(prof_sel, prof_sel, uint, 0444); MODULE_PARM_DESC(prof_sel, "profile selector. Valid range 0 - 2"); @@ -567,6 +569,8 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx) if (MLX5_CAP_GEN_MAX(dev, mkey_by_name)) MLX5_SET(cmd_hca_cap, set_hca_cap, mkey_by_name, 1); + mlx5_vhca_state_cap_handle(dev, set_hca_cap); + return set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE); } @@ -884,6 +888,24 @@ static int mlx5_init_once(struct mlx5_core_dev *dev) goto err_eswitch_cleanup; } + err = mlx5_vhca_event_init(dev); + if (err) { + mlx5_core_err(dev, "Failed to init vhca event notifier %d\n", err); + goto err_fpga_cleanup; + } + + err = mlx5_sf_hw_table_init(dev); + if (err) { + mlx5_core_err(dev, "Failed to init SF HW table %d\n", err); + goto err_sf_hw_table_cleanup; + } + + err = mlx5_sf_table_init(dev); + if (err) { + mlx5_core_err(dev, "Failed to init SF table %d\n", err); + goto err_sf_table_cleanup; + } + dev->dm = mlx5_dm_create(dev); if (IS_ERR(dev->dm)) mlx5_core_warn(dev, "Failed to init device memory%d\n", err); @@ -894,6 +916,12 @@ static int mlx5_init_once(struct mlx5_core_dev *dev) return 0; +err_sf_table_cleanup: + mlx5_sf_hw_table_cleanup(dev); +err_sf_hw_table_cleanup: + mlx5_vhca_event_cleanup(dev); +err_fpga_cleanup: + mlx5_fpga_cleanup(dev); err_eswitch_cleanup: mlx5_eswitch_cleanup(dev->priv.eswitch); err_sriov_cleanup: @@ -925,6 +953,9 @@ static void mlx5_cleanup_once(struct mlx5_core_dev *dev) mlx5_hv_vhca_destroy(dev->hv_vhca); mlx5_fw_tracer_destroy(dev->tracer); mlx5_dm_cleanup(dev); + mlx5_sf_table_cleanup(dev); + mlx5_sf_hw_table_cleanup(dev); + mlx5_vhca_event_cleanup(dev); mlx5_fpga_cleanup(dev); mlx5_eswitch_cleanup(dev->priv.eswitch); mlx5_sriov_cleanup(dev); @@ -1129,6 +1160,14 @@ static int mlx5_load(struct mlx5_core_dev *dev) goto err_sriov; } + mlx5_vhca_event_start(dev); + + err = mlx5_sf_hw_table_create(dev); + if (err) { + mlx5_core_err(dev, "sf table create failed %d\n", err); + goto err_vhca; + } + err = mlx5_ec_init(dev); if (err) { mlx5_core_err(dev, "Failed to init embedded CPU\n"); @@ -1141,11 +1180,16 @@ static int mlx5_load(struct mlx5_core_dev *dev) goto err_sriov; } + mlx5_sf_dev_table_create(dev); + return 0; err_sriov: mlx5_ec_cleanup(dev); err_ec: + mlx5_sf_hw_table_destroy(dev); +err_vhca: + mlx5_vhca_event_stop(dev); mlx5_cleanup_fs(dev); err_fs: mlx5_accel_tls_cleanup(dev); @@ -1171,8 +1215,11 @@ err_irq_table: static void mlx5_unload(struct mlx5_core_dev *dev) { + mlx5_sf_dev_table_destroy(dev); mlx5_sriov_detach(dev); mlx5_ec_cleanup(dev); + mlx5_sf_hw_table_destroy(dev); + mlx5_vhca_event_stop(dev); mlx5_cleanup_fs(dev); mlx5_accel_ipsec_cleanup(dev); mlx5_accel_tls_cleanup(dev); @@ -1283,7 +1330,7 @@ out: mutex_unlock(&dev->intf_state_mutex); } -static int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx) +int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx) { struct mlx5_priv *priv = &dev->priv; int err; @@ -1335,7 +1382,7 @@ err_health_init: return err; } -static void mlx5_mdev_uninit(struct mlx5_core_dev *dev) +void mlx5_mdev_uninit(struct mlx5_core_dev *dev) { struct mlx5_priv *priv = &dev->priv; @@ -1678,6 +1725,10 @@ static int __init init(void) if (err) goto err_debug; + err = mlx5_sf_driver_register(); + if (err) + goto err_sf; + #ifdef CONFIG_MLX5_CORE_EN err = mlx5e_init(); if (err) { @@ -1688,6 +1739,8 @@ static int __init init(void) return 0; +err_sf: + pci_unregister_driver(&mlx5_core_driver); err_debug: mlx5_unregister_debugfs(); return err; @@ -1698,6 +1751,7 @@ static void __exit cleanup(void) #ifdef CONFIG_MLX5_CORE_EN mlx5e_cleanup(); #endif + mlx5_sf_driver_unregister(); pci_unregister_driver(&mlx5_core_driver); mlx5_unregister_debugfs(); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index 0a0302ce7144..3754ef98554f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -117,6 +117,8 @@ enum mlx5_semaphore_space_address { MLX5_SEMAPHORE_SW_RESET = 0x20, }; +#define MLX5_DEFAULT_PROF 2 + int mlx5_query_hca_caps(struct mlx5_core_dev *dev); int mlx5_query_board_id(struct mlx5_core_dev *dev); int mlx5_cmd_init(struct mlx5_core_dev *dev); @@ -176,6 +178,7 @@ struct cpumask * mlx5_irq_get_affinity_mask(struct mlx5_irq_table *irq_table, int vecidx); struct cpu_rmap *mlx5_irq_get_rmap(struct mlx5_irq_table *table); int mlx5_irq_get_num_comp(struct mlx5_irq_table *table); +struct mlx5_irq_table *mlx5_irq_table_get(struct mlx5_core_dev *dev); int mlx5_events_init(struct mlx5_core_dev *dev); void mlx5_events_cleanup(struct mlx5_core_dev *dev); @@ -257,6 +260,15 @@ enum { u8 mlx5_get_nic_state(struct mlx5_core_dev *dev); void mlx5_set_nic_state(struct mlx5_core_dev *dev, u8 state); +static inline bool mlx5_core_is_sf(const struct mlx5_core_dev *dev) +{ + return dev->coredev_type == MLX5_COREDEV_SF; +} + +int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx); +void mlx5_mdev_uninit(struct mlx5_core_dev *dev); void mlx5_unload_one(struct mlx5_core_dev *dev, bool cleanup); int mlx5_load_one(struct mlx5_core_dev *dev, bool boot); + +void mlx5_events_work_enqueue(struct mlx5_core_dev *dev, struct work_struct *work); #endif /* __MLX5_CORE_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index 6fd974920394..a61e09aff152 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -30,6 +30,9 @@ int mlx5_irq_table_init(struct mlx5_core_dev *dev) { struct mlx5_irq_table *irq_table; + if (mlx5_core_is_sf(dev)) + return 0; + irq_table = kvzalloc(sizeof(*irq_table), GFP_KERNEL); if (!irq_table) return -ENOMEM; @@ -40,6 +43,9 @@ int mlx5_irq_table_init(struct mlx5_core_dev *dev) void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev) { + if (mlx5_core_is_sf(dev)) + return; + kvfree(dev->priv.irq_table); } @@ -268,6 +274,9 @@ int mlx5_irq_table_create(struct mlx5_core_dev *dev) int nvec; int err; + if (mlx5_core_is_sf(dev)) + return 0; + nvec = MLX5_CAP_GEN(dev, num_ports) * num_online_cpus() + MLX5_IRQ_VEC_COMP_BASE; nvec = min_t(int, nvec, num_eqs); @@ -319,6 +328,9 @@ void mlx5_irq_table_destroy(struct mlx5_core_dev *dev) struct mlx5_irq_table *table = dev->priv.irq_table; int i; + if (mlx5_core_is_sf(dev)) + return; + /* free_irq requires that affinity and rmap will be cleared * before calling it. This is why there is asymmetry with set_rmap * which should be called after alloc_irq but before request_irq. @@ -332,3 +344,11 @@ void mlx5_irq_table_destroy(struct mlx5_core_dev *dev) kfree(table->irq); } +struct mlx5_irq_table *mlx5_irq_table_get(struct mlx5_core_dev *dev) +{ +#ifdef CONFIG_MLX5_SF + if (mlx5_core_is_sf(dev)) + return dev->priv.parent_mdev->priv.irq_table; +#endif + return dev->priv.irq_table; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c new file mode 100644 index 000000000000..a8d75c2f0275 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#include <linux/mlx5/driver.h> +#include "priv.h" + +int mlx5_cmd_alloc_sf(struct mlx5_core_dev *dev, u16 function_id) +{ + u32 out[MLX5_ST_SZ_DW(alloc_sf_out)] = {}; + u32 in[MLX5_ST_SZ_DW(alloc_sf_in)] = {}; + + MLX5_SET(alloc_sf_in, in, opcode, MLX5_CMD_OP_ALLOC_SF); + MLX5_SET(alloc_sf_in, in, function_id, function_id); + + return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); +} + +int mlx5_cmd_dealloc_sf(struct mlx5_core_dev *dev, u16 function_id) +{ + u32 out[MLX5_ST_SZ_DW(dealloc_sf_out)] = {}; + u32 in[MLX5_ST_SZ_DW(dealloc_sf_in)] = {}; + + MLX5_SET(dealloc_sf_in, in, opcode, MLX5_CMD_OP_DEALLOC_SF); + MLX5_SET(dealloc_sf_in, in, function_id, function_id); + + return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); +} + +int mlx5_cmd_sf_enable_hca(struct mlx5_core_dev *dev, u16 func_id) +{ + u32 out[MLX5_ST_SZ_DW(enable_hca_out)] = {}; + u32 in[MLX5_ST_SZ_DW(enable_hca_in)] = {}; + + MLX5_SET(enable_hca_in, in, opcode, MLX5_CMD_OP_ENABLE_HCA); + MLX5_SET(enable_hca_in, in, function_id, func_id); + MLX5_SET(enable_hca_in, in, embedded_cpu_function, 0); + return mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out)); +} + +int mlx5_cmd_sf_disable_hca(struct mlx5_core_dev *dev, u16 func_id) +{ + u32 out[MLX5_ST_SZ_DW(disable_hca_out)] = {}; + u32 in[MLX5_ST_SZ_DW(disable_hca_in)] = {}; + + MLX5_SET(disable_hca_in, in, opcode, MLX5_CMD_OP_DISABLE_HCA); + MLX5_SET(disable_hca_in, in, function_id, func_id); + MLX5_SET(enable_hca_in, in, embedded_cpu_function, 0); + return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c new file mode 100644 index 000000000000..b265f27b2166 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c @@ -0,0 +1,275 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#include <linux/mlx5/driver.h> +#include <linux/mlx5/device.h> +#include "mlx5_core.h" +#include "dev.h" +#include "sf/vhca_event.h" +#include "sf/sf.h" +#include "sf/mlx5_ifc_vhca_event.h" +#include "ecpf.h" + +struct mlx5_sf_dev_table { + struct xarray devices; + unsigned int max_sfs; + phys_addr_t base_address; + u64 sf_bar_length; + struct notifier_block nb; + struct mlx5_core_dev *dev; +}; + +static bool mlx5_sf_dev_supported(const struct mlx5_core_dev *dev) +{ + return MLX5_CAP_GEN(dev, sf) && mlx5_vhca_event_supported(dev); +} + +bool mlx5_sf_dev_allocated(const struct mlx5_core_dev *dev) +{ + struct mlx5_sf_dev_table *table = dev->priv.sf_dev_table; + + if (!mlx5_sf_dev_supported(dev)) + return false; + + return !xa_empty(&table->devices); +} + +static ssize_t sfnum_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct auxiliary_device *adev = container_of(dev, struct auxiliary_device, dev); + struct mlx5_sf_dev *sf_dev = container_of(adev, struct mlx5_sf_dev, adev); + + return scnprintf(buf, PAGE_SIZE, "%u\n", sf_dev->sfnum); +} +static DEVICE_ATTR_RO(sfnum); + +static struct attribute *sf_device_attrs[] = { + &dev_attr_sfnum.attr, + NULL, +}; + +static const struct attribute_group sf_attr_group = { + .attrs = sf_device_attrs, +}; + +static const struct attribute_group *sf_attr_groups[2] = { + &sf_attr_group, + NULL +}; + +static void mlx5_sf_dev_release(struct device *device) +{ + struct auxiliary_device *adev = container_of(device, struct auxiliary_device, dev); + struct mlx5_sf_dev *sf_dev = container_of(adev, struct mlx5_sf_dev, adev); + + mlx5_adev_idx_free(adev->id); + kfree(sf_dev); +} + +static void mlx5_sf_dev_remove(struct mlx5_sf_dev *sf_dev) +{ + auxiliary_device_delete(&sf_dev->adev); + auxiliary_device_uninit(&sf_dev->adev); +} + +static void mlx5_sf_dev_add(struct mlx5_core_dev *dev, u16 sf_index, u32 sfnum) +{ + struct mlx5_sf_dev_table *table = dev->priv.sf_dev_table; + struct mlx5_sf_dev *sf_dev; + struct pci_dev *pdev; + int err; + int id; + + id = mlx5_adev_idx_alloc(); + if (id < 0) { + err = id; + goto add_err; + } + + sf_dev = kzalloc(sizeof(*sf_dev), GFP_KERNEL); + if (!sf_dev) { + mlx5_adev_idx_free(id); + err = -ENOMEM; + goto add_err; + } + pdev = dev->pdev; + sf_dev->adev.id = id; + sf_dev->adev.name = MLX5_SF_DEV_ID_NAME; + sf_dev->adev.dev.release = mlx5_sf_dev_release; + sf_dev->adev.dev.parent = &pdev->dev; + sf_dev->adev.dev.groups = sf_attr_groups; + sf_dev->sfnum = sfnum; + sf_dev->parent_mdev = dev; + + if (!table->max_sfs) { + mlx5_adev_idx_free(id); + kfree(sf_dev); + err = -EOPNOTSUPP; + goto add_err; + } + sf_dev->bar_base_addr = table->base_address + (sf_index * table->sf_bar_length); + + err = auxiliary_device_init(&sf_dev->adev); + if (err) { + mlx5_adev_idx_free(id); + kfree(sf_dev); + goto add_err; + } + + err = auxiliary_device_add(&sf_dev->adev); + if (err) { + put_device(&sf_dev->adev.dev); + goto add_err; + } + + err = xa_insert(&table->devices, sf_index, sf_dev, GFP_KERNEL); + if (err) + goto xa_err; + return; + +xa_err: + mlx5_sf_dev_remove(sf_dev); +add_err: + mlx5_core_err(dev, "SF DEV: fail device add for index=%d sfnum=%d err=%d\n", + sf_index, sfnum, err); +} + +static void mlx5_sf_dev_del(struct mlx5_core_dev *dev, struct mlx5_sf_dev *sf_dev, u16 sf_index) +{ + struct mlx5_sf_dev_table *table = dev->priv.sf_dev_table; + + xa_erase(&table->devices, sf_index); + mlx5_sf_dev_remove(sf_dev); +} + +static int +mlx5_sf_dev_state_change_handler(struct notifier_block *nb, unsigned long event_code, void *data) +{ + struct mlx5_sf_dev_table *table = container_of(nb, struct mlx5_sf_dev_table, nb); + const struct mlx5_vhca_state_event *event = data; + struct mlx5_sf_dev *sf_dev; + u16 sf_index; + + sf_index = event->function_id - MLX5_CAP_GEN(table->dev, sf_base_id); + sf_dev = xa_load(&table->devices, sf_index); + switch (event->new_vhca_state) { + case MLX5_VHCA_STATE_ALLOCATED: + if (sf_dev) + mlx5_sf_dev_del(table->dev, sf_dev, sf_index); + break; + case MLX5_VHCA_STATE_TEARDOWN_REQUEST: + if (sf_dev) + mlx5_sf_dev_del(table->dev, sf_dev, sf_index); + else + mlx5_core_err(table->dev, + "SF DEV: teardown state for invalid dev index=%d fn_id=0x%x\n", + sf_index, event->sw_function_id); + break; + case MLX5_VHCA_STATE_ACTIVE: + if (!sf_dev) + mlx5_sf_dev_add(table->dev, sf_index, event->sw_function_id); + break; + default: + break; + } + return 0; +} + +static int mlx5_sf_dev_vhca_arm_all(struct mlx5_sf_dev_table *table) +{ + struct mlx5_core_dev *dev = table->dev; + u16 max_functions; + u16 function_id; + int err = 0; + bool ecpu; + int i; + + max_functions = mlx5_sf_max_functions(dev); + function_id = MLX5_CAP_GEN(dev, sf_base_id); + ecpu = mlx5_read_embedded_cpu(dev); + /* Arm the vhca context as the vhca event notifier */ + for (i = 0; i < max_functions; i++) { + err = mlx5_vhca_event_arm(dev, function_id, ecpu); + if (err) + return err; + + function_id++; + } + return 0; +} + +void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_dev_table *table; + unsigned int max_sfs; + int err; + + if (!mlx5_sf_dev_supported(dev) || !mlx5_vhca_event_supported(dev)) + return; + + table = kzalloc(sizeof(*table), GFP_KERNEL); + if (!table) { + err = -ENOMEM; + goto table_err; + } + + table->nb.notifier_call = mlx5_sf_dev_state_change_handler; + table->dev = dev; + if (MLX5_CAP_GEN(dev, max_num_sf)) + max_sfs = MLX5_CAP_GEN(dev, max_num_sf); + else + max_sfs = 1 << MLX5_CAP_GEN(dev, log_max_sf); + table->sf_bar_length = 1 << (MLX5_CAP_GEN(dev, log_min_sf_size) + 12); + table->base_address = pci_resource_start(dev->pdev, 2); + table->max_sfs = max_sfs; + xa_init(&table->devices); + dev->priv.sf_dev_table = table; + + err = mlx5_vhca_event_notifier_register(dev, &table->nb); + if (err) + goto vhca_err; + err = mlx5_sf_dev_vhca_arm_all(table); + if (err) + goto arm_err; + mlx5_core_dbg(dev, "SF DEV: max sf devices=%d\n", max_sfs); + return; + +arm_err: + mlx5_vhca_event_notifier_unregister(dev, &table->nb); +vhca_err: + table->max_sfs = 0; + kfree(table); + dev->priv.sf_dev_table = NULL; +table_err: + mlx5_core_err(dev, "SF DEV table create err = %d\n", err); +} + +static void mlx5_sf_dev_destroy_all(struct mlx5_sf_dev_table *table) +{ + struct mlx5_sf_dev *sf_dev; + unsigned long index; + + xa_for_each(&table->devices, index, sf_dev) { + xa_erase(&table->devices, index); + mlx5_sf_dev_remove(sf_dev); + } +} + +void mlx5_sf_dev_table_destroy(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_dev_table *table = dev->priv.sf_dev_table; + + if (!table) + return; + + mlx5_vhca_event_notifier_unregister(dev, &table->nb); + + /* Now that event handler is not running, it is safe to destroy + * the sf device without race. + */ + mlx5_sf_dev_destroy_all(table); + + WARN_ON(!xa_empty(&table->devices)); + kfree(table); + dev->priv.sf_dev_table = NULL; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h new file mode 100644 index 000000000000..4de02902aef1 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#ifndef __MLX5_SF_DEV_H__ +#define __MLX5_SF_DEV_H__ + +#ifdef CONFIG_MLX5_SF + +#include <linux/auxiliary_bus.h> + +#define MLX5_SF_DEV_ID_NAME "sf" + +struct mlx5_sf_dev { + struct auxiliary_device adev; + struct mlx5_core_dev *parent_mdev; + struct mlx5_core_dev *mdev; + phys_addr_t bar_base_addr; + u32 sfnum; +}; + +void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev); +void mlx5_sf_dev_table_destroy(struct mlx5_core_dev *dev); + +int mlx5_sf_driver_register(void); +void mlx5_sf_driver_unregister(void); + +bool mlx5_sf_dev_allocated(const struct mlx5_core_dev *dev); + +#else + +static inline void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev) +{ +} + +static inline void mlx5_sf_dev_table_destroy(struct mlx5_core_dev *dev) +{ +} + +static inline int mlx5_sf_driver_register(void) +{ + return 0; +} + +static inline void mlx5_sf_driver_unregister(void) +{ +} + +static inline bool mlx5_sf_dev_allocated(const struct mlx5_core_dev *dev) +{ + return 0; +} + +#endif + +#endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c new file mode 100644 index 000000000000..daf63a8115e0 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c @@ -0,0 +1,101 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#include <linux/mlx5/driver.h> +#include <linux/mlx5/device.h> +#include "mlx5_core.h" +#include "dev.h" +#include "devlink.h" + +static int mlx5_sf_dev_probe(struct auxiliary_device *adev, const struct auxiliary_device_id *id) +{ + struct mlx5_sf_dev *sf_dev = container_of(adev, struct mlx5_sf_dev, adev); + struct mlx5_core_dev *mdev; + struct devlink *devlink; + int err; + + devlink = mlx5_devlink_alloc(); + if (!devlink) + return -ENOMEM; + + mdev = devlink_priv(devlink); + mdev->device = &adev->dev; + mdev->pdev = sf_dev->parent_mdev->pdev; + mdev->bar_addr = sf_dev->bar_base_addr; + mdev->iseg_base = sf_dev->bar_base_addr; + mdev->coredev_type = MLX5_COREDEV_SF; + mdev->priv.parent_mdev = sf_dev->parent_mdev; + mdev->priv.adev_idx = adev->id; + sf_dev->mdev = mdev; + + err = mlx5_mdev_init(mdev, MLX5_DEFAULT_PROF); + if (err) { + mlx5_core_warn(mdev, "mlx5_mdev_init on err=%d\n", err); + goto mdev_err; + } + + mdev->iseg = ioremap(mdev->iseg_base, sizeof(*mdev->iseg)); + if (!mdev->iseg) { + mlx5_core_warn(mdev, "remap error\n"); + goto remap_err; + } + + err = mlx5_load_one(mdev, true); + if (err) { + mlx5_core_warn(mdev, "mlx5_load_one err=%d\n", err); + goto load_one_err; + } + return 0; + +load_one_err: + iounmap(mdev->iseg); +remap_err: + mlx5_mdev_uninit(mdev); +mdev_err: + mlx5_devlink_free(devlink); + return err; +} + +static void mlx5_sf_dev_remove(struct auxiliary_device *adev) +{ + struct mlx5_sf_dev *sf_dev = container_of(adev, struct mlx5_sf_dev, adev); + struct devlink *devlink; + + devlink = priv_to_devlink(sf_dev->mdev); + mlx5_unload_one(sf_dev->mdev, true); + iounmap(sf_dev->mdev->iseg); + mlx5_mdev_uninit(sf_dev->mdev); + mlx5_devlink_free(devlink); +} + +static void mlx5_sf_dev_shutdown(struct auxiliary_device *adev) +{ + struct mlx5_sf_dev *sf_dev = container_of(adev, struct mlx5_sf_dev, adev); + + mlx5_unload_one(sf_dev->mdev, false); +} + +static const struct auxiliary_device_id mlx5_sf_dev_id_table[] = { + { .name = MLX5_ADEV_NAME "." MLX5_SF_DEV_ID_NAME, }, + { }, +}; + +MODULE_DEVICE_TABLE(auxiliary, mlx5_sf_dev_id_table); + +static struct auxiliary_driver mlx5_sf_driver = { + .name = MLX5_SF_DEV_ID_NAME, + .probe = mlx5_sf_dev_probe, + .remove = mlx5_sf_dev_remove, + .shutdown = mlx5_sf_dev_shutdown, + .id_table = mlx5_sf_dev_id_table, +}; + +int mlx5_sf_driver_register(void) +{ + return auxiliary_driver_register(&mlx5_sf_driver); +} + +void mlx5_sf_driver_unregister(void) +{ + auxiliary_driver_unregister(&mlx5_sf_driver); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c new file mode 100644 index 000000000000..c2ba41bb7a70 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c @@ -0,0 +1,556 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#include <linux/mlx5/driver.h> +#include "eswitch.h" +#include "priv.h" +#include "sf/dev/dev.h" +#include "mlx5_ifc_vhca_event.h" +#include "vhca_event.h" +#include "ecpf.h" + +struct mlx5_sf { + struct devlink_port dl_port; + unsigned int port_index; + u16 id; + u16 hw_fn_id; + u16 hw_state; +}; + +struct mlx5_sf_table { + struct mlx5_core_dev *dev; /* To refer from notifier context. */ + struct xarray port_indices; /* port index based lookup. */ + refcount_t refcount; + struct completion disable_complete; + struct mutex sf_state_lock; /* Serializes sf state among user cmds & vhca event handler. */ + struct notifier_block esw_nb; + struct notifier_block vhca_nb; + u8 ecpu: 1; +}; + +static struct mlx5_sf * +mlx5_sf_lookup_by_index(struct mlx5_sf_table *table, unsigned int port_index) +{ + return xa_load(&table->port_indices, port_index); +} + +static struct mlx5_sf * +mlx5_sf_lookup_by_function_id(struct mlx5_sf_table *table, unsigned int fn_id) +{ + unsigned long index; + struct mlx5_sf *sf; + + xa_for_each(&table->port_indices, index, sf) { + if (sf->hw_fn_id == fn_id) + return sf; + } + return NULL; +} + +static int mlx5_sf_id_insert(struct mlx5_sf_table *table, struct mlx5_sf *sf) +{ + return xa_insert(&table->port_indices, sf->port_index, sf, GFP_KERNEL); +} + +static void mlx5_sf_id_erase(struct mlx5_sf_table *table, struct mlx5_sf *sf) +{ + xa_erase(&table->port_indices, sf->port_index); +} + +static struct mlx5_sf * +mlx5_sf_alloc(struct mlx5_sf_table *table, u32 sfnum, struct netlink_ext_ack *extack) +{ + unsigned int dl_port_index; + struct mlx5_sf *sf; + u16 hw_fn_id; + int id_err; + int err; + + id_err = mlx5_sf_hw_table_sf_alloc(table->dev, sfnum); + if (id_err < 0) { + err = id_err; + goto id_err; + } + + sf = kzalloc(sizeof(*sf), GFP_KERNEL); + if (!sf) { + err = -ENOMEM; + goto alloc_err; + } + sf->id = id_err; + hw_fn_id = mlx5_sf_sw_to_hw_id(table->dev, sf->id); + dl_port_index = mlx5_esw_vport_to_devlink_port_index(table->dev, hw_fn_id); + sf->port_index = dl_port_index; + sf->hw_fn_id = hw_fn_id; + sf->hw_state = MLX5_VHCA_STATE_ALLOCATED; + + err = mlx5_sf_id_insert(table, sf); + if (err) + goto insert_err; + + return sf; + +insert_err: + kfree(sf); +alloc_err: + mlx5_sf_hw_table_sf_free(table->dev, id_err); +id_err: + if (err == -EEXIST) + NL_SET_ERR_MSG_MOD(extack, "SF already exist. Choose different sfnum"); + return ERR_PTR(err); +} + +static void mlx5_sf_free(struct mlx5_sf_table *table, struct mlx5_sf *sf) +{ + mlx5_sf_id_erase(table, sf); + mlx5_sf_hw_table_sf_free(table->dev, sf->id); + kfree(sf); +} + +static struct mlx5_sf_table *mlx5_sf_table_try_get(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_table *table = dev->priv.sf_table; + + if (!table) + return NULL; + + return refcount_inc_not_zero(&table->refcount) ? table : NULL; +} + +static void mlx5_sf_table_put(struct mlx5_sf_table *table) +{ + if (refcount_dec_and_test(&table->refcount)) + complete(&table->disable_complete); +} + +static enum devlink_port_fn_state mlx5_sf_to_devlink_state(u8 hw_state) +{ + switch (hw_state) { + case MLX5_VHCA_STATE_ACTIVE: + case MLX5_VHCA_STATE_IN_USE: + case MLX5_VHCA_STATE_TEARDOWN_REQUEST: + return DEVLINK_PORT_FN_STATE_ACTIVE; + case MLX5_VHCA_STATE_INVALID: + case MLX5_VHCA_STATE_ALLOCATED: + default: + return DEVLINK_PORT_FN_STATE_INACTIVE; + } +} + +static enum devlink_port_fn_opstate mlx5_sf_to_devlink_opstate(u8 hw_state) +{ + switch (hw_state) { + case MLX5_VHCA_STATE_IN_USE: + case MLX5_VHCA_STATE_TEARDOWN_REQUEST: + return DEVLINK_PORT_FN_OPSTATE_ATTACHED; + case MLX5_VHCA_STATE_INVALID: + case MLX5_VHCA_STATE_ALLOCATED: + case MLX5_VHCA_STATE_ACTIVE: + default: + return DEVLINK_PORT_FN_OPSTATE_DETACHED; + } +} + +static bool mlx5_sf_is_active(const struct mlx5_sf *sf) +{ + return sf->hw_state == MLX5_VHCA_STATE_ACTIVE || sf->hw_state == MLX5_VHCA_STATE_IN_USE; +} + +int mlx5_devlink_sf_port_fn_state_get(struct devlink *devlink, struct devlink_port *dl_port, + enum devlink_port_fn_state *state, + enum devlink_port_fn_opstate *opstate, + struct netlink_ext_ack *extack) +{ + struct mlx5_core_dev *dev = devlink_priv(devlink); + struct mlx5_sf_table *table; + struct mlx5_sf *sf; + int err = 0; + + table = mlx5_sf_table_try_get(dev); + if (!table) + return -EOPNOTSUPP; + + sf = mlx5_sf_lookup_by_index(table, dl_port->index); + if (!sf) { + err = -EOPNOTSUPP; + goto sf_err; + } + mutex_lock(&table->sf_state_lock); + *state = mlx5_sf_to_devlink_state(sf->hw_state); + *opstate = mlx5_sf_to_devlink_opstate(sf->hw_state); + mutex_unlock(&table->sf_state_lock); +sf_err: + mlx5_sf_table_put(table); + return err; +} + +static int mlx5_sf_activate(struct mlx5_core_dev *dev, struct mlx5_sf *sf) +{ + int err; + + if (mlx5_sf_is_active(sf)) + return 0; + if (sf->hw_state != MLX5_VHCA_STATE_ALLOCATED) + return -EINVAL; + + err = mlx5_cmd_sf_enable_hca(dev, sf->hw_fn_id); + if (err) + return err; + + sf->hw_state = MLX5_VHCA_STATE_ACTIVE; + return 0; +} + +static int mlx5_sf_deactivate(struct mlx5_core_dev *dev, struct mlx5_sf *sf) +{ + int err; + + if (!mlx5_sf_is_active(sf)) + return 0; + + err = mlx5_cmd_sf_disable_hca(dev, sf->hw_fn_id); + if (err) + return err; + + sf->hw_state = MLX5_VHCA_STATE_TEARDOWN_REQUEST; + return 0; +} + +static int mlx5_sf_state_set(struct mlx5_core_dev *dev, struct mlx5_sf_table *table, + struct mlx5_sf *sf, + enum devlink_port_fn_state state) +{ + int err = 0; + + mutex_lock(&table->sf_state_lock); + if (state == mlx5_sf_to_devlink_state(sf->hw_state)) + goto out; + if (state == DEVLINK_PORT_FN_STATE_ACTIVE) + err = mlx5_sf_activate(dev, sf); + else if (state == DEVLINK_PORT_FN_STATE_INACTIVE) + err = mlx5_sf_deactivate(dev, sf); + else + err = -EINVAL; +out: + mutex_unlock(&table->sf_state_lock); + return err; +} + +int mlx5_devlink_sf_port_fn_state_set(struct devlink *devlink, struct devlink_port *dl_port, + enum devlink_port_fn_state state, + struct netlink_ext_ack *extack) +{ + struct mlx5_core_dev *dev = devlink_priv(devlink); + struct mlx5_sf_table *table; + struct mlx5_sf *sf; + int err; + + table = mlx5_sf_table_try_get(dev); + if (!table) { + NL_SET_ERR_MSG_MOD(extack, + "Port state set is only supported in eswitch switchdev mode or SF ports are disabled."); + return -EOPNOTSUPP; + } + sf = mlx5_sf_lookup_by_index(table, dl_port->index); + if (!sf) { + err = -ENODEV; + goto out; + } + + err = mlx5_sf_state_set(dev, table, sf, state); +out: + mlx5_sf_table_put(table); + return err; +} + +static int mlx5_sf_add(struct mlx5_core_dev *dev, struct mlx5_sf_table *table, + const struct devlink_port_new_attrs *new_attr, + struct netlink_ext_ack *extack, + unsigned int *new_port_index) +{ + struct mlx5_eswitch *esw = dev->priv.eswitch; + struct mlx5_sf *sf; + u16 hw_fn_id; + int err; + + sf = mlx5_sf_alloc(table, new_attr->sfnum, extack); + if (IS_ERR(sf)) + return PTR_ERR(sf); + + hw_fn_id = mlx5_sf_sw_to_hw_id(dev, sf->id); + err = mlx5_esw_offloads_sf_vport_enable(esw, &sf->dl_port, hw_fn_id, new_attr->sfnum); + if (err) + goto esw_err; + *new_port_index = sf->port_index; + return 0; + +esw_err: + mlx5_sf_free(table, sf); + return err; +} + +static int +mlx5_sf_new_check_attr(struct mlx5_core_dev *dev, const struct devlink_port_new_attrs *new_attr, + struct netlink_ext_ack *extack) +{ + if (new_attr->flavour != DEVLINK_PORT_FLAVOUR_PCI_SF) { + NL_SET_ERR_MSG_MOD(extack, "Driver supports only SF port addition"); + return -EOPNOTSUPP; + } + if (new_attr->port_index_valid) { + NL_SET_ERR_MSG_MOD(extack, + "Driver does not support user defined port index assignment"); + return -EOPNOTSUPP; + } + if (!new_attr->sfnum_valid) { + NL_SET_ERR_MSG_MOD(extack, + "User must provide unique sfnum. Driver does not support auto assignment"); + return -EOPNOTSUPP; + } + if (new_attr->controller_valid && new_attr->controller) { + NL_SET_ERR_MSG_MOD(extack, "External controller is unsupported"); + return -EOPNOTSUPP; + } + if (new_attr->pfnum != PCI_FUNC(dev->pdev->devfn)) { + NL_SET_ERR_MSG_MOD(extack, "Invalid pfnum supplied"); + return -EOPNOTSUPP; + } + return 0; +} + +int mlx5_devlink_sf_port_new(struct devlink *devlink, + const struct devlink_port_new_attrs *new_attr, + struct netlink_ext_ack *extack, + unsigned int *new_port_index) +{ + struct mlx5_core_dev *dev = devlink_priv(devlink); + struct mlx5_sf_table *table; + int err; + + err = mlx5_sf_new_check_attr(dev, new_attr, extack); + if (err) + return err; + + table = mlx5_sf_table_try_get(dev); + if (!table) { + NL_SET_ERR_MSG_MOD(extack, + "Port add is only supported in eswitch switchdev mode or SF ports are disabled."); + return -EOPNOTSUPP; + } + err = mlx5_sf_add(dev, table, new_attr, extack, new_port_index); + mlx5_sf_table_put(table); + return err; +} + +static void mlx5_sf_dealloc(struct mlx5_sf_table *table, struct mlx5_sf *sf) +{ + if (sf->hw_state == MLX5_VHCA_STATE_ALLOCATED) { + mlx5_sf_free(table, sf); + } else if (mlx5_sf_is_active(sf)) { + /* Even if its active, it is treated as in_use because by the time, + * it is disabled here, it may getting used. So it is safe to + * always look for the event to ensure that it is recycled only after + * firmware gives confirmation that it is detached by the driver. + */ + mlx5_cmd_sf_disable_hca(table->dev, sf->hw_fn_id); + mlx5_sf_hw_table_sf_deferred_free(table->dev, sf->id); + kfree(sf); + } else { + mlx5_sf_hw_table_sf_deferred_free(table->dev, sf->id); + kfree(sf); + } +} + +int mlx5_devlink_sf_port_del(struct devlink *devlink, unsigned int port_index, + struct netlink_ext_ack *extack) +{ + struct mlx5_core_dev *dev = devlink_priv(devlink); + struct mlx5_eswitch *esw = dev->priv.eswitch; + struct mlx5_sf_table *table; + struct mlx5_sf *sf; + int err = 0; + + table = mlx5_sf_table_try_get(dev); + if (!table) { + NL_SET_ERR_MSG_MOD(extack, + "Port del is only supported in eswitch switchdev mode or SF ports are disabled."); + return -EOPNOTSUPP; + } + sf = mlx5_sf_lookup_by_index(table, port_index); + if (!sf) { + err = -ENODEV; + goto sf_err; + } + + mlx5_esw_offloads_sf_vport_disable(esw, sf->hw_fn_id); + mlx5_sf_id_erase(table, sf); + + mutex_lock(&table->sf_state_lock); + mlx5_sf_dealloc(table, sf); + mutex_unlock(&table->sf_state_lock); +sf_err: + mlx5_sf_table_put(table); + return err; +} + +static bool mlx5_sf_state_update_check(const struct mlx5_sf *sf, u8 new_state) +{ + if (sf->hw_state == MLX5_VHCA_STATE_ACTIVE && new_state == MLX5_VHCA_STATE_IN_USE) + return true; + + if (sf->hw_state == MLX5_VHCA_STATE_IN_USE && new_state == MLX5_VHCA_STATE_ACTIVE) + return true; + + if (sf->hw_state == MLX5_VHCA_STATE_TEARDOWN_REQUEST && + new_state == MLX5_VHCA_STATE_ALLOCATED) + return true; + + return false; +} + +static int mlx5_sf_vhca_event(struct notifier_block *nb, unsigned long opcode, void *data) +{ + struct mlx5_sf_table *table = container_of(nb, struct mlx5_sf_table, vhca_nb); + const struct mlx5_vhca_state_event *event = data; + bool update = false; + struct mlx5_sf *sf; + + table = mlx5_sf_table_try_get(table->dev); + if (!table) + return 0; + + mutex_lock(&table->sf_state_lock); + sf = mlx5_sf_lookup_by_function_id(table, event->function_id); + if (!sf) + goto sf_err; + + /* When driver is attached or detached to a function, an event + * notifies such state change. + */ + update = mlx5_sf_state_update_check(sf, event->new_vhca_state); + if (update) + sf->hw_state = event->new_vhca_state; +sf_err: + mutex_unlock(&table->sf_state_lock); + mlx5_sf_table_put(table); + return 0; +} + +static void mlx5_sf_table_enable(struct mlx5_sf_table *table) +{ + if (!mlx5_sf_max_functions(table->dev)) + return; + + init_completion(&table->disable_complete); + refcount_set(&table->refcount, 1); +} + +static void mlx5_sf_deactivate_all(struct mlx5_sf_table *table) +{ + struct mlx5_eswitch *esw = table->dev->priv.eswitch; + unsigned long index; + struct mlx5_sf *sf; + + /* At this point, no new user commands can start and no vhca event can + * arrive. It is safe to destroy all user created SFs. + */ + xa_for_each(&table->port_indices, index, sf) { + mlx5_esw_offloads_sf_vport_disable(esw, sf->hw_fn_id); + mlx5_sf_id_erase(table, sf); + mlx5_sf_dealloc(table, sf); + } +} + +static void mlx5_sf_table_disable(struct mlx5_sf_table *table) +{ + if (!mlx5_sf_max_functions(table->dev)) + return; + + if (!refcount_read(&table->refcount)) + return; + + /* Balances with refcount_set; drop the reference so that new user cmd cannot start + * and new vhca event handler cannnot run. + */ + mlx5_sf_table_put(table); + wait_for_completion(&table->disable_complete); + + mlx5_sf_deactivate_all(table); +} + +static int mlx5_sf_esw_event(struct notifier_block *nb, unsigned long event, void *data) +{ + struct mlx5_sf_table *table = container_of(nb, struct mlx5_sf_table, esw_nb); + const struct mlx5_esw_event_info *mode = data; + + switch (mode->new_mode) { + case MLX5_ESWITCH_OFFLOADS: + mlx5_sf_table_enable(table); + break; + case MLX5_ESWITCH_NONE: + mlx5_sf_table_disable(table); + break; + default: + break; + }; + + return 0; +} + +static bool mlx5_sf_table_supported(const struct mlx5_core_dev *dev) +{ + return dev->priv.eswitch && MLX5_ESWITCH_MANAGER(dev) && mlx5_sf_supported(dev); +} + +int mlx5_sf_table_init(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_table *table; + int err; + + if (!mlx5_sf_table_supported(dev) || !mlx5_vhca_event_supported(dev)) + return 0; + + table = kzalloc(sizeof(*table), GFP_KERNEL); + if (!table) + return -ENOMEM; + + mutex_init(&table->sf_state_lock); + table->dev = dev; + xa_init(&table->port_indices); + dev->priv.sf_table = table; + refcount_set(&table->refcount, 0); + table->esw_nb.notifier_call = mlx5_sf_esw_event; + err = mlx5_esw_event_notifier_register(dev->priv.eswitch, &table->esw_nb); + if (err) + goto reg_err; + + table->vhca_nb.notifier_call = mlx5_sf_vhca_event; + err = mlx5_vhca_event_notifier_register(table->dev, &table->vhca_nb); + if (err) + goto vhca_err; + + return 0; + +vhca_err: + mlx5_esw_event_notifier_unregister(dev->priv.eswitch, &table->esw_nb); +reg_err: + mutex_destroy(&table->sf_state_lock); + kfree(table); + dev->priv.sf_table = NULL; + return err; +} + +void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_table *table = dev->priv.sf_table; + + if (!table) + return; + + mlx5_vhca_event_notifier_unregister(table->dev, &table->vhca_nb); + mlx5_esw_event_notifier_unregister(dev->priv.eswitch, &table->esw_nb); + WARN_ON(refcount_read(&table->refcount)); + mutex_destroy(&table->sf_state_lock); + WARN_ON(!xa_empty(&table->port_indices)); + kfree(table); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c new file mode 100644 index 000000000000..58b6be0b03d7 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c @@ -0,0 +1,233 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ +#include <linux/mlx5/driver.h> +#include "vhca_event.h" +#include "priv.h" +#include "sf.h" +#include "mlx5_ifc_vhca_event.h" +#include "vhca_event.h" +#include "ecpf.h" + +struct mlx5_sf_hw { + u32 usr_sfnum; + u8 allocated: 1; + u8 pending_delete: 1; +}; + +struct mlx5_sf_hw_table { + struct mlx5_core_dev *dev; + struct mlx5_sf_hw *sfs; + int max_local_functions; + u8 ecpu: 1; + struct mutex table_lock; /* Serializes sf deletion and vhca state change handler. */ + struct notifier_block vhca_nb; +}; + +u16 mlx5_sf_sw_to_hw_id(const struct mlx5_core_dev *dev, u16 sw_id) +{ + return sw_id + mlx5_sf_start_function_id(dev); +} + +static u16 mlx5_sf_hw_to_sw_id(const struct mlx5_core_dev *dev, u16 hw_id) +{ + return hw_id - mlx5_sf_start_function_id(dev); +} + +int mlx5_sf_hw_table_sf_alloc(struct mlx5_core_dev *dev, u32 usr_sfnum) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + int sw_id = -ENOSPC; + u16 hw_fn_id; + int err; + int i; + + if (!table->max_local_functions) + return -EOPNOTSUPP; + + mutex_lock(&table->table_lock); + /* Check if sf with same sfnum already exists or not. */ + for (i = 0; i < table->max_local_functions; i++) { + if (table->sfs[i].allocated && table->sfs[i].usr_sfnum == usr_sfnum) { + err = -EEXIST; + goto exist_err; + } + } + + /* Find the free entry and allocate the entry from the array */ + for (i = 0; i < table->max_local_functions; i++) { + if (!table->sfs[i].allocated) { + table->sfs[i].usr_sfnum = usr_sfnum; + table->sfs[i].allocated = true; + sw_id = i; + break; + } + } + if (sw_id == -ENOSPC) { + err = -ENOSPC; + goto err; + } + + hw_fn_id = mlx5_sf_sw_to_hw_id(table->dev, sw_id); + err = mlx5_cmd_alloc_sf(table->dev, hw_fn_id); + if (err) + goto err; + + err = mlx5_modify_vhca_sw_id(dev, hw_fn_id, table->ecpu, usr_sfnum); + if (err) + goto vhca_err; + + mutex_unlock(&table->table_lock); + return sw_id; + +vhca_err: + mlx5_cmd_dealloc_sf(table->dev, hw_fn_id); +err: + table->sfs[i].allocated = false; +exist_err: + mutex_unlock(&table->table_lock); + return err; +} + +static void _mlx5_sf_hw_id_free(struct mlx5_core_dev *dev, u16 id) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + u16 hw_fn_id; + + hw_fn_id = mlx5_sf_sw_to_hw_id(table->dev, id); + mlx5_cmd_dealloc_sf(table->dev, hw_fn_id); + table->sfs[id].allocated = false; + table->sfs[id].pending_delete = false; +} + +void mlx5_sf_hw_table_sf_free(struct mlx5_core_dev *dev, u16 id) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + + mutex_lock(&table->table_lock); + _mlx5_sf_hw_id_free(dev, id); + mutex_unlock(&table->table_lock); +} + +void mlx5_sf_hw_table_sf_deferred_free(struct mlx5_core_dev *dev, u16 id) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + u32 out[MLX5_ST_SZ_DW(query_vhca_state_out)] = {}; + u16 hw_fn_id; + u8 state; + int err; + + hw_fn_id = mlx5_sf_sw_to_hw_id(dev, id); + mutex_lock(&table->table_lock); + err = mlx5_cmd_query_vhca_state(dev, hw_fn_id, table->ecpu, out, sizeof(out)); + if (err) + goto err; + state = MLX5_GET(query_vhca_state_out, out, vhca_state_context.vhca_state); + if (state == MLX5_VHCA_STATE_ALLOCATED) { + mlx5_cmd_dealloc_sf(table->dev, hw_fn_id); + table->sfs[id].allocated = false; + } else { + table->sfs[id].pending_delete = true; + } +err: + mutex_unlock(&table->table_lock); +} + +static void mlx5_sf_hw_dealloc_all(struct mlx5_sf_hw_table *table) +{ + int i; + + for (i = 0; i < table->max_local_functions; i++) { + if (table->sfs[i].allocated) + _mlx5_sf_hw_id_free(table->dev, i); + } +} + +int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_hw_table *table; + struct mlx5_sf_hw *sfs; + int max_functions; + + if (!mlx5_sf_supported(dev) || !mlx5_vhca_event_supported(dev)) + return 0; + + max_functions = mlx5_sf_max_functions(dev); + table = kzalloc(sizeof(*table), GFP_KERNEL); + if (!table) + return -ENOMEM; + + sfs = kcalloc(max_functions, sizeof(*sfs), GFP_KERNEL); + if (!sfs) + goto table_err; + + mutex_init(&table->table_lock); + table->dev = dev; + table->sfs = sfs; + table->max_local_functions = max_functions; + table->ecpu = mlx5_read_embedded_cpu(dev); + dev->priv.sf_hw_table = table; + mlx5_core_dbg(dev, "SF HW table: max sfs = %d\n", max_functions); + return 0; + +table_err: + kfree(table); + return -ENOMEM; +} + +void mlx5_sf_hw_table_cleanup(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + + if (!table) + return; + + mutex_destroy(&table->table_lock); + kfree(table->sfs); + kfree(table); +} + +static int mlx5_sf_hw_vhca_event(struct notifier_block *nb, unsigned long opcode, void *data) +{ + struct mlx5_sf_hw_table *table = container_of(nb, struct mlx5_sf_hw_table, vhca_nb); + const struct mlx5_vhca_state_event *event = data; + struct mlx5_sf_hw *sf_hw; + u16 sw_id; + + if (event->new_vhca_state != MLX5_VHCA_STATE_ALLOCATED) + return 0; + + sw_id = mlx5_sf_hw_to_sw_id(table->dev, event->function_id); + sf_hw = &table->sfs[sw_id]; + + mutex_lock(&table->table_lock); + /* SF driver notified through firmware that SF is finally detached. + * Hence recycle the sf hardware id for reuse. + */ + if (sf_hw->allocated && sf_hw->pending_delete) + _mlx5_sf_hw_id_free(table->dev, sw_id); + mutex_unlock(&table->table_lock); + return 0; +} + +int mlx5_sf_hw_table_create(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + + if (!table) + return 0; + + table->vhca_nb.notifier_call = mlx5_sf_hw_vhca_event; + return mlx5_vhca_event_notifier_register(table->dev, &table->vhca_nb); +} + +void mlx5_sf_hw_table_destroy(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + + if (!table) + return; + + mlx5_vhca_event_notifier_unregister(table->dev, &table->vhca_nb); + /* Dealloc SFs whose firmware event has been missed. */ + mlx5_sf_hw_dealloc_all(table); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/mlx5_ifc_vhca_event.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/mlx5_ifc_vhca_event.h new file mode 100644 index 000000000000..1daf5a122ba3 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/mlx5_ifc_vhca_event.h @@ -0,0 +1,82 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#ifndef __MLX5_IFC_VHCA_EVENT_H__ +#define __MLX5_IFC_VHCA_EVENT_H__ + +enum mlx5_ifc_vhca_state { + MLX5_VHCA_STATE_INVALID = 0x0, + MLX5_VHCA_STATE_ALLOCATED = 0x1, + MLX5_VHCA_STATE_ACTIVE = 0x2, + MLX5_VHCA_STATE_IN_USE = 0x3, + MLX5_VHCA_STATE_TEARDOWN_REQUEST = 0x4, +}; + +struct mlx5_ifc_vhca_state_context_bits { + u8 arm_change_event[0x1]; + u8 reserved_at_1[0xb]; + u8 vhca_state[0x4]; + u8 reserved_at_10[0x10]; + + u8 sw_function_id[0x20]; + + u8 reserved_at_40[0x80]; +}; + +struct mlx5_ifc_query_vhca_state_out_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + + u8 syndrome[0x20]; + + u8 reserved_at_40[0x40]; + + struct mlx5_ifc_vhca_state_context_bits vhca_state_context; +}; + +struct mlx5_ifc_query_vhca_state_in_bits { + u8 opcode[0x10]; + u8 uid[0x10]; + + u8 reserved_at_20[0x10]; + u8 op_mod[0x10]; + + u8 embedded_cpu_function[0x1]; + u8 reserved_at_41[0xf]; + u8 function_id[0x10]; + + u8 reserved_at_60[0x20]; +}; + +struct mlx5_ifc_vhca_state_field_select_bits { + u8 reserved_at_0[0x1e]; + u8 sw_function_id[0x1]; + u8 arm_change_event[0x1]; +}; + +struct mlx5_ifc_modify_vhca_state_out_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + + u8 syndrome[0x20]; + + u8 reserved_at_40[0x40]; +}; + +struct mlx5_ifc_modify_vhca_state_in_bits { + u8 opcode[0x10]; + u8 uid[0x10]; + + u8 reserved_at_20[0x10]; + u8 op_mod[0x10]; + + u8 embedded_cpu_function[0x1]; + u8 reserved_at_41[0xf]; + u8 function_id[0x10]; + + struct mlx5_ifc_vhca_state_field_select_bits vhca_state_field_select; + + struct mlx5_ifc_vhca_state_context_bits vhca_state_context; +}; + +#endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h new file mode 100644 index 000000000000..cb02a51d0986 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#ifndef __MLX5_SF_PRIV_H__ +#define __MLX5_SF_PRIV_H__ + +#include <linux/mlx5/driver.h> + +int mlx5_cmd_alloc_sf(struct mlx5_core_dev *dev, u16 function_id); +int mlx5_cmd_dealloc_sf(struct mlx5_core_dev *dev, u16 function_id); + +int mlx5_cmd_sf_enable_hca(struct mlx5_core_dev *dev, u16 func_id); +int mlx5_cmd_sf_disable_hca(struct mlx5_core_dev *dev, u16 func_id); + +u16 mlx5_sf_sw_to_hw_id(const struct mlx5_core_dev *dev, u16 sw_id); + +int mlx5_sf_hw_table_sf_alloc(struct mlx5_core_dev *dev, u32 usr_sfnum); +void mlx5_sf_hw_table_sf_free(struct mlx5_core_dev *dev, u16 id); +void mlx5_sf_hw_table_sf_deferred_free(struct mlx5_core_dev *dev, u16 id); + +#endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h new file mode 100644 index 000000000000..0b6aea1e6a94 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h @@ -0,0 +1,100 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#ifndef __MLX5_SF_H__ +#define __MLX5_SF_H__ + +#include <linux/mlx5/driver.h> + +static inline u16 mlx5_sf_start_function_id(const struct mlx5_core_dev *dev) +{ + return MLX5_CAP_GEN(dev, sf_base_id); +} + +#ifdef CONFIG_MLX5_SF + +static inline bool mlx5_sf_supported(const struct mlx5_core_dev *dev) +{ + return MLX5_CAP_GEN(dev, sf); +} + +static inline u16 mlx5_sf_max_functions(const struct mlx5_core_dev *dev) +{ + if (!mlx5_sf_supported(dev)) + return 0; + if (MLX5_CAP_GEN(dev, max_num_sf)) + return MLX5_CAP_GEN(dev, max_num_sf); + else + return 1 << MLX5_CAP_GEN(dev, log_max_sf); +} + +#else + +static inline bool mlx5_sf_supported(const struct mlx5_core_dev *dev) +{ + return false; +} + +static inline u16 mlx5_sf_max_functions(const struct mlx5_core_dev *dev) +{ + return 0; +} + +#endif + +#ifdef CONFIG_MLX5_SF_MANAGER + +int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev); +void mlx5_sf_hw_table_cleanup(struct mlx5_core_dev *dev); + +int mlx5_sf_hw_table_create(struct mlx5_core_dev *dev); +void mlx5_sf_hw_table_destroy(struct mlx5_core_dev *dev); + +int mlx5_sf_table_init(struct mlx5_core_dev *dev); +void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev); + +int mlx5_devlink_sf_port_new(struct devlink *devlink, + const struct devlink_port_new_attrs *add_attr, + struct netlink_ext_ack *extack, + unsigned int *new_port_index); +int mlx5_devlink_sf_port_del(struct devlink *devlink, unsigned int port_index, + struct netlink_ext_ack *extack); +int mlx5_devlink_sf_port_fn_state_get(struct devlink *devlink, struct devlink_port *dl_port, + enum devlink_port_fn_state *state, + enum devlink_port_fn_opstate *opstate, + struct netlink_ext_ack *extack); +int mlx5_devlink_sf_port_fn_state_set(struct devlink *devlink, struct devlink_port *dl_port, + enum devlink_port_fn_state state, + struct netlink_ext_ack *extack); +#else + +static inline int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev) +{ + return 0; +} + +static inline void mlx5_sf_hw_table_cleanup(struct mlx5_core_dev *dev) +{ +} + +static inline int mlx5_sf_hw_table_create(struct mlx5_core_dev *dev) +{ + return 0; +} + +static inline void mlx5_sf_hw_table_destroy(struct mlx5_core_dev *dev) +{ +} + +static inline int mlx5_sf_table_init(struct mlx5_core_dev *dev) +{ + return 0; +} + +static inline void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev) +{ +} + +#endif + +#endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c new file mode 100644 index 000000000000..af2f2dd9db25 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c @@ -0,0 +1,189 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#include <linux/mlx5/driver.h> +#include "mlx5_ifc_vhca_event.h" +#include "mlx5_core.h" +#include "vhca_event.h" +#include "ecpf.h" + +struct mlx5_vhca_state_notifier { + struct mlx5_core_dev *dev; + struct mlx5_nb nb; + struct blocking_notifier_head n_head; +}; + +struct mlx5_vhca_event_work { + struct work_struct work; + struct mlx5_vhca_state_notifier *notifier; + struct mlx5_vhca_state_event event; +}; + +int mlx5_cmd_query_vhca_state(struct mlx5_core_dev *dev, u16 function_id, + bool ecpu, u32 *out, u32 outlen) +{ + u32 in[MLX5_ST_SZ_DW(query_vhca_state_in)] = {}; + + MLX5_SET(query_vhca_state_in, in, opcode, MLX5_CMD_OP_QUERY_VHCA_STATE); + MLX5_SET(query_vhca_state_in, in, function_id, function_id); + MLX5_SET(query_vhca_state_in, in, embedded_cpu_function, ecpu); + + return mlx5_cmd_exec(dev, in, sizeof(in), out, outlen); +} + +static int mlx5_cmd_modify_vhca_state(struct mlx5_core_dev *dev, u16 function_id, + bool ecpu, u32 *in, u32 inlen) +{ + u32 out[MLX5_ST_SZ_DW(modify_vhca_state_out)] = {}; + + MLX5_SET(modify_vhca_state_in, in, opcode, MLX5_CMD_OP_MODIFY_VHCA_STATE); + MLX5_SET(modify_vhca_state_in, in, function_id, function_id); + MLX5_SET(modify_vhca_state_in, in, embedded_cpu_function, ecpu); + + return mlx5_cmd_exec(dev, in, inlen, out, sizeof(out)); +} + +int mlx5_modify_vhca_sw_id(struct mlx5_core_dev *dev, u16 function_id, bool ecpu, u32 sw_fn_id) +{ + u32 out[MLX5_ST_SZ_DW(modify_vhca_state_out)] = {}; + u32 in[MLX5_ST_SZ_DW(modify_vhca_state_in)] = {}; + + MLX5_SET(modify_vhca_state_in, in, opcode, MLX5_CMD_OP_MODIFY_VHCA_STATE); + MLX5_SET(modify_vhca_state_in, in, function_id, function_id); + MLX5_SET(modify_vhca_state_in, in, embedded_cpu_function, ecpu); + MLX5_SET(modify_vhca_state_in, in, vhca_state_field_select.sw_function_id, 1); + MLX5_SET(modify_vhca_state_in, in, vhca_state_context.sw_function_id, sw_fn_id); + + return mlx5_cmd_exec_inout(dev, modify_vhca_state, in, out); +} + +int mlx5_vhca_event_arm(struct mlx5_core_dev *dev, u16 function_id, bool ecpu) +{ + u32 in[MLX5_ST_SZ_DW(modify_vhca_state_in)] = {}; + + MLX5_SET(modify_vhca_state_in, in, vhca_state_context.arm_change_event, 1); + MLX5_SET(modify_vhca_state_in, in, vhca_state_field_select.arm_change_event, 1); + + return mlx5_cmd_modify_vhca_state(dev, function_id, ecpu, in, sizeof(in)); +} + +static void +mlx5_vhca_event_notify(struct mlx5_core_dev *dev, struct mlx5_vhca_state_event *event) +{ + u32 out[MLX5_ST_SZ_DW(query_vhca_state_out)] = {}; + int err; + + err = mlx5_cmd_query_vhca_state(dev, event->function_id, event->ecpu, out, sizeof(out)); + if (err) + return; + + event->sw_function_id = MLX5_GET(query_vhca_state_out, out, + vhca_state_context.sw_function_id); + event->new_vhca_state = MLX5_GET(query_vhca_state_out, out, + vhca_state_context.vhca_state); + + mlx5_vhca_event_arm(dev, event->function_id, event->ecpu); + + blocking_notifier_call_chain(&dev->priv.vhca_state_notifier->n_head, 0, event); +} + +static void mlx5_vhca_state_work_handler(struct work_struct *_work) +{ + struct mlx5_vhca_event_work *work = container_of(_work, struct mlx5_vhca_event_work, work); + struct mlx5_vhca_state_notifier *notifier = work->notifier; + struct mlx5_core_dev *dev = notifier->dev; + + mlx5_vhca_event_notify(dev, &work->event); +} + +static int +mlx5_vhca_state_change_notifier(struct notifier_block *nb, unsigned long type, void *data) +{ + struct mlx5_vhca_state_notifier *notifier = + mlx5_nb_cof(nb, struct mlx5_vhca_state_notifier, nb); + struct mlx5_vhca_event_work *work; + struct mlx5_eqe *eqe = data; + + work = kzalloc(sizeof(*work), GFP_ATOMIC); + if (!work) + return NOTIFY_DONE; + INIT_WORK(&work->work, &mlx5_vhca_state_work_handler); + work->notifier = notifier; + work->event.function_id = be16_to_cpu(eqe->data.vhca_state.function_id); + work->event.ecpu = be16_to_cpu(eqe->data.vhca_state.ec_function); + mlx5_events_work_enqueue(notifier->dev, &work->work); + return NOTIFY_OK; +} + +void mlx5_vhca_state_cap_handle(struct mlx5_core_dev *dev, void *set_hca_cap) +{ + if (!mlx5_vhca_event_supported(dev)) + return; + + MLX5_SET(cmd_hca_cap, set_hca_cap, vhca_state, 1); + MLX5_SET(cmd_hca_cap, set_hca_cap, event_on_vhca_state_allocated, 1); + MLX5_SET(cmd_hca_cap, set_hca_cap, event_on_vhca_state_active, 1); + MLX5_SET(cmd_hca_cap, set_hca_cap, event_on_vhca_state_in_use, 1); + MLX5_SET(cmd_hca_cap, set_hca_cap, event_on_vhca_state_teardown_request, 1); +} + +int mlx5_vhca_event_init(struct mlx5_core_dev *dev) +{ + struct mlx5_vhca_state_notifier *notifier; + + if (!mlx5_vhca_event_supported(dev)) + return 0; + + notifier = kzalloc(sizeof(*notifier), GFP_KERNEL); + if (!notifier) + return -ENOMEM; + + dev->priv.vhca_state_notifier = notifier; + notifier->dev = dev; + BLOCKING_INIT_NOTIFIER_HEAD(¬ifier->n_head); + MLX5_NB_INIT(¬ifier->nb, mlx5_vhca_state_change_notifier, VHCA_STATE_CHANGE); + return 0; +} + +void mlx5_vhca_event_cleanup(struct mlx5_core_dev *dev) +{ + if (!mlx5_vhca_event_supported(dev)) + return; + + kfree(dev->priv.vhca_state_notifier); + dev->priv.vhca_state_notifier = NULL; +} + +void mlx5_vhca_event_start(struct mlx5_core_dev *dev) +{ + struct mlx5_vhca_state_notifier *notifier; + + if (!dev->priv.vhca_state_notifier) + return; + + notifier = dev->priv.vhca_state_notifier; + mlx5_eq_notifier_register(dev, ¬ifier->nb); +} + +void mlx5_vhca_event_stop(struct mlx5_core_dev *dev) +{ + struct mlx5_vhca_state_notifier *notifier; + + if (!dev->priv.vhca_state_notifier) + return; + + notifier = dev->priv.vhca_state_notifier; + mlx5_eq_notifier_unregister(dev, ¬ifier->nb); +} + +int mlx5_vhca_event_notifier_register(struct mlx5_core_dev *dev, struct notifier_block *nb) +{ + if (!dev->priv.vhca_state_notifier) + return -EOPNOTSUPP; + return blocking_notifier_chain_register(&dev->priv.vhca_state_notifier->n_head, nb); +} + +void mlx5_vhca_event_notifier_unregister(struct mlx5_core_dev *dev, struct notifier_block *nb) +{ + blocking_notifier_chain_unregister(&dev->priv.vhca_state_notifier->n_head, nb); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h new file mode 100644 index 000000000000..1fe1ec6f4d4b --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h @@ -0,0 +1,57 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#ifndef __MLX5_VHCA_EVENT_H__ +#define __MLX5_VHCA_EVENT_H__ + +#ifdef CONFIG_MLX5_SF + +struct mlx5_vhca_state_event { + u16 function_id; + u16 sw_function_id; + u8 new_vhca_state; + bool ecpu; +}; + +static inline bool mlx5_vhca_event_supported(const struct mlx5_core_dev *dev) +{ + return MLX5_CAP_GEN_MAX(dev, vhca_state); +} + +void mlx5_vhca_state_cap_handle(struct mlx5_core_dev *dev, void *set_hca_cap); +int mlx5_vhca_event_init(struct mlx5_core_dev *dev); +void mlx5_vhca_event_cleanup(struct mlx5_core_dev *dev); +void mlx5_vhca_event_start(struct mlx5_core_dev *dev); +void mlx5_vhca_event_stop(struct mlx5_core_dev *dev); +int mlx5_vhca_event_notifier_register(struct mlx5_core_dev *dev, struct notifier_block *nb); +void mlx5_vhca_event_notifier_unregister(struct mlx5_core_dev *dev, struct notifier_block *nb); +int mlx5_modify_vhca_sw_id(struct mlx5_core_dev *dev, u16 function_id, bool ecpu, u32 sw_fn_id); +int mlx5_vhca_event_arm(struct mlx5_core_dev *dev, u16 function_id, bool ecpu); +int mlx5_cmd_query_vhca_state(struct mlx5_core_dev *dev, u16 function_id, + bool ecpu, u32 *out, u32 outlen); +#else + +static inline void mlx5_vhca_state_cap_handle(struct mlx5_core_dev *dev, void *set_hca_cap) +{ +} + +static inline int mlx5_vhca_event_init(struct mlx5_core_dev *dev) +{ + return 0; +} + +static inline void mlx5_vhca_event_cleanup(struct mlx5_core_dev *dev) +{ +} + +static inline void mlx5_vhca_event_start(struct mlx5_core_dev *dev) +{ +} + +static inline void mlx5_vhca_event_stop(struct mlx5_core_dev *dev) +{ +} + +#endif + +#endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vport.c b/drivers/net/ethernet/mellanox/mlx5/core/vport.c index bdafc85fd874..ba78e0660523 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c @@ -36,6 +36,7 @@ #include <linux/mlx5/vport.h> #include <linux/mlx5/eswitch.h> #include "mlx5_core.h" +#include "sf/sf.h" /* Mutex to hold while enabling or disabling RoCE */ static DEFINE_MUTEX(mlx5_roce_en_lock); @@ -1160,6 +1161,6 @@ EXPORT_SYMBOL_GPL(mlx5_query_nic_system_image_guid); */ u16 mlx5_eswitch_get_total_vports(const struct mlx5_core_dev *dev) { - return MLX5_SPECIAL_VPORTS(dev) + mlx5_core_max_vfs(dev); + return MLX5_SPECIAL_VPORTS(dev) + mlx5_core_max_vfs(dev) + mlx5_sf_max_functions(dev); } EXPORT_SYMBOL_GPL(mlx5_eswitch_get_total_vports); diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 45df5b465ba8..b955ed3450fd 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -193,7 +193,8 @@ enum port_state_policy { enum mlx5_coredev_type { MLX5_COREDEV_PF, - MLX5_COREDEV_VF + MLX5_COREDEV_VF, + MLX5_COREDEV_SF, }; struct mlx5_field_desc { @@ -507,6 +508,10 @@ struct mlx5_devcom; struct mlx5_fw_reset; struct mlx5_eq_table; struct mlx5_irq_table; +struct mlx5_vhca_state_notifier; +struct mlx5_sf_dev_table; +struct mlx5_sf_hw_table; +struct mlx5_sf_table; struct mlx5_rate_limit { u32 rate; @@ -604,6 +609,15 @@ struct mlx5_priv { struct mlx5_bfreg_data bfregs; struct mlx5_uars_page *uar; +#ifdef CONFIG_MLX5_SF + struct mlx5_vhca_state_notifier *vhca_state_notifier; + struct mlx5_sf_dev_table *sf_dev_table; + struct mlx5_core_dev *parent_mdev; +#endif +#ifdef CONFIG_MLX5_SF_MANAGER + struct mlx5_sf_hw_table *sf_hw_table; + struct mlx5_sf_table *sf_table; +#endif }; enum mlx5_device_state { diff --git a/include/net/devlink.h b/include/net/devlink.h index 426b98e74b6e..47b4b063401b 100644 --- a/include/net/devlink.h +++ b/include/net/devlink.h @@ -94,6 +94,18 @@ struct devlink_port_pci_vf_attrs { }; /** + * struct devlink_port_pci_sf_attrs - devlink port's PCI SF attributes + * @controller: Associated controller number + * @sf: Associated PCI SF for of the PCI PF for this port. + * @pf: Associated PCI PF number for this port. + */ +struct devlink_port_pci_sf_attrs { + u32 controller; + u32 sf; + u16 pf; +}; + +/** * struct devlink_port_attrs - devlink port object * @flavour: flavour of the port * @split: indicates if this is split port @@ -103,6 +115,7 @@ struct devlink_port_pci_vf_attrs { * @phys: physical port attributes * @pci_pf: PCI PF port attributes * @pci_vf: PCI VF port attributes + * @pci_sf: PCI SF port attributes */ struct devlink_port_attrs { u8 split:1, @@ -114,6 +127,7 @@ struct devlink_port_attrs { struct devlink_port_phys_attrs phys; struct devlink_port_pci_pf_attrs pci_pf; struct devlink_port_pci_vf_attrs pci_vf; + struct devlink_port_pci_sf_attrs pci_sf; }; }; @@ -138,6 +152,17 @@ struct devlink_port { struct mutex reporters_lock; /* Protects reporter_list */ }; +struct devlink_port_new_attrs { + enum devlink_port_flavour flavour; + unsigned int port_index; + u32 controller; + u32 sfnum; + u16 pfnum; + u8 port_index_valid:1, + controller_valid:1, + sfnum_valid:1; +}; + struct devlink_sb_pool_info { enum devlink_sb_pool_type pool_type; u32 size; @@ -1353,6 +1378,79 @@ struct devlink_ops { int (*port_function_hw_addr_set)(struct devlink *devlink, struct devlink_port *port, const u8 *hw_addr, int hw_addr_len, struct netlink_ext_ack *extack); + /** + * port_new() - Add a new port function of a specified flavor + * @devlink: Devlink instance + * @attrs: attributes of the new port + * @extack: extack for reporting error messages + * @new_port_index: index of the new port + * + * Devlink core will call this device driver function upon user request + * to create a new port function of a specified flavor and optional + * attributes + * + * Notes: + * - Called without devlink instance lock being held. Drivers must + * implement own means of synchronization + * - On success, drivers must register a port with devlink core + * + * Return: 0 on success, negative value otherwise. + */ + int (*port_new)(struct devlink *devlink, + const struct devlink_port_new_attrs *attrs, + struct netlink_ext_ack *extack, + unsigned int *new_port_index); + /** + * port_del() - Delete a port function + * @devlink: Devlink instance + * @port_index: port function index to delete + * @extack: extack for reporting error messages + * + * Devlink core will call this device driver function upon user request + * to delete a previously created port function + * + * Notes: + * - Called without devlink instance lock being held. Drivers must + * implement own means of synchronization + * - On success, drivers must unregister the corresponding devlink + * port + * + * Return: 0 on success, negative value otherwise. + */ + int (*port_del)(struct devlink *devlink, unsigned int port_index, + struct netlink_ext_ack *extack); + /** + * port_fn_state_get() - Get the state of a port function + * @devlink: Devlink instance + * @port: The devlink port + * @state: Admin configured state + * @opstate: Current operational state + * @extack: extack for reporting error messages + * + * Reports the admin and operational state of a devlink port function + * + * Return: 0 on success, negative value otherwise. + */ + int (*port_fn_state_get)(struct devlink *devlink, + struct devlink_port *port, + enum devlink_port_fn_state *state, + enum devlink_port_fn_opstate *opstate, + struct netlink_ext_ack *extack); + /** + * port_fn_state_set() - Set the admin state of a port function + * @devlink: Devlink instance + * @port: The devlink port + * @state: Admin state + * @extack: extack for reporting error messages + * + * Set the admin state of a devlink port function + * + * Return: 0 on success, negative value otherwise. + */ + int (*port_fn_state_set)(struct devlink *devlink, + struct devlink_port *port, + enum devlink_port_fn_state state, + struct netlink_ext_ack *extack); }; static inline void *devlink_priv(struct devlink *devlink) @@ -1409,6 +1507,8 @@ void devlink_port_attrs_pci_pf_set(struct devlink_port *devlink_port, u32 contro u16 pf, bool external); void devlink_port_attrs_pci_vf_set(struct devlink_port *devlink_port, u32 controller, u16 pf, u16 vf, bool external); +void devlink_port_attrs_pci_sf_set(struct devlink_port *devlink_port, + u32 controller, u16 pf, u32 sf); int devlink_sb_register(struct devlink *devlink, unsigned int sb_index, u32 size, u16 ingress_pools_count, u16 egress_pools_count, u16 ingress_tc_count, diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h index cf89c318f2ac..f6008b2fa60f 100644 --- a/include/uapi/linux/devlink.h +++ b/include/uapi/linux/devlink.h @@ -200,6 +200,10 @@ enum devlink_port_flavour { DEVLINK_PORT_FLAVOUR_UNUSED, /* Port which exists in the switch, but * is not used in any way. */ + DEVLINK_PORT_FLAVOUR_PCI_SF, /* Represents eswitch port + * for the PCI SF. It is an internal + * port that faces the PCI SF. + */ }; enum devlink_param_cmode { @@ -529,6 +533,7 @@ enum devlink_attr { DEVLINK_ATTR_RELOAD_ACTION_INFO, /* nested */ DEVLINK_ATTR_RELOAD_ACTION_STATS, /* nested */ + DEVLINK_ATTR_PORT_PCI_SF_NUMBER, /* u32 */ /* add new attributes above here, update the policy in devlink.c */ __DEVLINK_ATTR_MAX, @@ -578,9 +583,29 @@ enum devlink_resource_unit { enum devlink_port_function_attr { DEVLINK_PORT_FUNCTION_ATTR_UNSPEC, DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR, /* binary */ + DEVLINK_PORT_FN_ATTR_STATE, /* u8 */ + DEVLINK_PORT_FN_ATTR_OPSTATE, /* u8 */ __DEVLINK_PORT_FUNCTION_ATTR_MAX, DEVLINK_PORT_FUNCTION_ATTR_MAX = __DEVLINK_PORT_FUNCTION_ATTR_MAX - 1 }; +enum devlink_port_fn_state { + DEVLINK_PORT_FN_STATE_INACTIVE, + DEVLINK_PORT_FN_STATE_ACTIVE, +}; + +/** + * enum devlink_port_fn_opstate - indicates operational state of the function + * @DEVLINK_PORT_FN_OPSTATE_ATTACHED: Driver is attached to the function. + * For graceful tear down of the function, after inactivation of the + * function, user should wait for operational state to turn DETACHED. + * @DEVLINK_PORT_FN_OPSTATE_DETACHED: Driver is detached from the function. + * It is safe to delete the port. + */ +enum devlink_port_fn_opstate { + DEVLINK_PORT_FN_OPSTATE_DETACHED, + DEVLINK_PORT_FN_OPSTATE_ATTACHED, +}; + #endif /* _UAPI_LINUX_DEVLINK_H_ */ diff --git a/net/core/devlink.c b/net/core/devlink.c index f6e6445b9801..737b61c2976e 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -87,6 +87,9 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(devlink_trap_report); static const struct nla_policy devlink_function_nl_policy[DEVLINK_PORT_FUNCTION_ATTR_MAX + 1] = { [DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR] = { .type = NLA_BINARY }, + [DEVLINK_PORT_FN_ATTR_STATE] = + NLA_POLICY_RANGE(NLA_U8, DEVLINK_PORT_FN_STATE_INACTIVE, + DEVLINK_PORT_FN_STATE_ACTIVE), }; static LIST_HEAD(devlink_list); @@ -690,6 +693,15 @@ static int devlink_nl_port_attrs_put(struct sk_buff *msg, if (nla_put_u8(msg, DEVLINK_ATTR_PORT_EXTERNAL, attrs->pci_vf.external)) return -EMSGSIZE; break; + case DEVLINK_PORT_FLAVOUR_PCI_SF: + if (nla_put_u32(msg, DEVLINK_ATTR_PORT_CONTROLLER_NUMBER, + attrs->pci_sf.controller) || + nla_put_u16(msg, DEVLINK_ATTR_PORT_PCI_PF_NUMBER, + attrs->pci_sf.pf) || + nla_put_u32(msg, DEVLINK_ATTR_PORT_PCI_SF_NUMBER, + attrs->pci_sf.sf)) + return -EMSGSIZE; + break; case DEVLINK_PORT_FLAVOUR_PHYSICAL: case DEVLINK_PORT_FLAVOUR_CPU: case DEVLINK_PORT_FLAVOUR_DSA: @@ -713,42 +725,105 @@ static int devlink_nl_port_attrs_put(struct sk_buff *msg, } static int +devlink_port_fn_hw_addr_fill(struct devlink *devlink, const struct devlink_ops *ops, + struct devlink_port *port, struct sk_buff *msg, + struct netlink_ext_ack *extack, bool *msg_updated) +{ + u8 hw_addr[MAX_ADDR_LEN]; + int hw_addr_len; + int err; + + if (!ops->port_function_hw_addr_get) + return 0; + + err = ops->port_function_hw_addr_get(devlink, port, hw_addr, &hw_addr_len, extack); + if (err) { + if (err == -EOPNOTSUPP) + return 0; + return err; + } + err = nla_put(msg, DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR, hw_addr_len, hw_addr); + if (err) + return err; + *msg_updated = true; + return 0; +} + +static bool +devlink_port_fn_state_valid(enum devlink_port_fn_state state) +{ + return state == DEVLINK_PORT_FN_STATE_INACTIVE || + state == DEVLINK_PORT_FN_STATE_ACTIVE; +} + +static bool +devlink_port_fn_opstate_valid(enum devlink_port_fn_opstate opstate) +{ + return opstate == DEVLINK_PORT_FN_OPSTATE_DETACHED || + opstate == DEVLINK_PORT_FN_OPSTATE_ATTACHED; +} + +static int +devlink_port_fn_state_fill(struct devlink *devlink, + const struct devlink_ops *ops, + struct devlink_port *port, struct sk_buff *msg, + struct netlink_ext_ack *extack, + bool *msg_updated) +{ + enum devlink_port_fn_opstate opstate; + enum devlink_port_fn_state state; + int err; + + if (!ops->port_fn_state_get) + return 0; + + err = ops->port_fn_state_get(devlink, port, &state, &opstate, extack); + if (err) { + if (err == -EOPNOTSUPP) + return 0; + return err; + } + if (!devlink_port_fn_state_valid(state)) { + WARN_ON_ONCE(1); + NL_SET_ERR_MSG_MOD(extack, "Invalid state read from driver"); + return -EINVAL; + } + if (!devlink_port_fn_opstate_valid(opstate)) { + WARN_ON_ONCE(1); + NL_SET_ERR_MSG_MOD(extack, + "Invalid operational state read from driver"); + return -EINVAL; + } + if (nla_put_u8(msg, DEVLINK_PORT_FN_ATTR_STATE, state) || + nla_put_u8(msg, DEVLINK_PORT_FN_ATTR_OPSTATE, opstate)) + return -EMSGSIZE; + *msg_updated = true; + return 0; +} + +static int devlink_nl_port_function_attrs_put(struct sk_buff *msg, struct devlink_port *port, struct netlink_ext_ack *extack) { struct devlink *devlink = port->devlink; const struct devlink_ops *ops; struct nlattr *function_attr; - bool empty_nest = true; - int err = 0; + bool msg_updated = false; + int err; function_attr = nla_nest_start_noflag(msg, DEVLINK_ATTR_PORT_FUNCTION); if (!function_attr) return -EMSGSIZE; ops = devlink->ops; - if (ops->port_function_hw_addr_get) { - int hw_addr_len; - u8 hw_addr[MAX_ADDR_LEN]; - - err = ops->port_function_hw_addr_get(devlink, port, hw_addr, &hw_addr_len, extack); - if (err == -EOPNOTSUPP) { - /* Port function attributes are optional for a port. If port doesn't - * support function attribute, returning -EOPNOTSUPP is not an error. - */ - err = 0; - goto out; - } else if (err) { - goto out; - } - err = nla_put(msg, DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR, hw_addr_len, hw_addr); - if (err) - goto out; - empty_nest = false; - } - + err = devlink_port_fn_hw_addr_fill(devlink, ops, port, msg, + extack, &msg_updated); + if (err) + goto out; + err = devlink_port_fn_state_fill(devlink, ops, port, msg, extack, + &msg_updated); out: - if (err || empty_nest) + if (err || !msg_updated) nla_nest_cancel(msg, function_attr); else nla_nest_end(msg, function_attr); @@ -986,7 +1061,6 @@ devlink_port_function_hw_addr_set(struct devlink *devlink, struct devlink_port * const struct devlink_ops *ops; const u8 *hw_addr; int hw_addr_len; - int err; hw_addr = nla_data(attr); hw_addr_len = nla_len(attr); @@ -1011,12 +1085,25 @@ devlink_port_function_hw_addr_set(struct devlink *devlink, struct devlink_port * return -EOPNOTSUPP; } - err = ops->port_function_hw_addr_set(devlink, port, hw_addr, hw_addr_len, extack); - if (err) - return err; + return ops->port_function_hw_addr_set(devlink, port, hw_addr, hw_addr_len, extack); +} - devlink_port_notify(port, DEVLINK_CMD_PORT_NEW); - return 0; +static int devlink_port_fn_state_set(struct devlink *devlink, + struct devlink_port *port, + const struct nlattr *attr, + struct netlink_ext_ack *extack) +{ + enum devlink_port_fn_state state; + const struct devlink_ops *ops; + + state = nla_get_u8(attr); + ops = devlink->ops; + if (!ops->port_fn_state_set) { + NL_SET_ERR_MSG_MOD(extack, + "Function does not support state setting"); + return -EOPNOTSUPP; + } + return ops->port_fn_state_set(devlink, port, state, extack); } static int @@ -1034,9 +1121,21 @@ devlink_port_function_set(struct devlink *devlink, struct devlink_port *port, } attr = tb[DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR]; - if (attr) + if (attr) { err = devlink_port_function_hw_addr_set(devlink, port, attr, extack); + if (err) + return err; + } + /* Keep this as the last function attribute set, so that when + * multiple port function attributes are set along with state, + * Those can be applied first before activating the state. + */ + attr = tb[DEVLINK_PORT_FN_ATTR_STATE]; + if (attr) + err = devlink_port_fn_state_set(devlink, port, attr, extack); + if (!err) + devlink_port_notify(port, DEVLINK_CMD_PORT_NEW); return err; } @@ -1136,6 +1235,111 @@ static int devlink_nl_cmd_port_unsplit_doit(struct sk_buff *skb, return devlink_port_unsplit(devlink, port_index, info->extack); } +static int devlink_port_new_notifiy(struct devlink *devlink, + unsigned int port_index, + struct genl_info *info) +{ + struct devlink_port *devlink_port; + struct sk_buff *msg; + int err; + + msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!msg) + return -ENOMEM; + + mutex_lock(&devlink->lock); + devlink_port = devlink_port_get_by_index(devlink, port_index); + if (!devlink_port) { + err = -ENODEV; + goto out; + } + + err = devlink_nl_port_fill(msg, devlink, devlink_port, + DEVLINK_CMD_NEW, info->snd_portid, + info->snd_seq, 0, NULL); + if (err) + goto out; + + err = genlmsg_reply(msg, info); + mutex_unlock(&devlink->lock); + return err; + +out: + mutex_unlock(&devlink->lock); + nlmsg_free(msg); + return err; +} + +static int devlink_nl_cmd_port_new_doit(struct sk_buff *skb, + struct genl_info *info) +{ + struct netlink_ext_ack *extack = info->extack; + struct devlink_port_new_attrs new_attrs = {}; + struct devlink *devlink = info->user_ptr[0]; + unsigned int new_port_index; + int err; + + if (!devlink->ops->port_new || !devlink->ops->port_del) + return -EOPNOTSUPP; + + if (!info->attrs[DEVLINK_ATTR_PORT_FLAVOUR] || + !info->attrs[DEVLINK_ATTR_PORT_PCI_PF_NUMBER]) { + NL_SET_ERR_MSG_MOD(extack, "Port flavour or PCI PF are not specified"); + return -EINVAL; + } + new_attrs.flavour = nla_get_u16(info->attrs[DEVLINK_ATTR_PORT_FLAVOUR]); + new_attrs.pfnum = + nla_get_u16(info->attrs[DEVLINK_ATTR_PORT_PCI_PF_NUMBER]); + + if (info->attrs[DEVLINK_ATTR_PORT_INDEX]) { + /* Port index of the new port being created by driver. */ + new_attrs.port_index = + nla_get_u32(info->attrs[DEVLINK_ATTR_PORT_INDEX]); + new_attrs.port_index_valid = true; + } + if (info->attrs[DEVLINK_ATTR_PORT_CONTROLLER_NUMBER]) { + new_attrs.controller = + nla_get_u16(info->attrs[DEVLINK_ATTR_PORT_CONTROLLER_NUMBER]); + new_attrs.controller_valid = true; + } + if (new_attrs.flavour == DEVLINK_PORT_FLAVOUR_PCI_SF && + info->attrs[DEVLINK_ATTR_PORT_PCI_SF_NUMBER]) { + new_attrs.sfnum = nla_get_u32(info->attrs[DEVLINK_ATTR_PORT_PCI_SF_NUMBER]); + new_attrs.sfnum_valid = true; + } + + err = devlink->ops->port_new(devlink, &new_attrs, extack, + &new_port_index); + if (err) + return err; + + err = devlink_port_new_notifiy(devlink, new_port_index, info); + if (err && err != -ENODEV) { + /* Fail to send the response; destroy newly created port. */ + devlink->ops->port_del(devlink, new_port_index, extack); + } + return err; +} + +static int devlink_nl_cmd_port_del_doit(struct sk_buff *skb, + struct genl_info *info) +{ + struct netlink_ext_ack *extack = info->extack; + struct devlink *devlink = info->user_ptr[0]; + unsigned int port_index; + + if (!devlink->ops->port_del) + return -EOPNOTSUPP; + + if (!info->attrs[DEVLINK_ATTR_PORT_INDEX]) { + NL_SET_ERR_MSG_MOD(extack, "Port index is not specified"); + return -EINVAL; + } + port_index = nla_get_u32(info->attrs[DEVLINK_ATTR_PORT_INDEX]); + + return devlink->ops->port_del(devlink, port_index, extack); +} + static int devlink_nl_sb_fill(struct sk_buff *msg, struct devlink *devlink, struct devlink_sb *devlink_sb, enum devlink_command cmd, u32 portid, @@ -7594,6 +7798,10 @@ static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = { [DEVLINK_ATTR_RELOAD_ACTION] = NLA_POLICY_RANGE(NLA_U8, DEVLINK_RELOAD_ACTION_DRIVER_REINIT, DEVLINK_RELOAD_ACTION_MAX), [DEVLINK_ATTR_RELOAD_LIMITS] = NLA_POLICY_BITFIELD32(DEVLINK_RELOAD_LIMITS_VALID_MASK), + [DEVLINK_ATTR_PORT_FLAVOUR] = { .type = NLA_U16 }, + [DEVLINK_ATTR_PORT_PCI_PF_NUMBER] = { .type = NLA_U16 }, + [DEVLINK_ATTR_PORT_PCI_SF_NUMBER] = { .type = NLA_U32 }, + [DEVLINK_ATTR_PORT_CONTROLLER_NUMBER] = { .type = NLA_U32 }, }; static const struct genl_small_ops devlink_nl_ops[] = { @@ -7634,6 +7842,18 @@ static const struct genl_small_ops devlink_nl_ops[] = { .internal_flags = DEVLINK_NL_FLAG_NO_LOCK, }, { + .cmd = DEVLINK_CMD_PORT_NEW, + .doit = devlink_nl_cmd_port_new_doit, + .flags = GENL_ADMIN_PERM, + .internal_flags = DEVLINK_NL_FLAG_NO_LOCK, + }, + { + .cmd = DEVLINK_CMD_PORT_DEL, + .doit = devlink_nl_cmd_port_del_doit, + .flags = GENL_ADMIN_PERM, + .internal_flags = DEVLINK_NL_FLAG_NO_LOCK, + }, + { .cmd = DEVLINK_CMD_SB_GET, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = devlink_nl_cmd_sb_get_doit, @@ -8372,6 +8592,32 @@ void devlink_port_attrs_pci_vf_set(struct devlink_port *devlink_port, u32 contro } EXPORT_SYMBOL_GPL(devlink_port_attrs_pci_vf_set); +/** + * devlink_port_attrs_pci_sf_set - Set PCI SF port attributes + * + * @devlink_port: devlink port + * @controller: associated controller number for the devlink port instance + * @pf: associated PF for the devlink port instance + * @sf: associated SF of a PF for the devlink port instance + */ +void devlink_port_attrs_pci_sf_set(struct devlink_port *devlink_port, u32 controller, + u16 pf, u32 sf) +{ + struct devlink_port_attrs *attrs = &devlink_port->attrs; + int ret; + + if (WARN_ON(devlink_port->registered)) + return; + ret = __devlink_port_attrs_set(devlink_port, + DEVLINK_PORT_FLAVOUR_PCI_SF); + if (ret) + return; + attrs->pci_sf.controller = controller; + attrs->pci_sf.pf = pf; + attrs->pci_sf.sf = sf; +} +EXPORT_SYMBOL_GPL(devlink_port_attrs_pci_sf_set); + static int __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, char *name, size_t len) { @@ -8420,6 +8666,10 @@ static int __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, n = snprintf(name, len, "pf%uvf%u", attrs->pci_vf.pf, attrs->pci_vf.vf); break; + case DEVLINK_PORT_FLAVOUR_PCI_SF: + n = snprintf(name, len, "pf%usf%u", attrs->pci_sf.pf, + attrs->pci_sf.sf); + break; } if (n >= len) |