aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* openvswitch: Optimize updating for OvS flow_stats.Tonghao Zhang2017-07-191-2/+1
| | | | | | | | | | | In the ovs_flow_stats_update(), we only use the node var to alloc flow_stats struct. But this is not a common case, it is unnecessary to call the numa_node_id() everytime. This patch is not a bugfix, but there maybe a small increase. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'liquidio-lowmem-fixes'David S. Miller2017-07-192-6/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rick Farrington says: ==================== liquidio: avoid vm low memory crashes This patchset addresses issues brought about by low memory conditions in a VM. These conditions were not seen when the driver was exercised normally. Rather, they were brought about through manual fault injection. They are being included in the interest of hardening the driver against unforeseen circumstances. 1. Fix GPF in octeon_init_droq(); zero the allocated block 'recv_buf_list'. This prevents a GPF trying to access an invalid 'recv_buf_list[i]' entry in octeon_droq_destroy_ring_buffers() if init didn't alloc all entries. 2. Don't dereference a NULL ptr in octeon_droq_destroy_ring_buffers(). 3. For defensive programming, zero the allocated block 'oct->droq' in octeon_setup_output_queues() and 'oct->instr_queue' in octeon_setup_instr_queues(). change log: V1 -> V2: 1. Corrected syntax in 'Subject' lines; no functional or code changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * liquidio: lowmem: init allocated memory to 0Rick Farrington2017-07-191-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | For defensive programming, zero the allocated block 'oct->droq[0]' in octeon_setup_output_queues() and 'oct->instr_queue[0]' in octeon_setup_instr_queues(). Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * liquidio: lowmem: do not dereference null ptrRick Farrington2017-07-191-0/+2
| | | | | | | | | | | | | | | | | | | | Don't dereference a NULL ptr in octeon_droq_destroy_ring_buffers(). Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * liquidio: lowmem: init allocated memory to 0Rick Farrington2017-07-191-2/+2
|/ | | | | | | | | | | | Fix GPF in octeon_init_droq(); zero the allocated block 'recv_buf_list'. This prevents a GPF trying to access an invalid 'recv_buf_list[i]' entry in octeon_droq_destroy_ring_buffers() if init didn't alloc all entries. Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* liquidio: support new firmware statistic fw_err_pkiRick Farrington2017-07-192-0/+5
| | | | | | | | | Added support for new firmware statistic 'tx_err_pki'. Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'net-attribute_group-const'David S. Miller2017-07-1810-11/+15
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | Arvind Yadav says: ==================== constify net attribute_group structures. attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: chelsio: cxgb3: constify attribute_group structures.Arvind Yadav2017-07-181-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 28720 985 12 29717 7415 net/.../cxgb3/cxgb3_main.o File size After adding 'const': text data bss dec hex filename 28848 857 12 29717 7415 net/.../cxgb3/cxgb3_main.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: bonding: constify attribute_group structures.Arvind Yadav2017-07-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/netdevice.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 4512 1472 0 5984 1760 drivers/net/bonding/bond_sysfs.o File size After adding 'const': text data bss dec hex filename 4576 1408 0 5984 1760 drivers/net/bonding/bond_sysfs.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * arcnet: com20020-pci: constify attribute_group structures.Arvind Yadav2017-07-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/netdevice.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 3409 948 28 4385 1121 drivers/net/arcnet/com20020-pci.o File size After adding 'const': text data bss dec hex filename 3473 884 28 4385 1121 drivers/net/arcnet/com20020-pci.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * wireless: iwlegacy: Constify attribute_group structures.Arvind Yadav2017-07-181-1/+1
| | | | | | | | | | | | | | | | | | attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * wireless: iwlegacy: constify attribute_group structures.Arvind Yadav2017-07-181-1/+1
| | | | | | | | | | | | | | | | | | attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * wireless: ipw2100: constify attribute_group structures.Arvind Yadav2017-07-181-1/+1
| | | | | | | | | | | | | | | | | | attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * wireless: ipw2200: constify attribute_group structures.Arvind Yadav2017-07-181-1/+1
| | | | | | | | | | | | | | | | | | attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: can: janz-ican3: constify attribute_group structures.Arvind Yadav2017-07-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/netdevice.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 11800 368 0 12168 2f88 drivers/net/can/janz-ican3.o File size After adding 'const': text data bss dec hex filename 11864 304 0 12168 2f88 drivers/net/can/janz-ican3.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: can: at91_can: constify attribute_group structures.Arvind Yadav2017-07-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/netdevice.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 6164 304 0 6468 1944 drivers/net/can/at91_can.o File size After adding 'const': text data bss dec hex filename 6228 240 0 6468 1944 drivers/net/can/at91_can.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: cdc_ncm: constify attribute_group structures.Arvind Yadav2017-07-181-1/+1
|/ | | | | | | | | | | | | | | | | attribute_group are not supposed to change at runtime. All functions working with attribute_group provided by <linux/netdevice.h> work with const attribute_group. So mark the non-const structs as const. File size before: text data bss dec hex filename 13275 928 1 14204 377c drivers/net/usb/cdc_ncm.o File size After adding 'const': text data bss dec hex filename 13339 864 1 14204 377c drivers/net/usb/cdc_ncm.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'mlxsw-Preparations-for-IPv6-UC-router'David S. Miller2017-07-185-257/+622
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jiri Pirko says: ==================== mlxsw: Preparations for IPv6 UC router Ido says: The purpose of this set is to prepare the driver for the introduction of IPv6 FIB offload. It's mainly composed of small and non-functional changes, that either add the IPv6 equivalent of existing IPv4 code or aimed at making the introduction of IPv6-specific code easier. The first five patches enable IPv6 forwarding in the device and allow us to configure router interfaces (RIFs) based on inet6addr notifications. The next six patches add support for programming IPv6 neighbours into the device's table as well as dumping their activity and updating the kernel accordingly. The last 11 patches extend current infrastructure to allow us to program IPv6 routes, set catch-all IPv6 trap in case of abort and make the code more receptive towards up-coming changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Update prefix count for IPv6Ido Schimmel2017-07-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The number of possible prefix lengths for IPv6 is 129 and not 128. Fixes following warning from UBSAN when /128 routes are offloaded: UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:2510:27 index 128 is out of range for type 'long unsigned int [128]' Fixes: 5e9c16cc83a7 ("mlxsw: spectrum_router: Implement private fib") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Rename functions to add / delete a FIB entryIdo Schimmel2017-07-181-8/+8
| | | | | | | | | | | | | | | | | | | | These functions aren't specific to IPv4 and can be re-used for IPv6. Drop the '4' designation from their name. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Drop unnecessary parameterIdo Schimmel2017-07-181-19/+13
| | | | | | | | | | | | | | | | | | | | | | Functions that take as argument a FIB entry don't need to take FIB node as well, as it can be extracted from the entry. Remove unnecessary FIB node parameter. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Mark IPv4 specific function accordinglyIdo Schimmel2017-07-181-29/+29
| | | | | | | | | | | | | | | | | | | | The functions to create and destroy a nexthop group are IPv4 specific and should be renamed accordingly, so that they won't be confused with the IPv6 specific functions in follow-up patches. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Create IPv4 specific entry structIdo Schimmel2017-07-181-100/+123
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Some of the parameters stored in the FIB entry structure are specific to IPv4 and therefore better placed in an IPv4 specific structure. Create an IPv4 specific structure that encapsulates the common FIB entry structure and contains IPv4 specific parameters. In a follow-up patchset an IPv6 specific structure will be introduced. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Set abort trap for IPv6Ido Schimmel2017-07-181-13/+26
| | | | | | | | | | | | | | | | | | | | | | | | When we fail to insert a route we invoke the abort mechanism which flushes all the tables and inserts a default route in each, so that all packets incoming to the router will be trapped to the CPU. Upon abort, add an IPv6 default route to the IPv6 tables. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Allow IPv6 routes to be programmedIdo Schimmel2017-07-181-41/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | Take advantage of previous patch and allow the RALUE register to be called with IPv6 routes. In order to re-use as much code as possible between IPv4 and IPv6, only the lowest-level function that actually does the register packing is demuxed based on the passed protocol. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: reg: Update RALUE register with IPv6 supportIdo Schimmel2017-07-181-0/+11
| | | | | | | | | | | | | | | | | | Update the register so that IPv6 LPM entries could be programmed to the device's table. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Extend virtual routers with IPv6 supportIdo Schimmel2017-07-181-3/+25
| | | | | | | | | | | | | | | | | | | | | | | | A Virtual Router (VR) is an entity which corresponds to a VRF and performs FIB lookup in an LPM tree according to the {VR, IP Proto} -> Tree binding. Extend the virtual router data structure towards IPv6 FIB offload. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Make FIB node retrieval family agnosticIdo Schimmel2017-07-181-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A FIB node is an entity which stores routes sharing the same prefix and length. The data structure itself is already family agnostic, but we make some of its operations agnostic as well and thus re-use them for IPv6 offload. Instead of passing an IPv4-specific structure to fib4_node_get(), pass general routing parameters and rename the function accordingly. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Don't create FIB node during lookupIdo Schimmel2017-07-181-4/+13
| | | | | | | | | | | | | | | | | | | | | | When looking up a FIB entry we shouldn't create the FIB node where it's supposed to be linked in case the node doesn't already exist. Instead, lookup the node and fail if it doesn't exist. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Don't assume neighbour typeIdo Schimmel2017-07-181-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Thankfully, the neighbour subsystem is agnostic to the upper protocol and used by both IPv4 and IPv6. By removing assumptions regarding the neighbour type we can thus re-use much of the neighbour-related code for both IPv4 and IPv6. For each nexthop, store its gateway IP and for nexthop group store the neighbour table used by its nexthops. Use this information throughout the code and remove assumption about the neighbour type. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Set activity interval according to both neighbour tablesArkadi Sharshevsky2017-07-181-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | The neighbours' activity is currently dumped according to the ARP table's DELAY_PROBE time, but with the introduction of IPv6 offload we should set the interval according to the minimum between the ARP and ndisc tables. Signed-off-by: Arkadi Sharshvesky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Periodically dump active IPv6 neighboursArkadi Sharshevsky2017-07-181-10/+69
| | | | | | | | | | | | | | | | | | | | In addition to IPv4, periodically dump IPv6 neighbours and update the kernel about them. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: reg: Update RAUHTD register with IPv6 supportArkadi Sharshevsky2017-07-181-0/+32
| | | | | | | | | | | | | | | | | | | | Update the register so that the active IPv6 neighbours could be dumped from the device's neighbour table. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Reflect IPv6 neighbours to the deviceArkadi Sharshevsky2017-07-181-3/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As with IPv4, listen to NEIGH_UPDATE events from the ndisc table and program relevant neighbours to the device's neighbour table. Note that neighbours with a link-local IP address aren't programmed, as packets with a link-local destination IP are trapped after LPM lookup and never reach the neighbour table. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: reg: Update RAUHT register with IPv6 supportArkadi Sharshevsky2017-07-181-0/+10
| | | | | | | | | | | | | | | | | | | | Update the register, so the IPv6 neighbours could be programmed to the device's neighbour table. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Configure RIFs based on IPv6 addressesArkadi Sharshevsky2017-07-183-5/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When a netdev is configured with an IP address a router interface (RIF) should be configured for it in the device. Allow configuration of RIFs based on IPv6 address notifications as well as IPv4. Note that the RIF exists as long as an IP address is configured on the netdev, regardless of the address family. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Flood unregistered multicast packets to routerIdo Schimmel2017-07-181-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Up until now we only flooded broadcast packets to the router when an L3 interface was configured on top of a bridge. However, IPv6 Neighbour Discovery packets are trapped to the CPU inside the router and these can be sent with a multicast address. Flood unregistered multicast packets to the router port, so that relevant packets could be trapped there. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum: Add support for IPv6 trapsArkadi Sharshevsky2017-07-183-11/+53
| | | | | | | | | | | | | | | | | | | | | | Before we can start using IPv6, we need to trap certain control packets to the CPU. Among others, these include Neighbour Discovery, DHCP and neighbour misses. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: reg: Enable IPv6 on router interfacesArkadi Sharshevsky2017-07-181-0/+2
| | | | | | | | | | | | | | | | | | | | Enable IPv6 and IPv6 forwarding on router interfaces (RIFs), so that they will be able to receive and forward IPv6 traffic. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Enable IPv6 routerArkadi Sharshevsky2017-07-182-3/+5
|/ | | | | | | | | | Before we add IPv6 constructs like traps and router interfaces, we first need to enable IPv6 routing in the device. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'xfrm-remove-flow-cache'David S. Miller2017-07-1821-957/+173
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Florian Westphal says: ==================== xfrm: remove flow cache After RCU-ification of ipsec packet path there are no major scalability issues anymore without flow cache. We still incur a performance hit, which comes mostly from the extra xfrm dst allocation/freeing. The last patch in the series adds a simple percpu cache to avoid the extra allocation if a packet matched the same policies as last one. The main concern with this is that we will see performance drops, especially with large numbers of policies/SAs. However, during hallway discussions at nfws 2017 it seemed the issues with flow caching outweight the removal downsides, and that it might be best to just 'remove it' and see where the practical issues (if any) will appear. It should now be possible to also remove the genid member in the policies as we don't hold bundles for prolonged time anymore, but I think this change is controversial (and intrusive) enough as-is, so defer that to a later point in time. Changes since last rfc: - fix build failures due to implicit interrupt.h includes - rework last patch (pcpu cache): * avoid xchg() * check policies for walk.dead = 1 instead of more costly bundle_ok(). * flush pcpu bundles when sa/policies get removed, to allow module references to go away (suggested by Ilan Tayari) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm: add xdst pcpu cacheFlorian Westphal2017-07-184-3/+132
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | retain last used xfrm_dst in a pcpu cache. On next request, reuse this dst if the policies are the same. The cache will not help with strict RR workloads as there is no hit. The cache packet-path part is reasonably small, the notifier part is needed so we do not add long hangs when a device is dismantled but some pcpu xdst still holds a reference, there are also calls to the flush operation when userspace deletes SAs so modules can be removed (there is no hit. We need to run the dst_release on the correct cpu to avoid races with packet path. This is done by adding a work_struct for each cpu and then doing the actual test/release on each affected cpu via schedule_work_on(). Test results using 4 network namespaces and null encryption: ns1 ns2 -> ns3 -> ns4 netperf -> xfrm/null enc -> xfrm/null dec -> netserver what TCP_STREAM UDP_STREAM UDP_RR Flow cache: 14644.61 294.35 327231.64 No flow cache: 14349.81 242.64 202301.72 Pcpu cache: 14629.70 292.21 205595.22 UDP tests used 64byte packets, tests ran for one minute each, value is average over ten iterations. 'Flow cache' is 'net-next', 'No flow cache' is net-next plus this series but without this patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm: remove flow cacheFlorian Westphal2017-07-1813-734/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | After rcu conversions performance degradation in forward tests isn't that noticeable anymore. See next patch for some numbers. A followup patcg could then also remove genid from the policies as we do not cache bundles anymore. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm_policy: make xfrm_bundle_lookup return xfrm dst objectFlorian Westphal2017-07-181-16/+12
| | | | | | | | | | | | | | This allows to remove flow cache object embedded in struct xfrm_dst. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm_policy: remove xfrm_policy_lookupFlorian Westphal2017-07-181-32/+4
| | | | | | | | | | | | | | | | This removes the wrapper and renames the __xfrm_policy_lookup variant to get rid of another place that used flow cache objects. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm_policy: kill flow to policy dir conversionFlorian Westphal2017-07-181-42/+4
| | | | | | | | | | | | | | | | | | XFRM_POLICY_IN/OUT/FWD are identical to FLOW_DIR_*, so gcc already removed this function as its just returns the argument. Again, no code change. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm_policy: remove always true/false branchesFlorian Westphal2017-07-181-60/+14
| | | | | | | | | | | | | | | | after previous change oldflo and xdst are always NULL. These branches were already removed by gcc, this doesn't change code. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm_policy: bypass flow_cache_lookupFlorian Westphal2017-07-181-9/+5
| | | | | | | | | | | | | | | | | | | | | | Instead of consulting flow cache, call the xfrm bundle/policy lookup functions directly. This pretends the flow cache had no entry. This helps to gradually remove flow cache integration, followup commit will remove the dead code that this change adds. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: xfrm: revert to lower xfrm dst gc limitFlorian Westphal2017-07-183-6/+4
| | | | | | | | | | | | | | | | | | | | | | revert c386578f1cdb4dac230395 ("xfrm: Let the flowcache handle its size by default."). Once we remove flow cache, we don't have a flow cache limit anymore. We must not allow (virtually) unlimited allocations of xfrm dst entries. Revert back to the old xfrm dst gc limits. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * vti: revert flush x-netns xfrm cache when vti interface is removedFlorian Westphal2017-07-182-62/+0
| | | | | | | | | | | | | | flow cache is removed in next commit. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>