aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* net: sched: Add the invalid handle check in qdisc_class_findGao Feng2017-08-211-0/+3
| | | | | | | | | Add the invalid handle "0" check to avoid unnecessary search, because the qdisc uses the skb->priority as the handle value to look up, and it is "0" usually. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tipc: don't reset stale broadcast send linkJon Paul Maloy2017-08-214-45/+17
| | | | | | | | | | | | | | | | | | | | | | | | | When the broadcast send link after 100 attempts has failed to transfer a packet to all peers, we consider it stale, and reset it. Thereafter it needs to re-synchronize with the peers, something currently done by just resetting and re-establishing all links to all peers. This has turned out to be overkill, with potentially unwanted consequences for the remaining cluster. A closer analysis reveals that this can be done much simpler. When this kind of failure happens, for reasons that may lie outside the TIPC protocol, it is typically only one peer which is failing to receive and acknowledge packets. It is hence sufficient to identify and reset the links only to that peer to resolve the situation, without having to reset the broadcast link at all. This solution entails a much lower risk of negative consequences for the own node as well as for the overall cluster. We implement this change in this commit. Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: check type when freeing metadata dstDavid Lamparter2017-08-211-1/+2
| | | | | | | | | | | | | | | | | | | Commit 3fcece12bc1b ("net: store port/representator id in metadata_dst") added a new type field to metadata_dst, but metadata_dst_free() wasn't updated to check it before freeing the METADATA_IP_TUNNEL specific dst cache entry. This is not currently causing problems since it's far enough back in the struct to be zeroed for the only other type currently in existance (METADATA_HW_PORT_MUX), but nevertheless it's not correct. Fixes: 3fcece12bc1b ("net: store port/representator id in metadata_dst") Signed-off-by: David Lamparter <equinox@diac24.net> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Sridhar Samudrala <sridhar.samudrala@intel.com> Cc: Simon Horman <horms@verge.net.au> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ipv6: put host and anycast routes on device with addressDavid Ahern2017-08-213-56/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | One nagging difference between ipv4 and ipv6 is host routes for ipv6 addresses are installed using the loopback device or VRF / L3 Master device. e.g., 2001:db8:1::/120 dev veth0 proto kernel metric 256 pref medium local 2001:db8:1::1 dev lo table local proto kernel metric 0 pref medium Using the loopback device is convenient -- necessary for local tx, but has some nasty side effects, most notably setting the 'lo' device down causes all host routes for all local IPv6 address to be removed from the FIB and completely breaks IPv6 networking across all interfaces. This patch puts FIB entries for IPv6 routes against the device. This simplifies the routes in the FIB, for example by making dst->dev and rt6i_idev->dev the same (a future patch can look at removing the device reference taken for rt6i_idev for FIB entries). When copies are made on FIB lookups, the cloned route has dst->dev set to loopback (or the L3 master device). This is needed for the local Tx of packets to local addresses. With fib entries allocated against the real network device, the addrconf code that reinserts host routes on admin up of 'lo' is no longer needed. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* dsa: remove unused net_device arg from handlersFlorian Westphal2017-08-2110-21/+12
| | | | | | | | compile tested only, but saw no warnings/errors with allmodconfig build. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'bpf-mips-jit-improvements'David S. Miller2017-08-211-59/+103
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Daney says: ==================== MIPS,bpf: Improvements for MIPS eBPF JIT Here are several improvements and bug fixes for the MIPS eBPF JIT. The main change is the addition of support for JLT, JLE, JSLT and JSLE ops, that were recently added. Also fix WARN output when used with preemptable kernel, and a small cleanup/optimization in the use of BPF_OP(insn->code). I suggest that the whole thing go via the BPF/net-next path as there are dependencies on code that is not yet merged to Linus' tree. Still pending are changes to reduce stack usage when the verifier can determine the maximum stack size. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * MIPS,bpf: Cache value of BPF_OP(insn->code) in eBPF JIT.David Daney2017-08-211-33/+34
| | | | | | | | | | | | | | | | | | The code looks a little cleaner if we replace BPF_OP(insn->code) with the local variable bpf_op. Caching the value this way also saves 300 bytes (about 1%) in the code size of the JIT. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * MIPS, bpf: Implement JLT, JLE, JSLT and JSLE ops in the eBPF JIT.David Daney2017-08-211-29/+72
| | | | | | | | | | Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * MIPS,bpf: Fix using smp_processor_id() in preemptible splat.David Daney2017-08-211-14/+14
|/ | | | | | | | | | | | If the kernel is configured with preemption enabled we were getting warning stack traces for use of current_cpu_type(). Fix by moving the test between preempt_disable()/preempt_enable() and caching the results of the CPU type tests for use during code generation. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* qlogic: make device_attribute constBhumika Goyal2017-08-212-5/+5
| | | | | | | | | | Make these const as they are only passed as an argument to the function device_create_file and device_remove_file and the corresponding arguments are of type const. Done using Coccinelle Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2017-08-2116-37/+121
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2017-08-21 1) Support RX checksum with IPsec crypto offload for esp4/esp6. From Ilan Tayari. 2) Fixup IPv6 checksums when doing IPsec crypto offload. From Yossi Kuperman. 3) Auto load the xfrom offload modules if a user installs a SA that requests IPsec offload. From Ilan Tayari. 4) Clear RX offload informations in xfrm_input to not confuse the TX path with stale offload informations. From Ilan Tayari. 5) Allow IPsec GSO for local sockets if the crypto operation will be offloaded. 6) Support setting of an output mark to the xfrm_state. This mark can be used to to do the tunnel route lookup. From Lorenzo Colitti. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: xfrm: support setting an output mark.Lorenzo Colitti2017-08-118-20/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On systems that use mark-based routing it may be necessary for routing lookups to use marks in order for packets to be routed correctly. An example of such a system is Android, which uses socket marks to route packets via different networks. Currently, routing lookups in tunnel mode always use a mark of zero, making routing incorrect on such systems. This patch adds a new output_mark element to the xfrm state and a corresponding XFRMA_OUTPUT_MARK netlink attribute. The output mark differs from the existing xfrm mark in two ways: 1. The xfrm mark is used to match xfrm policies and states, while the xfrm output mark is used to set the mark (and influence the routing) of the packets emitted by those states. 2. The existing mark is constrained to be a subset of the bits of the originating socket or transformed packet, but the output mark is arbitrary and depends only on the state. The use of a separate mark provides additional flexibility. For example: - A packet subject to two transforms (e.g., transport mode inside tunnel mode) can have two different output marks applied to it, one for the transport mode SA and one for the tunnel mode SA. - On a system where socket marks determine routing, the packets emitted by an IPsec tunnel can be routed based on a mark that is determined by the tunnel, not by the marks of the unencrypted packets. - Support for setting the output marks can be introduced without breaking any existing setups that employ both mark-based routing and xfrm tunnel mode. Simply changing the code to use the xfrm mark for routing output packets could xfrm mark could change behaviour in a way that breaks these setups. If the output mark is unspecified or set to zero, the mark is not set or changed. Tested: make allyesconfig; make -j64 Tested: https://android-review.googlesource.com/452776 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * net: Allow IPsec GSO for local socketsSteffen Klassert2017-08-022-1/+20
| | | | | | | | | | | | | | This patch allows local sockets to make use of XFRM GSO code path. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com>
| * xfrm: Clear RX SKB secpath xfrm_offloadIlan Tayari2017-08-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an incoming packet undergoes XFRM crypto-offload, its secpath is filled with xfrm_offload struct denoting offload information. If the SKB is then forwarded to a device which supports crypto- offload, the stack wrongfully attempts to offload it (even though the output SA may not exist on the device) due to the leftover secpath xo. Clear the ingress xo by zeroizing secpath->olen just before delivering the decapsulated packet to the network stack. Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: Auto-load xfrm offload modulesIlan Tayari2017-08-026-7/+19
| | | | | | | | | | | | | | | | | | | | | | | | IPSec crypto offload depends on the protocol-specific offload module (such as esp_offload.ko). When the user installs an SA with crypto-offload, load the offload module automatically, in the same way that the protocol module is loaded (such as esp.ko) Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * esp6: Fix RX checksum after header pullYossi Kuperman2017-08-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both ip6_input_finish (non-GRO) and esp6_gro_receive (GRO) strip the IPv6 header without adjusting skb->csum accordingly. As a result CHECKSUM_COMPLETE breaks and "hw csum failure" is written to the kernel log by netdev_rx_csum_fault (dev.c). Fix skb->csum by substracting the checksum value of the pulled IPv6 header using a call to skb_postpull_rcsum. This affects both transport and tunnel modes. Note that the fix occurs far from the place that the header was pulled. This is based on existing code, see: ipv6_srh_rcv() in exthdrs.c and rawv6_rcv() in raw.c Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm6: Fix CHECKSUM_COMPLETE after IPv6 header pushYossi Kuperman2017-08-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xfrm6_transport_finish rebuilds the IPv6 header based on the original one and pushes it back without fixing skb->csum. Therefore, CHECKSUM_COMPLETE is no longer valid and the packet gets dropped. Fix skb->csum by calling skb_postpush_rcsum. Note: A valid IPv4 header has checksum 0, unlike IPv6. Thus, the change is not needed in the sibling xfrm4_transport_finish function. Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * esp6: Support RX checksum with crypto offloadIlan Tayari2017-08-022-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Keep the device's reported ip_summed indication in case crypto was offloaded by the device. Subtract the csum values of the stripped parts (esp header+iv, esp trailer+auth_data) to keep value correct. Note: CHECKSUM_COMPLETE should be indicated only if skb->csum has the post-decryption offload csum value. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * esp4: Support RX checksum with crypto offloadIlan Tayari2017-08-022-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Keep the device's reported ip_summed indication in case crypto was offloaded by the device. Subtract the csum values of the stripped parts (esp header+iv, esp trailer+auth_data) to keep value correct. Note: CHECKSUM_COMPLETE should be indicated only if skb->csum has the post-decryption offload csum value. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | liquidio: fix use of pf in pass-through mode in a virtual machineRick Farrington2017-08-202-5/+44
| | | | | | | | | | | | | | | | | | | | | | Fix problem when PF is used in pass-through mode in a VM (w/embedded f/w). If host error reading PF num from CN23XX_PCIE_SRIOV_FDL reg, try to retrieve PF num from SLI_PKT(0)_INPUT_CONTROL (initialized by f/w). Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: dsa: mv88e6xxx: make irq_chip constBhumika Goyal2017-08-202-2/+2
| | | | | | | | | | | | | | | | | | Make this const as it is only used in a copy operation. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: ibm: emac: Fix some error handling path in 'emac_probe()'Christophe Jaillet2017-08-202-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If 'irq_of_parse_and_map()' or 'of_address_to_resource()' fail, 'err' is known to be 0 at this point. So return -ENODEV instead in the first case and use 'of_iomap()' instead of the equivalent 'of_address_to_resource()/ioremap()' combinaison in the 2nd case. Doing so, the 'rsrc_regs' field of the 'emac_instance struct' becomes redundant and is removed. While at it, turn a 'err != 0' test into an equivalent 'err' to be more consistent. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
* | cxgb4/cxgbvf: Handle 32-bit fw port capabilitiesGanesh Goudar2017-08-208-356/+1220
| | | | | | | | | | | | | | | | | | | | | | Implement new 32-bit Firmware Port Capabilities in order to handle new speeds which couldn't be represented in the old 16-bit Firmware Port Capabilities values. Based on the original work of Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bpf: fix double free from dev_map_notification()Daniel Borkmann2017-08-201-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the current code, dev_map_free() can still race with dev_map_notification(). In dev_map_free(), we remove dtab from the list of dtabs after we purged all entries from it. However, we don't do xchg() with NULL or the like, so the entry at that point is still pointing to the device. If a unregister notification comes in at the same time, we therefore risk a double-free, since the pointer is still present in the map, and then pushed again to __dev_map_entry_free(). All this is completely unnecessary. Just remove the dtab from the list right before the synchronize_rcu(), so all outstanding readers from the notifier list have finished by then, thus we don't need to deal with this corner case anymore and also wouldn't need to nullify dev entires. This is fine because we iterate over the map releasing all entries and therefore dev references anyway. Fixes: 4cc7b9544b9a ("bpf: devmap fix mutex in rcu critical section") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge tag 'mlx5-updates-2017-08-17-V2' of ↵David S. Miller2017-08-2012-54/+267
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-08-17 Some updates for mlx5 ethernet and IPoIB device driver. Eran added the support for manage physical link state from netdevice upon interface open/close requests. Feras fixed the driver name showed in ethtool for IPoIB interfaces. Shalom Added the support for IPoIB netdevice ethtool get link settings. Gal and Eran exposed new diagnostic counters for outbound PCIe stalls and overflow and RX buffer fullness statistics. Code cleanups from Or Gerlitz. Variable types cleanup from Gal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5e: Use size_t to store byte offset in statistics descriptorsGal Pressman2017-08-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The byte offset of counter descriptors should be stored in size_t variable instead of an integer. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: Use kernel types instead of uint*_t in ethtool callbacksGal Pressman2017-08-202-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | Fix checkpatch errors: CHECK:PREFER_KERNEL_TYPES: Prefer kernel type 'u32' over 'uint32_t' Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: Place constants on the right side of comparisonsOr Gerlitz2017-08-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | To fix these checkpatch complaints: WARNING: Comparisons should place the constant on the right side of the test Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: Avoid using multiple blank linesOr Gerlitz2017-08-202-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | To fix these checkpatch complaints: CHECK: Please don't use multiple blank lines Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: Properly indent within conditional statementsOr Gerlitz2017-08-201-14/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To fix these checkpatch complaints: WARNING: suspect code indent for conditional statements (8, 24) + if (eth_proto & (MLX5E_PROT_MASK(MLX5E_10GBASE_SR) [...] + return PORT_FIBRE; Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Add a blank line after declarationsOr Gerlitz2017-08-202-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | To fix these checkpatch complaints: WARNING: Missing a blank line after declarations Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Avoid blank lines after/before open/close braceOr Gerlitz2017-08-202-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To fix these checkpatch complaints: CHECK: Blank lines aren't necessary after an open brace '{' CHECK: Blank lines aren't necessary before a close brace '}' Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: Add outbound PCI buffer overflow counterEran Ben Elisha2017-08-203-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add outbound_pci_buffer_overflow to ethtool output for monitoring the number of packets that were dropped due to lack of PCIe buffers on receive path from NIC port toward the host(s). This counter is valid only in case that tx_overflow_buffer_pkt is supported in MCAM enhanced features. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: Add RX buffer fullness countersGal Pressman2017-08-203-1/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rx_buffer_passed_thres_phy - The number of events where the port RX buffer has passed a fullness threshold. rx_buffer_full_phy - The number of events where the port RX buffer has reached 100% fullness. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Add RX buffer fullness counters infrastructureGal Pressman2017-08-201-2/+13
| | | | | | | | | | | | | | | | | | | | | Add capability bit in PCAM register and counters to PPCNT register. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: Add PCIe outbound stalls countersGal Pressman2017-08-202-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | outbound_pci_stalled_rd - The percentage of time within the last second that the NIC had outbound non-posted read requests but could not perform the operation due to insufficient non-posted credits. outbound_pci_stalled_wr - The percentage of time within the last second that the NIC had outbound posted writes requests but could not perform the operation due to insufficient posted credits. outbound_pci_stalled_rd_events - The number of events where outbound_pci_stalled_rd was above the threshold. outbound_pci_stalled_wr_events - The number of events where outbound_pci_stalled_wr was above the threshold. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5: Add PCIe outbound stalls counters infrastructureGal Pressman2017-08-201-3/+14
| | | | | | | | | | | | | | | | | | | | | Add capability bit in MCAM register and counters to MPCNT register. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: IPoIB, Add support for get_link_ksettings in ethtoolShalom Lagziel2017-08-201-12/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for "ethtool DEVNAME" over ipoib ports, Display standard port information for IPoIB netdevices using ethtool For example: $ ethtool ib2 > Settings for ib2: Supported ports: [ ] Supported link modes: Not reported Supported pause frame use: No Supports auto-negotiation: No Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Speed: 100000Mb/s Duplex: Full Port: Other PHYAD: 0 Transceiver: internal Auto-negotiation: off Link detected: yes Signed-off-by: Shalom Lagziel <shaloml@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: IPoIB, Fix driver name retrieved by ethtoolFeras Daoud2017-08-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Printing an enhanced IPoIB device information using "ethtool -i DEVNAME", prints the low level driver name: mlx5_core. This commit changes the name to mlx5_core [ib_ipoib], to include the ipoib device driver infromation. Fixes: 076b0936e5fb ("net/mlx5e: IPoIB, Add ethtool support") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | net/mlx5e: Send PAOS command on interface up/downEran Ben Elisha2017-08-202-7/+20
|/ / | | | | | | | | | | | | | | | | | | | | | | | | Upon interface up/down, driver will send PAOS (Ports Administrative and Operational Status Register) in order to inform the Firmware on the desired status of the port by the driver. Since now we might change physical link status on mlx5e_open/close, logical VF representor should not use mlx5e_open/close ndos as is, and should call the logical version mlx5e_open/closed_locked. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | bpf: linux/bpf.h needs linux/numa.hDavid S. Miller2017-08-191-0/+1
| | | | | | | | | | Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'BPF-inline-improvements'David S. Miller2017-08-193-1/+48
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== BPF inline improvements First one makes htab inlining more robust wrt future jits and second one inlines map in map lookups through map_gen_lookup() callback. v1 -> v2: - BITS_PER_LONG guard in patch 1 - BPF_EMIT_CALL is on __htab_map_lookup_elem ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: inline map in map lookup functions for array and htabDaniel Borkmann2017-08-192-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid two successive functions calls for the map in map lookup, first is the bpf_map_lookup_elem() helper call, and second the callback via map->ops->map_lookup_elem() to get to the map in map implementation. Implementation inlines array and htab flavor for map in map lookups. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: make htab inlining more robust wrt assumptionsDaniel Borkmann2017-08-191-1/+5
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 9015d2f59535 ("bpf: inline htab_map_lookup_elem()") was making the assumption that a direct call emission to the function __htab_map_lookup_elem() will always work out for JITs. This is currently true since all JITs we have are for 64 bit archs, but in case of 32 bit JITs like upcoming arm32, we get a NULL pointer dereference when executing the call to __htab_map_lookup_elem() since passed arguments are of a different size (due to pointer args) than what we do out of BPF. Guard and thus limit this for now for the current 64 bit JITs only. Reported-by: Shubham Bansal <illusionist.neo@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'bpf-Allow-selecting-numa-node-during-map-creation'David S. Miller2017-08-1917-39/+142
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Martin KaFai Lau says: ==================== bpf: Allow selecting numa node during map creation This series allows user to pick the numa node during map creation. The first patch has the details ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: Allow numa selection in INNER_LRU_HASH_PREALLOC test of map_perf_testMartin KaFai Lau2017-08-198-16/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes the needed changes to allow each process of the INNER_LRU_HASH_PREALLOC test to provide its numa node id when creating the lru map. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: Allow selecting numa node during map creationMartin KaFai Lau2017-08-199-23/+73
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current map creation API does not allow to provide the numa-node preference. The memory usually comes from where the map-creation-process is running. The performance is not ideal if the bpf_prog is known to always run in a numa node different from the map-creation-process. One of the use case is sharding on CPU to different LRU maps (i.e. an array of LRU maps). Here is the test result of map_perf_test on the INNER_LRU_HASH_PREALLOC test if we force the lru map used by CPU0 to be allocated from a remote numa node: [ The machine has 20 cores. CPU0-9 at node 0. CPU10-19 at node 1 ] ># taskset -c 10 ./map_perf_test 512 8 1260000 8000000 5:inner_lru_hash_map_perf pre-alloc 1628380 events per sec 4:inner_lru_hash_map_perf pre-alloc 1626396 events per sec 3:inner_lru_hash_map_perf pre-alloc 1626144 events per sec 6:inner_lru_hash_map_perf pre-alloc 1621657 events per sec 2:inner_lru_hash_map_perf pre-alloc 1621534 events per sec 1:inner_lru_hash_map_perf pre-alloc 1620292 events per sec 7:inner_lru_hash_map_perf pre-alloc 1613305 events per sec 0:inner_lru_hash_map_perf pre-alloc 1239150 events per sec #<<< After specifying numa node: ># taskset -c 10 ./map_perf_test 512 8 1260000 8000000 5:inner_lru_hash_map_perf pre-alloc 1629627 events per sec 3:inner_lru_hash_map_perf pre-alloc 1628057 events per sec 1:inner_lru_hash_map_perf pre-alloc 1623054 events per sec 6:inner_lru_hash_map_perf pre-alloc 1616033 events per sec 2:inner_lru_hash_map_perf pre-alloc 1614630 events per sec 4:inner_lru_hash_map_perf pre-alloc 1612651 events per sec 7:inner_lru_hash_map_perf pre-alloc 1609337 events per sec 0:inner_lru_hash_map_perf pre-alloc 1619340 events per sec #<<< This patch adds one field, numa_node, to the bpf_attr. Since numa node 0 is a valid node, a new flag BPF_F_NUMA_NODE is also added. The numa_node field is honored if and only if the BPF_F_NUMA_NODE flag is set. Numa node selection is not supported for percpu map. This patch does not change all the kmalloc. F.e. 'htab = kzalloc()' is not changed since the object is small enough to stay in the cache. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bnxt_en: fix spelling mistake: "swtichdev" -> "switchdev"Colin Ian King2017-08-191-1/+1
| | | | | | | | | | | | | | Trivial fix to spelling mistake in a netdev_info message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: hns3: fix a handful of spelling mistakesColin Ian King2017-08-192-3/+3
| | | | | | | | | | | | | | | | | | | | | | Trival fix to spelling mistakes: firware -> firmware invald -> invalid mutilcast -> multicast Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'net-const-eisa_device_id'David S. Miller2017-08-195-5/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Arvind Yadav says: ==================== constify net eisa_device_id eisa_device_id are not supposed to change at runtime. All functions working with eisa_device_id provided by <linux/eisa.h> work with const eisa_device_id. So mark the non-const structs as const. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>