]> asedeno.scripts.mit.edu Git - linux.git/log
linux.git
4 years agoxen-netback: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:31:08 +0000 (12:31 +0200)]
xen-netback: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Wei Liu <wei.liu@kernel.org>
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'net-dsa-mv88e6xxx-prepare-Wait-Bit-operation'
David S. Miller [Mon, 12 Aug 2019 04:27:15 +0000 (21:27 -0700)]
Merge branch 'net-dsa-mv88e6xxx-prepare-Wait-Bit-operation'

Vivien Didelot says:

====================
net: dsa: mv88e6xxx: prepare Wait Bit operation

The Remote Management Interface has its own implementation of a Wait
Bit operation, which requires a bit number and a value to wait for.

In order to prepare the introduction of this implementation, rework the
code waiting for bits and masks in mv88e6xxx to match this signature.

This has the benefit to unify the implementation of wait routines while
removing obsolete wait and update functions and also reducing the code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: mv88e6xxx: add delay in direct SMI wait
Vivien Didelot [Fri, 9 Aug 2019 22:47:59 +0000 (18:47 -0400)]
net: dsa: mv88e6xxx: add delay in direct SMI wait

The mv88e6xxx_smi_direct_wait routine is used to wait on indirect
registers access. It is of no exception and must delay between read
attempts, like other wait routines.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: mv88e6xxx: fix SMI bit checking
Vivien Didelot [Fri, 9 Aug 2019 22:47:58 +0000 (18:47 -0400)]
net: dsa: mv88e6xxx: fix SMI bit checking

The current mv88e6xxx_smi_direct_wait function is only used to check
the 16th bit of the (16-bit) SMI Command register. But the bit shift
operation is not enough if we eventually use this function to check
other bits, thus replace it with a mask.

Fixes: e7ba0fad9c53 ("net: dsa: mv88e6xxx: refine SMI support")
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: mv88e6xxx: remove wait and update routines
Vivien Didelot [Fri, 9 Aug 2019 22:47:57 +0000 (18:47 -0400)]
net: dsa: mv88e6xxx: remove wait and update routines

Now that we have proper Wait Bit and Wait Mask routines, remove the
unused mv88e6xxx_wait routine and its Global 1 and Global 2 variants.

The indirect tables such as the Device Mapping Table or Priority
Override Table make use of an Update bit to distinguish reading (0)
from writing (1) operations. After a write operation occurs, the bit
self clears right away so there's no need to wait on it. Thus keep
things simple and remove the mv88e6xxx_update helper as well.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: mv88e6xxx: wait for AVB Busy bit
Vivien Didelot [Fri, 9 Aug 2019 22:47:56 +0000 (18:47 -0400)]
net: dsa: mv88e6xxx: wait for AVB Busy bit

The AVB is not an indirect table using an Update bit, but a unit using
a Busy bit. This means that we must ensure that this bit is cleared
before setting it and wait until it gets cleared again after writing
an operation. Reflect that.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: mv88e6xxx: introduce wait bit routine
Vivien Didelot [Fri, 9 Aug 2019 22:47:55 +0000 (18:47 -0400)]
net: dsa: mv88e6xxx: introduce wait bit routine

Many portions of the driver need to wait until a given bit is set
or cleared. Some busses even have a specific implementation for this
operation. In preparation for such variant, implement a generic Wait
Bit routine that can be used by the driver core functions.

This allows us to get rid of the custom implementations we may find
in the driver. Note that for the EEPROM bits, BUSY and RUNNING bits
are independent, thus it is more efficient to wait independently for
each bit instead of waiting for their mask.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: mv88e6xxx: introduce wait mask routine
Vivien Didelot [Fri, 9 Aug 2019 22:47:54 +0000 (18:47 -0400)]
net: dsa: mv88e6xxx: introduce wait mask routine

The current mv88e6xxx_wait routine is used to wait for a given mask
to be cleared to zero. However in some cases, the driver may have
to wait for a given mask to be of a certain non-zero value.

Thus provide a generic wait mask routine that will be used to implement
the current mv88e6xxx_wait function, and use it to wait for 88E6185
PPU states.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: mv88e6xxx: wait for 88E6185 PPU disabled
Vivien Didelot [Fri, 9 Aug 2019 22:47:53 +0000 (18:47 -0400)]
net: dsa: mv88e6xxx: wait for 88E6185 PPU disabled

The PPU state of 88E6185 can be either "Disabled at Reset" or
"Disabled after Initialization". Because we intentionally clear the
PPU Enabled bit before checking its state, it is safe to wait for the
MV88E6185_G1_STS_PPU_STATE_DISABLED state explicitly instead of waiting
for any state different than MV88E6185_G1_STS_PPU_STATE_POLLING.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agor8169: inline rtl8169_free_rx_databuff
Heiner Kallweit [Fri, 9 Aug 2019 20:59:07 +0000 (22:59 +0200)]
r8169: inline rtl8169_free_rx_databuff

rtl8169_free_rx_databuff is used in only one place, so let's inline it.
We can improve the loop because rtl8169_init_ring zero's RX_databuff
before calling rtl8169_rx_fill, and rtl8169_rx_fill fills
Rx_databuff starting from index 0.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'realtek-phy-next'
David S. Miller [Mon, 12 Aug 2019 04:24:32 +0000 (21:24 -0700)]
Merge branch 'realtek-phy-next'

Heiner Kallweit says:

====================
net: phy: realtek: add support for integrated 2.5Gbps PHY in RTL8125

This series adds support for the integrated 2.5Gbps PHY in RTL8125.
First three patches add necessary functionality to phylib.

Changes in v2:
- added patch 1
- changed patch 4 to use a fake PHY ID that is injected by the
  network driver. This allows to use a dedicated PHY driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: realtek: add support for the 2.5Gbps PHY in RTL8125
Heiner Kallweit [Fri, 9 Aug 2019 18:45:14 +0000 (20:45 +0200)]
net: phy: realtek: add support for the 2.5Gbps PHY in RTL8125

This adds support for the integrated 2.5Gbps PHY in Realtek RTL8125.
Advertisement of 2.5Gbps mode is done via a vendor-specific register.
Same applies to reading NBase-T link partner advertisement.
Unfortunately this 2.5Gbps PHY shares the PHY ID with the integrated
1Gbps PHY's in other Realtek network chips and so far no method is
known to differentiate them. As a workaround use a dedicated fake PHY ID
that is set by the network driver by intercepting the MDIO PHY ID read.

v2:
- Create dedicated PHY driver and use a fake PHY ID that is injected by
  the network driver. Suggested by Andrew Lunn.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: add phy_modify_paged_changed
Heiner Kallweit [Fri, 9 Aug 2019 18:44:22 +0000 (20:44 +0200)]
net: phy: add phy_modify_paged_changed

Add helper function phy_modify_paged_changed, behavios is the same
as for phy_modify_changed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: prepare phylib to deal with PHY's extending Clause 22
Heiner Kallweit [Fri, 9 Aug 2019 18:43:50 +0000 (20:43 +0200)]
net: phy: prepare phylib to deal with PHY's extending Clause 22

The integrated PHY in 2.5Gbps chip RTL8125 is the first (known to me)
PHY that uses standard Clause 22 for all modes up to 1Gbps and adds
2.5Gbps control using vendor-specific registers. To use phylib for
the standard part little extensions are needed:
- Move most of genphy_config_aneg to a new function
  __genphy_config_aneg that takes a parameter whether restarting
  auto-negotiation is needed (depending on whether content of
  vendor-specific advertisement register changed).
- Don't clear phydev->lp_advertising in genphy_read_status so that
  we can set non-C22 mode flags before.

Basically both changes mimic the behavior of the equivalent Clause 45
functions.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: simplify genphy_config_advert by using the linkmode_adv_to_xxx_t functions
Heiner Kallweit [Fri, 9 Aug 2019 18:43:04 +0000 (20:43 +0200)]
net: phy: simplify genphy_config_advert by using the linkmode_adv_to_xxx_t functions

Using linkmode_adv_to_mii_adv_t and linkmode_adv_to_mii_ctrl1000_t
allows to simplify the code. In addition avoiding the conversion to
the legacy u32 advertisement format allows to remove the warning.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetdevsim: register couple of devlink params
Jiri Pirko [Fri, 9 Aug 2019 11:05:12 +0000 (13:05 +0200)]
netdevsim: register couple of devlink params

Register couple of devlink params, one generic, one driver-specific.
Make the values available over debugfs.

Example:
$ echo "111" > /sys/bus/netdevsim/new_device
$ devlink dev param
netdevsim/netdevsim111:
  name max_macs type generic
    values:
      cmode driverinit value 32
  name test1 type driver-specific
    values:
      cmode driverinit value true
$ cat /sys/kernel/debug/netdevsim/netdevsim111/max_macs
32
$ cat /sys/kernel/debug/netdevsim/netdevsim111/test1
Y
$ devlink dev param set netdevsim/netdevsim111 name max_macs cmode driverinit value 16
$ devlink dev param set netdevsim/netdevsim111 name test1 cmode driverinit value false
$ devlink dev reload netdevsim/netdevsim111
$ cat /sys/kernel/debug/netdevsim/netdevsim111/max_macs
16
$ cat /sys/kernel/debug/netdevsim/netdevsim111/test1

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'drop_monitor-Capture-dropped-packets-and-metadata'
David S. Miller [Sun, 11 Aug 2019 17:53:31 +0000 (10:53 -0700)]
Merge branch 'drop_monitor-Capture-dropped-packets-and-metadata'

Ido Schimmel says:

====================
drop_monitor: Capture dropped packets and metadata

So far drop monitor supported only one mode of operation in which a
summary of recent packet drops is periodically sent to user space as a
netlink event. The event only includes the drop location (program
counter) and number of drops in the last interval.

While this mode of operation allows one to understand if the system is
dropping packets, it is not sufficient if a more detailed analysis is
required. Both the packet itself and related metadata are missing.

This patchset extends drop monitor with another mode of operation where
the packet - potentially truncated - and metadata (e.g., drop location,
timestamp, netdev) are sent to user space as a netlink event. Thanks to
the extensible nature of netlink, more metadata can be added in the
future.

To avoid performing expensive operations in the context in which
kfree_skb() is called, the dropped skbs are cloned and queued on per-CPU
skb drop list. The list is then processed in process context (using a
workqueue), where the netlink messages are allocated, prepared and
finally sent to user space.

A follow-up patchset will integrate drop monitor with devlink and allow
the latter to call into drop monitor to report hardware drops. In the
future, XDP drops can be added as well, thereby making drop monitor the
go-to netlink channel for diagnosing all packet drops.

Example usage with patched dropwatch [1] can be found here [2]. Example
dissection of drop monitor netlink events with patched wireshark [3] can
be found here [4]. I will submit both changes upstream after the kernel
changes are accepted. Another change worth making is adding a dropmon
pseudo interface to libpcap, similar to the nflog interface [5]. This
will allow users to specifically listen on dropmon traffic instead of
capturing all netlink packets via the nlmon netdev.

Patches #1-#5 prepare the code towards the actual changes in later
patches.

Patch #6 adds another mode of operation to drop monitor in which the
dropped packet itself is notified to user space along with metadata.

Patch #7 allows users to truncate reported packets to a specific length,
in case only the headers are of interest. The original length of the
packet is added as metadata to the netlink notification.

Patch #8 allows user to query the current configuration of drop monitor
(e.g., alert mode, truncation length).

Patches #9-#10 allow users to tune the length of the per-CPU skb drop
list according to their needs.

Changes since v1 [6]:
* Add skb protocol as metadata. This allows user space to correctly
  dissect the packet instead of blindly assuming it is an Ethernet
  packet

Changes since RFC [7]:
* Limit the length of the per-CPU skb drop list and make it configurable
* Do not use the hysteresis timer in packet alert mode
* Introduce alert mode operations in a separate patch and only then
  introduce the new alert mode
* Use 'skb->skb_iif' instead of 'skb->dev' because the latter is inside
  a union with 'dev_scratch' and therefore not guaranteed to point to a
  valid netdev
* Return '-EBUSY' instead of '-EOPNOTSUPP' when trying to configure drop
  monitor while it is monitoring
* Did not change schedule_work() in favor of schedule_work_on() as I did
  not observe a change in number of tail drops

[1] https://github.com/idosch/dropwatch/tree/packet-mode
[2] https://gist.github.com/idosch/3d524b887e16bc11b4b19e25c23dcc23#file-gistfile1-txt
[3] https://github.com/idosch/wireshark/tree/drop-monitor-v2
[4] https://gist.github.com/idosch/3d524b887e16bc11b4b19e25c23dcc23#file-gistfile2-txt
[5] https://github.com/the-tcpdump-group/libpcap/blob/master/pcap-netfilter-linux.c
[6] https://patchwork.ozlabs.org/cover/1143443/
[7] https://patchwork.ozlabs.org/cover/1135226/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Expose tail drop counter
Ido Schimmel [Sun, 11 Aug 2019 07:35:55 +0000 (10:35 +0300)]
drop_monitor: Expose tail drop counter

Previous patch made the length of the per-CPU skb drop list
configurable. Expose a counter that shows how many packets could not be
enqueued to this list.

This allows users determine the desired queue length.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Make drop queue length configurable
Ido Schimmel [Sun, 11 Aug 2019 07:35:54 +0000 (10:35 +0300)]
drop_monitor: Make drop queue length configurable

In packet alert mode, each CPU holds a list of dropped skbs that need to
be processed in process context and sent to user space. To avoid
exhausting the system's memory the maximum length of this queue is
currently set to 1000.

Allow users to tune the length of this queue according to their needs.
The configured length is reported to user space when drop monitor
configuration is queried.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Add a command to query current configuration
Ido Schimmel [Sun, 11 Aug 2019 07:35:53 +0000 (10:35 +0300)]
drop_monitor: Add a command to query current configuration

Users should be able to query the current configuration of drop monitor
before they start using it. Add a command to query the existing
configuration which currently consists of alert mode and packet
truncation length.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Allow truncation of dropped packets
Ido Schimmel [Sun, 11 Aug 2019 07:35:52 +0000 (10:35 +0300)]
drop_monitor: Allow truncation of dropped packets

When sending dropped packets to user space it is not always necessary to
copy the entire packet as usually only the headers are of interest.

Allow user to specify the truncation length and add the original length
of the packet as additional metadata to the netlink message.

By default no truncation is performed.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Add packet alert mode
Ido Schimmel [Sun, 11 Aug 2019 07:35:51 +0000 (10:35 +0300)]
drop_monitor: Add packet alert mode

So far drop monitor supported only one alert mode in which a summary of
locations in which packets were recently dropped was sent to user space.

This alert mode is sufficient in order to understand that packets were
dropped, but lacks information to perform a more detailed analysis.

Add a new alert mode in which the dropped packet itself is passed to
user space along with metadata: The drop location (as program counter
and resolved symbol), ingress netdevice and drop timestamp. More
metadata can be added in the future.

To avoid performing expensive operations in the context in which
kfree_skb() is invoked (can be hard IRQ), the dropped skb is cloned and
queued on per-CPU skb drop list. Then, in process context the netlink
message is allocated, prepared and finally sent to user space.

The per-CPU skb drop list is limited to 1000 skbs to prevent exhausting
the system's memory. Subsequent patches will make this limit
configurable and also add a counter that indicates how many skbs were
tail dropped.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Add alert mode operations
Ido Schimmel [Sun, 11 Aug 2019 07:35:50 +0000 (10:35 +0300)]
drop_monitor: Add alert mode operations

The next patch is going to add another alert mode in which the dropped
packet is notified to user space, instead of only a summary of recent
drops.

Abstract the differences between the modes by adding alert mode
operations. The operations are selected based on the currently
configured mode and associated with the probes and the work item just
before tracing starts.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Require CAP_NET_ADMIN for drop monitor configuration
Ido Schimmel [Sun, 11 Aug 2019 07:35:49 +0000 (10:35 +0300)]
drop_monitor: Require CAP_NET_ADMIN for drop monitor configuration

Currently, the configure command does not do anything but return an
error. Subsequent patches will enable the command to change various
configuration options such as alert mode and packet truncation.

Similar to other netlink-based configuration channels, make sure only
users with the CAP_NET_ADMIN capability set can execute this command.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Reset per-CPU data before starting to trace
Ido Schimmel [Sun, 11 Aug 2019 07:35:48 +0000 (10:35 +0300)]
drop_monitor: Reset per-CPU data before starting to trace

The function reset_per_cpu_data() allocates and prepares a new skb for
the summary netlink alert message ('NET_DM_CMD_ALERT'). The new skb is
stored in the per-CPU 'data' variable and the old is returned.

The function is invoked during module initialization and from the
workqueue, before an alert is sent. This means that it is possible to
receive an alert with stale data, if we stopped tracing when the
hysteresis timer ('data->send_timer') was pending.

Instead of invoking the function during module initialization, invoke it
just before we start tracing and ensure we get a fresh skb.

This also allows us to remove the calls to initialize the timer and the
work item from the module initialization path, since both could have
been triggered by the error paths of reset_per_cpu_data().

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Initialize timer and work item upon tracing enable
Ido Schimmel [Sun, 11 Aug 2019 07:35:47 +0000 (10:35 +0300)]
drop_monitor: Initialize timer and work item upon tracing enable

The timer and work item are currently initialized once during module
init, but subsequent patches will need to associate different functions
with the work item, based on the configured alert mode.

Allow subsequent patches to make that change by initializing and
de-initializing these objects during tracing enable and disable.

This also guarantees that once the request to disable tracing returns,
no more netlink notifications will be generated.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Split tracing enable / disable to different functions
Ido Schimmel [Sun, 11 Aug 2019 07:35:46 +0000 (10:35 +0300)]
drop_monitor: Split tracing enable / disable to different functions

Subsequent patches will need to enable / disable tracing based on the
configured alerting mode.

Reduce the nesting level and prepare for the introduction of this
functionality by splitting the tracing enable / disable operations into
two different functions.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'Networking-driver-debugfs-cleanups'
David S. Miller [Sat, 10 Aug 2019 22:25:49 +0000 (15:25 -0700)]
Merge branch 'Networking-driver-debugfs-cleanups'

Greg Kroah-Hartman says:

====================
Networking driver debugfs cleanups

There is no need to test the result of any debugfs call anymore.  The
debugfs core warns the user if something fails, and the return value of
a debugfs call can always be fed back into another debugfs call with no
problems.

Also, debugfs is for debugging, so if there are problems with debugfs
(i.e. the system is out of memory) the rest of the kernel should not
change behavior, so testing for debugfs calls is pointless and not the
goal of debugfs at all.

This series cleans up a lot of networking drivers and some wimax code
that was calling debugfs and trying to do something with the return
value that it didn't need to.  Removing this logic makes the code
smaller, easier to understand, and use less run-time memory in some
cases, all good things.

The series is against net-next, and have no dependancies between any of
them if they want to go through any random tree/order.  Or, if wanted,
I can take them through my driver-core tree where other debugfs cleanups
are being slowly fed during major merge windows.

v3: fix build warning in i2400m, I thought I had caught them all :(
    add acks from some reviewers

v2: fix up build warnings, it's as if I never even built these.  Ugh, so
    sorry for wasting people's time with the v1 series.  I need to stop
    relying on 0-day as it isn't working well anymore :(
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoieee802154: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:32 +0000 (12:17 +0200)]
ieee802154: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Alexander Aring <alex.aring@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Harry Morris <h.morris@cascoda.com>
Cc: linux-wpan@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Acked-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoixgbe: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:31 +0000 (12:17 +0200)]
ixgbe: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoi40e: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:30 +0000 (12:17 +0200)]
i40e: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agofm10k: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:29 +0000 (12:17 +0200)]
fm10k: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomvpp2: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:28 +0000 (12:17 +0200)]
mvpp2: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maxime Chevallier <maxime.chevallier@bootlin.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nathan Huckleberry <nhuck@google.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoskge: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:27 +0000 (12:17 +0200)]
skge: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoqca: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:26 +0000 (12:17 +0200)]
qca: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stefan Wahren <stefan.wahren@i2se.com>
Cc: Michael Heimpold <michael.heimpold@i2se.com>
Cc: Yangtao Li <tiny.windzz@gmail.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodpaa2: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:25 +0000 (12:17 +0200)]
dpaa2: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Because we don't care about the individual files, we can remove the
stored dentry for the files, as they are not needed to be kept track of
at all.

Cc: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agostmmac: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:24 +0000 (12:17 +0200)]
stmmac: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Because we don't care about the individual files, we can remove the
stored dentry for the files, as they are not needed to be kept track of
at all.

Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Jose Abreu <joabreu@synopsys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: netdev@vger.kernel.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonfp: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:23 +0000 (12:17 +0200)]
nfp: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Edwin Peer <edwin.peer@netronome.com>
Cc: Yangtao Li <tiny.windzz@gmail.com>
Cc: Simon Horman <simon.horman@netronome.com>
Cc: oss-drivers@netronome.com
Cc: netdev@vger.kernel.org
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agohns3: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:22 +0000 (12:17 +0200)]
hns3: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agocxgb4: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:21 +0000 (12:17 +0200)]
cxgb4: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

If a debugfs call fails, it will properly warn in the syslog, there's no
need for all individual drivers to also print a message, so that is one
more reason to not care about checking the return values.

Cc: Vishal Kulkarni <vishal@chelsio.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Casey Leedom <leedom@chelsio.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobnxt: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:20 +0000 (12:17 +0200)]
bnxt: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

This cleans up a lot of unneeded code and logic around the debugfs
files, making all of this much simpler and easier to understand.

Cc: Michael Chan <michael.chan@broadcom.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoxgbe: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:19 +0000 (12:17 +0200)]
xgbe: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

This cleans up a lot of unneeded code and logic around the debugfs
files, making all of this much simpler and easier to understand.

Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlx5: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:18 +0000 (12:17 +0200)]
mlx5: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

This cleans up a lot of unneeded code and logic around the debugfs
files, making all of this much simpler and easier to understand as we
don't need to keep the dentries saved anymore.

Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agobonding: no need to print a message if debugfs_create_dir() fails
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:17 +0000 (12:17 +0200)]
bonding: no need to print a message if debugfs_create_dir() fails

The debugfs core now will print a message if this function fails, so
don't duplicate that logic.  Also, no need to change the code logic if
the call fails either, as no debugfs calls should interrupt normal
kernel code for any reason.

Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agowimax: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sat, 10 Aug 2019 10:17:16 +0000 (12:17 +0200)]
wimax: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

This cleans up a lot of unneeded code and logic around the debugfs wimax
files, making all of this much simpler and easier to understand.

Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: linux-wimax@intel.com
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'mlx5-updates-2019-08-09' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Sat, 10 Aug 2019 03:11:19 +0000 (20:11 -0700)]
Merge tag 'mlx5-updates-2019-08-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2019-08-09

This series includes update to mlx5 ethernet and core driver:

In first #11 patches, Vlad submits part 2 of 3 part series to allow
TC flow handling for concurrent execution.

1) TC flow handling for concurrent execution (part 2)

Vald Says:
==========

Refactor data structures that are shared between flows in tc.
Currently, all cls API hardware offloads driver callbacks require caller
to hold rtnl lock when calling them. Cls API has already been updated to
update software filters in parallel (on classifiers that support
unlocked execution), however hardware offloads code still obtains rtnl
lock before calling driver tc callbacks. This set implements support for
unlocked execution of tc hairpin, mod_hdr and encap subsystem. The
changed implemented in these subsystems are very similar in general.

The main difference is that hairpin is accessed through mlx5e_tc_table
(legacy mode), mod_hdr is accessed through both mlx5e_tc_table and
mlx5_esw_offload (legacy and switchdev modes) and encap is only accessed
through mlx5_esw_offload (switchdev mode).

1.1) Hairpin handling and structure mlx5e_hairpin_entry refactored in
following way:

- Hairpin structure is extended with atomic reference counter. This
  approach allows to lookup of hairpin entry and obtain reference to it
  with hairpin_tbl_lock protection and then continue using the entry
  unlocked (including provisioning to hardware).

- To support unlocked provisioning of hairpin entry to hardware, the entry
  is extended with 'res_ready' completion and is inserted to hairpin_tbl
  before calling the firmware. With this approach any concurrent users that
  attempt to use the same hairpin entry wait for completion first to
  prevent access to entries that are not fully initialized.

- Hairpin entry is extended with new flows_lock spinlock to protect the
  list when multiple concurrent tc instances update flows attached to
  the same hairpin entry.

1.2) Modify header handling code and structure mlx5e_mod_hdr_entry
are refactored in the following way:

- Mod_hdr structure is extended with atomic reference counter. This
  approach allows to lookup of mod_hdr entry and obtain reference to it
  with mod_hdr_tbl_lock protection and then continue using the entry
  unlocked (including provisioning to hardware).

- To support unlocked provisioning of mod_hdr entry to hardware, the entry
  is extended with 'res_ready' completion and is inserted to mod_hdr_tbl
  before calling the firmware. With this approach any concurrent users that
  attempt to use the same mod_hdr entry wait for completion first to
  prevent access to entries that are not fully initialized.

- Mod_Hdr entry is extended with new flows_lock spinlock to protect the
  list when multiple concurrent tc instances update flows attached to
  the same mod_hdr entry.

1.3) Encapsulation handling code and Structure mlx5e_encap_entry
are refactored in the following way:

- encap structure is extended with atomic reference counter. This
  approach allows to lookup of encap entry and obtain reference to it
  with encap_tbl_lock protection and then continue using the entry
  unlocked (including provisioning to hardware).

- To support unlocked provisioning of encap entry to hardware, the entry is
  extended with 'res_ready' completion and is inserted to encap_tbl before
  calling the firmware. With this approach any concurrent users that
  attempt to use the same encap entry wait for completion first to prevent
  access to entries that are not fully initialized.

- As a difference from approach used to refactor hairpin and mod_hdr,
  encap entry is not extended with any per-entry fine-grained lock.
  Instead, encap_table_lock is used to synchronize all operations on
  encap table and instances of mlx5e_encap_entry. This is necessary
  because single flow can be attached to multiple encap entries
  simultaneously. During new flow creation or neigh update event all of
  encaps that flow is attached to must be accessed together as in atomic
  manner, which makes usage of per-entry lock infeasible.

- Encap entry is extended with new flows_lock spinlock to protect the
  list when multiple concurrent tc instances update flows attached to
  the same encap entry.

==========

3) Parav improves the way port representors report their parent ID and
port index.

4) Use refcount_t for refcount in vxlan data base from  Chuhong Yuan
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotc-testing: added tdc tests for matchall filter
Roman Mashak [Fri, 9 Aug 2019 22:46:40 +0000 (18:46 -0400)]
tc-testing: added tdc tests for matchall filter

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: Fix detection of nettest command in fcnal-test
David Ahern [Fri, 9 Aug 2019 23:13:38 +0000 (16:13 -0700)]
selftests: Fix detection of nettest command in fcnal-test

Most of the tests run by fcnal-test.sh relies on the nettest command.
Rather than trying to cover all of the individual tests, check for the
binary only at the beginning.

Also removes the need for log_error which is undefined.

Fixes: 6f9d5cacfe07 ("selftests: Setup for functional tests for fib and socket lookups")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5e: Use refcount_t for refcount
Chuhong Yuan [Fri, 2 Aug 2019 16:48:28 +0000 (00:48 +0800)]
net/mlx5e: Use refcount_t for refcount

refcount_t is better for reference counters since its
implementation can prevent overflows.
So convert atomic_t ref counters to refcount_t.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Use vhca_id in generating representor port_index
Parav Pandit [Sun, 21 Jul 2019 03:20:58 +0000 (22:20 -0500)]
net/mlx5e: Use vhca_id in generating representor port_index

It is desired to use unique port indices when multiple pci devices'
devlink instance have the same switch-id.

Make use of vhca-id to generate such unique devlink port indices.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Simplify querying port representor parent id
Parav Pandit [Fri, 26 Jul 2019 18:42:04 +0000 (13:42 -0500)]
net/mlx5e: Simplify querying port representor parent id

System image GUID doesn't depend on eswitch switchdev mode.

Hence, remove the check which simplifies the code.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: E-switch, Removed unused hwid
Parav Pandit [Fri, 26 Jul 2019 13:26:52 +0000 (08:26 -0500)]
net/mlx5: E-switch, Removed unused hwid

Currently mlx5_eswitch_rep stores same hw ID for all representors.
However it is never used from this structure.
It is always used from mlx5_vport.

Hence, remove unused field.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Allow concurrent creation of encap entries
Vlad Buslov [Thu, 8 Aug 2019 14:01:33 +0000 (17:01 +0300)]
net/mlx5e: Allow concurrent creation of encap entries

Encap entries creation is fully synchronized by encap_tbl_lock. In order to
allow concurrent allocation of hardware resources used to offload
encapsulation, extend mlx5e_encap_entry with 'res_ready' completion. Move
call to mlx5e_tc_tun_create_header_ipv{4|6}() out of encap_tbl_lock
critical section. Modify code that attaches new flows to existing encap to
wait for 'res_ready' completion before using the entry. Insert encap entry
to table before provisioning it to hardware and modify all users of the
encap table to verify that encap was fully initialized by checking
completion result for non-zero value (and to wait for 'res_ready'
completion, if necessary).

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Protect encap hash table with mutex
Vlad Buslov [Fri, 2 Aug 2019 19:21:56 +0000 (22:21 +0300)]
net/mlx5e: Protect encap hash table with mutex

To remove dependency on rtnl lock, protect encap hash table from concurrent
modifications with new "encap_tbl_lock" mutex. Use the mutex to protect
internal encap entry state from concurrent modification. This is necessary
because a flow can be attached to multiple encap entries simultaneously,
which significantly complicates using finer grained per-entry lock.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Extend encap entry with reference counter
Vlad Buslov [Sun, 3 Jun 2018 17:31:47 +0000 (20:31 +0300)]
net/mlx5e: Extend encap entry with reference counter

List of flows attached to encap entry is used as implicit reference
counter (encap entry is deallocated when list becomes free) and as a
mechanism to obtain encap entry that flow is attached to (through list
head). This is not safe when concurrent modification of list of flows
attached to encap entry is possible. Proper atomic reference counter is
required to support concurrent access.

As a preparation for extending encap with reference counting, extract code
that lookups and deletes encap entry into standalone put/get helpers. In
order to remove this dependency on external locking, extend encap entry
with reference counter to manage its lifetime and extend flow structure
with direct pointer to encap entry that flow is attached to.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Allow concurrent creation of mod_hdr entries
Vlad Buslov [Thu, 8 Aug 2019 13:53:15 +0000 (16:53 +0300)]
net/mlx5e: Allow concurrent creation of mod_hdr entries

Mod_hdr entries creation is fully synchronized by mod_hdr_tbl->lock. In
order to allow concurrent allocation of hardware resources used to offload
header rewrite, extend mlx5e_mod_hdr_entry with 'res_ready' completion.
Move call to mlx5_modify_header_alloc() out of mod_hdr_tbl->lock critical
section. Modify code that attaches new flows to existing mh to wait for
'res_ready' completion before using the entry. Insert mh to mod_hdr table
before provisioning it to hardware and modify all users of mod_hdr table to
verify that mh was fully initialized by checking completion result for
negative value (and to wait for 'res_ready' completion, if necessary).

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Protect mod_hdr hash table with mutex
Vlad Buslov [Fri, 9 Aug 2019 10:20:48 +0000 (13:20 +0300)]
net/mlx5e: Protect mod_hdr hash table with mutex

To remove dependency on rtnl lock, protect mod_hdr hash table from
concurrent modifications with new mutex.

Implement helper function to get flow namespace to prevent code
duplication.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Protect mod header entry flows list with spinlock
Vlad Buslov [Fri, 8 Jun 2018 19:10:09 +0000 (22:10 +0300)]
net/mlx5e: Protect mod header entry flows list with spinlock

To remove dependency on rtnl lock, extend mod header entry with spinlock
and use it to protect list of flows attached to mod header entry from
concurrent modifications.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Extend mod header entry with reference counter
Vlad Buslov [Fri, 1 Jun 2018 18:47:43 +0000 (21:47 +0300)]
net/mlx5e: Extend mod header entry with reference counter

List of flows attached to mod header entry is used as implicit reference
counter (mod header entry is deallocated when list becomes free) and as a
mechanism to obtain mod header entry that flow is attached to (through list
head). This is not safe when concurrent modification of list of flows
attached to mod header entry is possible. Proper atomic reference counter
is required to support concurrent access.

As a preparation for extending mod header with reference counting, extract
code that lookups and deletes mod header entry into standalone put/get
helpers. In order to remove this dependency on external locking, extend mod
header entry with reference counter to manage its lifetime and extend flow
structure with direct pointer to mod header entry that flow is attached to.

To remove code duplication between legacy and switchdev mode
implementations that both support mod_hdr functionality, store mod_hdr
table in dedicated structure used by both fdb and kernel namespaces. New
table structure is extended with table lock by one of the following patches
in this series. Implement helper function to get correct mod_hdr table
depending on flow namespace.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Allow concurrent creation of hairpin entries
Vlad Buslov [Thu, 1 Aug 2019 13:54:54 +0000 (16:54 +0300)]
net/mlx5e: Allow concurrent creation of hairpin entries

Hairpin entries creation is fully synchronized by hairpin_tbl_lock. In
order to allow concurrent initialization of mlx5e_hairpin structure
instances and provisioning of hairpin entries to hardware, extend
mlx5e_hairpin_entry with 'res_ready' completion. Move call to
mlx5e_hairpin_create() out of hairpin_tbl_lock critical section. Modify
code that attaches new flows to existing hpe to wait for 'res_ready'
completion before using the hpe. Insert hpe to hairpin table before
provisioning it to hardware and modify all users of hairpin table to verify
that hpe was fully initialized by checking hpe->hp pointer (and to wait for
'res_ready' completion, if necessary).

Modify dead peer update event handling function to save hpe's to temporary
list with their reference counter incremented. Wait for completion of hpe's
in temporary list and update their 'peer_gone' flag outside of
hairpin_tbl_lock critical section.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Protect hairpin hash table with mutex
Vlad Buslov [Wed, 31 Jul 2019 15:19:06 +0000 (18:19 +0300)]
net/mlx5e: Protect hairpin hash table with mutex

To remove dependency on rtnl lock, protect hairpin hash table from
concurrent modifications with new "hairpin_tbl_lock" mutex.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Protect hairpin entry flows list with spinlock
Vlad Buslov [Thu, 7 Jun 2018 20:01:40 +0000 (23:01 +0300)]
net/mlx5e: Protect hairpin entry flows list with spinlock

To remove dependency on rtnl lock, extend hairpin entry with spinlock and
use it to protect list of flows attached to hairpin entry from concurrent
modifications.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Extend hairpin entry with reference counter
Vlad Buslov [Fri, 8 Jun 2018 16:26:50 +0000 (19:26 +0300)]
net/mlx5e: Extend hairpin entry with reference counter

List of flows attached to hairpin entry is used as implicit reference
counter (hairpin entry is deallocated when list becomes free) and as a
mechanism to obtain hairpin entry that flow is attached to (through list
head). This is not safe when concurrent modification of list of flows
attached to hairpin entry is possible. Proper atomic reference counter is
required to support concurrent access.

As a preparation for extending hairpin with reference counting, extract
code that deletes hairpin entry into standalone function. In order to
remove this dependency on external locking, extend hairpin entry with
reference counter to manage its lifetime and extend flow structure with
direct pointer to hairpin entry that flow is attached to.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agoMerge branch 'hns3-next'
David S. Miller [Fri, 9 Aug 2019 20:44:33 +0000 (13:44 -0700)]
Merge branch 'hns3-next'

Huazhong Tan says:

====================
net: hns3: add some bugfixes & optimizations & cleanups for HNS3 driver

This patch-set includes code optimizations, bugfixes and cleanups for
the HNS3 ethernet controller driver.

[patch 01/12] fixes a GFP flag error.

[patch 02/12] fixes a VF interrupt error.

[patch 03/12] adds a cleanup for VLAN handling.

[patch 04/12] fixes a bug in debugfs.

[patch 05/12] modifies pause displaying format.

[patch 06/12] adds more DFX information for ethtool -d.

[patch 07/12] adds more TX statistics information.

[patch 08/12] adds a check for TX BD number.

[patch 09/12] adds a cleanup for dumping NCL_CONFIG.

[patch 10/12] refines function for querying MAC pause statistics.

[patch 11/12] adds a handshake with VF when doing PF reset.

[patch 12/12] refines some macro definitions.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: refine some macro definitions
Guojia Liao [Fri, 9 Aug 2019 02:31:18 +0000 (10:31 +0800)]
net: hns3: refine some macro definitions

Macro arguments should be enclosed in parentheses, in case of
expression argument, but parentheses of pure number in macro
definition should be removed for simplicity.

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: add handshake with VF for PF reset
Huazhong Tan [Fri, 9 Aug 2019 02:31:17 +0000 (10:31 +0800)]
net: hns3: add handshake with VF for PF reset

Before PF asserting function reset, it should make sure
that all its VFs have been ready, otherwise, it will cause
some hardware errors.

So this patch adds function hclge_func_reset_sync_vf() to
synchronize VF before asserting PF function reset. For new
firmware which supports command HCLGE_OPC_QUERY_VF_RST_RDY,
we will try to query VFs' ready status within 30 seconds.
And keep the old implementation for compatible with firmware
which does not support this command.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: refine MAC pause statistics querying function
Yufeng Mo [Fri, 9 Aug 2019 02:31:16 +0000 (10:31 +0800)]
net: hns3: refine MAC pause statistics querying function

This patch refines the interface for querying MAC pause
statistics, and adds structure hns3_mac_stats to keep the
count of TX & RX.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: add function display NCL_CONFIG info
Yufeng Mo [Fri, 9 Aug 2019 02:31:15 +0000 (10:31 +0800)]
net: hns3: add function display NCL_CONFIG info

This adds a new function hclge_ncl_config_data_print()
to print the data of NCL_CONFIG, to make the code more
readable. Also, using macro replaces some magic number.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: add check for max TX BD num for tso and non-tso case
Yunsheng Lin [Fri, 9 Aug 2019 02:31:14 +0000 (10:31 +0800)]
net: hns3: add check for max TX BD num for tso and non-tso case

Hardware supports up to 8 TX BD for non-TSO skb and 63 TX
BD for TSO skb. Currently hns3 driver does not check the max
BD num that required by a skb before filling desc, which may
cause the hardware to issue a RAS error throug PCIe AER.

This patch adds the max BD num check before filling desc,
if the bd num is not within the hardware limit, it will
record the error by ring->stats.sw_err_cnt counter and
free the skb.

This patch also cleans up the hns3_nic_bd_num function by
changing the return type and removing an unnecessary check.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: add some statitics info to tx process
Yunsheng Lin [Fri, 9 Aug 2019 02:31:13 +0000 (10:31 +0800)]
net: hns3: add some statitics info to tx process

This patch adds tx_vlan_err, tx_l4_proto_err, tx_l2l3l4_err
and tx_tso_err counter to tx process, in order to better
debug the desc filling error.

This patch also adds a missing u64_stats_update_* around
ring->stats.sw_err_cnt and adds hns3_rl_err to limit the
error printing in the IO patch.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: add DFX registers information for ethtool -d
Guangbin Huang [Fri, 9 Aug 2019 02:31:12 +0000 (10:31 +0800)]
net: hns3: add DFX registers information for ethtool -d

Now we can use ethtool -d command to dump some registers. However,
these registers information is not enough to find out where the problem is.

This patch adds DFX registers information after original registers
when use ethtool -d commmand to dump registers. Also, using macro
replaces some related magic number.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: modify how pause options is displayed
Yonglong Liu [Fri, 9 Aug 2019 02:31:11 +0000 (10:31 +0800)]
net: hns3: modify how pause options is displayed

Currently, the pause options of HNS3 shown like this:
"RX/TX" is always the same with "RX negotiated/TX negotiated".
Because of the driver covered the value of "RX/TX" with the value
of "RX negotiated/TX negotiated" after adjust link.

This patch records the pause configurations of the user, and never
covered them in adjust link.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: add input length check for debugfs write function
Yufeng Mo [Fri, 9 Aug 2019 02:31:10 +0000 (10:31 +0800)]
net: hns3: add input length check for debugfs write function

If the input length reaches the maximum value of size_t, the reverse is
triggered when 1 is added. In addition, there is no need to have such a
large length. Therefore, the input length should be checked and the value
should be less than or equal to 1024.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: clean up for vlan handling in hns3_fill_desc_vtags
Yunsheng Lin [Fri, 9 Aug 2019 02:31:09 +0000 (10:31 +0800)]
net: hns3: clean up for vlan handling in hns3_fill_desc_vtags

This patch refactors the hns3_fill_desc_vtags function
by avoiding passing too many parameters, reducing indent
level and some other clean up.

This patch also adds the hns3_fill_skb_desc function to
fill the first desc of a skb.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: fix interrupt clearing error for VF
Huazhong Tan [Fri, 9 Aug 2019 02:31:08 +0000 (10:31 +0800)]
net: hns3: fix interrupt clearing error for VF

Currently, VF driver has two kinds of interrupts, reset & CMDQ RX.
For revision 0x21, according to the UM, each interrupt should be
cleared by write 0 to the corresponding bit, but the implementation
writes 0 to the whole register in fact, it will clear other
interrupt at the same time, then the VF will loss the interrupt.
But for revision 0x20, this interrupt clear register is a read &
write register, for compatible, we just keep the old implementation
for 0x20.

This patch fixes it, also, adds a new register for reading the interrupt
status according to hardware user manual.

Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Fixes: b90fcc5bd904 ("net: hns3: add reset handling for VF when doing Core/Global/IMP reset")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hns3: fix GFP flag error in hclge_mac_update_stats()
Zhongzhu Liu [Fri, 9 Aug 2019 02:31:07 +0000 (10:31 +0800)]
net: hns3: fix GFP flag error in hclge_mac_update_stats()

When CONFIG_DEBUG_ATOMIC_SLEEP on, calling kzalloc with
GFP_KERNEL in hclge_mac_update_stats() will get below warning:

[   52.514677] BUG: sleeping function called from invalid context at mm/slab.h:501
[   52.522051] in_atomic(): 0, irqs_disabled(): 0, pid: 1015, name: ifconfig
[   52.528827] 2 locks held by ifconfig/1015:
[   52.532921]  #0: (____ptrval____) (&p->lock){....}, at: seq_read+0x54/0x748
[   52.539878]  #1: (____ptrval____) (rcu_read_lock){....}, at: dev_seq_start+0x0/0x140
[   52.547610] CPU: 16 PID: 1015 Comm: ifconfig Not tainted 5.3.0-rc3-00697-g20b80be #98
[   52.555408] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B050.01 08/08/2019
[   52.564242] Call trace:
[   52.566687]  dump_backtrace+0x0/0x1f8
[   52.570338]  show_stack+0x14/0x20
[   52.573646]  dump_stack+0xb4/0xec
[   52.576950]  ___might_sleep+0x178/0x198
[   52.580773]  __might_sleep+0x74/0xe0
[   52.584338]  __kmalloc+0x244/0x2d8
[   52.587744]  hclge_mac_update_stats+0xc8/0x1f8 [hclge]
[   52.592870]  hclge_update_stats+0xe0/0x170 [hclge]
[   52.597651]  hns3_nic_get_stats64+0xa0/0x458 [hns3]
[   52.602514]  dev_get_stats+0x58/0x138
[   52.606165]  dev_seq_printf_stats+0x8c/0x280
[   52.610420]  dev_seq_show+0x14/0x40
[   52.613898]  seq_read+0x574/0x748
[   52.617205]  proc_reg_read+0xb4/0x108
[   52.620857]  __vfs_read+0x54/0xa8
[   52.624162]  vfs_read+0xa0/0x190
[   52.627380]  ksys_read+0xc8/0x178
[   52.630685]  __arm64_sys_read+0x40/0x50
[   52.634509]  el0_svc_common.constprop.0+0x120/0x1e0
[   52.639369]  el0_svc_handler+0x50/0x90
[   52.643106]  el0_svc+0x8/0xc

So this patch uses GFP_ATOMIC instead of GFP_KERNEL to fix it.

Fixes: d174ea75c96a ("net: hns3: add statistics for PFC frames and MAC control frames")
Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotaprio: remove unused variable 'entry_list_policy'
YueHaibing [Fri, 9 Aug 2019 01:49:23 +0000 (09:49 +0800)]
taprio: remove unused variable 'entry_list_policy'

net/sched/sch_taprio.c:680:32: warning:
 entry_list_policy defined but not used [-Wunused-const-variable=]

One of the points of commit a3d43c0d56f1 ("taprio: Add support adding
an admin schedule") is that it removes support (it now returns "not
supported") for schedules using the TCA_TAPRIO_ATTR_SCHED_SINGLE_ENTRY
attribute (which were never used), the parsing of those types of schedules
was the only user of this policy. So removing this policy should be fine.

Reported-by: Hulk Robot <hulkci@huawei.com>
Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agor8169: fix performance issue on RTL8168evl
Holger Hoffstätte [Thu, 8 Aug 2019 22:02:40 +0000 (00:02 +0200)]
r8169: fix performance issue on RTL8168evl

Disabling TSO but leaving SG active results is a significant
performance drop. Therefore disable also SG on RTL8168evl.
This restores the original performance.

Fixes: 93681cd7d94f ("r8169: enable HW csum and TSO")
Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotcp: Update TCP_BASE_MSS comment
Josh Hunt [Wed, 7 Aug 2019 23:52:30 +0000 (19:52 -0400)]
tcp: Update TCP_BASE_MSS comment

TCP_BASE_MSS is used as the default initial MSS value when MTU probing is
enabled. Update the comment to reflect this.

Suggested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Josh Hunt <johunt@akamai.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotcp: add new tcp_mtu_probe_floor sysctl
Josh Hunt [Wed, 7 Aug 2019 23:52:29 +0000 (19:52 -0400)]
tcp: add new tcp_mtu_probe_floor sysctl

The current implementation of TCP MTU probing can considerably
underestimate the MTU on lossy connections allowing the MSS to get down to
48. We have found that in almost all of these cases on our networks these
paths can handle much larger MTUs meaning the connections are being
artificially limited. Even though TCP MTU probing can raise the MSS back up
we have seen this not to be the case causing connections to be "stuck" with
an MSS of 48 when heavy loss is present.

Prior to pushing out this change we could not keep TCP MTU probing enabled
b/c of the above reasons. Now with a reasonble floor set we've had it
enabled for the past 6 months.

The new sysctl will still default to TCP_MIN_SND_MSS (48), but gives
administrators the ability to control the floor of MSS probing.

Signed-off-by: Josh Hunt <johunt@akamai.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: remove pointless data_len arg from region snapshot create
Jiri Pirko [Fri, 9 Aug 2019 13:27:15 +0000 (15:27 +0200)]
devlink: remove pointless data_len arg from region snapshot create

The size of the snapshot has to be the same as the size of the region,
therefore no need to pass it again during snapshot creation. Remove the
arg and use region->size instead.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotcp: batch calls to sk_flush_backlog()
Eric Dumazet [Fri, 9 Aug 2019 12:04:47 +0000 (05:04 -0700)]
tcp: batch calls to sk_flush_backlog()

Starting from commit d41a69f1d390 ("tcp: make tcp_sendmsg() aware of socket backlog")
loopback flows got hurt, because for each skb sent, the socket receives an
immediate ACK and sk_flush_backlog() causes extra work.

Intent was to not let the backlog grow too much, but we went a bit too far.

We can check the backlog every 16 skbs (about 1MB chunks)
to increase TCP over loopback performance by about 15 %

Note that the call to sk_flush_backlog() handles a single ACK,
thanks to coalescing done on backlog, but cleans the 16 skbs
found in rtx rb-tree.

Reported-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoliquidio: Use pcie_flr() instead of reimplementing it
Denis Efremov [Thu, 8 Aug 2019 04:57:53 +0000 (07:57 +0300)]
liquidio: Use pcie_flr() instead of reimplementing it

octeon_mbox_process_cmd() directly writes the PCI_EXP_DEVCTL_BCR_FLR
bit, which bypasses timing requirements imposed by the PCIe spec.
This patch fixes the function to use the pcie_flr() interface instead.

Signed-off-by: Denis Efremov <efremov@linux.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agor8169: allocate rx buffers using alloc_pages_node
Heiner Kallweit [Wed, 7 Aug 2019 19:38:22 +0000 (21:38 +0200)]
r8169: allocate rx buffers using alloc_pages_node

We allocate 16kb per rx buffer, so we can avoid some overhead by using
alloc_pages_node directly instead of bothering kmalloc_node. Due to
this change buffers are page-aligned now, therefore the alignment check
can be removed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agofq_codel: remove set but not used variables 'prev_ecn_mark' and 'prev_drop_count'
YueHaibing [Wed, 7 Aug 2019 13:10:55 +0000 (21:10 +0800)]
fq_codel: remove set but not used variables 'prev_ecn_mark' and 'prev_drop_count'

Fixes gcc '-Wunused-but-set-variable' warning:

net/sched/sch_fq_codel.c: In function fq_codel_dequeue:
net/sched/sch_fq_codel.c:288:23: warning: variable prev_ecn_mark set but not used [-Wunused-but-set-variable]
net/sched/sch_fq_codel.c:288:6: warning: variable prev_drop_count set but not used [-Wunused-but-set-variable]

They are not used since commit 77ddaff218fc ("fq_codel: Kill
useless per-flow dropped statistic")

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: spectrum: Extend to support Spectrum-3 ASIC
Jiri Pirko [Wed, 7 Aug 2019 10:42:31 +0000 (13:42 +0300)]
mlxsw: spectrum: Extend to support Spectrum-3 ASIC

Extend existing driver for Spectrum and Spectrum-2 ASICs
to support Spectrum-3 ASIC as well.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'stmmac-next'
David S. Miller [Fri, 9 Aug 2019 05:20:19 +0000 (22:20 -0700)]
Merge branch 'stmmac-next'

Jose Abreu says:

====================
net: stmmac: Improvements for -next

[ This is just a rebase of v2 into latest -next in order to avoid a merge
conflict ]

Couple of improvements for -next tree. More info in commit logs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: selftests: Add a selftest for Flexible RX Parser
Jose Abreu [Wed, 7 Aug 2019 08:03:18 +0000 (10:03 +0200)]
net: stmmac: selftests: Add a selftest for Flexible RX Parser

Add a selftest for the Flexible RX Parser feature.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: Add Flexible RX Parser support in XGMAC
Jose Abreu [Wed, 7 Aug 2019 08:03:17 +0000 (10:03 +0200)]
net: stmmac: Add Flexible RX Parser support in XGMAC

XGMAC cores also support the Flexible RX Parser feature. Add the support
for it in the XGMAC core.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: Implement Safety Features in XGMAC core
Jose Abreu [Wed, 7 Aug 2019 08:03:16 +0000 (10:03 +0200)]
net: stmmac: Implement Safety Features in XGMAC core

XGMAC also supports Safety Features. This patch implements the
configuration and handling of this feature in XGMAC core.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: selftests: Add test for VLAN and Double VLAN Filtering
Jose Abreu [Wed, 7 Aug 2019 08:03:15 +0000 (10:03 +0200)]
net: stmmac: selftests: Add test for VLAN and Double VLAN Filtering

Add a selftest for VLAN and Double VLAN Filtering in stmmac.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: Implement VLAN Hash Filtering in XGMAC
Jose Abreu [Wed, 7 Aug 2019 08:03:14 +0000 (10:03 +0200)]
net: stmmac: Implement VLAN Hash Filtering in XGMAC

Implement the VLAN Hash Filtering feature in XGMAC core.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: selftests: Add RSS test
Jose Abreu [Wed, 7 Aug 2019 08:03:13 +0000 (10:03 +0200)]
net: stmmac: selftests: Add RSS test

Add a test for RSS in the stmmac selftests.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: Implement RSS and enable it in XGMAC core
Jose Abreu [Wed, 7 Aug 2019 08:03:12 +0000 (10:03 +0200)]
net: stmmac: Implement RSS and enable it in XGMAC core

Implement the RSS functionality and add the corresponding callbacks in
XGMAC core.

Changes from v1:
- Do not use magic constants (Jakub)
- Use ethtool_rxfh_indir_default() (Jakub)

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: xgmac: Implement tx_queue_prio()
Jose Abreu [Wed, 7 Aug 2019 08:03:11 +0000 (10:03 +0200)]
net: stmmac: xgmac: Implement tx_queue_prio()

Implement the TX Queue Priority callback in XGMAC core.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: xgmac: Implement set_mtl_tx_queue_weight()
Jose Abreu [Wed, 7 Aug 2019 08:03:10 +0000 (10:03 +0200)]
net: stmmac: xgmac: Implement set_mtl_tx_queue_weight()

Implement the TX Queue Weight callback. In order for this to be active
we also need to set ETS algorithm when configuring Queue.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: stmmac: xgmac: Implement MMC counters
Jose Abreu [Wed, 7 Aug 2019 08:03:09 +0000 (10:03 +0200)]
net: stmmac: xgmac: Implement MMC counters

Implement the MMC counters feature in XGMAC core.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotipc: add loopback device tracking
John Rutherford [Wed, 7 Aug 2019 02:52:29 +0000 (12:52 +1000)]
tipc: add loopback device tracking

Since node internal messages are passed directly to the socket, it is not
possible to observe those messages via tcpdump or wireshark.

We now remedy this by making it possible to clone such messages and send
the clones to the loopback interface.  The clones are dropped at reception
and have no functional role except making the traffic visible.

The feature is enabled if network taps are active for the loopback device.
pcap filtering restrictions require the messages to be presented to the
receiving side of the loopback device.

v3 - Function dev_nit_active used to check for network taps.
   - Procedure netif_rx_ni used to send cloned messages to loopback device.

Signed-off-by: John Rutherford <john.rutherford@dektech.com.au>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'flow_offload-add-indr-block-in-nf_table_offload'
David S. Miller [Fri, 9 Aug 2019 01:44:31 +0000 (18:44 -0700)]
Merge branch 'flow_offload-add-indr-block-in-nf_table_offload'

wenxu says:

====================
flow_offload: add indr-block in nf_table_offload

This series patch make nftables offload support the vlan and
tunnel device offload through indr-block architecture.

The first four patches mv tc indr block to flow offload and
rename to flow-indr-block.
Because the new flow-indr-block can't get the tcf_block
directly. The fifth patch provide a callback list to get
flow_block of each subsystem immediately when the device
register and contain a block.
The last patch make nf_tables_offload support flow-indr-block.

This version add a mutex lock for add/del flow_indr_block_ing_cb
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetfilter: nf_tables_offload: support indr block call
wenxu [Wed, 7 Aug 2019 01:13:54 +0000 (09:13 +0800)]
netfilter: nf_tables_offload: support indr block call

nftable support indr-block call. It makes nftable an offload vlan
and tunnel device.

nft add table netdev firewall
nft add chain netdev firewall aclout { type filter hook ingress offload device mlx_pf0vf0 priority - 300 \; }
nft add rule netdev firewall aclout ip daddr 10.0.0.1 fwd to vlan0
nft add chain netdev firewall aclin { type filter hook ingress device vlan0 priority - 300 \; }
nft add rule netdev firewall aclin ip daddr 10.0.0.7 fwd to mlx_pf0vf0

Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>