]> asedeno.scripts.mit.edu Git - linux.git/log
linux.git
5 years agoRDMA/hns: Add mtr support for mixed multihop addressing
Lijun Ou [Sat, 8 Jun 2019 06:46:08 +0000 (14:46 +0800)]
RDMA/hns: Add mtr support for mixed multihop addressing

Currently, the MTT(memory translate table) design required a buffer
space must has the same hopnum, but the hip08 hw can support mixed
hopnum config in a buffer space.

This patch adds the MTR(memory translate region) design for supporting
mixed multihop.

Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/netlink: Resort policy array
Doug Ledford [Wed, 19 Jun 2019 13:20:49 +0000 (09:20 -0400)]
RDMA/netlink: Resort policy array

Sort the netlink policy array by netlink attribute name.  This will make
it easier in the future to find the entry you are looking for when you
need to make changes, or to make sure you don't add the same entry
twice.

Fix the whitespace while we are there.

Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/mlx5: Enable decap and packet reformat on FDB
Maor Gottlieb [Wed, 12 Jun 2019 12:20:14 +0000 (15:20 +0300)]
RDMA/mlx5: Enable decap and packet reformat on FDB

If FDB flow tables support decap operation, enable it on creation,
This allows to perform decapsulation of tunnelled packets by steering
rules. If FDB flow tables support reformat operation, enable it on
creation as well.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/mlx5: Consider eswitch encap mode
Maor Gottlieb [Wed, 12 Jun 2019 12:20:13 +0000 (15:20 +0300)]
RDMA/mlx5: Consider eswitch encap mode

When flow steering is created, then the encap support should
consider the eswitch encap mode. If the eswitch flow table (FDB)
supports encap then it shouldn't be supported on NIC RX flow tables.

Fixes: 4adda1122c490 ('RDMA/mlx5: Enable decap and packet reformat on flow tables')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoMerge remote-tracking branch 'mlx5-next/mlx5-next' into HEAD
Doug Ledford [Wed, 19 Jun 2019 02:44:36 +0000 (22:44 -0400)]
Merge remote-tracking branch 'mlx5-next/mlx5-next' into HEAD

Take mlx5-next so we can take a dependent two patch series next.

Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/odp: Fix missed unlock in non-blocking invalidate_start
Jason Gunthorpe [Tue, 11 Jun 2019 16:09:51 +0000 (13:09 -0300)]
RDMA/odp: Fix missed unlock in non-blocking invalidate_start

If invalidate_start returns with EAGAIN then the umem_rwsem needs to be
unlocked as no invalidate_end will be called.

Cc: <stable@vger.kernel.org>
Fixes: ca748c39ea3f ("RDMA/umem: Get rid of per_mm->notifier_count")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoIB/hfi1: Spelling s/statisfied/satisfied/
Geert Uytterhoeven [Mon, 17 Jun 2019 14:01:38 +0000 (16:01 +0200)]
IB/hfi1: Spelling s/statisfied/satisfied/

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEV
Jason Gunthorpe [Fri, 14 Jun 2019 00:38:19 +0000 (21:38 -0300)]
RDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEV

Update the struct ib_client for all modules exporting cdevs related to the
ibdevice to also implement RDMA_NLDEV_CMD_GET_CHARDEV. All cdevs are now
autoloadable and discoverable by userspace over netlink instead of relying
on sysfs.

uverbs also exposes the DRIVER_ID for drivers that are able to support
driver id binding in rdma-core.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA: Add NLDEV_GET_CHARDEV to allow char dev discovery and autoload
Jason Gunthorpe [Fri, 14 Jun 2019 00:38:18 +0000 (21:38 -0300)]
RDMA: Add NLDEV_GET_CHARDEV to allow char dev discovery and autoload

Allow userspace to issue a netlink query against the ib_device for
something like "uverbs" and get back the char dev name, inode major/minor,
and interface ABI information for "uverbs0".

Since we are now in netlink this can also trigger a module autoload to
make the uverbs device come into existence.

Largely this will let us replace searching and reading inside sysfs to
setup devices, and provides an alternative (using driver_id) to device
name based provider binding for things like rxe.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA: Move rdma_node_type to uapi/
Jason Gunthorpe [Fri, 14 Jun 2019 00:38:17 +0000 (21:38 -0300)]
RDMA: Move rdma_node_type to uapi/

This enum is exposed over the sysfs file 'node_type' and over netlink via
RDMA_NLDEV_ATTR_DEV_NODE_TYPE, so declare it in the uapi headers.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agonet/mlx5: Expose eswitch encap mode
Maor Gottlieb [Wed, 12 Jun 2019 12:20:12 +0000 (15:20 +0300)]
net/mlx5: Expose eswitch encap mode

Add API to get the current Eswitch encap mode.
It will be used in downstream patches to check if
flow table can be created with encap support or not.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
5 years agonet/mlx5: Declare more strictly devlink encap mode
Leon Romanovsky [Wed, 12 Jun 2019 12:20:11 +0000 (15:20 +0300)]
net/mlx5: Declare more strictly devlink encap mode

Devlink has UAPI declaration for encap mode, so there is no
need to be loose on the data get/set by drivers.

Update call sites to use enum devlink_eswitch_encap_mode
instead of plain u8.

Suggested-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
5 years agonet/mlx5: Add EQ enable/disable API
Yuval Avnery [Mon, 10 Jun 2019 23:38:42 +0000 (23:38 +0000)]
net/mlx5: Add EQ enable/disable API

Previously, EQ joined the chain notifier on creation.
This forced the caller to be ready to handle events before creating
the EQ through eq_create_generic interface.

To help the caller control when the created EQ will be attached to the
IRQ, add enable/disable API.

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Use a single IRQ for all async EQs
Ariel Levkovich [Mon, 10 Jun 2019 23:38:41 +0000 (23:38 +0000)]
net/mlx5: Use a single IRQ for all async EQs

The patch modifies the IRQ allocation so that all async EQs are
assigned to the same IRQ resulting in more available IRQs for
completion EQs.

The changes are using the support for IRQ sharing and EQ polling budget
that was introduced in previous patches so when the shared interrupt is
triggered, the kernel will serially call the handler of each of the
sharing EQs with a certain budget of EQEs to poll in order to prevent
starvation.

Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Rename mlx5_irq_info to mlx5_irq
Yuval Avnery [Mon, 10 Jun 2019 23:38:39 +0000 (23:38 +0000)]
net/mlx5: Rename mlx5_irq_info to mlx5_irq

struct mlx5_irq_info is an active object and not just info.

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Move all IRQ logic to pci_irq.c
Yuval Avnery [Mon, 10 Jun 2019 23:38:37 +0000 (23:38 +0000)]
net/mlx5: Move all IRQ logic to pci_irq.c

Finalize IRQ separation and expose irq interface.

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Generalize IRQ interface to work with irq_table
Yuval Avnery [Mon, 10 Jun 2019 23:38:34 +0000 (23:38 +0000)]
net/mlx5: Generalize IRQ interface to work with irq_table

IRQ interface should operate within the irq_table context.
It should be independent of any EQ data structure.

The interface that will be exposed:
init/clenup, create/destroy, attach/detach

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Separate IRQ table creation from EQ table creation
Yuval Avnery [Mon, 10 Jun 2019 23:38:32 +0000 (23:38 +0000)]
net/mlx5: Separate IRQ table creation from EQ table creation

IRQ allocation should be part of the IRQ table life-cycle.

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Move IRQ affinity set to IRQ allocation phase
Yuval Avnery [Mon, 10 Jun 2019 23:38:30 +0000 (23:38 +0000)]
net/mlx5: Move IRQ affinity set to IRQ allocation phase

Affinity set/clear is part of the IRQ life-cycle.

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Move IRQ rmap creation to IRQ allocation phase
Yuval Avnery [Mon, 10 Jun 2019 23:38:28 +0000 (23:38 +0000)]
net/mlx5: Move IRQ rmap creation to IRQ allocation phase

Rmap creation/deletion is part of the IRQ life-cycle.

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Separate IRQ data from EQ table data
Yuval Avnery [Mon, 10 Jun 2019 23:38:27 +0000 (23:38 +0000)]
net/mlx5: Separate IRQ data from EQ table data

IRQ table should only exist for mlx5_core_dev for PF and VF only.
EQ table of mediated devices should hold a pointer to the IRQ table
of the parent PCI device.

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Separate IRQ request/free from EQ life cycle
Yuval Avnery [Mon, 10 Jun 2019 23:38:25 +0000 (23:38 +0000)]
net/mlx5: Separate IRQ request/free from EQ life cycle

Instead of requesting IRQ with eq creation, IRQs will be requested
before EQ table creation.
Instead of freeing the IRQs after EQ destroy, free IRQs after eq
table destroy.

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Change interrupt handler to call chain notifier
Yuval Avnery [Mon, 10 Jun 2019 23:38:23 +0000 (23:38 +0000)]
net/mlx5: Change interrupt handler to call chain notifier

Multiple EQs may share the same IRQ in subsequent patches.

Instead of calling the IRQ handler directly, the EQ will register
to an atomic chain notfier.

The Linux built-in shared IRQ is not used because it forces the caller
to disable the IRQ and clear affinity before free_irq() can be called.

This patch is the first step in the separation of IRQ and EQ logic.

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Introduce EQ polling budget
Yuval Avnery [Mon, 10 Jun 2019 23:38:21 +0000 (23:38 +0000)]
net/mlx5: Introduce EQ polling budget

Multiple EQs may share the same irq in subsequent patches.
To avoid starvation, a budget is set per EQ's interrupt handler.

Because of this change, it is no longer required to check that
MLX5_NUM_SPARE_EQE eqes were polled (to detect that arm is required).
It is guaranteed that MLX5_NUM_SPARE_EQE > budget, therefore the
handler will arm and exit the handler before all the entries in the
eq are polled.

In the scenario where the handler is out of budget and there are more
EQEs to poll, arming the EQ guarantees that the HW will send another
interrupt and the handler will be called again.

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Support querying max VFs from device
Bodong Wang [Mon, 10 Jun 2019 23:38:19 +0000 (23:38 +0000)]
net/mlx5: Support querying max VFs from device

For ECPF with eswitch manager privilege, query the host max VF count
by querying the device using query_functions command.

With this enhancement:
1. flow steering entries are created only for valid vports based on
   the max VF count of the PF.
2. Driver only queries cap of valid vport.

Eswitch requires the max VFs when doing initialization, so do sr-iov
init before eswitch init.

Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: E-Switch, Return raw output for query esw functions
Bodong Wang [Mon, 10 Jun 2019 23:38:18 +0000 (23:38 +0000)]
net/mlx5: E-Switch, Return raw output for query esw functions

Current function only returns host num of VFs, later patch requires
other params such as host maximum num of VFs.

Return the raw output so that caller can extract info as needed.

Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: E-Switch, Handle representors creation in handler context
Vu Pham [Mon, 10 Jun 2019 23:38:16 +0000 (23:38 +0000)]
net/mlx5: E-Switch, Handle representors creation in handler context

Unified representors creation in esw_functions_changed context
handler. Emulate the esw_function_changed event for FW/HW that
does not support this event.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Increase wait time for fw initialization
Daniel Jurgens [Mon, 10 Jun 2019 23:38:14 +0000 (23:38 +0000)]
net/mlx5: Increase wait time for fw initialization

Firmware FLR happens sequentially, in some cases, like when destroying
a VM that had many VFs, may require waiting much longer than 10 seconds.
Increase the timeout to 2 minutes, and print a wait countdown status
every 20 seconds.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agordma: Remove nes
Jason Gunthorpe [Mon, 10 Jun 2019 19:49:11 +0000 (16:49 -0300)]
rdma: Remove nes

This driver was first merged over 10 years ago and has not seen major
activity by the authors in the last 7 years. However, in that time it has
been patched 150 times to adapt it to changing kernel APIs.

Further, the hardware has several issues, like not supporting 64 bit DMA,
that make it rather uninteresting for use with modern systems and RDMA.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/ipoib: Remove check for ETH_SS_TEST
Kamal Heib [Thu, 30 May 2019 13:18:17 +0000 (16:18 +0300)]
RDMA/ipoib: Remove check for ETH_SS_TEST

The default action for unlisted tests is "not-supported", so given that
ipoib doesn't support ETH_SS_TEST, there is no need to check for it
in the case statements, just let it get caught by the default: case.

Fixes: e3614bc9dc44 ("IB/ipoib: Add readout of statistics using ethtool")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA: Convert CQ allocations to be under core responsibility
Leon Romanovsky [Tue, 28 May 2019 11:37:29 +0000 (14:37 +0300)]
RDMA: Convert CQ allocations to be under core responsibility

Ensure that CQ is allocated and freed by IB/core and not by drivers.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA: Clean destroy CQ in drivers do not return errors
Leon Romanovsky [Tue, 28 May 2019 11:37:28 +0000 (14:37 +0300)]
RDMA: Clean destroy CQ in drivers do not return errors

Like all other destroy commands, .destroy_cq() call is not supposed
to fail. In all flows, the attempt to return earlier caused to memory
leaks.

This patch converts .destroy_cq() to do not return any errors.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/nes: Avoid memory allocation during CQ destroy
Leon Romanovsky [Tue, 28 May 2019 11:37:27 +0000 (14:37 +0300)]
RDMA/nes: Avoid memory allocation during CQ destroy

The memory allocation call can fail and cause to early return
from nes_desotroy_cq() function. This situation will cause to
memory leak of struct nes_cq. Rewrite function to avoid memory
allocation.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA: Move owner into struct ib_device_ops
Jason Gunthorpe [Wed, 5 Jun 2019 17:39:26 +0000 (14:39 -0300)]
RDMA: Move owner into struct ib_device_ops

This more closely follows how other subsytems work, with owner being a
member of the structure containing the function pointers.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA: Move uverbs_abi_ver into struct ib_device_ops
Jason Gunthorpe [Wed, 5 Jun 2019 17:39:25 +0000 (14:39 -0300)]
RDMA: Move uverbs_abi_ver into struct ib_device_ops

No reason for every driver to emit code to set this, just make it part of
the driver's existing static const ops structure.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA: Move driver_id into struct ib_device_ops
Jason Gunthorpe [Wed, 5 Jun 2019 17:39:24 +0000 (14:39 -0300)]
RDMA: Move driver_id into struct ib_device_ops

No reason for every driver to emit code to set this, just make it part of
the driver's existing static const ops structure.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agordma: Delete the ib_ucm module
Jason Gunthorpe [Mon, 10 Jun 2019 18:02:01 +0000 (15:02 -0300)]
rdma: Delete the ib_ucm module

This has been marked CONFIG_BROKEN for over a year now with no complaints.
Delete the whole thing for good.

The module provided the /dev/infiniband/ucmX interface.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoucma: Convert ctx_idr to XArray
Matthew Wilcox [Thu, 21 Feb 2019 00:21:05 +0000 (16:21 -0800)]
ucma: Convert ctx_idr to XArray

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoucma: Convert multicast_idr to XArray
Matthew Wilcox [Thu, 21 Feb 2019 00:21:04 +0000 (16:21 -0800)]
ucma: Convert multicast_idr to XArray

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/ucma: Use struct_size() helper
Gustavo A. R. Silva [Tue, 4 Jun 2019 15:42:22 +0000 (10:42 -0500)]
RDMA/ucma: Use struct_size() helper

Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Bugfix for filling the sge of srq
Lijun Ou [Fri, 31 May 2019 10:28:03 +0000 (18:28 +0800)]
RDMA/hns: Bugfix for filling the sge of srq

When user post recv a srq with multiple sges, the hardware will get the
last correct sge and count the sge numbers according to the specific
identifier with lkey. For example, when the driver fills the sges with
every wr less than the max sge that the user configured when creating srq,
the hardware will stop getting the sge according to the specific lkey in
the sge. However, it will always end with the first sge in the current
post srq recv interface implementation.

Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: fix inverted logic of readl read and shift
Colin Ian King [Fri, 31 May 2019 09:21:01 +0000 (10:21 +0100)]
RDMA/hns: fix inverted logic of readl read and shift

A previous change incorrectly changed the inverted logic and logically
negated the readl rather than the shifted readl result. Fix this by
adding in missing parentheses around the expression that needs to be
logically negated.

Addresses-Coverity: ("Logically dead code")
Fixes: 669cefb654cb ("RDMA/hns: Remove jiffies operation in disable interrupt context")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/srp: Accept again source addresses that do not have a port number
Bart Van Assche [Wed, 29 May 2019 16:38:31 +0000 (09:38 -0700)]
RDMA/srp: Accept again source addresses that do not have a port number

The function srp_parse_in() is used both for parsing source address
specifications and for target address specifications. Target addresses
must have a port number. Having to specify a port number for source
addresses is inconvenient. Make sure that srp_parse_in() supports again
parsing addresses with no port number.

Cc: <stable@vger.kernel.org>
Fixes: c62adb7def71 ("IB/srp: Fix IPv6 address parsing")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/ipoib: implement ethtool .get_link() callback
Kamal Heib [Wed, 29 May 2019 13:55:45 +0000 (16:55 +0300)]
RDMA/ipoib: implement ethtool .get_link() callback

Add support for reporting link state for ipoib net devices.

$ ip l set dev mlx4_ib0 up
$ ethtool mlx4_ib0 | grep Link
Link detected: yes
$ ip l set dev mlx4_ib0 down
$ ethtool mlx4_ib0 | grep Link
Link detected: no

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years ago{IB,net}/mlx5: Constify rep ops functions pointers
Parav Pandit [Wed, 29 May 2019 22:50:41 +0000 (22:50 +0000)]
{IB,net}/mlx5: Constify rep ops functions pointers

Currently for every representor type and for every single vport,
representer function pointers copy is stored even though they don't
change from one to other vport.

Additionally priv data entry for the rep is not passed during
registration, but its copied. It is used (set and cleared) by the user
of the reps.

As we want to scale vports, to simplify and also to split constants
from data,

1. Rename mlx5_eswitch_rep_if to mlx5_eswitch_rep_ops as to match _ops
prefix with other standard netdev, ibdev ops.
2. Constify the IB and Ethernet rep ops structure.
3. Instead of storing copy of all rep function pointers, store copy
per eswitch rep type.
4. Split data and function pointers to mlx5_eswitch_rep_ops and
mlx5_eswitch_rep_data.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years ago{IB, net}/mlx5: No need to typecast from void* to mlx5_ib_dev*
Parav Pandit [Wed, 29 May 2019 22:50:39 +0000 (22:50 +0000)]
{IB, net}/mlx5: No need to typecast from void* to mlx5_ib_dev*

Avoid typecasting from void* to mlx5_ib_dev* or mlx5e_rep_priv*
as it is not needed.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: E-Switch, Honor eswitch functions changed event cap
Vu Pham [Wed, 29 May 2019 22:50:37 +0000 (22:50 +0000)]
net/mlx5: E-Switch, Honor eswitch functions changed event cap

Whenever device supports eswitch functions changed event, honor
such device setting. Do not limit it to ECPF.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: E-Switch, Replace host_params event with functions_changed event
Vu Pham [Wed, 29 May 2019 22:50:34 +0000 (22:50 +0000)]
net/mlx5: E-Switch, Replace host_params event with functions_changed event

To support sriov on a E-Switch manager, num_vfs are queried
to the firmware whenever E-Switch manager is notified by
esw_functions_changed event.

Replace host_params event with esw_functions_changed event that reflects
more appropriate naming.

While at it, also correct num_vfs type from int to u16 as expected by
the function mlx5_esw_query_functions().

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Introduce termination table bits
Eli Britstein [Wed, 29 May 2019 22:50:29 +0000 (22:50 +0000)]
net/mlx5: Introduce termination table bits

Termination table is a flow table with a termination flag. The flag
allows the firmware to assume that the the specified actions are the last
actions list. This assumption allows the FW to safely perform potential
looping logic (e.g. hairpin). Introduce the bits for this attribute.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add core dump register access HW bits
Moshe Shemesh [Wed, 29 May 2019 22:50:24 +0000 (22:50 +0000)]
net/mlx5: Add core dump register access HW bits

Add Firmware core dump registers and HW definitions.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoRDMA/hns: Bugfix for posting multiple srq work request
Lijun Ou [Thu, 30 May 2019 15:55:53 +0000 (23:55 +0800)]
RDMA/hns: Bugfix for posting multiple srq work request

When the user submits more than 32 work request to a srq queue
at a time, it needs to find the corresponding number of entries
in the bitmap in the idx queue. However, the original lookup
function named ffs only processes 32 bits of the array element,
When the number of srq wqe issued exceeds 32, the ffs will only
process the lower 32 bits of the elements, it will not be able
to get the correct wqe index for srq wqe.

Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/uverbs: check for allocation failure in uapi_add_elm()
Dan Carpenter [Thu, 30 May 2019 08:20:24 +0000 (11:20 +0300)]
RDMA/uverbs: check for allocation failure in uapi_add_elm()

If the kzalloc() fails then we should return ERR_PTR(-ENOMEM).  In the
current code it's possible that the kzalloc() fails and the
radix_tree_insert() inserts the NULL pointer successfully and we return
the NULL "elm" pointer to the caller.  That results in a NULL pointer
dereference.

Fixes: 9ed3e5f44772 ("IB/uverbs: Build the specs into a radix tree at runtime")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/hfi1: Use struct_size() helper
Gustavo A. R. Silva [Wed, 29 May 2019 15:15:28 +0000 (10:15 -0500)]
IB/hfi1: Use struct_size() helper

Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes, in particular in the
context in which this code is being used.

So, replace the following form:

sizeof(struct opa_port_status_rsp) + num_vls * sizeof(struct _vls_pctrs)

with:

struct_size(rsp, vls, num_vls)

and so on...

Also, notice that variable size is unnecessary, hence it is removed.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/qib: Use struct_size() helper
Gustavo A. R. Silva [Wed, 29 May 2019 15:13:26 +0000 (10:13 -0500)]
IB/qib: Use struct_size() helper

Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes, in particular in the
context in which this code is being used.

So, replace the following form:

sizeof(*pkt) + sizeof(pkt->addr[0])*n

with:

struct_size(pkt, addr, n)

Also, notice that variable size is unnecessary, hence it is removed.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/rdmavt: Use struct_size() helper
Gustavo A. R. Silva [Wed, 29 May 2019 15:12:48 +0000 (10:12 -0500)]
IB/rdmavt: Use struct_size() helper

Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes, in particular in the
context in which this code is being used.

So, replace the following form:

sizeof(struct rvt_sge) * init_attr->cap.max_send_sge + sizeof(struct rvt_swqe)

with:

struct_size(swq, sg_list, init_attr->cap.max_send_sge)

and so on...

Also, notice that variable size is unnecessary, hence it is removed.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/efa: Remove unused includes
Gal Pressman [Tue, 28 May 2019 12:46:18 +0000 (15:46 +0300)]
RDMA/efa: Remove unused includes

Remove leftover includes that are no longer used from the driver.

Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/efa: Use rdma block iterator in chunk list creation
Gal Pressman [Tue, 28 May 2019 12:46:17 +0000 (15:46 +0300)]
RDMA/efa: Use rdma block iterator in chunk list creation

When creating the chunks list the rdma_for_each_block() iterator is used
in order to iterate over the payload in EFA_CHUNK_PAYLOAD_SIZE (device
defined) strides.

Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/efa: Remove unneeded admin commands abort flow
Gal Pressman [Tue, 28 May 2019 12:46:15 +0000 (15:46 +0300)]
RDMA/efa: Remove unneeded admin commands abort flow

The admin commands abort flow is buggy (use-after-free) and not really
necessary as it is guaranteed that after ib_unregister_device() is called
there are no user verbs threads running in parallel, delete it.

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/efa: Use kvzalloc instead of kzalloc with fallback
Gal Pressman [Tue, 28 May 2019 12:46:14 +0000 (15:46 +0300)]
RDMA/efa: Use kvzalloc instead of kzalloc with fallback

Use kvzalloc which attempts to allocate a physically continuous buffer and
fallbacks to virtually continuous on failure instead of open coding it in
the driver.

The is_vmalloc_addr function is used to determine whether the buffer is
physically continuous or not (which determines direct vs indirect MR
registration mode).

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/hfi1: Remove extra brackets from an if
Dennis Dalessandro [Fri, 24 May 2019 15:44:58 +0000 (11:44 -0400)]
IB/hfi1: Remove extra brackets from an if

A recent patch to hfi1 left behind a checkpatch error.

Fixes: fb24ea52f78e ("drivers: Remove explicit invocations of mmiowb()")
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agonet/mlx5: potential error pointer dereference in error handling
Dan Carpenter [Fri, 3 May 2019 12:28:39 +0000 (15:28 +0300)]
net/mlx5: potential error pointer dereference in error handling

The error handling was a bit flipped around.  If the mlx5_create_flow_group()
function failed then it would have resulted in dereferencing "fg" when
it was an error pointer.

Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA: Convert put_page() to put_user_page*()
John Hubbard [Sat, 25 May 2019 01:45:22 +0000 (18:45 -0700)]
RDMA: Convert put_page() to put_user_page*()

For infiniband code that retains pages via get_user_pages*(), release
those pages via the new put_user_page(), or put_user_pages*(), instead of
put_page()

This is a tiny part of the second step of fixing the problem described in
[1]. The steps are:

1) Provide put_user_page*() routines, intended to be used for releasing
   pages that were pinned via get_user_pages*().

2) Convert all of the call sites for get_user_pages*(), to invoke
   put_user_page*(), instead of put_page(). This involves dozens of call
   sites, and will take some time.

3) After (2) is complete, use get_user_pages*() and put_user_page*() to
   implement tracking of these pages. This tracking will be separate from
   the existing struct page refcounting.

4) Use the tracking and identification of these pages, to implement
   special handling (especially in writeback paths) when the pages are
   backed by a filesystem. Again, [1] provides details as to why that is
   desirable.

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/hfi1: Remove set but not used variables 'offset' and 'fspsn'
YueHaibing [Sat, 25 May 2019 12:57:37 +0000 (20:57 +0800)]
IB/hfi1: Remove set but not used variables 'offset' and 'fspsn'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/infiniband/hw/hfi1/tid_rdma.c: In function tid_rdma_rcv_error:
drivers/infiniband/hw/hfi1/tid_rdma.c:2029:7: warning: variable offset set but not used [-Wunused-but-set-variable]
drivers/infiniband/hw/hfi1/tid_rdma.c: In function hfi1_rc_rcv_tid_rdma_ack:
drivers/infiniband/hw/hfi1/tid_rdma.c:4555:35: warning: variable fspsn set but not used [-Wunused-but-set-variable]

'offset' is never used since introduction in
commit d0d564a1caac ("IB/hfi1: Add functions to receive TID RDMA READ request")

'fspsn' is never used since introduciotn in
commit 9e93e967f7b4 ("IB/hfi1: Add a function to receive TID RDMA ACK packet")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Replace magic numbers with #defines
Lijun Ou [Fri, 24 May 2019 15:29:36 +0000 (23:29 +0800)]
RDMA/hns: Replace magic numbers with #defines

This patch makes the code more readable by removing magic numbers.

Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Remove jiffies operation in disable interrupt context
Lang Cheng [Fri, 24 May 2019 07:31:23 +0000 (15:31 +0800)]
RDMA/hns: Remove jiffies operation in disable interrupt context

In some functions, the jiffies operation is unnecessary, and we can
control delay using mdelay and udelay functions only.  Especially, in
hns_roce_v1_clear_hem, the function calls spin_lock_irqsave, the context
disables interrupt, so we can not use jiffies and msleep functions.

Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Move spin_lock_irqsave to the correct place
Lang Cheng [Fri, 24 May 2019 07:31:22 +0000 (15:31 +0800)]
RDMA/hns: Move spin_lock_irqsave to the correct place

When hip08 set gid, it will call spin_unlock_bh when send cmq.  if main.ko
call spin_lock_irqsave firstly, and the kernel is before commit
f71b74bca637 ("irq/softirqs: Use lockdep to assert IRQs are
disabled/enabled"), it will cause WARN_ON_ONCE because of calling
spin_unlock_bh in disable context.

In fact, the spin_lock_irqsave in main.ko is only used for hip06, and
should be placed in hns_roce_hw_v1.c. hns_roce_hw_v2.c uses its own
spin_unlock_bh and does not need main.ko manage spin_lock.

Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Update CQE specifications
Lijun Ou [Fri, 24 May 2019 07:31:21 +0000 (15:31 +0800)]
RDMA/hns: Update CQE specifications

According to hip08 UM, the maximum number of CQEs supported by each CQ is
4M.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Remove unnecessary print message in aeq
Yixian Liu [Fri, 24 May 2019 07:31:20 +0000 (15:31 +0800)]
RDMA/hns: Remove unnecessary print message in aeq

There is no need to print when communication is established, especially
while lots of qp used by application.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoiw_cxgb4: Fix qpid leak
Nirranjan Kirubaharan [Thu, 23 May 2019 07:05:39 +0000 (00:05 -0700)]
iw_cxgb4: Fix qpid leak

Add await in destroy_qp() so that all references to qp are dereferenced
and qp is freed in destroy_qp() itself.  This ensures freeing of all QPs
before invocation of dealloc_ucontext(), which prevents loss of in use
qpids stored in the ucontext.

Signed-off-by: Nirranjan Kirubaharan <nirranjan@chelsio.com>
Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/cxgb4: Don't expose DMA addresses
Leon Romanovsky [Mon, 20 May 2019 06:54:32 +0000 (09:54 +0300)]
RDMA/cxgb4: Don't expose DMA addresses

Change unconditional print of DMA address to be printed with special
printk format type specifier.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/cxgb4: Use sizeof() notation
Leon Romanovsky [Mon, 20 May 2019 06:54:31 +0000 (09:54 +0300)]
RDMA/cxgb4: Use sizeof() notation

Convert various sizeof call sites to be written in standard format
sizeof().

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/cxgb3: Delete and properly mark unimplemented resize CQ function
Leon Romanovsky [Mon, 20 May 2019 06:54:30 +0000 (09:54 +0300)]
RDMA/cxgb3: Delete and properly mark unimplemented resize CQ function

Resize CQ implementation was guarded by undeclared "notyet" define while
cxgb3 was added to the kernel. Twelve years later, this call is still
unimplemented, so safely delete it and fix improper return error code when
.resize_cq() is not implemented.

Fixes: b038ced7b370 ("RDMA/cxgb3: Add driver for Chelsio T3 RNIC")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/cxgb3: Don't expose DMA addresses
Leon Romanovsky [Mon, 20 May 2019 06:54:29 +0000 (09:54 +0300)]
RDMA/cxgb3: Don't expose DMA addresses

DMA addresses like all other kernel addresses should be printed with
special %p* formatter. It is needed to allow control of exposure of such
information through a dedicated knob.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/cxgb3: Use sizeof() notation instead of plain sizeof
Leon Romanovsky [Mon, 20 May 2019 06:54:28 +0000 (09:54 +0300)]
RDMA/cxgb3: Use sizeof() notation instead of plain sizeof

sizeof(a), sizeof a and sizeof (a) are all valid notations, but first is
more readable format recommended by checkpatch.pl.

Let's canonize it in cxgb3 drivers, so latter patches won't emit
checkpatch warnings. As part of this change, a redundant memset() was
removed.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/efa: Remove check that prevents destroy of resources in error flows
Leon Romanovsky [Mon, 20 May 2019 06:54:22 +0000 (09:54 +0300)]
RDMA/efa: Remove check that prevents destroy of resources in error flows

Drivers cannot check the udata for validity when doing destroy as there
will be no way to report this error back to the uverbs.

Since udata is new for destroy no driver should start to use it - instead
drivers should opt for the ioctl interface and define it in a way where it
cannot fail due to incorrect data.

Remove the checks on udata construction so EFA is consistent with
everything else.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/nes: Remove second wait queue initialization call
Leon Romanovsky [Mon, 20 May 2019 06:54:25 +0000 (09:54 +0300)]
RDMA/nes: Remove second wait queue initialization call

The same wait queue is initialized a couple of lines above.

Fixes: 3c2d774cad5b ("RDMA/nes: Add a driver for NetEffect RNICs")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/i40iw: Remove useless NULL checks
Leon Romanovsky [Mon, 20 May 2019 06:54:24 +0000 (09:54 +0300)]
RDMA/i40iw: Remove useless NULL checks

There is no need to check existence of structures to be destroyed.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/nes: Remove useless NULL checks
Leon Romanovsky [Mon, 20 May 2019 06:54:23 +0000 (09:54 +0300)]
RDMA/nes: Remove useless NULL checks

The destroy functions are always called with relevant structs, there is no
need to check their existence.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/core: Make ib_destroy_cq() void
Leon Romanovsky [Mon, 20 May 2019 06:54:21 +0000 (09:54 +0300)]
RDMA/core: Make ib_destroy_cq() void

Kernel destroy CQ flows can't fail and the returned value of
ib_destroy_cq() is not interested in those flows.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/ipoib: Remove check of destroy CQ
Leon Romanovsky [Mon, 20 May 2019 06:54:20 +0000 (09:54 +0300)]
RDMA/ipoib: Remove check of destroy CQ

There are nothing to do from user side with knowledge that destroy CQ
fails.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agords: Don't check return value from destroy CQ
Leon Romanovsky [Mon, 20 May 2019 06:54:19 +0000 (09:54 +0300)]
rds: Don't check return value from destroy CQ

There is no value in checking ib_destroy_cq() result and skipping to clear
struct ic fields. This connection needs to be reinitialized anyway.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/core: Return void from ib_device_check_mandatory()
Kamal Heib [Tue, 21 May 2019 07:05:07 +0000 (10:05 +0300)]
RDMA/core: Return void from ib_device_check_mandatory()

The return value from ib_device_check_mandatory() is always 0 - change it
to be void.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/mlx4: Delete unused func arg
Yuval Shaia [Sun, 19 May 2019 15:31:27 +0000 (18:31 +0300)]
IB/mlx4: Delete unused func arg

The function argument virt_addr is not in use - delete it.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/umem: Move page_shift from ib_umem to ib_odp_umem
Jason Gunthorpe [Mon, 20 May 2019 06:05:25 +0000 (09:05 +0300)]
RDMA/umem: Move page_shift from ib_umem to ib_odp_umem

This value has always been set to PAGE_SHIFT in the core code, the only
thing that does differently was the ODP path. Move the value into the ODP
struct and still use it for ODP, but change all the non-ODP things to just
use PAGE_SHIFT/PAGE_SIZE/PAGE_MASK directly.

Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/qedr: Fix incorrect device rate.
Sagiv Ozeri [Mon, 20 May 2019 09:33:20 +0000 (12:33 +0300)]
RDMA/qedr: Fix incorrect device rate.

Use the correct enum value introduced in commit 12113a35ada6 ("IB/core:
Add HDR speed enum") Prior to this change a 50Gbps port would show 40Gbps.

This patch also cleaned up the redundant redefiniton of ib speeds for
qedr.

Fixes: 12113a35ada6 ("IB/core: Add HDR speed enum")
Signed-off-by: Sagiv Ozeri <sagiv.ozeri@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/core: Fix doc typo
Israel Rukshin [Wed, 15 May 2019 10:49:31 +0000 (13:49 +0300)]
RDMA/core: Fix doc typo

Use the correct function names.

Fixes: c4367a26357b ("IB: Pass uverbs_attr_bundle down ib_x destroy path")
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/rw: Add info regarding SG count failure
Max Gurtovoy [Wed, 15 May 2019 10:49:30 +0000 (13:49 +0300)]
RDMA/rw: Add info regarding SG count failure

Print the supported and wanted values for SG count during signature
operation.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/rw: Print the correct number of sig MRs
Israel Rukshin [Wed, 15 May 2019 10:49:29 +0000 (13:49 +0300)]
RDMA/rw: Print the correct number of sig MRs

A wrong value was printed in case of sig MR pool initialization failure.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/rw: Fix doc typo
Israel Rukshin [Wed, 15 May 2019 10:49:28 +0000 (13:49 +0300)]
RDMA/rw: Fix doc typo

Use the correct function name.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/isert: Remove unused sig_attrs argument
Israel Rukshin [Wed, 15 May 2019 10:49:27 +0000 (13:49 +0300)]
IB/isert: Remove unused sig_attrs argument

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/iser: Remove unused sig_attrs argument
Israel Rukshin [Wed, 15 May 2019 10:49:26 +0000 (13:49 +0300)]
IB/iser: Remove unused sig_attrs argument

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/iser: Refactor iscsi_iser_check_protection function
Israel Rukshin [Wed, 15 May 2019 10:49:25 +0000 (13:49 +0300)]
IB/iser: Refactor iscsi_iser_check_protection function

Reduce lines of code by using local variable.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoLinux 5.2-rc1 v5.2-rc1
Linus Torvalds [Sun, 19 May 2019 22:47:09 +0000 (15:47 -0700)]
Linux 5.2-rc1

5 years agoMerge tag 'upstream-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs
Linus Torvalds [Sun, 19 May 2019 22:22:03 +0000 (15:22 -0700)]
Merge tag 'upstream-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs

Pull UBIFS fixes from Richard Weinberger:

 - build errors wrt xattrs

 - mismerge which lead to a wrong Kconfig ifdef

 - missing endianness conversion

* tag 'upstream-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
  ubifs: Convert xattr inum to host order
  ubifs: Use correct config name for encryption
  ubifs: Fix build error without CONFIG_UBIFS_FS_XATTR

5 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sun, 19 May 2019 19:15:32 +0000 (12:15 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge yet more updates from Andrew Morton:
 "A few final bits:

   - large changes to vmalloc, yielding large performance benefits

   - tweak the console-flush-on-panic code

   - a few fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  panic: add an option to replay all the printk message in buffer
  initramfs: don't free a non-existent initrd
  fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
  mm/compaction.c: correct zone boundary handling when isolating pages from a pageblock
  mm/vmap: add DEBUG_AUGMENT_LOWEST_MATCH_CHECK macro
  mm/vmap: add DEBUG_AUGMENT_PROPAGATE_CHECK macro
  mm/vmalloc.c: keep track of free blocks for vmap allocation

5 years agoMerge tag 'kbuild-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy...
Linus Torvalds [Sun, 19 May 2019 18:53:58 +0000 (11:53 -0700)]
Merge tag 'kbuild-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull more Kbuild updates from Masahiro Yamada:

 - remove unneeded use of cc-option, cc-disable-warning, cc-ldoption

 - exclude tracked files from .gitignore

 - re-enable -Wint-in-bool-context warning

 - refactor samples/Makefile

 - stop building immediately if syncconfig fails

 - do not sprinkle error messages when $(CC) does not exist

 - move arch/alpha/defconfig to the configs subdirectory

 - remove crappy header search path manipulation

 - add comment lines to .config to clarify the end of menu blocks

 - check uniqueness of module names (adding new warnings intentionally)

* tag 'kbuild-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (24 commits)
  kconfig: use 'else ifneq' for Makefile to improve readability
  kbuild: check uniqueness of module names
  kconfig: Terminate menu blocks with a comment in the generated config
  kbuild: add LICENSES to KBUILD_ALLDIRS
  kbuild: remove 'addtree' and 'flags' magic for header search paths
  treewide: prefix header search paths with $(srctree)/
  media: prefix header search paths with $(srctree)/
  media: remove unneeded header search paths
  alpha: move arch/alpha/defconfig to arch/alpha/configs/defconfig
  kbuild: terminate Kconfig when $(CC) or $(LD) is missing
  kbuild: turn auto.conf.cmd into a mandatory include file
  .gitignore: exclude .get_maintainer.ignore and .gitattributes
  kbuild: add all Clang-specific flags unconditionally
  kbuild: Don't try to add '-fcatch-undefined-behavior' flag
  kbuild: add some extra warning flags unconditionally
  kbuild: add -Wvla flag unconditionally
  arch: remove dangling asm-generic wrappers
  samples: guard sub-directories with CONFIG options
  kbuild: re-enable int-in-bool-context warning
  MAINTAINERS: kbuild: Add pattern for scripts/*vmlinux*
  ...

5 years agoMerge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Linus Torvalds [Sun, 19 May 2019 18:47:03 +0000 (11:47 -0700)]
Merge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:
 "Some I2C core API additions which are kind of simple but enhance error
  checking for users a lot, especially by returning errno now.

  There are wrappers to still support the old API but it will be removed
  once all users are converted"

* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: core: add device-managed version of i2c_new_dummy
  i2c: core: improve return value handling of i2c_new_device and i2c_new_dummy

5 years agoMerge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 19 May 2019 18:43:16 +0000 (11:43 -0700)]
Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Some bug fixes, and an update to the URL's for the final version of
  Unicode 12.1.0"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: avoid panic during forced reboot due to aborted journal
  ext4: fix block validity checks for journal inodes using indirect blocks
  unicode: update to Unicode 12.1.0 final
  unicode: add missing check for an error return from utf8lookup()
  ext4: fix miscellaneous sparse warnings
  ext4: unsigned int compared against zero
  ext4: fix use-after-free in dx_release()
  ext4: fix data corruption caused by overlapping unaligned and aligned IO
  jbd2: fix potential double free
  ext4: zero out the unused memory region in the extent tree block

5 years agoMerge tag '5.2-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 19 May 2019 18:38:18 +0000 (11:38 -0700)]
Merge tag '5.2-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Minor cleanup and fixes, one for stable, four rdma (smbdirect)
  related. Also adds SEEK_HOLE support"

* tag '5.2-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: add support for SEEK_DATA and SEEK_HOLE
  Fixed https://bugzilla.kernel.org/show_bug.cgi?id=202935 allow write on the same file
  cifs: Allocate memory for all iovs in smb2_ioctl
  cifs: Don't match port on SMBDirect transport
  cifs:smbd Use the correct DMA direction when sending data
  cifs:smbd When reconnecting to server, call smbd_destroy() after all MIDs have been called
  cifs: use the right include for signal_pending()
  smb3: trivial cleanup to smb2ops.c
  cifs: cleanup smb2ops.c and normalize strings
  smb3: display session id in debug data

5 years agoMerge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 19 May 2019 18:20:22 +0000 (11:20 -0700)]
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf tooling updates from Ingo Molnar:
 "perf.data:

   - Streaming compression of perf ring buffer into
     PERF_RECORD_COMPRESSED user space records, resulting in ~3-5x
     perf.data file size reduction on variety of tested workloads what
     saves storage space on larger server systems where perf.data size
     can easily reach several tens or even hundreds of GiBs, especially
     when profiling with DWARF-based stacks and tracing of context
     switches.

  perf record:

   - Improve -user-regs/intr-regs suggestions to overcome errors

  perf annotate:

   - Remove hist__account_cycles() from callback, speeding up branch
     processing (perf record -b)

  perf stat:

   - Add a 'percore' event qualifier, e.g.: -e
     cpu/event=0,umask=0x3,percore=1/, that sums up the event counts for
     both hardware threads in a core.

     We can already do this with --per-core, but it's often useful to do
     this together with other metrics that are collected per hardware
     thread.

     I.e. now its possible to do this per-event, and have it mixed with
     other events not aggregated by core.

  arm64:

   - Map Brahma-B53 CPUID to cortex-a53 events.

   - Add Cortex-A57 and Cortex-A72 events.

  csky:

   - Add DWARF register mappings for libdw, allowing --call-graph=dwarf
     to work on the C-SKY arch.

  x86:

   - Add support for recording and printing XMM registers, available,
     for instance, on Icelake.

   - Add uncore_upi (Intel's "Ultra Path Interconnect" events) JSON
     support. UPI replaced the Intel QuickPath Interconnect (QPI) in
     Xeon Skylake-SP.

  Intel PT:

   - Fix instructions sampling rate.

   - Timestamp fixes.

   - Improve exported-sql-viewer GUI, allowing, for instance, to
     copy'n'paste the trees, useful for e-mailing"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits)
  perf stat: Support 'percore' event qualifier
  perf stat: Factor out aggregate counts printing
  perf tools: Add a 'percore' event qualifier
  perf docs: Add description for stderr
  perf intel-pt: Fix sample timestamp wrt non-taken branches
  perf intel-pt: Fix improved sample timestamp
  perf intel-pt: Fix instructions sampling rate
  perf regs x86: Add X86 specific arch__intr_reg_mask()
  perf parse-regs: Add generic support for arch__intr/user_reg_mask()
  perf parse-regs: Split parse_regs
  perf vendor events arm64: Add Cortex-A57 and Cortex-A72 events
  perf vendor events arm64: Map Brahma-B53 CPUID to cortex-a53 events
  perf vendor events arm64: Remove [[:xdigit:]] wildcard
  perf jevents: Remove unused variable
  perf test zstd: Fixup verbose mode output
  perf tests: Implement Zstd comp/decomp integration test
  perf inject: Enable COMPRESSED record decompression
  perf report: Implement perf.data record decompression
  perf record: Implement -z,--compression_level[=<n>] option
  perf report: Add stub processing of compressed events for -D
  ...