]> asedeno.scripts.mit.edu Git - linux.git/log
linux.git
4 years agoRDMA/efa: Mask access flags with the correct optional range
Gal Pressman [Wed, 29 Jan 2020 07:18:03 +0000 (09:18 +0200)]
RDMA/efa: Mask access flags with the correct optional range

The uapi value IB_UVERBS_ACCESS_OPTIONAL_RANGE shouldn't be used inside
the driver, use IB_ACCESS_OPTIONAL instead.

Fixes: 86dd738cf20c ("RDMA/efa: Allow passing of optional access flags for MR registration")
Link: https://lore.kernel.org/r/20200129071803.40117-1-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cma: Fix unbalanced cm_id reference count during address resolve
Parav Pandit [Sun, 26 Jan 2020 14:26:46 +0000 (16:26 +0200)]
RDMA/cma: Fix unbalanced cm_id reference count during address resolve

Below commit missed the AF_IB and loopback code flow in
rdma_resolve_addr().  This leads to an unbalanced cm_id refcount in
cma_work_handler() which puts the refcount which was not incremented prior
to queuing the work.

A call trace is observed with such code flow:

 BUG: unable to handle kernel NULL pointer dereference at (null)
 [<ffffffff96b67e16>] __mutex_lock_slowpath+0x166/0x1d0
 [<ffffffff96b6715f>] mutex_lock+0x1f/0x2f
 [<ffffffffc0beabb5>] cma_work_handler+0x25/0xa0
 [<ffffffff964b9ebf>] process_one_work+0x17f/0x440
 [<ffffffff964baf56>] worker_thread+0x126/0x3c0

Hence, hold the cm_id reference when scheduling the resolve work item.

Fixes: 722c7b2bfead ("RDMA/{cma, core}: Avoid callback on rdma_addr_cancel()")
Link: https://lore.kernel.org/r/20200126142652.104803-2-leon@kernel.org
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/umem: Fix ib_umem_find_best_pgsz()
Artemy Kovalyov [Tue, 28 Jan 2020 13:56:12 +0000 (15:56 +0200)]
RDMA/umem: Fix ib_umem_find_best_pgsz()

Except for the last entry, the ending iova alignment sets the maximum
possible page size as the low bits of the iova must be zero when starting
the next chunk.

Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR")
Link: https://lore.kernel.org/r/20200128135612.174820-1-leon@kernel.org
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Tested-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/mlx4: Fix leak in id_map_find_del
Håkon Bugge [Thu, 23 Jan 2020 15:55:21 +0000 (16:55 +0100)]
IB/mlx4: Fix leak in id_map_find_del

Using CX-3 virtual functions, either from a bare-metal machine or
pass-through from a VM, MAD packets are proxied through the PF driver.

Since the VF drivers have separate name spaces for MAD Transaction Ids
(TIDs), the PF driver has to re-map the TIDs and keep the book keeping in
a cache.

Following the RDMA Connection Manager (CM) protocol, it is clear when an
entry has to evicted from the cache. When a DREP is sent from
mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar when
a REJ is received by the mlx4_ib_demux_cm_handler(), id_map_find_del() is
called.

This function wipes out the TID in use from the IDR or XArray and removes
the id_map_entry from the table.

In short, it does everything except the topping of the cake, which is to
remove the entry from the list and free it. In other words, for the REJ
case enumerated above, one id_map_entry will be leaked.

For the other case above, a DREQ has been received first. The reception of
the DREQ will trigger queuing of a delayed work to delete the
id_map_entry, for the case where the VM doesn't send back a DREP.

In the normal case, the VM _will_ send back a DREP, and id_map_find_del()
will be called.

But this scenario introduces a secondary leak. First, when the DREQ is
received, a delayed work is queued. The VM will then return a DREP, which
will call id_map_find_del(). As stated above, this will free the TID used
from the XArray or IDR. Now, there is window where that particular TID can
be re-allocated, lets say by an outgoing REQ. This TID will later be wiped
out by the delayed work, when the function id_map_ent_timeout() is
called. But the id_map_entry allocated by the outgoing REQ will not be
de-allocated, and we have a leak.

Both leaks are fixed by removing the id_map_find_del() function and only
using schedule_delayed(). Of course, a check in schedule_delayed() to see
if the work already has been queued, has been added.

Another benefit of always using the delayed version for deleting entries,
is that we do get a TimeWait effect; a TID no longer in use, will occupy
the XArray or IDR for CM_CLEANUP_CACHE_TIMEOUT time, without any ability
of being re-used for that time period.

Fixes: 3cf69cc8dbeb ("IB/mlx4: Add CM paravirtualization")
Link: https://lore.kernel.org/r/20200123155521.1212288-1-haakon.bugge@oracle.com
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com>
Reviewed-by: Rama Nichanamatlu <rama.nichanamatlu@oracle.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/opa_vnic: Spelling correction of 'erorr' to 'error'
Dillon Brock [Sat, 18 Jan 2020 16:25:42 +0000 (11:25 -0500)]
IB/opa_vnic: Spelling correction of 'erorr' to 'error'

Correcting a minor spelling mistake in the comments.

Link: https://lore.kernel.org/r/20200118162542.15188-1-dab9861@gmail.com
Signed-off-by: Dillon Brock <dab9861@gmail.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Fix logical condition in msix_request_irq
Nathan Chancellor [Thu, 16 Jan 2020 22:26:58 +0000 (15:26 -0700)]
IB/hfi1: Fix logical condition in msix_request_irq

Clang warns:

drivers/infiniband/hw/hfi1/msix.c:136:22: warning: overlapping
comparisons always evaluate to false [-Wtautological-overlap-compare]
        if (type < IRQ_SDMA && type >= IRQ_OTHER)
            ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
1 warning generated.

It is impossible for something to be less than 0 (IRQ_SDMA) and greater
than or equal to 3 (IRQ_OTHER) at the same time. A logical OR should
have been used to keep the same logic as before.

Link: https://lore.kernel.org/r/20200116222658.5285-1-natechancellor@gmail.com
Link: https://github.com/ClangBuiltLinux/linux/issues/841
Fixes: 13d2a8384bd9 ("IB/hfi1: Decouple IRQ name from type")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Remove CM message structs
Jason Gunthorpe [Thu, 16 Jan 2020 17:00:37 +0000 (13:00 -0400)]
RDMA/cm: Remove CM message structs

All accesses now use the new IBA acessor scheme, so delete the structs
entirely and generate the structures from the schema file.

Link: https://lore.kernel.org/r/20200116170037.30109-8-jgg@ziepe.ca
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Use IBA functions for complex structure members
Jason Gunthorpe [Thu, 16 Jan 2020 17:00:36 +0000 (13:00 -0400)]
RDMA/cm: Use IBA functions for complex structure members

Use a Coccinelle spatch to replace CM structure members used as
structures, arrays, or pointers with IBA_GET/SET versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression src;
expression len;
{struct} *msg;
@@
- memcpy(msg->{old_name}, src, len)
+ IBA_SET_MEM({new_name}, msg, src, len)
@@
{struct} *msg;
identifier x;
@@
- msg->{old_name}.x
+ IBA_GET_MEM_PTR({new_name}, msg)->x
@@
{struct} *msg;
@@
- &msg->{old_name}
+ IBA_GET_MEM_PTR({new_name}, msg)

For GIDs:
@@
{struct} *msg;
@@
- msg->{old_name}
+ *IBA_GET_MEM_PTR({new_name}, msg)

For non-GIDs:
@@
{struct} *msg;
@@
- msg->{old_name}
+ IBA_GET_MEM_PTR({new_name}, msg)

Iterated for every remaining IBA_CHECK_OFF()/IBA_CHECK_GET()
pairing. Touched up with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-7-jgg@ziepe.ca
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Use IBA functions for simple structure members
Jason Gunthorpe [Thu, 16 Jan 2020 17:00:35 +0000 (13:00 -0400)]
RDMA/cm: Use IBA functions for simple structure members

Use a Coccinelle spatch script to replace use of simple CM structure
members with IBA_GET/SET versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression val;
{struct} *msg;
@@
- msg->{old_name} = val
+ IBA_SET({new_name}, msg, be{bits}_to_cpu(val))
@@
{struct} *msg;
@@
- msg->{old_name}
+ cpu_to_be{bits}(IBA_GET({new_name}, msg))

Iterated for every IBA_CHECK_OFF that isn't a CM_FIELD_MLOC.

And the below iterated over all byte sizes to remove doubled byte swaps:

@@
expression val;
@@
-be{bits}_to_cpu(cpu_to_be{bits}(val))
+val

(and __be_to_cpu and ntoh varients)

Touched up with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-6-jgg@ziepe.ca
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Use IBA functions for swapping get/set acessors
Jason Gunthorpe [Thu, 16 Jan 2020 17:00:34 +0000 (13:00 -0400)]
RDMA/cm: Use IBA functions for swapping get/set acessors

Use a Coccinelle spatch script to replace CM helper functions that
return/accept BE values with IBA_GET/SET versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression val;
{struct} *msg;
@@
- {old_setter}(msg, val)
+ IBA_SET({new_name}, msg, be{bits}_to_cpu(val))
@@
{struct} *msg;
@@
- {old_getter}(msg)
+ cpu_to_be{bits}(IBA_GET({new_name}, msg))

Iterated for every IBA_CHECK_GET_BE()/IBA_CHECK_SET_BE() pairing.

And the below iterated over all byte sizes to remove doubled byte swaps:

@@
expression val;
@@
-be{bits}_to_cpu(cpu_to_be{bits}(val))
+val

(and __be_to_cpu and ntoh varients)

Touched up with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-5-jgg@ziepe.ca
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Use IBA functions for simple get/set acessors
Jason Gunthorpe [Thu, 16 Jan 2020 17:00:33 +0000 (13:00 -0400)]
RDMA/cm: Use IBA functions for simple get/set acessors

Use a Coccinelle spatch to replace CM helper functions with IBA_GET/SET
versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression val;
{struct} *msg;
@@
- {old_setter}
+ IBA_SET({new_name}, msg, val)
@@
{struct} *msg;
@@
- {old_getter}
+ IBA_GET({new_name}, msg)

Iterated for every IBA_CHECK_GET()/IBA_CHECK_GET() pairing. Touched up
with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-4-jgg@ziepe.ca
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Add SET/GET implementations to hide IBA wire format
Leon Romanovsky [Thu, 16 Jan 2020 17:00:32 +0000 (13:00 -0400)]
RDMA/cm: Add SET/GET implementations to hide IBA wire format

There is no separation between RDMA-CM wire format as it is declared in
IBTA and kernel logic which implements needed support. Such situation
causes to many mistakes in conversion between big-endian (wire format)
and CPU format used by kernel. It also mixes RDMA core code with
combination of uXX and beXX variables.

The idea that all accesses to IBA definitions will go through special
GET/SET macros to ensure that no conversion mistakes are made. The
shifting and masking required to read the value is automatically deduced
using the field offset description from the tables in the IBA
specification.

This starts with the CM MADs described in IBTA release 1.3 volume 1.

To confirm that the new macros behave the same as the old accessors a
self-test is included in this patch.

Each macro replacing a straightforward struct field compile-time tests
that the new field has the same offsetof() and width as the old field.

For the fields with accessor functions a runtime test, the 'all ones'
value is placed in a dummy message and read back in several ways to
confirm that both approaches give identical results.

Later patches in this series delete the self test.

This creates a tested table of new field name, old field name(s) and some
meta information like BE coding for the functions which will be used in
the next patches.

Link: https://lore.kernel.org/r/20200116170037.30109-3-jgg@ziepe.ca
Link: https://lore.kernel.org/r/20191212093830.316934-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/cm: Add accessors for CM_REQ transport_type
Jason Gunthorpe [Thu, 16 Jan 2020 17:00:31 +0000 (13:00 -0400)]
RDMA/cm: Add accessors for CM_REQ transport_type

Access the two fields through wrappers, like all other fields, to make it
clearer what is happening.

Link: https://lore.kernel.org/r/20200116170037.30109-2-jgg@ziepe.ca
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/mlx5: Return the administrative GUID if exists
Danit Goldberg [Thu, 16 Jan 2020 12:00:48 +0000 (14:00 +0200)]
IB/mlx5: Return the administrative GUID if exists

A user can change the operational GUID (a.k.a affective GUID) through
link/infiniband. Therefore it is preferred to return the currently set
GUID if it exists instead of the operational.

This way the PF can query which VF GUID will be set in the next bind.  In
order to align with MAC address, zero is returned if administrative GUID
is not set.

For example, before setting administrative GUID:
 $ ip link show
 ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 qdisc mq state UP mode DEFAULT group default qlen 256
 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
 vf 0     link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff,
 spoof checking off, NODE_GUID 00:00:00:00:00:00:00:00, PORT_GUID 00:00:00:00:00:00:00:00, link-state auto, trust off, query_rss off

Then:

 $ ip link set ib0 vf 0 node_guid 11:00:af:21:cb:05:11:00
 $ ip link set ib0 vf 0 port_guid 22:11:af:21:cb:05:11:00

After setting administrative GUID:
 $ ip link show
 ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 qdisc mq state UP mode DEFAULT group default qlen 256
 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
 vf 0     link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff,
 spoof checking off, NODE_GUID 11:00:af:21:cb:05:11:00, PORT_GUID 22:11:af:21:cb:05:11:00, link-state auto, trust off, query_rss off

Fixes: 9c0015ef0928 ("IB/mlx5: Implement callbacks for getting VFs GUID attributes")
Link: https://lore.kernel.org/r/20200116120048.12744-1-leon@kernel.org
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fence
Jason Gunthorpe [Wed, 15 Jan 2020 20:20:44 +0000 (20:20 +0000)]
RDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fence

The set of entry->driver_removed is missing locking, protect it with
xa_lock() which is held by the only reader.

Otherwise readers may continue to see driver_removed = false after
rdma_user_mmap_entry_remove() returns and may continue to try and
establish new mmaps.

Fixes: 3411f9f01b76 ("RDMA/core: Create mmap database and cookie helper functions")
Link: https://lore.kernel.org/r/20200115202041.GA17199@ziepe.ca
Reviewed-by: Gal Pressman <galpress@amazon.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoMerge tag 'rds-odp-for-5.5' into rdma.git for-next
Jason Gunthorpe [Tue, 21 Jan 2020 13:55:04 +0000 (09:55 -0400)]
Merge tag 'rds-odp-for-5.5' into rdma.git for-next

From https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma

Leon Romanovsky says:

====================
Use ODP MRs for kernel ULPs

The following series extends MR creation routines to allow creation of
user MRs through kernel ULPs as a proxy. The immediate use case is to
allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
MRs to be created and such MRs were not possible to create prior this
series.

The first part of this patchset extends RDMA to have special verb
ib_reg_user_mr(). The common use case that uses this function is a
userspace application that allocates memory for HCA access but the
responsibility to register the memory at the HCA is on an kernel ULP.
This ULP acts as an agent for the userspace application.

The second part provides advise MR functionality for ULPs. This is
integral part of ODP flows and used to trigger pagefaults in advance
to prepare memory before running working set.

The third part is actual user of those in-kernel APIs.
====================

* tag 'rds-odp-for-5.5':
  net/rds: Use prefetch for On-Demand-Paging MR
  net/rds: Handle ODP mr registration/unregistration
  net/rds: Detect need of On-Demand-Paging memory registration
  RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths
  IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs
  RDMA/mlx5: Don't fake udata for kernel path
  IB/mlx5: Add ODP WQE handlers for kernel QPs
  IB/core: Add interface to advise_mr for kernel users
  IB/core: Introduce ib_reg_user_mr
  IB: Allow calls to ib_umem_get from kernel ULPs

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agonet/rds: Use prefetch for On-Demand-Paging MR
Hans Westgaard Ry [Wed, 15 Jan 2020 12:43:40 +0000 (14:43 +0200)]
net/rds: Use prefetch for On-Demand-Paging MR

Try prefetching pages when using On-Demand-Paging MR using
ib_advise_mr.

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agonet/rds: Handle ODP mr registration/unregistration
Hans Westgaard Ry [Wed, 15 Jan 2020 12:43:39 +0000 (14:43 +0200)]
net/rds: Handle ODP mr registration/unregistration

On-Demand-Paging MRs are registered using ib_reg_user_mr and
unregistered with ib_dereg_mr.

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoIB/mlx4: Fix memory leak in add_gid error flow
Jack Morgenstein [Wed, 15 Jan 2020 08:50:50 +0000 (10:50 +0200)]
IB/mlx4: Fix memory leak in add_gid error flow

In procedure mlx4_ib_add_gid(), if the driver is unable to update the FW
gid table, there is a memory leak in the driver's copy of the gid table:
the gid entry's context buffer is not freed.

If such an error occurs, free the entry's context buffer, and mark the
entry as available (by setting its context pointer to NULL).

Fixes: e26be1bfef81 ("IB/mlx4: Implement ib_device callbacks")
Link: https://lore.kernel.org/r/20200115085050.73746-1-leon@kernel.org
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/mlx5: Expose RoCE accelerator counters
Avihai Horon [Wed, 15 Jan 2020 14:54:59 +0000 (16:54 +0200)]
IB/mlx5: Expose RoCE accelerator counters

Introduce the following RoCE accelerator counters:
* roce_adp_retrans - number of adaptive retransmission for RoCE traffic.
* roce_adp_retrans_to - number of times RoCE traffic reached time out
  due to adaptive retransmission.
* roce_slow_restart - number of times RoCE slow restart was used.
* roce_slow_restart_cnps - number of times RoCE slow restart
  generate CNP packets.
* roce_slow_restart_trans - number of times RoCE slow restart change
  state to slow restart.

Link: https://lore.kernel.org/r/20200115145459.83280-3-leon@kernel.org
Signed-off-by: Avihai Horon <avihaih@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Set relaxed ordering when requested
Michael Guralnik [Wed, 8 Jan 2020 18:05:40 +0000 (20:05 +0200)]
RDMA/mlx5: Set relaxed ordering when requested

Enable relaxed ordering in the mkey context when requested. As relaxed
ordering is not currently supported in UMR, disable UMR usage for relaxed
ordering MRs.

Link: https://lore.kernel.org/r/1578506740-22188-11-git-send-email-yishaih@mellanox.com
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Add the core support field to METHOD_GET_CONTEXT
Michael Guralnik [Wed, 8 Jan 2020 18:05:39 +0000 (20:05 +0200)]
RDMA/core: Add the core support field to METHOD_GET_CONTEXT

Add the core support field to METHOD_GET_CONTEXT, this field should
represent capabilities that are not device-specific.

Return support for optional access flags for memory regions. User-space
will use this capability to mask the optional access flags for
unsupporting kernels.

Link: https://lore.kernel.org/r/1578506740-22188-10-git-send-email-yishaih@mellanox.com
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/uverbs: Add new relaxed ordering memory region access flag
Michael Guralnik [Wed, 8 Jan 2020 18:05:38 +0000 (20:05 +0200)]
RDMA/uverbs: Add new relaxed ordering memory region access flag

Add a new relaxed ordering access flag for memory regions.  Using memory
regions with relaxed ordeing set can enhance performance.

This access flag is handled in a best-effort manner, drivers should ignore
if they don't support setting relaxed ordering.

Link: https://lore.kernel.org/r/1578506740-22188-9-git-send-email-yishaih@mellanox.com
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/efa: Allow passing of optional access flags for MR registration
Michael Guralnik [Wed, 8 Jan 2020 18:05:37 +0000 (20:05 +0200)]
RDMA/efa: Allow passing of optional access flags for MR registration

As part of adding a range of optional access flags that drivers need to be
able to accept, mask this range inside efa driver.  This will prevent the
driver from failing when an access flag from that range is passed.

Link: https://lore.kernel.org/r/1578506740-22188-8-git-send-email-yishaih@mellanox.com
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Add optional access flags range
Michael Guralnik [Wed, 8 Jan 2020 18:05:36 +0000 (20:05 +0200)]
RDMA/core: Add optional access flags range

Define a range of access flags that are defined to be optional, both
uverbs and drivers should enable getting them and use if they are
applicable

This will be used, for example, for the relaxed ordering access flag which
unsupporting drivers can ignore.

Link: https://lore.kernel.org/r/1578506740-22188-7-git-send-email-yishaih@mellanox.com
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/uverbs: Verify MR access flags
Michael Guralnik [Wed, 8 Jan 2020 18:05:35 +0000 (20:05 +0200)]
RDMA/uverbs: Verify MR access flags

Verify that MR access flags that are passed from user are all supported
ones, otherwise an error is returned.

Fixes: 4fca03778351 ("IB/uverbs: Move ib_access_flags and ib_read_counters_flags to uapi")
Link: https://lore.kernel.org/r/1578506740-22188-6-git-send-email-yishaih@mellanox.com
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/uverbs: Add ioctl command to get a device context
Jason Gunthorpe [Wed, 8 Jan 2020 18:05:34 +0000 (20:05 +0200)]
RDMA/uverbs: Add ioctl command to get a device context

Allow future extensions of the get context command through the uverbs
ioctl kabi.

Unlike the uverbs version this does not return an async_fd as well, that
has to be done with another command.

Link: https://lore.kernel.org/r/1578506740-22188-5-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Remove ucontext_lock from the uverbs_destry_ufile_hw() path
Jason Gunthorpe [Wed, 8 Jan 2020 18:05:33 +0000 (20:05 +0200)]
RDMA/core: Remove ucontext_lock from the uverbs_destry_ufile_hw() path

This lock only serializes ucontext creation. Instead of checking the
ucontext_lock during destruction hold the existing hw_destroy_rwsem during
creation, which is the standard pattern for object creation.

The simplification of locking is needed for the next patch.

Link: https://lore.kernel.org/r/1578506740-22188-4-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Add UVERBS_METHOD_ASYNC_EVENT_ALLOC
Jason Gunthorpe [Wed, 8 Jan 2020 18:05:32 +0000 (20:05 +0200)]
RDMA/core: Add UVERBS_METHOD_ASYNC_EVENT_ALLOC

Allow the async FD to be allocated separately from the context.

This is necessary to introduce the ioctl to create a context, as an ioctl
should only ever create a single uobject at a time.

If multiple async FDs are created then the first one is used to deliver
affiliated events from any ib_uevent_object, with all subsequent ones will
receive only unaffiliated events.

Link: https://lore.kernel.org/r/1578506740-22188-3-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoMerge branch 'mlx5-next' into rdma.git for-next
Jason Gunthorpe [Thu, 16 Jan 2020 19:54:22 +0000 (15:54 -0400)]
Merge branch 'mlx5-next' into rdma.git for-next

From the mlx5-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Merged due to dependencies in the next patches.

* branch 'mlx5-next':
  net/mlx5: Expose relaxed ordering bits
  net/mlx5: Add RoCE accelerator counters

4 years agonet/mlx5: Expose relaxed ordering bits
Michael Guralnik [Wed, 8 Jan 2020 18:05:31 +0000 (20:05 +0200)]
net/mlx5: Expose relaxed ordering bits

Expose relaxed ordering bits in HCA capability and mkey context structs.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agonet/mlx5: Add RoCE accelerator counters
Leon Romanovsky [Wed, 15 Jan 2020 14:54:58 +0000 (16:54 +0200)]
net/mlx5: Add RoCE accelerator counters

Add RoCE accelerator definitions.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agonet/rds: Detect need of On-Demand-Paging memory registration
Hans Westgaard Ry [Wed, 15 Jan 2020 12:43:38 +0000 (14:43 +0200)]
net/rds: Detect need of On-Demand-Paging memory registration

Add code to check if memory intended for RDMA is FS-DAX-memory. RDS
will fail with error code EOPNOTSUPP if FS-DAX-memory is detected.

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoRDMA/mlx5: Fix handling of IOVA != user_va in ODP paths
Jason Gunthorpe [Wed, 15 Jan 2020 12:43:37 +0000 (14:43 +0200)]
RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths

Till recently it was not possible for userspace to specify a different
IOVA, but with the new ibv_reg_mr_iova() library call this can be done.

To compute the user_va we must compute:
  user_va = (iova - iova_start) + user_va_start

while being cautious of overflow and other math problems.

The iova is not reliably stored in the mmkey when the MR is created. Only
the cached creation path (the common one) set it, so it must also be set
when creating uncached MRs.

Fix the weird use of iova when computing the starting page index in the
MR. In the normal case, when iova == umem.address:
  iova & (~(BIT(page_shift) - 1)) ==
  ALIGN_DOWN(umem.address, odp->page_size) ==
  ib_umem_start(odp)

And when iova is different using it in math with a user_va is wrong.

Finally, do not allow an implicit ODP to be created with a non-zero IOVA
as we have no support for that.

Fixes: 7bdf65d411c1 ("IB/mlx5: Handle page faults")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoIB/mlx5: Mask out unsupported ODP capabilities for kernel QPs
Moni Shoua [Wed, 15 Jan 2020 12:43:36 +0000 (14:43 +0200)]
IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs

The ODP handler for WQEs in RQ or SRQ is not implented for kernel QPs.
Therefore don't report support in these if query comes from a kernel user.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoRDMA/mlx5: Don't fake udata for kernel path
Leon Romanovsky [Wed, 15 Jan 2020 12:43:35 +0000 (14:43 +0200)]
RDMA/mlx5: Don't fake udata for kernel path

Kernel paths must not set udata and provide NULL pointer,
instead of faking zeroed udata struct.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoIB/mlx5: Add ODP WQE handlers for kernel QPs
Moni Shoua [Wed, 15 Jan 2020 12:43:34 +0000 (14:43 +0200)]
IB/mlx5: Add ODP WQE handlers for kernel QPs

One of the steps in ODP page fault handler for WQEs is to read a WQE
from a QP send queue or receive queue buffer at a specific index.

Since the implementation of this buffer is different between kernel and
user QP the implementation of the handler needs to be aware of that and
handle it in a different way.

ODP for kernel MRs is currently supported only for RDMA_READ
and RDMA_WRITE operations so change the handler to
- read a WQE from a kernel QP send queue
- fail if access to receive queue or shared receive queue is
  required for a kernel QP

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoIB/core: Add interface to advise_mr for kernel users
Moni Shoua [Wed, 15 Jan 2020 12:43:33 +0000 (14:43 +0200)]
IB/core: Add interface to advise_mr for kernel users

Allow ULPs to call advise_mr, so they can control ODP regions
in the same way as user space applications.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoIB/core: Introduce ib_reg_user_mr
Moni Shoua [Wed, 15 Jan 2020 12:43:32 +0000 (14:43 +0200)]
IB/core: Introduce ib_reg_user_mr

Add ib_reg_user_mr() for kernel ULPs to register user MRs.

The common use case that uses this function is a userspace application
that allocates memory for HCA access but the responsibility to register
the memory at the HCA is on an kernel ULP. This ULP that acts as an agent
for the userspace application.

This function is intended to be used without a user context so vendor
drivers need to be aware of calling reg_user_mr() device operation with
udata equal to NULL.

Among all drivers, i40iw is the only driver which relies on presence
of udata, so check udata existence for that driver.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoIB: Allow calls to ib_umem_get from kernel ULPs
Moni Shoua [Wed, 15 Jan 2020 12:43:31 +0000 (14:43 +0200)]
IB: Allow calls to ib_umem_get from kernel ULPs

So far the assumption was that ib_umem_get() and ib_umem_odp_get()
are called from flows that start in UVERBS and therefore has a user
context. This assumption restricts flows that are initiated by ULPs
and need the service that ib_umem_get() provides.

This patch changes ib_umem_get() and ib_umem_odp_get() to get IB device
directly by relying on the fact that both UVERBS and ULPs sets that
field correctly.

Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoIB/srp: Never use immediate data if it is disabled by a user
Sergey Gorenko [Wed, 15 Jan 2020 13:30:55 +0000 (13:30 +0000)]
IB/srp: Never use immediate data if it is disabled by a user

Some SRP targets that do not support specification SRP-2, put the garbage
to the reserved bits of the SRP login response.  The problem was not
detected for a long time because the SRP initiator ignored those bits. But
now one of them is used as SRP_LOGIN_RSP_IMMED_SUPP. And it causes a
critical error on the target when the initiator sends immediate data.

The ib_srp module has a use_imm_date parameter to enable or disable
immediate data manually. But it does not help in the above case, because
use_imm_date is ignored at handling the SRP login response. The problem is
definitely caused by a bug on the target side, but the initiator's
behavior also does not look correct.  The initiator should not use
immediate data if use_imm_date is disabled by a user.

This commit adds an additional checking of use_imm_date at the handling of
SRP login response to avoid unexpected use of immediate data.

Fixes: 882981f4a411 ("RDMA/srp: Add support for immediate data")
Link: https://lore.kernel.org/r/20200115133055.30232-1-sergeygo@mellanox.com
Signed-off-by: Sergey Gorenko <sergeygo@mellanox.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/rxe: Compute the maximum sges and inline size based on the WQE size
Rao Shoaib [Tue, 14 Jan 2020 00:41:20 +0000 (16:41 -0800)]
RDMA/rxe: Compute the maximum sges and inline size based on the WQE size

The SGE buffer size and max_inline data should be derived from the size of
the WQE. Each value individually sets the WQE size, so compute the actual
sizes based on the actual WQE size and configure the QP with the maximums.

Also fix the missing return of the actual maximum capability to the caller.

Link: https://lore.kernel.org/r/1578962480-17814-3-git-send-email-rao.shoaib@oracle.com
Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIntroduce maximum WQE size to check limits
Rao Shoaib [Tue, 14 Jan 2020 00:41:19 +0000 (16:41 -0800)]
Introduce maximum WQE size to check limits

Introduce maximum WQE size to impose limits on max SGE's and inline data

Link: https://lore.kernel.org/r/1578962480-17814-2-git-send-email-rao.shoaib@oracle.com
Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/efa: Remove unused ucontext parameter from efa_qp_user_mmap_entries_remove
Gal Pressman [Tue, 14 Jan 2020 08:57:05 +0000 (10:57 +0200)]
RDMA/efa: Remove unused ucontext parameter from efa_qp_user_mmap_entries_remove

The ucontext parameter is unused, remove it.

Link: https://lore.kernel.org/r/20200114085706.82229-6-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/efa: Remove {} brackets from single statement if
Gal Pressman [Tue, 14 Jan 2020 08:57:04 +0000 (10:57 +0200)]
RDMA/efa: Remove {} brackets from single statement if

The {} brackets are not needed according to the Linux coding style.

Link: https://lore.kernel.org/r/20200114085706.82229-5-galpress@amazon.com
Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Firas JahJah <firasj@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/efa: Device definitions documentation updates
Gal Pressman [Tue, 14 Jan 2020 08:57:03 +0000 (10:57 +0200)]
RDMA/efa: Device definitions documentation updates

Various clarifications and updates to the documentation of the device
definitions.
No functional changes in this patch.

Link: https://lore.kernel.org/r/20200114085706.82229-4-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Add support for extended atomic in userspace
Jiaran Zhang [Wed, 15 Jan 2020 01:42:26 +0000 (09:42 +0800)]
RDMA/hns: Add support for extended atomic in userspace

To support extended atomic operations including cmp & swap and fetch & add
of 8 bytes, 16 bytes, 32 bytes, 64 bytes in userspace, some field in qpc
should be configured.

Link: https://lore.kernel.org/r/1579052546-11746-1-git-send-email-liweihang@huawei.com
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Get pf capabilities from firmware
Lijun Ou [Sat, 11 Jan 2020 10:32:41 +0000 (18:32 +0800)]
RDMA/hns: Get pf capabilities from firmware

Get pf capabilities from firmware according to different hardwares, if it
fails, all capabilities will be set with a default value.

Link: https://lore.kernel.org/r/1578738761-3176-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Add interfaces to get pf capabilities from firmware
Lijun Ou [Sat, 11 Jan 2020 10:32:40 +0000 (18:32 +0800)]
RDMA/hns: Add interfaces to get pf capabilities from firmware

pf capabilities are set by default for hip08 previously which should
depends on different types of hardware. So add new interfaces to get them
from firmware.

Link: https://lore.kernel.org/r/1578738761-3176-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Remove some redundant variables related to capabilities
Weihang Li [Sat, 11 Jan 2020 10:32:39 +0000 (18:32 +0800)]
RDMA/hns: Remove some redundant variables related to capabilities

In struct hns_roce_caps, max_srq_sg and max_srqwqes is unused, and
max_srqs has the same effect with num_srqs. So remove them from this
structrue.

Link: https://lore.kernel.org/r/1578738761-3176-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Use READ_ONCE for ib_ufile.async_file
Jason Gunthorpe [Wed, 8 Jan 2020 17:22:06 +0000 (19:22 +0200)]
RDMA/core: Use READ_ONCE for ib_ufile.async_file

The writer for async_file holds the ucontext_lock, while the readers are
left unlocked. Most readers rely on an implicit locking, either by having
a uobject (which cannot be created before a context) or by holding the
ib_ufile kref.

However ib_uverbs_free_hw_resources() has no implicit lock and has a
possible race. Make this all clear and sane by using READ_ONCE
consistently.

Link: https://lore.kernel.org/r/1578504126-9400-15-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Make ib_uverbs_async_event_file into a uobject
Jason Gunthorpe [Wed, 8 Jan 2020 17:22:05 +0000 (19:22 +0200)]
RDMA/core: Make ib_uverbs_async_event_file into a uobject

This makes async events aligned with completion events as both are full
uobjects of FD type and use the same uobject lifecycle.

A bunch of duplicate code is consolidated and the general flow between the
two FDs is now very similar.

Link: https://lore.kernel.org/r/1578504126-9400-14-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Remove the ufile arg from rdma_alloc_begin_uobject
Jason Gunthorpe [Wed, 8 Jan 2020 17:22:04 +0000 (19:22 +0200)]
RDMA/core: Remove the ufile arg from rdma_alloc_begin_uobject

Now that all callers provide a non-NULL attrs the ufile is redundant.
Adjust things so that the context handling is done inside alloc_uobj,
and the ib_uverbs_get_ucontext_file() is avoided if we already have the
context.

Link: https://lore.kernel.org/r/1578504126-9400-13-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Simplify type usage for ib_uverbs_async_handler()
Jason Gunthorpe [Wed, 8 Jan 2020 17:22:02 +0000 (19:22 +0200)]
RDMA/core: Simplify type usage for ib_uverbs_async_handler()

This function works on an ib_uverbs_async_file. Accept that as a parameter
instead of the struct ib_uverbs_file.

Consoldiate all the callers working from an ib_uevent_object to a single
function and locate the async_file directly from the struct ib_uobject
instead of using context_ptr.

Link: https://lore.kernel.org/r/1578504126-9400-11-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Do not erase the type of ib_wq.uobject
Jason Gunthorpe [Wed, 8 Jan 2020 17:22:01 +0000 (19:22 +0200)]
RDMA/core: Do not erase the type of ib_wq.uobject

This is a struct ib_uwq_object pointer, instead of using container_of()
all over the place just store it with its actual type.

Link: https://lore.kernel.org/r/1578504126-9400-10-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Do not erase the type of ib_srq.uobject
Jason Gunthorpe [Wed, 8 Jan 2020 17:22:00 +0000 (19:22 +0200)]
RDMA/core: Do not erase the type of ib_srq.uobject

This is a struct ib_usrq_object pointer, instead of using container_of()
all over the place just store it with its actual type.

Link: https://lore.kernel.org/r/1578504126-9400-9-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Do not erase the type of ib_qp.uobject
Jason Gunthorpe [Wed, 8 Jan 2020 17:21:59 +0000 (19:21 +0200)]
RDMA/core: Do not erase the type of ib_qp.uobject

This is a struct ib_uqp_object pointer, instead of using container_of()
all over the place just store it with its actual type.

Link: https://lore.kernel.org/r/1578504126-9400-8-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Do not erase the type of ib_cq.uobject
Jason Gunthorpe [Wed, 8 Jan 2020 17:21:58 +0000 (19:21 +0200)]
RDMA/core: Do not erase the type of ib_cq.uobject

This is a struct ib_ucq_object pointer, instead of using container_of()
all over the place just store it with its actual type.

Link: https://lore.kernel.org/r/1578504126-9400-7-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Make ib_ucq_object use ib_uevent_object
Jason Gunthorpe [Wed, 8 Jan 2020 17:21:57 +0000 (19:21 +0200)]
RDMA/core: Make ib_ucq_object use ib_uevent_object

Any uobject that sends events into the async_event_file should be using
ib_uevent_object so it can use the standard uevent based helper
functions. CQ pushes events into both the async_event and the comp_channel
in an open coded way. Move the async events related stuff to
ib_uevent_object.

Link: https://lore.kernel.org/r/1578504126-9400-6-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Do not allow alloc_commit to fail
Jason Gunthorpe [Wed, 8 Jan 2020 17:21:56 +0000 (19:21 +0200)]
RDMA/core: Do not allow alloc_commit to fail

This is a left over from an earlier version that creates a lot of
complexity for error unwind, particularly for FD uobjects.

The only reason this was done is so that anon_inode_get_file() could be
called with the final fops and a fully setup uobject. Both need to be
setup since unwinding anon_inode_get_file() via fput will call the
driver's release().

Now that the driver does not provide release, we no longer need to worry
about this complicated sequence, simply create the struct file at the
start and allow the core code's release function to deal with the abort
case.

This allows all the confusing error paths around commit to be removed.

Link: https://lore.kernel.org/r/1578504126-9400-5-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Simplify devx async commands
Jason Gunthorpe [Wed, 8 Jan 2020 17:21:55 +0000 (19:21 +0200)]
RDMA/mlx5: Simplify devx async commands

With the new FD structure the async commands do not need to hold any
references while running. The existing mlx5_cmd_exec_cb() and
mlx5_cmd_cleanup_async_ctx() provide enough synchronization to ensure
that all outstanding commands are completed before the uobject can be
destructed.

Remove the now confusing get_file() and the type erasure of the
devx_async_cmd_event_file.

Link: https://lore.kernel.org/r/1578504126-9400-4-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/core: Simplify destruction of FD uobjects
Jason Gunthorpe [Wed, 8 Jan 2020 17:21:54 +0000 (19:21 +0200)]
RDMA/core: Simplify destruction of FD uobjects

FD uobjects have a weird split between the struct file and uobject
world. Simplify this to make them pure uobjects and use a generic release
method for all struct file operations.

This fixes the control flow so that mlx5_cmd_cleanup_async_ctx() is always
called before erasing the linked list contents to make the concurrancy
simpler to understand.

For this to work the uobject destruction must fence anything that it is
cleaning up - the design must not rely on struct file lifetime.

Only deliver_event() relies on the struct file to when adding new events
to the queue, add a is_destroyed check under lock to block it.

Link: https://lore.kernel.org/r/1578504126-9400-3-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/mlx5: Use RCU and direct refcounts to keep memory alive
Jason Gunthorpe [Wed, 8 Jan 2020 17:21:53 +0000 (19:21 +0200)]
RDMA/mlx5: Use RCU and direct refcounts to keep memory alive

dispatch_event_fd() runs from a notifier with minimal locking, and relies
on RCU and a file refcount to keep the uobject and eventfd alive.

As the next patch wants to remove the file_operations release function
from the drivers, re-organize things so that the devx_event_notifier()
path uses the existing RCU to manage the lifetime of the uobject and
eventfd.

Move the refcount puts to a call_rcu so that the objects are guaranteed to
exist and remove the indirect file refcount.

Link: https://lore.kernel.org/r/1578504126-9400-2-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/uverbs: Remove needs_kfree_rcu from uverbs_obj_type_class
Jason Gunthorpe [Mon, 13 Jan 2020 14:33:10 +0000 (14:33 +0000)]
RDMA/uverbs: Remove needs_kfree_rcu from uverbs_obj_type_class

After device disassociation the uapi_objects are destroyed and freed,
however it is still possible that core code can be holding a kref on the
uobject. When it finally goes to uverbs_uobject_free() via the kref_put()
it can trigger a use-after-free on the uapi_object.

Since needs_kfree_rcu is a micro optimization that only benefits file
uobjects, just get rid of it. There is no harm in using kfree_rcu even if
it isn't required, and the number of involved objects is small.

Link: https://lore.kernel.org/r/20200113143306.GA28717@ziepe.ca
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoLinux 5.5-rc6 v5.5-rc6
Linus Torvalds [Mon, 13 Jan 2020 00:55:08 +0000 (16:55 -0800)]
Linux 5.5-rc6

4 years agoMerge tag 'riscv/for-v5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv...
Linus Torvalds [Mon, 13 Jan 2020 00:48:39 +0000 (16:48 -0800)]
Merge tag 'riscv/for-v5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Paul Walmsley:
 "Two fixes for RISC-V:

   - Clear FP registers during boot when FP support is present, rather
     than when they aren't present

   - Move the header files associated with the SiFive L2 cache
     controller to drivers/soc (where the code was recently moved)"

* tag 'riscv/for-v5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fixup obvious bug for fp-regs reset
  riscv: move sifive_l2_cache.h to include/soc

4 years agoIB/mlx5: Add mmap support for VAR
Yishai Hadas [Thu, 12 Dec 2019 11:09:28 +0000 (13:09 +0200)]
IB/mlx5: Add mmap support for VAR

Add mmap support for VAR, it uses the 'offset' command mode with
involvement of IB core APIs to find the previously allocated mmap entry.

Link: https://lore.kernel.org/r/20191212110928.334995-6-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/mlx5: Introduce VAR object and its alloc/destroy methods
Yishai Hadas [Thu, 12 Dec 2019 11:09:27 +0000 (13:09 +0200)]
IB/mlx5: Introduce VAR object and its alloc/destroy methods

Introduce VAR object and its alloc/destroy KABI methods. The internal
implementation uses the IB core API to manage mmap/munamp calls.

Link: https://lore.kernel.org/r/20191212110928.334995-5-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/mlx5: Extend caps stage to handle VAR capabilities
Yishai Hadas [Thu, 12 Dec 2019 11:09:26 +0000 (13:09 +0200)]
IB/mlx5: Extend caps stage to handle VAR capabilities

Extend caps stage to handle VAR capabilities.

Link: https://lore.kernel.org/r/20191212110928.334995-4-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoMerge branch 'mlx5_vdpa' into rdma.git for-next
Jason Gunthorpe [Sun, 12 Jan 2020 23:46:19 +0000 (19:46 -0400)]
Merge branch 'mlx5_vdpa' into rdma.git for-next

From the mlx5-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Merged due to dependencies in the next patches.

* branch 'mlx5_vdpa':
  net/mlx5: Expose vDPA emulation device capabilities
  net/mlx5: Add Virtio Emulation related device capabilities

4 years agoriscv: Fixup obvious bug for fp-regs reset
Guo Ren [Sun, 5 Jan 2020 02:52:14 +0000 (10:52 +0800)]
riscv: Fixup obvious bug for fp-regs reset

CSR_MISA is defined in Privileged Architectures' spec: 3.1.1 Machine
ISA Register misa. Every bit:1 indicate a feature, so we should beqz
reset_done when there is no F/D bit in csr_misa register.

Signed-off-by: Guo Ren <ren_guo@c-sky.com>
[paul.walmsley@sifive.com: fix typo in commit message]
Fixes: 9e80635619b51 ("riscv: clear the instruction cache and all registers when booting")
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
4 years agoriscv: move sifive_l2_cache.h to include/soc
Yash Shah [Wed, 8 Jan 2020 06:09:06 +0000 (22:09 -0800)]
riscv: move sifive_l2_cache.h to include/soc

The commit 9209fb51896f ("riscv: move sifive_l2_cache.c to drivers/soc")
moves the sifive L2 cache driver to driver/soc. It did not move the
header file along with the driver. Therefore this patch moves the header
file to driver/soc

Signed-off-by: Yash Shah <yash.shah@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
[paul.walmsley@sifive.com: updated to fix the include guard]
Fixes: 9209fb51896f ("riscv: move sifive_l2_cache.c to drivers/soc")
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
4 years agoMerge tag 'iommu-fixes-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 12 Jan 2020 17:35:42 +0000 (09:35 -0800)]
Merge tag 'iommu-fixes-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - Two fixes for VT-d and generic IOMMU code to fix teardown on error
   handling code paths.

 - Patch for the Intel VT-d driver to fix handling of non-PCI devices

 - Fix W=1 compile warning in dma-iommu code

* tag 'iommu-fixes-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/dma: fix variable 'cookie' set but not used
  iommu/vt-d: Unlink device if failed to add to group
  iommu: Remove device link to group on failure
  iommu/vt-d: Fix adding non-PCI devices to Intel IOMMU

4 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Linus Torvalds [Sat, 11 Jan 2020 23:40:43 +0000 (15:40 -0800)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Two driver bugfixes, a documentation fix, and a removal of a spec
  violation for the bus recovery algorithm in the core"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: fix bus recovery stop mode timing
  i2c: bcm2835: Store pointer to bus clock
  dt-bindings: i2c: at91: fix i2c-sda-hold-time-ns documentation for sam9x60
  i2c: at91: fix clk_offset for sam9x60

4 years agoMerge tag 'clone3-tls-v5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 11 Jan 2020 23:33:48 +0000 (15:33 -0800)]
Merge tag 'clone3-tls-v5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull thread fixes from Christian Brauner:
 "This contains a series of patches to fix CLONE_SETTLS when used with
  clone3().

  The clone3() syscall passes the tls argument through struct clone_args
  instead of a register. This means, all architectures that do not
  implement copy_thread_tls() but still support CLONE_SETTLS via
  copy_thread() expecting the tls to be located in a register argument
  based on clone() are currently unfortunately broken. Their tls value
  will be garbage.

  The patch series fixes this on all architectures that currently define
  __ARCH_WANT_SYS_CLONE3. It also adds a compile-time check to ensure
  that any architecture that enables clone3() in the future is forced to
  also implement copy_thread_tls().

  My ultimate goal is to get rid of the copy_thread()/copy_thread_tls()
  split and just have copy_thread_tls() at some point in the not too
  distant future (Maybe even renaming copy_thread_tls() back to simply
  copy_thread() once the old function is ripped from all arches). This
  is dependent now on all arches supporting clone3().

  While all relevant arches do that now there are still four missing:
  ia64, m68k, sh and sparc. They have the system call reserved, but not
  implemented. Once they all implement clone3() we can get rid of
  ARCH_WANT_SYS_CLONE3 and HAVE_COPY_THREAD_TLS.

  This series also includes a minor fix for the arm64 uapi headers which
  caused __NR_clone3 to be missing from the exported user headers.

  Unfortunately the series came in a little late especially given that
  it touches a range of architectures. Due to the holidays not all arch
  maintainers responded in time probably due to their backlog. Will and
  Arnd have thankfully acked the arm specific changes.

  Given that the changes are straightforward and rather minimal combined
  with the fact the that clone3() with CLONE_SETTLS is broken I decided
  to send them post rc3 nonetheless"

* tag 'clone3-tls-v5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  um: Implement copy_thread_tls
  clone3: ensure copy_thread_tls is implemented
  xtensa: Implement copy_thread_tls
  riscv: Implement copy_thread_tls
  parisc: Implement copy_thread_tls
  arm: Implement copy_thread_tls
  arm64: Implement copy_thread_tls
  arm64: Move __ARCH_WANT_SYS_CLONE3 definition to uapi headers

4 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Linus Torvalds [Fri, 10 Jan 2020 21:41:16 +0000 (13:41 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fix from Jiri Kosina:
 "A regression fix for EPOLLOUT handling in hidraw and uhid"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: hidraw, uhid: Always report EPOLLOUT

4 years agoMerge tag 'usb-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Fri, 10 Jan 2020 21:29:40 +0000 (13:29 -0800)]
Merge tag 'usb-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB/PHY fixes from Greg KH:
 "Here are a number of USB and PHY driver fixes for 5.5-rc6

  Nothing all that unusual, just the a bunch of small fixes for a lot of
  different reported issues. The PHY driver fixes are in here as they
  interacted with the usb drivers.

  Full details of the patches are in the shortlog, and all of these have
  been in linux-next with no reported issues"

* tag 'usb-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (24 commits)
  usb: missing parentheses in USE_NEW_SCHEME
  usb: ohci-da8xx: ensure error return on variable error is set
  usb: musb: Disable pullup at init
  usb: musb: fix idling for suspend after disconnect interrupt
  usb: typec: ucsi: Fix the notification bit offsets
  USB: Fix: Don't skip endpoint descriptors with maxpacket=0
  USB-PD tcpm: bad warning+size, PPS adapters
  phy/rockchip: inno-hdmi: round clock rate down to closest 1000 Hz
  usb: chipidea: host: Disable port power only if previously enabled
  usb: cdns3: should not use the same dev_id for shared interrupt handler
  usb: dwc3: gadget: Fix request complete check
  usb: musb: dma: Correct parameter passed to IRQ handler
  usb: musb: jz4740: Silence error if code is -EPROBE_DEFER
  usb: udc: tegra: select USB_ROLE_SWITCH
  USB: core: fix check for duplicate endpoints
  phy: cpcap-usb: Drop extra write to usb2 register
  phy: cpcap-usb: Improve host vs docked mode detection
  phy: cpcap-usb: Prevent USB line glitches from waking up modem
  phy: mapphone-mdm6600: Fix uninitialized status value regression
  phy: cpcap-usb: Fix flakey host idling and enumerating of devices
  ...

4 years agoMerge tag 'char-misc-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 10 Jan 2020 21:25:24 +0000 (13:25 -0800)]
Merge tag 'char-misc-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc fix from Greg KH:
 "Here is a single fix, for the chrdev core, for 5.5-rc6

  There's been a long-standing race condition triggered by syzbot, and
  occasionally real people, in the chrdev open() path. Will finally took
  the time to track it down and fix it for real before the holidays.

  Here's that one patch, it's been in linux-next for a while with no
  reported issues and it does fix the reported problem"

* tag 'char-misc-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  chardev: Avoid potential use-after-free in 'chrdev_open()'

4 years agoMerge tag 'staging-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 10 Jan 2020 21:22:11 +0000 (13:22 -0800)]
Merge tag 'staging-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging fixes from Greg KH:
 "Here are some small staging driver fixes for 5.5-rc6.

  Nothing major here, just some small fixes for a comedi driver, the
  vt6656 driver, and a new device id for the rtl8188eu driver.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'staging-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: rtl8188eu: Add device code for TP-Link TL-WN727N v5.21
  staging: comedi: adv_pci1710: fix AI channels 16-31 for PCI-1713
  staging: vt6656: set usb_set_intfdata on driver fail.
  staging: vt6656: remove bool from vnt_radio_power_on ret
  staging: vt6656: limit reg output to block size
  staging: vt6656: correct return of vnt_init_registers.
  staging: vt6656: Fix non zero logical return of, usb_control_msg

4 years agoMerge tag 'tty-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Fri, 10 Jan 2020 21:17:21 +0000 (13:17 -0800)]
Merge tag 'tty-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are two tty/serial driver fixes for 5.5-rc6.

  The first fixes a much much reported issue with a previous tty port
  link patch that is in your tree, and the second fixes a problem where
  the serdev driver would claim ACPI devices that it shouldn't be
  claiming.

  Both have been in linux-next for a while with no reported issues"

* tag 'tty-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serdev: Don't claim unsupported ACPI serial devices
  tty: always relink the port

4 years agoMerge tag 'block-5.5-2020-01-10' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 10 Jan 2020 20:05:26 +0000 (12:05 -0800)]
Merge tag 'block-5.5-2020-01-10' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A few fixes that should go into this round.

  This pull request contains two NVMe fixes via Keith, removal of a dead
  function, and a fix for the bio op for read truncates (Ming)"

* tag 'block-5.5-2020-01-10' of git://git.kernel.dk/linux-block:
  nvmet: fix per feat data len for get_feature
  nvme: Translate more status codes to blk_status_t
  fs: move guard_bio_eod() after bio_set_op_attrs
  block: remove unused mp_bvec_last_segment

4 years agoMerge tag 'io_uring-5.5-2020-01-10' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 10 Jan 2020 20:03:12 +0000 (12:03 -0800)]
Merge tag 'io_uring-5.5-2020-01-10' of git://git.kernel.dk/linux-block

Pull io_uring fix from Jens Axboe:
 "Single fix for this series, fixing a regression with the short read
  handling.

  This just removes it, as it cannot safely be done for all cases"

* tag 'io_uring-5.5-2020-01-10' of git://git.kernel.dk/linux-block:
  io_uring: remove punt of short reads to async context

4 years agoMerge tag 'mtd/fixes-for-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 10 Jan 2020 19:57:10 +0000 (11:57 -0800)]
Merge tag 'mtd/fixes-for-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux

Pull MTD fixes from Miquel Raynal:
 "MTD:
   - sm_ftl: Fix NULL pointer warning.

  Raw NAND:
   - Cadence: fix compile testing.
   - STM32: Avoid locking.

  Onenand:
   - Fix several sparse/build warnings.

  SPI-NOR:
   - Add a flag to fix interaction with Micron parts"

* tag 'mtd/fixes-for-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: spi-nor: Fix the writing of the Status Register on micron flashes
  mtd: sm_ftl: fix NULL pointer warning
  mtd: onenand: omap2: Pass correct flags for prep_dma_memcpy
  mtd: onenand: samsung: Fix iomem access with regular memcpy
  mtd: onenand: omap2: Fix errors in style
  mtd: cadence: Fix cast to pointer from integer of different size warning
  mtd: rawnand: stm32_fmc2: avoid to lock the CPU bus

4 years agoMerge tag 'sound-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Fri, 10 Jan 2020 19:52:36 +0000 (11:52 -0800)]
Merge tag 'sound-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A few piled ASoC fixes and usual HD-audio and USB-audio fixups. Some
  of them are for ASoC core error-handling"

* tag 'sound-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda: enable regmap internal locking
  ALSA: hda/realtek - Add quirk for the bass speaker on Lenovo Yoga X1 7th gen
  ALSA: hda/realtek - Set EAPD control to default for ALC222
  ALSA: usb-audio: Apply the sample rate quirk for Bose Companion 5
  ALSA: hda/realtek - Add new codec supported for ALCS1200A
  ASoC: Intel: boards: Fix compile-testing RT1011/RT5682
  ASoC: SOF: imx8: Fix dsp_box offset
  ASoC: topology: Prevent use-after-free in snd_soc_get_pcm_runtime()
  ASoC: fsl_audmix: add missed pm_runtime_disable
  ASoC: stm32: spdifrx: fix input pin state management
  ASoC: stm32: spdifrx: fix race condition in irq handler
  ASoC: stm32: spdifrx: fix inconsistent lock state
  ASoC: core: Fix access to uninitialized list heads
  ASoC: soc-core: Set dpcm_playback / dpcm_capture
  ASoC: SOF: imx8: fix memory allocation failure check on priv->pd_dev
  ASoC: SOF: Intel: hda: hda-dai: fix oops on hda_link .hw_free
  ASoC: SOF: fix fault at driver unload after failed probe

4 years agoMerge tag 'thermal-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal...
Linus Torvalds [Fri, 10 Jan 2020 19:48:37 +0000 (11:48 -0800)]
Merge tag 'thermal-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Pull thermal fix from Daniel Lezcano:
 "Fix backward compatibility with old DTBs on QCOM tsens (Amit
  Kucheria)"

* tag 'thermal-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  drivers: thermal: tsens: Work with old DTBs

4 years agoMerge tag 'pm-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 10 Jan 2020 19:46:59 +0000 (11:46 -0800)]
Merge tag 'pm-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "Prevent the cpufreq-dt driver from probing Tegra20/30 (Dmitry
  Osipenko) and prevent the Intel RAPL power capping driver from
  crashing during CPU initialization due to a NULL pointer dereference
  if the processor model in use is not known to it (Harry Pan)"

* tag 'pm-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  powercap: intel_rapl: add NULL pointer check to rapl_mmio_cpu_online()
  cpufreq: dt-platdev: Blacklist NVIDIA Tegra20 and Tegra30 SoCs

4 years agonet/mlx5: Expose vDPA emulation device capabilities
Yishai Hadas [Thu, 12 Dec 2019 11:09:25 +0000 (13:09 +0200)]
net/mlx5: Expose vDPA emulation device capabilities

Expose vDPA emulation device capabilities from the core layer.
It includes reading the capabilities from the firmware and exposing
helper functions to access the data.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agonet/mlx5: Add Virtio Emulation related device capabilities
Yishai Hadas [Thu, 12 Dec 2019 11:09:24 +0000 (13:09 +0200)]
net/mlx5: Add Virtio Emulation related device capabilities

Add Virtio Emulation related fields to the device capabilities.

It includes a general bit to indicate whether Virtio Emulation is
supported and the capabilities structure itself.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agonvmet: fix per feat data len for get_feature
Amit Engel [Tue, 7 Jan 2020 16:47:24 +0000 (01:47 +0900)]
nvmet: fix per feat data len for get_feature

The existing implementation for the get_feature admin-cmd does not
use per-feature data len. This patch introduces a new helper function
nvmet_feat_data_len(), which is used to calculate per feature data len.
Right now we only set data len for fid 0x81 (NVME_FEAT_HOST_ID).

Fixes: commit e9061c397839 ("nvmet: Remove the data_len field from the nvmet_req struct")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Amit Engel <amit.engel@dell.com>
[endiness, naming, and kernel style fixes]
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agonvme: Translate more status codes to blk_status_t
Keith Busch [Thu, 5 Dec 2019 19:50:44 +0000 (04:50 +0900)]
nvme: Translate more status codes to blk_status_t

Decode interrupted command and not ready namespace nvme status codes to
BLK_STS_TARGET. These are not generic IO errors and should use a non-path
specific error so that it can use the non-failover retry path.

Reported-by: John Meneghini <John.Meneghini@netapp.com>
Cc: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoRDMA/core: Remove err in iw_query_port
Guoqing Jiang [Thu, 9 Jan 2020 13:40:43 +0000 (14:40 +0100)]
RDMA/core: Remove err in iw_query_port

Since we can return device->ops.query_port directly, so no need to keep
those lines.

Link: https://lore.kernel.org/r/20200109134043.15568-1-guoqing.jiang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Add support for reporting wc as software mode
Xi Wang [Thu, 9 Jan 2020 12:20:12 +0000 (20:20 +0800)]
RDMA/hns: Add support for reporting wc as software mode

When hardware is in resetting stage, we may can't poll back all the
expected work completions as the hardware won't generate cqe anymore.

This patch allows the driver to compose the expected wc instead of the
hardware during resetting stage. Once the hardware finished resetting, we
can poll cq from hardware again.

Link: https://lore.kernel.org/r/1578572412-25756-1-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoRDMA/hns: Bugfix for posting a wqe with sge
Lijun Ou [Thu, 9 Jan 2020 12:10:52 +0000 (20:10 +0800)]
RDMA/hns: Bugfix for posting a wqe with sge

Driver should first check whether the sge is valid, then fill the valid
sge and the caculated total into hardware, otherwise invalid sges will
cause an error.

Fixes: 52e3b42a2f58 ("RDMA/hns: Filter for zero length of sge in hip08 kernel mode")
Fixes: 7bdee4158b37 ("RDMA/hns: Fill sq wqe context of ud type in hip08")
Link: https://lore.kernel.org/r/1578571852-13704-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Add RcvShortLengthErrCnt to hfi1stats
Mike Marciniszyn [Mon, 6 Jan 2020 13:42:35 +0000 (08:42 -0500)]
IB/hfi1: Add RcvShortLengthErrCnt to hfi1stats

This counter, RxShrErr, is required for error analysis and debug.

Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20200106134235.119356.29123.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Add software counter for ctxt0 seq drop
Mike Marciniszyn [Mon, 6 Jan 2020 13:42:28 +0000 (08:42 -0500)]
IB/hfi1: Add software counter for ctxt0 seq drop

All other code paths increment some form of drop counter.

This was missed in the original implementation.

Fixes: 82c2611daaf0 ("staging/rdma/hfi1: Handle packets with invalid RHF on context 0")
Link: https://lore.kernel.org/r/20200106134228.119356.96828.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Return void in packet receiving functions
Grzegorz Andrejczuk [Mon, 6 Jan 2020 13:42:22 +0000 (08:42 -0500)]
IB/hfi1: Return void in packet receiving functions

Packet receiving functions returns int value, and yet the return values
are not used at all.

This patch converts the functions to return void.

Link: https://lore.kernel.org/r/20200106134222.119356.84098.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Decouple IRQ name from type
Grzegorz Andrejczuk [Mon, 6 Jan 2020 13:42:16 +0000 (08:42 -0500)]
IB/hfi1: Decouple IRQ name from type

IRQ name was connected to IRQ type, this is not sufficient and it would be
better to use name as argument to msix_request_irq instead of assigning it
to variables when function is called.

Index argument was required to generate name and now it can be removed.

To generate name correctly helpers function were added and updated.

Link: https://lore.kernel.org/r/20200106134216.119356.44478.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Create API for auto activate
Mike Marciniszyn [Mon, 6 Jan 2020 13:42:10 +0000 (08:42 -0500)]
IB/hfi1: Create API for auto activate

Add an auto activate routine for use by the interrupt handler.

Link: https://lore.kernel.org/r/20200106134210.119356.43079.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: IB/hfi1: Add an API to handle special case drop
Mike Marciniszyn [Mon, 6 Jan 2020 13:42:04 +0000 (08:42 -0500)]
IB/hfi1: IB/hfi1: Add an API to handle special case drop

This patch pushes special case drop logic into an API to be shared by all
interrupt handlers.

Additionally, convert do_drop to a bool.

Link: https://lore.kernel.org/r/20200106134203.119356.36962.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
4 years agoIB/hfi1: Move common receive IRQ code to function
Grzegorz Andrejczuk [Mon, 6 Jan 2020 13:41:57 +0000 (08:41 -0500)]
IB/hfi1: Move common receive IRQ code to function

Tracing interrupts, incrementing interrupt counter and ASPM are part that
will be reused by HFI1 receive IRQ handlers.

Create common function to have shared code in one place.

Link: https://lore.kernel.org/r/20200106134157.119356.32656.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>