asedeno.scripts.mit.edu Git

KVM: async_pf: avoid recursive flushing of work items

This was reported by syzkaller:

    [ INFO: possible recursive locking detected ]
    4.9.0-rc4+ #49 Not tainted
    ---------------------------------------------
    kworker/2:1/5658 is trying to acquire lock:
     ([ 1644.769018] (&work->work)
    [<     inline     >] list_empty include/linux/compiler.h:243
    [<ffffffff8128dd60>] flush_work+0x0/0x660 kernel/workqueue.c:1511

    but task is already holding lock:
     ([ 1644.769018] (&work->work)
    [<ffffffff812916ab>] process_one_work+0x94b/0x1900 kernel/workqueue.c:2093

    stack backtrace:
    CPU: 2 PID: 5658 Comm: kworker/2:1 Not tainted 4.9.0-rc4+ #49
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Workqueue: events async_pf_execute
     ffff8800676ff630 ffffffff81c2e46b ffffffff8485b930 ffff88006b1fc480
     0000000000000000 ffffffff8485b930 ffff8800676ff7e0 ffffffff81339b27
     ffff8800676ff7e8 0000000000000046 ffff88006b1fcce8 ffff88006b1fccf0
    Call Trace:
    ...
    [<ffffffff8128ddf3>] flush_work+0x93/0x660 kernel/workqueue.c:2846
    [<ffffffff812954ea>] __cancel_work_timer+0x17a/0x410 kernel/workqueue.c:2916
    [<ffffffff81295797>] cancel_work_sync+0x17/0x20 kernel/workqueue.c:2951
    [<ffffffff81073037>] kvm_clear_async_pf_completion_queue+0xd7/0x400 virt/kvm/async_pf.c:126
    [<     inline     >] kvm_free_vcpus arch/x86/kvm/x86.c:7841
    [<ffffffff810b728d>] kvm_arch_destroy_vm+0x23d/0x620 arch/x86/kvm/x86.c:7946
    [<     inline     >] kvm_destroy_vm virt/kvm/kvm_main.c:731
    [<ffffffff8105914e>] kvm_put_kvm+0x40e/0x790 virt/kvm/kvm_main.c:752
    [<ffffffff81072b3d>] async_pf_execute+0x23d/0x4f0 virt/kvm/async_pf.c:111
    [<ffffffff8129175c>] process_one_work+0x9fc/0x1900 kernel/workqueue.c:2096
    [<ffffffff8129274f>] worker_thread+0xef/0x1480 kernel/workqueue.c:2230
    [<ffffffff812a5a94>] kthread+0x244/0x2d0 kernel/kthread.c:209
    [<ffffffff831f102a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433

The reason is that kvm_put_kvm is causing the destruction of the VM, but
the page fault is still on the ->queue list.  The ->queue list is owned
by the VCPU, not by the work items, so we cannot just add list_del to
the work item.

Instead, use work->vcpu to note async page faults that have been resolved
and will be processed through the done list.  There is no need to flush
those.

Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

kvm: kvmclock: let KVM_GET_CLOCK return whether the master clock is in use

Userspace can read the exact value of kvmclock by reading the TSC
and fetching the timekeeping parameters out of guest memory. This
however is brittle and not necessary anymore with KVM 4.11. Provide
a mechanism that lets userspace know if the new KVM_GET_CLOCK
semantics are in effect, and---since we are at it---if the clock
is stable across all VCPUs.

Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

KVM: Disable irq while unregistering user notifier

Function user_notifier_unregister should be called only once for each
registered user notifier.

Function kvm_arch_hardware_disable can be executed from an IPI context
which could cause a race condition with a VCPU returning to user mode
and attempting to unregister the notifier.

Signed-off-by: Ignacio Alvarado <ikalvarado@google.com>
Cc: stable@vger.kernel.org
Fixes: 18863bdd60f8 ("KVM: x86 shared msr infrastructure")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

KVM: x86: do not go through vcpu in __get_kvmclock_ns

Going through the first VCPU is wrong if you follow a KVM_SET_CLOCK with
a KVM_GET_CLOCK immediately after, without letting the VCPU run and
call kvm_guest_time_update.

To fix this, compute the kvmclock value ourselves, using the master
clock (tsc, nsec) pair as the base and the host CPU frequency as
the scale.

Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

Merge tag 'kvm-arm-for-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm

KVM/ARM updates for v4.9-rc6

- Fix handling of the 32bit cycle counter
- Fix cycle counter filtering

Merge tag 'batadv-net-for-davem-20161119' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
Here are two batman-adv bugfix patches:

- Revert a splat on disabling interface which created another problem,
by Sven Eckelmann

- Fix error handling when the primary interface disappears during a
throughput meter test, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

sparc: drop duplicate header scatterlist.h

Drop duplicate header scatterlist.h from iommu_common.h.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: macb: add check for dma mapping error in start_xmit()

at91ether_start_xmit() does not check for dma mapping errors.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'acpi-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
"They fix an ACPI thermal management regression introduced by a recent
  FADT handling cleanup, an ACPI tools build issue introduced by a
  recent ACPICA commit and a PCC mailbox initialization bug causing
  lockdep to complain loudly.

  Specifics:

   - Revert a recent ACPICA cleanup that attempted to get rid of all
     FADT version 2 legacy, but broke ACPI thermal management on at
     least one system (Rafael Wysocki).

   - Fix cross-compiled builds of ACPI tools that stopped working after
     a recent cleanup related to the handling of header files in ACPICA
     (Lv Zheng).

   - Fix a locking issue in the PCC channel initialization code that
     invokes devm_request_irq() under a spinlock (among other things)
     and causes lockdep to complain (Hoan Tran)"

* tag 'acpi-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  tools/power/acpi: Remove direct kernel source include reference
  mailbox: PCC: Fix lockdep warning when request PCC channel
  Revert "ACPICA: FADT support cleanup"

Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull kbuild fixes from Michal Marek:
"Here are some regression fixes for kbuild:

   - modversion support for exported asm symbols (Nick Piggin). The
     affected architectures need separate patches adding
     asm-prototypes.h.

   - fix rebuilds of lib-ksyms.o (Nick Piggin)

   - -fno-PIE builds (Sebastian Siewior and Borislav Petkov). This is
     not a kernel regression, but one of the Debian gcc package.
     Nevertheless, it's quite annoying, so I think it should go into
     mainline and stable now"

* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kbuild: Steal gcc's pie from the very beginning
  kbuild: be more careful about matching preprocessed asm ___EXPORT_SYMBOL
  x86/kexec: add -fno-PIE
  scripts/has-stack-protector: add -fno-PIE
  kbuild: add -fno-PIE
  kbuild: modversions for EXPORT_SYMBOL() for asm
  kbuild: prevent lib-ksyms.o rebuilds

Merge tag 'nfsd-4.9-2' of git://linux-nfs.org/~bfields/linux

Pull nfsd bugfix from Bruce Fields:
"Just one fix for an NFS/RDMA crash"

* tag 'nfsd-4.9-2' of git://linux-nfs.org/~bfields/linux:
sunrpc: svc_age_temp_xprts_now should not call setsockopt non-tcp transports

MAINTAINERS: Add LED subsystem co-maintainer

Mark me as a co-maintainer of LED subsystem.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>

Merge branches 'acpica-fixes', 'acpi-cppc-fixes' and 'acpi-tools-fixes'

* acpica-fixes:
  Revert "ACPICA: FADT support cleanup"

* acpi-cppc-fixes:
  mailbox: PCC: Fix lockdep warning when request PCC channel

* acpi-tools-fixes:
  tools/power/acpi: Remove direct kernel source include reference

Merge branch 'sparc-lockdep-small'

Babu Moger says:

====================
Adjust lockdep static allocations for sparc

These patches limit the static allocations for lockdep data structures
used for debugging locking correctness. For sparc, all the kernel's code,
data, and bss, must have locked translations in the TLB so that we don't
get TLB misses on kernel code and data. Current sparc chips have 8 TLB
entries available that may be locked down, and with a 4mb page size,
this gives a maximum of 32MB. With PROVE_LOCKING we could go over this
limit and cause system boot-up problems. These patches limit the static
allocations so that everything fits in current required size limit.

patch 1 : Adds new config parameter CONFIG_PROVE_LOCKING_SMALL
Patch 2 : Adjusts the sizes based on the new config parameter

v2-> v3:
   Some more comments from Sam Ravnborg and Peter Zijlstra.
   Defined PROVE_LOCKING_SMALL as invisible and moved the selection to
   arch/sparc/Kconfig.

v1-> v2:
   As suggested by Peter Zijlstra, keeping the default as is.
   Introduced new config variable CONFIG_PROVE_LOCKING_SMALL
   to handle sparc specific case.

v0:
   Initial revision.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

lockdep: Limit static allocations if PROVE_LOCKING_SMALL is defined

Reduce the size of data structure for lockdep entries by half if
PROVE_LOCKING_SMALL if defined. This is used only for sparc.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

config: Adding the new config parameter CONFIG_PROVE_LOCKING_SMALL for sparc

This new config parameter limits the space used for "Lock debugging:
prove locking correctness" by about 4MB. The current sparc systems have
the limitation of 32MB size for kernel size including .text, .data and
.bss sections. With PROVE_LOCKING feature, the kernel size could grow
beyond this limit and causing system boot-up issues. With this option,
kernel limits the size of the entries of lock_chains, stack_trace etc.,
so that kernel fits in required size limit. This is not visible to user
and only used for sparc.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

NFSv4.1: Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state

Now that we're doing TEST_STATEID in nfs4_reclaim_open_state(), we can have
a NFS4ERR_OLD_STATEID returned from nfs41_open_expired() . Instead of
marking state recovery as failed, mark the state for recovery again.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

sunbmac: Fix compiler warning

sunbmac uses '__u32' for dma handle while invoking kernel DMA APIs,
instead of using dma_addr_t. This hasn't caused any 'incompatible
pointer type' warning on SPARC because until now dma_addr_t is of
type u32. However, recent changes in SPARC ATU (iommu) enables 64bit
DMA and therefore dma_addr_t becomes of type u64. This makes
'incompatible pointer type' warnings inevitable.

e.g.
drivers/net/ethernet/sun/sunbmac.c: In function ‘bigmac_ether_init’:
drivers/net/ethernet/sun/sunbmac.c:1166: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’

This patch resolves above compiler warning.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sunqe: Fix compiler warnings

sunqe uses '__u32' for dma handle while invoking kernel DMA APIs,
instead of using dma_addr_t. This hasn't caused any 'incompatible
pointer type' warning on SPARC because until now dma_addr_t is of
type u32. However, recent changes in SPARC ATU (iommu) enables 64bit
DMA and therefore dma_addr_t becomes of type u64. This makes
'incompatible pointer type' warnings inevitable.

e.g.
drivers/net/ethernet/sun/sunqe.c: In function ‘qec_ether_init’:
drivers/net/ethernet/sun/sunqe.c:883: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’
drivers/net/ethernet/sun/sunqe.c:885: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type
./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t *’ but argument is of type ‘__u32 *’

This patch resolves above compiler warnings.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

NFSv4: Don't call close if the open stateid has already been cleared

Ensure we test to see if the open stateid is actually set, before we
send a CLOSE.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Merge branch 'sun4v-64bit-DMA'

Tushar Dave says:

====================
sparc: Enable sun4v hypervisor PCI IOMMU v2 APIs and ATU

ATU (Address Translation Unit) is a new IOMMU in SPARC supported with
sun4v hypervisor PCI IOMMU v2 APIs.

Current SPARC IOMMU supports only 32bit address ranges and one TSB
per PCIe root complex that has a 2GB per root complex DVMA space
limit. The limit has become a scalability bottleneck nowadays that
a typical 10G/40G NIC can consume 500MB DVMA space per instance.
When DVMA resource is exhausted, devices will not be usable
since the driver can't allocate DVMA.

For example, we recently experienced legacy IOMMU limitation while
using i40e driver in system with large number of CPUs (e.g. 128).
Four ports of i40e, each request 128 QP (Queue Pairs). Each queue has
512 (default) descriptors. So considering only RX queues (because RX
premap DMA buffers), i40e takes 4*128*512 number of DMA entries in
IOMMU table. Legacy IOMMU can have at max (2G/8K)- 1 entries available
in table. So bringing up four instance of i40e alone saturate existing
IOMMU resource.

ATU removes bottleneck by allowing guest os to create IOTSB of size
32G (or more) with 64bit address ranges available in ATU HW. 32G is
more than enough DVMA space to be shared by all PCIe devices under
root complex contrast to 2G space provided by legacy IOMMU.

ATU allows PCIe devices to use 64bit DMA addressing. Devices
which choose to use 32bit DMA mask will continue to work with the
existing legacy IOMMU.

The patch set is tested on sun4v (T1000, T2000, T3, T4, T5, T7, S7)
and sun4u SPARC.

Thanks.
-Tushar

v2->v3:
- Patch #5 addresses comment by Joe Perches.
-- use %s, __func__ instead of embedding the function name.

v1->v2:
- Patch #2 addresses comments by Dave M.
-- use page allocator to allocate IOTSB.
-- use true/false with boolean variables.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: Enable 64-bit DMA

ATU 64bit addressing allows PCIe devices with 64bit DMA capabilities
to use ATU for 64bit DMA.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: Enable sun4v dma ops to use IOMMU v2 APIs

Add Hypervisor IOMMU v2 APIs pci_iotsb_map(), pci_iotsb_demap() and
enable sun4v dma ops to use IOMMU v2 API for all PCIe devices with
64bit DMA mask.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: Bind PCIe devices to use IOMMU v2 service

In order to use Hypervisor (HV) IOMMU v2 API for map/demap, each PCIe
device has to be bound to IOTSB using HV API pci_iotsb_bind().

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: Initialize iommu_map_table and iommu_pool

Like legacy IOMMU, use common iommu_map_table and iommu_pool for ATU.
This change initializes iommu_map_table and iommu_pool for ATU.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: Add ATU (new IOMMU) support

ATU (Address Translation Unit) is a new IOMMU in SPARC supported with
Hypervisor IOMMU v2 APIs.

Current SPARC IOMMU supports only 32bit address ranges and one TSB
per PCIe root complex that has a 2GB per root complex DVMA space
limit. The limit has become a scalability bottleneck nowadays that
a typical 10G/40G NIC can consume 300MB-500MB DVMA space per
instance. When DVMA resource is exhausted, devices will not be usable
since the driver can't allocate DVMA.

ATU removes bottleneck by allowing guest os to create IOTSB of size
32G (or more) with 64bit address ranges available in ATU HW. 32G is
more than enough DVMA space to be shared by all PCIe devices under
root complex contrast to 2G space provided by legacy IOMMU.

ATU allows PCIe devices to use 64bit DMA addressing. Devices
which choose to use 32bit DMA mask will continue to work with the
existing legacy IOMMU.

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: Add FORCE_MAX_ZONEORDER and default to 13

This change allows ATU (new IOMMU) in SPARC systems to request
large (32M) contiguous memory during boot for creating IOTSB backing
store.

Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rtnetlink: fix FDB size computation

Add missing NDA_VLAN attribute's size.

Fixes: 1e53d5bb8878 ("net: Pass VLAN ID to rtnl_fdb_notify.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

netns: fix get_net_ns_by_fd(int pid) typo

The argument to get_net_ns_by_fd() is a /proc/$PID/ns/net file
descriptor not a pid. Fix the typo.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Rami Rosen <roszenrami@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mac80211-for-davem-2016-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
A few more bugfixes:
* limit # of scan results stored in memory - this is a long-standing bug
   Jouni and I only noticed while discussing other things in Santa Fe
* revert AP_LINK_PS patch that was causing issues (Felix)
* various A-MSDU/A-MPDU fixes for TXQ code (Felix)
* interoperability workaround for peers with broken VHT capabilities
   (Filip Matusiak)
* add bitrate definition for a VHT MCS that's supposed to be invalid
   but gets used by some hardware anyway (Thomas Pedersen)
* beacon timer fix in hwsim (Benjamin Beichler)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

af_unix: conditionally use freezable blocking calls in read

Commit 2b15af6f95 ("af_unix: use freezable blocking calls in read")
converts schedule_timeout() to its freezable version, it was probably
correct at that time, but later, commit 2b514574f7e8
("net: af_unix: implement splice for stream af_unix sockets") breaks
the strong requirement for a freezable sleep, according to
commit 0f9548ca1091:

    We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
    deadlock if the lock is later acquired in the suspend or hibernate path
    (e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
    cgroup_freezer if a lock is held inside a frozen cgroup that is later
    acquired by a process outside that group.

The pipe_lock is still held at that point.

So use freezable version only for the recvmsg call path, avoid impact for
Android.

Fixes: 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Colin Cross <ccross@android.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'cpsw-fixes'

Johan Hovold says:

====================
net: cpsw: fix leaks and probe deferral

This series fixes as number of leaks and issues in the cpsw probe-error
and driver-unbind paths, some which specifically prevented deferred
probing.

v2
- Keep platform device runtime-resumed throughout probe instead of
   resuming in the probe error path as suggested by Grygorii (patch
   1/7).

- Runtime-resume platform device before registering any children in
   order to make sure it is synchronously suspended after deregistering
   children in the error path (patch 3/7).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: ti: cpsw: fix fixed-link phy probe deferral

Make sure to propagate errors from of_phy_register_fixed_link() which
can fail with -EPROBE_DEFER.

Fixes: 1f71e8c96fc6 ("drivers: net: cpsw: Add support for fixed-link
PHY")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: ti: cpsw: add missing sanity check

Make sure to check for allocation failures before dereferencing a
NULL-pointer during probe.

Fixes: 649a1688c960 ("net: ethernet: ti: cpsw: create common struct to
hold shared driver data")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: ti: cpsw: fix secondary-emac probe error path

Make sure to deregister the primary device in case the secondary emac
fails to probe.

kernel BUG at /home/johan/work/omicron/src/linux/net/core/dev.c:7743!
...
[<c05b3dec>] (free_netdev) from [<c04fe6c0>] (cpsw_probe+0x9cc/0xe50)
[<c04fe6c0>] (cpsw_probe) from [<c047b28c>] (platform_drv_probe+0x5c/0xc0)

Fixes: d9ba8f9e6298 ("driver: net: ethernet: cpsw: dual emac interface
implementation")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: ti: cpsw: fix of_node and phydev leaks

Make sure to drop references taken and deregister devices registered
during probe on probe errors (including deferred probe) and driver
unbind.

Specifically, PHY of-node references were never released and fixed-link
PHY devices were never deregistered.

Fixes: 9e42f715264f ("drivers: net: cpsw: add phy-handle parsing")
Fixes: 1f71e8c96fc6 ("drivers: net: cpsw: Add support for fixed-link
PHY")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: ti: cpsw: fix deferred probe

Make sure to deregister all child devices also on probe errors to avoid
leaks and to fix probe deferral:

cpsw 4a100000.ethernet: omap_device: omap_device_enable() called from invalid state 1
cpsw 4a100000.ethernet: use pm_runtime_put_sync_suspend() in driver?
cpsw: probe of 4a100000.ethernet failed with error -22

Add generic helper to undo the effects of cpsw_probe_dt(), which will
also be used in a follow-on patch to fix further leaks that have been
introduced more recently.

Note that the platform device is now runtime-resumed before registering
any child devices in order to make sure that it is synchronously
suspended after having deregistered the children in the error path.

Fixes: 1fb19aa730e4 ("net: cpsw: Add parent<->child relation support
between cpsw and mdio")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: ti: cpsw: fix mdio device reference leak

Make sure to drop the reference taken by of_find_device_by_node() when
looking up an mdio device from a phy_id property during probe.

Fixes: 549985ee9c72 ("cpsw: simplify the setup of the register
pointers")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: ti: cpsw: fix bad register access in probe error path

Make sure to keep the platform device runtime-resumed throughout probe
to avoid accessing the CPSW registers in the error path (e.g. for
deferred probe) with clocks disabled:

Unhandled fault: external abort on non-linefetch (0x1008) at 0xd0872d08
...
[<c04fabcc>] (cpsw_ale_control_set) from [<c04fb8b4>] (cpsw_ale_destroy+0x2c/0x44)
[<c04fb8b4>] (cpsw_ale_destroy) from [<c04fea58>] (cpsw_probe+0xbd0/0x10c4)
[<c04fea58>] (cpsw_probe) from [<c047b2a0>] (platform_drv_probe+0x5c/0xc0)

Fixes: df828598a755 ("netdev: driver: ethernet: Add TI CPSW driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sky2: Fix shutdown crash

The sky2 frequently crashes during machine shutdown with:

sky2_get_stats+0x60/0x3d8 [sky2]
dev_get_stats+0x68/0xd8
rtnl_fill_stats+0x54/0x140
rtnl_fill_ifinfo+0x46c/0xc68
rtmsg_ifinfo_build_skb+0x7c/0xf0
rtmsg_ifinfo.part.22+0x3c/0x70
rtmsg_ifinfo+0x50/0x5c
netdev_state_change+0x4c/0x58
linkwatch_do_dev+0x50/0x88
__linkwatch_run_queue+0x104/0x1a4
linkwatch_event+0x30/0x3c
process_one_work+0x140/0x3e0
worker_thread+0x60/0x44c
kthread+0xdc/0xf0
ret_from_fork+0x10/0x50

This is caused by the sky2 being called after it has been shutdown.
A previous thread about this can be found here:

https://lkml.org/lkml/2016/4/12/410

An alternative fix is to assure that IFF_UP gets cleared by
calling dev_close() during shutdown. This is similar to what the
bnx2/tg3/xgene and maybe others are doing to assure that the driver
isn't being called following _shutdown().

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

NFSv4: Fix CLOSE races with OPEN

If the reply to a successful CLOSE call races with an OPEN to the same
file, we can end up scribbling over the stateid that represents the
new open state.
The race looks like:

  Client Server
  ====== ======

  CLOSE stateid A on file "foo"
CLOSE stateid A, return stateid C
  OPEN file "foo"
OPEN "foo", return stateid B
  Receive reply to OPEN
  Reset open state for "foo"
  Associate stateid B to "foo"

  Receive CLOSE for A
  Reset open state for "foo"
  Replace stateid B with C

The fix is to examine the argument of the CLOSE, and check for a match
with the current stateid "other" field. If the two do not match, then
the above race occurred, and we should just ignore the CLOSE.

Reported-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

NFSv4.1: Fix a regression in DELEGRETURN

We don't want to call nfs4_free_revoked_stateid() in the case where
the delegreturn was successful.

Reported-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Merge tag 'sound-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"Three trivial fixes:

  A regression fix for ASRock mobo, a use-after-free fix at hot-unplug
  of USB-audio, and a quirk for new Thinkpad models"

* tag 'sound-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio: Fix use-after-free of usb_device at disconnect
  ALSA: hda - Fix mic regression by ASRock mobo fixup
  ALSA: hda - add a new condition to check if it is thinkpad

Merge tag 'gpio-v4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
"These are hopefully the last GPIO fixes for v4.9. The most important
  is that it fixes the UML randconfig builds that have been nagging me
  for some time and me being confused about where the problem was really
  sitting, now this fix give this nice feeling that everything is solid
  and builds fine.

  Summary:

   - Finally, after being puzzled by a bunch of recurrent UML build
     failures on randconfigs from the build robot, Keno Fischer nailed
     it: GPIO_DEVRES is optional and depends on HAS_IOMEM even though
     many users just unconditionally rely on it to be available. And it
     *should* be available: garbage collection is nice for this and it
     *certainly* has nothing to do with having IOMEM. So we got rid of
     it, and now the UML builds should JustWork(TM).

   - Do not call .get_direction() on sleeping GPIO chips on the fastpath
     when locking GPIOs for interrupts: it is done from atomic context,
     no way.

   - Some driver fixes"

* tag 'gpio-v4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: Remove GPIO_DEVRES option
  gpio: tc3589x: fix up .get_direction()
  gpio: do not double-check direction on sleeping chips
  gpio: pca953x: Move memcpy into mutex lock for set multiple
  gpio: pca953x: Fix corruption of other gpios in set_multiple.

Merge tag 'drm-fixes-for-v4.9-rc6-brown-paper-bag' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"i915 fixes + 2 mediatek regressions.

  So some i915 fixes came in which I thought they might so I'm sending
  those along with two reverts for two patches to the mediatek driver
  that didn't seem to build so well, I've fixed up my -fixes ARM build
  and .config so I could see it, but yes brown paper bag time"

* tag 'drm-fixes-for-v4.9-rc6-brown-paper-bag' of git://people.freedesktop.org/~airlied/linux:
  Revert "drm/mediatek: set vblank_disable_allowed to true"
  Revert "drm/mediatek: fix a typo of OD_CFG to OD_RELAYMODE"
  drm/i915: Assume non-DP++ port if dvo_port is HDMI and there's no AUX ch specified in the VBT
  drm/i915: Refresh that status of MST capable connectors in ->detect()
  drm/i915: Grab the rotation from the passed plane state for VLV sprites
  drm/i915: Mark CPU cache as dirty when used for rendering

Merge tag 'usb-serial-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial fixes for v4.9-rc6

Here are a couple of new device ids.

Signed-off-by: Johan Hovold <johan@kernel.org>

crypto: algif_hash - Fix NULL hash crash with shash

Recently algif_hash has been changed to allow null hashes. This
triggers a bug when used with an shash algorithm whereby it will
cause a crash during the digest operation.

This patch fixes it by avoiding the digest operation and instead
doing an init followed by a final which avoids the buggy code in
shash.

This patch also ensures that the result buffer is freed after an
error so that it is not returned as a genuine hash result on the
next recv call.

The shash/ahash wrapper code will be fixed later to handle this
case correctly.

Fixes: 493b2ed3f760 ("crypto: algif_hash - Handle NULL hashes correctly")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Laura Abbott <labbott@redhat.com>

mmc: sdhci-of-esdhc: fixup PRESENT_STATE read

Since commit 87a18a6a5652 ("mmc: mmc: Use ->card_busy() to detect busy
cards in __mmc_switch()") the ESDHC driver is broken:
mmc0: Card stuck in programming state! __mmc_switch
mmc0: error -110 whilst initialising MMC card

Since this commit __mmc_switch() uses ->card_busy(), which is
sdhci_card_busy() for the esdhc driver. sdhci_card_busy() uses the
PRESENT_STATE register, specifically the DAT0 signal level bit. But the
ESDHC uses a non-conformant PRESENT_STATE register, thus a read fixup is
required to make the driver work again.

Signed-off-by: Michael Walle <michael@walle.cc>
Fixes: 87a18a6a5652 ("mmc: mmc: Use ->card_busy() to detect busy cards in __mmc_switch()")
Acked-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

Merge tag 'fixes-for-v4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus

Felipe writes:

usb: fixes for v4.9-rc5

One single fix for FunctionFS to make sure we're checking
ffs_func_req_match()'s return code correctly.

powerpc/mm: Fix missing update of HID register on secondary CPUs

We need to update on secondaries for the selected MMU mode.

Fixes: ad410674f560 ("powerpc/mm: Update the HID bit when switching from radix to hash")
Reported-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

usb: gadget: f_fs: fix wrong parenthesis in ffs_func_req_match()

Properly check the return code of ffs_func_revmap_intf() and
ffs_func_revmap_ep() for a non-negative value.

Instead of checking the return code, the comparison was performed for the last
parameter of the function calls, because of wrong parenthesis.

This also fixes the following static checker warning:
drivers/usb/gadget/function/f_fs.c:3152 ffs_func_req_match()
warn: always true condition '(((creq->wIndex)) >= 0) => (0-u16max >= 0)'

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Felix Hädicke <felixhaedicke@web.de>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>

KVM: arm64: Fix the issues when guest PMCCFILTR is configured

KVM calls kvm_pmu_set_counter_event_type() when PMCCFILTR is configured.
But this function can't deals with PMCCFILTR correctly because the evtCount
bits of PMCCFILTR, which is reserved 0, conflits with the SW_INCR event
type of other PMXEVTYPER<n> registers. To fix it, when eventsel == 0, this
function shouldn't return immediately; instead it needs to check further
if select_idx is ARMV8_PMU_CYCLE_IDX.

Another issue is that KVM shouldn't copy the eventsel bits of PMCCFILTER
blindly to attr.config. Instead it ought to convert the request to the
"cpu cycle" event type (i.e. 0x11).

To support this patch and to prevent duplicated definitions, a limited
set of ARMv8 perf event types were relocated from perf_event.c to
asm/perf_event.h.

Cc: stable@vger.kernel.org # 4.6+
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Wei Huang <wei@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

arm64: KVM: pmu: Fix AArch32 cycle counter access

We're missing the handling code for the cycle counter accessed
from a 32bit guest, leading to unexpected results.

Cc: stable@vger.kernel.org # 4.6+
Signed-off-by: Wei Huang <wei@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

perf/x86: Add perf support for AMD family-17h processors

This patch enables perf core PMU support for the new AMD family-17h processors.

In family-17h, there is no PMC-event constraint. All events, irrespective of
the type, can be measured using any of the six generic performance counters.

Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1479399306-13375-1-git-send-email-Janakarajan.Natarajan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/dumpstack: Prevent KASAN false positive warnings

The oops stack dump code scans the entire stack, which can cause KASAN
"stack-out-of-bounds" false positive warnings. Tell KASAN to ignore it.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: davej@codemonkey.org.uk
Cc: dvyukov@google.com
Link: http://lkml.kernel.org/r/5f6e80c4b0c7f7f0b6211900847a247cdaad753c.1479398226.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/unwind: Prevent KASAN false positive warnings in guess unwinder

The guess unwinder scans the entire stack, which can cause KASAN
"stack-out-of-bounds" false positive warnings. Tell KASAN to ignore it.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: davej@codemonkey.org.uk
Cc: dvyukov@google.com
Link: http://lkml.kernel.org/r/61939c0b2b2d63ce97ba59cba3b00fd47c2962cf.1479398226.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

cfg80211: limit scan results cache size

It's possible to make scanning consume almost arbitrary amounts
of memory, e.g. by sending beacon frames with random BSSIDs at
high rates while somebody is scanning.

Limit the number of BSS table entries we're willing to cache to
1000, limiting maximum memory usage to maybe 4-5MB, but lower
in practice - that would be the case for having both full-sized
beacon and probe response frames for each entry; this seems not
possible in practice, so a limit of 1000 entries will likely be
closer to 0.5 MB.

Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1

On POWER9 DD1, when we do a local TLB invalidate we also need to explicitly
invalidate the ERAT.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

i2c: digicolor: use clk_disable_unprepare instead of clk_unprepare

since clk_prepare_enable() is used to get i2c->clk, we should
use clk_disable_unprepare() to release it for the error path.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

Merge tag 'sunxi-fixes-for-4.9' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into fixes

Allwinner fixes for 4.9

A fix to reintroduce missing pinmux options that turned out not to be
optional.

* tag 'sunxi-fixes-for-4.9' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
ARM: dts: sun8i: fix the pinmux for UART1

Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'sti-dt-for-v4.9-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti into fixes

STi DT fix:

Fix typo cs-gpio to cs-gpios

* tag 'sti-dt-for-v4.9-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti:
ARM: dts: STiH410-b2260: Fix typo in spi0 chipselect definition

Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'imx-fixes-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes

i.MX fixes for 4.9, 2nd round:

It fixes a boot failure on imx53-qsb board with a DA9053 PMIC, which is
caused by the regulator core change, commit fa93fd4ecc9c ("regulator:
core: Ensure we are at least in bounds for our constraints").

* tag 'imx-fixes-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx53-qsb: Fix regulator constraints

Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'omap-for-v4.9/fixes-for-rc-cycle' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Fixes for omaps for v4.9-rc cycle. Except for the omap3 fix for the SoC
features printed, all these are quite trivial and tiny. The omap5 jack
detection and gpadc patches are not strictly fixes, but I wanted to get
binding document typo fixed before it pops up on other boards. The
gpadc one liner was in the same series and I applied and pushed it out
already before noticing it could have waited. The list of changes is:

- Fix omap3 SoC features printed
- Make sure OMAP_INTERCONNECT is selected for am43xx only configurations
- Add missing memory node for torpedo
- Initialize uart4_mask properly to avoid writing garbage to PRM registers
- Fix NULL pointer dereference for omap4 volt_data
- Add alias for omap5 gpadc needed by iio drivers
- Enable omap5 jack headset jack detection and fix it's binding typo
- Add missing memory node for logicpd-som-lv
- Fix wrong SMPS6 voltage for VDD-DDR3 for omap5

* tag 'omap-for-v4.9/fixes-for-rc-cycle' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: omap5: board-common: fix wrong SMPS6 (VDD-DDR3) voltage
  ARM: omap3: Add missing memory node in SOM-LV
  ASoC: omap-abe-twl6040: fix typo in bindings documentation
  dts: omap5: board-common: enable twl6040 headset jack detection
  dts: omap5: board-common: add phandle to reference Palmas gpadc
  ARM: OMAP2+: avoid NULL pointer dereference
  ARM: OMAP2+: PRM: initialize en_uart4_mask and grpsel_uart4_mask
  ARM: dts: omap3: Fix memory node in Torpedo board
  ARM: AM43XX: Select OMAP_INTERCONNECT in Kconfig
  ARM: OMAP3: Fix formatting of features printed

Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'mvebu-fixes-4.9-1' of git://git.infradead.org/linux-mvebu into fixes

mvebu fixes for 4.9 (part 1)

All of them are fixes for arm64 device tree

- 2 for the SPI node on the Armada 7K/8K
- 1 for the clock node on the Armada 37xx

* tag 'mvebu-fixes-4.9-1' of git://git.infradead.org/linux-mvebu:
  arm64: dts: marvell: add unique identifiers for Armada A8k SPI controllers
  arm64: dts: marvell: fix clocksource for CP110 slave SPI0
  arm64: dts: marvell: Fix typo in label name on Armada 37xx

Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'drm-intel-fixes-2016-11-17' of ssh://git.freedesktop.org/git/drm-intel into drm-fixes

i915 misc fixes.

* tag 'drm-intel-fixes-2016-11-17' of ssh://git.freedesktop.org/git/drm-intel:
  drm/i915: Assume non-DP++ port if dvo_port is HDMI and there's no AUX ch specified in the VBT
  drm/i915: Refresh that status of MST capable connectors in ->detect()
  drm/i915: Grab the rotation from the passed plane state for VLV sprites
  drm/i915: Mark CPU cache as dirty when used for rendering

ipmi/bt-bmc: change compatible node to 'aspeed, ast2400-ibt-bmc'

The Aspeed SoCs have two BT interfaces : one is IPMI compliant and the
other is H8S/2168 compliant.

The current ipmi/bt-bmc driver implements the IPMI version and we
should reflect its nature in the compatible node name using
'aspeed,ast2400-ibt-bmc' instead of 'aspeed,ast2400-bt-bmc'. The
latter should be used for a H8S interface driver if it is implemented
one day.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Olof Johansson <olof@lixom.net>

Revert "drm/mediatek: set vblank_disable_allowed to true"

This reverts commit f752fff611b99f5679224f3990a1f531ea64b1ec.

Signed-off-by: Dave Airlie <airlied@redhat.com>

Revert "drm/mediatek: fix a typo of OD_CFG to OD_RELAYMODE"

This reverts commit 83ba62bc700bab710b22be3a1bf6cf973f754273.

Signed-off-by: Dave Airlie <airlied@redhat.com>

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"A set of fixes, one for NVMe from Keith, and a set for nvme-{rdma,t,f}
  from the usual suspects, fixing actual problems that would be a shame
  to release 4.9 with"

* 'for-linus' of git://git.kernel.dk/linux-block:
  nvme/pci: Don't free queues on error
  nvmet-rdma: drain the queue-pair just before freeing it
  nvme-rdma: stop and free io queues on connect failure
  nvmet-rdma: don't forget to delete a queue from the list of connection failed
  nvmet: Don't queue fatal error work if csts.cfs is set
  nvme-rdma: reject non-connect commands before the queue is live
  nvmet-rdma: Fix possible NULL deref when handling rdma cm events

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rmda fixes from Doug Ledford.
"First round of -rc fixes.

  Due to various issues, I've been away and couldn't send a pull request
  for about three weeks. There were a number of -rc patches that built
  up in the meantime (some where there already from the early -rc
  stages). Obviously, there were way too many to send now, so I tried to
  pare the list down to the more important patches for the -rc cycle.

  Most of the code has had plenty of soak time at the various vendor's
  testing setups, so I doubt there will be another -rc pull request this
  cycle. I also tried to limit the patches to those with smaller
  footprints, so even though a shortlog is longer than I would like, the
  actual diffstat is mostly very small with the exception of just three
  files that had more changes, and a couple files with pure removals.

  Summary:
   - Misc Intel hfi1 fixes
   - Misc Mellanox mlx4, mlx5, and rxe fixes
   - A couple cxgb4 fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (34 commits)
  iw_cxgb4: invalidate the mr when posting a read_w_inv wr
  iw_cxgb4: set *bad_wr for post_send/post_recv errors
  IB/rxe: Update qp state for user query
  IB/rxe: Clear queue buffer when modifying QP to reset
  IB/rxe: Fix handling of erroneous WR
  IB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum
  IB/mlx4: Fix create CQ error flow
  IB/mlx4: Check gid_index return value
  IB/mlx5: Fix NULL pointer dereference on debug print
  IB/mlx5: Fix fatal error dispatching
  IB/mlx5: Resolve soft lock on massive reg MRs
  IB/mlx5: Use cache line size to select CQE stride
  IB/mlx5: Validate requested RQT size
  IB/mlx5: Fix memory leak in query device
  IB/core: Avoid unsigned int overflow in sg_alloc_table
  IB/core: Add missing check for addr_resolve callback return value
  IB/core: Set routable RoCE gid type for ipv4/ipv6 networks
  IB/cm: Mark stale CM id's whenever the mad agent was unregistered
  IB/uverbs: Fix leak of XRC target QPs
  IB/hfi1: Remove incorrect IS_ERR check
  ...

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
"A couple of regression fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix iov_iter_advance() for ITER_PIPE
xattr: Fix setting security xattrs on sockfs

Merge tag 'for-linus-4.9-rc5-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux

Pull orangefs fix from Mike Marshall:
"orangefs: add .owner to debugfs file_operations

  Without ".owner = THIS_MODULE" it is possible to crash the kernel by
  unloading the Orangefs module while someone is reading debugfs files"

* tag 'for-linus-4.9-rc5-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  orangefs: add .owner to debugfs file_operations

net sched filters: pass netlink message flags in event notification

Userland client should be able to read an event, and reflect it back to
the kernel, therefore it needs to extract complete set of netlink flags.

For example, this will allow "tc monitor" to distinguish Add and Replace
operations.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mremap: fix race between mremap() and page cleanning

Prior to 3.15, there was a race between zap_pte_range() and
page_mkclean() where writes to a page could be lost.  Dave Hansen
discovered by inspection that there is a similar race between
move_ptes() and page_mkclean().

We've been able to reproduce the issue by enlarging the race window with
a msleep(), but have not been able to hit it without modifying the code.
So, we think it's a real issue, but is difficult or impossible to hit in
practice.

The zap_pte_range() issue is fixed by commit 1cf35d47712d("mm: split
'tlb_flush_mmu()' into tlb flushing and memory freeing parts").  And
this patch is to fix the race between page_mkclean() and mremap().

Here is one possible way to hit the race: suppose a process mmapped a
file with READ | WRITE and SHARED, it has two threads and they are bound
to 2 different CPUs, e.g.  CPU1 and CPU2.  mmap returned X, then thread
1 did a write to addr X so that CPU1 now has a writable TLB for addr X
on it.  Thread 2 starts mremaping from addr X to Y while thread 1
cleaned the page and then did another write to the old addr X again.
The 2nd write from thread 1 could succeed but the value will get lost.

        thread 1                           thread 2
     (bound to CPU1)                    (bound to CPU2)

  1: write 1 to addr X to get a
     writeable TLB on this CPU

                                        2: mremap starts

                                        3: move_ptes emptied PTE for addr X
                                           and setup new PTE for addr Y and
                                           then dropped PTL for X and Y

  4: page laundering for N by doing
     fadvise FADV_DONTNEED. When done,
     pageframe N is deemed clean.

  5: *write 2 to addr X

                                        6: tlb flush for addr X

  7: munmap (Y, pagesize) to make the
     page unmapped

  8: fadvise with FADV_DONTNEED again
     to kick the page off the pagecache

  9: pread the page from file to verify
     the value. If 1 is there, it means
     we have lost the written 2.

  *the write may or may not cause segmentation fault, it depends on
  if the TLB is still on the CPU.

Please note that this is only one specific way of how the race could
occur, it didn't mean that the race could only occur in exact the above
config, e.g. more than 2 threads could be involved and fadvise() could
be done in another thread, etc.

For anonymous pages, they could race between mremap() and page reclaim:
THP: a huge PMD is moved by mremap to a new huge PMD, then the new huge
PMD gets unmapped/splitted/pagedout before the flush tlb happened for
the old huge PMD in move_page_tables() and we could still write data to
it.  The normal anonymous page has similar situation.

To fix this, check for any dirty PTE in move_ptes()/move_huge_pmd() and
if any, did the flush before dropping the PTL.  If we did the flush for
every move_ptes()/move_huge_pmd() call then we do not need to do the
flush in move_pages_tables() for the whole range.  But if we didn't, we
still need to do the whole range flush.

Alternatively, we can track which part of the range is flushed in
move_ptes()/move_huge_pmd() and which didn't to avoid flushing the whole
range in move_page_tables().  But that would require multiple tlb
flushes for the different sub-ranges and should be less efficient than
the single whole range flush.

KBuild test on my Sandybridge desktop doesn't show any noticeable change.
v4.9-rc4:
  real    5m14.048s
  user    32m19.800s
  sys     4m50.320s

With this commit:
  real    5m13.888s
  user    32m19.330s
  sys     4m51.200s

Reported-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ip6_tunnel: disable caching when the traffic class is inherited

If an ip6 tunnel is configured to inherit the traffic class from
the inner header, the dst_cache must be disabled or it will foul
the policy routing.

The issue is apprently there since at leat Linux-2.6.12-rc2.

Reported-by: Liam McBirnie <liam.mcbirnie@boeing.com>
Cc: Liam McBirnie <liam.mcbirnie@boeing.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'phy-dev-leaks'

Johan Hovold says:

====================
net: phy: fix of_node and device leaks

These patches fix a couple of of_node leaks in the fixed-link code and a
device reference leak in a phy helper.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: fixed_phy: fix of_node leak in fixed_phy_unregister

Make sure to drop the of_node reference taken in fixed_phy_register()
when deregistering a PHY.

Fixes: a75951217472 ("net: phy: extend fixed driver with
fixed_phy_register()")

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

of_mdio: fix device reference leak in of_phy_find_device

Make sure to drop the reference taken by bus_find_device() before
returning NULL from of_phy_find_device() when the found device is not a
PHY.

Fixes: 6ed742363b9c ("of: of_mdio: Ensure mdio device is a PHY")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

of_mdio: fix node leak in of_phy_register_fixed_link error path

Make sure to drop the of_node reference also on failure to parse the
speed property in of_phy_register_fixed_link().

Fixes: 3be2a49e5c08 ("of: provide a binding for fixed link PHYs")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: check dead netns for peernet2id_alloc()

Andrei reports we still allocate netns ID from idr after we destroy
it in cleanup_net().

cleanup_net():
  ...
  idr_destroy(&net->netns_ids);
  ...
  list_for_each_entry_reverse(ops, &pernet_list, list)
    ops_exit_list(ops, &net_exit_list);
      -> rollback_registered_many()
        -> rtmsg_ifinfo_build_skb()
         -> rtnl_fill_ifinfo()
           -> peernet2id_alloc()

After that point we should not even access net->netns_ids, we
should check the death of the current netns as early as we can in
peernet2id_alloc().

For net-next we can consider to avoid sending rtmsg totally,
it is a good optimization for netns teardown path.

Fixes: 0c7aecd4bde4 ("netns: add rtnl cmd to add and get peer netns ids")
Reported-by: Andrei Vagin <avagin@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

phy: twl4030-usb: Fix for musb session bit based PM

Now with musb driver implementing generic session bit based
PM, we need to have the USB PHYs behaving in a sane way for
platforms implementing PM.

Currently twl4030-usb enables PM in twl4030_phy_power_on()
and then disables it in twl4030_phy_power_off(). This will
block PM runtime for the SoC when no cable is connected.

Fix the issue by moving PM runtime autosuspend call to
happen where it gets called in twl4030_phy_power_on().

Note that this patch should not be backported to anything
before commit 467d5c980709 ("usb: musb: Implement session bit
based runtime PM for musb-core") as before that all the
glue layers implemented their own PM.

Fixes: 467d5c980709 ("usb: musb: Implement session bit based
runtime PM for musb-core")
Tested-by: Ladislav Michl <ladis@linux-mips.org>
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: musb: Drop pointless PM runtime code for dsps glue

This already gets done automatically by PM runtime and we have
a separate autosuspend timeout in musb_core.c.

Reviewed-by: Johan Hovold <johan@kernel.org>
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: musb: Add missing pm_runtime_disable and drop 2430 PM timeout

We are missing pm_runtime_disable() in 2430 glue layer. Further,
we only need to enable PM runtime and disable it on exit. With
musb_core.c doing PM, the glue layer as a parent will always be
active when musb_core.c is active.

This fixes host enumeration issues with some devices as reported
by Ladislav Michl <ladis@linux-mips.org>.

And holding an RPM reference while deregistering the child would
lead to a crash in omap2430_runtime_suspend() which dereferences
the now freed child's driver data on put as pointed out by
Johan Hovold <johan@kernel.org>:

Unable to handle kernel paging request at virtual address 6b6b6f17
...
[<c05453d4>] (omap2430_runtime_suspend) from [<c0481410>]
(pm_generic_runtime_suspend+0x3c/0x48)
[<c0481410>] (pm_generic_runtime_suspend) from [<c0121028>]
(_od_runtime_suspend+0x1c/0x30)
[<c0121028>] (_od_runtime_suspend) from [<c04833b0>] (__rpm_callback+0x3c/0x70)
[<c04833b0>] (__rpm_callback) from [<c0483414>] (rpm_callback+0x30/0x90)
[<c0483414>] (rpm_callback) from [<c0483984>] (rpm_suspend+0x118/0x6b4)
[<c0483984>] (rpm_suspend) from [<c04840f4>] (rpm_idle+0x104/0x440)
[<c04840f4>] (rpm_idle) from [<c04844ac>] (__pm_runtime_idle+0x7c/0xb0)
[<c04844ac>] (__pm_runtime_idle) from [<c0545458>] (omap2430_remove+0x38/0x58)
[<c0545458>] (omap2430_remove) from [<c047b2bc>] (platform_drv_remove+0x34/0x4c)

Note that if changes are needed to the autosuspend timeout, it should
be done in musb_core.c.

Reported-by: Ladislav Michl <ladis@linux-mips.org>
Fixes: 87326e858448 ("usb: musb: Remove extra PM runtime calls from
2430 glue layer")
Tested-by: Ladislav Michl <ladis@linux-mips.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: musb: Fix PM for hub disconnect

With a USB hub disconnected, devctl can be 0x19 for about a second
on am335x and will stay forever on at least omap3. And we get no
further interrupts when devctl session bit clears. This keeps
PM runtime active.

Let's fix the issue by polling devctl until the session bit clears
or times out. We can do this by making musb->irq_work into
delayed_work.

And with the polling implemented, we can now also have the quirk
for invalid VBUS it to avoid disconnecting too early while VBUS
is ramping up.

Fixes: 467d5c980709 ("usb: musb: Implement session bit based runtime
PM for musb-core")
Fixes: 65b3f50ed6fa ("usb: musb: Add PM runtime support for MUSB DSPS
Tested-by: Ladislav Michl <ladis@linux-mips.org>
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: musb: Fix sleeping function called from invalid context for hdrc glue

Commit 65b3f50ed6fa ("usb: musb: Add PM runtime support for MUSB DSPS
glue layer") wrongly added a call for pm_runtime_get_sync to otg_timer
that runs in softirq context. That causes a "BUG: sleeping function called
from invalid context" every time when polling the cable status:

[<c015ebb4>] (__might_sleep) from [<c0413d60>] (__pm_runtime_resume+0x9c/0xa0)
[<c0413d60>] (__pm_runtime_resume) from [<c04d0bc4>] (otg_timer+0x3c/0x254)
[<c04d0bc4>] (otg_timer) from [<c0191180>] (call_timer_fn+0xfc/0x41c)
[<c0191180>] (call_timer_fn) from [<c01915c0>] (expire_timers+0x120/0x210)
[<c01915c0>] (expire_timers) from [<c0191acc>] (run_timer_softirq+0xa4/0xdc)
[<c0191acc>] (run_timer_softirq) from [<c010168c>] (__do_softirq+0x12c/0x594)

I did not notice that as I did not have CONFIG_DEBUG_ATOMIC_SLEEP enabled.
And looks like also musb_gadget_queue() suffers from the same problem.

Let's fix the issue by using a list of delayed work then call it on
resume. Note that we want to do this only when musb core and it's
parent devices are awake, and we need to make sure the DSPS glue
timer is stopped as noted by Johan Hovold <johan@kernel.org>.
Note that we already are re-enabling the timer with mod_timer() in
dsps_musb_enable().

Later on we may be able to remove other delayed work in the musb driver
and just do it from pending_resume_work. But this should be done only
for delayed work that does not have other timing requirements beyond
just being run on resume.

Fixes: 65b3f50ed6fa ("usb: musb: Add PM runtime support for MUSB DSPS
glue layer")
Reported-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: musb: Fix broken use of static variable for multiple instances

We can't use static variable first for checking when musb is
initialized when we have multiple musb instances like on am335x.

Tested-by: Ladislav Michl <ladis@linux-mips.org>
Reviewed-by: Johan Hovold <johan@hovoldconsulting.com>
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

crypto: caam - fix type mismatch warning

Building the caam driver on arm64 produces a harmless warning:

drivers/crypto/caam/caamalg.c:140:139: warning: comparison of distinct pointer types lacks a cast

We can use min_t to tell the compiler which type we want it to use
here.

Fixes: 5ecf8ef9103c ("crypto: caam - fix sg dump")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

dmaengine: cppi41: More PM runtime fixes

Fix use of u32 instead of int for checking for negative errors values
as pointed out by Dan Carpenter <dan.carpenter@oracle.com>.

And while testing the PM runtime error path by randomly returning
failed values in runtime resume, I noticed two more places that need
fixing:

- If pm_runtime_get_sync() fails in probe, we still need to do
  pm_runtime_put_sync() to keep the use count happy. We could call
  pm_runtime_put_noidle() on the error path, but we're just going
  to call pm_runtime_disable() after that so pm_runtime_put_sync()
  will do what we want

- We should print an error if pm_runtime_get_sync() fails in
  cppi41_dma_alloc_chan_resources() so we know where it happens

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 740b4be3f742 ("dmaengine: cpp41: Fix handling of error path")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>

x86/boot: Avoid warning for zero-filling .bss

The latest binutils are warning about a .fill directive with an explicit
value in a .bss section:

  arch/x86/kernel/head_32.S: Assembler messages:
  arch/x86/kernel/head_32.S:677: Warning: ignoring fill value in section `.bss..page_aligned'
  arch/x86/kernel/head_32.S:679: Warning: ignoring fill value in section `.bss..page_aligned'

This comes from the 'ENTRY()' macro padding the space between the symbols
with 'nop' via:

  .align 4,0x90

Open-coding the .globl directive without the padding avoids that warning,
as all the symbols are already page aligned.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161116141726.2013389-1-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>

fix iov_iter_advance() for ITER_PIPE

iov_iter_advance() needs to decrement iter->count by the number of
bytes we'd moved beyond. Normal flavours do that, but ITER_PIPE
doesn't and ITER_PIPE generic_file_read_iter() for O_DIRECT files
ends up with a bogus fallback to page cache read, resulting in incorrect
values for file offset and bytes read.

Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

xattr: Fix setting security xattrs on sockfs

The IOP_XATTR flag is set on sockfs because sockfs supports getting the
"system.sockprotoname" xattr.  Since commit 6c6ef9f2, this flag is checked for
setxattr support as well.  This is wrong on sockfs because security xattr
support there is supposed to be provided by security_inode_setsecurity.  The
smack security module relies on socket labels (xattrs).

Fix this by adding a security xattr handler on sockfs that returns
-EAGAIN, and by checking for -EAGAIN in setxattr.

We cannot simply check for -EOPNOTSUPP in setxattr because there are
filesystems that neither have direct security xattr support nor support
via security_inode_setsecurity.  A more proper fix might be to move the
call to security_inode_setsecurity into sockfs, but it's not clear to me
if that is safe: we would end up calling security_inode_post_setxattr after
that as well.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

bnxt: add a missing rcu synchronization

Add a missing synchronize_net() call to avoid potential use after free,
since we explicitly call napi_hash_del() to factorize the RCU grace
period.

Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: b53: Fix VLAN usage and how we treat CPU port

We currently have a fundamental problem in how we treat the CPU port and
its VLAN membership. As soon as a second VLAN is configured to be
untagged, the CPU automatically becomes untagged for that VLAN as well,
and yet, we don't gracefully make sure that the CPU becomes tagged in
the other VLANs it could be a member of. This results in only one VLAN
being effectively usable from the CPU's perspective.

Instead of having some pretty complex logic which tries to maintain the
CPU port's default VLAN and its untagged properties, just do something
very simple which consists in neither altering the CPU port's PVID
settings, nor its untagged settings:

- whenever a VLAN is added, the CPU is automatically a member of this
VLAN group, as a tagged member
- PVID settings for downstream ports do not alter the CPU port's PVID
since it now is part of all VLANs in the system

This means that a typical example where e.g: LAN ports are in VLAN1, and
WAN port is in VLAN2, now require having two VLAN interfaces for the
host to properly terminate and send traffic from/to.

Fixes: Fixes: a2482d2ce349 ("net: dsa: b53: Plug in VLAN support")
Reported-by: Hartmut Knaack <knaack.h@gmx.de>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'drm-fixes-for-v4.9-rc6' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes fr9om Dave Airlie:
"Fixes for amdgpu, and a bunch of arm drivers.

  There seems to be an uptick in the ARM drivers sending things for
  fixes which is good, so I've decided to dequeue a bit early, more
  stuff may arrive before the weekend.

  This contains mediatek, arcpgu, sunxi, fsl-dcu display controller
  fixes along with 3 amdgpu fixes, one for a fencing issue with
  secondary GPUs"

* tag 'drm-fixes-for-v4.9-rc6' of git://people.freedesktop.org/~airlied/linux:
  drm/amdgpu:fix vpost_needed routine
  drm/amdgpu/powerplay: drop a redundant NULL check
  drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)
  drm/arcpgu: Accommodate adv7511 switch to DRM bridge
  drm/fsl-dcu: disable planes before disabling CRTC
  drm/fsl-dcu: update all registers on flush
  drm/fsl-dcu: do not update when modifying irq registers
  drm/sun4i: Propagate error to the caller
  drm/sun4i: Fix error handling
  drm/mediatek: modify the factor to make the pll_rate set in the 1G-2G range
  drm/mediatek: enhance the HDMI driving current
  drm/mediatek: do mtk_hdmi_send_infoframe after HDMI clock enable
  drm/mediatek: clear IRQ status before enable OVL interrupt
  drm/mediatek: set vblank_disable_allowed to true
  drm/mediatek: fix a typo of OD_CFG to OD_RELAYMODE
  drm/sun4i: rgb: Remove the bridge enable/disable functions
  drm/sun4i: rgb: Enable panel after controller

iw_cxgb4: invalidate the mr when posting a read_w_inv wr

Also, rearrange things a bit to have a common c4iw_invalidate_mr()
function used everywhere that we need to invalidate.

Fixes: 49b53a93a64a ("iw_cxgb4: add fast-path for small REG_MR operations")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

iw_cxgb4: set *bad_wr for post_send/post_recv errors

There are a few cases in c4iw_post_send() and c4iw_post_receive()
where *bad_wr is not set when an error is returned. This can
cause a crash if the application tries to use bad_wr.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

Merge branches 'hfi1' and 'mlx' into k.o/for-4.9-rc

IB/rxe: Update qp state for user query

The method rxe_qp_error() transitions QP to error state
and make sure the QP is drained. It did not though update
the QP state for user's query.

This patch fixes this.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/rxe: Clear queue buffer when modifying QP to reset

RXE resets the send-q only once in rxe_qp_init_req() when
QP is created, but when the QP is reused after QP reset, the send-q
holds previous garbage data.

This garbage data wrongly fails CQEs that otherwise
should have completed successfully.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

IB/rxe: Fix handling of erroneous WR

To correctly handle a erroneous WR this fix does the following
1. Make sure the bad WQE causes a user completion event.
2. Call rxe_completer to handle the erred WQE.

Before the fix, when rxe_requester found a bad WQE, it changed its
status to IB_WC_LOC_PROT_ERR and exit with 0 for non RC QPs.

If this was the 1st WQE then there would be no ACK to invoke the
completer and this bad WQE would be stuck in the QP's send-q.

On top of that the requester exiting with 0 caused rxe_do_task to
endlessly invoke rxe_requester, resulting in a soft-lockup attached
below.

In case the WQE was not the 1st and rxe_completer did get a chance to
handle the bad WQE, it did not cause a complete event since the WQE's
IB_SEND_SIGNALED flag was not set.

Setting WQE status to IB_SEND_SIGNALED is subject to IBA spec
version 1.2.1, section 10.7.3.1 Signaled Completions.

NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[<ffffffffa0590145>] ? rxe_pool_get_index+0x35/0xb0 [rdma_rxe]
[<ffffffffa05952ec>] lookup_mem+0x3c/0xc0 [rdma_rxe]
[<ffffffffa0595534>] copy_data+0x1c4/0x230 [rdma_rxe]
[<ffffffffa058c180>] rxe_requester+0x9d0/0x1100 [rdma_rxe]
[<ffffffff8158e98a>] ? kfree_skbmem+0x5a/0x60
[<ffffffffa05962c9>] rxe_do_task+0x89/0xf0 [rdma_rxe]
[<ffffffffa05963e2>] rxe_run_task+0x12/0x30 [rdma_rxe]
[<ffffffffa059110a>] rxe_post_send+0x41a/0x550 [rdma_rxe]
[<ffffffff811ef922>] ? __kmalloc+0x182/0x200
[<ffffffff816ba512>] ? down_read+0x12/0x40
[<ffffffffa054bd32>] ib_uverbs_post_send+0x532/0x540 [ib_uverbs]
[<ffffffff815f8722>] ? tcp_sendmsg+0x402/0xb80
[<ffffffffa05453dc>] ib_uverbs_write+0x18c/0x3f0 [ib_uverbs]
[<ffffffff81623c2e>] ? inet_recvmsg+0x7e/0xb0
[<ffffffff8158764d>] ? sock_recvmsg+0x3d/0x50
[<ffffffff81215b87>] __vfs_write+0x37/0x140
[<ffffffff81216892>] vfs_write+0xb2/0x1b0
[<ffffffff81217ce5>] SyS_write+0x55/0xc0
[<ffffffff816bc672>] entry_SYSCALL_64_fastpath+0x1a/0xa

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>