asedeno.scripts.mit.edu Git

Merge tag 'edac_fix_for_5.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC fix from Borislav Petkov:
"Fix persistent register offsets of altera_edac, from Thor Thayer"

* tag 'edac_fix_for_5.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, altera: Fix S10 persistent register offset

Merge tag 'for-linus-20190127' of git://git.kernel.dk/linux-block

Pull block revert from Jens Axboe:
"Silly error snuck into a patch from the last series, let's do a revert
to avoid a potential use-after-free"

* tag 'for-linus-20190127' of git://git.kernel.dk/linux-block:
Revert "block: cover another queue enter recursion via BIO_QUEUE_ENTERED"

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

net/rose: fix NULL ax25_cb kernel panic

When an internally generated frame is handled by rose_xmit(),
rose_route_frame() is called:

        if (!rose_route_frame(skb, NULL)) {
                dev_kfree_skb(skb);
                stats->tx_errors++;
                return NETDEV_TX_OK;
        }

We have the same code sequence in Net/Rom where an internally generated
frame is handled by nr_xmit() calling nr_route_frame(skb, NULL).
However, in this function NULL argument is tested while it is not in
rose_route_frame().
Then kernel panic occurs later on when calling ax25cmp() with a NULL
ax25_cb argument as reported many times and recently with syzbot.

We need to test if ax25 is NULL before using it.

Testing:
Built kernel with CONFIG_ROSE=y.

Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: syzbot+1a2c456a1ea08fa5b5f7@syzkaller.appspotmail.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Bernard Pidoux <f6bvp@free.fr>
Cc: linux-hams@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

net: altera_tse: fix msgdma_tx_completion on non-zero fill_level case

If fill_level was not zero and status was not BUSY,
result of "tx_prod - tx_cons - inuse" might be zero.
Subtracting 1 unconditionally results invalid negative return value
on this case.
Make sure not to return an negative value.

Signed-off-by: Tomonori Sakita <tomonori.sakita@sord.co.jp>
Signed-off-by: Atsushi Nemoto <atsushi.nemoto@sord.co.jp>
Reviewed-by: Dalon L Westergreen <dalon.westergreen@linux.intel.com>
Acked-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netrom: switch to sock timer API

sk_reset_timer() and sk_stop_timer() properly handle
sock refcnt for timer function. Switching to them
could fix a refcounting bug reported by syzbot.

Reported-and-tested-by: syzbot+defa700d16f1bd1b9a05@syzkaller.appspotmail.com
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-hams@vger.kernel.org
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2019-01-25

1) Several patches to fix the fallout from the recent
   tree based policy lookup work. From Florian Westphal.

2) Fix VTI for IPCOMP for 'not compressed' IPCOMP packets.
   We need an extra IPIP handler to process these packets
   correctly. From Su Yanjun.

3) Fix validation of template and selector families for
   MODE_ROUTEOPTIMIZATION with ipv4-in-ipv6 packets.
   This can lead to a stack-out-of-bounds because
   flowi4 struct is treated as flowi6 struct.
   Fix from Florian Westphal.

4) Restore the default behaviour of the xfrm set-mark
   in the output path. This was changed accidentally
   when mark setting was extended to the input path.
   From Benedict Wong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"Quite a few fixes for x86: nested virtualization save/restore, AMD
  nested virtualization and virtual APIC, 32-bit fixes, an important fix
  to restore operation on older processors, and a bunch of hyper-v
  bugfixes. Several are marked stable.

  There are also fixes for GCC warnings and for a GCC/objtool interaction"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Mark expected switch fall-throughs
  KVM: x86: fix TRACE_INCLUDE_PATH and remove -I. header search paths
  KVM: selftests: check returned evmcs version range
  x86/kvm/hyper-v: nested_enable_evmcs() sets vmcs_version incorrectly
  KVM: VMX: Move vmx_vcpu_run()'s VM-Enter asm blob to a helper function
  kvm: selftests: Fix region overlap check in kvm_util
  kvm: vmx: fix some -Wmissing-prototypes warnings
  KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1
  svm: Fix AVIC incomplete IPI emulation
  svm: Add warning message for AVIC IPI invalid target
  KVM: x86: WARN_ONCE if sending a PV IPI returns a fatal error
  KVM: x86: Fix PV IPIs for 32-bit KVM host
  x86/kvm/hyper-v: recommend using eVMCS only when it is enabled
  x86/kvm/hyper-v: don't recommend doing reset via synthetic MSR
  kvm: x86/vmx: Use kzalloc for cached_vmcs12
  KVM: VMX: Use the correct field var when clearing VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL
  KVM: x86: Fix single-step debugging
  x86/kvm/hyper-v: don't announce GUEST IDLE MSR support

Merge tag 'dma-mapping-5.0-2' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:
"Fix a xen-swiotlb regression on arm64"

* tag 'dma-mapping-5.0-2' of git://git.infradead.org/users/hch/dma-mapping:
arm64/xen: fix xen-swiotlb cache flushing

Merge tag 'libnvdimm-fixes-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:
"A fix for namespace label support for non-Intel NVDIMMs that implement
  the ACPI standard label method.

  This has apparently never worked and could wait for v5.1. However it
  has enough visibility with hardware vendors [1] and distro bug
  trackers [2], and low enough risk that I decided it should go in for
  -rc4. The other fixups target the new, for v5.0, nvdimm security
  functionality. The larger init path fixup closes a memory leak and a
  potential userspace lockup due to missed notifications.

    [1] https://github.com/pmem/ndctl/issues/78
    [2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1811785

  These have all soaked in -next for a week with no reported issues.

  Summary:

   - Fix support for NVDIMMs that implement the ACPI standard label
     methods.

   - Fix error handling for security overwrite (memory leak / userspace
     hang condition), and another one-line security cleanup"

* tag 'libnvdimm-fixes-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  acpi/nfit: Fix command-supported detection
  acpi/nfit: Block function zero DSMs
  libnvdimm/security: Require nvdimm_security_setup_events() to succeed
  nfit_test: fix security state pull for nvdimm security nfit_test

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:
"A fixup for the input_event fix for y2038 Sparc64, and couple other
  minor fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: input_event - fix the CONFIG_SPARC64 mixup
  Input: olpc_apsp - assign priv->dev earlier
  Input: uinput - fix undefined behavior in uinput_validate_absinfo()
  Input: raspberrypi-ts - fix link error
  Input: xpad - add support for SteelSeries Stratus Duo
  Input: input_event - provide override for sparc64

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Count ttl-dropped frames properly in mac80211, from Bob Copeland.

2) Integer overflow in ktime handling of bcm can code, from Oliver
    Hartkopp.

3) Fix RX desc handling wrt. hw checksumming in ravb, from Simon
    Horman.

4) Various hash key fixes in hv_netvsc, from Haiyang Zhang.

5) Use after free in ax25, from Eric Dumazet.

6) Several fixes to the SSN support in SCTP, from Xin Long.

7) Do not process frames after a NAPI reschedule in ibmveth, from
    Thomas Falcon.

8) Fix NLA_POLICY_NESTED arguments, from Johannes Berg.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (42 commits)
  qed: Revert error handling changes.
  cfg80211: extend range deviation for DMG
  cfg80211: reg: remove warn_on for a normal case
  mac80211: Add attribute aligned(2) to struct 'action'
  mac80211: don't initiate TDLS connection if station is not associated to AP
  nl80211: fix NLA_POLICY_NESTED() arguments
  ibmveth: Do not process frames after calling napi_reschedule
  net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP
  net: usb: asix: ax88772_bind return error when hw_reset fail
  MAINTAINERS: Update cavium networking drivers
  net/mlx4_core: Fix error handling when initializing CQ bufs in the driver
  net/mlx4_core: Add masking for a few queries on HCA caps
  sctp: set flow sport from saddr only when it's 0
  sctp: set chunk transport correctly when it's a new asoc
  sctp: improve the events for sctp stream adding
  sctp: improve the events for sctp stream reset
  ip_tunnel: Make none-tunnel-dst tunnel port work with lwtunnel
  ax25: fix possible use-after-free
  sfc: suppress duplicate nvmem partition types in efx_ef10_mtd_probe
  hv_netvsc: fix typos in code comments
  ...

Revert "block: cover another queue enter recursion via BIO_QUEUE_ENTERED"

We can't touch a bio after ->make_request_fn(), for all we know it could
already have been completed by the time this function returns.

This reverts commit 698cef173983b086977e633e46476e0f925ca01e.

Reported-by: syzbot+4df6ca820108fd248943@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

net: lmc: remove -I. header search path

The header search path -I. in kernel Makefiles is very suspicious;
it allows the compiler to search for headers in the top of $(srctree),
where obviously no header file exists.

I was able to build without this header search path.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag '5.0-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb3 fixes from Steve French:
"A set of small smb3 fixes, some fixing various crediting issues
  discovered during xfstest runs, five for stable"

* tag '5.0-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: print CIFSMaxBufSize as part of /proc/fs/cifs/DebugData
  smb3: add credits we receive from oplock/break PDUs
  CIFS: Fix mounts if the client is low on credits
  CIFS: Do not assume one credit for async responses
  CIFS: Fix credit calculations in compound mid callback
  CIFS: Fix credit calculation for encrypted reads with errors
  CIFS: Fix credits calculations for reads with errors
  CIFS: Do not reconnect TCP session in add_credits()
  smb3: Cleanup license mess
  CIFS: Fix possible hang during async MTU reads and writes
  cifs: fix memory leak of an allocated cifs_ntsd structure

Merge tag 'vfio-v5.0-rc4' of git://github.com/awilliam/linux-vfio

Pull VFIO fixes from Alex Williamson:

- cleanup licenses in new files (Thomas Gleixner)

- cleanup new compiler warnings (Alexey Kardashevskiy)

* tag 'vfio-v5.0-rc4' of git://github.com/awilliam/linux-vfio:
vfio-pci/nvlink2: Fix ancient gcc warnings
vfio/pci: Cleanup license mess

atheros: atl2: replace dev_kfree_skb_any() by dev_consume_skb_any()

atl2_xmit_frame() should call dev_consume_skb_any() when the
transmission is successful. It makes drop profiles more friendly.

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Six fixes, all of which appear to have user visible consequences.

  The DMA one is a regression fix from the merge window and of the
  others, four are driver specific and one specific to the target code"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: Use explicit access size in ufshcd_dump_regs
  scsi: tcmu: fix use after free
  scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state()
  scsi: lpfc: nvmet: avoid hang / use-after-free when destroying targetport
  scsi: lpfc: nvme: avoid hang / use-after-free when destroying localport
  scsi: communicate max segment size to the DMA mapping code

Merge branch 'jmp32-insns'

Jiong Wang says:

====================
v3 -> v4:
- Fixed rebase issue. JMP32 checks were missing in two new functions:
    + kernel/bpf/verifier.c:insn_is_cond_jump
    + drivers/net/ethernet/netronome/nfp/bpf/main.h:is_mbpf_cond_jump
   (Daniel)
- Further rebased on top of latest llvm-readelf change.

v2 -> v3:
- Added missed check on JMP32 inside bpf_jit_build_body. (Sandipan)
- Wrap ?: statements in s390 port with brace. They are used by macros
   which doesn't guard the operand with brace.
- Fixed the ',' issues test_verifier change.
- Reorder two selftests patches to be near each other.
- Rebased on top of latest bpf-next.

v1 -> v2:
- Updated encoding. Use reserved insn class 0x6 instead of packing with
   existing BPF_JMP. (Alexei)
- Updated code comments in s390 port. (Martin)
- Separate JIT function for jeq32_imm in NFP port. (Jakub)
- Re-implemented auto-testing support. (Jakub)
- Moved testcases to test_verifer.c, plus more unit tests. (Jakub)
- Fixed JEQ/JNE range deduction. (Jakub)
- Also supported JSET in this patch set.
- Fixed/Improved range deduction for all the other operations. All C
   programs under bpf selftest passed verification now.
- Improved min/max code implementation.
- Fixed bpftool/disassembler.

Current eBPF ISA has 32-bit sub-register and has defined a set of ALU32
instructions.

However, there is no JMP32 instructions, the consequence is code-gen for
32-bit sub-registers is not efficient. For example, explicit sign-extension
from 32-bit to 64-bit is needed for signed comparison.

Adding JMP32 instruction therefore could complete eBPF ISA on 32-bit
sub-register support. This also match those JMP32 instructions in most JIT
backends, for example x64-64 and AArch64. These new eBPF JMP32 instructions
could have one-to-one map on them.

A few verifier ALU32 related bugs has been fixed recently, and JMP32
introduced by this set further improves BPF sub-register ecosystem. Once
this is landed, BPF programs using 32-bit sub-register ISA could get
reasonably good support from verifier and JIT compilers. Users then could
compare the runtime efficiency of one BPF program under both modes, and
could use the one shown better from benchmark result.

From benchmark results on some Cilium BPF programs, for 64-bit arches,
after JMP32 introduced, programs compiled with -mattr=+alu32 (meaning
enable sub-register usage) are smaller in code size and generally smaller
in verifier processed insn number.

Benchmark results
===
Text size in bytes (generated by "size")
---
LLVM code-gen option   default  alu32  alu32/jmp32  change Vs.  change Vs.
                                                    alu32       default
bpf_lb-DLB_L3.o:       6456     6280   6160         -1.91%      -4.58%
bpf_lb-DLB_L4.o:       7848     7664   7136         -6.89%      -9.07%
bpf_lb-DUNKNOWN.o:     2680     2664   2568         -3.60%      -4.18%
bpf_lxc.o:             104824   104744 97360        -7.05%      -7.12%
bpf_netdev.o:          23456    23576  21632        -8.25%      -7.78%
bpf_overlay.o:         16184    16304  14648        -10.16%     -9.49%

Processed instruction number
---
LLVM code-gen option   default  alu32  alu32/jmp32  change Vs.  change Vs.
                                                    alu32       default
bpf_lb-DLB_L3.o:       1579     1281   1295         +1.09%      -17.99%
bpf_lb-DLB_L4.o:       2045     1663   1556         -6.43%      -23.91%
bpf_lb-DUNKNOWN.o:     606      513    501          -2.34%      -17.33%
bpf_lxc.o:             85381    103218 94435        -8.51%      +10.60%
bpf_netdev.o:          5246     5809   5200         -10.48%     -0.08%
bpf_overlay.o:         2443     2705   2456         -9.02%      -0.53%

It is even better for 32-bit arches like x32, arm32 and nfp etc, as now
some conditional jump will become JMP32 which doesn't require code-gen for
high 32-bit comparison.

Encoding
===
The new JMP32 instructions are using new BPF_JMP32 class which is using
the reserved eBPF class number 0x6. And BPF_JA/CALL/EXIT only exist for
BPF_JMP, they are reserved opcode for BPF_JMP32.

LLVM support
===
A couple of unit tests has been added and included in this set. Also LLVM
code-gen for JMP32 has been added, so you could just compile any BPF C
program with both -mcpu=probe and -mattr=+alu32 specified. If you are
compiling on a machine with kernel patched by this set, LLVM will select
the ISA automatically based on host probe results. Otherwise specify
-mcpu=v3 and -mattr=+alu32 could also force use JMP32 ISA.

   LLVM support could be found at:

     https://github.com/Netronome/llvm/tree/jmp32-v2

   (clang driver also taught about the new "v3" processor, will send out
    merge request for both clang and llvm once kernel set landed.)

JIT backends support
===
A couple of JIT backends has been supported in this set except SPARC and
MIPS. It shouldn't be a big issue for these two ports as LLVM default won't
generate JMP32 insns, it will only generate them when host machine is
probed to be with the support.

Thanks.
====================

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests: bpf: makefile support sub-register code-gen test mode

This patch enables testing some eBPF programs under sub-register
compilation mode.

Only enable this when there is BPF_JMP32 support on both LLVM and kernel.
This is because only after BPF_JMP32 added, code-gen for complex program
under sub-register mode will be clean enough to pass verification.

This patch splits TEST_GEN_FILES into BPF_OBJ_FILES and
BPF_OBJ_FILES_DUAL_COMPILE. The latter are those objects we would like to
compile for both default and sub-register mode. They are also objects used
by "test_progs".

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests: bpf: functional and min/max reasoning unit tests for JMP32

This patch adds unit tests for new JMP32 instructions.

This patch also added the new BPF_JMP32_REG and BPF_JMP32_IMM macros to
samples/bpf/bpf_insn.h so that JMP32 insn builders are available to tests
under 'samples' directory.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

nfp: bpf: implement jitting of JMP32

This patch implements code-gen for new JMP32 instructions on NFP.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

s390: bpf: implement jitting of JMP32

This patch implements code-gen for new JMP32 instructions on s390.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

ppc: bpf: implement jitting of JMP32

This patch implements code-gen for new JMP32 instructions on ppc.

For JMP32 | JSET, instruction encoding for PPC_RLWINM_DOT is added to check
the result of ANDing low 32-bit of operands.

Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

arm: bpf: implement jitting of JMP32

This patch implements code-gen for new JMP32 instructions on arm.

For JSET, "ands" (AND with flags updated) is used, so corresponding
encoding helper is added.

Cc: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

arm64: bpf: implement jitting of JMP32

This patch implements code-gen for new JMP32 instructions on arm64.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

x32: bpf: implement jitting of JMP32

This patch implements code-gen for new JMP32 instructions on x32.
Also fixed several reverse xmas tree coding style issues as I am there.

Cc: Wang YanQing <udknight@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

x86_64: bpf: implement jitting of JMP32

This patch implements code-gen for new JMP32 instructions on x86_64.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: JIT blinds support JMP32

This patch adds JIT blinds support for JMP32.

Like BPF_JMP_REG/IMM, JMP32 version are needed for building raw bpf insn.
They are added to both include/linux/filter.h and
tools/include/linux/filter.h.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: interpreter support for JMP32

This patch implements interpreting new JMP32 instructions.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

tools: bpftool: teach cfg code about JMP32

The cfg code need to be aware of the new JMP32 instruction class so it
could partition functions correctly.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: disassembler support JMP32

This patch teaches disassembler about JMP32. There are two places to
update:

  - Class 0x6 now used by BPF_JMP32, not "unused".

  - BPF_JMP32 need to show comparison operands properly.
    The disassemble format is to add an extra "(32)" before the operands if
    it is a sub-register. A better disassemble format for both JMP32 and
    ALU32 just show the register prefix as "w" instead of "r", this is the
    format using by LLVM assembler.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: verifier support JMP32

This patch teach verifier about the new BPF_JMP32 instruction class.
Verifier need to treat it similar as the existing BPF_JMP class.
A BPF_JMP32 insn needs to go through all checks that have been done on
BPF_JMP.

Also, verifier is doing runtime optimizations based on the extra info
conditional jump instruction could offer, especially when the comparison is
between constant and register that the value range of the register could be
improved based on the comparison results. These code are updated
accordingly.

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: refactor verifier min/max code for condition jump

The current min/max code does both signed and unsigned comparisons against
the input argument "val" which is "u64" and there is explicit type casting
when the comparison is signed.

As we will need slightly more complexer type casting when JMP32 introduced,
it is better to host the signed type casting. This makes the code more
clean with ignorable runtime overhead.

Also, code for J*GE/GT/LT/LE and JEQ/JNE are very similar, this patch
combine them.

The main purpose for this refactor is to make sure the min/max code will
still be readable and with minimum code duplication after JMP32 introduced.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: allocate 0x06 to new eBPF instruction class JMP32

The new eBPF instruction class JMP32 uses the reserved class number 0x6.
Kernel BPF ISA documentation updated accordingly.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Merge tag 'for-linus-20190125' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"A collection of fixes for this release. This contains:

   - Silence sparse rightfully complaining about non-static wbt
     functions (Bart)

   - Fixes for the zoned comments/ioctl documentation (Damien)

   - direct-io fix that's been lingering for a while (Ernesto)

   - cgroup writeback fix (Tejun)

   - Set of NVMe patches for nvme-rdma/tcp (Sagi, Hannes, Raju)

   - Block recursion tracking fix (Ming)

   - Fix debugfs command flag naming for a few flags (Jianchao)"

* tag 'for-linus-20190125' of git://git.kernel.dk/linux-block:
  block: Fix comment typo
  uapi: fix ioctl documentation
  blk-wbt: Declare local functions static
  blk-mq: fix the cmd_flag_name array
  nvme-multipath: drop optimization for static ANA group IDs
  nvmet-rdma: fix null dereference under heavy load
  nvme-rdma: rework queue maps handling
  nvme-tcp: fix timeout handler
  nvme-rdma: fix timeout handler
  writeback: synchronize sync(2) against cgroup writeback membership switches
  block: cover another queue enter recursion via BIO_QUEUE_ENTERED
  direct-io: allow direct writes to empty inodes

Merge branch 'ip_tunnel-next'

wenxu says:

====================
ip_tunnel: Refactor ip_gre collect metadata xmit to ip_md_tunnel_xmit

This patchset add tunnel_dst_cache and tnl_update_pmtu feature for
ip_md_tunnel_xmit also bugfix. Then Refactor collect metatdata mode
tunnel xmit to ip_md_tunnel_xmit
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ip_gre: Refactor collect metatdata mode tunnel xmit to ip_md_tunnel_xmit

Refactor collect metatdata mode tunnel xmit to the generic xmit function
ip_md_tunnel_xmit. It makes codes more generic and support more feture
such as pmtu_update through ip_md_tunnel_xmit

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip_tunnel: Fix route fl4 init in ip_md_tunnel_xmit

Init the gre_key from tuninfo->key.tun_id and init the mark
from the skb->mark, set the oif to zero in the collect metadata
mode.

Fixes: cfc7381b3002 ("ip_tunnel: add collect_md mode to IPIP tunnel")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit

Add tnl_update_pmtu in ip_md_tunnel_xmit to dynamic modify
the pmtu which packet send through collect_metadata mode
ip tunnel

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip_tunnel: Add ip tunnel dst_cache in ip_md_tunnel_xmit

Add ip tunnel dst cache in ip_md_tunnel_xmit to make more
efficient for the route lookup.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>

lan743x: Provide Read/Write Access to on chip OTP

The LAN743x includes on chip One-Time-Programmable (OTP) memory.

This patch extends the ethtool EEPROM read/write interface to
access OTP memory space.

The currently existing interface is limited, as it does not
allow OTP read, and OTP writes are restricted to
offset==0, length==512, and data[0]==0xF3.

This patch removes these restrictions and adds a private flag
called OTP_ACCESS, which is used to switch between EEPROM, and
OTP modes.

The private flag OTP_ACCESS is configurable through the
  ethtool --set-priv-flags command.
And visible through the
  ethtool --show-priv-flags command.

By default OTP_ACCESS is false, and there for previously existing
EEPROM commands will work exactly the same. However now access to
OTP requires one extra step of setting OTP_ACCESS to true. This
flag controls the read, write, and length reporting, functions
of ethtool.

EEPROM presence is not checked when setting or clearing this flag.
If the EEPROM is not present, the user, as before, will need to
diagnose that using existing read and write function of ethtool,
while OTP_ACCESS is false.

Updates for V2:
Added comments as to why this patch is needed.
Added comments explaining that EEPROM presence is not check
  when setting or clearing the OTP_ACCESS flag.
Added length checking to all otp/eeprom read/write functions.

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'hns3-next'

Huazhong Tan says:

====================
code optimizations & bugfixes for HNS3 driver

This patchset includes bugfixes and code optimizations for the HNS3
ethernet controller driver
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: don't allow vf to enable promisc mode

VF can receive packets of other functions when in promisc
mode. It's not safe, so don't allow VF to enable promisc
mode.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: add initialization for nic state

This patch adds initialization for nic state, sets flag
HNS3_NIC_STATE_DOWN when initialize, clears it before
vectors and napi being enabled in the hns3_nic_net_up(),
and sets it back in the error handler.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: add 8 BD limit for tx flow

A single transmit packet can span up to 8 descriptors according
to the HW limit. If a skb has more than 8 frags, driver uses
skb_copy to get a new skb which has less frags.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: call hns3_nic_set_real_num_queue with netdev down

hns3_client_setup_tc in enet is for updating TC configuration to
stack, and hclge_setup_tc in hclge_dcb is mainly for setting the
configuration to hardware.

This patch removes the hns3_nic_set_real_num_queue from
hns3_setup_tc in enet, and call hclge_client_setup_tc to update
TC configuration to stack with netdev down, because the netdev
down operation is done in hclge_dcb now.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: remove dcb_ops->map_update in hclge_dcb

After doing down/uninit/init/up in hclge_dcb, it is not necessary
to call dcb_ops->map_update in enet, so hclge_map_update can be
called directly in hclge_dcb.

This is for preparing to call hns3_nic_set_real_num_queue with
netdev down when user changes mqprio configuration.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: do reinitialization while mqprio configuration changed

When user changes the mqprio configuration, enet need to be
uninited and inited besides down'ed and up'ed, because the queue
num may change when the TC num changes.

Also, it is more suitable to do the down/unint/init/up operation
in hclge module using hclge_notify_client, because this config
change may affect PF and its VF.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: After setting the loopback, add the status of getting MAC

After setting the serdes loopback, you need to determine
the status of the MAC negotiation. If a status exception
is obtained after 200ms, a timeout error is returned.

Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: fix broadcast promisc issue for revision 0x20

For revision 0x20, vlan filter is always bypassed when enable
broadcast promisc mode. In this case, broadcast packets with
any vlan id can be accpeted. We should disable broadcast promisc
mode until user want enable it.

Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: fix return value handle issue for hclge_set_loopback()

In current code, it always return 0, even loopback mode setting failed.
It's incorrect. This patch fixes return value handle for loopback test.

Fixes: 0f29fc23b21d ("net: hns3: Fix for loopback selftest failed problem")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: add error handling in hclge_ieee_setets

Currently hclge_ieee_setets returns error directly when there is
error, which may cause netdev not up problem.

This patch adds some error handling when setting ETS configuration
fails.

Fixes: cacde272dd00 ("net: hns3: Add hclge_dcb module for the support of DCB feature")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: clear pci private data when unload hns3 driver

When unload hns3 driver, we should clear the pci private data.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: don't update packet statistics for packets dropped by hardware

Packet statistics for netdev should not include the packets dropped
by hardware.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'r8169-add-EEE-support-for-RTL8168f'

Heiner Kallweit says:

====================
r8169: add EEE support for RTL8168f

This series adds EEE support for RTL8168f. Again first patch adds the
support, and second patch enables EEE per default.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: enable EEE per default on RTL8168f

Enable EEE per default on RTL8168f.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: add EEE support for RTL8168f

Add EEE support for RTL8168f to the recently added EEE handling
framework in the driver. This patch leaves the chip defaults, means
EEE typically is disabled initially and it's up to the user to enable
it via ethtool.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: net: phy: switch documentation to rst format

Switch phylib documentation to rst format.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: allow zerocopy with fastopen

Accept MSG_ZEROCOPY in all the TCP states that allow sendmsg. Remove
the explicit check for ESTABLISHED and CLOSE_WAIT states.

This requires correctly handling zerocopy state (uarg, sk_zckey) in
all paths reachable from other TCP states. Such as the EPIPE case
in sk_stream_wait_connect, which a sendmsg() in incorrect state will
now hit. Most paths are already safe.

Only extension needed is for TCP Fastopen active open. This can build
an skb with data in tcp_send_syn_data. Pass the uarg along with other
fastopen state, so that this skb also generates a zerocopy
notification on release.

Tested with active and passive tcp fastopen packetdrill scripts at
https://github.com/wdebruij/packetdrill/commit/1747eef03d25a2404e8132817d0f1244fd6f129d

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ptp: fix debugfs_simple_attr.cocci warnings

Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE
for debugfs files.

Semantic patch information:
Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file()
imposes some significant overhead as compared to
DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe().

Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'r8169-add-EEE-support-for-RTL8168g+'

Heiner Kallweit says:

====================
r8169: add EEE support for RTL8168g+

This series adds general EEE support to be used with ethtool.
In addition it implements EEE for chip versions from RTL8168g.
The first patch leaves the default chip settings and the
second enables EEE per default. This allows us to revert patch 2
w/o removing EEE support completely if we should face issues with
EEE on particular chip versions.

Unfortunately Realtek decided not to use the standard EEE MMD
registers but to use proprietary registers. Therefore we can't
use phylib functions like phy_ethtool_set_eee and have to
reimplement the functionality.

Tested on a system with RTL8168g (chip version 40).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: enable EEE per default on chip versions from RTL8168g

Enable EEE per default on chip versions from RTL8168g.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: add general EEE support for chip versions from RTL8168g

This patch adds the general framework to deal with EEE in this driver
plus EEE support for chip versions from RTL8168g. We don't touch the
default chip settings, therefore EEE will usually be disabled and it's
up to the user to enable it.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ipv6-defrag-rbtree'

Peter Oskolkov says:

====================
net: IP defrag: use rbtrees in IPv6 defragmentation

Currently, IPv6 defragmentation code drops non-last fragments that
are smaller than 1280 bytes: see
commit 0ed4229b08c1 ("ipv6: defrag: drop non-last frags smaller than min mtu")

This behavior is not specified in IPv6 RFCs and appears to break compatibility
with some IPv6 implementations, as reported here:
https://www.spinics.net/lists/netdev/msg543846.html

This patchset contains four patches:
- patch 1 moves rbtree-related code from IPv4 to files shared b/w
IPv4/IPv6
- patch 2 changes IPv6 defragmenation code to use rbtrees for defrag
queue
- patch 3 changes nf_conntrack IPv6 defragmentation code to use rbtrees
- patch 4 changes ip_defrag selftest to test changes made in the
previous three patches.

Along the way, the 1280-byte restrictions are removed.

I plan to introduce similar changes to 6lowpan defragmentation code
once I figure out how to test it.
====================

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: net: ip_defrag: cover new IPv6 defrag behavior

This patch adds several changes to the ip_defrag selftest, to cover
new IPv6 defrag behavior:

- min IPv6 frag size is now 8 instead of 1280

- new test cases to cover IPv6 defragmentation in nf_conntrack_reasm.c

- new "permissive" mode in negative (overlap) tests: netfilter
sometimes drops invalid packets without passing them to IPv6
underneath, and thus defragmentation sometimes succeeds when
it is expected to fail; so the permissive mode does not fail the
test if the correct reassembled datagram is received instead of a
timeout.

Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c

Currently, IPv6 defragmentation code drops non-last fragments that
are smaller than 1280 bytes: see
commit 0ed4229b08c1 ("ipv6: defrag: drop non-last frags smaller than min mtu")

This behavior is not specified in IPv6 RFCs and appears to break
compatibility with some IPv6 implemenations, as reported here:
https://www.spinics.net/lists/netdev/msg543846.html

This patch re-uses common IP defragmentation queueing and reassembly
code in IP6 defragmentation in nf_conntrack, removing the 1280 byte
restriction.

Signed-off-by: Peter Oskolkov <posk@google.com>
Reported-by: Tom Herbert <tom@herbertland.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: IP6 defrag: use rbtrees for IPv6 defrag

Currently, IPv6 defragmentation code drops non-last fragments that
are smaller than 1280 bytes: see
commit 0ed4229b08c1 ("ipv6: defrag: drop non-last frags smaller than min mtu")

This behavior is not specified in IPv6 RFCs and appears to break
compatibility with some IPv6 implemenations, as reported here:
https://www.spinics.net/lists/netdev/msg543846.html

This patch re-uses common IP defragmentation queueing and reassembly
code in IPv6, removing the 1280 byte restriction.

Signed-off-by: Peter Oskolkov <posk@google.com>
Reported-by: Tom Herbert <tom@herbertland.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: IP defrag: encapsulate rbtree defrag code into callable functions

This is a refactoring patch: without changing runtime behavior,
it moves rbtree-related code from IPv4-specific files/functions
into .h/.c defrag files shared with IPv6 defragmentation code.

Signed-off-by: Peter Oskolkov <posk@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 's390-qeth-next'

Julian Wiedmann says:

====================
s390/qeth: updates 2019-01-25

please apply a first batch of qeth patches for net-next, primarily touching the
net_device parts of the driver.
In addition to the usual refactoring & code consolidation, patch 7 makes use of
netif_device_detach() to let the stack know when our control plane is down. This
helps quite a bit wrt to overall locking and proper init/shutdown sequencing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

s390/qeth: remove VLAN tracking for L2 devices

For recovery purposes, qeth keeps track of all registered VIDs. Replace
this by using the infrastructure introduced in
commit 9daae9bd47cf ("net: Call add/kill vid ndo on vlan filter feature toggling").

By managing NETIF_F_HW_VLAN_CTAG_FILTER as a hw_feature,
netdev_update_features() will select it from dev->wanted_features
and replay all of the netdevice's VIDs to its ndo_vlan_rx_add_vid()
callback.
z/VM NICs strictly require VLAN registration, so don't expose it as
hw_feature there but add a little hack in qeth_enable_hw_features()
to make things work regardless.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

s390/qeth: detach netdevice while card is offline

When a qeth card is offline, it has no connection to the HW. So none of
our control callbacks can run IO against it, and we can only cache the
input (eg a new MAC address) without providing proper feedback to the
caller. In this context, it seems much more reasonable to simply detach
the netdevice and let the kernel reject any interaction with it.

This also makes all sorts of internal state checks and locking obsolete.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

s390/qeth: delay netdevice registration

Re-order the code flow a bit so that all initial HW setup is done before
putting the netdevice into play. For a netdevice that hasn't been
registered before, we also don't need to re-enable its HW features or
check for recovery actions.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

s390/qeth: remove TX disable from online path

At best this is redundant, at worst it papers over a race in the
offline / online code.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

s390/qeth: register MAC address earlier

commit 4789a2188048 ("s390/qeth: fix race when setting MAC address")
resolved a race where our initial programming of dev_addr into the HW
and a call to ndo_set_mac_address() could run concurrently. In this
case, we could end up getting confused about which address was actually
set in the HW.

The quick fix was to introduce additional locking that blocks any
ndo_set_mac_address() while the device is being set online. But the race
primarily originated from the fact that we first register the netdevice,
and only then program its dev_addr. By re-ordering this sequence,
userspace will only be able to change the MAC address _after_ we have
finished with setting the initial dev_addr.

Still, the same MAC address race can also occur during a subsequent call
to qeth_l2_set_online(). So keep around the locking for now, until a
follow-up patch fully resolves this.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

s390/qeth: consolidate open/stop netdev ops

The L2 and L3 code for these ops is almost identical, we only need to
provide a custom ndo_validate_addr() for L2 that checks whether
programming the MAC address succeeded.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

s390/qeth: remove bogus netif_wake_queue()

qeth_qdio_cq_handler() doesn't replenish the Output Queue(s), and thus
has no reason to wake the txq.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

s390/qeth: streamline TX buffer management

Consolidate the code that marks the current buffer to be flushed, and
let qeth_fill_buffer() advance the Output Queue's buffer cursor.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Documentation: net: phy: reflect latest changes to phylib API

Recent changes to the phylib API
- removed phy_stop_interrupts
- replaced phy_start_interrupts with phy_request_interrupt
- moved some functionality from phy_connect() and phy_disconnect()
to phy_start() and phy_stop() respectively.
Reflect these changes in the documentation.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mlx5-updates-2019-01-25' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2019-01-25

This series provides some updates to mlx5 driver,

From Tariq,
1) Make sure RX packet header does not cross page boundary
   To avoid page boundary crossing, use stride size that fits
   the maximum possible header. Stride is increased form 64B to 256B.

2) CQ struct cleanup: Take CQ decompress fields into a separate structure

From Moshe,
3) Expand XPS cpumask to cover all online cpus

From Jason Gunthorpe and Tariq:
4) Compilation warning cleanup

From Or,
5) Add trace points for flow tables create/destroy

From Saeed,
6) Software stats update/folding improvements
   this also solves a compilation warning on 32bit systems that was reported
   last release cycle by Arnd and Andrew.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Revert error handling changes.

This is new code and not bug fixes.

This reverts all changes added by merge commit
8fb18be93efd7292d6ee403b9f61af1008239639

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mmc-v5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:

- sdhci-acpi: Fixup build dependency for PCI

- sdhci-omap: Resolve Kconfig warnings on keystone

- sdhci-iproc: Propagate errors from DT parsing

- meson-gx: Fixup IRQ handling in release callback

- meson-gx: Use signal re-sampling to fixup tuning

- dw_mmc-bluefield: Fix the license information

* tag 'mmc-v5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: dw_mmc-bluefield: : Fix the license information
  mmc: meson-gx: enable signal re-sampling together with tuning
  mmc: sdhci-iproc: handle mmc_of_parse() errors during probe
  mmc: meson-gx: Free irq in release() callback
  mmc: host: Fix Kconfig warnings on keystone_defconfig
  mmc: sdhci-acpi: Make PCI dependency explicit

Merge tag 'char-misc-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
"Here are some small char and misc driver fixes to resolve some
  reported issues, as well as a number of binderfs fixups that were
  found after auditing the filesystem code by Al Viro. As binderfs
  hasn't been in a previous release yet, it's good to get these in now
  before the first users show up.

  All of these have been in linux-next for a bit with no reported
  issues"

* tag 'char-misc-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (26 commits)
  i3c: master: Fix an error checking typo in 'cdns_i3c_master_probe()'
  binderfs: switch from d_add() to d_instantiate()
  binderfs: drop lock in binderfs_binder_ctl_create
  binderfs: kill_litter_super() before cleanup
  binderfs: rework binderfs_binder_device_create()
  binderfs: rework binderfs_fill_super()
  binderfs: prevent renaming the control dentry
  binderfs: remove outdated comment
  binderfs: use __u32 for device numbers
  binderfs: use correct include guards in header
  misc: pvpanic: fix warning implicit declaration
  char/mwave: fix potential Spectre v1 vulnerability
  misc: ibmvsm: Fix potential NULL pointer dereference
  binderfs: fix error return code in binderfs_fill_super()
  mei: me: add denverton innovation engine device IDs
  mei: me: mark LBG devices as having dma support
  mei: dma: silent the reject message
  binderfs: handle !CONFIG_IPC_NS builds
  binderfs: reserve devices for initial mount
  binderfs: rename header to binderfs.h
  ...

Merge tag 'staging-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
"Here are some small staging driver fixes for 5.0-rc4.

  They resolve some reported bugs and add a new device id for one
  driver. Nothing major at all, but all good to have.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'staging-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: android: ion: Support cpu access during dma_buf_detach
  staging: rtl8723bs: Fix build error with Clang when inlining is disabled
  staging: rtl8188eu: Add device code for D-Link DWA-121 rev B1
  staging: vchiq: Fix local event signalling
  Staging: wilc1000: unlock on error in init_chip()
  staging: wilc1000: fix memory leak in wilc_add_rx_gtk
  staging: wilc1000: fix registration frame size

Merge tag 'tty-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
"Here are a number of small tty core and serial driver fixes for
  5.0-rc4 to resolve some reported issues.

  Nothing major, the small serial driver fixes, a tty core fixup for a
  crash that was reported, and some good vt fixes from Nicolas Pitre as
  he seems to be auditing that chunk of code a lot lately.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: fsl_lpuart: fix maximum acceptable baud rate with over-sampling
  tty: serial: qcom_geni_serial: Allow mctrl when flow control is disabled
  tty: Handle problem if line discipline does not have receive_buf
  vgacon: unconfuse vc_origin when using soft scrollback
  vt: invoke notifier on screen size change
  vt: always call notifier with the console lock held
  vt: make vt_console_print() compatible with the unicode screen buffer
  tty/n_hdlc: fix __might_sleep warning
  serial: 8250: Fix serial8250 initialization crash
  uart: Fix crash in uart_write and uart_put_char

Merge tag 'usb-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB/PHY fixes from Greg KH:
"Here are a number of small USB and PHY driver fixes for 5.0-rc4.

  Nothing major at all, just the usual selection of USB gadget bugfixes,
  some new USB serial driver ids, some SPDX fixes, and some PHY driver
  fixes for reported issues.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: keyspan_usa: add proper SPDX lines for .h files
  USB: EHCI: ehci-mv: add MODULE_DEVICE_TABLE
  USB: leds: fix regression in usbport led trigger
  usb: chipidea: fix static checker warning for NULL pointer
  MAINTAINERS: email address update in MAINTAINERS entries
  USB: usbip: delete README file
  USB: serial: pl2303: add new PID to support PL2303TB
  usb: dwc2: gadget: Fix Remote Wakeup interrupt bit clearing
  phy: ath79-usb: Fix the main reset name to match the DT binding
  phy: ath79-usb: Fix the power on error path
  phy: fix build breakage: add PHY_MODE_SATA
  phy: ti: ensure priv is not null before dereferencing it
  USB: serial: ftdi_sio: fix GPIO not working in autosuspend
  usb: gadget: Potential NULL dereference on allocation error
  usb: dwc3: gadget: Fix the uninitialized link_state when udc starts
  usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup
  usb: dwc3: gadget: synchronize_irq dwc irq in suspend
  USB: serial: simple: add Motorola Tetra TPG2200 device id

net/mlx5e: Reuse fold sw stats in representors

Representors software stats are basic, this patch is reusing the
mlx5e_fold_sw_stats in representors, which sums up the basic stats64 for a
mlx5e netdevice.

Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5e: Present the representors SW stats when state is not opened

This behavior is already adopted for all other cases in the cited patch.
The representor's functions were missed, here we modify the them to
behave similarly.

Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5e: Separate between ethtool and netdev software stats folding

mlx5e_grp_sw_update_stats can be called from two threads,
1) ndo_get_stats64
2) get_ethtool_stats

For this reason and to minimize concurrency issue impact on 64bit machines
mlx5e_grp_sw_update_stats folds the software stats into a temporary
variable then copies it to the global driver stats, both ethtool and ndo
statistics callbacks will use the global software stats variable to report
whatever stats they need.

Actually ndo_get_stats64 doesn't need to fold the whole software stats
(mlx5e_grp_sw_update_stats), all it needs is five counters to fill the
rtnl_link_stats64 relevant stats parameter.

Hence this patch introduces a simpler helper function to fold software
stats for ndo_get_stats64 which will work directly on rtnl_link_stats64
stats parameter and not on the global or even temporary mlx5e_sw_stats
variable.

Since now mlx5e_grp_sw_update_stats is not called by ndo_get_stats64 we
can make it static and remove the temp var.

Unlike mlx5e_grp_sw_update_stats the new fold stats function doesn't
need to zero out the output statistics parameter since it is already
done by the stack @dev_get_stats().

This patch is fixing stack usage of mlx5e_grp_sw_update_stats on
x86 gcc-4.9 and higher, the concurrency issue between mlx5's
ndo_get_stats64 and get_ethtool_stats is resolved as well.

Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5: Add trace points for flow tables create/destroy

We were not tracking flow tables so far, add it up.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5e: Return the allocated flow directly from __mlx5e_add_fdb_flow

This confusing construction confuses the compiler which can't see
that flow is initialized if !err:

drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function `mlx5e_configure_flower`
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:2727:28: warning:
`flow` may be used uninitialized in this function
[-Wmaybe-uninitialized]

There is no reason for two function outputs, just return the
pointer directly and use ERR_PTR to encode a failure.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5e: Expand XPS cpumask to cover all online cpus

Currently we have one cpu in XPS cpumask per tx queue, this is good
enough for default configuration where there is a tx queue per cpu.
However, once configuration changes to use less tx queues, part of the
cpus are not XPS-mapped and so the select queue decision falls back to
hash calculation and balancing is not guaranteed.

Expand XPS cpumask to enable using all cpus even when number of tx
queues is smaller than number of cpus.

Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5e: Take CQ decompress fields into a separate structure

Only the Receive CQ makes use of these fields. Take them
out into a separate struct and save space in the generic
CQ structure.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5e: RX, Make sure packet header does not cross page boundary

In the non-linear SKB memory scheme of Striding RQ, a packet header
could cross page boundary. This requires special care in fast path
that costs LoC, additional runtime instructions and branches.

It could happen when the header (up to 256B) does not fit in
a single stride. Avoid this by working with a stride size that fits
the maximum possible header. Stride is increased form 64B to 256B.

Performance:
Tested packet rate for UDP streams, single ring, on ConnectX-5.

Configuration:
Set Striding RQ and LRO ON (to enabled the non-linear SKB scheme).
GRO OFF, early drop by TC rule.

64B: 4x worse memory utilization, no page-crossers headers
- No degradation (5,887,305 pps).
- The reduction in memory utilization is compensated by the saving of
branches tests.

192B: 1.33x worse memory utilization, avoid page-crossers headers
- Before: 5,727,252. After: 5,777,037. ~1% gain.

256B: Same memory util, no page-crossers
- Before: 5,691,885. After: 5,748,007. ~1% gain.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5e: Unblock setting vid 0 for VFs through the uplink rep

It turns out that libvirt uses 0-vid as a default if no vlan was
set for the guest (which is the case for switchdev mode) and errs
if we disallow that:

error: Failed to start domain vm75
error: Cannot set interface MAC/vlanid to 6a:66:2d:48:92:c2/0 \
for ifname enp59s0f0 vf 0: Operation not supported

So allow this in order not to break existing systems.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Maor Dickman <maord@mellanox.com>
Reviewed-by: Gavi Teitz <gavi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5e: Move to use common phys port names for vport representors

With VF LAG commit 491c37e49b48 "net/mlx5e: In case of LAG, one switch
parent id is used for all representors", both uplinks and all the VFs
(on both of them) get the same switchdev id.

This cause the provisioning system method to identify the rep of a given
VF from the parent PF PCI device using switchev id and physical port
name to break, since VFm of PF0 will have the (id, name) as VFm of PF1.

To fix that, we align to use the framework agreed upstream and set by
nfp commit 168c478e107e "nfp: wire get_phys_port_name on representors":

$ cat /sys/class/net/eth4_*/phys_port_name
p0
pf0vf0
pf0vf1

Now, the names will be different, e.g. pf0vf0 vs. pf1vf0.

Fixes: 491c37e49b48 ("net/mlx5e: In case of LAG, one switch parent id is used for all representors")
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Waleed Musa <waleedm@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5e: Allow MAC invalidation while spoofchk is ON

Prior to this patch the driver prohibited spoof checking on invalid MAC.
Now the user can set this configuration if it wishes to.

This is required since libvirt might invalidate the VF Mac by setting it
to zero, while spoofcheck is ON.

Fixes: 1ab2068a4c66 ("net/mlx5: Implement vports admin state backup/restore")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5: Take lock with IRQs disabled to avoid deadlock

The lock in qp_table might be taken from process context or from
interrupt context. This may lead to a deadlock unless it is taken with
IRQs disabled.

Discovered by lockdep

================================
WARNING: inconsistent lock state
4.20.0-rc6
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W}

python/12572 [HC1[1]:SC0[0]:HE0:SE1] takes:
00000000052a4df4 (&(&table->lock)->rlock#2){?.+.}, /0x50 [mlx5_core]
{HARDIRQ-ON-W} state was registered at:
  _raw_spin_lock+0x33/0x70
  mlx5_get_rsc+0x1a/0x50 [mlx5_core]
  mlx5_ib_eqe_pf_action+0x493/0x1be0 [mlx5_ib]
  process_one_work+0x90c/0x1820
  worker_thread+0x87/0xbb0
  kthread+0x320/0x3e0
  ret_from_fork+0x24/0x30
irq event stamp: 103928
hardirqs last  enabled at (103927): [] nk+0x1a/0x1c
hardirqs last disabled at (103928): [] unk+0x1a/0x1c
softirqs last  enabled at (103924): [] tcp_sendmsg+0x31/0x40
softirqs last disabled at (103922): [] 80

other info that might help us debug this:
Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&table->lock)->rlock#2);

    lock(&(&table->lock)->rlock#2);

*** DEADLOCK ***

Fixes: 032080ab43ac ("IB/mlx5: Lock QP during page fault handling")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

net/mlx5e: Fix wrong private flag usage causing checksum disable

MLX5E_PFLAG_* definitions were changed from bitmask to enumerated
values. However, in mlx5e_open_rq(), the proper API (MLX5E_GET_PFLAG macro)
was not used to read the flag value of MLX5E_PFLAG_RX_NO_CSUM_COMPLETE.
Fixed it.

Fixes: 8ff57c18e9f6 ("net/mlx5e: Improve ethtool private-flags code structure")
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

Revert "net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager"

This reverts commit 5f5991f36dce1e69dd8bd7495763eec2e28f08e7.

With the original commit, eswitch instance will not be initialized for
a function which is vport group manager but not eswitch manager such as
host PF on SmartNIC (BlueField) card. This will result in a kernel crash
when such a vport group manager is trying to access vports in its group.
E.g, PF vport manager (not eswitch manager) tries to configure the MAC
of its VF vport, a kernel trace will happen similar as bellow:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
...
RIP: 0010:mlx5_eswitch_get_vport_config+0xc/0x180 [mlx5_core]
...

Fixes: 5f5991f36dce ("net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager")
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reported-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>