asedeno.scripts.mit.edu Git - linux.git/log

]> asedeno.scripts.mit.edu Git - linux.git/log

projects / linux.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Eric Biggers [Sat, 15 Dec 2018 20:41:53 +0000 (12:41 -0800)]

crypto: skcipher - add might_sleep() to skcipher_walk_virt()

skcipher_walk_virt() can still sleep even with atomic=true, since that
only affects the later calls to skcipher_walk_done(). But,
skcipher_walk_virt() only has to allocate memory for some input data
layouts, so incorrectly calling it with preemption disabled can go
undetected. Use might_sleep() so that it's detected reliably.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Eric Biggers [Sat, 15 Dec 2018 20:40:17 +0000 (12:40 -0800)]

crypto: x86/chacha - avoid sleeping under kernel_fpu_begin()

Passing atomic=true to skcipher_walk_virt() only makes the later
skcipher_walk_done() calls use atomic memory allocations, not
skcipher_walk_virt() itself. Thus, we have to move it outside of the
preemption-disabled region (kernel_fpu_begin()/kernel_fpu_end()).

(skcipher_walk_virt() only allocates memory for certain layouts of the
input scatterlist, hence why I didn't notice this earlier...)

Reported-by: syzbot+9bf843c33f782d73ae7d@syzkaller.appspotmail.com
Fixes: 4af78261870a ("crypto: x86/chacha20 - add XChaCha20 support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Nagadheeraj Rottela [Fri, 14 Dec 2018 11:19:51 +0000 (11:19 +0000)]

crypto: cavium/nitrox - Added AEAD cipher support

Added support to offload AEAD ciphers to NITROX. Currently supported
AEAD cipher is 'gcm(aes)'.

Signed-off-by: Nagadheeraj Rottela <rnagadheeraj@marvell.com>
Reviewed-by: Srikanth Jampala <jsrikanth@marvell.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Fabio Estevam [Thu, 13 Dec 2018 09:52:32 +0000 (07:52 -0200)]

crypto: mxc-scc - fix build warnings on ARM64

The following build warnings are seen when building for ARM64 allmodconfig:

drivers/crypto/mxc-scc.c:181:20: warning: format '%d' expects argument of type 'int', but argument 5 has type 'size_t' {aka 'long unsigned int'} [-Wformat=]
drivers/crypto/mxc-scc.c:186:21: warning: format '%d' expects argument of type 'int', but argument 4 has type 'size_t' {aka 'long unsigned int'} [-Wformat=]
drivers/crypto/mxc-scc.c:277:21: warning: format '%d' expects argument of type 'int', but argument 4 has type 'size_t' {aka 'long unsigned int'} [-Wformat=]
drivers/crypto/mxc-scc.c:339:3: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
drivers/crypto/mxc-scc.c:340:3: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]

Fix them by using the %zu specifier to print a size_t variable and using
a plain %x to print the result of a readl().

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Corentin Labbe [Thu, 13 Dec 2018 08:36:38 +0000 (08:36 +0000)]

crypto: api - document missing stats member

This patchs adds missing member of stats documentation.

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Corentin Labbe [Thu, 13 Dec 2018 08:36:37 +0000 (08:36 +0000)]

crypto: user - remove unused dump functions

This patch removes unused dump functions for crypto_user_stats.
There are remains of the copy/paste of crypto_user_base to
crypto_user_stat and I forgot to remove them.

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Harsh Jain [Tue, 11 Dec 2018 10:51:42 +0000 (16:21 +0530)]

crypto: chelsio - Fix wrong error counter increments

Fix error counter increment in AEAD decrypt operation when
validation of tag is done in Driver instead of H/W.

Signed-off-by: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Harsh Jain [Tue, 11 Dec 2018 10:51:41 +0000 (16:21 +0530)]

crypto: chelsio - Reset counters on cxgb4 Detach

Reset the counters on receiving detach from Cxgb4.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Harsh Jain [Tue, 11 Dec 2018 10:51:40 +0000 (16:21 +0530)]

crypto: chelsio - Handle PCI shutdown event

chcr receives "CXGB4_STATE_DETACH" event on PCI Shutdown.
Wait for processing of inflight request and Mark the device unavailable.

Signed-off-by: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Harsh Jain [Tue, 11 Dec 2018 10:51:39 +0000 (16:21 +0530)]

crypto: chelsio - cleanup:send addr as value in function argument

Send dma address as value to function arguments instead of pointer.

Signed-off-by: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Harsh Jain [Tue, 11 Dec 2018 10:51:38 +0000 (16:21 +0530)]

crypto: chelsio - Use same value for both channel in single WR

Use tx_channel_id instead of rx_channel_id.

Signed-off-by: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Harsh Jain [Tue, 11 Dec 2018 10:51:37 +0000 (16:21 +0530)]

crypto: chelsio - Swap location of AAD and IV sent in WR

Send input as IV | AAD | Data. It will allow sending IV as Immediate
Data and Creates space in Work request to add more dma mapped entries.

Signed-off-by: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

YueHaibing [Tue, 11 Dec 2018 08:11:59 +0000 (08:11 +0000)]

crypto: chelsio - remove set but not used variable 'kctx_len'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/crypto/chelsio/chcr_ipsec.c: In function 'chcr_ipsec_xmit':
drivers/crypto/chelsio/chcr_ipsec.c:674:33: warning:
variable 'kctx_len' set but not used [-Wunused-but-set-variable]
unsigned int flits = 0, ndesc, kctx_len;

It not used since commit 8362ea16f69f ("crypto: chcr - ESN for Inline IPSec Tx")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Nathan Chancellor [Mon, 10 Dec 2018 23:49:54 +0000 (16:49 -0700)]

crypto: ux500 - Use proper enum in hash_set_dma_transfer

Clang warns when one enumerated type is implicitly converted to another:

drivers/crypto/ux500/hash/hash_core.c:169:4: warning: implicit
conversion from enumeration type 'enum dma_data_direction' to different
enumeration type 'enum dma_transfer_direction' [-Wenum-conversion]
direction, DMA_CTRL_ACK | DMA_PREP_INTERRUPT);
^~~~~~~~~
1 warning generated.

dmaengine_prep_slave_sg expects an enum from dma_transfer_direction.
We know that the only direction supported by this function is
DMA_TO_DEVICE because of the check at the top of this function so we can
just use the equivalent value from dma_transfer_direction.

DMA_TO_DEVICE = DMA_MEM_TO_DEV = 1

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Nathan Chancellor [Mon, 10 Dec 2018 23:49:29 +0000 (16:49 -0700)]

crypto: ux500 - Use proper enum in cryp_set_dma_transfer

Clang warns when one enumerated type is implicitly converted to another:

drivers/crypto/ux500/cryp/cryp_core.c:559:5: warning: implicit
conversion from enumeration type 'enum dma_data_direction' to different
enumeration type 'enum dma_transfer_direction' [-Wenum-conversion]
                                direction, DMA_CTRL_ACK);
                                ^~~~~~~~~
drivers/crypto/ux500/cryp/cryp_core.c:583:5: warning: implicit
conversion from enumeration type 'enum dma_data_direction' to different
enumeration type 'enum dma_transfer_direction' [-Wenum-conversion]
                                direction,
                                ^~~~~~~~~
2 warnings generated.

dmaengine_prep_slave_sg expects an enum from dma_transfer_direction.
Because we know the value of the dma_data_direction enum from the
switch statement, we can just use the proper value from
dma_transfer_direction so there is no more conversion.

DMA_TO_DEVICE = DMA_MEM_TO_DEV = 1
DMA_FROM_DEVICE = DMA_DEV_TO_MEM = 2

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:59:59 +0000 (19:59 +0000)]

crypto: aesni - Add scatter/gather avx stubs, and use them in C

Add the appropriate scatter/gather stubs to the avx asm.
In the C code, we can now always use crypt_by_sg, since both
sse and asm code now support scatter/gather.

Introduce a new struct, aesni_gcm_tfm, that is initialized on
startup to point to either the SSE, AVX, or AVX2 versions of the
four necessary encryption/decryption routines.

GENX_OPTSIZE is still checked at the start of crypt_by_sg. The
total size of the data is checked, since the additional overhead
is in the init function, calculating additional HashKeys.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:59:38 +0000 (19:59 +0000)]

crypto: aesni - Introduce partial block macro

Before this diff, multiple calls to GCM_ENC_DEC will
succeed, but only if all calls are a multiple of 16 bytes.

Handle partial blocks at the start of GCM_ENC_DEC, and update
aadhash as appropriate.

The data offset %r11 is also updated after the partial block.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:59:26 +0000 (19:59 +0000)]

crypto: aesni - Introduce READ_PARTIAL_BLOCK macro

Introduce READ_PARTIAL_BLOCK macro, and use it in the two existing
partial block cases: AAD and the end of ENC_DEC. In particular,
the ENC_DEC case should be faster, since we read by 8/4 bytes if
possible.

This macro will also be used to read partial blocks between
enc_update and dec_update calls.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:59:11 +0000 (19:59 +0000)]

crypto: aesni - Move ghash_mul to GCM_COMPLETE

Prepare to handle partial blocks between scatter/gather calls.
For the last partial block, we only want to calculate the aadhash
in GCM_COMPLETE, and a new partial block macro will handle both
aadhash update and encrypting partial blocks between calls.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:58:56 +0000 (19:58 +0000)]

crypto: aesni - Fill in new context data structures

Fill in aadhash, aadlen, pblocklen, curcount with appropriate values.
pblocklen, aadhash, and pblockenckey are also updated at the end
of each scatter/gather operation, to be carried over to the next
operation.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:58:38 +0000 (19:58 +0000)]

crypto: aesni - Merge avx precompute functions

The precompute functions differ only by the sub-macros
they call, merge them to a single macro. Later diffs
add more code to fill in the gcm_context_data structure,
this allows changes in a single place.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:58:19 +0000 (19:58 +0000)]

crypto: aesni - Split AAD hash calculation to separate macro

AAD hash only needs to be calculated once for each scatter/gather operation.
Move it to its own macro, and call it from GCM_INIT instead of
INITIAL_BLOCKS.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:57:49 +0000 (19:57 +0000)]

crypto: aesni - Add GCM_COMPLETE macro

Merge encode and decode tag calculations in GCM_COMPLETE macro.
Scatter/gather routines will call this once at the end of encryption
or decryption.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:57:36 +0000 (19:57 +0000)]

crypto: aesni - support 256 byte keys in avx asm

Add support for 192/256-bit keys using the avx gcm/aes routines.
The sse routines were previously updated in e31ac32d3b (Add support
for 192 & 256 bit keys to AESNI RFC4106).

Instead of adding an additional loop in the hotpath as in e31ac32d3b,
this diff instead generates separate versions of the code using macros,
and the entry routines choose which version once. This results
in a 5% performance improvement vs. adding a loop to the hot path.
This is the same strategy chosen by the intel isa-l_crypto library.

The key size checks are removed from the c code where appropriate.

Note that this diff depends on using gcm_context_data - 256 bit keys
require 16 HashKeys + 15 expanded keys, which is larger than
struct crypto_aes_ctx, so they are stored in struct gcm_context_data.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:57:12 +0000 (19:57 +0000)]

crypto: aesni - Macro-ify func save/restore

Macro-ify function save and restore. These will be used in new functions
added for scatter/gather update operations.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:57:00 +0000 (19:57 +0000)]

crypto: aesni - Introduce gcm_context_data

Add the gcm_context_data structure to the avx asm routines.
This will be necessary to support both 256 bit keys and
scatter/gather.

The pre-computed HashKeys are now stored in the gcm_context_data
struct, which is expanded to hold the greater number of hashkeys
necessary for avx.

Loads and stores to the new struct are always done unlaligned to
avoid compiler issues, see e5b954e8 "Use unaligned loads from
gcm_context_data"

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Dave Watson [Mon, 10 Dec 2018 19:56:45 +0000 (19:56 +0000)]

crypto: aesni - Merge GCM_ENC_DEC

The GCM_ENC_DEC routines for AVX and AVX2 are identical, except they
call separate sub-macros. Pass the macros as arguments, and merge them.
This facilitates additional refactoring, by requiring changes in only
one place.

The GCM_ENC_DEC macro was moved above the CONFIG_AS_AVX* ifdefs,
since it will be used by both AVX and AVX2.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

commit | commitdiff | tree

Gustavo A. R. Silva [Fri, 21 Dec 2018 21:22:29 +0000 (15:22 -0600)]

can: af_can: Fix Spectre v1 vulnerability

protocol is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

net/can/af_can.c:115 can_get_proto() warn: potential spectre issue 'proto_tab' [w]

Fix this by sanitizing protocol before using it to index proto_tab.

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Willem de Bruijn [Sat, 22 Dec 2018 21:53:45 +0000 (16:53 -0500)]

packet: validate address length if non-zero

Validate packet socket address length if a length is given. Zero
length is equivalent to not setting an address.

Fixes: 99137b7888f4 ("packet: validate address length")
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Gustavo A. R. Silva [Fri, 21 Dec 2018 21:47:53 +0000 (15:47 -0600)]

nfc: af_nfc: Fix Spectre v1 vulnerability

proto is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

net/nfc/af_nfc.c:42 nfc_sock_create() warn: potential spectre issue 'proto_tab' [w] (local cap)

Fix this by sanitizing proto before using it to index proto_tab.

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Gustavo A. R. Silva [Fri, 21 Dec 2018 21:41:17 +0000 (15:41 -0600)]

phonet: af_phonet: Fix Spectre v1 vulnerability

protocol is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

net/phonet/af_phonet.c:48 phonet_proto_get() warn: potential spectre issue 'proto_tab' [w] (local cap)

Fix this by sanitizing protocol before using it to index proto_tab.

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Gustavo A. R. Silva [Fri, 21 Dec 2018 20:49:01 +0000 (14:49 -0600)]

net: core: Fix Spectre v1 vulnerability

flen is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

net/core/filter.c:1101 bpf_check_classic() warn: potential spectre issue 'filter' [w]

Fix this by sanitizing flen before using it to index filter at line 1101:

switch (filter[flen - 1].code) {

and through pc at line 1040:

const struct sock_filter *ftest = &filter[pc];

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Sat, 22 Dec 2018 23:03:00 +0000 (15:03 -0800)]

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"This is two simple target fixes and one discard related I/O starvation
  problem in sd.

  The discard problem occurs because the discard page doesn't have a
  mempool backing so if the allocation fails due to memory pressure, we
  then lose the forward progress we require if the writeout is on the
  same device. The fix is to back it with a mempool"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sd: use mempool for discard special page
  scsi: target: iscsi: cxgbit: add missing spin_lock_init()
  scsi: target: iscsi: cxgbit: fix csk leak

commit | commitdiff | tree

Linus Torvalds [Sat, 22 Dec 2018 22:29:21 +0000 (14:29 -0800)]

Merge tag 'compiler-attributes-for-linus-v4.20' of https://github.com/ojeda/linux

Pull compiler_types.h fix from Miguel Ojeda:
"A cleanup for userspace in compiler_types.h: don't pollute userspace
  with macro definitions (Xiaozhou Liu)

  This is harmless for the kernel, but v4.19 was released with a few
  macros exposed to userspace as the patch explains; which this removes,
  so it *could* happen that we break something for someone (although
  leaving inline redefined is probably worse)"

* tag 'compiler-attributes-for-linus-v4.20' of https://github.com/ojeda/linux:
  include/linux/compiler_types.h: don't pollute userspace with macro definitions

commit | commitdiff | tree

Linus Torvalds [Sat, 22 Dec 2018 22:25:23 +0000 (14:25 -0800)]

Merge tag 'auxdisplay-for-linus-v4.20' of https://github.com/ojeda/linux

Pull auxdisplay fix from Miguel Ojeda:
"charlcd: fix x/y command parsing (Mans Rullgard)"

* tag 'auxdisplay-for-linus-v4.20' of https://github.com/ojeda/linux:
auxdisplay: charlcd: fix x/y command parsing

commit | commitdiff | tree

Christian Brauner [Thu, 5 Jul 2018 15:51:20 +0000 (17:51 +0200)]

Revert "vfs: Allow userns root to call mknod on owned filesystems."

This reverts commit 55956b59df336f6738da916dbb520b6e37df9fbd.

commit 55956b59df33 ("vfs: Allow userns root to call mknod on owned filesystems.")
enabled mknod() in user namespaces for userns root if CAP_MKNOD is
available. However, these device nodes are useless since any filesystem
mounted from a non-initial user namespace will set the SB_I_NODEV flag on
the filesystem. Now, when a device node s created in a non-initial user
namespace a call to open() on said device node will fail due to:

bool may_open_dev(const struct path *path)
{
return !(path->mnt->mnt_flags & MNT_NODEV) &&
!(path->mnt->mnt_sb->s_iflags & SB_I_NODEV);
}

The problem with this is that as of the aforementioned commit mknod()
creates partially functional device nodes in non-initial user namespaces.
In particular, it has the consequence that as of the aforementioned commit
open() will be more privileged with respect to device nodes than mknod().
Before it was the other way around. Specifically, if mknod() succeeded
then it was transparent for any userspace application that a fatal error
must have occured when open() failed.

All of this breaks multiple userspace workloads and a widespread assumption
about how to handle mknod(). Basically, all container runtimes and systemd
live by the slogan "ask for forgiveness not permission" when running user
namespace workloads. For mknod() the assumption is that if the syscall
succeeds the device nodes are useable irrespective of whether it succeeds
in a non-initial user namespace or not. This logic was chosen explicitly
to allow for the glorious day when mknod() will actually be able to create
fully functional device nodes in user namespaces.
A specific problem people are already running into when running 4.18 rc
kernels are failing systemd services. For any distro that is run in a
container systemd services started with the PrivateDevices= property set
will fail to start since the device nodes in question cannot be
opened (cf. the arguments in [1]).

Full disclosure, Seth made the very sound argument that it is already
possible to end up with partially functional device nodes. Any filesystem
mounted with MS_NODEV set will allow mknod() to succeed but will not allow
open() to succeed. The difference to the case here is that the MS_NODEV
case is transparent to userspace since it is an explicitly set mount option
while the SB_I_NODEV case is an implicit property enforced by the kernel
and hence opaque to userspace.

[1]: https://github.com/systemd/systemd/pull/9483

Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Sai Praneeth Prakhya [Sat, 22 Dec 2018 02:22:34 +0000 (18:22 -0800)]

x86/efi: Don't unmap EFI boot services code/data regions for EFI_OLD_MEMMAP and EFI_MIXED_MODE

The following commit:

  d5052a7130a6 ("x86/efi: Unmap EFI boot services code/data regions from efi_pgd")

forgets to take two EFI modes into consideration, namely EFI_OLD_MEMMAP and
EFI_MIXED_MODE:

- EFI_OLD_MEMMAP is a legacy way of mapping EFI regions into swapper_pg_dir
  using ioremap() and init_memory_mapping(). This feature can be enabled by
  passing "efi=old_map" as kernel command line argument. But,
  efi_unmap_pages() unmaps EFI boot services code/data regions *only* from
  efi_pgd and hence cannot be used for unmapping EFI boot services code/data
  regions from swapper_pg_dir.

Introduce a temporary fix to not unmap EFI boot services code/data regions
when EFI_OLD_MEMMAP is enabled while working on a real fix.

- EFI_MIXED_MODE is another feature where a 64-bit kernel runs on a
  64-bit platform crippled by a 32-bit firmware. To support EFI_MIXED_MODE,
  all RAM (i.e. namely EFI regions like EFI_CONVENTIONAL_MEMORY,
  EFI_LOADER_<CODE/DATA>, EFI_BOOT_SERVICES_<CODE/DATA> and
  EFI_RUNTIME_CODE/DATA regions) is mapped into efi_pgd all the time to
  facilitate EFI runtime calls access it's arguments in 1:1 mode.

Hence, don't unmap EFI boot services code/data regions when booted in mixed mode.

Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20181222022234.7573-1-sai.praneeth.prakhya@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Christoph Hellwig [Sat, 22 Dec 2018 08:21:08 +0000 (09:21 +0100)]

dma-mapping: fix flags in dma_alloc_wc

We really need the writecombine flag in dma_alloc_wc, fix a stupid
oversight.

Fixes: 7ed1d91a9e ("dma-mapping: translate __GFP_NOFAIL to DMA_ATTR_NO_WARN")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Oliver O'Halloran [Mon, 19 Mar 2018 05:14:03 +0000 (16:14 +1100)]

powerpc/zImage: Also check for stdout-path

The /chosen/linux,stdout-path is "deprecated" in favour of
/chosen/stdout-path so we should be checking for both.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

commit | commitdiff | tree

Benjamin Herrenschmidt [Mon, 8 Oct 2018 04:08:31 +0000 (15:08 +1100)]

powerpc: Fix HMIs on big-endian with CONFIG_RELOCATABLE=y

HMIs will crash the kernel due to

BRANCH_LINK_TO_FAR(hmi_exception_realmode)

Calling into the OPD instead of the actual code.

Fixes: 2337d207288f ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Use DOTSYM() rather than #ifdef]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

commit | commitdiff | tree

Rob Herring [Wed, 5 Dec 2018 19:50:28 +0000 (13:50 -0600)]

macintosh: Use of_node_name_{eq, prefix} for node name comparisons

Convert string compares of DT node names to use of_node_name_{eq,prefix}
helpers instead. This removes direct access to the node name pointer.

This changes a single case insensitive node name comparison to case
sensitive for "ata4". This is the only instance of a case insensitive
comparison for all the open coded node name comparisons on powerpc.
Searching the commit history, there doesn't appear to be any reason for
it to be case insensitive.

A couple of open coded iterating thru the child node names are converted
to use for_each_child_of_node() instead.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

commit | commitdiff | tree

Rob Herring [Wed, 5 Dec 2018 19:50:25 +0000 (13:50 -0600)]

ide: Use of_node_name_eq for node name comparisons

Convert string compares of DT node names to use of_node_name_eq helper
instead. This removes direct access to the node name pointer.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linux-ide@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

commit | commitdiff | tree

Rob Herring [Wed, 5 Dec 2018 19:50:18 +0000 (13:50 -0600)]

powerpc: Use of_node_name_eq for node name comparisons

Convert string compares of DT node names to use of_node_name_eq helper
instead. This removes direct access to the node name pointer.

A couple of open coded iterating thru the child node names are converted
to use for_each_child_of_node() instead.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

commit | commitdiff | tree

Rob Herring [Wed, 5 Dec 2018 19:50:17 +0000 (13:50 -0600)]

powerpc/pseries/pmem: Convert to %pOFn instead of device_node.name

In preparation to remove the node name pointer from struct
device_node, convert printf users to use the %pOFn format specifier.
pmem.c was recently added and missed the initial conversion.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

commit | commitdiff | tree

Michael Ellerman [Tue, 27 Nov 2018 08:18:05 +0000 (19:18 +1100)]

powerpc/mm: Remove very old comment in hash-4k.h

This comment talks about PTEs being 64-bits and PMD/PGD being 32-bits,
but that hasn't been true since 2005 when David Gibson implemented
4-level page tables in the commit titled "Four level pagetables for
ppc64".

Remove it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

commit | commitdiff | tree

Michael Ellerman [Tue, 27 Nov 2018 08:16:44 +0000 (19:16 +1100)]

powerpc/pseries: Fix node leak in update_lmb_associativity_index()

In update_lmb_associativity_index() we lookup dr_node using
of_find_node_by_path() which takes a reference for us. In the
non-error case we forget to drop the reference. Note that
find_aa_index() does modify properties of the node, but doesn't need
an extra reference held once it's returned.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

commit | commitdiff | tree

Scott Wood [Sat, 22 Dec 2018 04:07:54 +0000 (22:07 -0600)]

powerpc/configs/85xx: Enable CONFIG_DEBUG_KERNEL

This is required for CONFIG_DEBUG_INFO to work.

Signed-off-by: Scott Wood <oss@buserror.net>

commit | commitdiff | tree

Scott Wood [Sat, 22 Dec 2018 03:32:51 +0000 (21:32 -0600)]

powerpc/dts/fsl: Fix dtc-flagged interrupt errors

mpc8641_hpcn was updated to 4-cell interrupt specifiers, but
PCI interrupt-map was not updated. It was also missing #interrupt-cells
on the outer PCI buses.

p1020rdb-pc was updated to 4-cell interrupt specifiers, but
the ethernet-phy nodes weren't updated.

mpc832x_rdb had an invalid "interrupts = <0>" on the ethernet-phy nodes.
Besides being the wrong number of cells, 0 is not a valid IPIC interrupt
according to ipic.c. Presumably it was meant to indicate that these
PHYs are not connected to an interrupt.

Signed-off-by: Scott Wood <oss@buserror.net>

commit | commitdiff | tree

Yuantian Tang [Wed, 31 Oct 2018 06:57:36 +0000 (14:57 +0800)]

clk: qoriq: add more compatibles strings

Add more SoC compatible strings to support more chips.

Signed-off-by: Yuantian Tang <andy.tang@nxp.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Scott Wood <oss@buserror.net>

commit | commitdiff | tree

Scott Wood [Wed, 31 Oct 2018 06:57:35 +0000 (14:57 +0800)]

powerpc/fsl: Use new clockgen binding

The driver retains compatibility with old device trees, but we don't
want the old nodes lying around to be copied, or used as a reference
(some of the mux options are incorrect), or even just being clutter.

Signed-off-by: Scott Wood <oss@buserror.net>
Signed-off-by: Tang Yuantian <andy.tang@nxp.com>
[scottwood: removed sysclk node added by Andy]
Signed-off-by: Scott Wood <oss@buserror.net>

commit | commitdiff | tree

Christophe Leroy [Mon, 10 Dec 2018 11:41:29 +0000 (11:41 +0000)]

powerpc/83xx: handle machine check caused by watchdog timer

When the watchdog timer is set in interrupt mode, it causes a
machine check when it times out. The purpose of this mode is to
ease debugging, not to crash the kernel and reboot the machine.

This patch implements a special handling for that, in order to not
crash the kernel if the watchdog times out while in interrupt or
within the idle task.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[scottwood: added missing #include]
Signed-off-by: Scott Wood <oss@buserror.net>

commit | commitdiff | tree

Omar Sandoval [Sat, 22 Dec 2018 02:45:18 +0000 (18:45 -0800)]

xfs: reallocate realtime summary cache on growfs

At mount time, we allocate m_rsum_cache with the number of realtime
bitmap blocks. However, xfs_growfs_rt() can increase the number of
realtime bitmap blocks. Using the cache after this happens may access
out of the bounds of the cache. Fix it by reallocating the cache in this
case.

Fixes: 355e3532132b ("xfs: cache minimum realtime summary level")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

commit | commitdiff | tree

Alexandre Belloni [Tue, 20 Nov 2018 15:12:30 +0000 (16:12 +0100)]

powerpc/fsl-rio: fix spelling mistake "reserverd" -> "reserved"

Fix a spelling mistake in a register description.

Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Scott Wood <oss@buserror.net>

commit | commitdiff | tree

Christoph Hellwig [Wed, 14 Nov 2018 08:23:07 +0000 (09:23 +0100)]

powerpc/fsl_pci: simplify fsl_pci_dma_set_mask

swiotlb will only bounce buffer when the effective dma address for the
device is smaller than the actual DMA range. Instead of flipping between
the swiotlb and nommu ops for FSL SOCs that have the second outbound
window just don't set the bus dma_mask in this case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Scott Wood <oss@buserror.net>

commit | commitdiff | tree

Sabyasachi Gupta [Mon, 5 Nov 2018 02:22:48 +0000 (07:52 +0530)]

arch/powerpc/fsl_rmu: Use dma_zalloc_coherent

Replaced dma_alloc_coherent + memset with dma_zalloc_coherent

Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>

commit | commitdiff | tree

David S. Miller [Fri, 21 Dec 2018 23:06:20 +0000 (15:06 -0800)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Dec 2018 22:59:00 +0000 (14:59 -0800)]

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"4 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm, page_alloc: fix has_unmovable_pages for HugePages
  fork,memcg: fix crash in free_thread_stack on memcg charge fail
  mm: thp: fix flags for pmd migration when split
  mm, memory_hotplug: initialize struct pages for the full memory section

commit | commitdiff | tree

Oscar Salvador [Fri, 21 Dec 2018 22:31:00 +0000 (14:31 -0800)]

mm, page_alloc: fix has_unmovable_pages for HugePages

While playing with gigantic hugepages and memory_hotplug, I triggered
the following #PF when "cat memoryX/removable":

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
  #PF error: [normal kernel read fault]
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 1 PID: 1481 Comm: cat Tainted: G            E     4.20.0-rc6-mm1-1-default+ #18
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
  RIP: 0010:has_unmovable_pages+0x154/0x210
  Call Trace:
   is_mem_section_removable+0x7d/0x100
   removable_show+0x90/0xb0
   dev_attr_show+0x1c/0x50
   sysfs_kf_seq_show+0xca/0x1b0
   seq_read+0x133/0x380
   __vfs_read+0x26/0x180
   vfs_read+0x89/0x140
   ksys_read+0x42/0x90
   do_syscall_64+0x5b/0x180
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

The reason is we do not pass the Head to page_hstate(), and so, the call
to compound_order() in page_hstate() returns 0, so we end up checking
all hstates's size to match PAGE_SIZE.

Obviously, we do not find any hstate matching that size, and we return
NULL.  Then, we dereference that NULL pointer in
hugepage_migration_supported() and we got the #PF from above.

Fix that by getting the head page before calling page_hstate().

Also, since gigantic pages span several pageblocks, re-adjust the logic
for skipping pages.  While are it, we can also get rid of the
round_up().

[osalvador@suse.de: remove round_up(), adjust skip pages logic per Michal]
Link: http://lkml.kernel.org/r/20181221062809.31771-1-osalvador@suse.de
Link: http://lkml.kernel.org/r/20181217225113.17864-1-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Rik van Riel [Fri, 21 Dec 2018 22:30:54 +0000 (14:30 -0800)]

fork,memcg: fix crash in free_thread_stack on memcg charge fail

Commit 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting") will
result in fork failing if allocating a kernel stack for a task in
dup_task_struct exceeds the kernel memory allowance for that cgroup.

Unfortunately, it also results in a crash.

This is due to the code jumping to free_stack and calling
free_thread_stack when the memcg kernel stack charge fails, but without
tsk->stack pointing at the freshly allocated stack.

This in turn results in the vfree_atomic in free_thread_stack oopsing
with a backtrace like this:

#5 [ffffc900244efc88] die at ffffffff8101f0ab
#6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86
#7 [ffffc900244efce0] general_protection at ffffffff818ff082
    [exception RIP: llist_add_batch+7]
    RIP: ffffffff8150d487  RSP: ffffc900244efd98  RFLAGS: 00010282
    RAX: 0000000000000000  RBX: ffff88085ef55980  RCX: 0000000000000000
    RDX: ffff88085ef55980  RSI: 343834343531203a  RDI: 343834343531203a
    RBP: ffffc900244efd98   R8: 0000000000000001   R9: ffff8808578c3600
    R10: 0000000000000000  R11: 0000000000000001  R12: ffff88029f6c21c0
    R13: 0000000000000286  R14: ffff880147759b00  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7
#9 [ffffc900244efdb8] copy_process at ffffffff81086e37
#10 [ffffc900244efe98] _do_fork at ffffffff810884e0
#11 [ffffc900244eff10] sys_vfork at ffffffff810887ff
#12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43
    RIP: 000000000049b948  RSP: 00007ffcdb307830  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000896030  RCX: 000000000049b948
    RDX: 0000000000000000  RSI: 00007ffcdb307790  RDI: 00000000005d7421
    RBP: 000000000067370f   R8: 00007ffcdb3077b0   R9: 000000000001ed00
    R10: 0000000000000008  R11: 0000000000000246  R12: 0000000000000040
    R13: 000000000000000f  R14: 0000000000000000  R15: 000000000088d018
    ORIG_RAX: 000000000000003a  CS: 0033  SS: 002b

The simplest fix is to assign tsk->stack right where it is allocated.

Link: http://lkml.kernel.org/r/20181214231726.7ee4843c@imladris.surriel.com
Fixes: 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting")
Signed-off-by: Rik van Riel <riel@surriel.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Peter Xu [Fri, 21 Dec 2018 22:30:50 +0000 (14:30 -0800)]

mm: thp: fix flags for pmd migration when split

When splitting a huge migrating PMD, we'll transfer all the existing PMD
bits and apply them again onto the small PTEs. However we are fetching
the bits unconditionally via pmd_soft_dirty(), pmd_write() or
pmd_yound() while actually they don't make sense at all when it's a
migration entry. Fix them up. Since at it, drop the ifdef together as
not needed.

Note that if my understanding is correct about the problem then if
without the patch there is chance to lose some of the dirty bits in the
migrating pmd pages (on x86_64 we're fetching bit 11 which is part of
swap offset instead of bit 2) and it could potentially corrupt the
memory of an userspace program which depends on the dirty bit.

Link: http://lkml.kernel.org/r/20181213051510.20306-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: <stable@vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Mikhail Zaslonko [Fri, 21 Dec 2018 22:30:46 +0000 (14:30 -0800)]

mm, memory_hotplug: initialize struct pages for the full memory section

If memory end is not aligned with the sparse memory section boundary,
the mapping of such a section is only partly initialized.  This may lead
to VM_BUG_ON due to uninitialized struct page access from
is_mem_section_removable() or test_pages_in_a_zone() function triggered
by memory_hotplug sysfs handlers:

Here are the the panic examples:
CONFIG_DEBUG_VM=y
CONFIG_DEBUG_VM_PGFLAGS=y

kernel parameter mem=2050M
--------------------------
page:000003d082008000 is uninitialized and poisoned
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
Call Trace:
( test_pages_in_a_zone+0xde/0x160)
   show_valid_zones+0x5c/0x190
   dev_attr_show+0x34/0x70
   sysfs_kf_seq_show+0xc8/0x148
   seq_read+0x204/0x480
   __vfs_read+0x32/0x178
   vfs_read+0x82/0x138
   ksys_read+0x5a/0xb0
   system_call+0xdc/0x2d8
Last Breaking-Event-Address:
   test_pages_in_a_zone+0xde/0x160
Kernel panic - not syncing: Fatal exception: panic_on_oops

kernel parameter mem=3075M
--------------------------
page:000003d08300c000 is uninitialized and poisoned
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
Call Trace:
( is_mem_section_removable+0xb4/0x190)
   show_mem_removable+0x9a/0xd8
   dev_attr_show+0x34/0x70
   sysfs_kf_seq_show+0xc8/0x148
   seq_read+0x204/0x480
   __vfs_read+0x32/0x178
   vfs_read+0x82/0x138
   ksys_read+0x5a/0xb0
   system_call+0xdc/0x2d8
Last Breaking-Event-Address:
   is_mem_section_removable+0xb4/0x190
Kernel panic - not syncing: Fatal exception: panic_on_oops

Fix the problem by initializing the last memory section of each zone in
memmap_init_zone() till the very end, even if it goes beyond the zone end.

Michal said:

: This has alwways been problem AFAIU.  It just went unnoticed because we
: have zeroed memmaps during allocation before f7f99100d8d9 ("mm: stop
: zeroing memory during allocation in vmemmap") and so the above test
: would simply skip these ranges as belonging to zone 0 or provided a
: garbage.
:
: So I guess we do care for post f7f99100d8d9 kernels mostly and
: therefore Fixes: f7f99100d8d9 ("mm: stop zeroing memory during
: allocation in vmemmap")

Link: http://lkml.kernel.org/r/20181212172712.34019-2-zaslonko@linux.ibm.com
Fixes: f7f99100d8d9 ("mm: stop zeroing memory during allocation in vmemmap")
Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Suggested-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Dec 2018 22:23:57 +0000 (14:23 -0800)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc

Pull sparc fixes from David Miller:
"Just some small fixes here and there, and a refcount leak in a serial
  driver, nothing serious"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  serial/sunsu: fix refcount leak
  sparc: Set "ARCH: sunxx" information on the same line
  sparc: vdso: Drop implicit common-page-size linker flag

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Dec 2018 22:21:17 +0000 (14:21 -0800)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull more networking fixes from David Miller:
"Some more bug fixes have trickled in, we have:

  1) Local MAC entries properly in mscc driver, from Allan W. Nielsen.

  2) Eric Dumazet found some more of the typical "pskb_may_pull() -->
     oops forgot to reload the header pointer" bugs in ipv6 tunnel
     handling.

  3) Bad SKB socket pointer in ipv6 fragmentation handling, from Herbert
     Xu.

  4) Overflow fix in sk_msg_clone(), from Vakul Garg.

  5) Validate address lengths in AF_PACKET, from Willem de Bruijn"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  qmi_wwan: Fix qmap header retrieval in qmimux_rx_fixup
  qmi_wwan: Add support for Fibocom NL678 series
  tls: Do not call sk_memcopy_from_iter with zero length
  ipv6: tunnels: fix two use-after-free
  Prevent overflow of sk_msg in sk_msg_clone()
  packet: validate address length
  net: netxen: fix a missing check and an uninitialized use
  tcp: fix a race in inet_diag_dump_icsk()
  MAINTAINERS: update cxgb4 and cxgb3 maintainer
  ipv6: frags: Fix bogus skb->sk in reassembled packets
  mscc: Configured MAC entries should be locked.

commit | commitdiff | tree

Mans Rullgard [Wed, 5 Dec 2018 13:52:47 +0000 (13:52 +0000)]

auxdisplay: charlcd: fix x/y command parsing

The x/y command parsing has been broken since commit 129957069e6a
("staging: panel: Fixed checkpatch warning about simple_strtoul()").

Commit b34050fadb86 ("auxdisplay: charlcd: Fix and clean up handling of
x/y commands") fixed some problems by rewriting the parsing code,
but also broke things further by removing the check for a complete
command before attempting to parse it.  As a result, parsing is
terminated at the first x or y character.

This reinstates the check for a final semicolon.  Whereas the original
code use strchr(), this is wasteful seeing as the semicolon is always
at the end of the buffer.  Thus check this character directly instead.

Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>

commit | commitdiff | tree

Yangtao Li [Wed, 12 Dec 2018 16:01:45 +0000 (11:01 -0500)]

serial/sunsu: fix refcount leak

The function of_find_node_by_path() acquires a reference to the node
returned by it and that reference needs to be dropped by its caller.

su_get_type() doesn't do that. The match node are used as an identifier
to compare against the current node, so we can directly drop the refcount
after getting the node from the path as it is not used as pointer.

Fix this by use a single variable and drop the refcount right after
of_find_node_by_path().

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Corentin Labbe [Tue, 11 Dec 2018 12:11:09 +0000 (12:11 +0000)]

sparc: Set "ARCH: sunxx" information on the same line

While checking boot log from SPARC qemu, I saw that the "ARCH: sunxx"
information was split on two different line.
This patchs merge both line together.
In the meantime, thoses information need to be printed via pr_info
since printk print them by default via the warning loglevel.

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

ndesaulniers@google.com [Mon, 10 Dec 2018 22:35:13 +0000 (14:35 -0800)]

sparc: vdso: Drop implicit common-page-size linker flag

GNU linker's -z common-page-size's default value is based on the target
architecture. arch/sparc/vdso/Makefile sets it to the architecture
default, which is implicit and redundant. Drop it.

Link: https://lkml.kernel.org/r/20181206191231.192355-1-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Dec 2018 19:15:36 +0000 (11:15 -0800)]

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fix from Paolo Bonzini:
"A simple patch for a pretty bad bug: Unbreak AMD nested
virtualization."

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: nSVM: fix switch to guest mmu

commit | commitdiff | tree

Daniele Palmas [Fri, 21 Dec 2018 12:07:23 +0000 (13:07 +0100)]

qmi_wwan: Fix qmap header retrieval in qmimux_rx_fixup

This patch fixes qmap header retrieval when modem is configured for
dl data aggregation.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Dec 2018 18:51:54 +0000 (10:51 -0800)]

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Ingo Molnar:
"Fix a division by zero crash in the posix-timers code"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
posix-timers: Fix division by zero bug

commit | commitdiff | tree

Jörgen Storvist [Fri, 21 Dec 2018 14:38:52 +0000 (15:38 +0100)]

qmi_wwan: Add support for Fibocom NL678 series

Added support for Fibocom NL678 series cellular module QMI interface.
Using QMI_QUIRK_SET_DTR required for Qualcomm MDM9x40 series chipsets.

Signed-off-by: Jörgen Storvist <jorgen.storvist@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Vakul Garg [Fri, 21 Dec 2018 15:16:52 +0000 (15:16 +0000)]

tls: Do not call sk_memcopy_from_iter with zero length

In some conditions e.g. when tls_clone_plaintext_msg() returns -ENOSPC,
the number of bytes to be copied using subsequent function
sk_msg_memcopy_from_iter() becomes zero. This causes function
sk_msg_memcopy_from_iter() to fail which in turn causes tls_sw_sendmsg()
to return failure. To prevent it, do not call sk_msg_memcopy_from_iter()
when number of bytes to copy (indicated by 'try_to_copy') is zero.

Fixes: d829e9c4112b ("tls: convert to generic sk_msg interface")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Fri, 21 Dec 2018 18:24:54 +0000 (10:24 -0800)]

Merge branch 'skb_ext-fixes'

Paolo Abeni says:

====================
net: skb extension follow-ups

This series includes some follow-up for the recently added skb extension.
The first patch addresses an unlikely race while adding skb extensions,
and the following two are just minor code clean-up.

v1 -> v2:
- be sure to flag the newly added extension as active in skb_ext_add()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Paolo Abeni [Fri, 21 Dec 2018 18:03:15 +0000 (19:03 +0100)]

net: minor cleanup in skb_ext_add()

When the extension to be added is already present, the only
skb field we may need to update is 'extensions': we can reorder
the code and avoid a branch.

v1 -> v2:
- be sure to flag the newly added extension as active

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Paolo Abeni [Fri, 21 Dec 2018 18:03:14 +0000 (19:03 +0100)]

net: drop the unused helper skb_ext_get()

Such helper is currently unused, and skb extension users are
better off using skb_ext_add()/skb_ext_del(). So let's drop
it.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Paolo Abeni [Fri, 21 Dec 2018 18:03:13 +0000 (19:03 +0100)]

net: fix possible user-after-free in skb_ext_add()

On cow we can free the old extension: we must avoid dereferencing
such extension after skb_ext_maybe_cow(). Since 'new' contents
are always equal to 'old' after the copy, we can fix the above
accessing the relevant data using 'new'.

Fixes: df5042f4c5b9 ("sk_buff: add skb extension infrastructure")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Dec 2018 18:11:51 +0000 (10:11 -0800)]

Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull futex fix from Ingo Molnar:
"A single fix for a robust futexes race between sys_exit() and
sys_futex_lock_pi()"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Cure exit race

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Dec 2018 17:22:24 +0000 (09:22 -0800)]

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"The biggest part is a series of reverts for the macro based GCC
  inlining workarounds. It caused regressions in distro build and other
  kernel tooling environments, and the GCC project was very receptive to
  fixing the underlying inliner weaknesses - so as time ran out we
  decided to do a reasonably straightforward revert of the patches. The
  plan is to rely on the 'asm inline' GCC 9 feature, which might be
  backported to GCC 8 and could thus become reasonably widely available
  on modern distros.

  Other than those reverts, there's misc fixes from all around the
  place.

  I wish our final x86 pull request for v4.20 was smaller..."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs"
  Revert "x86/objtool: Use asm macros to work around GCC inlining bugs"
  Revert "x86/refcount: Work around GCC inlining bug"
  Revert "x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs"
  Revert "x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs"
  Revert "x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops"
  Revert "x86/extable: Macrofy inline assembly code to work around GCC inlining bugs"
  Revert "x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs"
  Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs"
  x86/mtrr: Don't copy uninitialized gentry fields back to userspace
  x86/fsgsbase/64: Fix the base write helper functions
  x86/mm/cpa: Fix cpa_flush_array() TLB invalidation
  x86/vdso: Pass --eh-frame-hdr to the linker
  x86/mm: Fix decoy address handling vs 32-bit builds
  x86/intel_rdt: Ensure a CPU remains online for the region's pseudo-locking sequence
  x86/dump_pagetables: Fix LDT remap address marker
  x86/mm: Fix guard hole handling

commit | commitdiff | tree

Eric Dumazet [Fri, 21 Dec 2018 15:47:51 +0000 (07:47 -0800)]

ipv6: tunnels: fix two use-after-free

xfrm6_policy_check() might have re-allocated skb->head, we need
to reload ipv6 header pointer.

sysbot reported :

BUG: KASAN: use-after-free in __ipv6_addr_type+0x302/0x32f net/ipv6/addrconf_core.c:40
Read of size 4 at addr ffff888191b8cb70 by task syz-executor2/1304

CPU: 0 PID: 1304 Comm: syz-executor2 Not tainted 4.20.0-rc7+ #356
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x39d lib/dump_stack.c:113
print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
__asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
__ipv6_addr_type+0x302/0x32f net/ipv6/addrconf_core.c:40
ipv6_addr_type include/net/ipv6.h:403 [inline]
ip6_tnl_get_cap+0x27/0x190 net/ipv6/ip6_tunnel.c:727
ip6_tnl_rcv_ctl+0xdb/0x2a0 net/ipv6/ip6_tunnel.c:757
vti6_rcv+0x336/0x8f3 net/ipv6/ip6_vti.c:321
xfrm6_ipcomp_rcv+0x1a5/0x3a0 net/ipv6/xfrm6_protocol.c:132
ip6_protocol_deliver_rcu+0x372/0x1940 net/ipv6/ip6_input.c:394
ip6_input_finish+0x84/0x170 net/ipv6/ip6_input.c:434
NF_HOOK include/linux/netfilter.h:289 [inline]
ip6_input+0xe9/0x600 net/ipv6/ip6_input.c:443
IPVS: ftp: loaded support on port[0] = 21
ip6_mc_input+0x514/0x11c0 net/ipv6/ip6_input.c:537
dst_input include/net/dst.h:450 [inline]
ip6_rcv_finish+0x17a/0x330 net/ipv6/ip6_input.c:76
NF_HOOK include/linux/netfilter.h:289 [inline]
ipv6_rcv+0x115/0x640 net/ipv6/ip6_input.c:272
__netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4973
__netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5083
process_backlog+0x24e/0x7a0 net/core/dev.c:5923
napi_poll net/core/dev.c:6346 [inline]
net_rx_action+0x7fa/0x19b0 net/core/dev.c:6412
__do_softirq+0x308/0xb7e kernel/softirq.c:292
do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1027
</IRQ>
do_softirq.part.14+0x126/0x160 kernel/softirq.c:337
do_softirq+0x19/0x20 kernel/softirq.c:340
netif_rx_ni+0x521/0x860 net/core/dev.c:4569
dev_loopback_xmit+0x287/0x8c0 net/core/dev.c:3576
NF_HOOK include/linux/netfilter.h:289 [inline]
ip6_finish_output2+0x193a/0x2930 net/ipv6/ip6_output.c:84
ip6_fragment+0x2b06/0x3850 net/ipv6/ip6_output.c:727
ip6_finish_output+0x6b7/0xc50 net/ipv6/ip6_output.c:152
NF_HOOK_COND include/linux/netfilter.h:278 [inline]
ip6_output+0x232/0x9d0 net/ipv6/ip6_output.c:171
dst_output include/net/dst.h:444 [inline]
ip6_local_out+0xc5/0x1b0 net/ipv6/output_core.c:176
ip6_send_skb+0xbc/0x340 net/ipv6/ip6_output.c:1727
ip6_push_pending_frames+0xc5/0xf0 net/ipv6/ip6_output.c:1747
rawv6_push_pending_frames net/ipv6/raw.c:615 [inline]
rawv6_sendmsg+0x3a3e/0x4b40 net/ipv6/raw.c:945
kobject: 'queues' (0000000089e6eea2): kobject_add_internal: parent: 'tunl0', set: '<NULL>'
kobject: 'queues' (0000000089e6eea2): kobject_uevent_env
inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
kobject: 'queues' (0000000089e6eea2): kobject_uevent_env: filter function caused the event to drop!
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:631
sock_write_iter+0x35e/0x5c0 net/socket.c:900
call_write_iter include/linux/fs.h:1857 [inline]
new_sync_write fs/read_write.c:474 [inline]
__vfs_write+0x6b8/0x9f0 fs/read_write.c:487
kobject: 'rx-0' (00000000e2d902d9): kobject_add_internal: parent: 'queues', set: 'queues'
kobject: 'rx-0' (00000000e2d902d9): kobject_uevent_env
vfs_write+0x1fc/0x560 fs/read_write.c:549
ksys_write+0x101/0x260 fs/read_write.c:598
kobject: 'rx-0' (00000000e2d902d9): fill_kobj_path: path = '/devices/virtual/net/tunl0/queues/rx-0'
__do_sys_write fs/read_write.c:610 [inline]
__se_sys_write fs/read_write.c:607 [inline]
__x64_sys_write+0x73/0xb0 fs/read_write.c:607
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
kobject: 'tx-0' (00000000443b70ac): kobject_add_internal: parent: 'queues', set: 'queues'
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457669
Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9bd200bc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457669
RDX: 000000000000058f RSI: 00000000200033c0 RDI: 0000000000000003
kobject: 'tx-0' (00000000443b70ac): kobject_uevent_env
RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9bd200c6d4
R13: 00000000004c2dcc R14: 00000000004da398 R15: 00000000ffffffff

Allocated by task 1304:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
__do_kmalloc_node mm/slab.c:3684 [inline]
__kmalloc_node_track_caller+0x50/0x70 mm/slab.c:3698
__kmalloc_reserve.isra.41+0x41/0xe0 net/core/skbuff.c:140
__alloc_skb+0x155/0x760 net/core/skbuff.c:208
kobject: 'tx-0' (00000000443b70ac): fill_kobj_path: path = '/devices/virtual/net/tunl0/queues/tx-0'
alloc_skb include/linux/skbuff.h:1011 [inline]
__ip6_append_data.isra.49+0x2f1a/0x3f50 net/ipv6/ip6_output.c:1450
ip6_append_data+0x1bc/0x2d0 net/ipv6/ip6_output.c:1619
rawv6_sendmsg+0x15ab/0x4b40 net/ipv6/raw.c:938
inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:631
___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
__sys_sendmsg+0x11d/0x280 net/socket.c:2154
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg net/socket.c:2161 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
kobject: 'gre0' (00000000cb1b2d7b): kobject_add_internal: parent: 'net', set: 'devices'

Freed by task 1304:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
__cache_free mm/slab.c:3498 [inline]
kfree+0xcf/0x230 mm/slab.c:3817
skb_free_head+0x93/0xb0 net/core/skbuff.c:553
pskb_expand_head+0x3b2/0x10d0 net/core/skbuff.c:1498
__pskb_pull_tail+0x156/0x18a0 net/core/skbuff.c:1896
pskb_may_pull include/linux/skbuff.h:2188 [inline]
_decode_session6+0xd11/0x14d0 net/ipv6/xfrm6_policy.c:150
__xfrm_decode_session+0x71/0x140 net/xfrm/xfrm_policy.c:3272
kobject: 'gre0' (00000000cb1b2d7b): kobject_uevent_env
__xfrm_policy_check+0x380/0x2c40 net/xfrm/xfrm_policy.c:3322
__xfrm_policy_check2 include/net/xfrm.h:1170 [inline]
xfrm_policy_check include/net/xfrm.h:1175 [inline]
xfrm6_policy_check include/net/xfrm.h:1185 [inline]
vti6_rcv+0x4bd/0x8f3 net/ipv6/ip6_vti.c:316
xfrm6_ipcomp_rcv+0x1a5/0x3a0 net/ipv6/xfrm6_protocol.c:132
ip6_protocol_deliver_rcu+0x372/0x1940 net/ipv6/ip6_input.c:394
ip6_input_finish+0x84/0x170 net/ipv6/ip6_input.c:434
NF_HOOK include/linux/netfilter.h:289 [inline]
ip6_input+0xe9/0x600 net/ipv6/ip6_input.c:443
ip6_mc_input+0x514/0x11c0 net/ipv6/ip6_input.c:537
dst_input include/net/dst.h:450 [inline]
ip6_rcv_finish+0x17a/0x330 net/ipv6/ip6_input.c:76
NF_HOOK include/linux/netfilter.h:289 [inline]
ipv6_rcv+0x115/0x640 net/ipv6/ip6_input.c:272
__netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4973
__netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5083
process_backlog+0x24e/0x7a0 net/core/dev.c:5923
kobject: 'gre0' (00000000cb1b2d7b): fill_kobj_path: path = '/devices/virtual/net/gre0'
napi_poll net/core/dev.c:6346 [inline]
net_rx_action+0x7fa/0x19b0 net/core/dev.c:6412
__do_softirq+0x308/0xb7e kernel/softirq.c:292

The buggy address belongs to the object at ffff888191b8cac0
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 176 bytes inside of
512-byte region [ffff888191b8cac0, ffff888191b8ccc0)
The buggy address belongs to the page:
page:ffffea000646e300 count:1 mapcount:0 mapping:ffff8881da800940 index:0x0
flags: 0x2fffc0000000200(slab)
raw: 02fffc0000000200 ffffea0006eaaa48 ffffea00065356c8 ffff8881da800940
raw: 0000000000000000 ffff888191b8c0c0 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected
kobject: 'queues' (000000005fd6226e): kobject_add_internal: parent: 'gre0', set: '<NULL>'

Memory state around the buggy address:
ffff888191b8ca00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888191b8ca80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
>ffff888191b8cb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888191b8cb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888191b8cc00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 0d3c703a9d17 ("ipv6: Cleanup IPv6 tunnel receive path")
Fixes: ed1efb2aefbb ("ipv6: Add support for IPsec virtual tunnel interfaces")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Dec 2018 17:17:52 +0000 (09:17 -0800)]

Merge tag 'drm-fixes-2018-12-21' of git://anongit.freedesktop.org/drm/drm

Pull final drm fix from Daniel Vetter:
"Very calm week, so either everything perfect or everyone on holidays
already. Just one array_index_nospec patch, also for stable"

* tag 'drm-fixes-2018-12-21' of git://anongit.freedesktop.org/drm/drm:
drm/ioctl: Fix Spectre v1 vulnerabilities

commit | commitdiff | tree

Vakul Garg [Fri, 21 Dec 2018 15:55:46 +0000 (15:55 +0000)]

Prevent overflow of sk_msg in sk_msg_clone()

Fixed function sk_msg_clone() to prevent overflow of 'dst' while adding
pages in scatterlist entries. The overflow of 'dst' causes crash in kernel
tls module while doing record encryption.

Crash fixed by this patch.

[   78.796119] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[   78.804900] Mem abort info:
[   78.807683]   ESR = 0x96000004
[   78.810744]   Exception class = DABT (current EL), IL = 32 bits
[   78.816677]   SET = 0, FnV = 0
[   78.819727]   EA = 0, S1PTW = 0
[   78.822873] Data abort info:
[   78.825759]   ISV = 0, ISS = 0x00000004
[   78.829600]   CM = 0, WnR = 0
[   78.832576] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000bf8ee311
[   78.839195] [0000000000000008] pgd=0000000000000000
[   78.844081] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   78.849642] Modules linked in: tls xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_filter ip6_tables xt_CHECKSUM cpve cpufreq_conservative lm90 ina2xx crct10dif_ce
[   78.865377] CPU: 0 PID: 6007 Comm: openssl Not tainted 4.20.0-rc6-01647-g754d5da63145-dirty #107
[   78.874149] Hardware name: LS1043A RDB Board (DT)
[   78.878844] pstate: 60000005 (nZCv daif -PAN -UAO)
[   78.883632] pc : scatterwalk_copychunks+0x164/0x1c8
[   78.888500] lr : scatterwalk_copychunks+0x160/0x1c8
[   78.893366] sp : ffff00001d04b600
[   78.896668] x29: ffff00001d04b600 x28: ffff80006814c680
[   78.901970] x27: 0000000000000000 x26: ffff80006c8de786
[   78.907272] x25: ffff00001d04b760 x24: 000000000000001a
[   78.912573] x23: 0000000000000006 x22: ffff80006814e440
[   78.917874] x21: 0000000000000100 x20: 0000000000000000
[   78.923175] x19: 000081ffffffffff x18: 0000000000000400
[   78.928476] x17: 0000000000000008 x16: 0000000000000000
[   78.933778] x15: 0000000000000100 x14: 0000000000000001
[   78.939079] x13: 0000000000001080 x12: 0000000000000020
[   78.944381] x11: 0000000000001080 x10: 00000000ffff0002
[   78.949683] x9 : ffff80006814c248 x8 : 00000000ffff0000
[   78.954985] x7 : ffff80006814c318 x6 : ffff80006c8de786
[   78.960286] x5 : 0000000000000f80 x4 : ffff80006c8de000
[   78.965588] x3 : 0000000000000000 x2 : 0000000000001086
[   78.970889] x1 : ffff7e0001b74e02 x0 : 0000000000000000
[   78.976192] Process openssl (pid: 6007, stack limit = 0x00000000291367f9)
[   78.982968] Call trace:
[   78.985406]  scatterwalk_copychunks+0x164/0x1c8
[   78.989927]  skcipher_walk_next+0x28c/0x448
[   78.994099]  skcipher_walk_done+0xfc/0x258
[   78.998187]  gcm_encrypt+0x434/0x4c0
[   79.001758]  tls_push_record+0x354/0xa58 [tls]
[   79.006194]  bpf_exec_tx_verdict+0x1e4/0x3e8 [tls]
[   79.010978]  tls_sw_sendmsg+0x650/0x780 [tls]
[   79.015326]  inet_sendmsg+0x2c/0xf8
[   79.018806]  sock_sendmsg+0x18/0x30
[   79.022284]  __sys_sendto+0x104/0x138
[   79.025935]  __arm64_sys_sendto+0x24/0x30
[   79.029936]  el0_svc_common+0x60/0xe8
[   79.033588]  el0_svc_handler+0x2c/0x80
[   79.037327]  el0_svc+0x8/0xc
[   79.040200] Code: 6b01005f 54fff788 940169b1 f9000320 (b9400801)
[   79.046283] ---[ end trace 74db007d069c1cf7 ]---

Fixes: d829e9c4112b ("tls: convert to generic sk_msg interface")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Willem de Bruijn [Fri, 21 Dec 2018 17:06:59 +0000 (12:06 -0500)]

packet: validate address length

Packet sockets with SOCK_DGRAM may pass an address for use in
dev_hard_header. Ensure that it is of sufficient length.

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Dec 2018 17:09:30 +0000 (09:09 -0800)]

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:
"Switching a few devices with Synaptics over to SMbus and disabling
  SMbus on a couple devices with Elan touchpads as they need more
  plumbing on PS/2 side"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: synaptics - enable SMBus for HP EliteBook 840 G4
  Input: elantech - disable elan-i2c for P52 and P72
  Input: synaptics - enable RMI on ThinkPad T560
  Input: omap-keypad - fix idle configuration to not block SoC idle states

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Dec 2018 17:05:28 +0000 (09:05 -0800)]

Merge tag 'gpio-v4.20-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
"Hopefully last round of GPIO fixes.

  The ACPI patch is pretty important for some laptop users, the rest is
  driver-specific for embedded (mostly ARM) systems.

  I took out one ACPI patch that wasn't critical enough because I
  couldn't justify sending it at this point, and that is why the commit
  date is today, but the patches have been in linux-next.

  Sorry for not sending some of them earlier :(

  Notice that we have a co-maintainer for GPIO now, Bartosz Golaszewski,
  and he might jump in and make some pull requests at times when I am
  off.

  Summary:

   - ACPI IRQ request deferral

   - OMAP: revert deferred wakeup quirk

   - MAX7301: fix DMA safe memory handling

   - MVEBU: selective probe failure on missing clk"

* tag 'gpio-v4.20-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: mvebu: only fail on missing clk if pwm is actually to be used
  gpio: max7301: fix driver for use with CONFIG_VMAP_STACK
  gpio: gpio-omap: Revert deferred wakeup quirk handling for regressions
  gpiolib-acpi: Only defer request_irq for GpioInt ACPI event handlers

commit | commitdiff | tree

Kangjie Lu [Fri, 21 Dec 2018 06:22:32 +0000 (00:22 -0600)]

net: netxen: fix a missing check and an uninitialized use

When netxen_rom_fast_read() fails, "bios" is left uninitialized and may
contain random value, thus should not be used.

The fix ensures that if netxen_rom_fast_read() fails, we return "-EIO".

Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Daniel Borkmann [Fri, 21 Dec 2018 13:04:46 +0000 (14:04 +0100)]

bpf: fix segfault in test_verifier selftest

Minor fallout from merge resolution, test_verifier was segfaulting
because the REJECT result was correct, but errstr was NULL. Properly
fix it as in 339bbff2d6e0.

Fixes: 339bbff2d6e0 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Fri, 21 Dec 2018 16:56:31 +0000 (08:56 -0800)]

Merge tag '4.20-rc7-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb3 fix from Steve French:
"An important smb3 fix for an regression to some servers introduced by
  compounding optimization to rmdir.

  This fix has been tested by multiple developers (including me) with
  the usual private xfstesting, but also by the new cifs/smb3 "buildbot"
  xfstest VMs (thank you Ronnie and Aurelien for good work on this
  automation). The automated testing has been updated so that it will
  catch problems like this in the future.

  Note that Pavel discovered (very recently) some unrelated but
  extremely important bugs in credit handling (smb3 flow control problem
  that can lead to disconnects/reconnects) when compounding, that I
  would have liked to send in ASAP but the complete testing of those two
  fixes may not be done in time and have to wait for 4.21"

* tag '4.20-rc7-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: Fix rmdir compounding regression to strict servers

commit | commitdiff | tree

Eric Sandeen [Fri, 21 Dec 2018 16:42:50 +0000 (08:42 -0800)]

iomap: don't search past page end in iomap_is_partially_uptodate

iomap_is_partially_uptodate() is intended to check wither blocks within
the selected range of a not-uptodate page are uptodate; if the range we
care about is up to date, it's an optimization.

However, the iomap implementation continues to check all blocks up to
from+count, which is beyond the page, and can even be well beyond the
iop->uptodate bitmap.

I think the worst that will happen is that we may eventually find a zero
bit and return "not partially uptodate" when it would have otherwise
returned true, and skip the optimization. Still, it's clearly an invalid
memory access that must be fixed.

So: fix this by limiting the search to within the page as is done in the
non-iomap variant, block_is_partially_uptodate().

Zorro noticed thiswhen KASAN went off for 512 byte blocks on a 64k
page system:

BUG: KASAN: slab-out-of-bounds in iomap_is_partially_uptodate+0x1a0/0x1e0
Read of size 8 at addr ffff800120c3a318 by task fsstress/22337

Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

commit | commitdiff | tree

Anup Patel [Tue, 4 Dec 2018 10:29:51 +0000 (15:59 +0530)]

RISC-V: Select GENERIC_SCHED_CLOCK for clocksource drivers

The riscv_timer driver can provide sched_clock using "rdtime"
instruction but to achieve this we require generic sched_clock
framework hence this patch selects GENERIC_SCHED_CLOCK for RISCV.

Signed-off-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>

commit | commitdiff | tree

Olof Johansson [Wed, 31 Oct 2018 06:47:08 +0000 (23:47 -0700)]

RISC-V: lib: minor asm cleanup

Fix tab/space conversion and use ENTRY/ENDPROC macros.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>

commit | commitdiff | tree

Palmer Dabbelt [Fri, 21 Dec 2018 16:15:39 +0000 (08:15 -0800)]

RISC-V: Move from EARLY_PRINTK to SBI earlycon

Now that we have earlycon support in the SBI console driver there is no
reason to have our arch-specific early printk support. This patch set
turns on SBI earlycon support and removes the old early printk.

commit | commitdiff | tree

Nick Kossifidis [Sun, 18 Nov 2018 00:06:56 +0000 (02:06 +0200)]

RISC-V: Update Kconfig to better handle CMDLINE

Added a menu to choose how the built-in command line will be
used and CMDLINE_EXTEND for compatibility with FDT code.

v2: Improved help messages, removed references to bootloader
and made them more descriptive. I also asked help from a
friend who's a language expert just in case.

v3: This time used the corrected text

v4: Copy the config strings from the arm32 port.

v5: Actually copy the config strings from the arm32 port.

Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Signed-off-by: Debbie Maliotaki <dmaliotaki@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>

commit | commitdiff | tree

David Abdurachmanov [Thu, 6 Dec 2018 10:26:26 +0000 (11:26 +0100)]

riscv: remove unused variable in ftrace

Noticed while building kernel-4.20.0-0.rc5.git2.1.fc30 for
Fedora 30/RISCV.

[..]
BUILDSTDERR: arch/riscv/kernel/ftrace.c: In function 'prepare_ftrace_return':
BUILDSTDERR: arch/riscv/kernel/ftrace.c:135:6: warning: unused variable 'err' [-Wunused-variable]
BUILDSTDERR: int err;
BUILDSTDERR: ^~~
[..]

Signed-off-by: David Abdurachmanov <david.abdurachmanov@gmail.com>
Fixes: e949b6db51dc1 ("riscv/function_graph: Simplify with function_graph_enter()")
Reviewed-by: Olof Johansson <olof@lixom.net>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>

commit | commitdiff | tree

Yangtao Li [Tue, 20 Nov 2018 14:11:02 +0000 (09:11 -0500)]

RISC-V: add of_node_put()

use of_node_put() to release the refcount.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>

commit | commitdiff | tree

Atish Patra [Tue, 20 Nov 2018 23:07:50 +0000 (15:07 -0800)]

RISC-V: Fix of_node_* refcount

Fix of_node* refcount at various places by using of_node_put.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>

commit | commitdiff | tree

Andrea Parri [Sat, 1 Dec 2018 00:01:56 +0000 (01:01 +0100)]

riscv, atomic: Add #define's for the atomic_{cmp,}xchg_*() variants

If an architecture does not define the atomic_{cmp,}xchg_*() variants,
the generic implementation defaults them to the fully-ordered version.

riscv's had its own variants since "the beginning", but it never told
(#define-d these for) the generic implementation: it is time to do so.

Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>

commit | commitdiff | tree

David S. Miller [Fri, 21 Dec 2018 16:00:27 +0000 (08:00 -0800)]

Merge tag 'mlx5-XDP-100Mpps' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-XDP-100Mpps

This series from Tariq, mainly adds the support of mlx5 Multi Packet WQE
(TX descriptor) - ConnectX-5 and above - for XDP TX, which allows us to
overcome the 70Mpps PCIe bottleneck of conventional TX queues (single TX
descriptor per packet), and achieve the 100Mpps milestone with the MPWQE
approach.

In the first five patches, Tariq did minor improvements to mlx5 tx path,
for better debug-ability and code structuring.

Next two patches lay down the foundation for MPWQE implementation to store
the in-flight XDP TX information for multiple packets of one descriptor
(WQE).

Next: Support Enhanced Multi-Packet TX WQE for XDP

In this patch we add support for the HW feature, which is supported
starting from ConnectX-5.

Performance:
Tested packet rate for UDP 64Byte multi-stream over ConnectX-5 NICs.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

XDP_TX:
We see a huge gain on single port ConnectX-5, and reach the 100 Mpps
milestone.
* Single-port HCA:
Before:   70 Mpps
After:   100 Mpps (+42.8%)

* Dual-port HCA:
Before: 51.7 Mpps
After:  57.3 Mpps (+10.8%)

* In both cases we tested traffic on one port and for now On Dual-port
  HCAs we see only a small gain, we are working to overcome this
  bottleneck, but for the moment only with experimental firmware on dual
  port HCAs we can reach the wanted numbers as seen on Single-port HCAs.

XDP_REDIRECT:
Redirect from (A) ConnectX-5 to (B) ConnectX-5.
Due to a setup limitation, (A) and (B) are on different NUMA nodes,
so absolute performance numbers are not optimal.
- Note:
  Below is the transmit rate of (B), not the redirect rate of (A)
  which is in some cases higher.

* (B) is single-port:
Before:   77 Mpps
After:    90 Mpps (+16.8%)

* (B) is dual-port:
Before:  61 Mpps
After:   72 Mpps (+18%)

Last patch adds a knob in mlx5 ethtool private flag to turn on/off
XDP TX MPWQE.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mark Brown [Fri, 21 Dec 2018 13:43:35 +0000 (13:43 +0000)]

Merge remote-tracking branch 'regulator/topic/coupled' into regulator-next

commit | commitdiff | tree

Mark Brown [Fri, 21 Dec 2018 13:43:32 +0000 (13:43 +0000)]

Merge branch 'regulator-4.21' into regulator-next

commit | commitdiff | tree

Mark Brown [Fri, 21 Dec 2018 13:43:30 +0000 (13:43 +0000)]

Merge branch 'regulator-4.20' into regulator-linus

local clone of linux.git