]> asedeno.scripts.mit.edu Git - linux.git/log
linux.git
5 years agoblock: copy ioprio in __bio_clone_fast() and bounce
Hannes Reinecke [Mon, 12 Nov 2018 17:35:25 +0000 (10:35 -0700)]
block: copy ioprio in __bio_clone_fast() and bounce

We need to copy the io priority, too; otherwise the clone will run
with a different priority than the original one.

Fixes: 43b62ce3ff0a ("block: move bio io prio to a new field")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixed up subject, and ordered stores.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agokyber: fix wrong strlcpy() size in trace_kyber_latency()
Omar Sandoval [Mon, 12 Nov 2018 08:08:46 +0000 (00:08 -0800)]
kyber: fix wrong strlcpy() size in trace_kyber_latency()

When copying to the latency type, we should be passing LATENCY_TYPE_LEN,
not DOMAIN_LEN (this isn't a problem in practice because we only pass
"total" or "I/O"). Fix it by changing all of the strlcpy() calls to use
sizeof().

Fixes: 6c3b7af1c975 ("kyber: add tracepoints")
Reported-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agodrm/i915: Fix hpd handling for pins with two encoders
Ville Syrjälä [Mon, 12 Nov 2018 13:36:16 +0000 (15:36 +0200)]
drm/i915: Fix hpd handling for pins with two encoders

In my haste to remove irq_port[] I accidentally changed the
way we deal with hpd pins that are shared by multiple encoders
(DP and HDMI for pre-DDI platforms). Previously we would only
handle such pins via ->hpd_pulse(), but now we queue up the
hotplug work for the HDMI encoder directly. Worse yet, we now
count each hpd twice and this increment the hpd storm count
twice as fast. This can lead to spurious storms being detected.

Go back to the old way of doing things, ie. delegate to
->hpd_pulse() for any pin which has an encoder with that hook
implemented. I don't really like the idea of adding irq_port[]
back so let's loop through the encoders first to check if we
have an encoder with ->hpd_pulse() for the pin, and then go
through all the pins and decided on the correct course of action
based on the earlier findings.

I have occasionally toyed with the idea of unifying the pre-DDI
HDMI and DP encoders into a single encoder as well. Besides the
hotplug processing it would have the other benefit of preventing
userspace from trying to enable both encoders at the same time.
That is simply illegal as they share the same clock/data pins.
We have some testcases that will attempt that and thus fail on
many older machines. But for now let's stick to fixing just the
hotplug code.

Cc: stable@vger.kernel.org # 4.19+
Cc: Lyude Paul <lyude@redhat.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: b6ca3eee18ba ("drm/i915: Nuke dev_priv->irq_port[]")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181108200424.28371-1-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
(cherry picked from commit 5a3aeca97af1b6b3498d59a7fd4e8bb95814c108)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
5 years agodrm/i915/execlists: Force write serialisation into context image vs execution
Chris Wilson [Thu, 8 Nov 2018 08:17:38 +0000 (08:17 +0000)]
drm/i915/execlists: Force write serialisation into context image vs execution

Ensure that the writes into the context image are completed prior to the
register mmio to trigger execution. Although previously we were assured
by the SDM that all writes are flushed before an uncached memory
transaction (our mmio write to submit the context to HW for execution),
we have empirical evidence to believe that this is not actually the
case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
References: https://bugs.freedesktop.org/show_bug.cgi?id=108315
References: https://bugs.freedesktop.org/show_bug.cgi?id=106887
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181108081740.25615-1-chris@chris-wilson.co.uk
Cc: stable@vger.kernel.org
(cherry picked from commit 987abd5c62f92ee4970b45aa077f47949974e615)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
5 years agodrm/i915/icl: Fix power well 2 wrt. DC-off toggling order
Imre Deak [Fri, 2 Nov 2018 18:22:00 +0000 (20:22 +0200)]
drm/i915/icl: Fix power well 2 wrt. DC-off toggling order

To enable DC5/6 power well 2 has to be disabled as for previous
platforms, so fix things up.

Bspec: 4234
Fixes: 67ca07e7ac10 ("drm/i915/icl: Add power well support")
Cc: Animesh Manna <animesh.manna@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181102182200.17219-1-imre.deak@intel.com
(cherry picked from commit a33e1ece777996ddddb1f23a30f8c66422ed0b68)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
5 years agodrm/i915: Fix NULL deref when re-enabling HPD IRQs on systems with MST
Lyude Paul [Tue, 6 Nov 2018 21:30:13 +0000 (16:30 -0500)]
drm/i915: Fix NULL deref when re-enabling HPD IRQs on systems with MST

Turns out that if you trigger an HPD storm on a system that has an MST
topology connected to it, you'll end up causing the kernel to eventually
hit a NULL deref:

[  332.339041] BUG: unable to handle kernel NULL pointer dereference at 00000000000000ec
[  332.340906] PGD 0 P4D 0
[  332.342750] Oops: 0000 [#1] SMP PTI
[  332.344579] CPU: 2 PID: 25 Comm: kworker/2:0 Kdump: loaded Tainted: G           O      4.18.0-rc3short-hpd-storm+ #2
[  332.346453] Hardware name: LENOVO 20BWS1KY00/20BWS1KY00, BIOS JBET71WW (1.35 ) 09/14/2018
[  332.348361] Workqueue: events intel_hpd_irq_storm_reenable_work [i915]
[  332.350301] RIP: 0010:intel_hpd_irq_storm_reenable_work.cold.3+0x2f/0x86 [i915]
[  332.352213] Code: 00 00 ba e8 00 00 00 48 c7 c6 c0 aa 5f a0 48 c7 c7 d0 73 62 a0 4c 89 c1 4c 89 04 24 e8 7f f5 af e0 4c 8b 04 24 44 89 f8 29 e8 <41> 39 80 ec 00 00 00 0f 85 43 13 fc ff 41 0f b6 86 b8 04 00 00 41
[  332.354286] RSP: 0018:ffffc90000147e48 EFLAGS: 00010006
[  332.356344] RAX: 0000000000000005 RBX: ffff8802c226c9d4 RCX: 0000000000000006
[  332.358404] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff88032dc95570
[  332.360466] RBP: 0000000000000005 R08: 0000000000000000 R09: ffff88031b3dc840
[  332.362528] R10: 0000000000000000 R11: 000000031a069602 R12: ffff8802c226ca20
[  332.364575] R13: ffff8802c2268000 R14: ffff880310661000 R15: 000000000000000a
[  332.366615] FS:  0000000000000000(0000) GS:ffff88032dc80000(0000) knlGS:0000000000000000
[  332.368658] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  332.370690] CR2: 00000000000000ec CR3: 000000000200a003 CR4: 00000000003606e0
[  332.372724] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  332.374773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  332.376798] Call Trace:
[  332.378809]  process_one_work+0x1a1/0x350
[  332.380806]  worker_thread+0x30/0x380
[  332.382777]  ? wq_update_unbound_numa+0x10/0x10
[  332.384772]  kthread+0x112/0x130
[  332.386740]  ? kthread_create_worker_on_cpu+0x70/0x70
[  332.388706]  ret_from_fork+0x35/0x40
[  332.390651] Modules linked in: i915(O) vfat fat joydev btusb btrtl btbcm btintel bluetooth ecdh_generic iTCO_wdt wmi_bmof i2c_algo_bit drm_kms_helper intel_rapl syscopyarea sysfillrect x86_pkg_temp_thermal sysimgblt coretemp fb_sys_fops crc32_pclmul drm psmouse pcspkr mei_me mei i2c_i801 lpc_ich mfd_core i2c_core tpm_tis tpm_tis_core thinkpad_acpi wmi tpm rfkill video crc32c_intel serio_raw ehci_pci xhci_pci ehci_hcd xhci_hcd [last unloaded: i915]
[  332.394963] CR2: 00000000000000ec

This appears to be due to the fact that with an MST topology, not all
intel_connector structs will have ->encoder set. So, fix this by
skipping connectors without encoders in
intel_hpd_irq_storm_reenable_work().

For those wondering, this bug was found on accident while simulating HPD
storms using a Chamelium connected to a ThinkPad T450s (Broadwell).

Changes since v1:
- Check intel_connector->mst_port instead of intel_connector->encoder

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181106213017.14563-3-lyude@redhat.com
(cherry picked from commit fee61deecb1d850bf34f682a6a452e5ee51b7572)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
5 years agodrm/i915: Fix possible race in intel_dp_add_mst_connector()
Lyude Paul [Tue, 6 Nov 2018 21:30:12 +0000 (16:30 -0500)]
drm/i915: Fix possible race in intel_dp_add_mst_connector()

This hasn't caused any issues yet that I'm aware of, but as Ville
Syrjälä pointed out - we need to make sure that
intel_connector->mst_port is set before initializing MST connectors,
since in theory we could potentially check intel_connector->mst_port in
i915_hpd_poll_init_work() after registering the connector but before
having written it's value.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20181106213017.14563-2-lyude@redhat.com
(cherry picked from commit 66a5ab1034be801630816d1fa6cfc30db1a2f0b0)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
5 years agodrm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5
Chris Wilson [Mon, 5 Nov 2018 09:43:05 +0000 (09:43 +0000)]
drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5

Exercising the gpu reloc path strenuously revealed an issue where the
updated relocations (from MI_STORE_DWORD_IMM) were not being observed
upon execution. After some experiments with adding pipecontrols (a lot
of pipecontrols (32) as gen4/5 do not have a bit to wait on earlier pipe
controls or even the current on), it was discovered that we merely
needed to delay the EMIT_INVALIDATE by several flushes. It is important
to note that it is the EMIT_INVALIDATE as opposed to the EMIT_FLUSH that
needs the delay as opposed to what one might first expect -- that the
delay is required for the TLB invalidation to take effect (one presumes
to purge any CS buffers) as opposed to a delay after flushing to ensure
the writes have landed before triggering invalidation.

Testcase: igt/gem_tiled_fence_blits
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181105094305.5767-1-chris@chris-wilson.co.uk
(cherry picked from commit 55f99bf2a9c331838c981694bc872cd1ec4070b2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
5 years agoARM: spectre-v2: per-CPU vtables to work around big.Little systems
Russell King [Thu, 19 Jul 2018 11:21:31 +0000 (12:21 +0100)]
ARM: spectre-v2: per-CPU vtables to work around big.Little systems

In big.Little systems, some CPUs require the Spectre workarounds in
paths such as the context switch, but other CPUs do not.  In order
to handle these differences, we need per-CPU vtables.

We are unable to use the kernel's per-CPU variables to support this
as per-CPU is not initialised at times when we need access to the
vtables, so we have to use an array indexed by logical CPU number.

We use an array-of-pointers to avoid having function pointers in
the kernel's read/write .data section.

Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
5 years agoARM: 8810/1: vfp: Fix wrong assignement to ufp_exc
Julien Thierry [Thu, 8 Nov 2018 16:25:28 +0000 (17:25 +0100)]
ARM: 8810/1: vfp: Fix wrong assignement to ufp_exc

In vfp_preserve_user_clear_hwstate, ufp_exc->fpinst2 gets assigned to
itself. It should actually be hwstate->fpinst2 that gets assigned to the
ufp_exc field.

Fixes commit 3aa2df6ec2ca6bc143a65351cca4266d03a8bc41 ("ARM: 8791/1:
vfp: use __copy_to_user() when saving VFP state").

Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
5 years agoARM: add PROC_VTABLE and PROC_TABLE macros
Russell King [Thu, 19 Jul 2018 11:17:38 +0000 (12:17 +0100)]
ARM: add PROC_VTABLE and PROC_TABLE macros

Allow the way we access members of the processor vtable to be changed
at compile time.  We will need to move to per-CPU vtables to fix the
Spectre variant 2 issues on big.Little systems.

However, we have a couple of calls that do not need the vtable
treatment, and indeed cause a kernel warning due to the (later) use
of smp_processor_id(), so also introduce the PROC_TABLE macro for
these which always use CPU 0's function pointers.

Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
5 years agoARM: clean up per-processor check_bugs method call
Russell King [Thu, 19 Jul 2018 11:43:03 +0000 (12:43 +0100)]
ARM: clean up per-processor check_bugs method call

Call the per-processor type check_bugs() method in the same way as we
do other per-processor functions - move the "processor." detail into
proc-fns.h.

Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
5 years agoARM: split out processor lookup
Russell King [Thu, 19 Jul 2018 10:59:56 +0000 (11:59 +0100)]
ARM: split out processor lookup

Split out the lookup of the processor type and associated error handling
from the rest of setup_processor() - we will need to use this in the
secondary CPU bringup path for big.Little Spectre variant 2 mitigation.

Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
5 years agoARM: make lookup_processor_type() non-__init
Russell King [Thu, 19 Jul 2018 10:42:36 +0000 (11:42 +0100)]
ARM: make lookup_processor_type() non-__init

Move lookup_processor_type() out of the __init section so it is callable
from (eg) the secondary startup code during hotplug.

Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
5 years agodrm/omap: dsi: Fix missing of_platform_depopulate()
Tony Lindgren [Tue, 6 Nov 2018 15:28:02 +0000 (07:28 -0800)]
drm/omap: dsi: Fix missing of_platform_depopulate()

We're missing a call to of_platform_depopulate() on errors for dsi.
Looks like dss is already doing this.

Signed-off-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181106152802.38599-1-tony@atomide.com
5 years agodrm/omap: Move DISPC runtime PM handling to omapdrm
Laurent Pinchart [Sat, 10 Nov 2018 11:16:54 +0000 (13:16 +0200)]
drm/omap: Move DISPC runtime PM handling to omapdrm

The internal encoders (DSI, HDMI4, HDMI5 and VENC) runtime PM handlers
attempt to manage the runtime PM state of the connected DISPC, based on
the rationale that the DISPC providing data to the encoders requires
ensuring that the display is active whenever the encoders are active.

While the DISPC provides data to the encoders, it doesn't as such
constitute a resource that encoders require in order to be taken out
of suspend, contrary to for instance a functional clock or a power
supply. Encoders registers can be accessed without the DISPC being
active, and while the encoders will not output any video stream without
being fed by the DISPC, the DISPC PM state doesn't influence the
encoders PM state.

For this reason the DISPC PM state is better managed from the omapdrm
driver, in the CRTC enable and disable operations. This allows the
encoders PM state to be handled separately from the DISPC, and in
particular at times when the DISPC may not be available (for instance at
probe due to the DSS probe being deferred, or at remove time du to the
DISPC being already removed).

Fixes: edb715dffdee ("drm/omap: dss: dsi: Move initialization code from bind to probe")
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181110111654.4387-5-laurent.pinchart@ideasonboard.com
5 years agodrm/omap: dsi: Ensure the device is active during probe
Laurent Pinchart [Sat, 10 Nov 2018 11:16:53 +0000 (13:16 +0200)]
drm/omap: dsi: Ensure the device is active during probe

The probe function performs hardware access to read the number of
supported data lanes from a configuration register and thus requires the
device to be active. Ensure this by surrounding the access with
dsi_runtime_get() and dsi_runtime_put() calls.

Fixes: edb715dffdee ("drm/omap: dss: dsi: Move initialization code from bind to probe")
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181110111654.4387-4-laurent.pinchart@ideasonboard.com
5 years agodrm/omap: hdmi4: Ensure the device is active during bind
Laurent Pinchart [Sat, 10 Nov 2018 11:16:52 +0000 (13:16 +0200)]
drm/omap: hdmi4: Ensure the device is active during bind

The bind function performs hardware access (in hdmi4_cec_init()) and
thus requires the device to be active. Ensure this by surrounding the
bind function by hdmi_runtime_get() and hdmi_runtime_put() calls.

Fixes: 27d624527d99 ("drm/omap: dss: Acquire next dssdev at probe time")
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181110111654.4387-3-laurent.pinchart@ideasonboard.com
5 years agodrm/omap: Populate DSS children in omapdss driver
Laurent Pinchart [Sat, 10 Nov 2018 11:16:51 +0000 (13:16 +0200)]
drm/omap: Populate DSS children in omapdss driver

The DSS DT node contains children that describe the DSS components
(DISPC and internal encoders). Each of those components is handled by a
platform driver, and thus needs to be backed by a platform device.

The corresponding platform devices are created in mach-omap2 code by a
call to of_platform_populate(). While this approach has worked so far,
it doesn't model the hardware architecture very well, as it creates
child devices before the parent is ready to handle them. This would be
akin to creating I2C slaves before the I2C master is available.

The task can be easily performed in the omapdss driver code instead,
simplifying mach-omap2 code. We however can't remove the mach-omap2 code
completely as the omap2fb driver still depends on it, but we can move it
to the omap2fb-specific section, where it can stay until the omap2fb
driver gets removed.

This has the added benefit of not allowing DSS components to probe
before the DSS itself, which led to runtime PM issues when the DSS probe
is deferred.

Fixes: 27d624527d99 ("drm/omap: dss: Acquire next dssdev at probe time")
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181110111654.4387-2-laurent.pinchart@ideasonboard.com
5 years agomnt: fix __detach_mounts infinite loop
Benjamin Coddington [Wed, 3 Oct 2018 14:18:33 +0000 (10:18 -0400)]
mnt: fix __detach_mounts infinite loop

Since commit ff17fa561a04 ("d_invalidate(): unhash immediately")
immediately unhashes the dentry, we'll never return the mountpoint in
lookup_mountpoint(), which can lead to an unbreakable loop in
d_invalidate().

I have reports of NFS clients getting into this condition after the server
removes an export of an existing mount created through follow_automount(),
but I suspect there are various other ways to produce this problem if we
hunt down users of d_invalidate().  For example, it is possible to get into
this state by using XFS' d_invalidate() call in xfs_vn_unlink():

truncate -s 100m img{1,2}

mkfs.xfs -q -n version=ci img1
mkfs.xfs -q -n version=ci img2

mkdir -p /mnt/xfs
mount img1 /mnt/xfs

mkdir /mnt/xfs/sub1
mount img2 /mnt/xfs/sub1

cat > /mnt/xfs/sub1/foo &
umount -l /mnt/xfs/sub1
mount img2 /mnt/xfs/sub1

mount --make-private /mnt/xfs

mkdir /mnt/xfs/sub2
mount --move /mnt/xfs/sub1 /mnt/xfs/sub2
rmdir /mnt/xfs/sub1

Fix this by moving the check for an unlinked dentry out of the
detach_mounts() path.

Fixes: ff17fa561a04 ("d_invalidate(): unhash immediately")
Cc: stable@vger.kernel.org
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
5 years agoperf/x86/intel/uncore: Support CoffeeLake 8th CBOX
Kan Liang [Fri, 19 Oct 2018 17:04:19 +0000 (10:04 -0700)]
perf/x86/intel/uncore: Support CoffeeLake 8th CBOX

Coffee Lake has 8 core products which has 8 Cboxes. The 8th CBOX is
mapped into different MSR space.

Increase the num_boxes to 8 to handle the new products. It will not
impact the previous platforms, SkyLake, KabyLake and earlier CoffeeLake.
Because the num_boxes will be recalculated in uncore_cpu_init and
doesn't exceed the x86_max_cores.

Introduce a new box flag bit to indicate the 8th CBOX.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20181019170419.378-2-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agoperf/x86/intel/uncore: Add more IMC PCI IDs for KabyLake and CoffeeLake CPUs
Kan Liang [Fri, 19 Oct 2018 17:04:18 +0000 (10:04 -0700)]
perf/x86/intel/uncore: Add more IMC PCI IDs for KabyLake and CoffeeLake CPUs

KabyLake and CoffeeLake CPUs have the same client uncore events as SkyLake.

Add the PCI IDs for the KabyLake Y, U, S processor lines and CoffeeLake U,
H, S processor lines.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20181019170419.378-1-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agosched/fair: Fix cpu_util_wake() for 'execl' type workloads
Patrick Bellasi [Mon, 5 Nov 2018 14:53:58 +0000 (14:53 +0000)]
sched/fair: Fix cpu_util_wake() for 'execl' type workloads

A ~10% regression has been reported for UnixBench's execl throughput
test by Aaron Lu and Ye Xiaolong:

  https://lkml.org/lkml/2018/10/30/765

That test is pretty simple, it does a "recursive" execve() syscall on the
same binary. Starting from the syscall, this sequence is possible:

   do_execve()
     do_execveat_common()
       __do_execve_file()
         sched_exec()
           select_task_rq_fair()          <==| Task already enqueued
             find_idlest_cpu()
               find_idlest_group()
                 capacity_spare_wake()    <==| Functions not called from
   cpu_util_wake()           | the wakeup path

which means we can end up calling cpu_util_wake() not only from the
"wakeup path", as its name would suggest. Indeed, the task doing an
execve() syscall is already enqueued on the CPU we want to get the
cpu_util_wake() for.

The estimated utilization for a CPU computed in cpu_util_wake() was
written under the assumption that function can be called only from the
wakeup path. If instead the task is already enqueued, we end up with a
utilization which does not remove the current task's contribution from
the estimated utilization of the CPU.
This will wrongly assume a reduced spare capacity on the current CPU and
increase the chances to migrate the task on execve.

The regression is tracked down to:

 commit d519329f72a6 ("sched/fair: Update util_est only on util_avg updates")

because in that patch we turn on by default the UTIL_EST sched feature.
However, the real issue is introduced by:

 commit f9be3e5961c5 ("sched/fair: Use util_est in LB and WU paths")

Let's fix this by ensuring to always discount the task estimated
utilization from the CPU's estimated utilization when the task is also
the current one. The same benchmark of the bug report, executed on a
dual socket 40 CPUs Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz machine,
reports these "Execl Throughput" figures (higher the better):

   mainline     : 48136.5 lps
   mainline+fix : 55376.5 lps

which correspond to a 15% speedup.

Moreover, since {cpu_util,capacity_spare}_wake() are not really only
used from the wakeup path, let's remove this ambiguity by using a better
matching name: {cpu_util,capacity_spare}_without().

Since we are at that, let's also improve the existing documentation.

Reported-by: Aaron Lu <aaron.lu@intel.com>
Reported-by: Ye Xiaolong <xiaolong.ye@intel.com>
Tested-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Fixes: f9be3e5961c5 (sched/fair: Use util_est in LB and WU paths)
Link: https://lore.kernel.org/lkml/20181025093100.GB13236@e110439-lin/
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agoselftests/powerpc: Fix wild_bctr test to work on ppc64
Michael Ellerman [Mon, 12 Nov 2018 02:46:06 +0000 (13:46 +1100)]
selftests/powerpc: Fix wild_bctr test to work on ppc64

The selftest I recently added to test branching to an out-of-bounds
NIP doesn't work on 64-bit big endian. It does fail but not in the
right way. That is it SEGVs trying to load from the opd at BAD_NIP,
but it never gets as far as branching to BAD_NIP.

To fix it we need to create an opd which is reachable but which holds
the bad address.

Fixes: b7683fc66eba ("selftests/powerpc: Add a test of wild bctr")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/io: Fix the IO workarounds code to work with Radix
Michael Ellerman [Tue, 6 Nov 2018 12:37:58 +0000 (23:37 +1100)]
powerpc/io: Fix the IO workarounds code to work with Radix

Back in 2006 Ben added some workarounds for a misbehaviour in the
Spider IO bridge used on early Cell machines, see commit
014da7ff47b5 ("[POWERPC] Cell "Spider" MMIO workarounds"). Later these
were made to be generic, ie. not tied specifically to Spider.

The code stashes a token in the high bits (59-48) of virtual addresses
used for IO (eg. returned from ioremap()). This works fine when using
the Hash MMU, but when we're using the Radix MMU the bits used for the
token overlap with some of the bits of the virtual address.

This is because the maximum virtual address is larger with Radix, up
to c00fffffffffffff, and in fact we use that high part of the address
range for ioremap(), see RADIX_KERN_IO_START.

As it happens the bits that are used overlap with the bits that
differentiate an IO address vs a linear map address. If the resulting
address lies outside the linear mapping we will crash (see below), if
not we just corrupt memory.

  virtio-pci 0000:00:00.0: Using 64-bit direct DMA at offset 800000000000000
  Unable to handle kernel paging request for data at address 0xc000000080000014
  ...
  CFAR: c000000000626b98 DAR: c000000080000014 DSISR: 42000000 IRQMASK: 0
  GPR00: c0000000006c54fc c00000003e523378 c0000000016de600 0000000000000000
  GPR04: c00c000080000014 0000000000000007 0fffffff000affff 0000000000000030
         ^^^^
  ...
  NIP [c000000000626c5c] .iowrite8+0xec/0x100
  LR [c0000000006c992c] .vp_reset+0x2c/0x90
  Call Trace:
    .pci_bus_read_config_dword+0xc4/0x120 (unreliable)
    .register_virtio_device+0x13c/0x1c0
    .virtio_pci_probe+0x148/0x1f0
    .local_pci_probe+0x68/0x140
    .pci_device_probe+0x164/0x220
    .really_probe+0x274/0x3b0
    .driver_probe_device+0x80/0x170
    .__driver_attach+0x14c/0x150
    .bus_for_each_dev+0xb8/0x130
    .driver_attach+0x34/0x50
    .bus_add_driver+0x178/0x2f0
    .driver_register+0x90/0x1a0
    .__pci_register_driver+0x6c/0x90
    .virtio_pci_driver_init+0x2c/0x40
    .do_one_initcall+0x64/0x280
    .kernel_init_freeable+0x36c/0x474
    .kernel_init+0x24/0x160
    .ret_from_kernel_thread+0x58/0x7c

This hasn't been a problem because CONFIG_PPC_IO_WORKAROUNDS which
enables this code is usually not enabled. It is only enabled when it's
selected by PPC_CELL_NATIVE which is only selected by
PPC_IBM_CELL_BLADE and that in turn depends on BIG_ENDIAN. So in order
to hit the bug you need to build a big endian kernel, with IBM Cell
Blade support enabled, as well as Radix MMU support, and then boot
that on Power9 using Radix MMU.

Still we can fix the bug, so let's do that. We simply use fewer bits
for the token, taking the union of the restrictions on the address
from both Hash and Radix, we end up with 8 bits we can use for the
token. The only user of the token is iowa_mem_find_bus() which only
supports 8 token values, so 8 bits is plenty for that.

Fixes: 566ca99af026 ("powerpc/mm/radix: Add dummy radix_enabled()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm/64s: Fix preempt warning in slb_allocate_kernel()
Michael Ellerman [Thu, 1 Nov 2018 05:21:05 +0000 (16:21 +1100)]
powerpc/mm/64s: Fix preempt warning in slb_allocate_kernel()

With preempt enabled we see warnings in do_slb_fault():

  BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u33:0/98
  futex hash table entries: 4096 (order: 3, 524288 bytes)
  caller is do_slb_fault+0x204/0x230
  CPU: 5 PID: 98 Comm: kworker/u33:0 Not tainted 4.19.0-rc3-gcc-7.3.1-00022-g1936f094e164 #138
  Call Trace:
    dump_stack+0xb4/0x104 (unreliable)
    check_preemption_disabled+0x148/0x150
    do_slb_fault+0x204/0x230
    data_access_slb_common+0x138/0x180

This is caused by the get_paca() in slb_allocate_kernel(), which
includes a call to debug_smp_processor_id().

slb_allocate_kernel() can only be called from do_slb_fault(), and in
that path interrupts are hard disabled and so we can't be preempted,
but we can't update the preempt flags (in thread_info) because that
could cause an SLB fault.

So just use local_paca which is safe and doesn't cause the warning.

Fixes: 48e7b7695745 ("powerpc/64s/hash: Convert SLB miss handlers to C")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoLinux 4.20-rc2 v4.20-rc2
Linus Torvalds [Sun, 11 Nov 2018 23:12:31 +0000 (17:12 -0600)]
Linux 4.20-rc2

5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sun, 11 Nov 2018 23:09:48 +0000 (17:09 -0600)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:
 "One last pull request before heading to Vancouver for LPC, here we have:

   1) Don't forget to free VSI contexts during ice driver unload, from
      Victor Raj.

   2) Don't forget napi delete calls during device remove in ice driver,
      from Dave Ertman.

   3) Don't request VLAN tag insertion of ibmvnic device when SKB
      doesn't have VLAN tags at all.

   4) IPV4 frag handling code has to accomodate the situation where two
      threads try to insert the same fragment into the hash table at the
      same time. From Eric Dumazet.

   5) Relatedly, don't flow separate on protocol ports for fragmented
      frames, also from Eric Dumazet.

   6) Memory leaks in qed driver, from Denis Bolotin.

   7) Correct valid MTU range in smsc95xx driver, from Stefan Wahren.

   8) Validate cls_flower nested policies properly, from Jakub Kicinski.

   9) Clearing of stats counters in mc88e6xxx driver doesn't retain
      important bits in the G1_STATS_OP register causing the chip to
      hang. Fix from Andrew Lunn"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
  act_mirred: clear skb->tstamp on redirect
  net: dsa: mv88e6xxx: Fix clearing of stats counters
  tipc: fix link re-establish failure
  net: sched: cls_flower: validate nested enc_opts_policy to avoid warning
  net: mvneta: correct typo
  flow_dissector: do not dissect l4 ports for fragments
  net: qualcomm: rmnet: Fix incorrect assignment of real_dev
  net: aquantia: allow rx checksum offload configuration
  net: aquantia: invalid checksumm offload implementation
  net: aquantia: fixed enable unicast on 32 macvlan
  net: aquantia: fix potential IOMMU fault after driver unbind
  net: aquantia: synchronized flow control between mac/phy
  net: smsc95xx: Fix MTU range
  net: stmmac: Fix RX packet size > 8191
  qed: Fix potential memory corruption
  qed: Fix SPQ entries not returned to pool in error flows
  qed: Fix blocking/unlimited SPQ entries leak
  qed: Fix memory/entry leak in qed_init_sp_request()
  inet: frags: better deal with smp races
  net: hns3: bugfix for not checking return value
  ...

5 years agoMerge tag 'kbuild-fixes-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masah...
Linus Torvalds [Sun, 11 Nov 2018 22:57:55 +0000 (16:57 -0600)]
Merge tag 'kbuild-fixes-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - fix build errors in binrpm-pkg and bindeb-pkg targets

 - fix false positive matches in merge_config.sh

 - fix build version mismatch in deb-pkg target

 - fix dtbs_install handling in (bin)deb-pkg target

 - revert a commit that allows setlocalversion to write to source tree

* tag 'kbuild-fixes-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  builddeb: Fix inclusion of dtbs in debian package
  Revert "scripts/setlocalversion: git: Make -dirty check more robust"
  kbuild: deb-pkg: fix too low build version number
  kconfig: merge_config: avoid false positive matches from comment lines
  kbuild: deb-pkg: fix bindeb-pkg breakage when O= is used
  kbuild: rpm-pkg: fix binrpm-pkg breakage when O= is used

5 years agoMerge tag 'for-4.20-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Linus Torvalds [Sun, 11 Nov 2018 22:54:38 +0000 (16:54 -0600)]
Merge tag 'for-4.20-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "Several fixes to recent release (4.19, fixes tagged for stable) and
  other fixes"

* tag 'for-4.20-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Btrfs: fix missing delayed iputs on unmount
  Btrfs: fix data corruption due to cloning of eof block
  Btrfs: fix infinite loop on inode eviction after deduplication of eof block
  Btrfs: fix deadlock on tree root leaf when finding free extent
  btrfs: avoid link error with CONFIG_NO_AUTO_INLINE
  btrfs: tree-checker: Fix misleading group system information
  Btrfs: fix missing data checksums after a ranged fsync (msync)
  btrfs: fix pinned underflow after transaction aborted
  Btrfs: fix cur_offset in the error case for nocow

5 years agoMerge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 11 Nov 2018 22:53:02 +0000 (16:53 -0600)]
Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "A large number of ext4 bug fixes, mostly buffer and memory leaks on
  error return cleanup paths"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: missing !bh check in ext4_xattr_inode_write()
  ext4: fix buffer leak in __ext4_read_dirblock() on error path
  ext4: fix buffer leak in ext4_expand_extra_isize_ea() on error path
  ext4: fix buffer leak in ext4_xattr_move_to_block() on error path
  ext4: release bs.bh before re-using in ext4_xattr_block_find()
  ext4: fix buffer leak in ext4_xattr_get_block() on error path
  ext4: fix possible leak of s_journal_flag_rwsem in error path
  ext4: fix possible leak of sbi->s_group_desc_leak in error path
  ext4: remove unneeded brelse call in ext4_xattr_inode_update_ref()
  ext4: avoid possible double brelse() in add_new_gdb() on error path
  ext4: avoid buffer leak in ext4_orphan_add() after prior errors
  ext4: avoid buffer leak on shutdown in ext4_mark_iloc_dirty()
  ext4: fix possible inode leak in the retry loop of ext4_resize_fs()
  ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizing
  ext4: add missing brelse() update_backups()'s error path
  ext4: add missing brelse() add_new_gdb_meta_bg()'s error path
  ext4: add missing brelse() in set_flexbg_block_bitmap()'s error path
  ext4: avoid potential extra brelse in setup_new_flex_group_blocks()

5 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 11 Nov 2018 22:41:50 +0000 (16:41 -0600)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of x86 fixes:

   - Cure the LDT remapping to user space on 5 level paging which ended
     up in the KASLR space

   - Remove LDT mapping before freeing the LDT pages

   - Make NFIT MCE handling more robust

   - Unbreak the VSMP build by removing the dependency on paravirt ops

   - Support broken PIT emulation on Microsoft hyperV

   - Don't trace vmware_sched_clock() to avoid tracer recursion

   - Remove -pipe from KBUILD CFLAGS which breaks clang and is also
     slower on GCC

   - Trivial coding style and typo fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu/vmware: Do not trace vmware_sched_clock()
  x86/vsmp: Remove dependency on pv_irq_ops
  x86/ldt: Remove unused variable in map_ldt_struct()
  x86/ldt: Unmap PTEs for the slot before freeing LDT pages
  x86/mm: Move LDT remap out of KASLR region on 5-level paging
  acpi/nfit, x86/mce: Validate a MCE's address before using it
  acpi/nfit, x86/mce: Handle only uncorrectable machine checks
  x86/build: Remove -pipe from KBUILD_CFLAGS
  x86/hyper-v: Fix indentation in hv_do_fast_hypercall16()
  Documentation/x86: Fix typo in zero-page.txt
  x86/hyper-v: Enable PIT shutdown quirk
  clockevents/drivers/i8253: Add support for PIT shutdown quirk

5 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 11 Nov 2018 22:39:12 +0000 (16:39 -0600)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
 "A bunch of perf tooling fixes:

   - Make the Intel PT SQL viewer more robust

   - Make the Intel PT debug log more useful

   - Support weak groups in perf record so it's behaving the same way as
     perf stat

   - Display the LBR stats in callchain entries properly in perf top

   - Handle different PMu names with common prefix properlin in pert
     stat

   - Start syscall augmenting in perf trace. Preparation for
     architecture independent eBPF instrumentation of syscalls.

   - Fix build breakage in JVMTI perf lib

   - Fix arm64 tools build failure wrt smp_load_{acquire,release}"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Do not zero sample_id_all for group members
  perf tools: Fix undefined symbol scnprintf in libperf-jvmti.so
  perf beauty: Use SRCARCH, ARCH=x86_64 must map to "x86" to find the headers
  perf intel-pt: Add MTC and CYC timestamps to debug log
  perf intel-pt: Add more event information to debug log
  perf scripts python: exported-sql-viewer.py: Fix table find when table re-ordered
  perf scripts python: exported-sql-viewer.py: Add help window
  perf scripts python: exported-sql-viewer.py: Add Selected branches report
  perf scripts python: exported-sql-viewer.py: Fall back to /usr/local/lib/libxed.so
  perf top: Display the LBR stats in callchain entry
  perf stat: Handle different PMU names with common prefix
  perf record: Support weak groups
  perf evlist: Move perf_evsel__reset_weak_group into evlist
  perf augmented_syscalls: Start collecting pathnames in the BPF program
  perf trace: Fix setting of augmented payload when using eBPF + raw_syscalls
  perf trace: When augmenting raw_syscalls plug raw_syscalls:sys_exit too
  perf examples bpf: Start augmenting raw_syscalls:sys_{start,exit}
  tools headers barrier: Fix arm64 tools build failure wrt smp_load_{acquire,release}

5 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 11 Nov 2018 22:37:41 +0000 (16:37 -0600)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
 "Just the removal of a redundant call into the sched deadline overrun
  check"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  posix-cpu-timers: Remove useless call to check_dl_overrun()

5 years agoMerge branch 'sched/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Sun, 11 Nov 2018 22:33:00 +0000 (16:33 -0600)]
Merge branch 'sched/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Thomas Gleixner:
 "Two small scheduler fixes:

   - Take hotplug lock in sched_init_smp(). Technically not really
     required, but lockdep will complain other.

   - Trivial comment fix in sched/fair"

* 'sched/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix a comment in task_numa_fault()
  sched/core: Take the hotplug lock in sched_init_smp()

5 years agoMerge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 11 Nov 2018 22:18:10 +0000 (16:18 -0600)]
Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking build fix from Thomas Gleixner:
 "A single fix for a build fail with CONFIG_PROFILE_ALL_BRANCHES=y in
  the qspinlock code"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/qspinlock: Fix compile error

5 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 11 Nov 2018 22:14:05 +0000 (16:14 -0600)]
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core fixes from Thomas Gleixner:
 "A couple of fixlets for the core:

   - Kernel doc function documentation fixes

   - Missing prototypes for weak watchdog functions"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  resource/docs: Complete kernel-doc style function documentation
  watchdog/core: Add missing prototypes for weak functions
  resource/docs: Fix new kernel-doc warnings

5 years agoact_mirred: clear skb->tstamp on redirect
Eric Dumazet [Sun, 11 Nov 2018 00:22:29 +0000 (16:22 -0800)]
act_mirred: clear skb->tstamp on redirect

If sch_fq is used at ingress, skbs that might have been
timestamped by net_timestamp_set() if a packet capture
is requesting timestamps could be delayed by arbitrary
amount of time, since sch_fq time base is MONOTONIC.

Fix this problem by moving code from sch_netem.c to act_mirred.c.

Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: Fix clearing of stats counters
Andrew Lunn [Sat, 10 Nov 2018 23:41:10 +0000 (00:41 +0100)]
net: dsa: mv88e6xxx: Fix clearing of stats counters

The mv88e6161 would sometime fail to probe with a timeout waiting for
the switch to complete an operation. This operation is supposed to
clear the statistics counters. However, due to a read/modify/write,
without the needed mask, the operation actually carried out was more
random, with invalid parameters, resulting in the switch not
responding. We need to preserve the histogram mode bits, so apply a
mask to keep them.

Reported-by: Chris Healy <Chris.Healy@zii.aero>
Fixes: 40cff8fca9e3 ("net: dsa: mv88e6xxx: Fix stats histogram mode")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotipc: fix link re-establish failure
Jon Maloy [Sat, 10 Nov 2018 22:30:24 +0000 (17:30 -0500)]
tipc: fix link re-establish failure

When a link failure is detected locally, the link is reset, the flag
link->in_session is set to false, and a RESET_MSG with the 'stopping'
bit set is sent to the peer.

The purpose of this bit is to inform the peer that this endpoint just
is going down, and that the peer should handle the reception of this
particular RESET message as a local failure. This forces the peer to
accept another RESET or ACTIVATE message from this endpoint before it
can re-establish the link. This again is necessary to ensure that
link session numbers are properly exchanged before the link comes up
again.

If a failure is detected locally at the same time at the peer endpoint
this will do the same, which is also a correct behavior.

However, when receiving such messages, the endpoints will not
distinguish between 'stopping' RESETs and ordinary ones when it comes
to updating session numbers. Both endpoints will copy the received
session number and set their 'in_session' flags to true at the
reception, while they are still expecting another RESET from the
peer before they can go ahead and re-establish. This is contradictory,
since, after applying the validation check referred to below, the
'in_session' flag will cause rejection of all such messages, and the
link will never come up again.

We now fix this by not only handling received RESET/STOPPING messages
as a local failure, but also by omitting to set a new session number
and the 'in_session' flag in such cases.

Fixes: 7ea817f4e832 ("tipc: check session number before accepting link protocol messages")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobuilddeb: Fix inclusion of dtbs in debian package
Rob Herring [Wed, 7 Nov 2018 14:36:46 +0000 (08:36 -0600)]
builddeb: Fix inclusion of dtbs in debian package

Commit 37c8a5fafa3b ("kbuild: consolidate Devicetree dtb build rules")
moved the location of 'dtbs_install' target which caused dtbs to not be
installed when building debian package with 'bindeb-pkg' target. Update
the builddeb script to use the same logic that determines if there's a
'dtbs_install' target which is presence of the arch dts directory. Also,
use CONFIG_OF_EARLY_FLATTREE instead of CONFIG_OF as that's a better
indication of whether we are building dtbs.

This commit will also have the side effect of installing dtbs on any
arch that has dts files. Previously, it was dependent on whether the
arch defined 'dtbs_install'.

Fixes: 37c8a5fafa3b ("kbuild: consolidate Devicetree dtb build rules")
Reported-by: Nuno Gonçalves <nunojpg@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
5 years agoRevert "scripts/setlocalversion: git: Make -dirty check more robust"
Guenter Roeck [Tue, 6 Nov 2018 18:10:38 +0000 (10:10 -0800)]
Revert "scripts/setlocalversion: git: Make -dirty check more robust"

This reverts commit 6147b1cf19651c7de297e69108b141fb30aa2349.

The reverted patch results in attempted write access to the source
repository, even if that repository is mounted read-only.

Output from "strace git status -uno --porcelain":

getcwd("/tmp/linux-test", 129)          = 16
open("/tmp/linux-test/.git/index.lock", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) =
-1 EROFS (Read-only file system)

While git appears to be able to handle this situation, a monitored
build environment (such as the one used for Chrome OS kernel builds)
may detect it and bail out with an access violation error. On top of
that, the attempted write access suggests that git _will_ write to the
file even if a build output directory is specified. Users may have the
reasonable expectation that the source repository remains untouched in
that situation.

Fixes: 6147b1cf19651 ("scripts/setlocalversion: git: Make -dirty check more robust"
Cc: Genki Sky <sky@genki.is>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
5 years agokbuild: deb-pkg: fix too low build version number
Masahiro Yamada [Tue, 6 Nov 2018 04:18:05 +0000 (13:18 +0900)]
kbuild: deb-pkg: fix too low build version number

Since commit b41d920acff8 ("kbuild: deb-pkg: split generating packaging
and build"), the build version of the kernel contained in a deb package
is too low by 1.

Prior to the bad commit, the kernel was built first, then the number
in .version file was read out, and written into the debian control file.

Now, the debian control file is created before the kernel is actually
compiled, which is causing the version number mismatch.

Let the mkdebian script pass KBUILD_BUILD_VERSION=${revision} to require
the build system to use the specified version number.

Fixes: b41d920acff8 ("kbuild: deb-pkg: split generating packaging and build")
Reported-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: Doug Smythies <dsmythies@telus.net>
5 years agokconfig: merge_config: avoid false positive matches from comment lines
Masahiro Yamada [Mon, 5 Nov 2018 08:19:36 +0000 (17:19 +0900)]
kconfig: merge_config: avoid false positive matches from comment lines

The current SED_CONFIG_EXP could match to comment lines in config
fragment files, especially when CONFIG_PREFIX_ is empty. For example,
Buildroot uses empty prefixing; starting symbols with BR2_ is just
convention.

Make the sed expression more robust against false positives from
comment lines. The new sed expression matches to only valid patterns.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Petr Vorel <petr.vorel@gmail.com>
Reviewed-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
5 years agoMerge branch 'pm-cpuidle'
Rafael J. Wysocki [Sat, 10 Nov 2018 23:02:37 +0000 (00:02 +0100)]
Merge branch 'pm-cpuidle'

* pm-cpuidle:
  ARM: cpuidle: Convert to use cpuidle_register|unregister()
  ARM: cpuidle: Don't register the driver when back-end init returns -ENXIO

5 years agoMerge tag 'tty-4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sat, 10 Nov 2018 19:32:14 +0000 (13:32 -0600)]
Merge tag 'tty-4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are some small tty fixes for 4.20-rc2

  One of these missed the original 4.19-final release, I missed that I
  hadn't done a pull request for it as it was in linux-next and my
  branch for a long time, that's my fault.

  The others are small, fixing some reported issues and finally fixing
  the termios mess for alpha so that glibc has a chance to implement
  some missing functionality that has been pending for many years now.

  All of these have been in linux-next with no reported issues"

* tag 'tty-4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: sh-sci: Fix could not remove dev_attr_rx_fifo_timeout
  arch/alpha, termios: implement BOTHER, IBSHIFT and termios2
  termios, tty/tty_baudrate.c: fix buffer overrun
  vt: fix broken display when running aptitude
  serial: sh-sci: Fix receive on SCIFA/SCIFB variants with DMA

5 years agoMerge tag 'drm-fixes-2018-11-11' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Sat, 10 Nov 2018 19:29:47 +0000 (13:29 -0600)]
Merge tag 'drm-fixes-2018-11-11' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "drm: i915, amdgpu, sun4i, exynos and etnaviv fixes:

   - amdgpu has some display fixes, KFD ioctl fixes and a Vega20 bios
     interaction fix.

   - sun4i has some NULL checks added

   - i915 has a 32-bit system fix, LPE audio oops, and HDMI2.0 clock
     fixes.

   - Exynos has a 3 regression fixes (one frame counter, fbdev missing,
     dsi->panel check)

   - Etnaviv has a single fencing fix for GPU recovery"

* tag 'drm-fixes-2018-11-11' of git://anongit.freedesktop.org/drm/drm: (39 commits)
  drm/amd/amdgpu/dm: Fix dm_dp_create_fake_mst_encoder()
  drm/amd/display: Drop reusing drm connector for MST
  drm/amd/display: Cleanup MST non-atomic code workaround
  drm/amd/powerplay: always use fast UCLK switching when UCLK DPM enabled
  drm/amd/powerplay: set a default fclk/gfxclk ratio
  drm/amdgpu/display/dce11: only enable FBC when selected
  drm/amdgpu/display/dm: handle FBC dc feature parameter
  drm/amdgpu/display/dc: add FBC to dc_config
  drm/amdgpu: add DC feature mask module parameter
  drm/amdgpu/display: check if fbc is available in set_static_screen_control (v2)
  drm/amdgpu/vega20: add CLK base offset
  drm/amd/display: Stop leaking planes
  drm/amd/display: Fix misleading buffer information
  Revert "drm/amd/display: set backlight level limit to 1"
  drm/amd: Update atom_smu_info_v3_3 structure
  drm/i915: Fix ilk+ watermarks when disabling pipes
  drm/sun4i: tcon: prevent tcon->panel dereference if NULL
  drm/sun4i: tcon: fix check of tcon->panel null pointer
  drm/i915: Don't oops during modeset shutdown after lpe audio deinit
  drm/i915: Mark pin flags as u64
  ...

5 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm...
Linus Torvalds [Sat, 10 Nov 2018 19:27:58 +0000 (13:27 -0600)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

Pull namespace fixes from Eric Biederman:
 "I believe all of these are simple obviously correct bug fixes. These
  fall into two groups:

   - Fixing the implementation of MNT_LOCKED which prevents lesser
     privileged users from seeing unders mounts created by more
     privileged users.

   - Fixing the extended uid and group mapping in user namespaces.

  As well as ensuring the code looks correct I have spot tested these
  changes as well and in my testing the fixes are working"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  mount: Prevent MNT_DETACH from disconnecting locked mounts
  mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts
  mount: Retest MNT_LOCKED in do_umount
  userns: also map extents in the reverse map to kernel IDs

5 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 10 Nov 2018 19:25:55 +0000 (13:25 -0600)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "A small set of fixes for clk drivers.

  One to fix a DT refcount imbalance, two to mark some Amlogic clks as
  critical, and one final one that fixes a clk name for the Qualcomm
  driver merged this cycle"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: qcom: gcc: Fix board clock node name
  clk: meson: axg: mark fdiv2 and fdiv3 as critical
  clk: meson-gxbb: set fclk_div3 as CLK_IS_CRITICAL
  clk: fixed-factor: fix of_node_get-put imbalance

5 years agoMerge branch 'drm-fixes-4.20' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Dave Airlie [Sat, 10 Nov 2018 18:20:48 +0000 (04:20 +1000)]
Merge branch 'drm-fixes-4.20' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Fixes for 4.20:
- DC MST fixes
- DC FBC fix
- Vega20 updates to support the latest vbios
- KFD type fixes for ioctl headers

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181108035551.2904-1-alexander.deucher@amd.com
5 years agoMerge tag 'drm-misc-fixes-2018-11-07' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Sat, 10 Nov 2018 18:19:52 +0000 (04:19 +1000)]
Merge tag 'drm-misc-fixes-2018-11-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

- sun4i: tcon->panel NULL deref protections (Giulio)

Cc: Giulio Benetti <giulio.benetti@micronovasrl.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20181107205051.GA27823@art_vandelay
5 years agoMerge tag 'drm-intel-fixes-2018-11-08' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Sat, 10 Nov 2018 18:14:22 +0000 (04:14 +1000)]
Merge tag 'drm-intel-fixes-2018-11-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

Bugzilla #108282 fixed: Avoid graphics corruption on 32-bit systems for Mesa 18.2.x
Avoid OOPS on LPE audio deinit. Remove two unused W/As.
Fix to correct HDMI 2.0 audio clock modes to spec.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181108134508.GA28466@jlahtine-desk.ger.corp.intel.com
5 years agonet: sched: cls_flower: validate nested enc_opts_policy to avoid warning
Jakub Kicinski [Sat, 10 Nov 2018 05:06:26 +0000 (21:06 -0800)]
net: sched: cls_flower: validate nested enc_opts_policy to avoid warning

TCA_FLOWER_KEY_ENC_OPTS and TCA_FLOWER_KEY_ENC_OPTS_MASK can only
currently contain further nested attributes, which are parsed by
hand, so the policy is never actually used resulting in a W=1
build warning:

net/sched/cls_flower.c:492:1: warning: ‘enc_opts_policy’ defined but not used [-Wunused-const-variable=]
 enc_opts_policy[TCA_FLOWER_KEY_ENC_OPTS_MAX + 1] = {

Add the validation anyway to avoid potential bugs when other
attributes are added and to make the attribute structure slightly
more clear.  Validation will also set extact to point to bad
attribute on error.

Fixes: 0a6e77784f49 ("net/sched: allow flower to match tunnel options")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoRevert "acpi, nfit: Further restrict userspace ARS start requests"
Dan Williams [Sun, 4 Nov 2018 00:53:09 +0000 (17:53 -0700)]
Revert "acpi, nfit: Further restrict userspace ARS start requests"

The following lockdep splat results from acquiring the init_mutex in
acpi_nfit_clear_to_send():

 WARNING: possible circular locking dependency detected
 lt-daxdev-error/7216 is trying to acquire lock:
 00000000f694db15 (&acpi_desc->init_mutex){+.+.}, at: acpi_nfit_clear_to_send+0x27/0x80 [nfit]

 but task is already holding lock:
 00000000182298f2 (&nvdimm_bus->reconfig_mutex){+.+.}, at: __nd_ioctl+0x457/0x610 [libnvdimm]

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&nvdimm_bus->reconfig_mutex){+.+.}:
        nvdimm_badblocks_populate+0x41/0x150 [libnvdimm]
        nd_region_notify+0x95/0xb0 [libnvdimm]
        nd_device_notify+0x40/0x50 [libnvdimm]
        ars_complete+0x7f/0xd0 [nfit]
        acpi_nfit_scrub+0xbb/0x410 [nfit]
        process_one_work+0x22b/0x5c0
        worker_thread+0x3c/0x390
        kthread+0x11e/0x140
        ret_from_fork+0x3a/0x50

 -> #0 (&acpi_desc->init_mutex){+.+.}:
        __mutex_lock+0x83/0x980
        acpi_nfit_clear_to_send+0x27/0x80 [nfit]
        __nd_ioctl+0x474/0x610 [libnvdimm]
        nd_ioctl+0xa4/0xb0 [libnvdimm]
        do_vfs_ioctl+0xa5/0x6e0
        ksys_ioctl+0x70/0x80
        __x64_sys_ioctl+0x16/0x20
        do_syscall_64+0x60/0x210
        entry_SYSCALL_64_after_hwframe+0x49/0xbe

New infrastructure is needed to be able to perform this check without
acquiring the lock.

Fixes: 594861215c83 ("acpi, nfit: Further restrict userspace ARS start")
Cc: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
5 years agoacpi, nfit: Fix ARS overflow continuation
Dan Williams [Thu, 1 Nov 2018 07:30:22 +0000 (00:30 -0700)]
acpi, nfit: Fix ARS overflow continuation

When the platform BIOS is unable to report all the media error records
it requires the OS to restart the scrub at a prescribed location. The
driver detects the overflow condition, but then fails to report it to
the ARS state machine after reaping the records. Propagate -ENOSPC
correctly to continue the ARS operation.

Cc: <stable@vger.kernel.org>
Fixes: 1cf03c00e7c1 ("nfit: scrub and register regions in a workqueue")
Reported-by: Jacek Zloch <jacek.zloch@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
5 years agofloppy: fix race condition in __floppy_read_block_0()
Jens Axboe [Fri, 9 Nov 2018 22:58:40 +0000 (15:58 -0700)]
floppy: fix race condition in __floppy_read_block_0()

LKP recently reported a hang at bootup in the floppy code:

[  245.678853] INFO: task mount:580 blocked for more than 120 seconds.
[  245.679906]       Tainted: G                T 4.19.0-rc6-00172-ga9f38e1 #1
[  245.680959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  245.682181] mount           D 6372   580      1 0x00000004
[  245.683023] Call Trace:
[  245.683425]  __schedule+0x2df/0x570
[  245.683975]  schedule+0x2d/0x80
[  245.684476]  schedule_timeout+0x19d/0x330
[  245.685090]  ? wait_for_common+0xa5/0x170
[  245.685735]  wait_for_common+0xac/0x170
[  245.686339]  ? do_sched_yield+0x90/0x90
[  245.686935]  wait_for_completion+0x12/0x20
[  245.687571]  __floppy_read_block_0+0xfb/0x150
[  245.688244]  ? floppy_resume+0x40/0x40
[  245.688844]  floppy_revalidate+0x20f/0x240
[  245.689486]  check_disk_change+0x43/0x60
[  245.690087]  floppy_open+0x1ea/0x360
[  245.690653]  __blkdev_get+0xb4/0x4d0
[  245.691212]  ? blkdev_get+0x1db/0x370
[  245.691777]  blkdev_get+0x1f3/0x370
[  245.692351]  ? path_put+0x15/0x20
[  245.692871]  ? lookup_bdev+0x4b/0x90
[  245.693539]  blkdev_get_by_path+0x3d/0x80
[  245.694165]  mount_bdev+0x2a/0x190
[  245.694695]  squashfs_mount+0x10/0x20
[  245.695271]  ? squashfs_alloc_inode+0x30/0x30
[  245.695960]  mount_fs+0xf/0x90
[  245.696451]  vfs_kern_mount+0x43/0x130
[  245.697036]  do_mount+0x187/0xc40
[  245.697563]  ? memdup_user+0x28/0x50
[  245.698124]  ksys_mount+0x60/0xc0
[  245.698639]  sys_mount+0x19/0x20
[  245.699167]  do_int80_syscall_32+0x61/0x130
[  245.699813]  entry_INT80_32+0xc7/0xc7

showing that we never complete that read request. The reason is that
the completion setup is racy - it initializes the completion event
AFTER submitting the IO, which means that the IO could complete
before/during the init. If it does, we are passing garbage to
complete() and we may sleep forever waiting for the event to
occur.

Fixes: 7b7b68bba5ef ("floppy: bail out in open() if drive is not responding to block0 read")
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoMerge tag 'for-linus-4.20a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 10 Nov 2018 14:58:48 +0000 (08:58 -0600)]
Merge tag 'for-linus-4.20a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "Several fixes, mostly for rather recent regressions when running under
  Xen"

* tag 'for-linus-4.20a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: remove size limit of privcmd-buf mapping interface
  xen: fix xen_qlock_wait()
  x86/xen: fix pv boot
  xen-blkfront: fix kernel panic with negotiate_mq error path
  xen/grant-table: Fix incorrect gnttab_dma_free_pages() pr_debug message
  CONFIG_XEN_PV breaks xen_create_contiguous_region on ARM

5 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Sat, 10 Nov 2018 13:07:21 +0000 (07:07 -0600)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Fix occasional page fault during boot due to memblock resizing before
   the linear map is up.

 - Define NET_IP_ALIGN to 0 to improve the DMA performance on some
   platforms.

 - lib/raid6 test build fix.

 - .mailmap update for Punit Agrawal

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: memblock: don't permit memblock resizing until linear mapping is up
  arm64: mm: define NET_IP_ALIGN to 0
  lib/raid6: Fix arm64 test build
  mailmap: Update email for Punit Agrawal

5 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Linus Torvalds [Sat, 10 Nov 2018 12:57:34 +0000 (06:57 -0600)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:
 "I2C has one bugfix (qcom-geni driver), one arch enablement (i2c-omap
  driver, no code change), and a new driver (nvidia-gpu) this time"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  usb: typec: ucsi: add support for Cypress CCGx
  i2c: nvidia-gpu: make pm_ops static
  i2c: add i2c bus driver for NVIDIA GPU
  i2c: qcom-geni: Fix runtime PM mismatch with child devices
  MAINTAINERS: Add entry for i2c-omap driver
  i2c: omap: Enable for ARCH_K3
  dt-bindings: i2c: omap: Add new compatible for AM654 SoCs

5 years agonet: mvneta: correct typo
Alexandre Belloni [Fri, 9 Nov 2018 16:37:20 +0000 (17:37 +0100)]
net: mvneta: correct typo

The reserved variable should be named reserved1.

Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoflow_dissector: do not dissect l4 ports for fragments
배석진 [Sat, 10 Nov 2018 00:53:06 +0000 (16:53 -0800)]
flow_dissector: do not dissect l4 ports for fragments

Only first fragment has the sport/dport information,
not the following ones.

If we want consistent hash for all fragments, we need to
ignore ports even for first fragment.

This bug is visible for IPv6 traffic, if incoming fragments
do not have a flow label, since skb_get_hash() will give
different results for first fragment and following ones.

It is also visible if any routing rule wants dissection
and sport or dport.

See commit 5e5d6fed3741 ("ipv6: route: dissect flow
in input path if fib rules need it") for details.

[edumazet] rewrote the changelog completely.

Fixes: 06635a35d13d ("flow_dissect: use programable dissector in skb_flow_dissect and friends")
Signed-off-by: 배석진 <soukjin.bae@samsung.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: qualcomm: rmnet: Fix incorrect assignment of real_dev
Subash Abhinov Kasiviswanathan [Sat, 10 Nov 2018 01:56:27 +0000 (18:56 -0700)]
net: qualcomm: rmnet: Fix incorrect assignment of real_dev

A null dereference was observed when a sysctl was being set
from userspace and rmnet was stuck trying to complete some actions
in the NETDEV_REGISTER callback. This is because the real_dev is set
only after the device registration handler completes.

sysctl call stack -

<6> Unable to handle kernel NULL pointer dereference at
    virtual address 00000108
<2> pc : rmnet_vnd_get_iflink+0x1c/0x28
<2> lr : dev_get_iflink+0x2c/0x40
<2>  rmnet_vnd_get_iflink+0x1c/0x28
<2>  inet6_fill_ifinfo+0x15c/0x234
<2>  inet6_ifinfo_notify+0x68/0xd4
<2>  ndisc_ifinfo_sysctl_change+0x1b8/0x234
<2>  proc_sys_call_handler+0xac/0x100
<2>  proc_sys_write+0x3c/0x4c
<2>  __vfs_write+0x54/0x14c
<2>  vfs_write+0xcc/0x188
<2>  SyS_write+0x60/0xc0
<2>  el0_svc_naked+0x34/0x38

device register call stack -

<2>  notifier_call_chain+0x84/0xbc
<2>  raw_notifier_call_chain+0x38/0x48
<2>  call_netdevice_notifiers_info+0x40/0x70
<2>  call_netdevice_notifiers+0x38/0x60
<2>  register_netdevice+0x29c/0x3d8
<2>  rmnet_vnd_newlink+0x68/0xe8
<2>  rmnet_newlink+0xa0/0x160
<2>  rtnl_newlink+0x57c/0x6c8
<2>  rtnetlink_rcv_msg+0x1dc/0x328
<2>  netlink_rcv_skb+0xac/0x118
<2>  rtnetlink_rcv+0x24/0x30
<2>  netlink_unicast+0x158/0x1f0
<2>  netlink_sendmsg+0x32c/0x338
<2>  sock_sendmsg+0x44/0x60
<2>  SyS_sendto+0x150/0x1ac
<2>  el0_svc_naked+0x34/0x38

Fixes: b752eff5be24 ("net: qualcomm: rmnet: Implement ndo_get_iflink")
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'aquantia-fixes'
David S. Miller [Fri, 9 Nov 2018 23:38:11 +0000 (15:38 -0800)]
Merge branch 'aquantia-fixes'

Igor Russkikh says:

====================
net: aquantia: 2018-11 bugfixes

The patchset fixes a number of bugs found in various areas after
driver validation.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: aquantia: allow rx checksum offload configuration
Dmitry Bogdanov [Fri, 9 Nov 2018 11:54:03 +0000 (11:54 +0000)]
net: aquantia: allow rx checksum offload configuration

RX Checksum offloads could not be configured and ignored netdev features
flag for checksumming.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: aquantia: invalid checksumm offload implementation
Dmitry Bogdanov [Fri, 9 Nov 2018 11:54:01 +0000 (11:54 +0000)]
net: aquantia: invalid checksumm offload implementation

Packets with marked invalid IP/UDP/TCP checksums were considered as good
by the driver. The error was in a logic, processing offload bits in
RX descriptor.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: aquantia: fixed enable unicast on 32 macvlan
Igor Russkikh [Fri, 9 Nov 2018 11:53:59 +0000 (11:53 +0000)]
net: aquantia: fixed enable unicast on 32 macvlan

Fixed a condition mistake due to which macvlans unicast
item number 32 was not added in the unicast filter.

The consequence is that when exactly 32 macvlans are created
on NIC, the last created macvlan receives no traffic because
its MAC was not registered in HW.

Fixes: 94b3b542303f ("net: aquantia: vlan unicast address list correct handling")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Tested-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: aquantia: fix potential IOMMU fault after driver unbind
Dmitry Bogdanov [Fri, 9 Nov 2018 11:53:57 +0000 (11:53 +0000)]
net: aquantia: fix potential IOMMU fault after driver unbind

IOMMU fault may occurr on unbind/bind or if_down/if_up sequence.

Although driver disables the rings on down, this is not enough.
Due to internal HW design, during subsequent initialization
NIC sometimes may reuse RX descriptors cache and write to the
host memory from the descriptor cache.
That's get catched by IOMMU on host.

This patch invalidates the descriptor cache in NIC on interface down
to prevent writing to the cached descriptors and to the memory pointed
in those descriptors.

Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: aquantia: synchronized flow control between mac/phy
Igor Russkikh [Fri, 9 Nov 2018 11:53:56 +0000 (11:53 +0000)]
net: aquantia: synchronized flow control between mac/phy

Flow control statuses were not synchronized between blocks,
that caused packets/link drop on some corner cases, when
MAC sent PFC although Phy was not expecting these to come.

Driver should readout the negotiated FC from phy and
configure RX block accordigly.

This is done on each link change event with information from FW.

Fixes: 288551de45aa ("net: aquantia: Implement rx/tx flow control ethtools callback")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'devicetree-fixes-for-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 9 Nov 2018 22:41:58 +0000 (16:41 -0600)]
Merge tag 'devicetree-fixes-for-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull Devicetree fixes from Rob Herring:

 - Add validation of NUMA distance map to prevent crashes with bad map

 - Fix setting of dma_mask

* tag 'devicetree-fixes-for-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of, numa: Validate some distance map rules
  of/device: Really only set bus DMA mask when appropriate

5 years agoMerge tag 'for-linus-20181109' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 9 Nov 2018 22:31:51 +0000 (16:31 -0600)]
Merge tag 'for-linus-20181109' of git://git.kernel.dk/linux-block

Pull block layer fixes from Jens Axboe:

 - Two fixes for an ubd regression, one for missing locking, and one for
   a missing initialization of a field. The latter was an old latent
   bug, but it's now visible and triggers (Me, Anton Ivanov)

 - Set of NVMe fixes via Christoph, but applied manually due to a git
   tree mixup (Christoph, Sagi)

 - Fix for a discard split regression, in three patches (Ming)

 - Update libata git trees (Geert)

 - SPDX identifier for sata_rcar (Kuninori Morimoto)

 - Virtual boundary merge fix (Johannes)

 - Preemptively clear memory we are going to pass to userspace, in case
   the driver does a short read (Keith)

* tag 'for-linus-20181109' of git://git.kernel.dk/linux-block:
  block: make sure writesame bio is aligned with logical block size
  block: cleanup __blkdev_issue_discard()
  block: make sure discard bio is aligned with logical block size
  Revert "nvmet-rdma: use a private workqueue for delete"
  nvme: make sure ns head inherits underlying device limits
  nvmet: don't try to add ns to p2p map unless it actually uses it
  sata_rcar: convert to SPDX identifiers
  ubd: fix missing initialization of io_req
  block: Clear kernel memory before copying to user
  MAINTAINERS: Fix remaining pointers to obsolete libata.git
  ubd: fix missing lock around request issue
  block: respect virtual boundary mask in bvecs

5 years agoMerge tag 'ceph-for-4.20-rc2' of https://github.com/ceph/ceph-client
Linus Torvalds [Fri, 9 Nov 2018 22:26:18 +0000 (16:26 -0600)]
Merge tag 'ceph-for-4.20-rc2' of https://github.com/ceph/ceph-client

Pull Ceph fixes from Ilya Dryomov:
 "Two CephFS fixes (copy_file_range and quota) and a small feature bit
  cleanup"

* tag 'ceph-for-4.20-rc2' of https://github.com/ceph/ceph-client:
  libceph: assume argonaut on the server side
  ceph: quota: fix null pointer dereference in quota check
  ceph: add destination file data sync before doing any remote copy

5 years agoMerge tag 'mips_fixes_4.20_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Linus Torvalds [Fri, 9 Nov 2018 22:21:24 +0000 (16:21 -0600)]
Merge tag 'mips_fixes_4.20_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Paul Burton:
 "A couple of small MIPS fixes for 4.20:

   - Extend an array to avoid overruns on some Octeon hardware, fixing a
     bug introduced in 4.3.

   - Fix a coherent DMA regression for systems without cache-coherent
     DMA introduced in the 4.20 merge window"

* tag 'mips_fixes_4.20_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: Fix `dma_alloc_coherent' returning a non-coherent allocation
  MIPS: OCTEON: fix out of bounds array access on CN68XX

5 years agoclk: qcom: gcc: Fix board clock node name
Vinod Koul [Fri, 9 Nov 2018 09:50:54 +0000 (15:20 +0530)]
clk: qcom: gcc: Fix board clock node name

Device tree node name are not supposed to have "_" in them so fix the
node name use of xo_board to xo-board

Fixes: 652f1813c113 ("clk: qcom: gcc: Add global clock controller driver for QCS404")
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
5 years agox86/cpu/vmware: Do not trace vmware_sched_clock()
Steven Rostedt (VMware) [Fri, 9 Nov 2018 20:22:07 +0000 (15:22 -0500)]
x86/cpu/vmware: Do not trace vmware_sched_clock()

When running function tracing on a Linux guest running on VMware
Workstation, the guest would crash. This is due to tracing of the
sched_clock internal call of the VMware vmware_sched_clock(), which
causes an infinite recursion within the tracing code (clock calls must
not be traced).

Make vmware_sched_clock() not traced by ftrace.

Fixes: 80e9a4f21fd7c ("x86/vmware: Add paravirt sched clock")
Reported-by: GwanYeong Kim <gy741.kim@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
CC: Alok Kataria <akataria@vmware.com>
CC: GwanYeong Kim <gy741.kim@gmail.com>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
CC: Thomas Gleixner <tglx@linutronix.de>
CC: virtualization@lists.linux-foundation.org
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20181109152207.4d3e7d70@gandalf.local.home
5 years agousb: typec: ucsi: add support for Cypress CCGx
Ajay Gupta [Fri, 26 Oct 2018 16:36:59 +0000 (09:36 -0700)]
usb: typec: ucsi: add support for Cypress CCGx

Latest NVIDIA GPU cards have a Cypress CCGx Type-C controller
over I2C interface.

This UCSI I2C driver uses I2C bus driver interface for communicating
with Type-C controller.

Signed-off-by: Ajay Gupta <ajayg@nvidia.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
5 years agoserial: sh-sci: Fix could not remove dev_attr_rx_fifo_timeout
Yoshihiro Shimoda [Tue, 30 Oct 2018 06:13:35 +0000 (15:13 +0900)]
serial: sh-sci: Fix could not remove dev_attr_rx_fifo_timeout

This patch fixes an issue that the sci_remove() could not remove
dev_attr_rx_fifo_timeout because uart_remove_one_port() set
the port->port.type to PORT_UNKNOWN.

Reported-by: Hiromitsu Yamasaki <hiromitsu.yamasaki.ym@renesas.com>
Fixes: 5d23188a473d ("serial: sh-sci: make RX FIFO parameters tunable via sysfs")
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Ulrich Hecht <uli+renesas@fpond.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/dp_mst: Check if primary mstb is null
Stanislav Lisovskiy [Fri, 9 Nov 2018 09:00:12 +0000 (11:00 +0200)]
drm/dp_mst: Check if primary mstb is null

Unfortunately drm_dp_get_mst_branch_device which is called from both
drm_dp_mst_handle_down_rep and drm_dp_mst_handle_up_rep seem to rely
on that mgr->mst_primary is not NULL, which seem to be wrong as it can be
cleared with simultaneous mode set, if probing fails or in other case.
mgr->lock mutex doesn't protect against that as it might just get
assigned to NULL right before, not simultaneously.

There are currently bugs 107738, 108616 bugs which crash in
drm_dp_get_mst_branch_device, caused by this issue.

v2: Refactored the code, as it was nicely noticed.
    Fixed Bugzilla bug numbers(second was 108616, but not 108816)
    and added links.

[changed title and added stable cc]
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108616
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107738
Link: https://patchwork.freedesktop.org/patch/msgid/20181109090012.24438-1-stanislav.lisovskiy@intel.com
5 years agoi2c: nvidia-gpu: make pm_ops static
Wolfram Sang [Fri, 9 Nov 2018 16:54:38 +0000 (17:54 +0100)]
i2c: nvidia-gpu: make pm_ops static

sparse rightfully says:

warning: symbol 'gpu_i2c_driver_pm' was not declared. Should it be static?

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
5 years agoi2c: add i2c bus driver for NVIDIA GPU
Ajay Gupta [Fri, 26 Oct 2018 16:36:58 +0000 (09:36 -0700)]
i2c: add i2c bus driver for NVIDIA GPU

Latest NVIDIA GPU card has USB Type-C interface. There is a
Type-C controller which can be accessed over I2C.

This driver adds I2C bus driver to communicate with Type-C controller.
I2C client driver will be part of USB Type-C UCSI driver.

Signed-off-by: Ajay Gupta <ajayg@nvidia.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[wsa: kept Makefile sorting]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
5 years agoext4: missing !bh check in ext4_xattr_inode_write()
Vasily Averin [Fri, 9 Nov 2018 16:34:40 +0000 (11:34 -0500)]
ext4: missing !bh check in ext4_xattr_inode_write()

According to Ted Ts'o ext4_getblk() called in ext4_xattr_inode_write()
should not return bh = NULL

The only time that bh could be NULL, then, would be in the case of
something really going wrong; a programming error elsewhere (perhaps a
wild pointer dereference) or I/O error causing on-disk file system
corruption (although that would be highly unlikely given that we had
*just* allocated the blocks and so the metadata blocks in question
probably would still be in the cache).

Fixes: e50e5129f384 ("ext4: xattr-in-inode support")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 4.13
5 years agoi2c: qcom-geni: Fix runtime PM mismatch with child devices
Stephen Boyd [Fri, 2 Nov 2018 20:57:32 +0000 (13:57 -0700)]
i2c: qcom-geni: Fix runtime PM mismatch with child devices

We need to enable runtime PM on this i2c controller before populating
child devices with i2c_add_adapter(). Otherwise, if a child device uses
runtime PM and stays runtime PM enabled we'll get the following warning
at boot.

 Enabling runtime PM for inactive device (a98000.i2c) with active children

[...]

 Call trace:
  pm_runtime_enable+0xd8/0xf8
  geni_i2c_probe+0x440/0x460
  platform_drv_probe+0x74/0xc8
[...]

Let's move the runtime PM enabling and setup to before we add the
adapter, so that this device can respond to runtime PM requests from
children.

Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
5 years agoMAINTAINERS: Add entry for i2c-omap driver
Vignesh R [Fri, 9 Nov 2018 11:14:12 +0000 (16:44 +0530)]
MAINTAINERS: Add entry for i2c-omap driver

Add separate entry for i2c-omap and add my name as maintainer for this
driver.

Signed-off-by: Vignesh R <vigneshr@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
5 years agoi2c: omap: Enable for ARCH_K3
Vignesh R [Fri, 9 Nov 2018 11:14:11 +0000 (16:44 +0530)]
i2c: omap: Enable for ARCH_K3

Allow I2C_OMAP to be built for K3 platforms.

Signed-off-by: Vignesh R <vigneshr@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
5 years agodt-bindings: i2c: omap: Add new compatible for AM654 SoCs
Vignesh R [Fri, 9 Nov 2018 11:14:10 +0000 (16:44 +0530)]
dt-bindings: i2c: omap: Add new compatible for AM654 SoCs

AM654 SoCs have same I2C IP as OMAP SoCs. Add new compatible to
handle AM654 SoCs. While at that reformat the existing compatible list
for older SoCs to list one valid compatible per line.

Signed-off-by: Vignesh R <vigneshr@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
5 years agoxen: remove size limit of privcmd-buf mapping interface
Juergen Gross [Thu, 1 Nov 2018 12:33:07 +0000 (13:33 +0100)]
xen: remove size limit of privcmd-buf mapping interface

Currently the size of hypercall buffers allocated via
/dev/xen/hypercall is limited to a default of 64 memory pages. For live
migration of guests this might be too small as the page dirty bitmask
needs to be sized according to the size of the guest. This means
migrating a 8GB sized guest is already exhausting the default buffer
size for the dirty bitmap.

There is no sensible way to set a sane limit, so just remove it
completely. The device node's usage is limited to root anyway, so there
is no additional DOS scenario added by allowing unlimited buffers.

While at it make the error path for the -ENOMEM case a little bit
cleaner by setting n_pages to the number of successfully allocated
pages instead of the target size.

Fixes: c51b3c639e01f2 ("xen: add new hypercall buffer mapping device")
Cc: <stable@vger.kernel.org> #4.18
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
5 years agoxen: fix xen_qlock_wait()
Juergen Gross [Thu, 8 Nov 2018 07:35:06 +0000 (08:35 +0100)]
xen: fix xen_qlock_wait()

Commit a856531951dc80 ("xen: make xen_qlock_wait() nestable")
introduced a regression for Xen guests running fully virtualized
(HVM or PVH mode). The Xen hypervisor wouldn't return from the poll
hypercall with interrupts disabled in case of an interrupt (for PV
guests it does).

So instead of disabling interrupts in xen_qlock_wait() use a nesting
counter to avoid calling xen_clear_irq_pending() in case
xen_qlock_wait() is nested.

Fixes: a856531951dc80 ("xen: make xen_qlock_wait() nestable")
Cc: stable@vger.kernel.org
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Juergen Gross <jgross@suse.com>
5 years agofuse: fix use-after-free in fuse_direct_IO()
Lukas Czerner [Fri, 9 Nov 2018 13:51:46 +0000 (14:51 +0100)]
fuse: fix use-after-free in fuse_direct_IO()

In async IO blocking case the additional reference to the io is taken for
it to survive fuse_aio_complete(). In non blocking case this additional
reference is not needed, however we still reference io to figure out
whether to wait for completion or not. This is wrong and will lead to
use-after-free. Fix it by storing blocking information in separate
variable.

This was spotted by KASAN when running generic/208 fstest.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 744742d692e3 ("fuse: Add reference counting for fuse_io_priv")
Cc: <stable@vger.kernel.org> # v4.6
5 years agofuse: fix possibly missed wake-up after abort
Miklos Szeredi [Fri, 9 Nov 2018 14:52:16 +0000 (15:52 +0100)]
fuse: fix possibly missed wake-up after abort

In current fuse_drop_waiting() implementation it's possible that
fuse_wait_aborted() will not be woken up in the unlikely case that
fuse_abort_conn() + fuse_wait_aborted() runs in between checking
fc->connected and calling atomic_dec(&fc->num_waiting).

Do the atomic_dec_and_test() unconditionally, which also provides the
necessary barrier against reordering with the fc->connected check.

The explicit smp_mb() in fuse_wait_aborted() is not actually needed, since
the spin_unlock() in fuse_abort_conn() provides the necessary RELEASE
barrier after resetting fc->connected.  However, this is not a performance
sensitive path, and adding the explicit barrier makes it easier to
document.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: b8f95e5d13f5 ("fuse: umount should wait for all requests")
Cc: <stable@vger.kernel.org> #v4.19
5 years agofuse: fix leaked notify reply
Miklos Szeredi [Fri, 9 Nov 2018 14:52:16 +0000 (15:52 +0100)]
fuse: fix leaked notify reply

fuse_request_send_notify_reply() may fail if the connection was reset for
some reason (e.g. fs was unmounted).  Don't leak request reference in this
case.  Besides leaking memory, this resulted in fc->num_waiting not being
decremented and hence fuse_wait_aborted() left in a hanging and unkillable
state.

Fixes: 2d45ba381a74 ("fuse: add retrieve request")
Fixes: b8f95e5d13f5 ("fuse: umount should wait for all requests")
Reported-and-tested-by: syzbot+6339eda9cb4ebbc4c37b@syzkaller.appspotmail.com
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> #v2.6.36
5 years agoblock: make sure writesame bio is aligned with logical block size
Ming Lei [Mon, 29 Oct 2018 12:57:19 +0000 (20:57 +0800)]
block: make sure writesame bio is aligned with logical block size

Obviously the created writesame bio has to be aligned with logical block
size, and use bio_allowed_max_sectors() to retrieve this number.

Cc: stable@vger.kernel.org
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Xiao Ni <xni@redhat.com>
Cc: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Fixes: b49a0871be31a745b2ef ("block: remove split code in blkdev_issue_{discard,write_same}")
Tested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoblock: cleanup __blkdev_issue_discard()
Ming Lei [Mon, 29 Oct 2018 12:57:18 +0000 (20:57 +0800)]
block: cleanup __blkdev_issue_discard()

Cleanup __blkdev_issue_discard() a bit:

- remove local variable of 'end_sect'
- remove code block of 'fail'

Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Xiao Ni <xni@redhat.com>
Cc: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Tested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoblock: make sure discard bio is aligned with logical block size
Ming Lei [Mon, 29 Oct 2018 12:57:17 +0000 (20:57 +0800)]
block: make sure discard bio is aligned with logical block size

Obviously the created discard bio has to be aligned with logical block size.

This patch introduces the helper of bio_allowed_max_sectors() for
this purpose.

Cc: stable@vger.kernel.org
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Xiao Ni <xni@redhat.com>
Cc: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Fixes: 744889b7cbb56a6 ("block: don't deal with discard limit in blkdev_issue_discard()")
Fixes: a22c4d7e34402cc ("block: re-add discard_granularity and alignment checks")
Reported-by: Rui Salvaterra <rsalvaterra@gmail.com>
Tested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoRevert "nvmet-rdma: use a private workqueue for delete"
Christoph Hellwig [Wed, 7 Nov 2018 08:20:25 +0000 (09:20 +0100)]
Revert "nvmet-rdma: use a private workqueue for delete"

This reverts commit 2acf70ade79d26b97611a8df52eb22aa33814cd4.

The commit never really fixed the intended issue and caused all
kinds of other issues, including a use before initialization.

Suggested-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvme: make sure ns head inherits underlying device limits
Sagi Grimberg [Fri, 2 Nov 2018 18:22:13 +0000 (11:22 -0700)]
nvme: make sure ns head inherits underlying device limits

Whenever we update ns_head info, we need to make sure it is still
compatible with all underlying backing devices because although nvme
multipath doesn't have any explicit use of these limits, other devices
can still be stacked on top of it which may rely on the underlying limits.
Start with unlimited stacking limits, and every info update iterate over
siblings and adjust queue limits.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonvmet: don't try to add ns to p2p map unless it actually uses it
Sagi Grimberg [Fri, 2 Nov 2018 23:12:21 +0000 (16:12 -0700)]
nvmet: don't try to add ns to p2p map unless it actually uses it

Even without CONFIG_P2PDMA this results in a error print:
nvmet: no peer-to-peer memory is available that's supported by rxe0 and /dev/nullb0

Fixes: c6925093d0b2 ("nvmet: Optionally use PCI P2P memory")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoMerge tag 's390-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 9 Nov 2018 12:30:44 +0000 (06:30 -0600)]
Merge tag 's390-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:

 - A fix for the pgtable_bytes misaccounting on s390. The patch changes
   common code part in regard to page table folding and adds extra
   checks to mm_[inc|dec]_nr_[pmds|puds].

 - Add FORCE for all build targets using if_changed

 - Use non-loadable phdr for the .vmlinux.info section to avoid a
   segment overlap that confuses kexec

 - Cleanup the attribute definition for the diagnostic sampling

 - Increase stack size for CONFIG_KASAN=y builds

 - Export __node_distance to fix a build error

 - Correct return code of a PMU event init function

 - An update for the default configs

* tag 's390-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/perf: Change CPUM_CF return code in event init function
  s390: update defconfigs
  s390/mm: Fix ERROR: "__node_distance" undefined!
  s390/kasan: increase instrumented stack size to 64k
  s390/cpum_sf: Rework attribute definition for diagnostic sampling
  s390/mm: fix mis-accounting of pgtable_bytes
  mm: add mm_pxd_folded checks to pgtable_bytes accounting functions
  mm: introduce mm_[p4d|pud|pmd]_folded
  mm: make the __PAGETABLE_PxD_FOLDED defines non-empty
  s390: avoid vmlinux segments overlap
  s390/vdso: add missing FORCE to build targets
  s390/decompressor: add missing FORCE to build targets

5 years agogfs2: Fix metadata read-ahead during truncate (2)
Andreas Gruenbacher [Thu, 8 Nov 2018 20:14:29 +0000 (20:14 +0000)]
gfs2: Fix metadata read-ahead during truncate (2)

The previous attempt to fix for metadata read-ahead during truncate was
incorrect: for files with a height > 2 (1006989312 bytes with a block
size of 4096 bytes), read-ahead requests were not being issued for some
of the indirect blocks discovered while walking the metadata tree,
leading to significant slow-downs when deleting large files.  Fix that.

In addition, only issue read-ahead requests in the first pass through
the meta-data tree, while deallocating data blocks.

Fixes: c3ce5aa9b0 ("gfs2: Fix metadata read-ahead during truncate")
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
5 years agogfs2: Put bitmap buffers in put_super
Andreas Gruenbacher [Mon, 5 Nov 2018 22:57:24 +0000 (22:57 +0000)]
gfs2: Put bitmap buffers in put_super

gfs2_put_super calls gfs2_clear_rgrpd to destroy the gfs2_rgrpd objects
attached to the resource group glocks.  That function should release the
buffers attached to the gfs2_bitmap objects (bi_bh), but the call to
gfs2_rgrp_brelse for doing that is missing.

When gfs2_releasepage later runs across these buffers which are still
referenced, it refuses to free them.  This causes the pages the buffers
are attached to to remain referenced as well.  With enough mount/unmount
cycles, the system will eventually run out of memory.

Fix this by adding the missing call to gfs2_rgrp_brelse in
gfs2_clear_rgrpd.

(Also fix a gfs2_rgrp_relse -> gfs2_rgrp_brelse typo in a comment.)

Fixes: 39b0f1e92908 ("GFS2: Don't brelse rgrp buffer_heads every allocation")
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
5 years agoMAINTAINERS: Add tree link for Intel pin control driver
Andy Shevchenko [Mon, 5 Nov 2018 16:33:34 +0000 (18:33 +0200)]
MAINTAINERS: Add tree link for Intel pin control driver

Intel pin control driver gets its own tree. Update MAINTAINERS accordingly.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
5 years agocrypto: user - Zeroize whole structure given to user space
Corentin Labbe [Sat, 3 Nov 2018 21:56:01 +0000 (14:56 -0700)]
crypto: user - Zeroize whole structure given to user space

For preventing uninitialized data to be given to user-space (and so leak
potential useful data), the crypto_stat structure must be correctly
initialized.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: cac5818c25d0 ("crypto: user - Implement a generic crypto statistics")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
[EB: also fix it in crypto_reportstat_one()]
[EB: use sizeof(var) rather than sizeof(type)]
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>