Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync up with Linus #50

Merged
merged 33 commits into from
Mar 17, 2015
Merged

Sync up with Linus #50

merged 33 commits into from
Mar 17, 2015

Conversation

dabrace
Copy link
Owner

@dabrace dabrace commented Mar 17, 2015

No description provided.

Valentin Rothberg and others added 30 commits February 14, 2015 14:26
Since commit 1c6c695 ("genirq: Reject
bogus threaded irq requests") threaded IRQs without a primary handler
need to be requested with IRQF_ONESHOT, otherwise the request will fail.

The %irq_flags flag is used to request the threaded IRQ and is also a
parameter of the caller.  Hence, we cannot be sure that IRQF_ONESHOT is
set.  This change avoids the potentially missing flag by setting
IRQF_ONESHOT when requesting the threaded IRQ.

Generated by: scripts/coccinelle/misc/irqf_oneshot.cocci

Signed-off-by: Valentin Rothberg <Valentin.Rothberg@lip6.fr>
Signed-off-by: Mark Brown <broonie@kernel.org>
regcache_sync() spews warnings when a value was cached for a read-only
register as it tries to write all registers no matter whether they are
writable or not.  This patch adds regmap_wrtieable() checks for
avoiding it in regcache_sync_block_single() and regcache_block_raw().

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
when multiport is off, we don't initialize config work,
but we then cancel uninitialized control_work on freeze.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
when multiport is off, virtio console invokes config access from irq
context, config access is blocking on s390.
Fix this up by scheduling work from config irq - similar to what we do
for multiport configs.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
When inserting a new register into a block at the lower end the present
bitmap is currently shifted into the wrong direction. The effect of this is
that the bitmap becomes corrupted and registers which are present might be
reported as not present and vice versa.

Fix this by shifting left rather than right.

Fixes: 472fdec("regmap: rbtree: Reduce number of nodes, take 2")
Reported-by: Daniel Baluta <daniel.baluta@gmail.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
The _regulator_do_enable() call ought to be a no-op when called on an
already-enabled regulator.  However, as an optimization
_regulator_enable() doesn't call _regulator_do_enable() on an already
enabled regulator.  That means we never test the case of calling
_regulator_do_enable() during normal usage and there may be hidden
bugs or warnings.  We have seen warnings issued by the tps65090 driver
and bugs when using the GPIO enable pin.

Let's match the same optimization that _regulator_enable() in
regulator_suspend_finish().  That may speed up suspend/resume and also
avoids exposing hidden bugs.

[Use much clearer commit message from Doug Anderson]

Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Normally _regulator_do_enable() isn't called on an already-enabled
rdev.  That's because the main caller, _regulator_enable() always
calls _regulator_is_enabled() and only calls _regulator_do_enable() if
the rdev was not already enabled.

However, there is one caller of _regulator_do_enable() that doesn't
check: regulator_suspend_finish().  While we might want to make
regulator_suspend_finish() behave more like _regulator_enable(), it's
probably also a good idea to make _regulator_do_enable() robust if it
is called on an already enabled rdev.

At the moment, _regulator_do_enable() is _not_ robust for already
enabled rdevs if we're using an ena_pin.  Each time
_regulator_do_enable() is called for an rdev using an ena_pin the
reference count of the ena_pin is incremented even if the rdev was
already enabled.  This is not as intended because the ena_pin is for
something else: for keeping track of how many active rdevs there are
sharing the same ena_pin.

Here's how the reference counting works here:

* Each time _regulator_enable() is called we increment
  rdev->use_count, so _regulator_enable() calls need to be balanced
  with _regulator_disable() calls.

* There is no explicit reference counting in _regulator_do_enable()
  which is normally just a warapper around rdev->desc->ops->enable()
  with code for supporting delays.  It's not expected that the
  "ops->enable()" call do reference counting.

* Since regulator_ena_gpio_ctrl() does have reference counting
  (handling the sharing of the pin amongst multiple rdevs), we
  shouldn't call it if the current rdev is already enabled.

Note that as part of this we cleanup (remove) the initting of
ena_gpio_state in regulator_register().  In _regulator_do_enable(),
_regulator_do_disable() and _regulator_is_enabled() is is clear that
ena_gpio_state should be the state of whether this particular rdev has
requested the GPIO be enabled.  regulator_register() was initting it
as the actual state of the pin.

Fixes: 967cfb1 ("regulator: core: manage enable GPIO list")
Signed-off-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
virtio spec requires that all drivers set DRIVER_OK
before using devices. While balloon isn't yet
included in the virtio 1 spec, previous spec versions
also required this.

virtio balloon might violate this rule: probe calls
kthread_run before setting DRIVER_OK, which might run
immediately and cause balloon to inflate/deflate.

To fix, call virtio_device_ready before running the kthread.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
Now that QEmu reuses linux virtio headers, we noticed
a typo in the exported virtio block header. Fix it up.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fix up comment to match virtio 1.0 logic:
virtio_blk_outhdr isn't the first elements anymore,
the only requirement is that it comes first in
the s/g list.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
virtio balloon has this code:
        wait_event_interruptible(vb->config_change,
                                 (diff = towards_target(vb)) != 0
                                 || vb->need_stats_update
                                 || kthread_should_stop()
                                 || freezing(current));

Which is a problem because towards_target() call might block after
wait_event_interruptible sets task state to TAST_INTERRUPTIBLE, causing
the task_struct::state collision typical of nesting of sleeping
primitives

See also http://lwn.net/Articles/628628/ or Thomas's
bug report
http://article.gmane.org/gmane.linux.kernel.virtualization/24846
for a fuller explanation.

To fix, rewrite using wait_woken.

Cc: stable@vger.kernel.org
Reported-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
POWER supports irqfds but forgot to advertise them.  Some userspace does
not check for the capability, but others check it---thus they work on
x86 and s390 but not POWER.

To avoid that other architectures in the future make the same mistake, let
common code handle KVM_CAP_IRQFD the same way as KVM_CAP_IRQFD_RESAMPLE.

Reported-and-tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 297e210
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
We're using __get_free_pages with to allocate the guest's stage-2
PGD. The standard behaviour of this function is to return a set of
pages where only the head page has a valid refcount.

This behaviour gets us into trouble when we're trying to increment
the refcount on a non-head page:

page:ffff7c00cfb693c0 count:0 mapcount:0 mapping:          (null) index:0x0
flags: 0x4000000000000000()
page dumped because: VM_BUG_ON_PAGE((*({ __attribute__((unused)) typeof((&page->_count)->counter) __var = ( typeof((&page->_count)->counter)) 0; (volatile typeof((&page->_count)->counter) *)&((&page->_count)->counter); })) <= 0)
BUG: failure at include/linux/mm.h:548/get_page()!
Kernel panic - not syncing: BUG!
CPU: 1 PID: 1695 Comm: kvm-vcpu-0 Not tainted 4.0.0-rc1+ #3825
Hardware name: APM X-Gene Mustang board (DT)
Call trace:
[<ffff80000008a09c>] dump_backtrace+0x0/0x13c
[<ffff80000008a1e8>] show_stack+0x10/0x1c
[<ffff800000691da8>] dump_stack+0x74/0x94
[<ffff800000690d78>] panic+0x100/0x240
[<ffff8000000a0bc4>] stage2_get_pmd+0x17c/0x2bc
[<ffff8000000a1dc4>] kvm_handle_guest_abort+0x4b4/0x6b0
[<ffff8000000a420c>] handle_exit+0x58/0x180
[<ffff80000009e7a4>] kvm_arch_vcpu_ioctl_run+0x114/0x45c
[<ffff800000099df4>] kvm_vcpu_ioctl+0x2e0/0x754
[<ffff8000001c0a18>] do_vfs_ioctl+0x424/0x5c8
[<ffff8000001c0bfc>] SyS_ioctl+0x40/0x78
CPU0: stopping

A possible approach for this is to split the compound page using
split_page() at allocation time, and change the teardown path to
free one page at a time.  It turns out that alloc_pages_exact() and
free_pages_exact() does exactly that.

While we're at it, the PGD allocation code is reworked to reduce
duplication.

This has been tested on an X-Gene platform with a 4kB/48bit-VA host
kernel, and kvmtool hacked to place memory in the second page of
the hardware PGD (PUD for the host kernel). Also regression-tested
on a Cubietruck (Cortex-A7).

 [ Reworked to use alloc_pages_exact() and free_pages_exact() and to
   return pointers directly instead of by reference as arguments
    - Christoffer ]

Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
The kernel's pgd_index macro is designed to index a normal, page
sized array. KVM is a bit diffferent, as we can use concatenated
pages to have a bigger address space (for example 40bit IPA with
4kB pages gives us an 8kB PGD.

In the above case, the use of pgd_index will always return an index
inside the first 4kB, which makes a guest that has memory above
0x8000000000 rather unhappy, as it spins forever in a page fault,
whist the host happilly corrupts the lower pgd.

The obvious fix is to get our own kvm_pgd_index that does the right
thing(tm).

Tested on X-Gene with a hacked kvmtool that put memory at a stupidly
high address.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Commit 87366d8 ("arm64: Add boot time configuration of
Intermediate Physical Address size") removed the hardcoded setting
of VTCR_EL2.PS to use ID_AA64MMFR0_EL1.PARange instead, but didn't
remove the (now rather misleading) comment.

Fix the comments to match reality (at least for the next few minutes).

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
If data is read from PIC with invalid access size, the return data stays
uninitialized even though success is returned.

Fix this by always initializing the data.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
On device hot-unplug, 9p/virtio currently will kfree channel while
it might still be in use.

Of course, it might stay used forever, so it's an extremely ugly hack,
but it seems better than use-after-free that we have now.

[ Unused variable removed, whitespace cleanup, msg single-lined --RR ]
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
virtio spec requires that all drivers set DRIVER_OK
before using devices. While rpmsg isn't yet
included in the virtio 1 spec, previous spec versions
also required this.

virtio rpmsg violates this rule: is calls kick
before setting DRIVER_OK.

The fix isn't trivial since simply calling virtio_device_ready earlier
would mean we might get an interrupt in parallel with adding buffers.

Instead, split kick out to prepare+notify calls.  prepare before
virtio_device_ready - when we know we won't get interrupts. notify right
afterwards.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
virtio_mmio currently lacks generation support which
makes multi-byte field access racy.
Fix by getting the value at offset 0xfc for version 2
devices. Nothing we can do for version 1, so return
generation id 0.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
QEMU wants to use virtio scsi structures with
a different VIRTIO_SCSI_CDB_SIZE/VIRTIO_SCSI_SENSE_SIZE,
let's add ifdefs to allow overriding them.

Keep the old defines under new names:
VIRTIO_SCSI_CDB_DEFAULT_SIZE/VIRTIO_SCSI_SENSE_DEFAULT_SIZE,
since that's what these values really are:
defaults for cdb/sense size fields.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Add the missing unlock before return from function kvm_vgic_create()
in the error handling case.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
In commit 3af18d9 ("KVM: nVMX: Prepare for using hardware MSR bitmap"),
we are setting MSR_BITMAP in prepare_vmcs02 if we should use hardware. This
is not enough since the field will be modified by following vmx_set_efer.

Fix this by setting vmx_msr_bitmap_nested in vmx_set_msr_bitmap if vcpu is
in guest mode.

Signed-off-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
There is an interesting bug in the vgic code, which manifests itself
when the KVM run loop has a signal pending or needs a vmid generation
rollover after having disabled interrupts but before actually switching
to the guest.

In this case, we flush the vgic as usual, but we sync back the vgic
state and exit to userspace before entering the guest.  The consequence
is that we will be syncing the list registers back to the software model
using the GICH_ELRSR and GICH_EISR from the last execution of the guest,
potentially overwriting a list register containing an interrupt.

This showed up during migration testing where we would capture a state
where the VM has masked the arch timer but there were no interrupts,
resulting in a hung test.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Alex Bennee <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
drivers/regulator/tps65910-regulator.c: In function ‘tps65910_parse_dt_reg_data’:
drivers/regulator/tps65910-regulator.c:1018: error: implicit declaration of function ‘of_get_child_by_name’
drivers/regulator/tps65910-regulator.c:1018: warning: assignment makes pointer from integer without a cast
drivers/regulator/tps65910-regulator.c:1034: error: implicit declaration of function ‘of_node_put’
drivers/regulator/tps65910-regulator.c:1056: error: implicit declaration of function ‘of_property_read_u32’

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
…ux/kernel/git/kvmarm/kvmarm

Fixes for KVM/ARM for 4.0-rc5.

Fixes page refcounting issues in our Stage-2 page table management code,
fixes a missing unlock in a gicv3 error path, and fixes a race that can
cause lost interrupts if signals are pending just prior to entering the
guest.
Going over the virtio mmio code, I noticed that it doesn't correctly
access modern device config values using "natural" accessors: it uses
readb to get/set them byte by byte, while the virtio 1.0 spec explicitly states:

	4.2.2.2 Driver Requirements: MMIO Device Register Layout

	...

	The driver MUST only use 32 bit wide and aligned reads and writes to
	access the control registers described in table 4.1.
	For the device-specific configuration space, the driver MUST use
	8 bit wide accesses for 8 bit wide fields, 16 bit wide and aligned
	accesses for 16 bit wide fields and 32 bit wide and aligned accesses for
	32 and 64 bit wide fields.

Borrow code from virtio_pci_modern to do this correctly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As pointed by recent post[1] on exploiting DRAM physical imperfection,
/proc/PID/pagemap exposes sensitive information which can be used to do
attacks.

This disallows anybody without CAP_SYS_ADMIN to read the pagemap.

[1] http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html

[ Eventually we might want to do anything more finegrained, but for now
  this is the simple model.   - Linus ]

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Seaborn <mseaborn@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull kvm fixes from Marcelo Tosatti:
 "KVM bug fixes (ARM and x86)"

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  arm/arm64: KVM: Keep elrsr/aisr in sync with software model
  KVM: VMX: Set msr bitmap correctly if vcpu is in guest mode
  arm/arm64: KVM: fix missing unlock on error in kvm_vgic_create()
  kvm: x86: i8259: return initialized data on invalid-size read
  arm64: KVM: Fix outdated comment about VTCR_EL2.PS
  arm64: KVM: Do not use pgd_index to index stage-2 pgd
  arm64: KVM: Fix stage-2 PGD allocation to have per-page refcounting
  kvm: move advertising of KVM_CAP_IRQFD to common code
…ux/kernel/git/rusty/linux

Pull virtio fixes from Rusty Russell:
 "Not entirely surprising: the ongoing QEMU work on virtio 1.0 has
  revealed more minor issues with our virtio 1.0 drivers just introduced
  in the kernel.

  (I would normally use my fixes branch for this, but there were a batch
  of them...)"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  virtio_mmio: fix access width for mmio
  uapi/virtio_scsi: allow overriding CDB/SENSE size
  virtio_mmio: generation support
  virtio_rpmsg: set DRIVER_OK before using device
  9p/trans_virtio: fix hot-unplug
  virtio-balloon: do not call blocking ops when !TASK_RUNNING
  virtio_blk: fix comment for virtio 1.0
  virtio_blk: typo fix
  virtio_balloon: set DRIVER_OK before using device
  virtio_console: avoid config access from irq
  virtio_console: init work unconditionally
…nel/git/broonie/regmap

Pull regmap fixes from Mark Brown:
 "A few things here:

   - a change from Lars to fix insertion of cache values at the start of
     rather than end of a rbtree block.  This hadn't been noticed before
     since almost everything lists registers in ascending order.

   - a fix from Takashi for spurious warnings during cache sync with
     read once registers, a problem which can be very noticeable on
     devices that it affects.

   - a fix from Valentin for a tighening of the oneshot IRQ request
     interface which would have broken affected devices"

* tag 'regmap-v4.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: regcache-rbtree: Fix present bitmap resize
  regmap: Skip read-only registers in regcache_sync()
  regmap-irq: set IRQF_ONESHOT flag to ensure IRQ request
…nux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "The two main fixes here from Javier and Doug both fix issues seen on
  the Exynos-based ARM Chromebooks with reference counting of GPIO
  regulators over system suspend.  The GPIO enable code didn't properly
  take account of this case (a full analysis is in Doug's commit log).

  This is fixed by both fixing the reference counting directly and by
  making the resume code skip enables it doesn't need to do.  We could
  skip the change in the resume code but it's a very simple change and
  adds extra robustness against problems in other drivers"

* tag 'regulator-fix-v4.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: tps65910: Add missing #include <linux/of.h>
  regulator: core: Fix enable GPIO reference counting
  regulator: Only enable disabled regulators on resume
dabrace added a commit that referenced this pull request Mar 17, 2015
@dabrace dabrace merged commit 4f3a14c into dabrace:master Mar 17, 2015
dabrace pushed a commit that referenced this pull request Sep 28, 2015
During quick plug/removal of OTG adapter during dual-role testing
it can happen that xhci_alloc_device() is called for the newly
detected device after the DRD library has called xhci_stop to
remove the HCD.

If that is the case, just fail early to prevent the following warning.

[  154.732649] hub 4-0:1.0: USB hub found
[  154.742204] hub 4-0:1.0: 1 port detected
[  154.824458] hub 3-0:1.0: state 7 ports 1 chg 0002 evt 0000
[  154.854609] hub 4-0:1.0: state 7 ports 1 chg 0000 evt 0000
[  154.944430] usb 3-1: new high-speed USB device number 2 using xhci-hcd
[  154.951009] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
[  155.038191] xhci-hcd xhci-hcd.0.auto: remove, state 4
[  155.043315] usb usb4: USB disconnect, device number 1
[  155.055270] xhci-hcd xhci-hcd.0.auto: xhci_stop
[  155.060094] xhci-hcd xhci-hcd.0.auto: USB bus 4 deregistered
[  155.066576] xhci-hcd xhci-hcd.0.auto: remove, state 1
[  155.071710] usb usb3: USB disconnect, device number 1
[  155.077124] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
[  155.082389] ------------[ cut here ]------------
[  155.087690] WARNING: CPU: 0 PID: 72 at drivers/usb/host/xhci.c:3800 xhci_setup_device+0x410/0x484 [xhci_hcd]()
[  155.097861] Modules linked in: sd_mod usb_storage scsi_mod usb_f_ss_lb g_zero libcomposite ipv6 xhci_plat_hcd xhci_hcd usbcore dwc3 udc_core evdev ti_am335x_adc joydev kfifo_buf industrialio snd_soc_simple_cc
[  155.146734] CPU: 0 PID: 72 Comm: kworker/0:3 Tainted: G        W       4.1.4-00834-gcd9380b-dirty #50
[  155.156073] Hardware name: Generic AM43 (Flattened Device Tree)
[  155.162117] Workqueue: usb_hub_wq hub_event [usbcore]
[  155.167249] Backtrace:
[  155.169751] [<c0012af0>] (dump_backtrace) from [<c0012c8c>] (show_stack+0x18/0x1c)
[  155.177390]  r6:c089d4a4 r5:ffffffff r4:00000000 r3:ee46c000
[  155.183137] [<c0012c74>] (show_stack) from [<c05f7c14>] (dump_stack+0x84/0xd0)
[  155.190446] [<c05f7b90>] (dump_stack) from [<c00439ac>] (warn_slowpath_common+0x80/0xbc)
[  155.198605]  r7:00000009 r6:00000ed8 r5:bf27eb70 r4:00000000
[  155.204348] [<c004392c>] (warn_slowpath_common) from [<c0043a0c>] (warn_slowpath_null+0x24/0x2c)
[  155.213202]  r8:ee49f000 r7:ee7c0004 r6:00000000 r5:ee7c0158 r4:ee7c0000
[  155.220051] [<c00439e8>] (warn_slowpath_null) from [<bf27eb70>] (xhci_setup_device+0x410/0x484 [xhci_hcd])
[  155.229816] [<bf27e760>] (xhci_setup_device [xhci_hcd]) from [<bf27ec10>] (xhci_address_device+0x14/0x18 [xhci_hcd])
[  155.240415]  r10:ee598200 r9:00000001 r8:00000002 r7:00000001 r6:00000003 r5:00000002
[  155.248363]  r4:ee49f000
[  155.250978] [<bf27ebfc>] (xhci_address_device [xhci_hcd]) from [<bf20cb94>] (hub_port_init+0x1b8/0xa9c [usbcore])
[  155.261403] [<bf20c9dc>] (hub_port_init [usbcore]) from [<bf2101e0>] (hub_event+0x738/0x1020 [usbcore])
[  155.270874]  r10:ee598200 r9:ee7c0000 r8:ee7c0038 r7:ee518800 r6:ee49f000 r5:00000001
[  155.278822]  r4:00000000
[  155.281426] [<bf20faa8>] (hub_event [usbcore]) from [<c005754c>] (process_one_work+0x128/0x340)
[  155.290196]  r10:00000000 r9:00000003 r8:00000000 r7:fedfa000 r6:eeec5400 r5:ee598314
[  155.298151]  r4:ee434380
[  155.300718] [<c0057424>] (process_one_work) from [<c00578f8>] (worker_thread+0x158/0x49c)
[  155.308963]  r10:ee434380 r9:00000003 r8:eeec5400 r7:00000008 r6:ee434398 r5:eeec5400
[  155.316913]  r4:eeec5414
[  155.319482] [<c00577a0>] (worker_thread) from [<c005cc40>] (kthread+0xdc/0xf8)
[  155.326765]  r10:00000000 r9:00000000 r8:00000000 r7:c00577a0 r6:ee434380 r5:ee4441c0
[  155.334713]  r4:00000000 r3:00000000
[  155.338341] [<c005cb64>] (kthread) from [<c000fc08>] (ret_from_fork+0x14/0x2c)
[  155.345626]  r7:00000000 r6:00000000 r5:c005cb64 r4:ee4441c0
[  155.356108] ---[ end trace a58d34c223b190e6 ]---
[  155.360783] xhci-hcd xhci-hcd.0.auto: Virt dev invalid for slot_id 0x1!
[  155.574404] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
[  155.579667] ------------[ cut here ]------------

Cc: <stable@vger.kernel.org>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dabrace pushed a commit that referenced this pull request Jun 5, 2018
…access

James Harvey reported that some corrupted compressed extent data can
lead to various kernel memory corruption.

Such corrupted extent data belongs to inode with NODATASUM flags, thus
data csum won't help us detecting such bug.

If lucky enough, KASAN could catch it like:

BUG: KASAN: slab-out-of-bounds in lzo_decompress_bio+0x384/0x7a0 [btrfs]
Write of size 4096 at addr ffff8800606cb0f8 by task kworker/u16:0/2338

CPU: 3 PID: 2338 Comm: kworker/u16:0 Tainted: G           O      4.17.0-rc5-custom+ #50
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Workqueue: btrfs-endio btrfs_endio_helper [btrfs]
Call Trace:
 dump_stack+0xc2/0x16b
 print_address_description+0x6a/0x270
 kasan_report+0x260/0x380
 memcpy+0x34/0x50
 lzo_decompress_bio+0x384/0x7a0 [btrfs]
 end_compressed_bio_read+0x99f/0x10b0 [btrfs]
 bio_endio+0x32e/0x640
 normal_work_helper+0x15a/0xea0 [btrfs]
 process_one_work+0x7e3/0x1470
 worker_thread+0x1b0/0x1170
 kthread+0x2db/0x390
 ret_from_fork+0x22/0x40
...

The offending compressed data has the following info:

Header:			length 32768		(looks completely valid)
Segment 0 Header:	length 3472882419	(obviously out of bounds)

Then when handling segment 0, since it's over the current page, we need
the copy the compressed data to temporary buffer in workspace, then such
large size would trigger out-of-bounds memory access, screwing up the
whole kernel.

Fix it by adding extra checks on header and segment headers to ensure we
won't access out-of-bounds, and even checks the decompressed data won't
be out-of-bounds.

Reported-by: James Harvey <jamespharvey20@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ updated comments ]
Signed-off-by: David Sterba <dsterba@suse.com>
dabrace pushed a commit that referenced this pull request Nov 12, 2018
Increase kasan instrumented kernel stack size from 32k to 64k. Other
architectures seems to get away with just doubling kernel stack size under
kasan, but on s390 this appears to be not enough due to bigger frame size.
The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE
vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting
stack overflow is fs sync on xfs filesystem:

 #0 [9a0681e8]  704 bytes  check_usage at 34b1fc
 #1 [9a0684a8]  432 bytes  check_usage at 34c710
 #2 [9a068658]  1048 bytes  validate_chain at 35044a
 #3 [9a068a70]  312 bytes  __lock_acquire at 3559fe
 #4 [9a068ba8]  440 bytes  lock_acquire at 3576ee
 #5 [9a068d60]  104 bytes  _raw_spin_lock at 21b44e0
 #6 [9a068dc8]  1992 bytes  enqueue_entity at 2dbf72
 #7 [9a069590]  1496 bytes  enqueue_task_fair at 2df5f0
 #8 [9a069b68]  64 bytes  ttwu_do_activate at 28f438
 #9 [9a069ba8]  552 bytes  try_to_wake_up at 298c4c
 #10 [9a069dd0]  168 bytes  wake_up_worker at 23f97c
 #11 [9a069e78]  200 bytes  insert_work at 23fc2e
 #12 [9a069f40]  648 bytes  __queue_work at 2487c0
 #13 [9a06a1c8]  200 bytes  __queue_delayed_work at 24db28
 #14 [9a06a290]  248 bytes  mod_delayed_work_on at 24de84
 #15 [9a06a388]  24 bytes  kblockd_mod_delayed_work_on at 153e2a0
 #16 [9a06a3a0]  288 bytes  __blk_mq_delay_run_hw_queue at 158168c
 #17 [9a06a4c0]  192 bytes  blk_mq_run_hw_queue at 1581a3c
 #18 [9a06a580]  184 bytes  blk_mq_sched_insert_requests at 15a2192
 #19 [9a06a638]  1024 bytes  blk_mq_flush_plug_list at 1590f3a
 #20 [9a06aa38]  704 bytes  blk_flush_plug_list at 1555028
 #21 [9a06acf8]  320 bytes  schedule at 219e476
 #22 [9a06ae38]  760 bytes  schedule_timeout at 21b0aac
 #23 [9a06b130]  408 bytes  wait_for_common at 21a1706
 #24 [9a06b2c8]  360 bytes  xfs_buf_iowait at fa1540
 #25 [9a06b430]  256 bytes  __xfs_buf_submit at fadae6
 #26 [9a06b530]  264 bytes  xfs_buf_read_map at fae3f6
 #27 [9a06b638]  656 bytes  xfs_trans_read_buf_map at 10ac9a8
 #28 [9a06b8c8]  304 bytes  xfs_btree_kill_root at e72426
 #29 [9a06b9f8]  288 bytes  xfs_btree_lookup_get_block at e7bc5e
 #30 [9a06bb18]  624 bytes  xfs_btree_lookup at e7e1a6
 #31 [9a06bd88]  2664 bytes  xfs_alloc_ag_vextent_near at dfa070
 #32 [9a06c7f0]  144 bytes  xfs_alloc_ag_vextent at dff3ca
 #33 [9a06c880]  1128 bytes  xfs_alloc_vextent at e05fce
 #34 [9a06cce8]  584 bytes  xfs_bmap_btalloc at e58342
 #35 [9a06cf30]  1336 bytes  xfs_bmapi_write at e618de
 #36 [9a06d468]  776 bytes  xfs_iomap_write_allocate at ff678e
 #37 [9a06d770]  720 bytes  xfs_map_blocks at f82af8
 #38 [9a06da40]  928 bytes  xfs_writepage_map at f83cd6
 #39 [9a06dde0]  320 bytes  xfs_do_writepage at f85872
 #40 [9a06df20]  1320 bytes  write_cache_pages at 73dfe8
 #41 [9a06e448]  208 bytes  xfs_vm_writepages at f7f892
 #42 [9a06e518]  88 bytes  do_writepages at 73fe6a
 #43 [9a06e570]  872 bytes  __writeback_single_inode at a20cb6
 #44 [9a06e8d8]  664 bytes  writeback_sb_inodes at a23be2
 #45 [9a06eb70]  296 bytes  __writeback_inodes_wb at a242e0
 #46 [9a06ec98]  928 bytes  wb_writeback at a2500e
 #47 [9a06f038]  848 bytes  wb_do_writeback at a260ae
 #48 [9a06f388]  536 bytes  wb_workfn at a28228
 #49 [9a06f5a0]  1088 bytes  process_one_work at 24a234
 #50 [9a06f9e0]  1120 bytes  worker_thread at 24ba26
 #51 [9a06fe40]  104 bytes  kthread at 26545a
 #52 [9a06fea8]             kernel_thread_starter at 21b6b62

To be able to increase the stack size to 64k reuse LLILL instruction
in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE
(65192) value as unsigned.

Reported-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
dabrace pushed a commit that referenced this pull request Mar 5, 2019
scsi_device_quiesce() and scsi_device_resume() are called during
system-wide suspend and resume. scsi_device_quiesce() only succeeds for
SCSI devices that are in one of the RUNNING, OFFLINE or TRANSPORT_OFFLINE
states (see also scsi_set_device_state()).  This patch avoids that the
following warning is triggered when resuming a system for which quiescing a
SCSI device failed:

WARNING: CPU: 2 PID: 11303 at drivers/scsi/scsi_lib.c:2600 scsi_device_resume+0x4f/0x58
CPU: 2 PID: 11303 Comm: kworker/u8:70 Not tainted 5.0.0-rc1+ #50
Hardware name: LENOVO 80E3/Lancer 5B2, BIOS A2CN45WW(V2.13) 08/04/2016
Workqueue: events_unbound async_run_entry_fn
Call Trace:
 scsi_dev_type_resume+0x2e/0x60
 async_run_entry_fn+0x32/0xd8
 process_one_work+0x1f4/0x420
 worker_thread+0x28/0x3c0
 kthread+0x118/0x130
 ret_from_fork+0x22/0x40

Cc: Przemek Socha <soprwa@gmail.com>
Reported-by: Przemek Socha <soprwa@gmail.com>
Fixes: 3a0a529 ("block, scsi: Make SCSI quiesce and resume work reliably") # v4.15
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
dabrace pushed a commit that referenced this pull request Jan 19, 2021
…tirq

Commit 39d42fa ("dm crypt: add flags to optionally bypass kcryptd
workqueues") made it possible for some code paths in dm-crypt to be
executed in softirq context, when the underlying driver processes IO
requests in interrupt/softirq context.

When Crypto API backlogs a crypto request, dm-crypt uses
wait_for_completion to avoid sending further requests to an already
overloaded crypto driver. However, if the code is executing in softirq
context, we might get the following stacktrace:

[  210.235213][    C0] BUG: scheduling while atomic: fio/2602/0x00000102
[  210.236701][    C0] Modules linked in:
[  210.237566][    C0] CPU: 0 PID: 2602 Comm: fio Tainted: G        W         5.10.0+ #50
[  210.239292][    C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[  210.241233][    C0] Call Trace:
[  210.241946][    C0]  <IRQ>
[  210.242561][    C0]  dump_stack+0x7d/0xa3
[  210.243466][    C0]  __schedule_bug.cold+0xb3/0xc2
[  210.244539][    C0]  __schedule+0x156f/0x20d0
[  210.245518][    C0]  ? io_schedule_timeout+0x140/0x140
[  210.246660][    C0]  schedule+0xd0/0x270
[  210.247541][    C0]  schedule_timeout+0x1fb/0x280
[  210.248586][    C0]  ? usleep_range+0x150/0x150
[  210.249624][    C0]  ? unpoison_range+0x3a/0x60
[  210.250632][    C0]  ? ____kasan_kmalloc.constprop.0+0x82/0xa0
[  210.251949][    C0]  ? unpoison_range+0x3a/0x60
[  210.252958][    C0]  ? __prepare_to_swait+0xa7/0x190
[  210.254067][    C0]  do_wait_for_common+0x2ab/0x370
[  210.255158][    C0]  ? usleep_range+0x150/0x150
[  210.256192][    C0]  ? bit_wait_io_timeout+0x160/0x160
[  210.257358][    C0]  ? blk_update_request+0x757/0x1150
[  210.258582][    C0]  ? _raw_spin_lock_irq+0x82/0xd0
[  210.259674][    C0]  ? _raw_read_unlock_irqrestore+0x30/0x30
[  210.260917][    C0]  wait_for_completion+0x4c/0x90
[  210.261971][    C0]  crypt_convert+0x19a6/0x4c00
[  210.263033][    C0]  ? _raw_spin_lock_irqsave+0x87/0xe0
[  210.264193][    C0]  ? kasan_set_track+0x1c/0x30
[  210.265191][    C0]  ? crypt_iv_tcw_ctr+0x4a0/0x4a0
[  210.266283][    C0]  ? kmem_cache_free+0x104/0x470
[  210.267363][    C0]  ? crypt_endio+0x91/0x180
[  210.268327][    C0]  kcryptd_crypt_read_convert+0x30e/0x420
[  210.269565][    C0]  blk_update_request+0x757/0x1150
[  210.270563][    C0]  blk_mq_end_request+0x4b/0x480
[  210.271680][    C0]  blk_done_softirq+0x21d/0x340
[  210.272775][    C0]  ? _raw_spin_lock+0x81/0xd0
[  210.273847][    C0]  ? blk_mq_stop_hw_queue+0x30/0x30
[  210.275031][    C0]  ? _raw_read_lock_irq+0x40/0x40
[  210.276182][    C0]  __do_softirq+0x190/0x611
[  210.277203][    C0]  ? handle_edge_irq+0x221/0xb60
[  210.278340][    C0]  asm_call_irq_on_stack+0x12/0x20
[  210.279514][    C0]  </IRQ>
[  210.280164][    C0]  do_softirq_own_stack+0x37/0x40
[  210.281281][    C0]  irq_exit_rcu+0x110/0x1b0
[  210.282286][    C0]  common_interrupt+0x74/0x120
[  210.283376][    C0]  asm_common_interrupt+0x1e/0x40
[  210.284496][    C0] RIP: 0010:_aesni_enc1+0x65/0xb0

Fix this by making crypt_convert function reentrant from the point of
a single bio and make dm-crypt defer further bio processing to a
workqueue, if Crypto API backlogs a request in interrupt context.

Fixes: 39d42fa ("dm crypt: add flags to optionally bypass kcryptd workqueues")
Cc: stable@vger.kernel.org # v5.9+
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
dabrace pushed a commit that referenced this pull request Jan 19, 2021
Commit 39d42fa ("dm crypt: add flags to optionally bypass kcryptd
workqueues") made it possible for some code paths in dm-crypt to be
executed in softirq context, when the underlying driver processes IO
requests in interrupt/softirq context.

In this case sometimes when allocating a new crypto request we may get
a stacktrace like below:

[  210.103008][    C0] BUG: sleeping function called from invalid context at mm/mempool.c:381
[  210.104746][    C0] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2602, name: fio
[  210.106599][    C0] CPU: 0 PID: 2602 Comm: fio Tainted: G        W         5.10.0+ #50
[  210.108331][    C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[  210.110212][    C0] Call Trace:
[  210.110921][    C0]  <IRQ>
[  210.111527][    C0]  dump_stack+0x7d/0xa3
[  210.112411][    C0]  ___might_sleep.cold+0x122/0x151
[  210.113527][    C0]  mempool_alloc+0x16b/0x2f0
[  210.114524][    C0]  ? __queue_work+0x515/0xde0
[  210.115553][    C0]  ? mempool_resize+0x700/0x700
[  210.116586][    C0]  ? crypt_endio+0x91/0x180
[  210.117479][    C0]  ? blk_update_request+0x757/0x1150
[  210.118513][    C0]  ? blk_mq_end_request+0x4b/0x480
[  210.119572][    C0]  ? blk_done_softirq+0x21d/0x340
[  210.120628][    C0]  ? __do_softirq+0x190/0x611
[  210.121626][    C0]  crypt_convert+0x29f9/0x4c00
[  210.122668][    C0]  ? _raw_spin_lock_irqsave+0x87/0xe0
[  210.123824][    C0]  ? kasan_set_track+0x1c/0x30
[  210.124858][    C0]  ? crypt_iv_tcw_ctr+0x4a0/0x4a0
[  210.125930][    C0]  ? kmem_cache_free+0x104/0x470
[  210.126973][    C0]  ? crypt_endio+0x91/0x180
[  210.127947][    C0]  kcryptd_crypt_read_convert+0x30e/0x420
[  210.129165][    C0]  blk_update_request+0x757/0x1150
[  210.130231][    C0]  blk_mq_end_request+0x4b/0x480
[  210.131294][    C0]  blk_done_softirq+0x21d/0x340
[  210.132332][    C0]  ? _raw_spin_lock+0x81/0xd0
[  210.133289][    C0]  ? blk_mq_stop_hw_queue+0x30/0x30
[  210.134399][    C0]  ? _raw_read_lock_irq+0x40/0x40
[  210.135458][    C0]  __do_softirq+0x190/0x611
[  210.136409][    C0]  ? handle_edge_irq+0x221/0xb60
[  210.137447][    C0]  asm_call_irq_on_stack+0x12/0x20
[  210.138507][    C0]  </IRQ>
[  210.139118][    C0]  do_softirq_own_stack+0x37/0x40
[  210.140191][    C0]  irq_exit_rcu+0x110/0x1b0
[  210.141151][    C0]  common_interrupt+0x74/0x120
[  210.142171][    C0]  asm_common_interrupt+0x1e/0x40

Fix this by allocating crypto requests with GFP_ATOMIC mask in
interrupt context.

Fixes: 39d42fa ("dm crypt: add flags to optionally bypass kcryptd workqueues")
Cc: stable@vger.kernel.org # v5.9+
Reported-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
dabrace pushed a commit that referenced this pull request May 6, 2021
Ritesh reported a bug [1] against UML, noting that it crashed on
startup. The backtrace shows the following (heavily redacted):

(gdb) bt
...
 #26 0x0000000060015b5d in sem_init () at ipc/sem.c:268
 #27 0x00007f89906d92f7 in ?? () from /lib/x86_64-linux-gnu/libcom_err.so.2
 #28 0x00007f8990ab8fb2 in call_init (...) at dl-init.c:72
...
 #40 0x00007f89909bf3a6 in nss_load_library (...) at nsswitch.c:359
...
 #44 0x00007f8990895e35 in _nss_compat_getgrnam_r (...) at nss_compat/compat-grp.c:486
 #45 0x00007f8990968b85 in __getgrnam_r [...]
 #46 0x00007f89909d6b77 in grantpt [...]
 #47 0x00007f8990a9394e in __GI_openpty [...]
 #48 0x00000000604a1f65 in openpty_cb (...) at arch/um/os-Linux/sigio.c:407
 #49 0x00000000604a58d0 in start_idle_thread (...) at arch/um/os-Linux/skas/process.c:598
 #50 0x0000000060004a3d in start_uml () at arch/um/kernel/skas/process.c:45
 #51 0x00000000600047b2 in linux_main (...) at arch/um/kernel/um_arch.c:334
 #52 0x000000006000574f in main (...) at arch/um/os-Linux/main.c:144

indicating that the UML function openpty_cb() calls openpty(),
which internally calls __getgrnam_r(), which causes the nsswitch
machinery to get started.

This loads, through lots of indirection that I snipped, the
libcom_err.so.2 library, which (in an unknown function, "??")
calls sem_init().

Now, of course it wants to get libpthread's sem_init(), since
it's linked against libpthread. However, the dynamic linker
looks up that symbol against the binary first, and gets the
kernel's sem_init().

Hajime Tazaki noted that "objcopy -L" can localize a symbol,
so the dynamic linker wouldn't do the lookup this way. I tried,
but for some reason that didn't seem to work.

Doing the same thing in the linker script instead does seem to
work, though I cannot entirely explain - it *also* works if I
just add "VERSION { { global: *; }; }" instead, indicating that
something else is happening that I don't really understand. It
may be that explicitly doing that marks them with some kind of
empty version, and that's different from the default.

Explicitly marking them with a version breaks kallsyms, so that
doesn't seem to be possible.

Marking all the symbols as local seems correct, and does seem
to address the issue, so do that. Also do it for static link,
nsswitch libraries could still be loaded there.

[1] https://bugs.debian.org/983379

Reported-by: Ritesh Raj Sarraf <rrs@debian.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Tested-By: Ritesh Raj Sarraf <rrs@debian.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
dabrace pushed a commit that referenced this pull request May 11, 2021
An out of bounds write happens when setting the default power state.
KASAN sees this as:

[drm] radeon: 512M of GTT memory ready.
[drm] GART: num cpu pages 131072, num gpu pages 131072
==================================================================
BUG: KASAN: slab-out-of-bounds in
radeon_atombios_parse_power_table_1_3+0x1837/0x1998 [radeon]
Write of size 4 at addr ffff88810178d858 by task systemd-udevd/157

CPU: 0 PID: 157 Comm: systemd-udevd Not tainted 5.12.0-E620 #50
Hardware name: eMachines        eMachines E620  /Nile       , BIOS V1.03 09/30/2008
Call Trace:
 dump_stack+0xa5/0xe6
 print_address_description.constprop.0+0x18/0x239
 kasan_report+0x170/0x1a8
 radeon_atombios_parse_power_table_1_3+0x1837/0x1998 [radeon]
 radeon_atombios_get_power_modes+0x144/0x1888 [radeon]
 radeon_pm_init+0x1019/0x1904 [radeon]
 rs690_init+0x76e/0x84a [radeon]
 radeon_device_init+0x1c1a/0x21e5 [radeon]
 radeon_driver_load_kms+0xf5/0x30b [radeon]
 drm_dev_register+0x255/0x4a0 [drm]
 radeon_pci_probe+0x246/0x2f6 [radeon]
 pci_device_probe+0x1aa/0x294
 really_probe+0x30e/0x850
 driver_probe_device+0xe6/0x135
 device_driver_attach+0xc1/0xf8
 __driver_attach+0x13f/0x146
 bus_for_each_dev+0xfa/0x146
 bus_add_driver+0x2b3/0x447
 driver_register+0x242/0x2c1
 do_one_initcall+0x149/0x2fd
 do_init_module+0x1ae/0x573
 load_module+0x4dee/0x5cca
 __do_sys_finit_module+0xf1/0x140
 do_syscall_64+0x33/0x40
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Without KASAN, this will manifest later when the kernel attempts to
allocate memory that was stomped, since it collides with the inline slab
freelist pointer:

invalid opcode: 0000 [#1] SMP NOPTI
CPU: 0 PID: 781 Comm: openrc-run.sh Tainted: G        W 5.10.12-gentoo-E620 #2
Hardware name: eMachines        eMachines E620  /Nile , BIOS V1.03       09/30/2008
RIP: 0010:kfree+0x115/0x230
Code: 89 c5 e8 75 ea ff ff 48 8b 00 0f ba e0 09 72 63 e8 1f f4 ff ff 41 89 c4 48 8b 45 00 0f ba e0 10 72 0a 48 8b 45 08 a8 01 75 02 <0f> 0b 44 89 e1 48 c7 c2 00 f0 ff ff be 06 00 00 00 48 d3 e2 48 c7
RSP: 0018:ffffb42f40267e10 EFLAGS: 00010246
RAX: ffffd61280ee8d88 RBX: 0000000000000004 RCX: 000000008010000d
RDX: 4000000000000000 RSI: ffffffffba1360b0 RDI: ffffd61280ee8d80
RBP: ffffd61280ee8d80 R08: ffffffffb91bebdf R09: 0000000000000000
R10: ffff8fe2c1047ac8 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000100
FS:  00007fe80eff6b68(0000) GS:ffff8fe339c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe80eec7bc0 CR3: 0000000038012000 CR4: 00000000000006f0
Call Trace:
 __free_fdtable+0x16/0x1f
 put_files_struct+0x81/0x9b
 do_exit+0x433/0x94d
 do_group_exit+0xa6/0xa6
 __x64_sys_exit_group+0xf/0xf
 do_syscall_64+0x33/0x40
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fe80ef64bea
Code: Unable to access opcode bytes at RIP 0x7fe80ef64bc0.
RSP: 002b:00007ffdb1c47528 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fe80ef64bea
RDX: 00007fe80ef64f60 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 00007fe80ee2c620 R11: 0000000000000246 R12: 00007fe80eff41e0
R13: 00000000ffffffff R14: 0000000000000024 R15: 00007fe80edf9cd0
Modules linked in: radeon(+) ath5k(+) snd_hda_codec_realtek ...

Use a valid power_state index when initializing the "flags" and "misc"
and "misc2" fields.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211537
Reported-by: Erhard F. <erhard_f@mailbox.org>
Fixes: a48b9b4 ("drm/radeon/kms/pm: add asic specific callbacks for getting power state (v2)")
Fixes: 79daedc ("drm/radeon/kms: minor pm cleanups")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.