Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

z_wr_iss/0: page allocation failure #540

Closed
ghost opened this issue Jan 20, 2012 · 3 comments
Closed

z_wr_iss/0: page allocation failure #540

ghost opened this issue Jan 20, 2012 · 3 comments
Labels
Component: Memory Management kernel memory management
Milestone

Comments

@ghost
Copy link

ghost commented Jan 20, 2012

Sorry, but after looking around, I still can't tell if this a follow up or a real new issue...

This occurred under RHEL 6.2, 2.6.32-220.2.1.el6.x86_64, 48GB RAM, 2x raidz2 with 6+2 2TB disks.
zfs/spl compiled from master zfsonlinux-zfs-b9c59ec and zfsonlinux-spl-e05bec8 (Jan 4th).

Jan 20 01:57:58 anor kernel: _thread+0x1d2/0x330 [spl]
Jan 20 01:57:58 anor kernel: [<ffffffff8105e770>] ? default_wake_function+0x0/0x20
Jan 20 01:57:58 anor kernel: [<ffffffffa0268910>] ? taskq_thread+0x0/0x330 [spl]
Jan 20 01:57:58 anor kernel: [<ffffffff810906a6>] ? kthread+0x96/0xa0
Jan 20 01:57:58 anor kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20
Jan 20 01:57:58 anor kernel: [<ffffffff81090610>] ? kthread+0x0/0xa0
Jan 20 01:57:58 anor kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20
Jan 20 01:57:58 anor kernel: z_wr_iss/0: page allocation failure. order:0, mode:0x20
Jan 20 01:57:58 anor kernel: Pid: 1663, comm: z_wr_iss/0 Tainted: P        W  ----------------   2.6.32-220.2.1.el6.x86_64 #1
Jan 20 01:57:58 anor kernel: Call Trace:
Jan 20 01:57:58 anor kernel: <IRQ>  [<ffffffff81123d2f>] ? __alloc_pages_nodemask+0x77f/0x940
Jan 20 01:57:58 anor kernel: [<ffffffff8115dbe2>] ? kmem_getpages+0x62/0x170
Jan 20 01:57:58 anor kernel: [<ffffffff8115e7fa>] ? fallback_alloc+0x1ba/0x270
Jan 20 01:57:58 anor kernel: [<ffffffff8115e24f>] ? cache_grow+0x2cf/0x320
Jan 20 01:57:58 anor kernel: [<ffffffff8115e579>] ? ____cache_alloc_node+0x99/0x160
Jan 20 01:57:58 anor kernel: [<ffffffff8142196a>] ? __alloc_skb+0x7a/0x180
Jan 20 01:57:58 anor kernel: [<ffffffff8115f43f>] ? kmem_cache_alloc_node_notrace+0x6f/0x130
Jan 20 01:57:58 anor kernel: [<ffffffff8115f67b>] ? __kmalloc_node+0x7b/0x100
Jan 20 01:57:58 anor kernel: [<ffffffff8142e018>] ? netif_receive_skb+0x58/0x60
Jan 20 01:57:58 anor kernel: [<ffffffff8142196a>] ? __alloc_skb+0x7a/0x180
Jan 20 01:57:58 anor kernel: [<ffffffff81421ae6>] ? __netdev_alloc_skb+0x36/0x60
Jan 20 01:57:58 anor kernel: [<ffffffffa023d172>] ? e1000_clean_rx_irq+0x392/0x530 [e1000e]
Jan 20 01:57:58 anor kernel: [<ffffffff8100df09>] ? handle_irq+0x49/0xa0
Jan 20 01:57:58 anor kernel: [<ffffffffa023c5f0>] ? e1000_clean+0xb0/0x2b0 [e1000e]
Jan 20 01:57:58 anor kernel: [<ffffffff814308c3>] ? net_rx_action+0x103/0x2f0
Jan 20 01:57:58 anor kernel: [<ffffffff81071f81>] ? __do_softirq+0xc1/0x1d0
Jan 20 01:57:58 anor kernel: [<ffffffff810d9310>] ? handle_IRQ_event+0x60/0x170
Jan 20 01:57:58 anor kernel: [<ffffffff81071fda>] ? __do_softirq+0x11a/0x1d0
Jan 20 01:57:58 anor kernel: [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
Jan 20 01:57:58 anor kernel: [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
Jan 20 01:57:58 anor kernel: [<ffffffff81071d65>] ? irq_exit+0x85/0x90
Jan 20 01:57:58 anor kernel: [<ffffffff814f4dd5>] ? do_IRQ+0x75/0xf0
Jan 20 01:57:58 anor kernel: [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
Jan 20 01:57:58 anor kernel: <EOI>  [<ffffffff81271650>] ? rb_next+0x20/0x50
Jan 20 01:57:58 anor kernel: [<ffffffff811492a0>] ? alloc_vmap_area+0x1c0/0x390
Jan 20 01:57:58 anor kernel: [<ffffffff8115f4d4>] ? kmem_cache_alloc_node_notrace+0x104/0x130
Jan 20 01:57:58 anor kernel: [<ffffffff81149562>] ? __get_vm_area_node+0xf2/0x230
Jan 20 01:57:58 anor kernel: [<ffffffffa0266e0c>] ? kv_alloc+0x7c/0xc0 [spl]
Jan 20 01:57:58 anor kernel: [<ffffffff81149999>] ? __vmalloc_node+0x89/0xb0
Jan 20 01:57:58 anor kernel: [<ffffffffa0266e0c>] ? kv_alloc+0x7c/0xc0 [spl]
Jan 20 01:57:58 anor kernel: [<ffffffff81149d32>] ? __vmalloc+0x22/0x30
Jan 20 01:57:58 anor kernel: [<ffffffffa0266e0c>] ? kv_alloc+0x7c/0xc0 [spl]
Jan 20 01:57:58 anor kernel: [<ffffffffa0267189>] ? spl_kmem_cache_alloc+0x339/0x650 [spl]
Jan 20 01:57:58 anor kernel: [<ffffffffa03c42b3>] ? zio_buf_alloc+0x23/0x30 [zfs]
Jan 20 01:57:58 anor kernel: [<ffffffffa0391b5e>] ? vdev_raidz_io_start+0x29e/0x6b0 [zfs]
Jan 20 01:57:58 anor kernel: [<ffffffffa03c43e7>] ? zio_vdev_io_start+0xa7/0x2e0 [zfs]
Jan 20 01:57:58 anor kernel: [<ffffffffa038e130>] ? vdev_mirror_child_done+0x0/0x30 [zfs]
Jan 20 01:57:58 anor kernel: [<ffffffffa03c5d99>] ? zio_nowait+0xa9/0x120 [zfs]
Jan 20 01:57:58 anor kernel: [<ffffffffa038e77e>] ? vdev_mirror_io_start+0x17e/0x3c0 [zfs]
Jan 20 01:57:58 anor kernel: [<ffffffffa038e130>] ? vdev_mirror_child_done+0x0/0x30 [zfs]
Jan 20 01:57:58 anor kernel: [<ffffffffa03c455f>] ? zio_vdev_io_start+0x21f/0x2e0 [zfs]
Jan 20 01:57:58 anor kernel: [<ffffffffa03c6c59>] ? zio_execute+0x99/0xf0 [zfs]
Jan 20 01:57:58 anor kernel: [<ffffffffa0268ae2>] ? taskq_thread+0x1d2/0x330 [spl]
Jan 20 01:57:58 anor kernel: [<ffffffff8105e770>] ? default_wake_function+0x0/0x20
Jan 20 01:57:58 anor kernel: [<ffffffffa0268910>] ? taskq_thread+0x0/0x330 [spl]
Jan 20 01:57:58 anor kernel: [<ffffffff810906a6>] ? kthread+0x96/0xa0
Jan 20 01:57:58 anor kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20
Jan 20 01:57:58 anor kernel: [<ffffffff81090610>] ? kthread+0x0/0xa0
Jan 20 01:57:58 anor kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20

This repeats several times followed later by

Jan 20 01:58:12 anor kernel: z_wr_iss/0: page allocation failure. order:0, mode:0x20
Jan 20 01:58:12 anor kernel: Pid: 1663, comm: z_wr_iss/0 Tainted: P        W  ----------------   2.6.32-220.2.1.el6.x86_64 #1
Jan 20 01:58:12 anor kernel: Call Trace:
Jan 20 01:58:12 anor kernel: <IRQ>  [<ffffffff81123d2f>] ? __alloc_pages_nodemask+0x77f/0x940
Jan 20 01:58:12 anor kernel: [<ffffffff8115dbe2>] ? kmem_getpages+0x62/0x170
Jan 20 01:58:12 anor kernel: [<ffffffff8115e7fa>] ? fallback_alloc+0x1ba/0x270
Jan 20 01:58:12 anor kernel: [<ffffffff8115e24f>] ? cache_grow+0x2cf/0x320
Jan 20 01:58:12 anor kernel: [<ffffffff8115e579>] ? ____cache_alloc_node+0x99/0x160
Jan 20 01:58:12 anor kernel: [<ffffffff8115f35b>] ? kmem_cache_alloc+0x11b/0x190
Jan 20 01:58:12 anor kernel: [<ffffffff81434220>] ? dst_alloc+0x30/0x80
Jan 20 01:58:12 anor kernel: [<ffffffff8145f8cc>] ? ip_route_input_slow+0x32c/0xb00
Jan 20 01:58:12 anor kernel: [<ffffffff81460106>] ? ip_route_input+0x66/0x5b0
Jan 20 01:58:12 anor kernel: [<ffffffff81421367>] ? __kfree_skb+0x47/0xa0
Jan 20 01:58:12 anor kernel: [<ffffffff8148df9b>] ? arp_process+0x3ab/0x730
Jan 20 01:58:12 anor kernel: [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11
Jan 20 01:58:12 anor kernel: [<ffffffff8115f447>] ? kmem_cache_alloc_node_notrace+0x77/0x130
Jan 20 01:58:12 anor kernel: [<ffffffff8148e441>] ? arp_rcv+0x111/0x140
Jan 20 01:58:12 anor kernel: [<ffffffff8142bf6b>] ? __netif_receive_skb+0x49b/0x6e0
Jan 20 01:58:12 anor kernel: [<ffffffff8142e018>] ? netif_receive_skb+0x58/0x60
Jan 20 01:58:12 anor kernel: [<ffffffff8142e120>] ? napi_skb_finish+0x50/0x70
Jan 20 01:58:12 anor kernel: [<ffffffff814307a9>] ? napi_gro_receive+0x39/0x50
Jan 20 01:58:12 anor kernel: [<ffffffffa023a3bb>] ? e1000_receive_skb+0x5b/0x90 [e1000e]
Jan 20 01:58:12 anor kernel: [<ffffffffa023d090>] ? e1000_clean_rx_irq+0x2b0/0x530 [e1000e]
Jan 20 01:58:12 anor kernel: [<ffffffffa023c5f0>] ? e1000_clean+0xb0/0x2b0 [e1000e]
Jan 20 01:58:12 anor kernel: [<ffffffff814308c3>] ? net_rx_action+0x103/0x2f0
Jan 20 01:58:12 anor kernel: [<ffffffff81071f81>] ? __do_softirq+0xc1/0x1d0
Jan 20 01:58:12 anor kernel: [<ffffffff810d9310>] ? handle_IRQ_event+0x60/0x170
Jan 20 01:58:12 anor kernel: [<ffffffff81071fda>] ? __do_softirq+0x11a/0x1d0
Jan 20 01:58:12 anor kernel: [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
Jan 20 01:58:12 anor kernel: [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
Jan 20 01:58:12 anor kernel: [<ffffffff81071d65>] ? irq_exit+0x85/0x90
Jan 20 01:58:12 anor kernel: [<ffffffff814f4dd5>] ? do_IRQ+0x75/0xf0

and more...

The full log (826kB) can be found at http://wwwlehre.dhbw-stuttgart.de/~bziller/zfs-messages.txt

@ryao
Copy link
Contributor

ryao commented May 17, 2012

The following patch should fix this:

ryao@61c0a39

@ryao
Copy link
Contributor

ryao commented Aug 26, 2012

This issue should be resolved by pull request #883.

@behlendorf
Copy link
Contributor

The #883 changes have been merged in to master, this issue should be resolved.

behlendorf pushed a commit to behlendorf/zfs that referenced this issue May 21, 2018
To reduce mutex footprint, we detect the existence of owner in kernel mutex,
and rely on it if it exists.

Note that before Linux 3.0, mutex owner is of type thread_info. Also note
that, in Linux 3.18, the condition for owner is changed from
CONFIG_DEBUG_MUTEXES || CONFIG_SMP to
CONFIG_DEBUG_MUTEXES || CONFIG_MUTEX_SPIN_ON_OWNER

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#540
pcd1193182 pushed a commit to pcd1193182/zfs that referenced this issue Sep 26, 2023
…aster

Merge remote-tracking branch '6.0/stage' into 'master'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Memory Management kernel memory management
Projects
None yet
Development

No branches or pull requests

2 participants