FSTRIM fails #431

gstreeter · 2013-11-12T20:32:38Z

Issue after updating to 3.10.18+ #590
FSTRIM -v / fails with "FITRIM ioctl failed: Operation not supported"

Worked ok in previous 3.10.18 updates.

SDCard is SanDisk Ultra 8GB class 10

popcornmix · 2013-11-12T20:33:50Z

I don't see how that will have changed with last update. Can you let me know the most recent kernel that worked?
You can get older firmware with:

sudo rpi-update <git hash>

and you can get the hashes from here:
https://github.com/Hexxeh/rpi-firmware/commits/master

ghollingworth · 2013-11-12T20:41:24Z

FSTRIM is not supported by SD cards therefore it failing is a good thing...

Previous updates were clearly incorrect and someone has fixed it more like

gstreeter · 2013-11-12T22:43:47Z

Interest... I've been using it every day since I got my Pis over a year ago :-) It reports to be doing something with "-v" as it prints out a figure for the bytes trimmed and then zero if it's run again immediately.

ghollingworth · 2013-11-12T22:46:51Z

Yes it pretends to do something but can't sd cards do not support trim.

That is why I guess it's been fixed...

On 12 Nov 2013, at 22:44, "Gary Streeter" <notifications@github.com mailto:notifications@github.com> wrote:

Interest... I've been using it every day since I got my Pis over a year ago :-) It reports to be doing something with "-v" as it prints out a figure for the bytes trimmed and then zero if it's run again immediately.

Reply to this email directly or view it on GitHubhttps://github.com//issues/431#issuecomment-28341529.

fedeaf · 2013-12-08T13:39:38Z

I'm finding the same issue with Linux raspberrypi 3.10.22+ #606.

@ghollingworth on sdcards TRIM is not supported, so fstrim should use the ERASE command instead. Then is up to the wearleveling to realize that an erased block is actually free.

ghollingworth · 2013-12-08T14:17:44Z

Although that's a complete waste of time on all SD cards because they already handle this by keeping 10% of the card available for wear levelling. The only possible advantage to doing this is reduction in erasing time, but since you've normally got something like 400MByte of erased blocks just waiting around the be used you'd think this doesn't really make any difference either

Unfortunately I still don't have an SD card that supports erasing but complains when you do fstrim on it to debug the problem

Gordon

From: fedef <notifications@github.com mailto:notifications@github.com>
Reply-To: raspberrypi/linux <reply@reply.github.com mailto:reply@reply.github.com>
Date: Sunday, 8 December 2013 13:39
To: raspberrypi/linux <linux@noreply.github.com mailto:linux@noreply.github.com>
Cc: Gordon Hollingworth <gordon@raspberrypi.org mailto:gordon@raspberrypi.org>
Subject: Re: [linux] FSTRIM fails (#431)

I'm finding the same issue with Linux raspberrypi 3.10.22+ #606.

@ghollingworthhttps://github.com/ghollingworth on sdcards TRIM is not supported, so fstrim should use the ERASE command instead. Then is up to the wearleveling to realize that an erased block is actually free.

Reply to this email directly or view it on GitHubhttps://github.com//issues/431#issuecomment-30081724.

ghost · 2014-01-15T20:02:44Z

Yes it pretends to do something but can't sd cards do not support trim.

SD cards support "TRIM", which is actually called ERASE in the mmc/sd world.
The driver has a capability flag named "MMC_CAP_ERASE" where it can specify whether it supports the EREASE command or not.

Although that's a complete waste of time on all SD cards because
they already handle this by keeping 10% of the card available for wear levelling.

SD cards might have a tiny bit of over-provisioning (although I doubt it's 10% when even many SSDs only have 7%), but their controllers are a lot worse.
Most SD cards only do dynamic wear leveling, which means only wear leveling to blocks which are currently written to. This is one of the reason why SD cards in the PI tend to die so quickly / develop bad blocks exactly where ext4 places its journal.

ghollingworth · 2014-01-15T20:17:20Z

I'll remember to tell the SD card manufacturer's that they do not know what they're talking about the next time they're in the office

Buy the official Raspberry Pi SD card, it's not only the cheapest on the market but also pretty much the fastest card for the Raspberry Pi. If you succeed in trashing one of those then I'd be interested in seeing it.

Raspbian has journalling turned off to avoid trashing the card

ghost · 2014-01-15T20:30:41Z

As the raspberry pi sd cards are re-labeled samsung cards, I'll probably buy a samsung one as I don't want to cause warranty replacement costs for the foundation ^^

The question why fstrim worked previously and now fails still remains ....

ghollingworth · 2014-01-15T20:37:02Z

If you buy standard Samsung one's you'll end up with something made to work in a camera which is not then optimised for the Raspberry Pi. The point of the Raspberry Pi SD card is that it was specifically developed and tuned to work with the Raspberry Pi!

ghost · 2014-01-18T13:50:23Z

Raspbian has journalling turned off to avoid trashing the card

At least not on my installation, which is based on the official 12.2013 image:

Filesystem features: has_journal
ext_attr resize_inode dir_index filetype extent flex_bg
sparse_super large_file uninit_bg dir_nlink extra_isize

It also matches the location of bad blocks I saw on that 32gb microSD, a few megabytes aftre the partition start.

fedeaf · 2015-03-22T14:00:35Z

@popcornmix You might want to close this issue as it is now resolved on the new versions of Raspbian + firmware.
(I'm really not sure at what point it was fixed.)

popcornmix · 2015-03-22T14:03:30Z

Okay, thanks for update.

We got a null pointer deference BUG_ON in blk_mq_rq_timed_out() as following: [ 108.825472] BUG: kernel NULL pointer dereference, address: 0000000000000040 [ 108.827059] PGD 0 P4D 0 [ 108.827313] Oops: 0000 [raspberrypi#1] SMP PTI [ 108.827657] CPU: 6 PID: 198 Comm: kworker/6:1H Not tainted 5.3.0-rc8+ raspberrypi#431 [ 108.829503] Workqueue: kblockd blk_mq_timeout_work [ 108.829913] RIP: 0010:blk_mq_check_expired+0x258/0x330 [ 108.838191] Call Trace: [ 108.838406] bt_iter+0x74/0x80 [ 108.838665] blk_mq_queue_tag_busy_iter+0x204/0x450 [ 108.839074] ? __switch_to_asm+0x34/0x70 [ 108.839405] ? blk_mq_stop_hw_queue+0x40/0x40 [ 108.839823] ? blk_mq_stop_hw_queue+0x40/0x40 [ 108.840273] ? syscall_return_via_sysret+0xf/0x7f [ 108.840732] blk_mq_timeout_work+0x74/0x200 [ 108.841151] process_one_work+0x297/0x680 [ 108.841550] worker_thread+0x29c/0x6f0 [ 108.841926] ? rescuer_thread+0x580/0x580 [ 108.842344] kthread+0x16a/0x1a0 [ 108.842666] ? kthread_flush_work+0x170/0x170 [ 108.843100] ret_from_fork+0x35/0x40 The bug is caused by the race between timeout handle and completion for flush request. When timeout handle function blk_mq_rq_timed_out() try to read 'req->q->mq_ops', the 'req' have completed and reinitiated by next flush request, which would call blk_rq_init() to clear 'req' as 0. After commit 12f5b93 ("blk-mq: Remove generation seqeunce"), normal requests lifetime are protected by refcount. Until 'rq->ref' drop to zero, the request can really be free. Thus, these requests cannot been reused before timeout handle finish. However, flush request has defined .end_io and rq->end_io() is still called even if 'rq->ref' doesn't drop to zero. After that, the 'flush_rq' can be reused by the next flush request handle, resulting in null pointer deference BUG ON. We fix this problem by covering flush request with 'rq->ref'. If the refcount is not zero, flush_end_io() return and wait the last holder recall it. To record the request status, we add a new entry 'rq_status', which will be used in flush_end_io(). Cc: Christoph Hellwig <hch@infradead.org> Cc: Keith Busch <keith.busch@intel.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: stable@vger.kernel.org # v4.18+ Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Yufen Yu <yuyufen@huawei.com> ------- v2: - move rq_status from struct request to struct blk_flush_queue v3: - remove unnecessary '{}' pair. v4: - let spinlock to protect 'fq->rq_status' v5: - move rq_status after flush_running_idx member of struct blk_flush_queue Signed-off-by: Jens Axboe <axboe@kernel.dk>

commit 8d69966 upstream. We got a null pointer deference BUG_ON in blk_mq_rq_timed_out() as following: [ 108.825472] BUG: kernel NULL pointer dereference, address: 0000000000000040 [ 108.827059] PGD 0 P4D 0 [ 108.827313] Oops: 0000 [#1] SMP PTI [ 108.827657] CPU: 6 PID: 198 Comm: kworker/6:1H Not tainted 5.3.0-rc8+ #431 [ 108.829503] Workqueue: kblockd blk_mq_timeout_work [ 108.829913] RIP: 0010:blk_mq_check_expired+0x258/0x330 [ 108.838191] Call Trace: [ 108.838406] bt_iter+0x74/0x80 [ 108.838665] blk_mq_queue_tag_busy_iter+0x204/0x450 [ 108.839074] ? __switch_to_asm+0x34/0x70 [ 108.839405] ? blk_mq_stop_hw_queue+0x40/0x40 [ 108.839823] ? blk_mq_stop_hw_queue+0x40/0x40 [ 108.840273] ? syscall_return_via_sysret+0xf/0x7f [ 108.840732] blk_mq_timeout_work+0x74/0x200 [ 108.841151] process_one_work+0x297/0x680 [ 108.841550] worker_thread+0x29c/0x6f0 [ 108.841926] ? rescuer_thread+0x580/0x580 [ 108.842344] kthread+0x16a/0x1a0 [ 108.842666] ? kthread_flush_work+0x170/0x170 [ 108.843100] ret_from_fork+0x35/0x40 The bug is caused by the race between timeout handle and completion for flush request. When timeout handle function blk_mq_rq_timed_out() try to read 'req->q->mq_ops', the 'req' have completed and reinitiated by next flush request, which would call blk_rq_init() to clear 'req' as 0. After commit 12f5b93 ("blk-mq: Remove generation seqeunce"), normal requests lifetime are protected by refcount. Until 'rq->ref' drop to zero, the request can really be free. Thus, these requests cannot been reused before timeout handle finish. However, flush request has defined .end_io and rq->end_io() is still called even if 'rq->ref' doesn't drop to zero. After that, the 'flush_rq' can be reused by the next flush request handle, resulting in null pointer deference BUG ON. We fix this problem by covering flush request with 'rq->ref'. If the refcount is not zero, flush_end_io() return and wait the last holder recall it. To record the request status, we add a new entry 'rq_status', which will be used in flush_end_io(). Cc: Christoph Hellwig <hch@infradead.org> Cc: Keith Busch <keith.busch@intel.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: stable@vger.kernel.org # v4.18+ Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> ------- v2: - move rq_status from struct request to struct blk_flush_queue v3: - remove unnecessary '{}' pair. v4: - let spinlock to protect 'fq->rq_status' v5: - move rq_status after flush_running_idx member of struct blk_flush_queue Signed-off-by: Jens Axboe <axboe@kernel.dk>

popcornmix closed this as completed Mar 22, 2015

pelwell mentioned this issue Jul 20, 2015

could you add "MMC_CAP_ERASE" to the bcm2835-sdhost driver mmc->caps initialization? #1076

Closed

pelwell mentioned this issue Nov 22, 2016

Raspbian 2016-05-27 kernel crash #1518

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FSTRIM fails #431

FSTRIM fails #431

gstreeter commented Nov 12, 2013

popcornmix commented Nov 12, 2013

ghollingworth commented Nov 12, 2013

gstreeter commented Nov 12, 2013

ghollingworth commented Nov 12, 2013

fedeaf commented Dec 8, 2013

ghollingworth commented Dec 8, 2013

ghost commented Jan 15, 2014

ghollingworth commented Jan 15, 2014

ghost commented Jan 15, 2014

ghollingworth commented Jan 15, 2014

ghost commented Jan 18, 2014

fedeaf commented Mar 22, 2015

popcornmix commented Mar 22, 2015

FSTRIM fails #431

FSTRIM fails #431

Comments

gstreeter commented Nov 12, 2013

popcornmix commented Nov 12, 2013

ghollingworth commented Nov 12, 2013

gstreeter commented Nov 12, 2013

ghollingworth commented Nov 12, 2013

fedeaf commented Dec 8, 2013

ghollingworth commented Dec 8, 2013

ghost commented Jan 15, 2014

ghollingworth commented Jan 15, 2014

ghost commented Jan 15, 2014

ghollingworth commented Jan 15, 2014

ghost commented Jan 18, 2014

fedeaf commented Mar 22, 2015

popcornmix commented Mar 22, 2015