Truncating an MFS file 175 times makes it unwritable. #4519

Stebalien · 2017-12-22T01:52:58Z

After exactly 175 truncate/writes, mfs returns an empty string from read (forever, apparently).

Stebalien · 2017-12-22T02:02:42Z

Test case:

func TestTruncateAndWrite(t *testing.T) {
	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()
	ds, rt := setupRoot(ctx, t)

	dir := rt.GetValue().(*Directory)

	nd := dag.NodeWithData(ft.FilePBData(nil, 0))
	fi, err := NewFile("test", nd, dir, ds)
	if err != nil {
		t.Fatal(err)
	}

	fd, err := fi.Open(OpenReadWrite, true)
	defer fd.Close()
	if err != nil {
		t.Fatal(err)
	}
	for i := 0; i < 200; i++ {
		err = fd.Truncate(0)
		if err != nil {
			t.Fatal(err)
		}
		l, err := fd.Write([]byte("test"))
		if err != nil {
			t.Fatal(err)
		}
		if l != len("test") {
			t.Fatal("incorrect write length")
		}
		data, err := ioutil.ReadAll(fd)
		if err != nil {
			t.Fatal(err)
		}
		if string(data) != "test" {
			t.Errorf("read error at read %d, read: %v", i, data)
		}
	}
}

@whyrusleeping any ideas?

tests #4519 License: MIT Signed-off-by: Steven Allen <steven@stebalien.com>

whyrusleeping · 2017-12-31T21:31:04Z

thats.... special

schomatis · 2018-02-26T21:43:47Z

@Stebalien Are you currently addressing this in your PR #4517? (I saw you were making fixes to the Truncate function). If not I would like to give this a try, seems like an interesting issue.

Truncating to any smaller than (the original) 4 bytes value produces the same error. Truncating to zero 174 times and the rest of the times to 1 produces:

--- FAIL: TestTruncateAndWrite (0.20s)
	mfs_test.go:1168: read error at read 175, read: [116 116 101 115 116]
	mfs_test.go:1168: read error at read 176, read: [116]
	mfs_test.go:1168: read error at read 177, read: [116]
	[...]

Similarly, truncating to 2:

--- FAIL: TestTruncateAndWrite (0.20s)
	mfs_test.go:1168: read error at read 175, read: [116 101 116 101 115 116]
	mfs_test.go:1168: read error at read 176, read: [116 101]
	mfs_test.go:1168: read error at read 177, read: [116 101]

The same progression continues truncating to 3. I'm avoiding truncating to the same size for the error you encountered and fixed. Truncating to 5 or bigger (always starting with 174 truncations to zero) only fails in iteration 175:

--- FAIL: TestTruncateAndWrite (0.22s)
	mfs_test.go:1168: read error at read 175, read: [116 101 115 116 0 116 101 115 116]
FAIL
FAIL	github.com/ipfs/go-ipfs/mfs	0.222s

Truncating from the start to a value bigger than 4 panics:

--- FAIL: TestTruncateAndWrite (0.00s)
	mfs_test.go:1147: read error at read 0, read: [0 0]
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x6ac1c6]

goroutine 22 [running]:
testing.tRunner.func1(0xc4201263c0)
	/home/user/go/src/github.com/golang/go/src/testing/testing.go:711 +0x2d2
panic(0x73b180, 0x97e4b0)
	/home/user/go/src/github.com/golang/go/src/runtime/panic.go:491 +0x283
github.com/ipfs/go-ipfs/blockservice.(*blockService).AddBlock(0xc42007f4a0, 0x0, 0x0, 0x0, 0x0)
	/home/user/go/src/github.com/ipfs/go-ipfs/blockservice/blockservice.go:132 +0x26
github.com/ipfs/go-ipfs/merkledag.(*dagService).Add(0xc42007b250, 0x953940, 0xc420086058, 0x0, 0x0, 0xc4200871a8, 0x0)
	/home/user/go/src/github.com/ipfs/go-ipfs/merkledag/merkledag.go:52 +0x8c
github.com/ipfs/go-ipfs/unixfs/mod.dagTruncate(0x953940, 0xc420086058, 0x956080, 0xc4201255e0, 0x6, 0x954e40, 0xc42007b250, 0x0, 0x40, 0x40, ...)
	/home/user/go/src/github.com/ipfs/go-ipfs/unixfs/mod/dagmodifier.go:573 +0x448
github.com/ipfs/go-ipfs/unixfs/mod.(*DagModifier).Truncate(0xc420138120, 0x6, 0x21, 0x2)
	/home/user/go/src/github.com/ipfs/go-ipfs/unixfs/mod/dagmodifier.go:505 +0xc5
github.com/ipfs/go-ipfs/mfs.(*fileDescriptor).Truncate(0xc42010a5c0, 0x6, 0x1f, 0xc42004df78)
	/home/user/go/src/github.com/ipfs/go-ipfs/mfs/fd.go:49 +0x9c
github.com/ipfs/go-ipfs/mfs.TestTruncateAndWrite(0xc4201263c0)
	/home/user/go/src/github.com/ipfs/go-ipfs/mfs/mfs_test.go:1131 +0x38a
testing.tRunner(0xc4201263c0, 0x7b7500)
	/home/user/go/src/github.com/golang/go/src/testing/testing.go:746 +0xd0
created by testing.(*T).Run
	/home/user/go/src/github.com/golang/go/src/testing/testing.go:789 +0x2de

But the last two cases may be more related with a problem in expandSparse than the issue in dagTruncate that is causing the error you reported.

schomatis · 2018-03-01T19:01:28Z

@Stebalien I took another look at this (sorry if I jumped the gun here, I don't know if you were currently working on it) and the main issue seems to be that dagTruncate does not preserve the Type of the parent of the node that is truncating (while manipulating its Blocksizes to adjust them to the truncated value). The magic number 175 is derived from the DefaultLinksPerBlock, which has a computed value (after the division) of 174.

I'm new to IPFS and there aren't a lot of comments in the code for a beginner like me to clearly determine what is the expected behavior of some parts of it, so I document here what has drawn my attention while debugging this issue, and leave it to a more experienced developer to determine if there are any problems (beyond the one already mentioned).

After each write to the test file a new node is added (appendData) through the trickle importer. The first 174 (DefaultLinksPerBlock) additions are leaf Data_Raw nodes, the 175th addition (corresponding to test iteration 174) starts a new layer adding a node of depth 1 (increasing the whole tree depth to 2) of type Data_File that will in turn have a child (Data_Raw) node with the 4 bytes of data just written. When the test iteration 175 starts and Truncate is called, dagTruncate will operate on that (second level) child node, truncating its data to zero, but it will inadvertently change its parent's Type. This happens because the newly created structure ndata that will replace the contents of the node (to adjust the new Blocksizes values after truncation) doesn't copy the Type of the original (nd) node, and it overwrites the original node with its own (uninitialized, and hence zero) Type, leaving it as Data_Raw (instead of Data_File).

In this test iteration 175, while reading the file (CtxReadFull) after truncating it, precalcNextBuf reaches this last (175th) node that contains (now) two (Data_Raw) child nodes with the data to read (in the second child), but as the parent itself is now (incorrectly) marked as Data_Raw (instead of its original Data_File type) it's interpreted to be a buffer directly containing the data to read. When opened as a buffer its contents are empty and its children remain unexplored (as its links are not analyzed).

In every write of the test a new Data_Raw node (either as the leaf of the root or as a depth 2 child) is added, this is because modifyDag (called when syncing after the write) always fails to modify the existing nodes (returning done: false). When it encounters a leaf node it tries to read the data written before into an empty byte slice (Data), because it has length zero Read() interprets it as an order to read nothing. The value of that buffer (Data) is always zero at this point because the contents have been read in the previous iteration (and hence drained from the buffer). After that failed attempt it searches for space in the node's children, as they are also empty (for the same reason as before) their Blocksizes are always zero and the condition to reach the desired offset is never met. I'm not sure from the function comments but this may be the expected behavior and modifyDag is only supposed to overwrite existing bytes but never to change any sizes.

In parallel to all this DagModifier has two write related indexes, writeStart and curWrOff, that keep increasing without limit, even though the file is being truncated repeatedly. In the 175th iteration their values are at 700 for writeStart (175 * 4 bytes) and 1400 for curWrOff (same logic, but increases at double rate), which are used for seek and write operations. The offset that modifyDag receives is actually writeStart, so even if the nodes weren't empty it doesn't seem likely their sizes would match those increasing values to perform the modification.

The function Seek from DagModifier does reset those indexes but that function is not called during the test (not even in dagTruncate, which might be expected when the file is truncated, restarting them to zero). The Seek function which is in fact called in readPrep (when the program is about to read the contents of the file) is the Seek from PBDagReader. This seek function, wich receives an ever increasing offset (curWrOff) ends up calling a buffer seek that, as the seek interface specifies, it just moves its internal index to the desired value without checking if there is that much data (as the the behavior of subsequent I/O operations on the underlying object is implementation-dependent), so the following check always passes, and the offset of PBDagReader is set to a value that doesn't correspond with the size of the underlying data available.

whyrusleeping · 2018-03-01T19:18:16Z

@schomatis wow 👏👏👏👏👏👏👏👏👏👏👏👏👏

Great sleuthing work. I'm sorry for writing crappy undocumented code. So a bit of unix weirdness, truncating a file does not reset its current position (see https://golang.org/pkg/os/#File.Truncate). So that bit of the behaviour is correct, weirdness comes from a couple things i think:

we should likely be extending existing data nodes up to the block size before appending a new node
if a node becomes a parent of other nodes, it should get promoted to type File, not ever type Raw.
Other things you mention (the read 0 length buffer things)

I don't believe that @Stebalien is actively working on this one, so its all yours. Thank you for investigating :)

Stebalien · 2018-03-01T19:57:05Z

AWESOME work!

I'm not currently working on this. Would you be up to reviving that PR (you'll probably need to make a new one based on a rebased feat/improve-mfs branch) and fixing this issue on top of it? That PR also has a bunch of other MFS fixes that I never got around to merging (because I couldn't get the test case to pass due to this issue).

schomatis · 2018-03-01T20:59:02Z

@Stebalien Great, I'll give that a try and let you know.

Stebalien added a commit that referenced this issue Dec 22, 2017

add test case for MFS repeated truncation failure.

14c1688

tests #4519 License: MIT Signed-off-by: Steven Allen <steven@stebalien.com>

schomatis mentioned this issue Mar 2, 2018

[WIP] MFS improvements #4758

Closed

schomatis mentioned this issue Jul 11, 2018

unixfs: fix dagTruncate to preserve node type #5216

Merged

ghost assigned schomatis Jul 11, 2018

ghost added the status/in-progress In progress label Jul 11, 2018

whyrusleeping closed this as completed in #5216 Jul 18, 2018

ghost removed the status/in-progress In progress label Jul 18, 2018

Stebalien mentioned this issue Jan 8, 2019

Fix/32/pr ports from go-ipfs to go-mfs ipfs/go-mfs#49

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Truncating an MFS file 175 times makes it unwritable. #4519

Truncating an MFS file 175 times makes it unwritable. #4519

Stebalien commented Dec 22, 2017

Stebalien commented Dec 22, 2017

whyrusleeping commented Dec 31, 2017

schomatis commented Feb 26, 2018

schomatis commented Mar 1, 2018

whyrusleeping commented Mar 1, 2018

Stebalien commented Mar 1, 2018

schomatis commented Mar 1, 2018

Truncating an MFS file 175 times makes it unwritable. #4519

Truncating an MFS file 175 times makes it unwritable. #4519

Comments

Stebalien commented Dec 22, 2017

Stebalien commented Dec 22, 2017

whyrusleeping commented Dec 31, 2017

schomatis commented Feb 26, 2018

schomatis commented Mar 1, 2018

whyrusleeping commented Mar 1, 2018

Stebalien commented Mar 1, 2018

schomatis commented Mar 1, 2018