Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Truncating an MFS file 175 times makes it unwritable. #4519

Closed
Stebalien opened this issue Dec 22, 2017 · 7 comments · Fixed by #5216
Closed

Truncating an MFS file 175 times makes it unwritable. #4519

Stebalien opened this issue Dec 22, 2017 · 7 comments · Fixed by #5216
Assignees

Comments

@Stebalien
Copy link
Member

After exactly 175 truncate/writes, mfs returns an empty string from read (forever, apparently).

@Stebalien
Copy link
Member Author

Test case:

func TestTruncateAndWrite(t *testing.T) {
	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()
	ds, rt := setupRoot(ctx, t)

	dir := rt.GetValue().(*Directory)

	nd := dag.NodeWithData(ft.FilePBData(nil, 0))
	fi, err := NewFile("test", nd, dir, ds)
	if err != nil {
		t.Fatal(err)
	}

	fd, err := fi.Open(OpenReadWrite, true)
	defer fd.Close()
	if err != nil {
		t.Fatal(err)
	}
	for i := 0; i < 200; i++ {
		err = fd.Truncate(0)
		if err != nil {
			t.Fatal(err)
		}
		l, err := fd.Write([]byte("test"))
		if err != nil {
			t.Fatal(err)
		}
		if l != len("test") {
			t.Fatal("incorrect write length")
		}
		data, err := ioutil.ReadAll(fd)
		if err != nil {
			t.Fatal(err)
		}
		if string(data) != "test" {
			t.Errorf("read error at read %d, read: %v", i, data)
		}
	}
}

@whyrusleeping any ideas?

Stebalien added a commit that referenced this issue Dec 22, 2017
tests #4519

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
@whyrusleeping
Copy link
Member

thats.... special

@schomatis
Copy link
Contributor

@Stebalien Are you currently addressing this in your PR #4517? (I saw you were making fixes to the Truncate function). If not I would like to give this a try, seems like an interesting issue.

Truncating to any smaller than (the original) 4 bytes value produces the same error. Truncating to zero 174 times and the rest of the times to 1 produces:

--- FAIL: TestTruncateAndWrite (0.20s)
	mfs_test.go:1168: read error at read 175, read: [116 116 101 115 116]
	mfs_test.go:1168: read error at read 176, read: [116]
	mfs_test.go:1168: read error at read 177, read: [116]
	[...]

Similarly, truncating to 2:

--- FAIL: TestTruncateAndWrite (0.20s)
	mfs_test.go:1168: read error at read 175, read: [116 101 116 101 115 116]
	mfs_test.go:1168: read error at read 176, read: [116 101]
	mfs_test.go:1168: read error at read 177, read: [116 101]

The same progression continues truncating to 3. I'm avoiding truncating to the same size for the error you encountered and fixed. Truncating to 5 or bigger (always starting with 174 truncations to zero) only fails in iteration 175:

--- FAIL: TestTruncateAndWrite (0.22s)
	mfs_test.go:1168: read error at read 175, read: [116 101 115 116 0 116 101 115 116]
FAIL
FAIL	github.com/ipfs/go-ipfs/mfs	0.222s

Truncating from the start to a value bigger than 4 panics:

--- FAIL: TestTruncateAndWrite (0.00s)
	mfs_test.go:1147: read error at read 0, read: [0 0]
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x6ac1c6]

goroutine 22 [running]:
testing.tRunner.func1(0xc4201263c0)
	/home/user/go/src/github.com/golang/go/src/testing/testing.go:711 +0x2d2
panic(0x73b180, 0x97e4b0)
	/home/user/go/src/github.com/golang/go/src/runtime/panic.go:491 +0x283
github.com/ipfs/go-ipfs/blockservice.(*blockService).AddBlock(0xc42007f4a0, 0x0, 0x0, 0x0, 0x0)
	/home/user/go/src/github.com/ipfs/go-ipfs/blockservice/blockservice.go:132 +0x26
github.com/ipfs/go-ipfs/merkledag.(*dagService).Add(0xc42007b250, 0x953940, 0xc420086058, 0x0, 0x0, 0xc4200871a8, 0x0)
	/home/user/go/src/github.com/ipfs/go-ipfs/merkledag/merkledag.go:52 +0x8c
github.com/ipfs/go-ipfs/unixfs/mod.dagTruncate(0x953940, 0xc420086058, 0x956080, 0xc4201255e0, 0x6, 0x954e40, 0xc42007b250, 0x0, 0x40, 0x40, ...)
	/home/user/go/src/github.com/ipfs/go-ipfs/unixfs/mod/dagmodifier.go:573 +0x448
github.com/ipfs/go-ipfs/unixfs/mod.(*DagModifier).Truncate(0xc420138120, 0x6, 0x21, 0x2)
	/home/user/go/src/github.com/ipfs/go-ipfs/unixfs/mod/dagmodifier.go:505 +0xc5
github.com/ipfs/go-ipfs/mfs.(*fileDescriptor).Truncate(0xc42010a5c0, 0x6, 0x1f, 0xc42004df78)
	/home/user/go/src/github.com/ipfs/go-ipfs/mfs/fd.go:49 +0x9c
github.com/ipfs/go-ipfs/mfs.TestTruncateAndWrite(0xc4201263c0)
	/home/user/go/src/github.com/ipfs/go-ipfs/mfs/mfs_test.go:1131 +0x38a
testing.tRunner(0xc4201263c0, 0x7b7500)
	/home/user/go/src/github.com/golang/go/src/testing/testing.go:746 +0xd0
created by testing.(*T).Run
	/home/user/go/src/github.com/golang/go/src/testing/testing.go:789 +0x2de

But the last two cases may be more related with a problem in expandSparse than the issue in dagTruncate that is causing the error you reported.

@schomatis
Copy link
Contributor

@Stebalien I took another look at this (sorry if I jumped the gun here, I don't know if you were currently working on it) and the main issue seems to be that dagTruncate does not preserve the Type of the parent of the node that is truncating (while manipulating its Blocksizes to adjust them to the truncated value). The magic number 175 is derived from the DefaultLinksPerBlock, which has a computed value (after the division) of 174.

I'm new to IPFS and there aren't a lot of comments in the code for a beginner like me to clearly determine what is the expected behavior of some parts of it, so I document here what has drawn my attention while debugging this issue, and leave it to a more experienced developer to determine if there are any problems (beyond the one already mentioned).

After each write to the test file a new node is added (appendData) through the trickle importer. The first 174 (DefaultLinksPerBlock) additions are leaf Data_Raw nodes, the 175th addition (corresponding to test iteration 174) starts a new layer adding a node of depth 1 (increasing the whole tree depth to 2) of type Data_File that will in turn have a child (Data_Raw) node with the 4 bytes of data just written. When the test iteration 175 starts and Truncate is called, dagTruncate will operate on that (second level) child node, truncating its data to zero, but it will inadvertently change its parent's Type. This happens because the newly created structure ndata that will replace the contents of the node (to adjust the new Blocksizes values after truncation) doesn't copy the Type of the original (nd) node, and it overwrites the original node with its own (uninitialized, and hence zero) Type, leaving it as Data_Raw (instead of Data_File).

In this test iteration 175, while reading the file (CtxReadFull) after truncating it, precalcNextBuf reaches this last (175th) node that contains (now) two (Data_Raw) child nodes with the data to read (in the second child), but as the parent itself is now (incorrectly) marked as Data_Raw (instead of its original Data_File type) it's interpreted to be a buffer directly containing the data to read. When opened as a buffer its contents are empty and its children remain unexplored (as its links are not analyzed).

In every write of the test a new Data_Raw node (either as the leaf of the root or as a depth 2 child) is added, this is because modifyDag (called when syncing after the write) always fails to modify the existing nodes (returning done: false). When it encounters a leaf node it tries to read the data written before into an empty byte slice (Data), because it has length zero Read() interprets it as an order to read nothing. The value of that buffer (Data) is always zero at this point because the contents have been read in the previous iteration (and hence drained from the buffer). After that failed attempt it searches for space in the node's children, as they are also empty (for the same reason as before) their Blocksizes are always zero and the condition to reach the desired offset is never met. I'm not sure from the function comments but this may be the expected behavior and modifyDag is only supposed to overwrite existing bytes but never to change any sizes.

In parallel to all this DagModifier has two write related indexes, writeStart and curWrOff, that keep increasing without limit, even though the file is being truncated repeatedly. In the 175th iteration their values are at 700 for writeStart (175 * 4 bytes) and 1400 for curWrOff (same logic, but increases at double rate), which are used for seek and write operations. The offset that modifyDag receives is actually writeStart, so even if the nodes weren't empty it doesn't seem likely their sizes would match those increasing values to perform the modification.

The function Seek from DagModifier does reset those indexes but that function is not called during the test (not even in dagTruncate, which might be expected when the file is truncated, restarting them to zero). The Seek function which is in fact called in readPrep (when the program is about to read the contents of the file) is the Seek from PBDagReader. This seek function, wich receives an ever increasing offset (curWrOff) ends up calling a buffer seek that, as the seek interface specifies, it just moves its internal index to the desired value without checking if there is that much data (as the the behavior of subsequent I/O operations on the underlying object is implementation-dependent), so the following check always passes, and the offset of PBDagReader is set to a value that doesn't correspond with the size of the underlying data available.

@whyrusleeping
Copy link
Member

@schomatis wow 👏👏👏👏👏👏👏👏👏👏👏👏👏

Great sleuthing work. I'm sorry for writing crappy undocumented code. So a bit of unix weirdness, truncating a file does not reset its current position (see https://golang.org/pkg/os/#File.Truncate). So that bit of the behaviour is correct, weirdness comes from a couple things i think:

  • we should likely be extending existing data nodes up to the block size before appending a new node
  • if a node becomes a parent of other nodes, it should get promoted to type File, not ever type Raw.
  • Other things you mention (the read 0 length buffer things)

I don't believe that @Stebalien is actively working on this one, so its all yours. Thank you for investigating :)

@Stebalien
Copy link
Member Author

AWESOME work!

I'm not currently working on this. Would you be up to reviving that PR (you'll probably need to make a new one based on a rebased feat/improve-mfs branch) and fixing this issue on top of it? That PR also has a bunch of other MFS fixes that I never got around to merging (because I couldn't get the test case to pass due to this issue).

@schomatis
Copy link
Contributor

@Stebalien Great, I'll give that a try and let you know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants