Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove filestore as a go between with StorageMiner, pass direct io.reader to StorageMiner #49

Merged
merged 3 commits into from
Jan 18, 2020

Conversation

hannahhoward
Copy link
Collaborator

@hannahhoward hannahhoward commented Jan 17, 2020

Goals

Many people have balked at the "Filestore" concept as a handoff between storage market and storage miner and requested we simply pass a direct io.reader from one to the other. This implements that change.

As I understand it, the filestore was originally introduced in specs on the theory that markets & miner might run on different machines, and that, rather than RPC over the whole piece, the filestore would act as a shared interface to a single storage staging area (a networked hard drive or perhaps other device). It seems to me while this concept might still have value, it's a concern best left to the code integrating storage market and storage miner, and the miner who wishes to customize that code.

Implementation

While this PR does not delete the Filestore code, it moves all management of files internal to the module (also removing the requirement that the StorageMiner delete the files when it's done -- this is now managed internally as well).

I'm tagging @laser @acruikshank and @magik6k who will work with this in their own node implementations. Note @magik6k we are working on a replacement for sealed refs as well that works with CAR files that will be ready before we attempt to integrate this with Lotus.

The key line of concern to external stakeholders is this:

-	OnDealComplete(ctx context.Context, deal MinerDeal, piecePath string) (uint64, error)
+	OnDealComplete(ctx context.Context, deal MinerDeal, pieceSize uint64, pieceReader io.Reader) (uint64, error)

There is some other code to also make PieceIO more modular submitted by @rvagg so that external uses of go-fil-markets can generate commP from a CAR file they made themselves (there will be reasons to do this soon)

For Discussion

One potential future inefficiency, in a scenario where one does wish to customize the handoff between markets and miner, is that we still have to write the piece file to disk internally, and if you were to write it to disk outside this module, it would be an inefficiency.

* Refactor use of PadReader
* Don't pad internal CAR files, wrap in PadReader and pass around paddedSize
  rather than using car.Size()
@hannahhoward hannahhoward changed the title Remove filestore as a shared concept, keep internal to StorageMarkets, pass direct io.reader to StorageMiner Remove filestore as a go between with StorageMiner, pass direct io.reader to StorageMiner Jan 17, 2020
Remove need to have filestore access outside of storagemarket by passing a direct io.reader +
piecesize to OnDealComplete
}
_, err = f.Seek(0, io.SeekStart)
commitment, paddedSize, err := GeneratePieceCommitment(f, pieceSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you've fseek'ed the file back to 0 after the call to pio.carIO.WriteCar. I think it would be good manners to do the same after GeneratePieceCommitment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(if you stick with the fseekable approach)

}

func (pio *pieceIO) GeneratePieceCommitment(payloadCid cid.Cid, selector ipld.Node) ([]byte, filestore.File, error) {
func (pio *pieceIO) GeneratePieceCommitment(payloadCid cid.Cid, selector ipld.Node) ([]byte, filestore.File, uint64, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're writing the CAR to a tempfile here. Some Linux distros (like Arch) use a ramdisk for /tmp. Does CreateTemp() write to /tmp? If so, I would expect this to be the source of OOMs for miners committing large sectors.

A potential solution would be to modify pieceIO#GeneratePieceCommitment to return a Reader instead of a File. You could use (os.Pipe)[https://golang.org/pkg/os/#Pipe], passing the write side of the pipe to pio.carIO.WriteCar in a goroutine and return the read side of the pipe.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

golang's temp file is not writing to a ram disk -- it puts it in the directory you specify

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look at piping.

id: deal.ProposalCid,
err: err,
}:
case <-p.stop:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell from looking at this PR where it is that you send a value to p.stop - but I do see that you've got at least one anonymous goroutine (elsewhere in the file) which uses receipt of a value on p.stop to signal exit.

If it's the case that a single termination value on p.stop is sent - then I fear that you'll end up with orphaned goroutines (only one will receive the termination value).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.stop sends by being closed, which means it will read multiple times.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I didn't know that about Golang channels.

@@ -179,7 +185,8 @@ func (p *Provider) staged(ctx context.Context, deal MinerDeal) (func(*MinerDeal)
Ref: deal.Ref,
DealID: deal.DealID,
},
string(deal.PiecePath),
paddedSize,
paddedReader,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

Copy link
Contributor

@laser laser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, modulo the temp file thing. Thanks for making this change.

refactor piece io functions to return file paths, also clean up and clarify path usage in filestore
@codecov-io
Copy link

Codecov Report

Merging #49 into master will decrease coverage by 0.28%.
The diff coverage is 44.24%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #49      +/-   ##
==========================================
- Coverage    21.5%   21.22%   -0.27%     
==========================================
  Files          31       31              
  Lines        2308     2310       +2     
==========================================
- Hits          496      490       -6     
- Misses       1759     1766       +7     
- Partials       53       54       +1
Impacted Files Coverage Δ
storagemarket/impl/provider.go 0% <0%> (ø) ⬆️
storagemarket/impl/client.go 0% <0%> (ø) ⬆️
storagemarket/impl/provider_states.go 0% <0%> (ø) ⬆️
storagemarket/impl/client_utils.go 32.26% <0%> (+1.96%) ⬆️
filestore/filestore.go 86.85% <100%> (+2.23%) ⬆️
filestore/file.go 76.48% <50%> (-10.19%) ⬇️
pieceio/pieceio.go 84.38% <88.24%> (-7.29%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d03e1c2...edead00. Read the comment docs.

@hannahhoward hannahhoward merged commit 00ff81e into master Jan 18, 2020
@hannahhoward hannahhoward deleted the feat/pieceio-modular branch April 30, 2020 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants