Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically chunk files bigger than sector size #209

Closed
magik6k opened this issue Sep 18, 2019 · 9 comments
Closed

Automatically chunk files bigger than sector size #209

magik6k opened this issue Sep 18, 2019 · 9 comments

Comments

@magik6k
Copy link
Contributor

magik6k commented Sep 18, 2019

Requiring users to do than manually is bad UX, and this should be reasonably simple

@whyrusleeping
Copy link
Member

Yeah, we should have a whole thing around users adding their files. Encryption and splitting your data redundantly across multiple miners should be thought about too.

@laser
Copy link
Contributor

laser commented Jun 9, 2020

@magik6k @whyrusleeping -

Requiring users to do than manually is bad UX, and this should be reasonably simple

We currently have a command which allows a user to import a file and get a single CID.

We also have a command which allows a user to propose a storage deal to a miner, providing a single CID.

What about adding a --split-boundary flag to the client import command?


10:06 $ ./lotus client import ~/.lotus/flarg
bafkreihdwdcefgh4dqkjv67uzcmw7ojee6xedzdetojuzjevtenxquvyku
# assumes catpix.zip is 1028MiB
lotus client import --split-boundary=1GiB ~/.lotus/catpix.zip
(1of2) bafkreihdwdcefgh4dqkjv67uzcmw7ojee6detojuzjevtenxquvykuxedz
(2of2) bafkreih5ffbd63eh5s7wcy253h5pgw6cfjpk6vvn7tdkigsdfsk3bwiiza

We'd need to make sure that the file names (post-split) made sense to the user:

10:09 $ ./lotus client local
bafkreihdwdcefgh4dqkjv67uzcmw7ojee6xedzdetojuzjevtenxquvyku .lotus/catpix.zip_1of2 1GiB ok
bafkreih5ffbd63eh5s7wcy253h5pgw6cfjpk6vvn7tdkigsdfsk3bwiiza .lotus/catpix.zip_2of2 4MiB ok

If you have further ideas for how this should work, please post some example commands here. That would help me to understand what you had in mind.

@laser
Copy link
Contributor

laser commented Jun 11, 2020

@arajasek - pinging you for some assistance rounding up some expected CLI interactions here :)

@jennijuju jennijuju added area/client/storage area/ux Area: UX need/team-input Hint: Needs Team Input labels Nov 3, 2020
@jennijuju jennijuju added P2 P2: Should be resolved effort/hours Effort: Hours and removed need/team-input Hint: Needs Team Input labels Nov 4, 2020
@jennijuju jennijuju added need/author-input Hint: Needs Author Input need/team-input Hint: Needs Team Input tribute and removed need/author-input Hint: Needs Author Input need/team-input Hint: Needs Team Input labels Jul 14, 2021
@jennijuju
Copy link
Member

Should be able to define the sector size - 32 | 64 for mainnet

@frrist frrist self-assigned this Jul 20, 2021
@frrist
Copy link
Member

frrist commented Jul 20, 2021

Picking up this issue for my Tribute week, I gather that the UX described above is sufficient.

One thing I'll note: lotus client import accepts a path to a CAR file or a "normal file". If a CAR file is provided, no attempt to chunk it will be made. However, "normal files" will be chunked (based on a flag from the user), and a list of CIDs will be returned as described in the above comments.

@laser
Copy link
Contributor

laser commented Jul 20, 2021

@frrist - sounds good to me

@frrist
Copy link
Member

frrist commented Jul 20, 2021

Thinking about this more carefully, and after exploring the related bits in the lotus codebase, I think this issue needs more detail -- and therefore is a poor fit for Tribute work. The time to spec and implement this likely spans weeks. Details around the following should be discussed before anyone decides to pick this up:

  • A more thought-full design on how large content should be split up/added to the network. Simply splitting data along some boundary will likely be insufficient and possibly lead to worse UX. leading to ->
  • How will data (deals) span multiple sectors, and how should it be retrieved once split? I think this format needs to be agreed upon before work here can begin. e.g. what is the "root" CID of data that is split across many sectors, and how does a user go about retrieving it. Furthermore, if the data is split across multiple miners how does that request look?

@jennijuju I think the tribute tag should be removed from this and its effort rating adjusted to weeks.

@frrist frrist removed their assignment Jul 20, 2021
@jennijuju
Copy link
Member

@frrist thank you for looking into this. Agreeing "how large size data should be stored in the network" requires a dedicated project. That being said, can we keep the scope of this ticket small and just to be lotus client import will chuck the files to pieces that can be fit in one sector and returns a series of cids? then client can decide how they wanna make deals with those cids.

@TippyFlitsUK TippyFlitsUK added the need/team-input Hint: Needs Team Input label Mar 30, 2022
@TippyFlitsUK TippyFlitsUK added P1 P1: Must be resolved P0 P0: Critical Blocker and removed P1 P1: Must be resolved P0 P0: Critical Blocker labels Jun 2, 2022
@TippyFlitsUK
Copy link
Contributor

Hi 👋

The Legacy Lotus Markets sub-system reached EOL at the end of the 31st January 2023.

This ticket is being marked as won't fix and closed as the Lotus Team will no longer be making any further fixes or enhancements to the legacy markets subsystem.

Please feel free to re-open this ticket in the new Boost markets sub-system repository at https://github.com/filecoin-project/boost if you feel that it is still relevant.

Many thanks 🙏

@TippyFlitsUK TippyFlitsUK added status/won't fix and removed P2 P2: Should be resolved need/team-input Hint: Needs Team Input tribute labels Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants