provide for data retention and storage payment #40

jimscarver · 2021-07-01T17:43:34Z

Introduction/Motivation/Abstract

It has been stated that rchain will eventually delete data that has not paid for continued storage. We cannot allow tuple space to grow without bound with reads and writes that are never likely to occur. Currently there is no distinction between items in tuple space. No record is kept of when they were created or last accessed.

There is no immediate need to solve all the retention issues at this time but there is a need to insure we are creating a tuple space that will allow for a retention policy in the future.

Design

A minimum implementation is a modification of tuple space to keep a least recently used list of tuples along with the block number.Additional future fields can be provided for.

this would allow purging old tuples up to some block number
read/writes could be refreshed to keep them.
each access will support the storage indirectly through the fees paid for access
parameters can be setable in a soft fork

at some block height least recently used tuples could be dropped . Then after each propose data can be purged for the lower block height plus one. The number of block age always kept may be constant or inflationary TBD.

In the future keeping the deployid for tuples could enable refreshing deploys.

A comprehensive solution might include a rholang extension allowing names to listen for the deletion event

Alternatives

?

dckc · 2021-07-01T19:40:42Z

I just mentioned space rental to @leithaus earlier this week; he said there's an architecture / design from 2018 that just hasn't been built yet.

Meanwhile...

We cannot allow tuple space to grow without bound with reads and writes that are never likely to occur.

The last finalized state has exactly the channels that could ever be accessed again. So a straightforward way to garbage collect is to stop a node and resume from LFS.

This doesn't address the time-cost of storage. But storage does get cheaper over time... at a conference of librarians I saw a presentation on "endowed publication" where they charged, say, $1000 to store 1 GB "forever" using interest on an endowment.

jimscarver · 2021-07-01T23:32:37Z

garbage collection is not the issue here though it is certainly important.

The issue is reachable data that is ancient. URI's and deployerIds may be tips of the iceberg for names in contracts and other data structures that are forever reachable..

he said there's an architecture / design from 2018 that just hasn't been built yet.

I cannot think of a possible solution that does not require a hard fork which is why I think we need to address this before the planned hard forks.

dckc · 2021-07-02T03:07:26Z

Before the planned hard forks seems quite ambitious. This seems like a Venus thing.

jimscarver · 2021-07-02T13:29:28Z

The last finalized state

We lose all the history in the LFS such that we cannot distinguish the old from the new.

Keeping the block number of last access allows mimicking natural memory with old memories, not reinforced, being lost.

This seems like a Venus thing.

Greg has a different emerging time solution in mind. Keeping block number seems a easy fix for now with some other notion of time added later but it seems it will be delayed Greg does not seems to be considering last access. I am not convinced the charging for storage scheme being considered is viable.

jimscarver · 2021-07-04T14:21:11Z

Greg wants a natural notion of relative time ordering to emerge. I agree.

I suggest that we use last accessed block number (or block time) to represent that ordering temporarily. I expect it to be a long time before we need to improve it and it can enable the entropy required by natural systems. The common clock referenced is the blockchain itself.

Validators need all to agree on which tuples are expired based on the maximum range of block number of last access are valid. Encountering tuples having a earlier last access are removed not executed to be in sync with other validators.

A periodic sweep can remove expired tuples, shrinking the size of tuple space and backing them up if desired. There is the possibility of resurrecting expired tuples from backups.

A particular block number on a shard is a type of time ordering on blockchains and we can prepare to add other time ordering types later. Behavioural types ultimately can enable greg's dream. On blockchains block height is a natural ordering available immediacy without the overhead of it's emergence.

@tgrospic suggested we experiment and see the overhead cost of keeping the last accessed in the tuple space

jimscarver · 2021-07-06T15:04:52Z

In my career I have often documented a problem that was inevitable but often the warning wasn't heeded until the crises occurred. Most often I had already developed a plan for recovery however if the size of tuple space becomes unmanageable and we do not have last access for tuples I see no path of resolution.

I think it is worth at least determining the overhead of keeping the last access for tuples in rspace.

The argument that we can always add last accessed later, assuming that is true, is okay as long as we anticipate needing it well in advance of a crises.

jimscarver added the enhancement New feature or request label Jul 1, 2021

jimscarver assigned leithaus, dckc and tgrospic Jul 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

provide for data retention and storage payment #40

provide for data retention and storage payment #40

jimscarver commented Jul 1, 2021

dckc commented Jul 1, 2021

jimscarver commented Jul 1, 2021

dckc commented Jul 2, 2021

jimscarver commented Jul 2, 2021 •

edited

Loading

jimscarver commented Jul 4, 2021

jimscarver commented Jul 6, 2021

provide for data retention and storage payment #40

provide for data retention and storage payment #40

Comments

jimscarver commented Jul 1, 2021

Introduction/Motivation/Abstract

Design

Alternatives

dckc commented Jul 1, 2021

jimscarver commented Jul 1, 2021

dckc commented Jul 2, 2021

jimscarver commented Jul 2, 2021 • edited Loading

jimscarver commented Jul 4, 2021

jimscarver commented Jul 6, 2021

jimscarver commented Jul 2, 2021 •

edited

Loading