Skip to content

Commit

Permalink
Update nep-0508.md
Browse files Browse the repository at this point in the history
Updating based on @wacban 's feedback
  • Loading branch information
walnut-the-cat authored Sep 21, 2023
1 parent 396c426 commit 1f1d107
Showing 1 changed file with 18 additions and 5 deletions.
23 changes: 18 additions & 5 deletions neps/nep-0508.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
NEP: 508
Title: Resharding phase 2
Title: Resharding v2
Authors: Waclaw Banasik, Shreyan Gupta, Yoon Hong
Status: Draft
DiscussionsTo: https://github.com/near/nearcore/issues/8992
Expand All @@ -14,9 +14,11 @@ LastUpdated: 2023-09-19

In essence, this NEP is extension of [NEP-40](https://github.com/near/NEPs/blob/master/specs/Proposals/0040-split-states.md), which focused splitting one shard into multiple shards.

We are introducing the second phase of resharding, which supports one shard splitting into two within one epoch at pre-determined split boundary.
We are introducing resharding v2, which supports one shard splitting into two within one epoch at pre-determined split boundary. The NEP includes performance improvement to make resharding feasible under the current state as well as actual resharding in mainnet and testnet (To be specific, spliting shard 3 into two).

While the new approach addresses critical limitations left unsolved in NEP-40 and is expected to remain valid for foreseable future, it does not serve all usecases, such as dynamic resharding.


While the new approach addresses critical limitations left unsolved in NEP-40 and is expected to remain valid for foreseable future, it does not serve all usecases, such as dynamic resharding.

## Motivation

Expand All @@ -29,25 +31,35 @@ Currently, NEAR protocol has four shards. With more partners onboarding, we star
* Some form of State sync (centralized or decentralized) is enabled.
* Flat state is enabled.
* Shard split boundary is predetermined. In other words, necessity of shard splitting is manually decided.
* Merkle Patricia Trie is undelying data structure for the protocol.
* Merkle Patricia Trie is the undelying data structure for the protocol state.
* Minimal epoch gap between two resharding events is X.

### High level requirements

* Resharding should work even when validators stop tracking all shards.
* Resharding should work after stateless validation is enabled.
* Resharding should be fast enough so that both state sync and resharding can happen within one epoch.
* Resharding should not require additional hardware from nodes.
* ~~Resharding should not require additional hardware from nodes.~~
* This needs to be assessed during test
* Resharding should be fault tolerant
* Chain must not stall in case of resharding failure.
* A validator should be able to recover in case they go offline during resharding.
* For now, our aim is at least allowing a validator to join back after resharding is finished.
* No transaction or receipt should be lost during resharding.
* Resharding should work regardless of number of existing shards.
* There should be no more place (in any apps or tools) where number of shard is hardcoded.

### Out of scope

* Dynamic resharding
* automatically scheduling resharding based on shard usage/capacity
* automatically determining the shard layout
* merging shards
* shard reshuffling
* shard boundary adjustment
* Shard determination logic (shard boundary is still determined by string value)
* Advanced failure handling
* If a validator goes offline during resharding, it can join back immediately and move forward as long as enough time is left to reperform resharding.
* TBD

### Required protocol changes
Expand Down Expand Up @@ -108,6 +120,7 @@ Other useful features that can be considered as a follow up:

* The resharding process is still not fully automated. Analyzing shard data, determining the split boundary, and triggering an actual shard split all need to be manually curated by a person.
* During resharding, a node is expected to do more work as it will have to apply changes twice (for the current shard and future shard).
* Increased potential for apps and tools to break without proper shard layout change handling.

### Backwards Compatibility

Expand Down

0 comments on commit 1f1d107

Please sign in to comment.