Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Grant Proposal: Data Onboarding Metrics #858

Closed
Fatman13 opened this issue Aug 12, 2022 · 7 comments
Closed

Open Grant Proposal: Data Onboarding Metrics #858

Fatman13 opened this issue Aug 12, 2022 · 7 comments
Assignees

Comments

@Fatman13
Copy link

Fatman13 commented Aug 12, 2022

Open Grant Proposal: Data Onboarding Metrics

Name of Project: Data Onboarding Metrics - Venus

Proposal Category: Choose one of core-dev, devtools-libraries

Proposer: ipfs-force-community

(Optional) Technical Sponsor:

Do you agree to open source all work you do on behalf of this RFP and dual-license under MIT, APACHE2, or GPL licenses?: Yes

Project Description

One of the issues that new SPs or even many veteran SPs facing everyday when they on-board loads of sectors is getting a clear picture of the heartbeat for their storage system to diagnose whatever has gone wrong in their pipeline. A thousand things could go wrong when moving sectors through SP’s storage systems such as chain head out of sync, messages stuck in mpool, missing block producing round, high API latency and etc. SPs have to navigate through these anomalies all the time and be quick to response to these conditions.

This is where Data Onboarding Metrics for Venus Filecoin comes into play. We propose to build a series of critical metrics for each component of Venus Filecoin to reflect the live health of a storage system so that operators could have better knowledge of what’s going with their systems and then could better react to different situations instead of relying on guessing, digging through tons of logs or overly extensive dev-ops experience.


Value


There are many benefits we see that Data Onboarding Metrics could bring to SPs to take control of their storage systems back instead of spending a lot time troubleshooting a black box. We believe metrics provides the toolbox for SP to minimize the impact of their operation errors, to get to see if winidowPost messages get properly sent out in time, to monitor time/latency for PoST computation and much more so that SPs do not get punished by the protocol unintentionally.

  • Live heart beat map of all critical information of your storage system
  • Lower protocol penalties from mechanics such as PCD, PoST slashes, missing block etc
  • Easy integration with third party monitoring solution

Deliverables

Milestone Area Deliverable Funding
A Spec Begin collecting feedbacks from community on what kind of metrics they would like to see developed to ease their data-onboarding efforts -
A Spec Initial design for metrics system for each components, daemon, miner, messager, gateway, market, cluster 5,000
B Impl Implementation of metrics to be collected from miner component as the first MVP 5,000
B Doc Initial documentation on configurations and usages of the metrics system 3,000
C Impl Implementation of metrics to be collected from messager component 3,000
C Impl Implementation of metrics to be collected from gateway component 3,000
C Impl Implementation of metrics to be collected from daemon component and finish up on all other chain service component 3,000
D Impl Implementation of metrics to be collected from market component 5,000
E Impl Implementation of metrics to be collected from cluster component 5,000
F Spec Release full documentation on the metrics system along with practical tutorials 5,000
G Spec Collect community feedbacks on more metrics they would like to have after their initial tryouts -
H Impl Revisit each component and add new metrics that community deemed necessary 6,000
- - Project Management: A dedicated project management budget will help coordinate work between different collaborators, as well as outreach to key stakeholders and key users 5,000
- - Total $48,000

Development Roadmap


The development could be loosely broken down into three parts: 1) Design 2) Implementation and lastly 3) maintenance of the metrics system.

Design

This phase includes milestone A and B in the above deliverable table. The team will be collecting ideas from community, concieve the 1st design of metrics system, and lastly build a POC/MVP for miner component. A embedded exporter that allows custom configuration will be included for easier integration with third party tools. A metrics module will be added to the miner project which may contain below parameters for SPs to monitor their storage pipeline.

// latency for GetBaseInfo API
GetBaseInfoDuration   (Milliseconds)
// latency for ComputeTicket API
ComputeTicketDuration (Milliseconds)
// latency for IsRoundWinner API
IsRoundWinnerDuration (Milliseconds)
// latency for ComputeProof API
ComputeProofDuration (Seconds)

// number of block produced
NumberOfBlock (Dimensionless)
// number of rounds that miner_id is winner
NumberOfIsRoundWinner (Dimensionless)

Implementation

This phase includes milestone C to E in the above deliverable table. The team will be continuing to collect ideas from community while implementing the metrics system for the rest of the Venus components. A list of parameters that metrics module will be adopting are listed below…

messager

// Below metrics are updated on a per wallet address granularity 
WalletBalance  (UnitDimensionless)
WalletDBNonce (UnitDimensionless)
WalletChainNonce (Dimensionless)

// Current number of messages that are waiting for venus-messager to fill out parameters like signature, gas usage, nonce etc.
// This metric is updated on a per wallet address granularity 
NumOfUnFillMsg (UnitDimensionless)
// Current number of messages that venus-messager has filled out parameters like signature, gas usage, nonce etc.
// This metric is updated on a per wallet address granularity 
NumOfFillMsg  (Dimensionless)
// Current number of messages that venus-messager has failed to fill out parameters like signature, gas usage, nonce etc.
// This metric is updated on a per wallet address granularity 
NumOfFailedMsg (UnitDimensionless)

// Current number of messages that haven't being on-chain for more than 3 minutes
NumOfMsgBlockedThreeMinutes (Dimensionless)
// Current number of messages that haven't being on-chain for more than 5 minutes
NumOfMsgBlockedFiveMinutes  (UnitDimensionless)

// Number of message being selected by venus-messager during last round of message pushing
SelectedMsgNumOfLastRound (UnitDimensionless)
// Number of message being pushed by venus-messager during last round of message pushing
ToPushMsgNumOfLastRound  (UnitDimensionless)
// Number of message being expired by venus-messager during last round of message pushing
ExpiredMsgNumOfLastRound (UnitDimensionless)
// Number of message encountered errors during last round of message pushing
ErrMsgNumOfLastRound  (UnitDimensionless)

// Current time difference between chain head time and time on venus-messager machine system time
ChainHeadStableDelay  (UnitSeconds)
// Histogram of time difference between chain head time and time on venus-messager machine system time
ChainHeadStableDuration (UnitSeconds)
)

gateway

// Number of wallet connecting to the gateway
WalletCount
// Number of wallet addresses connecting to the gateway
WalletAddressCount
// IP of remote wallet connecting to the gateway
WalletIPAddress

// Number of SP connecting to the gateway
SPCount
// Number of SP addresses connecting to the gateway
SPAddressCount
// IP of remote SP connecting to the gateway
SPIPAddress

// Number of signature gateway initiated
SignCount

market

// Count of storage deals accepted
StorageDealAccepted
// Number of active data transfer 
NumberOfActiveTransfer
// Speed of data transfer, per transfer, unit = Mbps
DataTransferSpeed
// The rate of successful data transfer
SucessTransferRate

daemon

TBD

cluster

// Count of new sectors, per miner_id 
SectorManagerNewSector

// Count of preCommit, per miner_id 
SectorManagerPreCommitSector

// count of commit, per miner_id 
SectorManagerCommitSector

// time of computing winningPost, per miner_id, unit = Seconds
ProverWinningPostDuration

// time of computing WindowPost, per miner_id, unit = Minutes
ProverWindowPostDuration

// Completion rate for partition that have passed windowPost, per miner_id
// Eg: ProverWindowPostCompleteRate=0.9 when 9 out 10 partition complete windowPost submission
ProverWindowPostCompleteRate

// Latency of sector manage API calls, unit = ms
APIRequestDuration

Note that all metrics are not final and subject to have more parameters when community see fit.

Maintenance

This phase includes milestone F to H in the above deliverable table. The team will be continuing to collect ideas and feedbacks from community while iterating on the metrics system for all Venus components. Documentations and easy-to-follow tutorials will be produced to help push metrics system to be adopted by broader community members. We hope after we are done with this phase SPs will have the tools they need to remove any obstacles when on-boarding large amount of sectors.

Total Budget Requested

The total budget requests is $48,000. The breakdown of the budget is associated with the deliverables of each milestone, defined above.


Maintenance and Upgrade Plans

The goal of the team is to support metrics system long term, which including continuously adding more critical parameters that community deemed worthy of monitoring. Therefore, easing the process of on-boarding large amount of data to the network.


Team

Team Members

Force community engineering team

Team Member LinkedIn Profiles

Team Website

https://forcecommunity.io/

Relevant Experience

Force community has been an active contributor to Web3 ecosystem and Filecoin ecosystem in general. The engineering team from Force community has a track record of contributing code to Lotus as far back as Testnet and Space Race.

Team code repositories

https://github.com/ipfs-force-community

Additional information

Force community is committed to become a major contributor to Web3 infrastructure and we see Filecoin at the core of the big Web3 migration. We hope that we could fast track the realization of Web3 adoption by contributing our software development capacity to the course and join hand in hand with all other ecosystem developers around the globe through this historical journey!

@ErinOCon
Copy link
Collaborator

Hi @Fatman13, thank you for your proposal! We are currently reviewing this grant and expect to have more information available next week.

@ErinOCon
Copy link
Collaborator

Hi @Fatman13, thank you for your ongoing patience! This grant is still under our review. We will be in touch as soon as we have completed connecting with our ecosystem experts.

@ErinOCon
Copy link
Collaborator

HI @Fatman13, can you confirm if the metrics are venus-specific or network-wide? Many thanks!

@Fatman13
Copy link
Author

Fatman13 commented Sep 20, 2022

HI @Fatman13, can you confirm if the metrics are venus-specific or network-wide? Many thanks!

Metrics are specific to Venus and being built into Venus with embedded exporter for a front end to consume.

@ErinOCon
Copy link
Collaborator

Thanks, @Fatman13! This grant has been approved. Would you like us to use the contact information on file for this grant?

@Fatman13
Copy link
Author

Fatman13 commented Sep 30, 2022

Great to hear that! The email will be venus@ipfsforce.com. Thank you so much!

@ErinOCon
Copy link
Collaborator

Thanks, @Fatman13!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants