Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(torii): limit number of blocks processed in one go #2505

Merged
merged 1 commit into from
Oct 9, 2024

Conversation

lambda-0x
Copy link
Collaborator

@lambda-0x lambda-0x commented Oct 8, 2024

Summary by CodeRabbit

  • New Features
    • Introduced a new command-line argument blocks_chunk_size for configuring the number of blocks to process before committing to the database, enhancing user control over data processing.
  • Improvements
    • Updated processing logic to handle block chunks more efficiently, improving performance during data fetching and event processing.

Copy link

codecov bot commented Oct 8, 2024

Codecov Report

Attention: Patch coverage is 12.50000% with 7 lines in your changes missing coverage. Please review.

Project coverage is 67.73%. Comparing base (e591364) to head (67dc049).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
crates/torii/core/src/engine.rs 16.66% 5 Missing ⚠️
bin/torii/src/main.rs 0.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2505   +/-   ##
=======================================
  Coverage   67.73%   67.73%           
=======================================
  Files         388      388           
  Lines       50421    50427    +6     
=======================================
+ Hits        34153    34158    +5     
- Misses      16268    16269    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Base automatically changed from feat/torii/ercs to main October 9, 2024 14:02
Copy link

coderabbitai bot commented Oct 9, 2024

Walkthrough

Ohayo, sensei! This pull request introduces a new command-line argument blocks_chunk_size to the Args struct in the torii binary. This allows users to specify how many blocks to process before committing to the database, with a default of 10240. The EngineConfig struct is updated to include this field, and the fetch_data and fetch_range methods in the Engine class are modified to handle chunked processing. A new process_tasks method is also added to manage concurrency during event processing.

Changes

File Path Change Summary
bin/torii/src/main.rs Added new command-line argument blocks_chunk_size to Args struct.
crates/torii/core/src/engine.rs Added pub blocks_chunk_size: u64 to EngineConfig struct; updated default value to 10240.
crates/torii/core/src/engine.rs Updated fetch_data and fetch_range methods to process blocks in chunks based on new config.
crates/torii/core/src/engine.rs Introduced new process_tasks method for managing concurrency during event processing.

Possibly related PRs

Suggested reviewers

  • Larkooo

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)
bin/torii/src/main.rs (2)

117-119: Ohayo, sensei! New argument looks good, but let's add some spice to the docs!

The new blocks_chunk_size argument is a great addition! It allows users to fine-tune the indexing process. However, to make it even more user-friendly, consider adding a brief explanation of its impact on performance and memory usage in the argument's help text.

Here's a suggestion to enhance the documentation:

 /// Number of blocks to process before commiting to DB
-#[arg(long, default_value = "10240")]
+#[arg(long, default_value = "10240", help = "Number of blocks to process before committing to DB. Higher values may improve performance but increase memory usage.")]
 blocks_chunk_size: u64,

246-246: Ohayo again, sensei! The engine config update looks solid!

The blocks_chunk_size is correctly added to the EngineConfig initialization. This ensures that the engine respects the user's preference (or the default value) for block processing.

To improve code readability, consider aligning the new field with the others:

 EngineConfig {
     max_concurrent_tasks: args.max_concurrent_tasks,
     start_block: 0,
-    blocks_chunk_size: args.blocks_chunk_size,
+    blocks_chunk_size:    args.blocks_chunk_size,
     events_chunk_size: args.events_chunk_size,
     index_pending: args.index_pending,
     polling_interval: Duration::from_millis(args.polling_interval),
     flags,
 },
crates/torii/core/src/engine.rs (1)

291-291: Ohayo, sensei! Let's tackle the TODO for parallelizing data fetching.

Now that blocks are processed in chunks, implementing parallel data fetching could significantly improve performance.

Would you like assistance in implementing parallel fetching? I can help with code suggestions or open a GitHub issue to track this enhancement.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between e591364 and 67dc049.

📒 Files selected for processing (2)
  • bin/torii/src/main.rs (2 hunks)
  • crates/torii/core/src/engine.rs (3 hunks)
🧰 Additional context used
🔇 Additional comments (4)
bin/torii/src/main.rs (1)

246-246: Ohayo once more, sensei! Let's make sure our engine is tuned up!

The blocks_chunk_size configuration looks good, but we should verify that the Engine implementation correctly utilizes this new parameter. This ensures that the chunked processing works as intended, maintaining data consistency and optimizing performance.

Could you please run the following script to check the Engine implementation?

This will help us confirm that the Engine is properly implementing the chunked processing based on the new blocks_chunk_size parameter.

✅ Verification successful

Ohayo once more, sensei! It looks like engine.rs wasn't found at the expected location. Let's broaden our search to locate the correct implementation.


Ohayo once more, sensei! The blocks_chunk_size parameter is correctly utilized within the Engine implementation. Everything looks good!

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the Engine implementation correctly uses the blocks_chunk_size parameter

# Search for blocks_chunk_size usage in the Engine implementation
rg --type rust 'blocks_chunk_size' crates/torii-core/src/engine.rs

# Look for any methods that might be using this parameter for chunked processing
rg --type rust 'fn\s+process_blocks|fn\s+index_blocks' crates/torii-core/src/engine.rs

Length of output: 299


Script:

#!/bin/bash
# Description: Locate the Engine implementation and verify usage of blocks_chunk_size

# Search for blocks_chunk_size usage across the entire codebase
rg 'blocks_chunk_size'

# Search for engine-related files in case the path has changed
rg --files | rg 'engine\.rs'

Length of output: 469

crates/torii/core/src/engine.rs (3)

134-134: Ohayo, sensei! Great addition of blocks_chunk_size to EngineConfig.

Introducing blocks_chunk_size provides better control over block processing, enhancing the engine's flexibility and scalability.


146-146: Setting default blocks_chunk_size to 10240.

The default value appears reasonable for standard workloads, ensuring efficient processing without overwhelming system resources.


303-303: ⚠️ Potential issue

Potential off-by-one error when incrementing from.

Incrementing from when it's not zero might skip processing the block at from. Please verify if this behavior is intentional.

Run the following script to inspect the block range calculations:

let from = cursors.head.unwrap_or(0);
let total_remaining_blocks = latest_block_number - from;
let blocks_to_process = total_remaining_blocks.min(self.config.blocks_chunk_size);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Possible underflow risk when calculating total_remaining_blocks.

If from is greater than latest_block_number, subtracting may cause an underflow panic. Consider adding a check to handle this scenario gracefully.

Apply this change to prevent potential underflow:

 let total_remaining_blocks = latest_block_number - from;
+if from > latest_block_number {
+    return Ok(FetchDataResult::None);
+}

Committable suggestion was skipped due to low confidence.

@lambda-0x lambda-0x merged commit 975f3e4 into main Oct 9, 2024
13 of 15 checks passed
@lambda-0x lambda-0x deleted the torii-chunk-blocks branch October 9, 2024 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants