Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: change adjust max io from max threads to max io requests #8321

Merged
merged 4 commits into from
Oct 20, 2022

Conversation

BohuTANG
Copy link
Member

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

For object storage, the more read requests, the faster until the network bandwidth is reached. We have applied this mechanism to read snapshot&segment files(see #8153).

This PR tries to fast-read block files with more FuseEngineSource source pipes to the pipeline:
change the pipe numbers from max_threads to max_storage_io_requests

Performance test:

Table

mysql> select * from fuse_snapshot('db7861', 't7861') limit 1\G;

*************************** 1. row ***************************
         snapshot_id: 074b5e9a528b4543ada3697d5abb0b44
   snapshot_location: 8/8074/_ss/074b5e9a528b4543ada3697d5abb0b44_v1.json
      format_version: 1
previous_snapshot_id: 8884c4cdb4904dcea0ff66dce61a664a
       segment_count: 1545888
         block_count: 1583761
           row_count: 15335500000
  bytes_uncompressed: 19065197766050
    bytes_compressed: 6001882777921
          index_size: 11814144308
           timestamp: 2022-10-17 03:47:31.377732
1 row in set (13.73 sec)
Read 1 rows, 229.00 B in 13.725 sec., 0.07285775055708306 rows/sec., 16.68 B/sec.

select sum(c8) from t7861

main branch 1 hour 15 minutes 26 MiB /sec
this PR 4 minutes 491 MiB/sec

Fixes #8263

@vercel
Copy link

vercel bot commented Oct 19, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
databend ✅ Ready (Inspect) Visit Preview Oct 20, 2022 at 1:43AM (UTC)

@mergify mergify bot added the pr-feature this PR introduces a new feature to the codebase label Oct 19, 2022
@BohuTANG
Copy link
Member Author

Wait for #8319 to merge.

@BohuTANG
Copy link
Member Author

@mergify update

@mergify
Copy link
Contributor

mergify bot commented Oct 19, 2022

update

✅ Branch has been successfully updated

@BohuTANG BohuTANG marked this pull request as ready for review October 19, 2022 14:11
@BohuTANG BohuTANG requested a review from zhang2014 October 19, 2022 14:12
@BohuTANG
Copy link
Member Author

@mergify update

1 similar comment
@Xuanwo
Copy link
Member

Xuanwo commented Oct 19, 2022

@mergify update

@mergify
Copy link
Contributor

mergify bot commented Oct 19, 2022

update

✅ Branch has been successfully updated

@BohuTANG BohuTANG marked this pull request as draft October 19, 2022 23:10
@BohuTANG
Copy link
Member Author

@mergify update

@mergify
Copy link
Contributor

mergify bot commented Oct 20, 2022

update

✅ Branch has been successfully updated

@BohuTANG BohuTANG marked this pull request as ready for review October 20, 2022 01:43
@BohuTANG BohuTANG merged commit b63a5fe into databendlabs:main Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

performance: try to fast IO read for FuseTableSource
3 participants