-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: Use opendal to handle the cache storage operations #1404
Comments
Looks interesting |
Also, do you know any project using opendal ? |
Take s3 as an example,
databend is a cloud data warehouse which uses opendal heavily. |
Sure but I would like to see some benchmark :) The devil is in the details
Are you aware of other projects? |
Agreed. How about this benchmark plan?
Small project like deepeth/mars. No others so far to my knowledge. I'm seeking to expand the usage too. |
opendal was created precisely to simplify access to data. I think opendal is a good fit for sccache, taking over all the data storage services that might need to be considered without having to bother with abstractions and complex dependencies. And vice versa, sccache is a good project to test and validate opendal designs and implementations. Although there may be a lack of more projects using opendal at the moment, I thought it would be worthwhile to try it out on sccache. |
Sure, I just don't want to add a critical dependency which could be unmaintained in a few months |
I did a simple benchmark. SetupTest with databend at commit b24be85 Both of sccache and cccache are using the same local minio with different bucket. Sccachesccache v0.3.1, built with release profile under stable rust export RUSTC_WRAPPER=/home/xuanwo/Code/mozilla/sccache/target/release/sccache
export CARGO_INCREMENTAL=0
export SCCACHE_BUCKET=sccache
export SCCACHE_ENDPOINT=127.0.0.1:9900
export AWS_ACCESS_KEY_ID=minioadmin
export AWS_SECRET_ACCESS_KEY=minioadmin
cargo build CccacheCccache at commit c1a725f0, built with release profile under stable rust.
export RUSTC_WRAPPER=/home/xuanwo/Code/xuanwo/cccache/target/release/cccache
export CARGO_INCREMENTAL=0
export SCCACHE_BUCKET=cccache
export SCCACHE_ENDPOINT=127.0.0.1:9900
export AWS_ACCESS_KEY_ID=minioadmin
export AWS_SECRET_ACCESS_KEY=minioadmin
cargo build ResultWithout any cacheFinished dev [unoptimized + debuginfo] target(s) in 2m 45s
cargo build 1722.26s user 174.20s system 1147% cpu 2:45.30 total With sccacheFirst build Finished dev [unoptimized + debuginfo] target(s) in 3m 45s
cargo build 407.65s user 76.37s system 214% cpu 3:46.04 total Build after cargo clean Finished dev [unoptimized + debuginfo] target(s) in 2m 38s
cargo build 325.56s user 59.88s system 243% cpu 2:38.17 total With cccacheFirst build Finished dev [unoptimized + debuginfo] target(s) in 3m 34s
cargo build 557.05s user 99.79s system 305% cpu 3:34.94 total Build after cargo clean Finished dev [unoptimized + debuginfo] target(s) in 2m 31s
cargo build 427.92s user 70.19s system 327% cpu 2:31.96 total The result is stable within couple of seconds. |
Could you please try benchmark with hyperfine ? |
On the same setup: Sccache
Cccache
|
We have great news related to this to share! The newly open-source project greptimedb is using OpenDAL too. @sunng87 from the @GreptimeTeam is one of the contributors to OpenDAL. We do know the concern of the long-term maintenance status of the project. And we are working hard to grow our community and introduce more maintainers (outside of our own team) to address that. |
In PR #1412 I built a quick demo in about 5 minutes, please take a look and leave any comments so we can improve. |
@Xuanwo does OpenDAL uses connection pool? All the calls will be to the same backend S3. It would be nice to re-use the connections so that whole SSL handshakes are minimized and also avoiding any DNS lookups. @sylvestre I know for sure the aws sdk doesn't use any connection reuse. I had used very similar project to OpenDAL for replacing the AWS SDK and gained tremendous performance improvement. I used object_store from apache arror-rs and the performance to interact with S3 compatible is very better than standard AWS SDK. I am assuming OpenDAL also must be similiar to object_store if so sccache would benefit a lot with that implementation. |
Yes.
We also support caching DNS lookups by enabling |
reopening as we "only" have s3 & azure. @Xuanwo I would be curious if you could do GCS next. I would like to test it for Firefox CI. |
Yes, absolutely! I'm working on migrating gcs now. The only thing blocks me is apache/opendal#1062 I expect to implement it before tomorrow. |
Most opendal related work has been done. Let's close this issue. |
Background
Hi, I'm the maintainer of opendal, a rust lib that focuses on accessing data freely, painlessly, and efficiently.
OpenDAL now supports almost all storage services, including
s3
,gcs
,azblob
,oss
,obs
,hdfs
,redis
,fs
and so on.I came across
sccache
last week and found opendal is a great choice forsccache
, which can help resolve many issues like #1384, #1180, #1167.Apart from various storage backend supports, opendal also has the following benefits:
Proposal
So I propose to replace the storage implementation with opendal instead.
I built a quick demo here: https://github.com/Xuanwo/cccache/tree/debug. Most of the change will be like:
If this proposal is accepted, I plan to
Any comments are welcome! Thanks in advance.
The text was updated successfully, but these errors were encountered: