-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: Add shuffle sharding doc #6173
Conversation
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0.3%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall it looks great on technical explanation. There are some suggestions by Dan and I will leave it to Karen for suggesting any corrections in writing.
The maximum number of queriers can be overridden on a per-tenant basis in the limits overrides configuration (`max_queriers_per_tenant`). | ||
|
||
### The impact of “query of death” | ||
In the event a tenant is repeatedly sending a “query of death” which leads the querier to crash or getting killed because of out-of-memory, the crashed querier will get disconnected from the query-frontend and a new querier will be immediately assigned to the tenant’s shard. This practically invalidates the assumption that shuffle-sharding can be used to contain the blast radius in case of a query of death. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add a line explaining a bit more why that is a problem by mentioning that the tenant would get assigned to other running queries to match the max_queriers_per_tenant
and might eventually affect other tenant shards which beats the purpose of this feature.
Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com>
Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com>
Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com>
Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com>
Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com>
Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
- loki -0.6% |
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. - ingester -0.1%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
- distributor -0.3%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com>
Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
1 similar comment
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@KMiller-Grafana can you take a look at the doc when you get a chance? |
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the write up. It help me understand shuffle sharding better 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small change please!
Co-authored-by: Karen Miller <84039272+KMiller-Grafana@users.noreply.github.com>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much.
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
* doc: Add shuffle sharding doc Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Add query-scheduler config as well Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Add line explaining what is "query-of-death" exactly means * PR remarsk from Karen Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Fix image URL Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Karen Miller <84039272+KMiller-Grafana@users.noreply.github.com> Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> Co-authored-by: Karen Miller <84039272+KMiller-Grafana@users.noreply.github.com> (cherry picked from commit a1e75da)
* doc: Add shuffle sharding doc Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Add query-scheduler config as well Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Add line explaining what is "query-of-death" exactly means * PR remarsk from Karen Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Fix image URL Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Karen Miller <84039272+KMiller-Grafana@users.noreply.github.com> Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> Co-authored-by: Karen Miller <84039272+KMiller-Grafana@users.noreply.github.com> (cherry picked from commit a1e75da) Co-authored-by: Kaviraj Kanagaraj <kavirajkanagaraj@gmail.com>
* doc: Add shuffle sharding doc Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Add query-scheduler config as well Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> * Add line explaining what is "query-of-death" exactly means * PR remarsk from Karen Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Fix image URL Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com> * Update docs/sources/operations/shuffle-sharding.md Co-authored-by: Karen Miller <84039272+KMiller-Grafana@users.noreply.github.com> Co-authored-by: Dan Mihai Dinu <33793329+danmdinu@users.noreply.github.com> Co-authored-by: Karen Miller <84039272+KMiller-Grafana@users.noreply.github.com>
Signed-off-by: Kaviraj kavirajkanagaraj@gmail.com
What this PR does / why we need it:
Add shuffle sharding doc for Loki.
And some operational advice.
Which issue(s) this PR fixes:
Fixes #NA
Special notes for your reviewer:
Markdown preview: https://github.com/grafana/loki/blob/0dcc6034ea927b16ff9a79eb71100b98a736da78/docs/sources/operations/shuffle-sharding.md
Checklist
CHANGELOG.md
.