Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loki: Per-tenant stream sharding #7311

Merged
merged 13 commits into from
Oct 5, 2022

Conversation

DylanGuedes
Copy link
Contributor

@DylanGuedes DylanGuedes commented Sep 30, 2022

What this PR does / why we need it:

  • Move stream sharding configuration to its own package to avoid cyclic imports
  • Change stream sharding to be a per-tenant configuration
  • Change ingesters to reject whole streams due to rate-limit based on per-tenant stream sharding
  • Change stream sharding flags prefix from distributor.shard-stream to shard-stream

Special notes for your reviewer:
This is part of the stream sharding work

@DylanGuedes DylanGuedes force-pushed the make-stream-shard-per-tenant branch 8 times, most recently from c6d3b1e to fb6e575 Compare October 1, 2022 13:49
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
-        distributor	-0.1%
+            querier	0%
+ querier/queryrange	0.1%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@DylanGuedes DylanGuedes changed the title Loki: Reuse stream-sharding as a parameter to ignore pushes. Loki: Per-tenant stream sharding Oct 3, 2022
@DylanGuedes DylanGuedes marked this pull request as ready for review October 3, 2022 13:43
@DylanGuedes DylanGuedes requested a review from a team as a code owner October 3, 2022 13:43
Copy link
Contributor

@MasslessParticle MasslessParticle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, overall. We've got to fix a bug and improve the tests

@@ -346,7 +348,7 @@ func (s *stream) validateEntries(entries []logproto.Entry, isReplay bool) ([]log
}

now := time.Now()
if !s.cfg.RateLimitWholeStream && !s.limiter.AllowN(now, len(entries[i].Line)) {
if rateLimitWholeStream && !s.limiter.AllowN(now, len(entries[i].Line)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ought to be !rateLimitWholeStream.

I see why our test missed this: because whole-stream rate limiting happens at the end, the result looks the same. It's important that we not call AllowN when rateLimitWholeStream==true because a successful call changes the limiter's state and affects the rate limit. It there a way we can tell how the limiter was called in the test to help us in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch, thank you. I'll investigate how to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a fix and a test covering it, wdyt?

require.Contains(t, err.Error(), (&validation.ErrStreamRateLimit{RateLimit: l.PerStreamRateLimit, Labels: s.labelsString, Bytes: flagext.ByteSize(len(entries[1].Line))}).Error())
}

func TestPushRateLimitAllOrNothing(t *testing.T) {
l := validation.Limits{
PerStreamRateLimit: 10,
PerStreamRateLimitBurst: 10,
ShardStreams: &shardstreams.Config{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this setup because the limits for this are used in instance.go to determine rateLimitWholeStream

We should probably have a test in instance.go to make sure this stays wired together the same way in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I added a test covering it, WDYT?

pkg/distributor/distributor_test.go Outdated Show resolved Hide resolved
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
-        distributor	-0.1%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

-           ingester	-0.1%
-        distributor	-0.1%
+            querier	0%
- querier/queryrange	-0.1%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
-               loki	-0.1%

@@ -312,6 +313,9 @@ func TestPushRateLimit(t *testing.T) {
l := validation.Limits{
PerStreamRateLimit: 10,
PerStreamRateLimitBurst: 10,
ShardStreams: &shardstreams.Config{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Because whole stream rate limiting is controlled by the flag we pass to Push, there's no need to set this up here, either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ops, good catch. Fixed!

Copy link
Contributor

@MasslessParticle MasslessParticle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a nit but lgtm

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

-           ingester	-0.1%
-        distributor	-0.1%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
-               loki	-0.1%

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
-        distributor	-0.1%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@github-actions github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Oct 5, 2022
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
-        distributor	-0.1%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

1 similar comment
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
-        distributor	-0.1%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@DylanGuedes DylanGuedes merged commit b1d4efa into grafana:main Oct 5, 2022
lxwzy pushed a commit to lxwzy/loki that referenced this pull request Nov 7, 2022
**What this PR does / why we need it**:
- Move stream sharding configuration to its own package to avoid cyclic
imports
- Change stream sharding to be a per-tenant configuration
- Change ingesters to reject whole streams due to rate-limit based on
per-tenant stream sharding
- Change stream sharding flags prefix from `distributor.shard-stream` to
`shard-stream`
changhyuni pushed a commit to changhyuni/loki that referenced this pull request Nov 8, 2022
**What this PR does / why we need it**:
- Move stream sharding configuration to its own package to avoid cyclic
imports
- Change stream sharding to be a per-tenant configuration
- Change ingesters to reject whole streams due to rate-limit based on
per-tenant stream sharding
- Change stream sharding flags prefix from `distributor.shard-stream` to
`shard-stream`
Abuelodelanada pushed a commit to canonical/loki that referenced this pull request Dec 1, 2022
**What this PR does / why we need it**:
- Move stream sharding configuration to its own package to avoid cyclic
imports
- Change stream sharding to be a per-tenant configuration
- Change ingesters to reject whole streams due to rate-limit based on
per-tenant stream sharding
- Change stream sharding flags prefix from `distributor.shard-stream` to
`shard-stream`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants