Fix partition key computation for aggregation #158

jamiees2 · 2021-08-23T15:32:45Z

Signed-off-by: James Elias Sigurdarson jamiees2@gmail.com

Description of changes:

The aggregator function has a bug I was wondering about earlier, but recently realized causes actual problems in our setup.

Some notes to provide context:
When producing a record and switching over the aggregator due to size limitations, we start a new aggregation record.
When aggregating into the complete record, we choose pkeys[0] to be the partition key of the Kinesis record.

The bug is that when we switch over to the new aggregation record, we don't regenerate the partition key. The partition key passed is consistent for the entire aggregation, and after returning, we regenerate this key, but we add the first record to the aggregator with the wrong partition key.

The end result is that the next aggregation record is produced with the same partition key as the previous aggregation record. Funny enough this is only a problem for the first aggregation record produced after the switch over, after which pkeys[0] will always never match the remaining records.

This is a problem when the first record is right up against the 1MB shard limit, because the next record will guaranteed end up in the same shard as the first record. As a result, we see a ErrCodeProvisionedThroughputExceededException, and the chunk can never be submitted.

Anyway, this fixes the issue by refactoring the random string generator into a class which can be passed around, and handing control over random string generation to the aggregator itself. This felt like the most maintainable solution, but ends up refactoring the way we manage partition keys. The changes themselves aren't too bad, instead of returning a random string, getPartitionKey returns a (partitionKey, ok) tuple, which is handled differently between the aggregated and non-aggregated impl.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

jamiees2

add comments for code review

jamiees2 · 2021-08-23T15:38:05Z

aggregate/aggregator.go

+	if !hasPartitionKey {
+		if len(a.partitionKeys) > 0 {
+			// Take the first partition key from the map, if any
+			for k, _ := range a.partitionKeys {


I'm open to suggestions for how to do this better, not sure how to get keys out of a map properly.

Could I ask a question about what "first" means here? Does it mean the first element/key was put in the map? Then I am not sure if you could get it from the map through iteration.

I think I generally just want any key that is in the map, basically we choose a "random" key for the record but AFAICT this key does not matter, since the real partition key is just the first key, so this is just a way to save space. I'll update the comment.

jamiees2 · 2021-08-23T15:39:32Z

aggregate/aggregator.go

+		}
+		// Recompute field size, since it changed
+		pKeyIdx, _ = a.checkPartitionKey(partitionKey)
+		pkeyFieldSize = protowire.SizeVarint(pKeyIdx) + fieldNumberSize


Also fixes a sizing bug, when we switch records we may be off on the size slightly (it may shrink if the previous pkeyIdx was > 128)

jamiees2 · 2021-08-23T16:36:05Z

to be even clearer, the event sequence we were seeing was

AddRecord("random key 1", "data 1") -> nil, nil
AddRecord("random key 1", "data 2") -> AggregatedRecords(pkey: "random key 1", records: ["data 1"]), nil
// regenerate aggregation key
AddRecord("random key 2", "data 3") -> nil, nil

Flush() -> AggregatedRecords(pkey: "random key 1", records: ["data 2", "data 3"]), nil

This is because the second AddRecord created a new record and added data 2 to it, but didn't regenerate the random partition key. When aggregating the second time, it used the first partition key, which will be random key 1 in this sequence.

hossain-rayhan · 2021-08-23T16:44:07Z

Hi @jamiees2, just to be clear, does this change have a strong dependency with your previous PR #155? Should we revert that commit and merge together?

jamiees2 · 2021-08-23T17:19:15Z

oh, no, this was a separate bug we found today :) #155 is perfectly fine on it's own

jamiees2 · 2021-08-23T18:08:36Z

Funny enough, this was the problem that caused me to investigate what was behind #155. The size estimation was always incorrect if we had more than one partition key, which is what #155 fixed, but this fixes the reason why we had more than one in the first place.

hossain-rayhan · 2021-08-23T18:28:48Z

oh, no, this was a separate bug we found today :) #155 is perfectly fine on it's own

Thanks for confirming.

zhonghui12

Also, could I double check that you've tested the code change in an end-to-end test right?

zhonghui12 · 2021-08-24T22:46:11Z

aggregate/aggregator.go

+	if !hasPartitionKey {
+		if len(a.partitionKeys) > 0 {
+			// Take the first partition key from the map, if any
+			for k, _ := range a.partitionKeys {


Could I ask a question about what "first" means here? Does it mean the first element/key was put in the map? Then I am not sure if you could get it from the map through iteration.

jamiees2 · 2021-08-24T22:59:23Z

Yeah, this was causing tons of repeated records in our production environment, so we have already deployed a version built off this branch to mitigate the issue.

jamiees2 · 2021-08-24T23:07:13Z

If anybody's interested, this is the graph of how much fluent bit's retry metrics dropped when we deployed the change :D the retries were what was causing the duplicate records, and the immediate increase corresponds to normal traffic increase.

zhonghui12 · 2021-08-24T23:47:47Z

LGTM

hossain-rayhan · 2021-08-24T23:55:15Z

If anybody's interested, this is the graph of how much fluent bit's retry metrics dropped when we deployed the change :D the retries were what was causing the duplicate records, and the immediate increase corresponds to normal traffic increase.

Hey @jamiees2 This is great! Thanks for this contribution. Also, did you do a double check to make sure no data were lost?

jamiees2 · 2021-08-25T00:00:12Z

Yup, the other metrics on the Kinesis end (records ingested) remained the same before and after the rollout

PettitWesley · 2021-08-25T00:07:15Z

@zackwine FYI

PettitWesley

Please squash the commits into one or improve the commit messages

Signed-off-by: James Elias Sigurdarson <jamiees2@gmail.com>

zhonghui12 · 2021-08-25T00:28:54Z

Will include this one in our release this week

aggregate/aggregator.go

jamiees2 requested a review from a team as a code owner August 23, 2021 15:32

jamiees2 commented Aug 23, 2021

View reviewed changes

zhonghui12 reviewed Aug 24, 2021

View reviewed changes

PettitWesley approved these changes Aug 25, 2021

View reviewed changes

Fix partition key computation for aggregation

d2a1db7

Signed-off-by: James Elias Sigurdarson <jamiees2@gmail.com>

jamiees2 force-pushed the mainline branch from 08e6eef to d2a1db7 Compare August 25, 2021 00:17

zhonghui12 approved these changes Aug 25, 2021

View reviewed changes

hossain-rayhan reviewed Aug 25, 2021

View reviewed changes

aggregate/aggregator.go Show resolved Hide resolved

hossain-rayhan approved these changes Aug 26, 2021

View reviewed changes

zackwine approved these changes Aug 26, 2021

View reviewed changes

zhonghui12 merged commit 35f3cc9 into aws:mainline Aug 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix partition key computation for aggregation #158

Fix partition key computation for aggregation #158

jamiees2 commented Aug 23, 2021

jamiees2 left a comment

jamiees2 Aug 23, 2021

zhonghui12 Aug 24, 2021

jamiees2 Aug 24, 2021

jamiees2 Aug 23, 2021

jamiees2 commented Aug 23, 2021

hossain-rayhan commented Aug 23, 2021

jamiees2 commented Aug 23, 2021 •

edited

Loading

jamiees2 commented Aug 23, 2021

hossain-rayhan commented Aug 23, 2021

zhonghui12 left a comment

zhonghui12 Aug 24, 2021

jamiees2 commented Aug 24, 2021

jamiees2 commented Aug 24, 2021

zhonghui12 commented Aug 24, 2021

hossain-rayhan commented Aug 24, 2021 •

edited

Loading

jamiees2 commented Aug 25, 2021

PettitWesley commented Aug 25, 2021

PettitWesley left a comment

zhonghui12 commented Aug 25, 2021

Fix partition key computation for aggregation #158

Fix partition key computation for aggregation #158

Conversation

jamiees2 commented Aug 23, 2021

jamiees2 left a comment

Choose a reason for hiding this comment

jamiees2 Aug 23, 2021

Choose a reason for hiding this comment

zhonghui12 Aug 24, 2021

Choose a reason for hiding this comment

jamiees2 Aug 24, 2021

Choose a reason for hiding this comment

jamiees2 Aug 23, 2021

Choose a reason for hiding this comment

jamiees2 commented Aug 23, 2021

hossain-rayhan commented Aug 23, 2021

jamiees2 commented Aug 23, 2021 • edited Loading

jamiees2 commented Aug 23, 2021

hossain-rayhan commented Aug 23, 2021

zhonghui12 left a comment

Choose a reason for hiding this comment

zhonghui12 Aug 24, 2021

Choose a reason for hiding this comment

jamiees2 commented Aug 24, 2021

jamiees2 commented Aug 24, 2021

zhonghui12 commented Aug 24, 2021

hossain-rayhan commented Aug 24, 2021 • edited Loading

jamiees2 commented Aug 25, 2021

PettitWesley commented Aug 25, 2021

PettitWesley left a comment

Choose a reason for hiding this comment

zhonghui12 commented Aug 25, 2021

jamiees2 commented Aug 23, 2021 •

edited

Loading

hossain-rayhan commented Aug 24, 2021 •

edited

Loading