ProvisionedThroughputExceededException Error #158

chenyin0126 · 2024-05-23T05:40:38Z

We have 2 pods, each pod is a consumer and relying on this library. We used .Scan to consume from all shards. We have 8 Kinesis shards, but we meet this error when 2 consumers runs together. We got

shard shardId-000000000010 error: get records error: operation error Kinesis: GetRecords, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: f450bd05-0d8a-739b-aa98-773a29ee48de, ProvisionedThroughputExceededException: Rate exceeded for Shard - 715119903224/command-data-stream/shardId-000000000010

I think the reason would be the consumer get records every 0.25 seconds as default, but one shard at AWS only allow 5 transactions per second. When 2 consumers works at the same time, both of them scan all shards, so it might happen that one shard got getRecords for 8 times per second, so it raises this error: ProvisionedThroughputExceededException

Any idea what is the best way to solve this?

The text was updated successfully, but these errors were encountered:

chenyin0126 · 2024-05-23T05:57:10Z

cc: @harlow

luanruisong · 2024-06-12T11:28:23Z

mayby there is something wrong in func isRetriableError

error log

shard {shardId} error: get records error: operation error Kinesis: GetRecords, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: {requestId}, ProvisionedThroughputExceededException: Rate exceeded for Shard - {shard}

func isRetriableError(err error) bool {
	switch err.(type) {
	case *types.ExpiredIteratorException:
		return true
	case *types.ProvisionedThroughputExceededException:
		return true
	}
	return false
}

func Scan got error ProvisionedThroughputExceededException but this error maybe retry?

https://github.com/harlow/kinesis-consumer/blob/master/consumer.go#L188

i think it want to retry (wait the ticker),but it is finish the goroutine

maybe should change to

        if oe := (*types.ProvisionedThroughputExceededException)(nil); errors.As(err, &oe) {
		return true
	}

luanruisong · 2024-06-12T17:51:09Z

BTW
there are some problems in commit: 6720a01
It starts the goroutine repeatedly to process the same shard until ProvisionedThroughputExceededException

error log (There are only two shards)

{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213184044,"msg":"[CONSUMER] start scan: {shardId} 49647579764499847879996118179013785811305901676174508770"}
{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213184046,"msg":"[CONSUMER] start scan: {shardId} 49651624479223759978416358002305044641501698271515509490"}
{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213214045,"msg":"[CONSUMER] start scan: {shardId} 49651624479223759978416358045822747370169506734998029042"}
{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213214045,"msg":"[CONSUMER] start scan: {shardId} 49647579764499847879996118207531136970195391339902796514"}
{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213244044,"msg":"[CONSUMER] start scan: {shardId} 49647579764499847879996118236082338052034090689242333922"}
{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213244046,"msg":"[CONSUMER] start scan: {shardId} 49651624479223759978416358091472995244637521131381719794"}
{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213274045,"msg":"[CONSUMER] start scan: {shardId} 49647579764499847879996118265124363016636329277354148578"}
{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213274046,"msg":"[CONSUMER] start scan: {shardId} 49651624479223759978416358130257753389514056238450606834"}
{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213285292,"msg":"[CONSUMER] get records error: operation error Kinesis: GetRecords, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: {requestId}, ProvisionedThroughputExceededException: Rate exceeded for Shard {{shardId}}"}
{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213288257,"msg":"[CONSUMER] get records error: operation error Kinesis: GetRecords, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: {requestId}, ProvisionedThroughputExceededException: Rate exceeded for Shard {{shardId}}"}
{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213294575,"msg":"[CONSUMER] get records error: operation error Kinesis: GetRecords, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: {requestId}, ProvisionedThroughputExceededException: Rate exceeded for Shard {{shardId}}"}
{"level":"info","caller":"/kinesis/logger.go:22","ts":1718213298295,"msg":"[CONSUMER] get records error: operation error Kinesis: GetRecords, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: {requestId}, ProvisionedThroughputExceededException: Rate exceeded for Shard {{shardId}}"}

i think the problems in allgroup.go:121

on ticker channel triggered, it is also put the on runing shard into the channel shardc(no parentShard) in the second loop

	for _, shard := range shards {
		if _, ok := g.shards[*shard.ShardId]; ok {
			continue
		}
		g.shards[*shard.ShardId] = shard
		g.shardsClosed[*shard.ShardId] = make(chan struct{})
	}
	for _, shard := range shards {
		shard := shard // Shadow shard, since we use it in goroutine
		var parent1, parent2 <-chan struct{}
		if shard.ParentShardId != nil {
			parent1 = g.shardsClosed[*shard.ParentShardId]
		}
		if shard.AdjacentParentShardId != nil {
			parent2 = g.shardsClosed[*shard.AdjacentParentShardId]
		}
		go func() {
			// Asynchronously wait for all parents of this shard to be processed
			// before providing it out to our client.  Kinesis guarantees that a
			// given partition key's data will be provided to clients in-order,
			// but when splits or joins happen, we need to process all parents prior
			// to processing children or that ordering guarantee is not maintained.
			if waitForCloseChannel(ctx, parent1) && waitForCloseChannel(ctx, parent2) {
				shardc <- shard
			}
		}()
	}

mskonovalov · 2024-09-16T06:39:59Z

Fix #161

Fixes #158. Seems the bug was introduced in #155. See #155 (comment)

luanruisong · 2024-09-18T02:41:01Z

good job!

luanruisong mentioned this issue Jun 12, 2024

fix isRetriableError #159

Merged

powersj mentioned this issue Aug 6, 2024

Kinesis consumer not working from 1.20.3 onwards influxdata/telegraf#13853

Closed

harlow pushed a commit that referenced this issue Sep 16, 2024

Fix ProvisionedThroughputExceededException error (#161)

8d10ac8

Fixes #158. Seems the bug was introduced in #155. See #155 (comment)

harlow closed this as completed in #161 Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ProvisionedThroughputExceededException Error #158

ProvisionedThroughputExceededException Error #158

chenyin0126 commented May 23, 2024 •

edited

Loading

chenyin0126 commented May 23, 2024

luanruisong commented Jun 12, 2024 •

edited

Loading

luanruisong commented Jun 12, 2024 •

edited

Loading

mskonovalov commented Sep 16, 2024

luanruisong commented Sep 18, 2024

ProvisionedThroughputExceededException Error #158

ProvisionedThroughputExceededException Error #158

Comments

chenyin0126 commented May 23, 2024 • edited Loading

chenyin0126 commented May 23, 2024

luanruisong commented Jun 12, 2024 • edited Loading

luanruisong commented Jun 12, 2024 • edited Loading

mskonovalov commented Sep 16, 2024

luanruisong commented Sep 18, 2024

chenyin0126 commented May 23, 2024 •

edited

Loading

luanruisong commented Jun 12, 2024 •

edited

Loading

luanruisong commented Jun 12, 2024 •

edited

Loading