Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataDog scaler outputs wrong log when getting 429 responses #4187

Closed
ccorbacho opened this issue Feb 1, 2023 · 8 comments · Fixed by #4259
Closed

DataDog scaler outputs wrong log when getting 429 responses #4187

ccorbacho opened this issue Feb 1, 2023 · 8 comments · Fixed by #4259
Assignees
Labels
bug Something isn't working

Comments

@ccorbacho
Copy link

Report

Based on looking at the DataDog scaler code, if KEDA makes too many requests to DataDog and gets a 429 back, we should hit this code path:
https://github.com/kedacore/keda/blob/main/pkg/scalers/datadog_scaler.go#L293

And therefore extract the rate limits from the headers, and get an error message telling us what the rate limit is we have hit.

However, what we are seeing is that we are failing the check, and hitting this path instead:
https://github.com/kedacore/keda/blob/main/pkg/scalers/datadog_scaler.go#L304

And just getting back a generic 429 error message.

Expected Behavior

KEDA outputs a log line in the format:

"your Datadog account reached the %s queries per hour rate limit, next limit reset will happen in %s seconds"

Which would include the rate limit and rate limit reset information from the headers.

Actual Behavior

We get this message in the logs:

error when retrieving Datadog metrics: 429 Too Manny Requests

Which means we are hitting this message format.

error when retrieving Datadog metrics: %s

Steps to Reproduce the Problem

  1. Create a lot of scalers that will hit the KEDA limits
  2. Review the KEDA logs.

Logs from KEDA operator

2023-02-01T14:48:28Z	ERROR	scalehandler	Error getting scale decision	{"scaledobject.Name": "[REDACTED]-pod-scaler", "scaledObject.Namespace": "[REDACTED]", "scaleTarget.Name": "[REDACTED]", "error": "error when retrieving Datadog metrics: 429 Too Many Requests"}
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScalers
	/workspace/pkg/scaling/scale_handler.go:278

KEDA Version

2.8.1

Kubernetes Version

None

Platform

Amazon Web Services

Scaler Details

DataDog

Anything else?

(This is on KEDA 2.8.2, but that's not an option in the dropdown).

@ccorbacho ccorbacho added the bug Something isn't working label Feb 1, 2023
@zroubalik
Copy link
Member

@ccorbacho makes sense, are you willing to implement this?

@ccorbacho
Copy link
Author

I'm not much of a Go developer, I'm afraid (I tried already taking a quick look at the DD API code and this to see if I could work out why it was broken, but couldn't figure it out, so this is probably best left to someone who does know it).

@JorTurFer
Copy link
Member

JorTurFer commented Feb 1, 2023

BTW, in KEDA v2.9 we released an important (and experimental) feature that reduces significantly the total amount of request to DD API (from 5-6 per minute to 1-2).
Basically KEDA can request the metric only each pollingInterval and cache it (for the HPA requests) instead of querying the value each time. This may help with the API rate limit

@ccorbacho
Copy link
Author

I saw that as well earlier in my reading, and we'll likely use that, but as we will still need to raise our limits as well as use the caching, it's still helpful if we can get the more informative error message.

@JorTurFer
Copy link
Member

Yes of course, I agree with the proper logging, I just wanted to show that feature to you just in case :)

@tomkerkhove tomkerkhove moved this from Proposed to To Triage in Roadmap - KEDA Core Feb 16, 2023
@arapulido
Copy link
Contributor

@tomkerkhove hey! feel free to assign this to me, I will try to reproduce and see what's going on.

@tomkerkhove
Copy link
Member

Thank you!

arapulido added a commit to arapulido/keda that referenced this issue Feb 21, 2023
Signed-off-by: Ara Pulido <ara.pulido@datadoghq.com>
@arapulido
Copy link
Contributor

PR opened: #4259

@github-project-automation github-project-automation bot moved this from To Triage to Ready To Ship in Roadmap - KEDA Core Feb 22, 2023
@JorTurFer JorTurFer moved this from Ready To Ship to Done in Roadmap - KEDA Core Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants