Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIVIS-9901] accept gzipped responses from API #170

Merged
merged 7 commits into from
May 17, 2024

Conversation

anmarchenko
Copy link
Member

@anmarchenko anmarchenko commented May 2, 2024

What does this PR do?
Adds capability for HTTP layer to request gzipped payload via Accept-Encoding header and decompress payloads if they are gzipped. Uses this capability for Rapid-API requests in agentless mode.

Motivation
Save time when fetching skippable tests for very large repos. Fetching 20k+ tests for rubocop happens in 5s with this change vs 13s before.

How to test the change?
Unit tests are provided

@anmarchenko anmarchenko force-pushed the anmarchenko/accept_gzip_skippable_tests branch from 2a7f87b to 05cf9d7 Compare May 13, 2024 14:11
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.98%. Comparing base (f2f9160) to head (51a1f72).
Report is 9 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #170   +/-   ##
=======================================
  Coverage   98.98%   98.98%           
=======================================
  Files         205      205           
  Lines        9827     9876   +49     
  Branches      437      445    +8     
=======================================
+ Hits         9727     9776   +49     
  Misses        100      100           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@anmarchenko anmarchenko marked this pull request as ready for review May 14, 2024 12:49
@anmarchenko anmarchenko requested review from a team as code owners May 14, 2024 12:49
external_encoding: Encoding::UTF_8,
internal_encoding: Encoding::UTF_8
)
gzip_reader.read || ""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should read not always return a string already? When would it not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RBS believes that read returns (String | nil), so I need to somehow convince it that Gzip.decompress will never return nil

first_bytes = payload[0, 2]
return false if first_bytes.nil? || first_bytes.empty?

first_bytes.b == Datadog::CI::Ext::Transport::GZIP_MAGIC_NUMBER

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last 3 lines should be replaceable with payload.start_with?(GZIP_MAGIC_NUMBER)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly, I need to convert the string to ASCII_BINARY before comparing bytes with payload.b, I am not sure if this operation takes constant time. Payload could be very large (for my biggest test repos it is about 10Mb, our customers have repos with many more tests, so it could be easily a 50Mb string)

def trace_count
0
end

def gzipped?(payload)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method should examine response headers, specifically Content-Encoding: gzip (https://www.rfc-editor.org/rfc/rfc2616#page-118)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will add this check in the follow up to this PR as soon as I add headers method to https://github.com/DataDog/dd-trace-rb/blob/9810fe00d76a2ff2407f4772fa2363c3f0623522/lib/datadog/core/transport/http/response.rb

I would still keep the GZIP_MAGIC_NUMBER check even then as I want to be double sure that I don't try to decode not gzipped payload (between Agent, Rapid API backend and any proxies that could exist in customer environments I expect all possible headers/payload mismatches to happen.

@anmarchenko anmarchenko merged commit 79224f2 into main May 17, 2024
26 checks passed
@anmarchenko anmarchenko deleted the anmarchenko/accept_gzip_skippable_tests branch May 17, 2024 08:11
@github-actions github-actions bot added this to the 1.0.0 milestone May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants