Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIAPP-5371] validate git tags #15

Merged
merged 33 commits into from
Sep 6, 2023

Conversation

anmarchenko
Copy link
Member

@anmarchenko anmarchenko commented Sep 1, 2023

What does this PR do?
In cases when it is not possible to automatically extract git environment information for CI visibility we allow users to supply this info via environment variables, see docs here.

DD_GIT_REPOSITORY_URL and DD_GIT_COMMIT_SHA are required but currently not validated. When these values are not provided or SHA is not valid CI-APP backend silently drops these traces. End users get no feedback in these cases.

We add the following validations to these tags:

DD_GIT_REPOSITORY_URL must be present
DD_GIT_COMMIT_SHA must be present, must be 40 characters long must be a valid hex number
In case these values are incorrect we continue to send traces but we log an error message explaining the problem.

Motivation
Improve end user experience and prevent confusion when test spans are not shown in UI without any reason.

Additional Notes

This PR includes several refactorings:

  • Git constants moved from dd-trace-rb Core module here as they are only used by CI visibility (I will remove them from dd-trace-rb before 2.0 release)
  • giant module Datadog::CI::Ext::Environment is split into smaller chunks encapsulating CI provider environment logic, user defined environment variables, and local git fallbacks

How to test the change?
Tested according to CI spec in environment_spec.rb

@anmarchenko anmarchenko changed the title move git constants here from dd-trace-rb [CIAPP-5371] validate git tags Sep 1, 2023
@anmarchenko anmarchenko requested a review from a team September 4, 2023 12:18
@anmarchenko anmarchenko marked this pull request as ready for review September 4, 2023 12:18
@anmarchenko anmarchenko requested a review from a team September 4, 2023 12:35
Copy link
Contributor

@TonyCTHsu TonyCTHsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the polymorphic approach that each extractor encapsulates vendor specific logic, but I expect there should be corresponding test for each of them, am i missing?

When I read through the changes, there are a lot of methods that are returning nil and performing nil check. This causes a lot of cognitive burden and I wonder if there is a better way to handle it.

lib/datadog/ci/ext/environment/extractor.rb Show resolved Hide resolved
def git_commit_sha
end

def normalize_git!(tags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input tags conflict with the method tags, I would suggest extract this stateless function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I need separate modules for normalization and expansion 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TonyCTHsu I made this more readable using instance variable @tags and removing some nil checks (moved the code that removes nil tags after the normalization)

Comment on lines 119 to 132
fields = output.split("\t").each(&:strip!)

@commit_users = {
author_name: fields[0],
author_email: fields[1],
# Because we can't get a reliable UTC time from all recent versions of git
# We have to rely on converting the date to UTC ourselves.
author_date: Time.at(fields[2].to_i).utc.to_datetime.iso8601,
committer_name: fields[3],
committer_email: fields[4],
# Because we can't get a reliable UTC time from all recent versions of git
# We have to rely on converting the date to UTC ourselves.
committer_date: Time.at(fields[5].to_i).utc.to_datetime.iso8601
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps consider extracting an CommitUser object instead of a hash?

It is less readable when each field is depending on the index of the array for variable fields.

Consider

name, email, ... = output.split("\t").each(&:strip!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this, will explore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TonyCTHsu I refactored this code to use GitUser concept but I am not sure if it is actually better than hash... wdyt about it now?

@anmarchenko
Copy link
Member Author

@TonyCTHsu I will add tests for each separate class as you suggested: they are all thoroughly tested by environment_spec.rb according to the datadog-ci-spec specifications but it does not hurt to test them in isolation.

As for the nil checks: we are dealing with environment so everything can be nil at all times as we don't control what comes from the external environment. Will try to find a better way to define checks.

@anmarchenko
Copy link
Member Author

anmarchenko commented Sep 5, 2023

@TonyCTHsu @GustavoCaso I added more specs, moved fixtures around and refactored some bits of code

@GustavoCaso
Copy link
Member

@anmarchenko There was something that wasn't clicking with me with the organization of the code.

  • The fact that the extractor had to require all providers.
  • The fact that providers inherit from extractor

It felt like a smell to me.
I think the idea of having a list of providers and all of those providers having a base class makes sense. I think the extractor should care only about asking the provider for the right method to build the tags hash.

I open a DRAFT PR to show you want I mean #18

Let me know what you think

@anmarchenko
Copy link
Member Author

anmarchenko commented Sep 6, 2023

@GustavoCaso I like your cleanup, especially the fact that it draws a clear line between Extractor and Provider concepts.

I will incorporate this change in this PR today

@anmarchenko
Copy link
Member Author

@GustavoCaso @TonyCTHsu now it is finally finally finally ready for review

Copy link
Member

@GustavoCaso GustavoCaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work, and thanks to being open to my suggestion 😄

I left a couple comments which are not blockers at all, feel free to address them or not

@anmarchenko anmarchenko merged commit 562cbaf into main Sep 6, 2023
20 checks passed
@anmarchenko anmarchenko deleted the anmarchenko/ci_vis_validate_git_tags branch September 6, 2023 14:06
@anmarchenko anmarchenko added this to the 0.1.0 milestone Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants