Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comprehensive overhaul of Harvester and Server enhancements #171

Merged

Conversation

maxlambrecht
Copy link
Contributor

@maxlambrecht maxlambrecht commented May 24, 2023

Pull request check list

  • Proper tests/regressions included?
  • Documentation updated?

Summary of Changes

This Pull Request presents a comprehensive overhaul of the Harvester and several improvements to the Server. The enhancements presented in this PR mainly concern:

  1. Improved synchronization of federation bundles: Enhancements in its relationship configuration and consent status.
  2. Bundle signing and verification functionality: Configurable signing and verification through Providers.
  3. Newly introduced Endpoint: Serving th Harvester admin API via a Unix Domain Socket (UDS).
  4. Introduction of two Harvester Syncers:
    • The first Syncer focuses on SPIRE bundle synchronization and server upload.
    • The second Syncer orchestrates the synchronization of federated bundles from the server.
  5. Streamlining of the data model: Elimination of unused fields, namely 'onboarding_bundle' and 'harvester_spiffe_if'.
  6. Upgrades to the Server harvester and admin APIs: Improving naming of components to better align with REST operations. Adding consistency to the specs.
  7. Enhancements in logging: More efficient debugging and system tracking.
  8. Improvements in naming conventions: Fostering better code readability and maintainability.
  9. Revisions in package organization and distribution of responsibilities: Enhancing modular design.

Additional Improvements

The Harvester now requests a new JWT token every half an hour and stores it on disk. This resilience-improving measure enables the Harvester to be restarted without needing to re-onboard it with a join token.

Additional Notes

Noteworthy changes to the CLI and configuration files are planned to be covered in a subsequent PR. There's also an acknowledgment of certain packages lacking comprehensive tests - this will be addressed in upcoming follow-up PRs.

Looking Ahead

The aforementioned enhancements in this PR reflect our ongoing commitment to evolve and refine the Harvester.

Which issue this pull requests fixes

Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
// LogAndRespondWithError logs the error and returns an HTTP error.
func LogAndRespondWithError(logger logrus.FieldLogger, err error, errorMessage string, statusCode int) error {
if err != nil {
logger.Errorf("%s: %v", errorMessage, err)

Check failure

Code scanning / CodeQL

Log entries created from user input

This log write receives unsanitized user input from [here](1).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message is sanitized.

Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Copy link
Collaborator

@mgbcaio mgbcaio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments/questions, Max

func ParseCertificates(pemBytes []byte) ([]*x509.Certificate, error) {
var certs []*x509.Certificate
block, rest := pem.Decode(pemBytes)
fmt.Println("block", block, "rest", rest)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

@@ -10,19 +10,17 @@ import (
type ConsentStatus string

const (
ConsentStatusAccepted ConsentStatus = "accepted"
ConsentStatusAccepted ConsentStatus = "approved"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ConsentStatusAccepted ConsentStatus = "approved"
ConsentStatusApproved ConsentStatus = "approved"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

return fmt.Errorf("failed to fetch federated bundles from SPIRE Server: %w", err)
}

galadrielCtx, cancel := context.WithTimeout(ctx, galadrielCallTimeout)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curiosity question, having 2 defer cancel which would trigger first, does that cancel the whole context ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, reusing the same variable for the cancel is a mistake.


// Check if the bundle is the same as the last one fetched
if s.lastSpireBundle != nil && s.lastSpireBundle.Equal(bundle) {
return nil // No new bundle
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what you think about putting a log here to indicate that there is no new bundle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That specific log line might be redundant, as we have a Debug log statement that says "Checking..." before the action takes place, and another one that states "New bundle..." when it fetches a new bundle, and an Info log statement indicating "Uploaded...". I think these log lines provide sufficient visibility into the process.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, sounds good

}

func (p *jwtProvider) setToken(jwt string) {
p.mu.Lock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not p.mu.RLock in here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setToken function modifies the shared jwt token. Therefore, we need to ensure that no other goroutines are reading or writing the jwt token at the same time. A full lock, i.e, Lock(), accomplishes this by blocking both reads and writes to the shared resource. On the other hand, RLock only blocks write operations but allows multiple concurrent read operations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for clarifying!

Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Copy link
Contributor

@wibarre wibarre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hats off!!! @maxlambrecht

func ParseCertificates(pemBytes []byte) ([]*x509.Certificate, error) {
var certs []*x509.Certificate
block, rest := pem.Decode(pemBytes)
fmt.Println("block", block, "rest", rest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this for debugging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was a leftover, already removed.

@@ -0,0 +1 @@
package bundlemanager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//TODO?

const kidHeader = "kid"
const (
kidHeader = "kid"
defaultJWTTTL = 10 * time.Minute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this would be better to have them as a configuration. It could give the users of Galadriel the opportunity to align with internal policies they may have for their JWTs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a note to address this later on.

return nil
}

// isOnboarded Check if the client has been onboarded by checking if there is a JWT token
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check if the JWT has not expired as well? If there is no token or if the token is expired, we need new onboarding, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have enhanced the client's constructor function such that it now attempts to renew the token during initialization. This way, if it fails to procure a renewed token from the server, the creation process will fail, preventing the Harvester from starting. In case of such a failure, the Harvester will log stating that the Harvester was unable to establish a connection with the server using the stored token. The error message will suggest the need for re-onboarding the Harvester using a fresh join token.

}
err = server.Close()
if err != nil {
e.Logger.WithError(err).Error("Error closing Echo Server")
}
<-errChan
e.Logger.Info("TCP Server stopped")
log.Info("TCP Server stopped")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TPC Listener stopped

Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
Signed-off-by: Max Lambrecht <maxlambrecht@gmail.com>
@sonarcloud
Copy link

sonarcloud bot commented May 24, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 16 Code Smells

No Coverage information No Coverage information
0.1% 0.1% Duplication

Copy link
Collaborator

@mgbcaio mgbcaio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished my second round of the review. Amazing work, thanks Max!

@maxlambrecht maxlambrecht merged commit ac99727 into HewlettPackard:main May 25, 2023
@maxlambrecht maxlambrecht deleted the reimplementing-galadriel branch May 25, 2023 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants