Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wsclient library to ecs-agent module #3690

Merged
merged 2 commits into from
May 23, 2023

Conversation

RichaGangwar
Copy link
Contributor

@RichaGangwar RichaGangwar commented May 11, 2023

Summary

Add wsclient library to ecs-agent

Implementation details

  1. Added a new directory wsclient to ecs-agent.
  2. Added the new files in wsclient related to wsclient functionality
  3. Added utils directory to move the needed dependencies from agent to ecs-agent.
  4. Edited the needed imports in agent related to moved utils.
  5. Moved Proxy( *http.Request) function to proxy.go from utils.go to avoid moving the whole utils.go in ecs-agent.

TODOs to be handled in future PRs:

  1. Periodic timeout/disconnection of WebSocket connections in wsclient instead of tcs/acs clients
  2. Move exitcodes package to /ecs-agent
  3. Use context.Canceled.Error() instead of hardcoding "context canceled"
  4. Replace GetMockServer() used in unit tests with gomock generated mock interfaces.
  5. Remove docker specific checks in wsclient library.
  6. Remove existing agent/wsclient after plugging in the new libraries in ecs-agent, i.e. tcs and acs.

Testing

Unit tested with existing tests in wsclient
Unit tested agent with make test

New tests cover the changes: no

Description for the changelog

Add wsclient library to ecs-agent module.

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@RichaGangwar RichaGangwar requested a review from a team as a code owner May 11, 2023 16:02
@RichaGangwar RichaGangwar changed the title Ecs agent wsclient lib Add wsclient library to ecs-agent module May 11, 2023
@danehlim
Copy link
Contributor

Linux / Linux unit tests check is failing at the moment, just want to confirm is this PR ready for review or currently still WIP?

@RichaGangwar
Copy link
Contributor Author

Linux / Linux unit tests check is failing at the moment, just want to confirm is this PR ready for review or currently still WIP?
@danehlim Its ready for review. The unit tests passed in my local setup. The Linux unit tests is failing here due to less coverage than expected. Ideally, we should not see the coverage being lowered as we are moving the existing files. I am checking on it.

@@ -35,10 +35,11 @@ import (
"time"

"github.com/aws/amazon-ecs-agent/agent/config"
"github.com/aws/amazon-ecs-agent/agent/utils"
"github.com/aws/amazon-ecs-agent/agent/utils/cipher"
"github.com/aws/amazon-ecs-agent/agent/wsclient/wsconn"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't imports of "github.com/aws/amazon-ecs-agent/agent/wsclient/wsconn" be updated to "github.com/aws/amazon-ecs-agent/ecs-agent/wsclient/wsconn" then we can remove the unused files in agent/wsclient/wsconn, right? (i.e., like what we've done with agent/utils/cipher and agent/utils/httpproxy)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think it makes sense to switch just wsconn imports to ecs-agent/wsclient. The plan is to plug in the ecs-agent/wsclient with ecs-agent/tcs and ecs-agent/acs, and once we have the confidence in testing, we would completely remove the agent/wsclient package. It made sense to move the utils to avoid the cyclic dependency issue between agent and ecs-agent.

Copy link
Contributor

@danehlim danehlim May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking since the files in agent/wsclient/wsconn and ecs-agent/wsclient/wsconn are identical, we could just have a single one living in ecs-agent/to refer to as a source of truth since we will be removing agent/wsclient/wsconn in the future anyway.

Copy link
Member

@fierlion fierlion May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once we have the confidence in testing, we would completely remove the agent/wsclient package

do we have a TODO for this anywhere (in code or in a tracking ticket)?

Reply from RichaGangwar:
Yes Ray, Thats the plan. As part of tcs and acs shared lib, we will integrate this new wsclient lib in the clients. And once well tested, we will remove the agent/wsclient. There isn't any TODO right now. Will add it in description to keep track of it. Thank you :)

// AgentConfig is a subset of agent's config. Since this requires agent specific information,
// making this as a struct to be passed from the individual agents while using the wsclient shared lib.
// The agent where the field is not applicable can be left empty
type AgentConfig struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this to something that makes it more distinguishable from the original agent config (maybe something along the lines ofMinimalAgentConfig, or WSClientAgentConfigs)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I will update it.

timeoutDialer := &net.Dialer{Timeout: wsConnectTimeout}
tlsConfig := &tls.Config{ServerName: parsedURL.Host, InsecureSkipVerify: cs.cfg.AcceptInsecureCert, MinVersion: tls.VersionTLS12}

//TODO: In order to get rid of the check -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to get rid of this check as part of this PR? It seems to be specific to Docker, and I don't think we want anything tied to or specific to Docker in the ecs-agent module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ideally I want to get rid of the check. I had a discussion with multiple folks and able to resolve few other such checks. I am preparing to doc to deal with these two checks. Once everyone agrees on the strategy, we can remove these checks as well. I had raised the PR so that ACS is unblocked, as driving consensus is taking some time.

Copy link
Contributor

@danehlim danehlim May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will look forward to this in a followup PR then. If not already, let's make sure we're tracking this internally.

// ClientServerImpl wraps commonly used methods defined in ClientServer interface.
type ClientServerImpl struct {
// cfg is the subset of user-specified runtime configuration
cfg *AgentConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be capitalized to be exported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing out. Will fix this.


"github.com/aws/aws-sdk-go/aws/credentials"
"github.com/aws/aws-sdk-go/private/protocol/json/jsonutil"
"github.com/cihub/seelog"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to be using github.com/aws/amazon-ecs-agent/ecs-agent/logger for any logging in ecs-agent/. I see we are already using it for some of the logging in this file, could we update to make sure we’re using that and not seelog directly in ecs-agent/ (whether that be in this PR or a future one)?

// by backend. It allows for bidirectional communication and acts as both a
// client-and-server in terms of requests, but only as a client in terms of
// connecting.
package wsclient
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Periodic timeout/disconnection of WebSocket connections seems to be missing from this PR, let’s make sure to add that in if we can.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the timeouts are initiated by individual components using the library, i.e. ACS and TACS have their own implementation of timeouts. I had not accounted for that as part of wsclient library, hence it's missing from this PR. Is that something we want to unify?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I believe we decided that we want to have the same logic for refreshing our websocket connection across all our websocket connections (i.e., ACS and TACS websocket connections). I'll reach out to you offline and point you in the direction of these previous discussions.

signer := v4.NewSigner(creds)
_, err := signer.Sign(req, body, service, region, time.Now())
if err != nil {
seelog.Warnf("Signing HTTP request failed: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be updated to use logger instead of seelog as well (per this previous comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thank you for pointing out.


// ExitTerminal indicates the agent run into error that's not recoverable
// no need to restart
ExitTerminal = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the exitcodes package to /ecs-agent? Perhaps in a fast follow-up PR, if not this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will move it as a follow-up PR. This is already quite bulky. But thank you for pointing out.

}

// RespondFunc specifies a function callback that can be used by the
// RequestResonder to respond to requests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo, RequestResonder -> RequestResponder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks.

// ClientServerImpl wraps commonly used methods defined in ClientServer interface.
type ClientServerImpl struct {
// Cfg is the subset of user-specified runtime configuration
Cfg *WSClientMinAgentConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Cfg need to be exported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would need to be populated by the ACS/TACS libraries. So it needs to be exported.

io.Closer
}

// AgentConfig is a subset of agent's config. Since this requires agent specific information,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "WSClientMinAgentConfig is a subset[...]"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks.

// messages from an active connection.
func (cs *ClientServerImpl) ConsumeMessages(ctx context.Context) error {
// Since ReadMessage is blocking, we don't want to wait for timeout when context gets cancelled

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: weird newline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// ClientFactory interface abstracts the method that creates new ClientServer
// objects. This is helpful when writing unit tests.
type ClientFactory interface {
New(url string, credentialProvider *credentials.Credentials, rwTimeout time.Duration) ClientServer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be updated to include a *wsclient.WSClientMinAgentConfig parameter in the function signature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thank you for pointing it out. :)

)

// GetMockServer returns a mock websocket server that can be started up as TLS or not.
// TODO replace with gomock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this TODO doable within this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mockServer is heavily used in all existing test cases. Maybe it can be picked up in a separate PR. But it would need good amount of rewrite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this isn't something we can pull in from our mockgen process in the agent. Ideally all of our mocks are generated. (We'll follow up offline).

Comment on lines 349 to 352
//test for DecodeConnectionError
//use WriteCloseMessage

//send httpResponse in Connect()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there missing tests here on L349-L352, or what are these comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed removing the comments. I was using it while writing new tests to ensure the coverage requirements are met. Will remove them in latest commit.

waitForRequests.Wait()
}

func getClientServer(url string, msgType []interface{}) *ClientServerImpl {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this name to be dummyClientServer or testClientServer or similar? That seems a lot more clear to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Done.

}

func getClientServer(url string, msgType []interface{}) *ClientServerImpl {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spacing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


// Close closes the underlying connection. Implement Close() in this file
// as ClientServerImpl doesn't implement it. This is needed by the
// TestSetReadDeadline* tests
Copy link
Contributor

@ohsoo ohsoo May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update comments to end with periods/have proper punctuation if they don't already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Fixed.

Copy link
Contributor

@ohsoo ohsoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do some commit squashing please. Overall, not seeing any major concerns!

@RichaGangwar
Copy link
Contributor Author

Let's do some commit squashing please. Overall, not seeing any major concerns!

Sure Soo.

for {
select {
case <-ctx.Done():
// Close connection and wait for Read goroutine to finish
Copy link
Contributor

@danehlim danehlim May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update our tests to cover this code path (i.e., L491-L495)? This is a major change from our current agent/wsclient/client.go, and we currently don't have any tests that validate this behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Added.

}()
// Cancel the context.
cancel()
assert.EqualError(t, <-messageError, "context canceled")
Copy link
Contributor

@danehlim danehlim May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (not blocking, can be added in a followup PR so that this PR can get merged and unblock other work ASAP):

Can we use context.Canceled.Error() here instead of hardcoding "context canceled" (i.e., similar to what we do on L283)?

danehlim
danehlim previously approved these changes May 18, 2023
Copy link
Contributor

@danehlim danehlim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved under the agreement that we will fast follow with a PR to include periodic timeout/disconnection of websocket connections logic as part of the wsclient itself (per this comment thread) and that we will squash commits before merging.

Copy link
Member

@fierlion fierlion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the description to highlight TODOs from code and from comments so we can track these in our release process?

@@ -35,10 +35,11 @@ import (
"time"

"github.com/aws/amazon-ecs-agent/agent/config"
"github.com/aws/amazon-ecs-agent/agent/utils"
"github.com/aws/amazon-ecs-agent/agent/utils/cipher"
"github.com/aws/amazon-ecs-agent/agent/wsclient/wsconn"
Copy link
Member

@fierlion fierlion May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once we have the confidence in testing, we would completely remove the agent/wsclient package

do we have a TODO for this anywhere (in code or in a tracking ticket)?

Reply from RichaGangwar:
Yes Ray, Thats the plan. As part of tcs and acs shared lib, we will integrate this new wsclient lib in the clients. And once well tested, we will remove the agent/wsclient. There isn't any TODO right now. Will add it in description to keep track of it. Thank you :)

)

// GetMockServer returns a mock websocket server that can be started up as TLS or not.
// TODO replace with gomock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this isn't something we can pull in from our mockgen process in the agent. Ideally all of our mocks are generated. (We'll follow up offline).

"github.com/aws/amazon-ecs-agent/agent/version"
"github.com/aws/amazon-ecs-agent/ecs-agent/utils/cipher"
"github.com/aws/amazon-ecs-agent/ecs-agent/utils/httpproxy"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, agent/utils/httpproxy is already moved to ecs-agent/utils/httpproxy as part of this PR. Is that what you are referring to Ray?

Copy link
Member

@fierlion fierlion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the description to highlight TODOs from code and from comments so we can track these in our release process?

@RichaGangwar
Copy link
Contributor Author

Can you update the description to highlight TODOs from code and from comments so we can track these in our release process?

Sure Ray.

@Realmonia Realmonia merged commit 6f994c6 into aws:dev May 23, 2023
@ubhattacharjya ubhattacharjya mentioned this pull request May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants