Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go Agent 3.0 #106

Closed
willnewrelic opened this issue Oct 1, 2019 · 16 comments
Closed

Go Agent 3.0 #106

willnewrelic opened this issue Oct 1, 2019 · 16 comments

Comments

@willnewrelic
Copy link

willnewrelic commented Oct 1, 2019

UPDATE

These changes are released as beta in the v3 directory of the new release! Please take a look and give it a try!

https://godoc.org/github.com/newrelic/go-agent/v3/newrelic

This 3.0 work will be officially released in the next release, which will be tagged with v3.0.0 and tags for each of the integrations such as v3/integrations/nrecho-v3/v1.0.0.

Go Agent 3.0 Ideas

Hi All!

We are considering a major version release with breaking changes to improve the Go Agent API.
We would like to reduce minor points of irration.
We want to balance improvement with avoiding upgrade friction.
Here are the changes we propose. We would love your feedback!

Note that we will not be maintaining old agent versions after this major version has been released.

Complete List of Proposed Changes

Changes That Definitely Affect You

  • The newrelic package moves from "github.com/newrelic/go-agent" to "github.com/newrelic/go-agent/v3/newrelic".

  • The underscore in the _integrations directory is removed. Thus the "github.com/newrelic/go-agent/_integrations/nrgin/v1" import path becomes "github.com/newrelic/go-agent/v3/integrations/nrgin".

  • Go module files will be added. Each integration will be in its own module. Our first release will have git tags like this:

    • v3.0.0.0 for the base newrelic package.
    • v3/integrations/nrgin/v1.0.0 for the integration which supports the Gin framework
    • v3/integrations/nrecho-v3/v1.0.0 for the integration which supports version 3 of the Echo framework
    • v3/integrations/nrecho-v4/v1.0.0 for the integration which supports version 4 of the Echo framework
    • etc
  • Application and Transaction change from interfaces to structs. All methods on these types will have pointer receiver. References to these types in your code will need a pointer added.

  • NewConfig is removed and the NewApplication signature changes to:

    func NewApplication(options ...func(*Config)) (*Application, error)

    New functions are provided to modify the Config. Here's what your
    Application creation will look like:

    app, err := newrelic.NewApplication(
      newrelic.ConfigAppName("My Application"),
      newrelic.ConfigLicenseKey(os.Getenv("NEW_RELIC_LICENSE_KEY")),
    )

Changes That Affect You If You Call StartTransaction

The following changes will affect you are starting transactions manually. You will not be affected if you are using on the framework integration packages or rely on WrapHandle or WrapHandleFunc.

  • Application.StartTransaction signature changes to:

    func (app *Application) StartTransaction(name string) *Transaction
  • Transaction no longer implements http.ResponseWriter.

  • Transaction.SetWebResponse signature changes to:

    func (txn *Transaction) SetWebResponse(w http.ResponseWriter) (replacement http.ResponseWriter)
  • The WebRequest type changes from an interface to a struct
    and Transaction will have two methods for adding request information:

    type WebRequest struct {
      Header    http.Header
      URL       *url.URL
      Method    string
      Transport TransportType
    }
    func (txn *Transaction) SetWebRequest(WebRequest) error
    func (txn *Transaction) SetWebRequestHTTP(*http.Request) error
    

Other Misc Changes

  • The transaction parameter to NewRoundTripper is removed. The function signature becomes:

    func NewRoundTripper(t http.RoundTripper) http.RoundTripper
  • TransportType type is changed from a struct to a string.

  • Turn DistributedTracePayload from and interface into a struct.

  • Attributes are renamed:

    • "httpResponseCode" becomes "response.status"
  • The Config.TransactionTracer.SegmentThreshold field moves to Config.TransactionTracer.Segments.Threshold and the
    Config.TransactionTracer.StackTraceThreshold field moves to Config.TransactionTracer.Segments.StackTraceThreshold.

  • The payload parameter to Transaction.AcceptDistributedTracePayload changes to a string. The signature becomes:

    func (txn *Transaction) AcceptDistributedTracePayload(t TransportType, payload string) error

Discussion

Change Directory Layout and Import Paths

Problems:

  • The newrelic package name doesn't match the "github.com/newrelic/go-agent" import path directory. Tools don't like this.
  • The integrations are located inside a directory with an underscore. This makes it tough for tooling to automatically import them.

Solution: We propose moving the newrelic package into a newrelic/ directory and removing the underscore from the _integrations/ directory.

  • The newrelic package import path becomes "github.com/newrelic/go-agent/v3/newrelic".
  • The nrhttprouter package import path becomes "github.com/newrelic/go-agent/v3/integrations/nrhttprouter".

Change Application and Transaction Types

Problem: The Application and Transaction types are interfaces. This causes pain:

  • Consumers need nil checks throughout their code.

  • Consumers who have created mockApplication and mockTransaction types for testing or debugging situations need to update them every time we add new methods. Technically, we have been breaking the semantic versioning rules by adding methods without incrementing the major version number.

  • We can't provide use godoc links to specific application and transaction methods.

Solution: We propose changing Application and Transaction into structs with pointer method receivers. Consumers wouldn't need nil checks in their code or need to create their own mocks. This would be a big change for consumers using lots of transaction and application parameters. Functions that look like this:

func doSomething(txn newrelic.Transaction) {
  // do something and use the txn
}

Should be changed to look like this:

func doSomething(txn *newrelic.Transaction) {
  // do something and use the txn
}

But what if consumers don't remember to add the * in all locations? We don't want consumers to copy a struct containing a mutex and non-pointer fields. We think we can avoid this problem using another level of indirection:

type Transaction struct {
  internalTxn *internalTxn
}

With this design, everything will work fine if consumers copy the Transaction value. Not only would the pointers to the Transaction be made nil-safe, but the pointers to the *internalTxn would be made nil-safe too:

// This is safe!
var nilTxn *newrelic.Transaction
nilTxn.SetName("something")

// This is safe too!
incompleteTxn := &newrelic.Transaction{}
incompleteTxn.SetName("something")

// No need to add nil checks in your code!
txn := newrelic.FromContext(ctx)
txn.AddAttribute("myAttribute", 123)

We will change transaction and application methods and functions to use pointers. For example, the transaction context functions will now look like this:

func FromContext(ctx context.Context) *Transaction
func NewContext(ctx context.Context, txn *Transaction) context.Context

These changes are not strictly necessary given the extra level of indirection, but the lack of pointer might be a source of confusion since Go programmers don't expect stateful structs to be copied by value.

http.ResponseWriter and StartTransaction Changes

Changing Transaction into a struct will necessitate other changes.

Currently, the transaction implements "optional" response writer methods (such as http.CloseNotifier) to match the input response writer (passed in to Application.StartTransaction or Transaction.SetWebResponse). If the transaction becomes a struct, then its method set can't be changed. Therefore, we propose the following change: Transaction would not implement http.ResponseWriter. Instead, the SetWebResponse method would return the replacement writer:

func (t *Transaction) SetWebResponse(w http.ResponseWriter) (replacement http.ResponseWriter)

This method will return the input parameter if the transaction is nil so that it can be used safely in all circumstances.

If the response writer parameter is removed from StartTransaction, we believe it makes sense to remove the request parameter too.

func (a *Application) StartTransaction(name string) *Transaction

This makes the API simpler and easier for consumers creating background transactions. Here is what the implementation of WrapHandle would look like after the proposed changes:

func WrapHandle(app *Application, pattern string, handler http.Handler) (string, http.Handler) {
	return pattern, http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		txn := app.StartTransaction(pattern)
		txn.SetWebRequestHTTP(r)
		w = txn.SetWebResponse(w)
		defer txn.End()

		r = RequestWithTransactionContext(r, txn)

		handler.ServeHTTP(w, r)
	})
}

Fortunately, consumers using WrapHandle or an integration framework support package will be unaffected by the changes.

Change Config and Application Construction

Problem: NewConfig is clunky. Multiple lines of code are needed to set up an application, and it is possible to pass an invalid Config to NewApplication.

Solution: We propose using the functional options pattern (as done in our new telemetry sdk):

func main() {
	app, err := newrelic.NewApplication(
		newrelic.ConfigAppName("Example App"),
		newrelic.ConfigLicenseKey(mustGetEnv("NEW_RELIC_LICENSE_KEY")),
		newrelic.NewDebugLogger(os.Stdout),
		func (cfg *newrelic.Config) {
			// Consumers still have complete control over Config fields
			// by creating their own functions.
		},
	)
}

For consumers who want to use the environment variables consistent with other agents, we will provide a function like this:

func main() {
	app, err := newrelic.NewApplication(newrelic.ConfigFromEnvironmentVariables)
}
@cjcjameson
Copy link

Generally in favor -- seems like a good idea. Obviously transitioning over a major version bump like this can be slow/painful (I work in an enterprise setting with a too-long release cycle and various maintenance needs). But the overall approach is cleaner and clear.

Specifically from the context of #74 ... will the new Application config have a method to validate the license key? I'm cool with a functional approach to configuration; can we tack on validation?

@willnewrelic
Copy link
Author

Hi @cjcjameson Thanks for the feedback!

will the new Application config have a method to validate the license key? I'm cool with a functional approach to configuration; can we tack on validation?

Without waiting for a connect (which is done async in a goroutine), the only license key validation that can be done is checking the length (40 characters). Is that worth it? What would you like that API to look like?

@cjcjameson
Copy link

ah, I see. No sweat if not.

@whuang8
Copy link

whuang8 commented Oct 7, 2019

Having Application and Transaction methods be nil safe by making the switch to struct and defining pointer method receivers is a great idea!

@colesnodgrass
Copy link

Since we're discussing breaking changes, I'd like to propose removing the error responses from most (maybe all) of the methods on the Application, Transaction, Segment, DatastoreSegment, and ExternalSegment interfaces/structs.

As these errors exist only within the go-agent workflow, I'm never going to let these errors prevent the rest of my code (my business-logic) from executing. So either I:

  1. implicitly ignore these errors (which Goland will constantly warn me about)
  2. explicitly ignore these errors (waste of time)
  3. or I capture the error, log the error, and then ignore the error (bigger waste of time)

None of the docs show checking for these errors and if I have already configured the go-agent to log, then why not simply let the agent log these errors instead of returning them?

@purple4reina
Copy link
Contributor

Another proposed change for v3.0 that we would love everyone's feedback on.

We would like to propose changing the names of transactions created by WrapHandle, WrapHandleFunc, and when using our integrations for echo, gin, and gorilla. We would like to add the request method to the transaction name. Therefore, a transaction that used to be /hello/world would become GET /hello/world or POST /hello/world. Metrics that used to be WebTransaction/Go/hello/world would become WebTransaction/Go/GET /hello/world or WebTransaction/Go/POST /hello/world.

Breaking transactions out by method increases the granularity of transaction data. Routes that used to roll up into a single transaction name would now be more properly split. This would allow you to better monitor your routes.

In this Gorilla example, these two routes would previously be rolled up under the transaction name /hello. Our proposal would split these into two transactions: GET /hello and POST /hello.

r := mux.NewRouter()
r.HandleFunc("/hello", handlerGET).Methods("GET")
r.HandleFunc("/hello", handlerPOST).Methods("POST")
http.Handle("/", r)

Unfortunately however, changing transaction names comes at a cost. It would mean that you could not directly compare timing information on the transactions between the v2 and v3 agents. You may have to change your alerting or custom dashboards to match. This is why we are eager to hear your thoughts as we want the next version of the Go Agent to be the best it can be!

@willnewrelic
Copy link
Author

Hi All

These changes are released as beta in the v3 directory of the new release! Please take a look and give it a try!

The godocs for this 3.0 API are located here!

https://godoc.org/github.com/newrelic/go-agent/v3/newrelic

This 3.0 work will be officially released in the next release, which will be tagged with v3.0.0 and tags for each of the integrations such as v3/integrations/nrecho-v3/v1.0.0.

@edmorley
Copy link

edmorley commented Dec 3, 2019

Hi and thank you for the new release! Here is our feedback as of the 2.16.1 v3 beta...

The migration process

Overall I found the migration checklist really easy to follow - the before/after examples really helped. Some things I spotted:

  1. The migration example here says to use the import path of "github.com/newrelic/go-agent/v3/integrations/nrlogrus/v1" however it should instead be "github.com/newrelic/go-agent/v3/integrations/nrlogrus" (since that's one of the few integrations that isn't using the versioned paths).

  2. After changing the imports to v3 and running go mod tidy I got:

    go: finding github.com/newrelic/go-agent/v3/integrations/nrlogrus latest
    go: finding github.com/newrelic/go-agent/v3 latest
    go: github.com/newrelic/go-agent/v3/integrations/nrlogrus@v0.0.0-20191202221010-85c56f3fd8ba requires
    	github.com/newrelic/go-agent/v3@v3.0.0: reading github.com/newrelic/go-agent/go.mod at revision v3.0.0: unknown revision v3.0.0
    

    I'm presuming this is due to there (intentionally) being no 3.0.0 git tag.

    I worked around the issue by adding the following to my go.mod:

    replace github.com/newrelic/go-agent/v3 v3.0.0 => github.com/newrelic/go-agent/v3 v3.0.0-20191202221010-85c56f3fd8ba
    

    ...however it would be good to fix the integration's go.mod files to use the SHA version form for any future beta releases until the v3.0.0 tag is added.

  3. The nrlogrus v3 godocs have references to newrelic.NewConfig() that no longer work. We've switched to the nrlogrus.StandardLogger() configuration helper instead - though it's worth noting I only happened to spot it after looking at the v3 integration docs, the new functions aren't mentioned in the migration guide (though perhaps doing so would be overkill).

  4. The "Update how you configure your application" item on the checklist is great - though for us the massive win here was the addition of ConfigFromEnvironment, which isn't currently mentioned in the migration checklist itself - perhaps worth adding?

  5. For the section that discusses Config.ErrorCollector.RecordPanics, it would be good to outline the trade-offs (if any) of this setting. The fact it's being turned off by default makes it seem like there are occasions where it's counter-productive?

  6. There are a few typos in the migration guide: Tranasaction, peferred, acceps

Config from environment variables

This is a great new addition - saves us having to add env-handling boilerplate to each app :-)

  1. Would it be possible to add support for NEW_RELIC_ATTRIBUTES_EXCLUDE and NEW_RELIC_ATTRIBUTES_INCLUDE (as comma separated lists) for parity with the Ruby agent?

  2. The env vars that require casting (eg the bool and int options) currently silently ignore failures. Should they log an error or else cause NewApplication to return err instead? I can't decide.

  3. I noticed that the env var for controlling whether the agent is enabled differs across agents:

    Would it be possible to consolidate around one naming scheme? (With the old name being supported for backwards compatibility in agents that formerly used it).

Changes to distributed tracing methods

In the v3 release, AcceptDistributedTracePayload() was renamed to AcceptDistributedTraceHeaders() and now must be passed an http.header:
https://github.com/newrelic/go-agent/blob/master/v3/MIGRATION.md#distributed-trace-methods

For non-HTTP use-cases (such as passing a trace ID via command line to another process, or for an event being added to a queue for consumption by a worker on another instance) it feels a bit strange to be having to create an http headers object for this. This can be seen in one of the examples:

// Accept the payload that was passed on the command line.
hdrs := http.Header{}
hdrs.Set(newrelic.DistributedTracePayloadHeader, payload)
txn.AcceptDistributedTraceHeaders(newrelic.TransportOther, hdrs)
time.Sleep(1 * time.Second)

Are there any further changes that could be made to make the API feel more natural in these use-cases?

Misc

  1. Re this comment:

    We would like to propose changing the names of transactions created by WrapHandle, WrapHandleFunc, and when using our integrations for echo, gin, and gorilla. We would like to add the request method to the transaction name.

    ...we'd love transactions to be broken down this way, so fully support this change.

  2. In v3 GUIDE.md, this example uses txn.Close() -- I presume it should be txn.End() instead?

@willnewrelic
Copy link
Author

willnewrelic commented Dec 3, 2019

Hi @edmorley!

Thank you for this feedback! We really appreciate it! You're the best!!

The migration example here says to use the import path of "github.com/newrelic/go-agent/v3/integrations/nrlogrus/v1" however it should instead be "github.com/newrelic/go-agent/v3/integrations/nrlogrus" (since that's one of the few integrations that isn't using the versioned paths).

Oops! Good catch! This will be fixed.

it would be good to fix the integration's go.mod files to use the SHA version form for any future beta releases until the v3.0.0 tag is added.

We did not know about the SHA version form. Good to know! Hopefully, we do not have more beta releases, but if so, we will look into this!

The nrlogrus v3 godocs have references to newrelic.NewConfig() that no longer work. We've switched to the nrlogrus.StandardLogger() configuration helper instead - though it's worth noting I only happened to spot it after looking at the v3 integration docs, the new functions aren't mentioned in the migration guide (though perhaps doing so would be overkill).

Oops! Good catch! This will be fixed.

the massive win here was the addition of ConfigFromEnvironment, which isn't currently mentioned in the migration checklist itself - perhaps worth adding?

I'm glad ConfigFromEnvironment is useful! We will ensure it is better emphasized in the migration guide.

For the section that discusses Config.ErrorCollector.RecordPanics, it would be good to outline the trade-offs (if any) of this setting.

This sounds useful. We will improve the RecordPanics documentation.

There are a few typos in the migration guide: Tranasaction, peferred, acceps

Yikes! Typos will be fixed!

Would it be possible to add support for NEW_RELIC_ATTRIBUTES_EXCLUDE and NEW_RELIC_ATTRIBUTES_INCLUDE

Sure, we will add them!

The env vars that require casting (eg the bool and int options) currently silently ignore failures. Should they log an error or else cause NewApplication to return err instead? I can't decide.

Great question. We aren't sure: Populating Config.Error to prevent application creation seems heavy-handed. Logging an error is unreliable since the consumer may add logging configuration after ConfigFromEnvironment. What do you think we should do?

I noticed that the env var for controlling whether the agent is enabled differs across agents: Go: NEW_RELIC_ENABLED, Ruby: NEW_RELIC_AGENT_ENABLED, Node.js: NEW_RELIC_ENABLED Would it be possible to consolidate around one naming scheme?

We apologize for the naming confusion. We lament these differences. Changing and deprecating names across agents is a hurdle, but we will look into it!

  • Regarding the AcceptDistributedTraceHeaders change: We made this change to prepare for the W3C Trace Context header future https://www.w3.org/TR/trace-context-1 which will send the distributed tracing information through multiple headers (we anticipate having something like a Config.DistributedTracing.Format configuration field). We want to embrace open standards. We acknowledge that this is a pain for message queue systems. We could have both AcceptDistributedTracePayload and AcceptDistributedTraceHeaders, but we want to keep the API simple and minimize code paths. Any thoughts about supporting this W3C future?

  • Regarding the transaction rename proposal: We have decided not to go through with this because of the extra hassle it would cause customers with existing dashboards and alerts. Having the method in there would be real nice, but we think it would not be worth the extra confusion.

In v3 GUIDE.md, this example uses txn.Close() -- I presume it should be txn.End() instead?

Good catch. We will fix this!

Thanks again for all your thoughts @edmorley!!

@edmorley
Copy link

edmorley commented Dec 4, 2019

Thank you for your fast reply!

Great question. We aren't sure: Populating Config.Error to prevent application creation seems heavy-handed. Logging an error is unreliable since the consumer may add logging configuration after ConfigFromEnvironment. What do you think we should do?

A few ideas:

  • have ConfigFromEnvironment take a single bool argument named strict, which controls whether to fail on invalid values
  • add a Warning field of type string to Config (alongside Error), and have NewApplication collect all non-nil c.Warnings here, so that once all config functions have been invoked, the final logger config can be used to output those warnings.

That said, perhaps just silently ignoring is the simplest option!

Regarding the AcceptDistributedTraceHeaders change: We made this change to prepare for the W3C Trace Context header future

Ah I didn't realise that. In which case it makes sense. I love the idea of supporting open standards - so happy to have a slight compromise in API in return :-)

Regarding the transaction rename proposal: We have decided not to go through with this because of the extra hassle it would cause customers with existing dashboards and alerts.

Would it be possible to make it opt in?

Would it be possible to add support for NEW_RELIC_ATTRIBUTES_EXCLUDE and NEW_RELIC_ATTRIBUTES_INCLUDE

Sure, we will add them!

Amazing, thank you :-)

@willnewrelic
Copy link
Author

Hi @edmorley

We just released another version of the 3.0 beta with some changes you might like 😄

https://github.com/newrelic/go-agent/blob/master/CHANGELOG.md#2162

We highly value your feedback! Please keep it coming!

@edmorley
Copy link

We just released another version of the 3.0 beta with some changes you might like 😄

Awesome, thank you! I've pulled it locally, will try it out tomorrow :-)

@edmorley
Copy link

edmorley commented Dec 11, 2019

With the latest version of the agent I get a "license length is not 40" error returned from newrelic.NewApplication(). This appears to be due to assignString assigning the name of the environment variable to *field rather than the value of the variable, here:

assignString := func(field *string, name string) {
if env := getenv(name); env != "" {
*field = name
}
}

@purple4reina
Copy link
Contributor

With the latest version of the agent I get a "license length is not 40" error returned from newrelic.NewApplication(). This appears to be due to assignString assigning the name of the environment variable to *field rather than the value of the variable, here:

assignString := func(field *string, name string) {
if env := getenv(name); env != "" {
*field = name
}
}

That is indeed a bug. We will get that fixed asap and issue another patch release. Thanks for catching it @edmorley !!

@edmorley
Copy link

One other thing I spotted was that whilst v3/GUIDE.md has been updated to remove all of the now-unnecessary nil-checks on Application and Transaction (now that they are structs), some of the examples still use them (so could be simplified further).

eg:

if txn := newrelic.FromContext(r.Context()); txn != nil {
txn.NoticeError(errors.New("my error message"))
}

if txn := newrelic.FromContext(r.Context()); txn != nil {
txn.SetName("other-name")
}

if txn := nrecho.FromContext(c); nil != txn {
txn.AddAttribute("userId", id)
}

if txn := nrecho.FromContext(c); nil != txn {
txn.AddAttribute("userId", id)
}

if txn := nrgin.Transaction(c); nil != txn {
txn.SetName("custom-name")
}

@willnewrelic
Copy link
Author

Hi All

Version 3.0.0 has been released! Thanks for all your help @edmorley. You will be pleased to see that Heroku dyno support is included!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants