Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return an error from BrowserContext.AddCookies and handle panic in mapping layer #744

Closed
wants to merge 3 commits into from

Conversation

ankur22
Copy link
Collaborator

@ankur22 ankur22 commented Feb 1, 2023

At the moment the codebase uses panic too liberally and we don't have control over when we actually want to abort all test runs vs when we only want to fail the current test iteration.

This is a POC (which if we're all on agreement with could become the first PR for this way of working with errors and panics) of what was suggested in a comment on #688. The idea is that we will bubble up errors all the way to the mapping layer, where it can decide on whether to panic or to return an error to goja.

This would mean that we have more control over when to panic, and to be more aware of what we want to panic on, as well as using good GO error handling practices throughout the codebase.

@ankur22 ankur22 changed the title Refactor/688 init Return an error from BrowserContext.AddCookies and handle panic in mapping layer Feb 1, 2023
k6error/internal.go Outdated Show resolved Hide resolved
k6error/internal.go Outdated Show resolved Hide resolved
Copy link
Member

@inancgumus inancgumus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @ankur22, for following up on the discussion we made earlier and putting it into reality in this PR.

One question: Do you think we need a separate error type like that (InternalError) and all the machinery around it? Can we have a single error variable, wrap it using Errorf: %w, and detect it with errors.Is?

For example:

// k6error package
var ErrInternal = errors.New("internal error")

// common package
func (b *BrowserContext) AddCookies(cookies goja.Value) error {
  return fmt.Errorf("BrowserContext.addCookies(cookies) has not been implemented yet: %w", k6error.ErrInternal)
}

// mapping layer
...
"addCookies": func(cookies goja.Value) *goja.Promise {
  return k6ext.Promise(ctx, func() (result any, reason error) {
    err := bc.AddCookies(cookies)
    panicIfInternalError(ctx, err)
}

func panicIfInternalError(ctx context.Context, err error) {
  if err == nil || !errors.Is(err, k6error.ErrInternal) {
    return
  }
  k6ext.Panic(ctx, err.Error())
}

@ankur22
Copy link
Collaborator Author

ankur22 commented Feb 1, 2023

One question: Do you think we need a separate error type like that (InternalError) and all the machinery around it? Can we have a single error variable, wrap it using Errorf: %w, and detect it with errors.Is?

Nice! I do like the simplicity. Let me try that and see.

@ankur22
Copy link
Collaborator Author

ankur22 commented Feb 1, 2023

One question for you @inancgumus and @ka3de:

If you're scanning through the code, which one looks clearer:

fmt.Errorf("BrowserContext.addCookies(cookies) has not been implemented yet: %w", k6error.ErrInternal)

or

k6error.NewInternalError("BrowserContext.addCookies(cookies) has not been implemented yet")

@inancgumus
Copy link
Member

inancgumus commented Feb 1, 2023

If you're scanning through the code, which one looks clearer:

That's the idiomatic way of doings things after Go 1.13. So it depends on the Go authors, not us :)

However, you can still make it cleaner if you like as follows:

// k6error
func Internal(err error) error {
  return fmt.Errorf("%s: %w", err, ErrInternal)
}

// common
return k6error.Internal("BrowserContext.addCookies(cookies) has not been implemented yet")

You can go fancier if needed, like copying Errorf, something like:

// k6error
func Internal(format string, a ...any) error {
  a = append(a, ErrInternal)
  return fmt.Errorf(format+": %w", a...)
}

But, I like the idiomatic way I proposed earlier—it's pretty explicit and the conventional way of returning and checking errors in Go.

On a tangent, for panicIfInternalError function or k6error.NewInternalError (stutters):

Effective Go on naming:

Long names don't automatically make things more readable. A helpful doc comment can often be more valuable than an extra long name.

@ankur22
Copy link
Collaborator Author

ankur22 commented Feb 1, 2023

That's the idiomatic way of doings things after Go 1.13. So it depends on the Go authors, not us :)

Updated to the simpler idiomatic way: 5abd5e4

@ka3de
Copy link
Collaborator

ka3de commented Feb 1, 2023

One question for you @inancgumus and @ka3de:

If you're scanning through the code, which one looks clearer:

fmt.Errorf("BrowserContext.addCookies(cookies) has not been implemented yet: %w", k6error.ErrInternal)

or

k6error.NewInternalError("BrowserContext.addCookies(cookies) has not been implemented yet")

Get a little late for the discussion, but yes, I agree with @inancgumus that if idiomatic Go fullfils our use case, I would stick to that without adding extra complexity. So agree on the refactor done on 5abd5e4.

This internal error type will help us track internal errors so that
we can panic at the goja mapping layer instead of panicking from
within the business logic code.
This refactoring will catch ErrInternal errors and panic if found.
Copy link
Collaborator

@ka3de ka3de left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@inancgumus
Copy link
Member

inancgumus commented Feb 6, 2023

The code is OK as it is 👍 However, we will get a huge stack trace because of an unimplemented method. The stack trace—hence the panic—would benefit if an "unexpected" error occurs, and we can track the bug through stack trace. In this case, it will harm the user experience. I would prefer saving the panic for really unexpected situations and use the panicIfInternalError function for those cases.

Instead of doing it like this, we can return the error as follows:

...
"addCookies": func(cookies goja.Value) *goja.Promise {
	return k6ext.Promise(ctx, func() (result any, reason error) {
		err := bc.AddCookies(cookies)
		return nil, err //nolint:wrapcheck
	})
},

For example, let's try it with the following script:

import { chromium } from 'k6/x/browser';

export const options = {
  thresholds: {
    checks: ["rate==1.0"]
  }
}

export default async function() {
  const browser = chromium.launch({ headless: true });
  const context = browser.newContext();

  try {
    await context.addCookies("...");
    console.log("should not reach here");
  } finally {
    browser.close();
  }
}

Just returning the error in the mapping layer will result in the following error output in the console:

ERRO[0000] Uncaught (in promise) BrowserContext.addCookies(cookies) has not been implemented yet  executor=per-vu-iterations scenario=default

I believe it's clear enough for the user.

We can make it better by using the AbortingPromise function instead—which I wrote for this sort of case:

...
"addCookies": func(cookies goja.Value) *goja.Promise {
	return k6ext.AbortingPromise(ctx, func() (result any, reason error) {
		err := bc.AddCookies(cookies)
		return nil, err //nolint:wrapcheck
	})
},

Which will result in the following error output:

ERRO[0000] BrowserContext.addCookies(cookies) has not been implemented yet  executor=per-vu-iterations scenario=default

@grafana/k6-browser WDYT?

@ka3de
Copy link
Collaborator

ka3de commented Feb 6, 2023

@grafana/k6-browser WDYT?

But in this cases the test would continue after the error is returned right? I thought the idea of the definition of ErrInternal was a way for us to understand in the mapping layer when a panic must occur so the test is ended because it's a "non-recoverable error", therefore I understand that the panic was the proper action. Is this the behavior that we are trying to define? Or did I misunderstood it?

@inancgumus
Copy link
Member

inancgumus commented Feb 6, 2023

But in this cases the test would continue after the error is returned right?

@ka3de, you're right: The test run will continue if we don't panic (more evident if we run with the script with multiple VUs). Then we can compromise and have a long stack trace with a hard-to-see error message—even in this expected case 🤷 Maybe we can use go.k6.io/k6/execution.AbortTestRun() instead of panicking 🤔 But I haven't looked into it yet. I'm happy to keep the PR as it is without complicating things further 👍


@ankur22, by the way, we get the following error when we run the script I shared above:

panic: no browser process ID in context
        panic: GoError: BrowserContext.addCookies(cookies) has not been implemented yet: internal error

The interesting line is this:

panic: no browser process ID in context

The k6ext.Panic can't get the browser process ID and can't clear the browser process. We might want to check what's going on there.

@ka3de
Copy link
Collaborator

ka3de commented Feb 6, 2023

The interesting line is this:

panic: no browser process ID in context

The k6ext.Panic can't get the browser process ID and can't clear the browser process. We might want to check what's going on there.

You are right @inancgumus that does not look very friendly. Maybe we should pause this until we have analyzed better the different options and establish a consensus on the behavior from now one.

For me personally I see various different ways to return an error for which I don't fully understand the differences. E.g.:

Or maybe this should be tackled in a different discussion, but I think we need a better analysis for this.
What do you think @ankur22 ?

@ankur22
Copy link
Collaborator Author

ankur22 commented Feb 6, 2023

@grafana/k6-browser Thanks for the reviews and suggestions.

My feeling is that we're looking for the perfect solution instead of one that will get us moving in the right direction.

There is confusion between all the ways of panicking and erroring (as i've mentioned in #688).

We need to move towards a solution where we:

  • Stop the whole test run when we find a genuine reason to do so (e.g. panic when a method is not implemented or an unexpected error such as when parsing string to json, etc.).

That is the key goal for issue #688, but at the moment panics are placed throughout the code and it's difficult to reason where we should be erroring vs panicking.

This PR is an attempt to:

  • Force us to think about errors and bubble them up to the mapping layer.
  • At the mapping layer, it can easily discern an internal error and panic, otherwise return an error.

This will hopefully (one day) leave us with three things:

  • good old errors;
  • k6ext.Panic;
  • and a tool that doesn't abort the whole test run when there's no need to.

Panics should rarely occur. When they do, we will see a messy stack trace. The stack trace is a concern, but one that could be solved in another PR which is unrelated to this PR, which is trying to solve a different problem.

@inancgumus
Copy link
Member

inancgumus commented Feb 6, 2023

@ankur22, when we panic, we clean up the browser process. That is the current behavior until v0.8.

However, this PR will introduce a regression I explained in my earlier comment 😒 I'd be happy to merge this PR if we can solve that issue 👍

We can tackle finding the perfect solution later (#753)—I agree 🚀

@ankur22
Copy link
Collaborator Author

ankur22 commented Feb 6, 2023

However, this PR will introduce a regression I explained in my earlier comment 😒 I'd be happy to merge this PR if we can solve that issue 👍

Apologies, this is a valid blocker. I've attempted to solve it with 42871c3.

EDIT: I've created a new PR (#755) which uses a cleaner mechanism to force the shutdown of all the browser processes.

We can tackle finding the perfect solution later (#753)—I agree 🚀

💪

@@ -35,4 +38,7 @@ type BrowserContext interface {
StorageState(opts goja.Value)
Unroute(url goja.Value, handler goja.Callable)
WaitForEvent(event string, optsOrPredicate goja.Value) any

// Internally available methods
Ctx() context.Context
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another solution could be to have a BrowserProcessRegister which would keep track of all the running browser process (i.e. store the pid). When the panic occurs, it would just be a case of running something like bpr.Shutdown().

This way we wouldn't dirty the APIs with these "internal" exported methods.

Another reason to work with a BrowserProcessRegister would be to clean up the context a bit by removing the the pid from the context.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to understand better how this would be done, but on a first sight this seems like a better option to me. Otherwise we'll have to expose a method in order to retrieve the ctx from every mapping function (as a panic could be called from methods on each one of them), right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created #755 which uses a register.

@inancgumus
Copy link
Member

inancgumus commented Feb 10, 2023

I was wondering whether we can solve this issue more simply without doing medium-sized changes (#755 and #762).

The idea is this. Since BrowserType.Launch is the root method that launches a browser and gets the process ID, we can make the process ID available to the mapping layer. So it would be safe to get the process ID and make it available to the child mappings.

  1. Let Launch return the process ID.
  2. Put the processID in moduleVU after we get it from Launch.
  3. Return a new context with the process ID in moduleVU.Context().

I've tested this solution, and it worked without an issue. It has minimal code changes to allow us to move forward instead of spending time on refactorings (maybe with baggage of additional unforeseen bugs). When we finish, we can move the usage of k6ext.Panic to the mapping layer.

I know it's not the cleanest solution, and we can perfect it later, as we discussed above.

func (vu moduleVU) Context() context.Context {
	...
	ctx := k6ext.WithProcessID(vu.VU.Context(), vu.browserProcessID)
	return k6ext.WithVU(ctx, vu.VU)
}
...
func mapBrowserType(vu moduleVU, bt api.BrowserType) mapping {
	...
	return mapping{
		...
		"launch": func(opts goja.Value) *goja.Object {
			var b api.Browser
			b, vu.browserProcessID = bt.Launch(opts)
			m := mapBrowser(vu, b)
			return rt.ToValue(m).ToObject(rt)
		},
	}
}

@grafana/k6-browser

This will allow us to panic from the mapping layer and ensure that the
browser process can be shutdown by retrieving the pid from the context
@ankur22
Copy link
Collaborator Author

ankur22 commented Feb 10, 2023

I was wondering whether we can solve this issue more simply without doing medium-sized changes (#755 and #762).

This is indeed a more elegant solution.

I still prefer the #755 solution since it handles the closing of all browser processes that a test might start up. I'd be happy to remove the ugly runID code if we were in agreement to using the hacky workaround of using Kill in #762.

I think #762 is still a valid issue that we should merge in since calling Release on the process will prevent the Kill from functioning as we expect, which can lead to zombie browser processes.

@inancgumus
Copy link
Member

inancgumus commented Feb 10, 2023

Thanks, @ankur22! I believe we can start simpler with this solution and work towards the others next. Since this solution is a low-hanging fruit.

I still prefer the #755 solution since it handles the closing of all browser processes that a test might start up.

Currently (v0.8), we should also be closing all test browser processes since each test gets its own context, and we should already be cleaning them up. Then we can include #762 for sure if you think that's critical.

@ankur22
Copy link
Collaborator Author

ankur22 commented Feb 10, 2023

Currently (v0.8), we should also be closing all test browser processes since each test gets its own context, and we should already be cleaning them up. Then we can include #762 for sure if you think that's critical.

That's true about the context. What i meant was when a panic occurs in k6ext.Panic, it can only retrieve the single PID, but if the browser test spawns multiple browsers then the panic will only try to kill the last opened browser. With the register in #755, we can register multiple pids, that then kill all open browser processes. The register also helps encapsulate pids in the k6os package.

@inancgumus
Copy link
Member

inancgumus commented Feb 21, 2023

@ankur22, this should be unblocked now 🥳 Thanks for your cooperation and understanding 🙇 #779 couldn't be possible without your previous work.

@ankur22
Copy link
Collaborator Author

ankur22 commented Feb 27, 2023

Closing and changes moved to #796 as reworking this PR would be too much work.

@ankur22 ankur22 closed this Feb 27, 2023
@ankur22 ankur22 deleted the refactor/688-init branch March 16, 2023 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants