Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-iterational goroutines should not abort an iteration #688

Open
12 tasks
inancgumus opened this issue Jan 5, 2023 · 4 comments
Open
12 tasks

Non-iterational goroutines should not abort an iteration #688

inancgumus opened this issue Jan 5, 2023 · 4 comments
Labels
bug Something isn't working compatibility k6 core compatibility epic ux
Milestone

Comments

@inancgumus
Copy link
Member

inancgumus commented Jan 5, 2023

Brief summary

Goroutines other than the one that drives an iteration should not panic. Otherwise, k6 won't catch the panic and fail.

We should take control of our non-iteration goroutines: Not panic and cause an iteration to be prematurely aborted.

More details about how to craft a solution:

xk6-browser version

v0.7.0

OS

Staging Cloud

Docker version and image (if applicable)

grafana/k6-cloud-scripts-to-archive:2.40.6-xk6browser-v0.7.0-1

Steps to reproduce the problem

Pawel ran the following script, and we got sentry alerts because of the panics.

This script panics.
import { check, sleep } from 'k6';
import { Counter } from 'k6/metrics';
import { chromium } from 'k6/x/browser';


const checkpoint1 = new Counter('checkpoint1');
const checkpoint2 = new Counter('checkpoint2');
const checkpoint3 = new Counter('checkpoint3');
const checkpoint4 = new Counter('checkpoint4');
const checkpoint5 = new Counter('checkpoint5');
const checkpoint6 = new Counter('checkpoint6');
const checkpoint7 = new Counter('checkpoint7');


export let options = {
  vus: 100,
  duration: '10h',
}

export function setup(){
  console.log("Running setup")
}

export default function () {
  const browser = chromium.launch({
    headless: true
  });
  const context = browser.newContext();
  const page = context.newPage();

  checkpoint1.add(1);

  // Goto front page, find login link and click it
  page.goto('https://test.k6.io/', { waitUntil: 'networkidle' });

  checkpoint2.add(1);

  Promise.all([
    page.waitForNavigation(),
    page.locator('a[href="/my_messages.php"]').click(),
  ]).then(() => {
    checkpoint3.add(1)

    // Enter login credentials and login
    page.locator('input[name="login"]').type('admin');
    page.locator('input[name="password"]').type('123');
    // We expect the form submission to trigger a navigation, so to prevent a
    // race condition, setup a waiter concurrently while waiting for the click
    // to resolve.

    checkpoint4.add(1)

    return Promise.all([
      page.waitForNavigation(),
      page.locator('input[type="submit"]').click(),
    ]);
  }).then(() => {
    checkpoint5.add(1)

    check(page, {
      'header': page.locator('h2').textContent() == 'Welcome, admin!',
    });
  }).finally(() => {
    checkpoint6.add(1)
    page.close();
    browser.close();
    checkpoint7.add(1)
  });
  sleep(1);
}

export function teardown() {
  console.log("Running teardown")
}

The script has an incorrect usage of promises, and the fix is easy:

page.goto('https://test.k6.io/', { waitUntil: 'networkidle' });

// this should be page.goto(...).then(...)

However, the problem is deeper than it seems.

See more details in the test run: 1620122 (proxy).

Actual behaviour

Panic causes an iteration to be prematurely aborted.

Tasks

@inancgumus inancgumus added bug Something isn't working compatibility k6 core compatibility ux next Might be eligible for the next planning (not guaranteed!) labels Jan 5, 2023
@inancgumus inancgumus added this to the v0.8.0 milestone Jan 5, 2023
@inancgumus inancgumus changed the title Goroutines should not abort an iteration Non-iterational goroutines should not abort an iteration Jan 5, 2023
@ankur22 ankur22 assigned ankur22 and unassigned ankur22 Jan 5, 2023
@inancgumus inancgumus added evaluate and removed next Might be eligible for the next planning (not guaranteed!) labels Jan 6, 2023
@ankur22 ankur22 self-assigned this Jan 9, 2023
@ankur22
Copy link
Collaborator

ankur22 commented Jan 17, 2023

Types of panic

  1. ✅ go panic. -- all documented
  2. k6ext.Panic.
  3. k6common.Throw. -- all documented

Panic vs Error

Error

Should denote to the user that an error occurred but there's a way around it, i.e. they can fix it themselves. This shouldn't stop the whole test but just the current iteration.

Panic

Should denote an abort of the whole test run (all iterations) as an unexpected internal error was encountered which we need to fix e.g. the goja runtime is missing from the content. The assumption i'm making is that if it happens in the current iteration it's probably happening in all iterations.

Areas where I think are ok to use panic (abort the whole test run)

NOTE:

Where runs on main thread Reason
panic("too many arguments to promiseThen")
Y used in unit/integration tests
panic(fmt.Errorf("testBrowser: unexpected browser type %T", v))
Y used in unit/integration tests
defer k6common.Throw(rt, fmt.Errorf(format, a...))
Y k6ext.panic eventually uses a k6common.Throw, this seems fine for now as long as k6ext.Panic is used sensibly
panic(fmt.Sprintf("internal error while getting goroutine ID: %v", err))
Y panics when goroutine id can't be found during logging
panic(fmt.Sprintf("keyboard layout already registered: %s", lang))
Y When keyboard registration fails
panic("no browser process ID in context")
Y No browser PID
panic("no k6 JS runtime in context")
Y No k6 goja runtime
k6common.Throw(vu.Runtime(), errors.New(msg))
Y To prevent cloud runs
k6common.Throw(rt, fmt.Errorf("mapping: %w", err))
Y When the goja mapping fails
panic("dst should be one of: map, struct, slice")
Y programmer error if we're not passing in the correct type (change to k6ext.Panic)
panic("dst should be a pointer")
Y programmer error if we're not passing in the correct type (change to k6ext.Panic)
panic(err)
N programmer error if we're not passing in the correct type (change to k6ext.Panic)
Y Programmer error -- trying to update subframe viewport which can only be done on the main frame (change to k6ext.Panic)
panic(fmt.Errorf("unexpected DOM error type %T", v))
? unexpected error that we now need to code in
panic("DOM error is nil")
? nil error from dom, programmer needs to investigate symptom

List of panics that could be errors (only stop the current test iteration)

Where reason
k6common.Throw(rt, errors.New("BrowserType.connect() has not been implemented yet"))
unimplemented method, user can remove the call
k6common.Throw(rt, errors.New("BrowserType.LaunchPersistentContext(userDataDir, opts) has not been implemented yet"))
unimplemented method, user can remove the call

Question

  1. Can we return an error to goja instead of panicking?
    1. Yes we can. I tested this with the Launch method, and the behaviour seems to be the same if we panic from Launch or return an error. The difference is that the API changes from Launch(opts goja.Value) api.Browser to Launch(opts goja.Value) (api.Browser, error). POC: Refactor the Launch code to work with errors #718
    2. IMO a panic should be used to denote an abort of the whole test run (all iterations) as an unexpected internal error was encountered which we need to fix e.g. the goja runtime is missing from the content.
    3. IMO an error should denote a user error that can be fixed by the user themselves e.g. the element with name "foo" couldn't be found, and a time out occured. This shouldn't stop the whole test but just the current iteration.
  2. Panic in code is ok if running on the "main" thread, which should mean it's ok to panic from APIs, e.g. goto since it will only stop the current iteration.
    1. This is true. A panic on the "main" thread will stop the current iteration but not the whole test run.

@ankur22
Copy link
Collaborator

ankur22 commented Jan 30, 2023

We briefly discussed the strategy going forward, and we've agreed that we should pick a file and convert the errors into panic panics into errors where appropriate and ensure that a panic is only used if we need to abort the whole test process due to an unexpected internal error.

Based on the feedback from the first PR, we should be able to convert the rest of the codebase.

@inancgumus
Copy link
Member Author

@ankur22, what I have in mind is to let the mapping layer handle the panics and the internal (core/business logic) parts of the code return errors. This is more idiomatic and maintainable in the long run (from the perspective of tests + code). But it's not something you necessarily need to tackle in this issue. I agree with the rest of your findings 👍

@ankur22
Copy link
Collaborator

ankur22 commented Jan 30, 2023

That's a good idea. We could assert on the behaviour of the error to determine whether to panic or return the error: https://dave.cheney.net/2014/12/24/inspecting-errors

@ankur22 ankur22 modified the milestones: v0.8.0, v0.9.0 Feb 2, 2023
@ankur22 ankur22 removed their assignment Feb 21, 2023
inancgumus added a commit that referenced this issue Mar 9, 2023
This allows us to:
- Better handle extension errors
- Leaves the decision of handling to Goja when an element is not found.
  This is how it's done in the k6-core and other extensions.

Updates: #804
Related: #688
inancgumus added a commit that referenced this issue Mar 10, 2023
* Add error handling to wildcard selectors

This allows us to:
- Better handle extension errors.
- Leaves the decision of properly handling errors to Goja when an element is not found. This is how it's done in the k6-core and other extensions.
- Returning an error from the mapping layer also prevents multiple unnecessary panics.

Updates: #804
Related: #688
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 14, 2023
inancgumus added a commit that referenced this issue Mar 15, 2023
inancgumus added a commit that referenced this issue Mar 15, 2023
inancgumus added a commit that referenced this issue Mar 15, 2023
inancgumus added a commit that referenced this issue Mar 15, 2023
@inancgumus inancgumus modified the milestones: v0.9.0, v0.10.0 Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working compatibility k6 core compatibility epic ux
Projects
None yet
Development

No branches or pull requests

3 participants