Non-iterational goroutines should not abort an iteration #688

inancgumus · 2023-01-05T09:42:15Z

Brief summary

Goroutines other than the one that drives an iteration should not panic. Otherwise, k6 won't catch the panic and fail.

We should take control of our non-iteration goroutines: Not panic and cause an iteration to be prematurely aborted.

More details about how to craft a solution:

xk6-browser version

v0.7.0

OS

Staging Cloud

Docker version and image (if applicable)

grafana/k6-cloud-scripts-to-archive:2.40.6-xk6browser-v0.7.0-1

Steps to reproduce the problem

Pawel ran the following script, and we got sentry alerts because of the panics.

This script panics.

import { check, sleep } from 'k6';
import { Counter } from 'k6/metrics';
import { chromium } from 'k6/x/browser';


const checkpoint1 = new Counter('checkpoint1');
const checkpoint2 = new Counter('checkpoint2');
const checkpoint3 = new Counter('checkpoint3');
const checkpoint4 = new Counter('checkpoint4');
const checkpoint5 = new Counter('checkpoint5');
const checkpoint6 = new Counter('checkpoint6');
const checkpoint7 = new Counter('checkpoint7');


export let options = {
  vus: 100,
  duration: '10h',
}

export function setup(){
  console.log("Running setup")
}

export default function () {
  const browser = chromium.launch({
    headless: true
  });
  const context = browser.newContext();
  const page = context.newPage();

  checkpoint1.add(1);

  // Goto front page, find login link and click it
  page.goto('https://test.k6.io/', { waitUntil: 'networkidle' });

  checkpoint2.add(1);

  Promise.all([
    page.waitForNavigation(),
    page.locator('a[href="/my_messages.php"]').click(),
  ]).then(() => {
    checkpoint3.add(1)

    // Enter login credentials and login
    page.locator('input[name="login"]').type('admin');
    page.locator('input[name="password"]').type('123');
    // We expect the form submission to trigger a navigation, so to prevent a
    // race condition, setup a waiter concurrently while waiting for the click
    // to resolve.

    checkpoint4.add(1)

    return Promise.all([
      page.waitForNavigation(),
      page.locator('input[type="submit"]').click(),
    ]);
  }).then(() => {
    checkpoint5.add(1)

    check(page, {
      'header': page.locator('h2').textContent() == 'Welcome, admin!',
    });
  }).finally(() => {
    checkpoint6.add(1)
    page.close();
    browser.close();
    checkpoint7.add(1)
  });
  sleep(1);
}

export function teardown() {
  console.log("Running teardown")
}

The script has an incorrect usage of promises, and the fix is easy:

page.goto('https://test.k6.io/', { waitUntil: 'networkidle' });

// this should be page.goto(...).then(...)

However, the problem is deeper than it seems.

See more details in the test run: 1620122 (proxy).

Actual behaviour

Panic causes an iteration to be prematurely aborted.

The text was updated successfully, but these errors were encountered:

ankur22 · 2023-01-17T16:13:07Z

Types of panic

✅ go panic. -- all documented
k6ext.Panic.
✅ k6common.Throw. -- all documented

Panic vs Error

Error

Should denote to the user that an error occurred but there's a way around it, i.e. they can fix it themselves. This shouldn't stop the whole test but just the current iteration.

Panic

Should denote an abort of the whole test run (all iterations) as an unexpected internal error was encountered which we need to fix e.g. the goja runtime is missing from the content. The assumption i'm making is that if it happens in the current iteration it's probably happening in all iterations.

Areas where I think are ok to use panic (abort the whole test run)

NOTE:

Where runs on main thread Reason

xk6-browser/tests/test_browser.go

Line 349 in 63d5db7

panic("too many arguments to promiseThen")

Y

used in unit/integration tests

xk6-browser/tests/test_browser.go

Line 92 in 63d5db7

panic(fmt.Errorf("testBrowser: unexpected browser type %T", v))

Y

used in unit/integration tests

xk6-browser/k6ext/panic.go

Line 34 in 16c0b8c

defer k6common.Throw(rt, fmt.Errorf(format, a...))

Y

k6ext.panic eventually uses a k6common.Throw, this seems fine for now as long as k6ext.Panic is used sensibly

xk6-browser/log/logger.go

Line 181 in 9047580

panic(fmt.Sprintf("internal error while getting goroutine ID: %v", err))

Y

panics when goroutine id can't be found during logging

xk6-browser/keyboardlayout/layout.go

Line 72 in 4485c6d

panic(fmt.Sprintf("keyboard layout already registered: %s", lang))

Y

When keyboard registration fails

xk6-browser/k6ext/panic.go

Line 39 in 16c0b8c

panic("no browser process ID in context")

Y

No browser PID

xk6-browser/k6ext/panic.go

Line 22 in 16c0b8c

panic("no k6 JS runtime in context")

Y

No k6 goja runtime

xk6-browser/browser/module.go

Line 56 in f9153fd

k6common.Throw(vu.Runtime(), errors.New(msg))

Y

To prevent cloud runs

xk6-browser/browser/mapping.go

Line 52 in 575b975

k6common.Throw(rt, fmt.Errorf("mapping: %w", err))

Y

When the goja mapping fails

xk6-browser/common/options.go

Line 50 in 9047580

panic("dst should be one of: map, struct, slice")

Y

programmer error if we're not passing in the correct type (change to k6ext.Panic)

xk6-browser/common/options.go

Line 41 in 9047580

panic("dst should be a pointer")

Y

programmer error if we're not passing in the correct type (change to k6ext.Panic)

xk6-browser/common/frame.go

Line 401 in 575b975

panic(err)

N

programmer error if we're not passing in the correct type (change to k6ext.Panic)

xk6-browser/common/frame_session.go

Line 991 in 9884e91

panic(err)

Y

Programmer error -- trying to update subframe viewport which can only be done on the main frame (change to k6ext.Panic)

xk6-browser/common/element_handle.go

Line 1508 in 575b975

panic(fmt.Errorf("unexpected DOM error type %T", v))

?

unexpected error that we now need to code in

xk6-browser/common/element_handle.go

Line 1504 in 575b975

panic("DOM error is nil")

?

nil error from dom, programmer needs to investigate symptom

List of panics that could be errors (only stop the current test iteration)

Where

reason

xk6-browser/chromium/browser_type.go

Line 68 in e62e7c1

    
           k6common.Throw(rt, errors.New("BrowserType.connect() has not been implemented yet"))

unimplemented method, user can remove the call

xk6-browser/chromium/browser_type.go

Line 210 in e62e7c1

    
           k6common.Throw(rt, errors.New("BrowserType.LaunchPersistentContext(userDataDir, opts) has not been implemented yet"))

unimplemented method, user can remove the call

Question

Can we return an error to goja instead of panicking?
1. Yes we can. I tested this with the Launch method, and the behaviour seems to be the same if we panic from Launch or return an error. The difference is that the API changes from Launch(opts goja.Value) api.Browser to Launch(opts goja.Value) (api.Browser, error). POC: Refactor the Launch code to work with errors #718
2. IMO a panic should be used to denote an abort of the whole test run (all iterations) as an unexpected internal error was encountered which we need to fix e.g. the goja runtime is missing from the content.
3. IMO an error should denote a user error that can be fixed by the user themselves e.g. the element with name "foo" couldn't be found, and a time out occured. This shouldn't stop the whole test but just the current iteration.
Panic in code is ok if running on the "main" thread, which should mean it's ok to panic from APIs, e.g. goto since it will only stop the current iteration.
1. This is true. A panic on the "main" thread will stop the current iteration but not the whole test run.

ankur22 · 2023-01-30T12:34:16Z

We briefly discussed the strategy going forward, and we've agreed that we should pick a file and convert the ~~errors into panic~~ panics into errors where appropriate and ensure that a panic is only used if we need to abort the whole test process due to an unexpected internal error.

Based on the feedback from the first PR, we should be able to convert the rest of the codebase.

inancgumus · 2023-01-30T12:37:57Z

@ankur22, what I have in mind is to let the mapping layer handle the panics and the internal (core/business logic) parts of the code return errors. This is more idiomatic and maintainable in the long run (from the perspective of tests + code). But it's not something you necessarily need to tackle in this issue. I agree with the rest of your findings 👍

ankur22 · 2023-01-30T12:40:56Z

That's a good idea. We could assert on the behaviour of the error to determine whether to panic or return the error: https://dave.cheney.net/2014/12/24/inspecting-errors

This allows us to: - Better handle extension errors - Leaves the decision of handling to Goja when an element is not found. This is how it's done in the k6-core and other extensions. Updates: #804 Related: #688

* Add error handling to wildcard selectors This allows us to: - Better handle extension errors. - Leaves the decision of properly handling errors to Goja when an element is not found. This is how it's done in the k6-core and other extensions. - Returning an error from the mapping layer also prevents multiple unnecessary panics. Updates: #804 Related: #688

Updates: #804 Related: #688

inancgumus added bug Something isn't working compatibility k6 core compatibility ux next Might be eligible for the next planning (not guaranteed!) labels Jan 5, 2023

inancgumus added this to the v0.8.0 milestone Jan 5, 2023

inancgumus changed the title ~~Goroutines should not abort an iteration~~ Non-iterational goroutines should not abort an iteration Jan 5, 2023

ankur22 assigned ankur22 and unassigned ankur22 Jan 5, 2023

inancgumus added evaluate and removed next Might be eligible for the next planning (not guaranteed!) labels Jan 6, 2023

ankur22 self-assigned this Jan 9, 2023

ankur22 mentioned this issue Feb 1, 2023

Return an error from BrowserContext.AddCookies and handle panic in mapping layer #744

Closed

inancgumus mentioned this issue Feb 2, 2023

Remove Async handling from the core methods #746

Merged

ankur22 modified the milestones: v0.8.0, v0.9.0 Feb 2, 2023

inancgumus mentioned this issue Feb 6, 2023

Improve user error messages on panic #753

Open

inancgumus mentioned this issue Feb 21, 2023

Clear zombie browser processes on panic #779

Merged

ankur22 removed their assignment Feb 21, 2023

ka3de mentioned this issue Feb 28, 2023

Implement BrowserType.connect #800

Merged

andrewslotin removed the evaluate label Feb 28, 2023

inancgumus mentioned this issue Mar 9, 2023

Add error handling to wildcard selectors #820

Merged

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to EvaluateHandle

0d5de5e

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to getProperties

95a0d8b

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to contentFrame

8d2f71c

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to waitForSelector

ac0a395

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to FrameElement

adaf1f9

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to newPage

e341851

Updates: #804 Related: #688

inancgumus mentioned this issue Mar 14, 2023

Fix Edge-case: Mapping fails if the extension code returns nil values #804 #826

Merged

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to EvaluateHandle

423bb02

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to getProperties

2eb28cc

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to contentFrame

8cb8539

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to ownerFrame

e1121a7

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to waitForSelector

aa18c6a

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to FrameElement

904bd9b

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to newPage

568133b

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to EvaluateHandle

a5411f8

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to getProperties

2513a3f

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to contentFrame

20094c0

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to ownerFrame

2f42334

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to waitForSelector

dac28f8

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to FrameElement

62ca4d1

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 14, 2023

Add error handling to newPage

d9bc3c2

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 15, 2023

Add error handling to ownerFrame

5cabef4

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 15, 2023

Add error handling to waitForSelector

bad95b5

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 15, 2023

Add error handling to FrameElement

c60bb48

Updates: #804 Related: #688

inancgumus added a commit that referenced this issue Mar 15, 2023

Add error handling to newPage

fcaddb3

Updates: #804 Related: #688

inancgumus modified the milestones: v0.9.0, v0.10.0 Mar 31, 2023

inancgumus added the epic label May 29, 2023

ankur22 mentioned this issue Sep 21, 2023

Fix browserContext.waitForEvent #1042

Merged

3 tasks

ankur22 mentioned this issue Nov 22, 2023

Refactor isHidden to return an error #1109

Merged

3 tasks

inancgumus mentioned this issue Dec 4, 2023

Panic: Unexpected end of JSON input #987

Closed

inancgumus mentioned this issue Dec 14, 2023

Making k6/experimental/browser module part of k6 core #1117

Closed

inancgumus mentioned this issue May 30, 2024

Turn Browser panics into errors #1360

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-iterational goroutines should not abort an iteration #688

Non-iterational goroutines should not abort an iteration #688

inancgumus commented Jan 5, 2023 •

edited

Loading

Tasks

ankur22 commented Jan 17, 2023 •

edited

Loading

ankur22 commented Jan 30, 2023 •

edited

Loading

inancgumus commented Jan 30, 2023

ankur22 commented Jan 30, 2023

Non-iterational goroutines should not abort an iteration #688

Non-iterational goroutines should not abort an iteration #688

Comments

inancgumus commented Jan 5, 2023 • edited Loading

Brief summary

xk6-browser version

OS

Docker version and image (if applicable)

Steps to reproduce the problem

Actual behaviour

Tasks

ankur22 commented Jan 17, 2023 • edited Loading

Types of panic

Panic vs Error

Error

Panic

Areas where I think are ok to use panic (abort the whole test run)

List of panics that could be errors (only stop the current test iteration)

Question

ankur22 commented Jan 30, 2023 • edited Loading

inancgumus commented Jan 30, 2023

ankur22 commented Jan 30, 2023

inancgumus commented Jan 5, 2023 •

edited

Loading

ankur22 commented Jan 17, 2023 •

edited

Loading

ankur22 commented Jan 30, 2023 •

edited

Loading