NETSCRIPT: Update ScriptDeath to extend Error #1685

alainbryden · 2024-10-07T01:37:21Z

Summary of changes

Rename the "name" property to "hostname" so that it doesn't clash with inherited Error.name property.
Extend Error, this way if a user ends up catching it they can:
- Test the type of error caught (name) and ignore it if desired
- Get a stack trace and find out where they went wrong
- Not require special case error printing logic for catching things that are neither strings, nor Error objects.

Motivation

Not sure how many have seen one of these:

We throw a new ScriptDeath instance in the netscript checkEnvFlags when we want to (silently) kill code still running on a script that's been killed - whenever they next invoke an ns instance. Problem is users with their own error handling logic can catch these, and it's difficult to make heads or tails of them. If they behaved more like normal Error instances, they would be easier to deal with.

As of #1513 - we no longer even display these errors in the UI when handled, so this is really just for the benefit of the user, who might still catch one of these in their try/catch handlers and not know what it is or where it came from (I was one such user).

Tests

nano test.js

let _ns = null;
/** @param {NS} ns */
export async function main(ns) {
    try {
        _ns ??= ns; // Simulates a bad global cache (usually a result of bad coding)
        ns.tail();
        runHelper();
    } catch (err) {
        ns.print('ERROR: Caught an unexpected error!')
        ns.print('Direct ns.print:')
        ns.print(err)
        ns.print('Class Name: ' + err.constructor?.name)
        ns.print('Name: ' + err.name)
        if (err.toString !== undefined)
            ns.print('toString: ' + err.toString())
        if (err.stack !== undefined)
            ns.print('stack: ' + err.stack)
        ns.print('Stringified: ' + JSON.stringify(err))
        ns.print('Manually serialized: ' + Object.keys(err).map(k => `${k}: ${err[k]}`).join(', '))
    }
}

function runHelper() {
    _ns.print("Now run this script a second time.");
}

This is what it looks like when you run this script twice within a few seconds on the current "stable build"

This is what it looks like when you run it on the latest "dev" build (2.6.3):

Note that the unused "errorMessage" property was recently removed - other than that, no difference.

This is what it now looks like with the change in place:

In release mode, the "internal" part of the stack trace (first 2 lines) would be minified, but the important part is that the player can now see that the line of code in test.js that caused the error was "runHelper.js:24:7":

This information, combined with the error message "Attempted to invoke an unsupported ns function from a killed process" hopefully points them in the right direction (that they were using a "bad" ns instance from the previous run)

- Rename the "name" property to "hostname" so that it doesn't clash with inherited Error.name property. - Extend Error, this way if a user ends up catching it they can: - Test the type of error caught (name) and ignore it if desired - Get a stack trace and find out where they went wrong - Not require special case error printing logic for catching things that are neither strings, nor Error objects.

d0sboots · 2024-10-08T06:11:55Z

There's one (potential) significant issue here: ScriptDeath objects are created in a variety of circumstances, and the (small) object currently created is much cheaper than a full Error which contains stacktrace information. This might not be an issue, because normal script termination doesn't invoke ScriptDeath (or it can be trivially tweaked to not), but many scenarios that involve kill do. I'm not certain, but I'm wary that there may be a high-performance usecase that currently involves a lot of killing.

Is there a different way the object could be laid out that would give you more of a hint, without requiring it to be a full Error?

Alpheus · 2024-10-08T06:31:37Z

There's one (potential) significant issue here: ScriptDeath objects are created in a variety of circumstances, and the (small) object currently created is much cheaper than a full Error which contains stacktrace information. This might not be an issue, because normal script termination doesn't invoke ScriptDeath (or it can be trivially tweaked to not), but many scenarios that involve kill do. I'm not certain, but I'm wary that there may be a high-performance usecase that currently involves a lot of killing.

Is there a different way the object could be laid out that would give you more of a hint, without requiring it to be a full Error?

You can get a stack trace from new Error().stack but the performance impact will be from what the error does primarily, not how we're passing it.

Though you said it may not be an issue, let's run everyone's batchers on dev after the merge for a quick test or do perf tests (there already are some).

d0sboots · 2024-10-08T06:36:21Z

The issue is that I'm concerned there's a class of scripts I don't know about, that will be impacted. There may not be, but I vaguely feel like there might be. And there is an overhead from Error, because the stack has to grabbed at the time of creation.

If it would be good enough to implement toString on ScriptDeath so that it has a more friendly representation, that has no overhead (since it's just part of the prototype).

alainbryden · 2024-10-08T12:19:44Z

To help with the decision, here's a performance test of the two I whipped up.
Throwing the new object is 4-10 times more expensive depending on environment, but we can still generally throw ten thousand of these in under 30 milliseconds in a "kill-all-scripts" scenario:

Latest dev run in Firefox:

Latest dev run in Chrome:

Latest Steam:

Test code is below:

/** @param {NS} ns */
export async function main(ns) {
    let lastError;
    const tests = [1, 10, 100, 1000, 10_000, 100_000];
    for (const testSize of tests) {
        const startTime = performance.now();
        for (let i = 0; i < testSize; i++) {
            try { trace(true) } catch (error) { lastError = error; }
        }
        const endTime = performance.now();
        ns.print(`Took ${endTime - startTime} milliseconds to throw ${testSize} NEW ScriptDeath errors.`);
    }
    ns.print(`ERROR: ${lastError}\n${lastError.stack}`);

    for (const testSize of tests) {
        const startTime = performance.now();
        for (let i = 0; i < testSize; i++) {
            try { trace(false) } catch (error) { lastError = error; }
        }
        const endTime = performance.now();
        ns.print(`Took ${endTime - startTime} milliseconds to throw ${testSize} OLD ScriptDeath errors.`);
    }
    ns.print(`ERROR: ${lastError}\n${lastError.stack}`);
}

function trace(useNewScriptDeath) {
    stack(useNewScriptDeath);
}

function stack(useNewScriptDeath) {
    deep(useNewScriptDeath);
}

function deep(useNewScriptDeath) {
    some(useNewScriptDeath);
}

function some(useNewScriptDeath) {
    if (useNewScriptDeath)
        throw new NewScriptDeath(mockProcess);
    else
        throw new OldScriptDeath(mockProcess);
}

const mockProcess = { 'pid': 1234, 'name': 'test.js', 'hostname': 'home' };

export class OldScriptDeath {
    /** Process ID number. */
    pid;
    /** Filename of the script. */
    filename;
    /** IP Address on which the script was running */
    hostname;

    constructor(ws) {
        this.pid = ws.pid;
        this.name = ws.name;
        this.hostname = ws.hostname;

        Object.freeze(this);
    }
}

export class NewScriptDeath extends Error {
    /** Process ID number. */
    pid;
    /** Filename of the script. */
    filename;
    /** IP Address on which the script was running */
    hostname;

    constructor(ws) {
        // Invoke the Error constructor with a meaningful message
        const message = `Attempted to invoke an unsupported ns function from a killed process (${ws.name} running on ${ws.hostname} with pid ${ws.pid})`;
        super(message);
        // Setting the base Error.name property is important to facilitate easy
        // detection, since prototype.constructor.name might be minified for them.
        this.name = "ScriptDeath";

        // Set own properties
        this.pid = ws.pid;
        this.filename = ws.name;
        this.hostname = ws.hostname;

        // Fix prototype, required when extending Error
        Object.setPrototypeOf(this, NewScriptDeath.prototype);
        Object.freeze(this);
    }
}

Alpheus · 2024-10-08T17:21:56Z

Could you check in the tests as well please?

catloversg · 2024-10-08T17:28:29Z

I think d0sboots's suggestion is the better approach. We can implement 2 functions:

toString gives a short but still useful error message.
getStack gives a full stack trace if the player wants.

PoC:

export class ScriptDeath {
  /** Process ID number. */
  pid: number;

  /** Filename of the script. */
  name: string;

  /** IP Address on which the script was running */
  hostname: string;

  constructor(ws: WorkerScript) {
    this.pid = ws.pid;
    this.name = ws.name;
    this.hostname = ws.hostname;

    Object.freeze(this);
  }

  toString(): string {
    return `Attempted to invoke an NS function from a killed process (${this.name} running on ${this.hostname} with pid ${this.pid}). Call getStack() to get the stack trace.`;
  }

  getStack(): string | undefined {
    return Error().stack;
  }
}

Alpheus · 2024-10-08T17:38:56Z

I think d0sboots's suggestion is the better approach. We can implement 2 functions:

toString gives a short but still useful error message.
getStack gives a full stack trace if the player wants.

PoC:

export class ScriptDeath {
  /** Process ID number. */
  pid: number;

  /** Filename of the script. */
  name: string;

  /** IP Address on which the script was running */
  hostname: string;

  constructor(ws: WorkerScript) {
    this.pid = ws.pid;
    this.name = ws.name;
    this.hostname = ws.hostname;

    Object.freeze(this);
  }

  toString(): string {
    return `Attempted to invoke an NS function from a killed process (${this.name} running on ${this.hostname} with pid ${this.pid}). Call getStack() to get the stack trace.`;
  }

  getStack(): string | undefined {
    return Error().stack;
  }
}

Won't that generate the wrong trace?

catloversg · 2024-10-08T17:48:15Z

No.
The only difference is between this:

and this:

What matters is the stack trace in the player's scripts. That's what they need to troubleshoot their problems.

Alpheus · 2024-10-08T17:51:27Z

No. The only difference is between this: and this: What matters is the stack trace in the player's scripts. That's what they need to troubleshoot their problems.

It's odd to me that its implied you have to call stack immediately after the object is created. If the snapshot time is important then we're talking about 2 different ScriptDeaths, one with a stack, one without. Why not add a separate constructor for the stack case instead?

catloversg · 2024-10-08T17:59:56Z

I don't understand your question. When the player catches a ScriptDeath, they can optionally get the stack trace (via calling getStack() on that ScriptDeath instance) to see why their script tries to run an NS function after the script was killed. If they don't care about that, they (and we) don't need to do anything. "On-demand" getStack() is why there is need to worry about the (potential) performance impact of a new Error object.

alainbryden · 2024-10-09T00:14:09Z

Alpheus is right, that's the wrong stack trace - the stack where the exception was handled, not where it was thrown.

Try running the example I gave in my previous description. The stack trace should at least start from "runHelper", but it will only include main.

Throw an error in a function 4 or 5 deep (like in my performance test) and it'll be even more clear.

d0sboots · 2024-10-09T00:22:00Z

Throwing the new object is 4-10 times more expensive depending on environment, but we can still generally throw ten thousand of these in under 30 milliseconds in a "kill-all-scripts" scenario

This is what I worried about, because IIRC that's approaching the same speed of running scripts in a tight loop, currently. So in performance critical code, this very well could be an issue.

Alpheus is right, that's the wrong stack trace - the stack where the exception was handled, not where it was thrown.

Also agreed. If the stack trace is actually needed, it must be captured at the creation of the object. What is still not clear to me is if/why the stack trace is needed, or if it's good enough to clearly identify the object in other ways.

alainbryden · 2024-10-09T00:33:23Z

Throwing the new object is 4-10 times more expensive depending on environment, but we can still generally throw ten thousand of these in under 30 milliseconds in a "kill-all-scripts" scenario

This is what I worried about, because IIRC that's approaching the same speed of running scripts in a tight loop, currently. So in performance critical code, this very well could be an issue.

Is there any scenario we would be throwing on the order of 10,000 script deaths while someone is trying to run their performance-critical (e.g. H/G/W timing) loops? Only time I think this would happen is resetting / ascending, in which case hack timing loops don't matter.

As you said earlier, normal scripts ending don't trigger this, only an explicit kill command trying to prematurely terminate a running script.

Here's my thinking:

Typically scripts are killed only one at a time. Maybe a few hundred in special circumstances (e.g. a user killing a slew of scheduled hack/grow scripts that are going against stock they want to manipulate). In these cases, its still sub-1ms, not even noticeable.

The only time all thousands or tens of thousands get killed, which is still only 30ms, less than a game tick, is when resetting, at which point user-script timings don't matter anymore (they're all being killed). Even then, I would suspect that there are much slower processes being kicked off to reset most aspects of the game (mostly UI stuff), and this is just a drop in the bucket that amounts to having to wait slightly longer for prestigeWorkerScripts to finish up.

alainbryden · 2024-10-09T00:45:20Z

What is still not clear to me is if/why the stack trace is needed, or if it's good enough to clearly identify the object in other ways.

This is a tough pitch, you've either been in a situation where stack traces save your bacon, or you haven't. Anecdotally, a bunch of my scripts are upwards of 1000 lines of code. Unless we have a way to say which line of code triggered the ns call that blew up, I'd be hard pressed to track it down.

This is what happened to me the other day. I made a slight change to my daemon.js (1997 lines of code) wherein I cached a list of Server objects (custom helper class) in sorted order, so that I didn't have to pay the O(N^2) price of re-sorting them each loop. This cache was declared at the top of the file with a few others, but whenever I introduced it, I forgot to also re-initialize it at the start of my "function main(ns)" with the other globals...

This is the change that murdered me, after 2 years of barely touching this code:

Rather than re-sort targets by their hack-experience potential on each loop, I decided to cache it after the first time it was retrieved, since it doesn't change significantly over time. Unfortunately, this variable which used to be overwritten on each loop was now persisted between script invocations.

The Server objects get this little "convenience" copy of the ns object when they're constructed, which is supposed to die with the class as it's disposed (when the script dies):

Innocuous enough, in the name of letting me define some ns-dependent getters, but definitely a recipe for disaster if somehow this "Server" instance doesn't die with the script.

So suddenly after a few days of coding changes (including that addition) my script started throwing encountering these weird throws and I had to start scouring all recently changed code to see what could have caused it. I was this close to wrapping each and every function in my script (about 100) with a try/catch and custom log so I could at least narrow down which function the error was happening in.

I finally solved it on a hunch, after remembering the issues the game had with global properties still having their old values between different runs of the same script. (I know this is a me-problem, and I need to re-write all my old scripts to not use any globals, but it's a big undertaking)

d0sboots · 2024-10-09T03:54:56Z

You might have misunderstood what I was asking, but that's OK because you also answered the question just now. :) I wasn't sure if the problem was "it was unclear initially what these error objects were" or "I needed the stack traces to figure out what was going wrong in my code." From your description, it was primarily the latter.

I did an audit of the code, to find all the places a new ScriptDeath object can be created. I found these cases:

ns.exit()
ns.spawn()
Calling an ns function after the script is already dead.
Finishing a promise-based operation (like hack(), grow(), etc.) after the script is already dead.
Killing (for any reason) a script while a promise-based operation is ongoing. (Note that this is a different codepath than the previous bullet, but they essentially cover the same scenario. They are guarding different edge cases.)

Out of all of these, I only have high-frequency concerns about the last, and maybe the first. (But if someone is using ns.exit to kill themselves and running something like a batcher, probably they can restructure their code.) Even in the last case, killing a script while it is executing a long-running operation is not something you should be doing thousands of times per second; that seems like a failure to calculate proper run times. So, on the whole I think I will note my potential concerns, but say they are overridden by the better debuggability here.

src/Netscript/ScriptDeath.ts

Alpheus · 2024-10-09T05:56:56Z

I did an audit of the code, to find all the places a new ScriptDeath object can be created. I found these cases:

ns.exit()

ns.spawn()

Calling an ns function after the script is already dead.

Finishing a promise-based operation (like hack(), grow(), etc.) after the script is already dead.

Killing (for any reason) a script while a promise-based operation is ongoing. (Note that this is a different codepath than the previous bullet, but they essentially cover the same scenario. They are guarding different edge cases.)

Out of all of these, I only have high-frequency concerns about the last, and maybe the first. (But if someone is using ns.exit to kill themselves and running something like a batcher, probably they can restructure their code.)

Now how do we make sure that we don't accidentally introduce ScriptDeath into one of the high-speed paths in the future? Or at least signal that it's a bad idea?

alainbryden · 2024-10-09T11:13:32Z

I see what you're saying. We are creating ScriptDeath objects in a few places where they aren't necessarily being thrown, and in those non-throw situations extending Error is quite frivolous...

Is it worth creating a separate ScriptDeathError object (constructed with a ScriptDeath object) which we construct only as and when we're getting ready to explicitly throw a ScriptDeath? Other cases where we merely return, or reject a promise via a ScriptDeath instance can continue using the simple object.

Alpheus · 2024-10-09T11:31:16Z

I see what you're saying. We are creating ScriptDeath objects in a few places where they aren't necessarily being thrown, and in those non-throw situations extending Error is quite frivolous...

Is it worth creating a separate ScriptDeathError object (constructed with a ScriptDeath object) which we construct only as and when we're getting ready to explicitly throw a ScriptDeath? Other cases where we merely return, or reject a promise via a ScriptDeath instance can continue using the simple object.

The crazy+good UX path would likely entail having a config option. Ie. an interface setting in-game that controls the stack trace generation. ("Enable Stack traces on Script Death")

That way you could be more aggressive on the performance for the positive case without impacting users. Also makes this PR much more trivial to merge if it's controllable.

alainbryden · 2024-10-09T12:12:02Z

I'm going off on a bit of a tangent here, but I'm wondering if there are other clever ways we can invalidate the ns object that was previously passed to a script (and which is perhaps living longer than it should).

Like if instead of passing an NS instance when we invoke a script's main we passed a revocable Proxy to an NS object - then when any script is killed we can simply revoke the proxy and any further attempts to call NS functions would naturally generate a TypeError with a stack trace to where they tried to use the "dead" ns instance.

That's probably a bigger refactor, but it seems to have been designed precisely for our use case. To be sure, we'd still need to deal with many existing edge cases such as rejecting promises-in-progress, but it may simplify others.

Edit I guess upon closer inspection NSProxyHandler is already our own home-grown version of the Proxy<T> built-in, and helpers.checkEnvFlags is our "revoke" implementation, where we throw a ScriptDeath object if the instance is revoked.

src/Netscript/ScriptDeath.ts

Co-authored-by: David Walker <d0sboots@gmail.com>

d0sboots · 2024-10-10T04:13:20Z

I see what you're saying. We are creating ScriptDeath objects in a few places where they aren't necessarily being thrown, and in those non-throw situations extending Error is quite frivolous...

Actually in all the situations I found, we do throw the ScriptDeath. (Or reject a Promise with it, which is equivalent.)

Now how do we make sure that we don't accidentally introduce ScriptDeath into one of the high-speed paths in the future? Or at least signal that it's a bad idea?

A combination of review diligence and general speed testing. (We do check to see how long various operations take, and if certain things, especially the common path of script creation/destruction, get slower, people with batchers will notice.

The "good" answer would involve automated tracking of metrics like scripts/second, but we are nowhere near there. I've done that sort of testing professionally, and it's a pain to set up due to its inherent flakiness, and only worth it if you actually are willing to spend the time to chase down every performance regression.

* Update ScriptDeath to extend Error - Rename the "name" property to "hostname" so that it doesn't clash with inherited Error.name property. - Extend Error, this way if a user ends up catching it they can: - Test the type of error caught (name) and ignore it if desired - Get a stack trace and find out where they went wrong - Not require special case error printing logic for catching things that are neither strings, nor Error objects. It is possible (but unlikely) that this could make killing scripts slower in some circumstances.

Alpheus approved these changes Oct 7, 2024

View reviewed changes

d0sboots reviewed Oct 9, 2024

View reviewed changes

src/Netscript/ScriptDeath.ts Outdated Show resolved Hide resolved

src/Netscript/ScriptDeath.ts Outdated Show resolved Hide resolved

src/Netscript/ScriptDeath.ts Outdated Show resolved Hide resolved

alainbryden commented Oct 9, 2024

View reviewed changes

src/Netscript/ScriptDeath.ts Outdated Show resolved Hide resolved

src/Netscript/ScriptDeath.ts Outdated Show resolved Hide resolved

Apply suggestions from code review

83110e1

Co-authored-by: David Walker <d0sboots@gmail.com>

d0sboots merged commit cd99006 into bitburner-official:dev Oct 10, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NETSCRIPT: Update ScriptDeath to extend Error #1685

NETSCRIPT: Update ScriptDeath to extend Error #1685

alainbryden commented Oct 7, 2024

d0sboots commented Oct 8, 2024

Alpheus commented Oct 8, 2024

d0sboots commented Oct 8, 2024 •

edited

Loading

alainbryden commented Oct 8, 2024

Alpheus commented Oct 8, 2024

catloversg commented Oct 8, 2024

Alpheus commented Oct 8, 2024

catloversg commented Oct 8, 2024

Alpheus commented Oct 8, 2024

catloversg commented Oct 8, 2024

alainbryden commented Oct 9, 2024

d0sboots commented Oct 9, 2024

alainbryden commented Oct 9, 2024 •

edited

Loading

alainbryden commented Oct 9, 2024 •

edited

Loading

d0sboots commented Oct 9, 2024

Alpheus commented Oct 9, 2024

alainbryden commented Oct 9, 2024 •

edited

Loading

Alpheus commented Oct 9, 2024

alainbryden commented Oct 9, 2024 •

edited

Loading

d0sboots commented Oct 10, 2024 •

edited

Loading

NETSCRIPT: Update ScriptDeath to extend Error #1685

NETSCRIPT: Update ScriptDeath to extend Error #1685

Conversation

alainbryden commented Oct 7, 2024

Summary of changes

Motivation

Tests

d0sboots commented Oct 8, 2024

Alpheus commented Oct 8, 2024

d0sboots commented Oct 8, 2024 • edited Loading

alainbryden commented Oct 8, 2024

Alpheus commented Oct 8, 2024

catloversg commented Oct 8, 2024

Alpheus commented Oct 8, 2024

catloversg commented Oct 8, 2024

Alpheus commented Oct 8, 2024

catloversg commented Oct 8, 2024

alainbryden commented Oct 9, 2024

d0sboots commented Oct 9, 2024

alainbryden commented Oct 9, 2024 • edited Loading

alainbryden commented Oct 9, 2024 • edited Loading

d0sboots commented Oct 9, 2024

Alpheus commented Oct 9, 2024

alainbryden commented Oct 9, 2024 • edited Loading

Alpheus commented Oct 9, 2024

alainbryden commented Oct 9, 2024 • edited Loading

d0sboots commented Oct 10, 2024 • edited Loading

d0sboots commented Oct 8, 2024 •

edited

Loading

alainbryden commented Oct 9, 2024 •

edited

Loading

alainbryden commented Oct 9, 2024 •

edited

Loading

alainbryden commented Oct 9, 2024 •

edited

Loading

alainbryden commented Oct 9, 2024 •

edited

Loading

d0sboots commented Oct 10, 2024 •

edited

Loading