Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: make a standard API to run tests inside a worker #4441

Merged
merged 32 commits into from
Jan 12, 2024

Conversation

sheremet-va
Copy link
Member

@sheremet-va sheremet-va commented Nov 4, 2023

Instead of having several entry points for each pool, Vitest now exposes a single entry point that accepts an object with worker property that should export an object with getRpcOptions and runTests methods.

  • getRpcOptions should return an object that is passed down to birpc - this is primarily to establish communication between the threads (Vitest exposes createForksRpcOptions and createThreadsRpcOptions from vitest/workers entry point)
  • runTests receives WorkerGlobalState object which should be used to decide what tests to run and what the current config is (Vitest exposes runVmTests and runBaseTests methods from vitest/workers)

All pools workers are still bundled, but now they are located inside workers folder:

  • forks pool: vitest/dist/workers/forks.js
  • threads pool: vitest/dist/workers/threads.js
  • vmForks pool: vitest/dist/workers/vmForks.js
  • vmThreads pool: vitest/dist/workers/vmThreads.js

projectFiles on Vitest instance doesn't exist anymore, instead, there is distPath that can be used to resolve pool paths.

We don't consider this to be a breaking change because Vitest API is in an experimental state and doesn't follow SemVer. pool option is a public API but it doesn't have to rely on experimental Vitest API to work, - it's functionality was not changed.

Copy link

stackblitz bot commented Nov 4, 2023

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

Copy link

netlify bot commented Nov 4, 2023

Deploy Preview for fastidious-cascaron-4ded94 ready!

Name Link
🔨 Latest commit 8619291
🔍 Latest deploy log https://app.netlify.com/sites/fastidious-cascaron-4ded94/deploys/65a13c9c46643100077d0c10
😎 Deploy Preview https://deploy-preview-4441--fastidious-cascaron-4ded94.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@sheremet-va sheremet-va force-pushed the feat/refactor-pool-worker branch from 5eebc3b to b52461e Compare November 4, 2023 22:26
Copy link
Member

@AriPerkkio AriPerkkio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime/workers look good. I would imagine similar changes on main thread's side would simplify things a lot.

packages/vitest/src/types/rpc.ts Outdated Show resolved Hide resolved
packages/vitest/src/runtime/workers/forks.ts Outdated Show resolved Hide resolved
packages/vitest/src/integrations/vi.ts Outdated Show resolved Hide resolved
packages/vitest/src/runtime/worker.ts Show resolved Hide resolved
@AriPerkkio
Copy link
Member

Something like this should work on main thread's side for single instance of Tinypool. I didn't test it but it might work as is.

import { Tinypool } from 'tinypool'

const pool = new Tinypool({
  filename: 'dist/worker.js',
  // min/max options here. Shared by all pools.
})

async function waitForEmptyQueue() {
  return new Promise<void>((resolve) => {
    if (pool.queueSize === 0) {
      return resolve()
    }

    pool.once('drain', resolve)
  })
}

const runningTasks: Promise<void>[] = [];

// TODO handle vm ones too
async function runTests(files: string[], vitestPool: 'threads' | 'forks') {
  const runtime = vitestPool === 'threads' ? 'worker_threads' : 'child_process'

  await pool.recycleWorkers({ runtime })

  const data = {
    pool,
    files,
    // ... add more options here
  }

  // At this point there may be previous tasks running on different runtime.
  // All tasks that are now pushed to queue with `run` will start with the new runtime.
  // Note that `run` is not await'ed. Empty queue will indicate finished tasks.
  runningTasks.push(pool.run(data))
  await waitForEmptyQueue()
}

type TestFile = string
const testsByPool: ['threads' | 'forks', TestFile[]][] = [
  ['threads', ['... files ...']],
  ['forks', ['... files ...']],
]

// Run tests
for (const [pool, files] of testsByPool) {
  await runTests(files, pool)
}

// Wait for last round's tasks to finish too
await Promise.all(runningTasks);

@sheremet-va
Copy link
Member Author

sheremet-va commented Nov 5, 2023

await pool.recycleWorkers({ runtime })

Is it fast though? What does this mean for a non-isolated environment? Does it mean we would need to wait to start another process/worker and then restart it again? Can we optimize it somehow? Maybe create half workers and half forks, and close/restart other runtimes (so, 2/2 -> 3/1) when they finish running? (So, when there is a workers queue, but forks are idling)

@AriPerkkio
Copy link
Member

Is it fast though?

Calling recycleWorkers will create new worker with given runtime as soon as it's possible. If there are idle workers with previous runtime, those ones will be destroyed and new ones will be spawned. Recycling this many threads/processes is fast.

What does this mean for a non-isolated environment? Does it mean we would need to wait to start another process/worker and then restart it again?

From Tinypool's API's perspective it's identical usage. When we want to change the runtime, recycleWorkers is needed and new process/worker is spawned.
But note that isolateWorkers option cannot change, just like maxThreads and minThreads. If they need changing, we need multiple instances of Tinypool.

Maybe create half workers and half forks, and close/restart other runtimes (so, 2/2 -> 3/1) when they finish running? (So, when there is a workers queue, but forks are idling)

Currently it's working like this:

  • Imagine we have 5 tasks where 3 are for worker_threads and 2 for child_process. First 3 tasks take 2s, rest 1s. We allocate 2 threads for Tinypool.
  • First create worker_threads pool with 2 threads. Push 3 tasks there. 2 tasks start running.
  • 2 tasks finish. 1 task from queue is started.
  • Pool emits pool.drain event and indicates queue is empty.
  • We call recycleWorkers and change runtime to child_process. Now Tinypool will destroy the one idle worker and spawns a new one.
  • 1 task is started in child_process, 1 task is still running in worker_threads.
  • Both tasks are finished. worker_thread worker is recycled and new child_process one is spawned. Both take tasks from queue.

Explaining this would be easier with whiteboard 😄. Hopefully this explains the situation a bit.

@sheremet-va
Copy link
Member Author

Calling recycleWorkers will create new worker with given runtime as soon as it's possible. If there are idle workers with previous runtime, those ones will be destroyed and new ones will be spawned. Recycling this many threads/processes is fast.

But if we call it 100 times at the same time, and isolation is disabled (so, we only need 4 workers for example) - won't it just create new ones and destroy previous instead of reusing them?

@AriPerkkio
Copy link
Member

won't it just create new ones and destroy previous instead of reusing them

It will destroy workers and create new ones, yes. And that's expected. We are also using this mechanism when looping over environments:

for (const files of Object.values(filesByOptions)) {
// Always run environments isolated between each other
await pool.recycleWorkers()

But we definitely should not call it 100 times. Instead, we should sort files by pools and their options like done currently.

  1. Push tasks to pool with initial options
  2. Wait for queue to be empty
  3. Call recycleWorkers to change runtime and other options.
  4. Push tasks to queue and repeat 2-4 until no tasks left.

@sheremet-va sheremet-va force-pushed the feat/refactor-pool-worker branch 2 times, most recently from b630c30 to f8ffa7b Compare November 15, 2023 09:27
@sheremet-va
Copy link
Member Author

@AriPerkkio looks like vm now takes even more memory :(

@AriPerkkio
Copy link
Member

I haven't checked the latest changes yet but by comparing --heap-prof outputs between main and PR it might be visible what's consuming the memory now.

export default defineConfig({
  test: {
    pool: "vmThreads",
    poolOptions: {
      vmThreads: {
        execArgv: ["--heap-prof", "--heap-prof-dir=./profiling"],
      },
    },
  },
});

Then open the files in Chrome dev-tools

@sheremet-va
Copy link
Member Author

I haven't checked the latest changes yet but by comparing --heap-prof outputs between main and PR it might be visible what's consuming the memory now.

export default defineConfig({

  test: {

    pool: "vmThreads",

    poolOptions: {

      vmThreads: {

        execArgv: ["--heap-prof", "--heap-prof-dir=./profiling"],

      },

    },

  },

});

Then open the files in Chrome dev-tools

It looks like vite-node test calls mocked process.exit, can't exit and the memory piles up. I will check it with your example, thank you!

@AriPerkkio
Copy link
Member

It looks like vite-node test calls mocked process.exit, can't exit

I've also run into this during development. Back then it caused some of those 100% CPU spikes and left zombie processes. Maybe we should add some handling for these cases:

process.exit(1)

+ // This is run if exit was overwritten
+ throw new Error('Unable to exit since process.exit is overwritten')

@AriPerkkio
Copy link
Member

looks like vm now takes even more memory

What kind of setup does reproducing this require? I'm seeing some memory increment on vm.ts's resolveModule.

@sheremet-va
Copy link
Member Author

What kind of setup does reproducing this require? I'm seeing some memory increment on vm.ts's resolveModule.

I don't know actually, but CI always fails here. I didn't start working on it locally yet.

@sheremet-va sheremet-va added this to the 1.2.0 milestone Dec 9, 2023
@sheremet-va sheremet-va force-pushed the feat/refactor-pool-worker branch 2 times, most recently from e9363b9 to 2ba2c00 Compare January 3, 2024 14:54
@sheremet-va
Copy link
Member Author

sheremet-va commented Jan 3, 2024

@AriPerkkio this is reproducible 100% on my local machine:

cd test/vite-node
pnpm test --pool=forks --run

It's called because SIGTERM is triggered by someone (tinypool?). Previously we had these lines:

const exit = process.exit
try {
 // run tests
} finally {
  process.exit = exit
}

After bringing it back, tests work correctly. 🤔 🤔 🤔

@sheremet-va sheremet-va marked this pull request as ready for review January 3, 2024 16:40
@sheremet-va sheremet-va requested a review from AriPerkkio January 3, 2024 16:40
@AriPerkkio
Copy link
Member

It's called because SIGTERM is triggered by someone (tinypool?).

Tinypool uses first SIGTERM, which is the default signal when calling process.kill(): https://nodejs.org/api/child_process.html#child_processforkmodulepath-args-options. If process doesn't terminate in time (Tinypool default 1000ms), it calls process.kill('SIGKILL'). This is similar to Jest.

If overwriting process.exit becomes issue I think we could store original copy in Tinypool's entrypoints before any user-provided code is even loaded. And when we start to terminate the processes (or threads), we could restore it back to original one. Doing the same thing in Vitest might be a good thing too.

Is this issue still reproducible easily? I'm interested to see where Node hangs when proces/thread is trying to exit in these cases. Might be able to capture that with debugger. 🤔

I'll take a good look at this PR later this week and run some manual tests myself.

@sheremet-va
Copy link
Member Author

Is this issue still reproducible easily? I'm interested to see where Node hangs when proces/thread is trying to exit in these cases. Might be able to capture that with debugger. 🤔

Yes, if you remove the process.exit assignment in the forks worker.

@AriPerkkio
Copy link
Member

AriPerkkio commented Jan 4, 2024

Need to add await vnServer.server.close() in the test/vite-node/test/server.test.ts tests. Then tests will pass without hacking around process.exit. I think following is happening now:

  1. Vite server remains open once tests exit
  2. Tinypool kills process with process.kill() -> SIGTERM is sent to test process
  3. Vite server running inside that child_process has SIGTERM handler which captures this and calls process.exit: https://github.com/vitejs/vite/blob/f43945c1c35de34d64b2b1772b1629aba83cb476/packages/vite/src/node/server/index.ts#L625-L632
  4. Test process's RPC catches process.exit and sends it to main process via onWorkerExit
  5. Main process exits:
    process.exit(code || 1)
  6. Other tests that were running become zombie processes, something like Unrefed child_process inside a worker thread becomes a zombie nodejs/node#46569. But I think in this case it's child_process running in child_process

I wonder why we exit whole process here instead of ctx.close(). Maybe we should also call ctx.cancelCurrentRun() before that.

async onWorkerExit(error, code) {
await ctx.logger.printError(error, { type: 'Unexpected Exit', fullStack: true })
process.exit(code || 1)
},

Also need to check why Tinypool calls unref for processes/threads: https://github.com/tinylibs/tinypool/blob/8e941b511c1d54522bdfaae4c7e9de9be93b9e6d/src/runtime/process-worker.ts#L117-L125.

Here's minimal repro for the zombie process:

import { fork } from "node:child_process";
import { fileURLToPath } from "node:url";

const filename = fileURLToPath(import.meta.url);
const isMainThread = !Boolean(process.env.CHILD_PROCESS);

if (isMainThread) {
  console.log("Running main");
  const subprocess = fork(filename, { env: { CHILD_PROCESS: "true" } });
  subprocess.unref();
  subprocess.on("message", console.log);

  await new Promise((r) => setTimeout(r, 3000));

  console.log("Exit main");
  process.exit();
} else {
  setInterval(() => console.log("Fork here"), 1000);
}

packages/vitest/src/runtime/workers/vm.ts Outdated Show resolved Hide resolved
packages/vitest/src/runtime/workers/vm.ts Outdated Show resolved Hide resolved
packages/vitest/src/runtime/workers/base.ts Show resolved Hide resolved
packages/vitest/src/runtime/workers/utils.ts Outdated Show resolved Hide resolved
@sheremet-va sheremet-va force-pushed the feat/refactor-pool-worker branch from f8fbb67 to ba5715e Compare January 8, 2024 10:39
@sheremet-va sheremet-va force-pushed the feat/refactor-pool-worker branch from 420d645 to 8619291 Compare January 12, 2024 13:20
@sheremet-va sheremet-va merged commit 3ca3174 into vitest-dev:main Jan 12, 2024
18 of 19 checks passed
@sheremet-va sheremet-va deleted the feat/refactor-pool-worker branch January 12, 2024 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants