Produce better errors for failing jobsets #1025

lukegb · 2021-09-22T21:01:11Z

Recently nixpkgs had a problem where evaluations were failing with the dreaded [json.exception.type_error.302] type must be string, but is null nlohmann/json error.

It turns out this is because some of the NixOS tests included in the tested job could no longer be evaluated.

No more! With this change, not only will we produce an evaluation, but we'll also generate more useful evaluation errors for people to use when debugging, rather than trying to manually bisect nixpkgs to work out at what point evaluations broke.

More context in the commits included in this PR. It even includes a test (which appears to be the first test of the aggregate job functionality, and I'm... asserting that it breaks correctly. Oh well.)

src/hydra-eval-jobs/hydra-eval-jobs.cc

grahamc

Looks good to me, thank you. Let me know about that question and I'll happily merge.

andersk · 2021-10-26T04:24:25Z

nixos-unstable evaluation has been broken again on Hydra for several days (NixOS/nixpkgs#142918) with the same useless error: [json.exception.type_error.302] type must be string, but is null. I just confirmed that if this had been merged, it would have produced a helpful error pointing straight at the real problem.

nixos.tests.gnome-xorg.x86_64-linux: error: undefined variable 'source-sans-pro'

       at /home/anders/nixpkgs/nixos/modules/services/x11/desktop-managers/gnome.nix:456:9:

          455|         source-code-pro # Default monospace font in 3.32
          456|         source-sans-pro
             |         ^
          457|       ];
nixos.tests.gnome.x86_64-linux: error: undefined variable 'source-sans-pro'

       at /home/anders/nixpkgs/nixos/modules/services/x11/desktop-managers/gnome.nix:456:9:

          455|         source-code-pro # Default monospace font in 3.32
          456|         source-sans-pro
             |         ^
          457|       ];

Maybe it’s time to consider merging this now and worrying about that small cleanup later?

lukegb · 2021-10-26T09:05:29Z

Whoops, meant to get back to this. Extracted the logic out into its own function, but it's a bit messy because it's a "get or perform spooky side-effects" lambda. Not sure that's a huge improvement.

At the moment, the jobset object is unlikely to actually retrieve the evaluation error output, because it isn't refreshed after hydra-eval-jobsets is run. Explicitly calling DBIx::Class::Row->discard_changes causes any updated data to be refreshed, at the cost of losing any not-yet committed changes to the row.

At the moment, aggregate jobs can easily break and cause the entire evaluation to fail, which is not ideal. For Nixpkgs, we do have some important aggregate jobs (like `tested`), but for debugging and building purposes it's still useful to get a partial result even if the channel won't actually advance. This commit changes the behaviour of hydra-eval-jobs such that it aggregates any errors found during the construction of an aggregate, and will instead annotate the job with the evaluation failure such that it shows up in a "cleaner" way. There are really two types of failure that we care about: one is where the attribute just ends up missing altogether in the final output, and also where the attribute is in the output but fails to evaluate. Both are handled here. Note that this does mean that the same error message may be output multiple times, but this aids debuggability because it'll be much clearer what's blocking the job from being created.

lukegb mentioned this pull request Sep 22, 2021

hydra-eval-jobs masks error messages from Nix (again) #822

Open

grahamc reviewed Sep 25, 2021

View reviewed changes

src/hydra-eval-jobs/hydra-eval-jobs.cc Show resolved Hide resolved

grahamc approved these changes Sep 25, 2021

View reviewed changes

andersk mentioned this pull request Oct 26, 2021

nixos-unstable evaluation broken on Hydra (error: [json.exception.type_error.302] type must be string, but is null) NixOS/nixpkgs#142918

Closed

lukegb force-pushed the hydra-better-errors branch from 757b979 to 8478697 Compare October 26, 2021 09:04

lukegb added 2 commits October 26, 2021 10:13

lukegb force-pushed the hydra-better-errors branch from 8478697 to 67ebce8 Compare October 26, 2021 09:14

grahamc merged commit ef9a9fa into NixOS:master Oct 26, 2021

andersk mentioned this pull request Oct 31, 2021

hydra-unstable: 2021-08-11 → 2021-10-27 NixOS/nixpkgs#144044

Closed

12 tasks

andersk mentioned this pull request Mar 1, 2022

nixos-unstable blocked on evaluation error “error: unexpected EOF reading a line” NixOS/nixpkgs#162317

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Produce better errors for failing jobsets #1025

Produce better errors for failing jobsets #1025

lukegb commented Sep 22, 2021

grahamc left a comment

andersk commented Oct 26, 2021

lukegb commented Oct 26, 2021

Produce better errors for failing jobsets #1025

Produce better errors for failing jobsets #1025

Conversation

lukegb commented Sep 22, 2021

grahamc left a comment

Choose a reason for hiding this comment

andersk commented Oct 26, 2021

lukegb commented Oct 26, 2021