Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ATTENTION: Ziglings currently only works manually with the new build system of Zig since 0.11.0-dev.2157 #202

Closed
Tracked by #212
chrboesch opened this issue Mar 18, 2023 · 29 comments
Labels
bug Something isn't working

Comments

@chrboesch
Copy link
Collaborator

Zigling is currently not executable with the new Zig build system. The parallel processing of the individual processes no longer allows a sequence as before. Therefore Ziglings must be rebuilt. This will take some time and we ask for your understanding.
ziglang/zig#14647

@chrboesch chrboesch added the bug Something isn't working label Mar 18, 2023
@aitoehigie
Copy link

@chrboesch
I am a newbie who just got started on Ziglang. Is this bug related to this build error?

error: no field named 'color' in struct 'Build'. I suddenly started to see this error out of the blue.

Zig version: 0.11.0-dev.2157+f56f3c582
OS: Ubuntu Linux Mint 21:
Kernel 5.19.0-37-generic

Thanks in advance for your help

@Durobot
Copy link

Durobot commented Mar 19, 2023

Is this bug related to this build error?
error: no field named 'color' in struct 'Build'. I suddenly started to see this error out of the blue.
Zig version: 0.11.0-dev.2157+f56f3c582 OS: Ubuntu Linux Mint 21: Kernel 5.19.0-37-generic

I'm getting this error with 0.11.0-dev.2154+2089b3f19 as i try zig build in the root Ziglings directory.

@rohlem
Copy link

rohlem commented Mar 19, 2023

No Zig build since parallel builds was merged can work with the current build.zig file (which zig build uses).

(version number details)

The downloadable version is 0.11.0-dev.2160+49d37e2d1 (meaning 2160 commits since 0.11.0 was released, and based on commit with hash 49d37e2d1).
The merge of parallel builds was in the commit with hash bd242ce1c, which was chronologically at least 19 commits prior to this - but afaik chronologically-previous commits that are merged in are also counted, so it'll be a couple more.
Parallel builds itself was 156 commits, so the tag of a working version must lie at least 19+156 = 175 commits back. 2160 - 175 = 1995.

Therefore, if your version tag doesn't lie before 0.11.0-dev.1995, failure is to be expected (and evidently, the error message you are getting is the way it manifests).

Note that you can (probably) do many (most?) examples by hand, if you manually invoke zig run <source file> on the .zig files in directory /exercises.
The hints usually printed by build.zig can also be read from the build.zig source starting at line 59 - so not quite as streamlined, but manageable.

@chrboesch
Copy link
Collaborator Author

@aitoehigie Yes, this is also related to the new build system.

@chrboesch chrboesch changed the title ATTENTION: Ziglings currently does not work with the new build system of Zig since 0.11.0-dev.2157 ATTENTION: Ziglings currently only works manually with the new build system of Zig since 0.11.0-dev.2157 Mar 19, 2023
@chrboesch
Copy link
Collaborator Author

chrboesch commented Mar 19, 2023

First of all, all tasks can be called manually by their number. The automatic version will also come back, I'm already working on the solution.

@chrboesch
Copy link
Collaborator Author

#203

@chrboesch
Copy link
Collaborator Author

ziglang/zig#15013

@chrboesch
Copy link
Collaborator Author

The idea for a sustainable solution is to use the arguments in the build to select whether to compile all tasks one by one, and then re-run the build system one step at a time. In principal it works. I tested yesterday.

@sterchelen
Copy link

sterchelen commented Mar 20, 2023

whether to compile all tasks one by one, and then re-run the build system one step at a time. In principal it works.

not sure to understand @chrboesch ; Isn't it already the case?

@chrboesch
Copy link
Collaborator Author

@sterchelen No. If you start the build system without paramter, all exercises are packed into an array and then a list of steps is build from it. So far this list was processed one after the other and now it is processed in parallel.

This leads to several problems: 1. the order is no longer correct, 2. the build process aborts because too many errors also occur once due to parallel processing.

Therefore my thought, the build process does NOT create a list of steps but calls itself recursively and passes one step at a time as a parameter.

@rohlem
Copy link

rohlem commented Mar 20, 2023

@chrboesch Would it be an option to make each exercise's step depend on the previous one's step? (I.e. using y.step.dependOn(&x.step))
From my understanding that should sequentialize the process again, using mechanisms native to zig build.
I'm guessing in particular compiling exercise N+1 needs to depend on successful execution of exercise N.
Previous run-and-check-output steps need to hide their output from the user, but I assume that is encapsulated already.

@chrboesch
Copy link
Collaborator Author

@rohlem The dependencies are already set the same way. I hadn't seen that directly because I just haven't had time to look at it closely yet. The question is, why doesn't the compiler stop and abort after the first error? Instead, he keeps running and producing errors until he runs into one himself. Maybe something has changed in the dependency syntax and the problem can be solved by adjusting it.

@chrboesch
Copy link
Collaborator Author

As far as I can see, there is no way to continue with the build system in the old way because the dependencies do not work with parallel threads. Each exercise is put in the pipe and compiled independently of the other exercises.

It would need semaphores that exchange information between threads. This is possible to program, but would probably destroy the entire mechanism.

I will continue with the experimentation.

@perillo
Copy link
Contributor

perillo commented Apr 2, 2023

I started a local branch to improve, simplify and fix build.zig.

For ensuring that my branch does not change the behavior of ziglings, I need to have a working build.zig in the master branch.

There are two quick fixes:

1. Use an old version of lib/build_runner.zig

Not tested.

@chrboesch can you tell me the version required?

2. Use single-threaded mode

It requires a small change to the code, since the current build runner iterates steps starting from the latest.

Tested.

@chrboesch
Copy link
Collaborator Author

@chrboesch can you tell me the version required?
zig version 0.11.0-dev.1711 should ddefinitely work.

@perillo
Copy link
Contributor

perillo commented Apr 3, 2023

@chrboesch can you tell me the version required?
zig version 0.11.0-dev.1711 should ddefinitely work.

Unfortunately https://github.com/ziglang/zig/blob/6f13a725a/lib/build_runner.zig does not work.

However I managed to get the new build runner work with a simple patch:

diff --git a/lib/build_runner.zig b/lib/build_runner.zig
index bfc6ab7..e195b32 100644
--- a/lib/build_runner.zig
+++ b/lib/build_runner.zig
@@ -355,7 +355,7 @@ fn runStepNames(
     } else {
         try step_stack.ensureUnusedCapacity(gpa, step_names.len);
         for (0..step_names.len) |i| {
-            const step_name = step_names[step_names.len - i - 1];
+            const step_name = step_names[i];
             const s = b.top_level_steps.get(step_name) orelse {
                 std.debug.print("no step named '{s}'. Access the help menu with 'zig build -h'\n", .{step_name});
                 process.exit(1);
@@ -410,12 +410,12 @@ fn runStepNames(
         // check whether it should run any dependants.
         const steps_slice = step_stack.keys();
         for (0..steps_slice.len) |i| {
-            const step = steps_slice[steps_slice.len - i - 1];
+            const step = steps_slice[i];
 
             wait_group.start();
-            thread_pool.spawn(workerMakeOneStep, .{
+            @call(.auto, workerMakeOneStep, .{
                 &wait_group, &thread_pool, b, step, &step_prog, run,
-            }) catch @panic("OOM");
+            });
         }
     }
     assert(run.memory_blocked_steps.items.len == 0);
@@ -846,9 +846,9 @@ fn workerMakeOneStep(
         // Successful completion of a step, so we queue up its dependants as well.
         for (s.dependants.items) |dep| {
             wg.start();
-            thread_pool.spawn(workerMakeOneStep, .{
+            @call(.auto, workerMakeOneStep, .{
                 wg, thread_pool, b, dep, prog_node, run,
-            }) catch @panic("OOM");
+            });
         }
     }
 
@@ -872,9 +872,9 @@ fn workerMakeOneStep(
                 remaining -= dep.max_rss;
 
                 wg.start();
-                thread_pool.spawn(workerMakeOneStep, .{
+                @call(.auto, workerMakeOneStep, .{
                     wg, thread_pool, b, dep, prog_node, run,
-                }) catch @panic("OOM");
+                });
             } else {
                 run.memory_blocked_steps.items[i] = dep;
                 i += 1;

In my local repository, I have saved this patch in patches/zig/build_runner.patch and the build_runner.zig file in the lib directory.

In order to use it:

$ ZIG_BUILD_RUNNER=lib/build_runner.zig zig build

@chrboesch
Copy link
Collaborator Author

Unfortunately https://github.com/ziglang/zig/blob/6f13a725a/lib/build_runner.zig does not work.

Did you try it with our original build system before we made changes because of the update? https://github.com/ratfactor/ziglings/commits/main/build.zig

@perillo
Copy link
Contributor

perillo commented Apr 3, 2023

Unfortunately https://github.com/ziglang/zig/blob/6f13a725a/lib/build_runner.zig does not work.

Did you try it with our original build system before we made changes because of the update? https://github.com/ratfactor/ziglings/commits/main/build.zig

No.

@chrboesch
Copy link
Collaborator Author

@perillo Sorry, I'll summarise again briefly. Before the update of the Zig Build system with parallel processing, Ziglings ran as follows: zig build started with the first exercise, compiled it and stopped on error. Otherwise, the next exercise from the list was compiled and so on until the end. When entering zig build xy, where xy stands for the number of an exercise, exactly this exercise was compiled.

After the update of the zig build system, all exercises run in parallel, so that a simple zig build no longer makes sense. The only way that still works is zig build xy.

@perillo
Copy link
Contributor

perillo commented Apr 3, 2023

@chrboesch This is the same behavior when using the latest commit of ziglings with the custom build_runner.

What I did not test is the commit before the one you mentioned.

I tested 0d154fe now, but it fails since the code try to use Builder.color that is no more available.

@chrboesch
Copy link
Collaborator Author

Ok, I understand. The problem with Builder.color was solved here: 59f9c48#diff-f87bb3596894756629bc39d595fb18d479dc4edf168d93a911cadcb060f10fccR536-R553
That was a part of fixing the problems with the new build system. You can use it in this way.

@chrboesch
Copy link
Collaborator Author

Well, I'll summarise:

With @perillo's patch of Build_Runner we can run 'Ziglings' as usual thanks to @SuperAuguste's build extension. ziglang/zig@25b8318

What we need for this is
a) ship a patched Build_Runner with it.
b) we need to set the environment variable 'ZIG_BUILD_RUNNER' before running 'zig build'.

The latter is a bit of a challenge because it breaks the native feeling of'zig build'. I would like to know what @ratfactor thinks about this.

@ratfactor
Copy link
Owner

Yeah, I agree that it breaks the native feeling. The ZIG_BUILD_RUNNER env variable would be bad enough, but the extra source file...doing this is a very unpleasant initial step for people who just want to learn the basics!

I really appreciate the attempt to make this work like before, but we've got to figure out how to make it less awkward.

I'm half tempted to suggest my original shell script, which was so simple:

971ab7f#diff-e98d7fec1a7ab427eac77bee3049921b81f13a2b9df3b6de0f326c0dd91350d1

But the advantage of the Zig build system is that it's cross-platform and...you are guaranteed to already have it.

Is there some other way to trick the build script into just running one exercise at a time? Some sort of hack like writing to a text file when an exercise is completed?

@perillo
Copy link
Contributor

perillo commented Apr 4, 2023

I would like to clarify that using a custom build runner is a temporary solution; it can be used by person reviewing #212 to test that my changes does not break the behavior of ziglings.

@chrboesch
Copy link
Collaborator Author

Does that mean you think it's a general bug in the build_runner that we can fix in Zig?

@perillo
Copy link
Contributor

perillo commented Apr 5, 2023

I finally managed to fix the bug in build.zig, making it working again with the exact same behavior as before.
There are still minor issues, so I will continue do add commits to the perillo:improve-build branch.

For testing the old behavior I used Zig 0.10.1 and ziglings commit f534355.

Should I mark this issue as closed, in the next commit I will push in #212?

@chrboesch
Copy link
Collaborator Author

For testing the old behavior I used Zig 0.10.1 and ziglings commit f534355.

ok

Should I mark this issue as closed, in the next commit I will push in #212?

yes

@chrboesch
Copy link
Collaborator Author

@ratfactor

Is there some other way to trick the build script into just running one exercise at a time?

@perillo submitted a PR with an extension for single-threaded-option in the Zig build system. This should make it possible again, for my understanding. I would like to take this opportunity to thank @perillo for his great work.

perillo added a commit to perillo/ziglings that referenced this issue Apr 6, 2023
The new parallel build support in Zig broke the exercise chain, so that
each esercise check is no longer strictly serialized.

  1. Add the Dexno option, in order to isolate the chain starting from a
     named exercise from the normal chain, thus simplify the code.

     The current code have an additional issue: it added 4 x n steps,
     making reading the help message or the list of steps very hard.

     Add only the `install`, `uninstall`, `zigling`, `test` and `start`
     steps.  The last three steps match the old steps `n`, `n_test` and
     `n_start`.

     The default step is zigling (note the singular form).

     The `install` step override the builtin install step, showing a
     custom description and matches the old `n_install` step.
     The uninstall step was added for consistency, so that the
     description is consistent.

     Setup a new chain starting at `zig build -Dexno=n start` so that it
     is stricly serialized.

     The behavior should be the same as the old one.

  2. Handle the code for all the exercises separately.

     Add only the `ziglings step`, making it the default step, in
     addition to the install and uninstall steps.

     Setup a new chain starting at the first exercise, to that it is
     strictly serialized.

     The behavior should be the same as the old one.

The current code has a know issue: the messages from the ZiglingStep and
the ones from the compiler compilation progress are interleaved, but each
message is written atomically, due to the use of `std.debug.getStderrMutex()`.

Update the README.md file.

Closes ratfactor#202
@perillo perillo mentioned this issue Apr 6, 2023
4 tasks
@perillo
Copy link
Contributor

perillo commented Apr 6, 2023

[...]
I'm half tempted to suggest my original shell script, which was so simple:

971ab7f#diff-e98d7fec1a7ab427eac77bee3049921b81f13a2b9df3b6de0f326c0dd91350d1

@ratfactor

IMHO, the solution was to reimplement the ziglings script in Zig and running it with zig run ziglings.zig.

perillo added a commit to perillo/ziglings that referenced this issue Apr 6, 2023
The new parallel build support in Zig broke the exercise chain, so that
each esercise check is no longer strictly serialized.

  1. Add the Dexno option, in order to isolate the chain starting from a
     named exercise from the normal chain, thus simplify the code.

     The current code have an additional issue: it added 4 x n steps,
     making reading the help message or the list of steps very hard.

     Add only the `install`, `uninstall`, `zigling`, `test` and `start`
     steps.  The last three steps match the old steps `n`, `n_test` and
     `n_start`.

     The default step is zigling (note the singular form).

     The `install` step override the builtin install step, showing a
     custom description and matches the old `n_install` step.
     The uninstall step was added for consistency, so that the
     description is consistent.

     Setup a new chain starting at `zig build -Dexno=n start` so that it
     is stricly serialized.

     The behavior should be the same as the old one.

  2. Handle the code for all the exercises separately.

     Add only the `ziglings step`, making it the default step, in
     addition to the install and uninstall steps.

     Setup a new chain starting at the first exercise, to that it is
     strictly serialized.

     The behavior should be the same as the old one.

The current code has a know issue: the messages from the ZiglingStep and
the ones from the compiler compilation progress are interleaved, but each
message is written atomically, due to the use of `std.debug.getStderrMutex()`.

Update the README.md file.

Closes ratfactor#202
vamega pushed a commit to vamega/ziglings that referenced this issue Jul 25, 2023
The new parallel build support in Zig broke the exercise chain, so that
each esercise check is no longer strictly serialized.

  1. Add the Dexno option, in order to isolate the chain starting from a
     named exercise from the normal chain, thus simplify the code.

     The current code have an additional issue: it added 4 x n steps,
     making reading the help message or the list of steps very hard.

     Add only the `install`, `uninstall`, `zigling`, `test` and `start`
     steps.  The last three steps match the old steps `n`, `n_test` and
     `n_start`.

     The default step is zigling (note the singular form).

     The `install` step override the builtin install step, showing a
     custom description and matches the old `n_install` step.
     The uninstall step was added for consistency, so that the
     description is consistent.

     Setup a new chain starting at `zig build -Dexno=n start` so that it
     is stricly serialized.

     The behavior should be the same as the old one.

  2. Handle the code for all the exercises separately.

     Add only the `ziglings step`, making it the default step, in
     addition to the install and uninstall steps.

     Setup a new chain starting at the first exercise, to that it is
     strictly serialized.

     The behavior should be the same as the old one.

The current code has a know issue: the messages from the ZiglingStep and
the ones from the compiler compilation progress are interleaved, but each
message is written atomically, due to the use of `std.debug.getStderrMutex()`.

Update the README.md file.

Closes ratfactor#202
fleimgruber pushed a commit to fleimgruber/ziglings that referenced this issue Oct 26, 2023
The new parallel build support in Zig broke the exercise chain, so that
each esercise check is no longer strictly serialized.

  1. Add the Dexno option, in order to isolate the chain starting from a
     named exercise from the normal chain, thus simplify the code.

     The current code have an additional issue: it added 4 x n steps,
     making reading the help message or the list of steps very hard.

     Add only the `install`, `uninstall`, `zigling`, `test` and `start`
     steps.  The last three steps match the old steps `n`, `n_test` and
     `n_start`.

     The default step is zigling (note the singular form).

     The `install` step override the builtin install step, showing a
     custom description and matches the old `n_install` step.
     The uninstall step was added for consistency, so that the
     description is consistent.

     Setup a new chain starting at `zig build -Dexno=n start` so that it
     is stricly serialized.

     The behavior should be the same as the old one.

  2. Handle the code for all the exercises separately.

     Add only the `ziglings step`, making it the default step, in
     addition to the install and uninstall steps.

     Setup a new chain starting at the first exercise, to that it is
     strictly serialized.

     The behavior should be the same as the old one.

The current code has a know issue: the messages from the ZiglingStep and
the ones from the compiler compilation progress are interleaved, but each
message is written atomically, due to the use of `std.debug.getStderrMutex()`.

Update the README.md file.

Closes ratfactor#202
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants