WITH DOCKER is not executed in parallel in latest v0.6.28 (but it was in version v0.6.25) #2377

jrodrigv · 2022-11-08T10:42:10Z

Hi,

I have some targets meant for running tests projects in parallel that are using WITH DOCKER to run a Rabbimq service before running the tests. Using v0.6.25 I can see how the test-executor instances are running fully in parallel loading the rabbitmq tar files in parallel etc. However when using v0.6.28 this is not the case anymore and the test-executor instances are running sequentially instead. Is there a way to prevent this from happening?

parallel-testing:
    FROM +build
    WAIT
        FOR dir IN $(ls Tests/*Tests/*.csproj)
            COPY (+test-executor/TestResults --PROJECT=./$dir) ./TestResults
        END
        SAVE ARTIFACT ./TestResults testresults AS LOCAL earthly-artifacts/testresults
    END

   test-executor:
    FROM +build
    ARG PROJECT
    ARG ID
    COPY docker-compose-unit-tests-rabbitmq.yml .
    COPY coverlet.runsettings .

      WITH DOCKER \
     --compose docker-compose-unit-tests-rabbitmq.yml \
     --service rabbitmq
          RUN dotnet test ${PROJECT} --blame-hang-timeout 5m --no-build --collect:"XPlat Code Coverage" -p:CollectCoverage=true --settings coverlet.runsettings --logger trx --configuration $buildConfiguration --results-directory ./TestResults
      END
  
    SAVE ARTIFACT ./TestResults TestResults

The text was updated successfully, but these errors were encountered:

vladaionescu · 2022-11-08T17:47:08Z

Hi @jrodrigv - what's your VERSION declaration at the top of the Eartfhile?

jrodrigv · 2022-11-08T18:33:14Z

Hi @vladaionescu - In both cases I was using:

VERSION --wait-block 0.6

vladaionescu · 2022-11-08T18:46:08Z

I suspect it has something to do with either WAIT or this new piece of functionality that we released in v0.6.26:

Loading Docker images as part of WITH DOCKER is now faster through the use of an embedded registry in Buildkit. This functionality was previously hidden (VERSION --use-registry-for-with-docker) and was only auto-enabled for Earthly Satellite users. It is now enabled by default for all builds

Can you try the following:

On earthly v0.6.29 enable VERSION --wait-block --no-use-registry-for-with-docker 0.6 and see if the issue goes away.
On earthly v0.6.29, remove --wait-block (just use VERSION 0.6 with no other flags) and see if that makes the issue go away (you might have to change your build a little bit to remove WAIT for this test).

It's a way to narrow down to what the cause of this might be.

jrodrigv · 2022-11-09T08:54:28Z

Hi @vladaionescu .

I have tested the VERSION --wait-block --no-use-registry-for-with-docker 0.6 and it is working fine as before in v0.6.25.

Thanks

vladaionescu · 2022-11-09T22:16:06Z

I was hoping you would be able to try the second suggestion too and see if that influences things too. The --no-use-registry-for-with-docker flag is not something we plan to support long-term - it might go away at some point without warning.

jrodrigv · 2022-11-10T10:17:47Z

Hi @vladaionescu ,

I have tried the 2nd suggestion as well using just the VERSION 0.6 without any other flags and removing the WAIT

In this scenario, all the +test-executor targets are executed sequentially following the same pattern (screenshot attached)

Not sure which could be the root cause but looking at the logs it looks like each test-executor is doing a final image transfer of the docker compose service.
output | [ ] 0% transferring registry.code.com/connectivity/common-library/rabbitmq:management

And then the next test-executor is pulling the previous transferred image from the embedded registry? Creating a kind of artificial dependency between the test-executor instances.

+test-executor | Loading images from BuildKit via embedded registry...
+test-executor | Pulling 172.30.0.1:8371/sess-ofbf05t00k2o4z7js3zfss7q4:img-0 and retagging as registry.code.com/connectivity/common-library/rabbitmq:management
+test-executor | img-0: Pulling from sess-ofbf05t00k2o4z7js3zfss7q4

However, going back to VERSION --wait-block --no-use-registry-for-with-docker 0.6 we can see how the test-executor targets are executed in parallel:

jrodrigv · 2023-01-20T23:34:32Z

Hi @vladaionescu

I have tested this issue on 0.7 I can reproduce it. I hace created a Earthfile here that can be used to test it.

https://github.com/jrodrigv/EarthlySamples/blob/test/Earthfile

sesgoe · 2023-03-01T15:58:37Z

Seeing the same issue here! Version 0.6.30.

Also re-parallelizes for me if I use VERSION --no-use-registry-for-with-docker 0.6 in Earthfile.

…rent args (#3406) Fixes #2377 The previous `VisitedCollection` implementation attempts to wait for ongoing targets with the same name to complete before comparing the inputs of the target. This has the advantage of being more precise with which ARGs actually influence the outcome of the targets, ignoring overriding ARGs that end up being unused within those targets. But it also has the disadvantage that all targets with the same name (but different overriding args) execute sequentially. This new implementation simplifies this greatly, by computing the target input hash upfront based on all overriding args, without knowing if there are any args that will end up being unused. The result is that we are able to run all targets with the same name but different args in parallel. But it also has the disadvantage that in some cases we would create duplicated LLBs for some targets when certain overriding args are different but they are unused. Buildkit will generally de-duplicate the LLB in these cases, although it's possible that there might be edge cases if the LLB construction is not consistent. --------- Co-authored-by: nacho <idelvall@brutusin.org> Co-authored-by: Vlad A. Ionescu <vladaionescu@users.noreply.github.com>

jrodrigv · 2023-10-24T05:52:06Z

Thanks Vlad! I was waiting specifically for this one to jump from 0.6 to 0.7, we execute all tests projects in parallel :) El lun, 23 oct 2023 10:11 p. m., Vlad A. Ionescu ***@***.***> escribió:

…

Closed #2377 <#2377> as completed via #3406 <#3406>. — Reply to this email directly, view it on GitHub <#2377 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB7IR62ZAOAFXEEG3V3Q24TYA3FPDAVCNFSM6AAAAAAR2FIA5KVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJQG42DKNZXGY4DMOI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

vladaionescu · 2023-10-24T20:18:55Z

Great - please note that this is behind a feature flag. To enable it, you need to use VERSION --use-visited-upfront-hash-collection 0.7. This just went out in Earthly v0.7.21.

jrodrigv · 2023-11-18T11:03:58Z

Hi @vladaionescu , I have been doing some testing with the flag using v0.7.21 using the example I provided and It does not seem to work in parallel but still sequentially. I have recorded a video as evidence.

earthly-parallel-withdocker.mp4

vladaionescu · 2023-11-21T02:02:27Z

Hi @jrodrigv - Many thanks for the video - it showcases the issue clearly. It seems that I fixed a slightly different situation than your setup. They're closely related though.

It's possible that we either don't yet support the FOR loop, or the series of COPY commands as a way to parallelize. I'll take a look into this. The old --no-use-registry-for-with-docker should still work as a workaround in the meantime.

jrodrigv · 2024-05-20T10:38:21Z

Hi @vladaionescu today I was doing some testing with v0.8.11 and I found something interesting that I'd like to share with you.

It seems the problem is caused when using FOR + WITH DOCKER when the targets are being executed sequentially rather that in parallel, however N explicit BUILD commands are executed in parallel successfully.

VERSION  0.8

FROM earthly/dind:alpine

# this executes in parallel
test-parallel:
    BUILD +test-executor  --PROJECT=1
    BUILD +test-executor  --PROJECT=2
    BUILD +test-executor  --PROJECT=3
    BUILD +test-executor  --PROJECT=4
    BUILD +test-executor  --PROJECT=5

# this sequentially
test-parallel-for:
     BUILD +parallel

parallel:
    FOR num IN $(seq 5)
        BUILD +test-executor --PROJECT=$num
    END

test-executor:
    ARG PROJECT

    WITH DOCKER \
        --pull hello-world
        RUN for i in {1..5}; do sleep 1; echo "hello $PROJECT"; done
    END

alexcb · 2024-05-21T21:01:00Z

Great find @jrodrigv; I have a potential implementation to support this under #4138; however there's some duplicate output that's causing some issues that needs to be addressed before it's ready to be merged.

alexcb added the type:bug Something isn't working label Nov 8, 2022

idelvall self-assigned this Oct 18, 2023

idelvall mentioned this issue Oct 18, 2023

Fix WITH DOCKER is not executed in parallel #3403

Closed

vladaionescu mentioned this issue Oct 19, 2023

Fix for limited parallelism when the target is the same but has different args #3406

Merged

vladaionescu closed this as completed in #3406 Oct 23, 2023

vladaionescu reopened this Nov 21, 2023

idelvall assigned vladaionescu and unassigned idelvall Dec 3, 2023

jrodrigv mentioned this issue Feb 3, 2024

v0.8.3 --no-use-registry-for-with-docker - error docker load: solve: local directory . not enabled #3776

Open

idelvall mentioned this issue Feb 14, 2024

WITH DOCKER breaks parallelization #3808

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WITH DOCKER is not executed in parallel in latest v0.6.28 (but it was in version v0.6.25) #2377

WITH DOCKER is not executed in parallel in latest v0.6.28 (but it was in version v0.6.25) #2377

jrodrigv commented Nov 8, 2022

vladaionescu commented Nov 8, 2022

jrodrigv commented Nov 8, 2022

vladaionescu commented Nov 8, 2022

jrodrigv commented Nov 9, 2022

vladaionescu commented Nov 9, 2022

jrodrigv commented Nov 10, 2022 •

edited

Loading

jrodrigv commented Jan 20, 2023

sesgoe commented Mar 1, 2023

jrodrigv commented Oct 24, 2023 via email

vladaionescu commented Oct 24, 2023 •

edited

Loading

jrodrigv commented Nov 18, 2023

vladaionescu commented Nov 21, 2023

jrodrigv commented May 20, 2024

alexcb commented May 21, 2024

WITH DOCKER is not executed in parallel in latest v0.6.28 (but it was in version v0.6.25) #2377

WITH DOCKER is not executed in parallel in latest v0.6.28 (but it was in version v0.6.25) #2377

Comments

jrodrigv commented Nov 8, 2022

vladaionescu commented Nov 8, 2022

jrodrigv commented Nov 8, 2022

vladaionescu commented Nov 8, 2022

jrodrigv commented Nov 9, 2022

vladaionescu commented Nov 9, 2022

jrodrigv commented Nov 10, 2022 • edited Loading

jrodrigv commented Jan 20, 2023

sesgoe commented Mar 1, 2023

jrodrigv commented Oct 24, 2023 via email

vladaionescu commented Oct 24, 2023 • edited Loading

jrodrigv commented Nov 18, 2023

vladaionescu commented Nov 21, 2023

jrodrigv commented May 20, 2024

alexcb commented May 21, 2024

jrodrigv commented Nov 10, 2022 •

edited

Loading

vladaionescu commented Oct 24, 2023 •

edited

Loading