Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for service containers #1949

Merged
merged 33 commits into from
Oct 19, 2023

Conversation

GuessWhoSamFoo
Copy link
Contributor

Closes #173

This PR cherry picks:

https://gitea.com/gitea/act/pulls/50 is not moved over as it does not exist as a Github Action feature.

Gitea specific features such as createSimpleContainerName and the AutoRemove flag was removed during this process. In addition, ContainerMaxLifetime is also removed as since running /bin/sleep 0 by default can be confusing and doesn't seem to have a use case outside of usage as an sdk.

@GuessWhoSamFoo GuessWhoSamFoo requested a review from a team as a code owner August 7, 2023 02:04
@GuessWhoSamFoo GuessWhoSamFoo changed the title Service container Add support for service containers Aug 7, 2023
@mergify
Copy link
Contributor

mergify bot commented Aug 7, 2023

@GuessWhoSamFoo this pull request has failed checks 🛠

@mergify mergify bot added the needs-work Extra attention is needed label Aug 7, 2023
@mergify
Copy link
Contributor

mergify bot commented Aug 7, 2023

@GuessWhoSamFoo this pull request has failed checks 🛠

pkg/runner/runner.go Outdated Show resolved Hide resolved
@mergify
Copy link
Contributor

mergify bot commented Aug 7, 2023

@GuessWhoSamFoo this pull request has failed checks 🛠

@GuessWhoSamFoo
Copy link
Contributor Author

GuessWhoSamFoo commented Aug 7, 2023

Not sure what is up with the artifact upload/download test. Maybe there's a side effect somewhere, gonna have to investigate later

[Test that artifact uploads and downloads succeed/test-artifacts]   | Create Artifact Container - Attempt 5 of 5 failed with error: connect ECONNREFUSED 127.0.0.1:12345
[Test that artifact uploads and downloads succeed/test-artifacts]   ❗  ::error::Create Artifact Container failed: connect ECONNREFUSED 127.0.0.1:12345

EDIT: Realized it's because ContainerNetworkMode needs to be host otherwise it can't reach the test server

@codecov
Copy link

codecov bot commented Aug 7, 2023

Codecov Report

Merging #1949 (93089ed) into master (4989f44) will increase coverage by 0.23%.
Report is 254 commits behind head on master.
The diff coverage is 59.71%.

@@            Coverage Diff             @@
##           master    #1949      +/-   ##
==========================================
+ Coverage   61.22%   61.45%   +0.23%     
==========================================
  Files          46       53       +7     
  Lines        7141     8774    +1633     
==========================================
+ Hits         4372     5392    +1020     
- Misses       2462     2953     +491     
- Partials      307      429     +122     
Files Coverage Δ
pkg/common/executor.go 51.69% <100.00%> (+1.69%) ⬆️
pkg/container/docker_cli.go 82.23% <ø> (ø)
pkg/container/docker_logger.go 52.08% <ø> (ø)
pkg/runner/step_action_local.go 93.54% <100.00%> (ø)
pkg/runner/step_action_remote.go 91.56% <100.00%> (+0.65%) ⬆️
pkg/runner/step_docker.go 93.18% <100.00%> (ø)
pkg/container/file_collector.go 37.30% <0.00%> (ø)
pkg/container/util.go 0.00% <0.00%> (ø)
pkg/container/docker_build.go 60.00% <80.00%> (+1.02%) ⬆️
...ontainer/linux_container_environment_extensions.go 23.07% <0.00%> (-1.25%) ⬇️
... and 31 more

... and 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@mergify mergify bot removed the needs-work Extra attention is needed label Aug 7, 2023
@mergify mergify bot requested a review from a team August 7, 2023 05:06
wolfogre
wolfogre previously approved these changes Aug 7, 2023
Copy link
Member

@wolfogre wolfogre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and I believe more practical tests are needed.

Some notes for other reviewers:

  • Services always run in containers, so docker will be required to run services (even run jobs in host mode).
  • It will create a new docker network for service containers, and let the job container join the network. And delete the network when it has done.
  • IIRC, there will be some problems to conntect to serivces when runing jobs in host mode without job containers. Since there's no job container, it's impossible to let the job container join the network.
  • A new docker network make it possible to let services bind any ports without conflict, however, it also make things more complex. (That's why the team of Gitea are cautious to port it upstream)
  • IIRC, act will keep the job container to reuse (while gitea/act never reuse), I'm not sure if the network could be used and deleted correctly.
  • Docker can only provide a limited number of networks, so the network must be deleted timely.

@ChristopherHX
Copy link
Contributor

Please create stubs for NewDockerNetworkCreateExecutor, NewDockerNetworkRemoveExecutor returning an empty noop executor.

In this file: https://github.com/nektos/act/blob/master/pkg/container/docker_stub.go

I rely on extended platform support, where docker libraries does not compile. (see build tags)

Seems like CI tests should be extended to test this

Copy link
Contributor

@ChristopherHX ChristopherHX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been a static code review from my side, I haven't tested anything locally.

IIRC, there will be some problems to conntect to serivces when runing jobs in host mode without job containers. Since there's no job container, it's impossible to let the job container join the network.

Yeah that's why GitHub advise to use a job container, the ports section is used to expose ports in that case and also allows assigning an random port.

pkg/runner/job_executor.go Outdated Show resolved Hide resolved
pkg/runner/runner.go Show resolved Hide resolved
pkg/container/container_types.go Outdated Show resolved Hide resolved
pkg/container/docker_network.go Show resolved Hide resolved
pkg/container/docker_network.go Show resolved Hide resolved
pkg/runner/run_context.go Outdated Show resolved Hide resolved
@ChristopherHX
Copy link
Contributor

Actually it makes sense to just merge this change, otherwise act may never has this feature.
I defer my appoval to a later date in August, unless these points are addressed

@GuessWhoSamFoo
Copy link
Contributor Author

@ChristopherHX Thanks for the review - I'll take a closer look if something got missed in the backport and likely want to explicitly test https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idservicesservice_idvolumes as well

Copy link
Contributor

@ChristopherHX ChristopherHX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a file descriptor leak, this have to be fixed in my opinion.
I opened a gitea/act issue about this: https://gitea.com/gitea/act/issues/76
If you don't use act with very large workflows or in watch mode for a long time you won't notice the consequences of this code.
Yes act had this problem on more places, before I fixed them all. In 2021 the fd leak per job was very large, but due to high limits you didn't see any errors in short term.

pkg/container/docker_network.go Show resolved Hide resolved
pkg/container/docker_network.go Show resolved Hide resolved
@GuessWhoSamFoo
Copy link
Contributor Author

@wolfogre @ChristopherHX Still missing some tests, but wanted to run by the feedback so far. The host network will be used by default unless a service container is defined or passed through the flag. This can mitigate some concerns around breaking existing users and orphaned networks eventually reaching limits.

I have also thought about creating a network name by convention and re-using instead. Higher chance of port conflicts but less complexity

ChristopherHX
ChristopherHX previously approved these changes Aug 13, 2023
Copy link
Contributor

@ChristopherHX ChristopherHX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good now for me

Does gitea/act provide contextdata for services? If not it's also ok for me

pkg/runner/run_context.go Outdated Show resolved Hide resolved
Zettat123 and others added 2 commits August 14, 2023 11:45
Removed createSimpleContainerName and AutoRemove flag

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Jason Song <i@wolfogre.com>
Reviewed-on: https://gitea.com/gitea/act/pulls/42
Reviewed-by: Jason Song <i@wolfogre.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
Reviewed-on: https://gitea.com/gitea/act/pulls/45
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>
jmikedupont2 pushed a commit to meta-introspector/act that referenced this pull request Mar 10, 2024
* Support services (nektos#42)

Removed createSimpleContainerName and AutoRemove flag

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Jason Song <i@wolfogre.com>
Reviewed-on: https://gitea.com/gitea/act/pulls/42
Reviewed-by: Jason Song <i@wolfogre.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>

* Support services options (nektos#45)

Reviewed-on: https://gitea.com/gitea/act/pulls/45
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>

* Support intepolation for `env` of `services` (nektos#47)

Reviewed-on: https://gitea.com/gitea/act/pulls/47
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>

* Support services `credentials` (nektos#51)

If a service's image is from a container registry requires authentication, `act_runner` will need `credentials` to pull the image, see [documentation](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idservicesservice_idcredentials).
Currently, `act_runner` incorrectly uses the `credentials` of `containers` to pull services' images and the `credentials` of services won't be used, see the related code: https://gitea.com/gitea/act/src/commit/0c1f2edb996a87ee17dcf3cfa7259c04be02abd7/pkg/runner/run_context.go#L228-L269

Co-authored-by: Jason Song <i@wolfogre.com>
Reviewed-on: https://gitea.com/gitea/act/pulls/51
Reviewed-by: Jason Song <i@wolfogre.com>
Reviewed-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>

* Add ContainerMaxLifetime and ContainerNetworkMode options

from: https://gitea.com/gitea/act/commit/b9c20dcaa43899cb3bb327619d447248303170e0

* Fix container network issue (nektos#56)

Follow: https://gitea.com/gitea/act_runner/pulls/184
Close https://gitea.com/gitea/act_runner/issues/177

- `act` create new networks only if the value of `NeedCreateNetwork` is true, and remove these networks at last. `NeedCreateNetwork` is passed by `act_runner`. 'NeedCreateNetwork' is true only if  `container.network` in the configuration file of the `act_runner` is empty.
- In the `docker create` phase, specify the network to which containers will connect. Because, if not specify , container will connect to `bridge` network which is created automatically by Docker.
  - If the network is user defined network ( the value of `container.network` is empty or `<custom-network>`.  Because, the network created by `act` is also user defined network.), will also specify alias by `--network-alias`. The alias of service is `<service-id>`. So we can be access service container by `<service-id>:<port>` in the steps of job.
- Won't try to `docker network connect ` network after `docker start` any more.
  - Because on the one hand,  `docker network connect` applies only to user defined networks, if try to `docker network connect host <container-name>` will return error.
  - On the other hand, we just specify network in the stage of `docker create`, the same effect can be achieved.
- Won't try to remove containers and networks berfore  the stage of `docker start`, because the name of these containers and netwoks won't be repeat.

Co-authored-by: Jason Song <i@wolfogre.com>
Reviewed-on: https://gitea.com/gitea/act/pulls/56
Reviewed-by: Jason Song <i@wolfogre.com>
Co-authored-by: sillyguodong <gedong_1994@163.com>
Co-committed-by: sillyguodong <gedong_1994@163.com>

* Check volumes (nektos#60)

This PR adds a `ValidVolumes` config. Users can specify the volumes (including bind mounts) that can be mounted to containers by this config.

Options related to volumes:
- [jobs.<job_id>.container.volumes](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idcontainervolumes)
- [jobs.<job_id>.services.<service_id>.volumes](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idservicesservice_idvolumes)

In addition, volumes specified by `options` will also be checked.

Currently, the following default volumes (see https://gitea.com/gitea/act/src/commit/a72822b3f83d3e68ffc697101b713b7badf57e2f/pkg/runner/run_context.go#L116-L166) will be added to `ValidVolumes`:
- `act-toolcache`
- `<container-name>` and `<container-name>-env`
- `/var/run/docker.sock` (We need to add a new configuration to control whether the docker daemon can be mounted)

Co-authored-by: Jason Song <i@wolfogre.com>
Reviewed-on: https://gitea.com/gitea/act/pulls/60
Reviewed-by: Jason Song <i@wolfogre.com>
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>

* Remove ContainerMaxLifetime; fix lint

* Remove unused ValidVolumes

* Remove ConnectToNetwork

* Add docker stubs

* Close docker clients to prevent file descriptor leaks

* Fix the error when removing network in self-hosted mode (nektos#69)

Fixes https://gitea.com/gitea/act_runner/issues/255

Reviewed-on: https://gitea.com/gitea/act/pulls/69
Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-committed-by: Zettat123 <zettat123@gmail.com>

* Move service container and network cleanup to rc.cleanUpJobContainer

* Add --network flag; default to host if not using service containers or set explicitly

* Correctly close executor to prevent fd leak

* Revert to tail instead of full path

* fix network duplication

* backport networkingConfig for aliaes

* don't hardcode netMode host

* Convert services test to table driven tests

* Add failing tests for services

* Expose service container ports onto the host

* Set container network mode in artifacts server test to host mode

* Log container network mode when creating/starting a container

* fix: Correctly handle ContainerNetworkMode

* fix: missing service container network

* Always remove service containers

Although we usually keep containers running if the workflow errored
(unless `--rm` is given) in order to facilitate debugging and we have
a flag (`--reuse`) to always keep containers running in order to speed
up repeated `act` invocations, I believe that these should only apply
to job containers and not service containers, because changing the
network settings on a service container requires re-creating it anyway.

* Remove networks only if no active endpoints exist

* Ensure job containers are stopped before starting a new job

* fix: go build -tags WITHOUT_DOCKER

---------

Co-authored-by: Zettat123 <zettat123@gmail.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Jason Song <i@wolfogre.com>
Co-authored-by: sillyguodong <gedong_1994@163.com>
Co-authored-by: ChristopherHX <christopher.homberger@web.de>
Co-authored-by: ZauberNerd <zaubernerd@zaubernerd.de>
vicamo added a commit to vicamo/runner-images that referenced this pull request Mar 11, 2024
Inside container environment we're not running a second service process,
but use service container for act instead.

See: nektos/act#1949
Signed-off-by: You-Sheng Yang <vicamo@gmail.com>
vicamo added a commit to vicamo/runner-images that referenced this pull request Mar 11, 2024
Inside container environment we're not running a second service process,
but use service container for act instead.

See: nektos/act#1949
Signed-off-by: You-Sheng Yang <vicamo@gmail.com>
vicamo added a commit to vicamo/runner-images that referenced this pull request Mar 11, 2024
Inside container environment we're not running a second service process,
but use service container for act instead.

See: nektos/act#1949
Signed-off-by: You-Sheng Yang <vicamo@gmail.com>
vicamo added a commit to vicamo/runner-images that referenced this pull request Mar 11, 2024
Inside container environment we're not running a second service process,
but use service container for act instead.

See: nektos/act#1949
Signed-off-by: You-Sheng Yang <vicamo@gmail.com>
@devnoname120
Copy link

PR to update the documentation accordingly: nektos/act-docs#30

ChristopherHX pushed a commit to nektos/act-docs that referenced this pull request Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Services not working
8 participants