Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring SDK base image to debian:bullseye #2769

Merged
merged 5 commits into from
Oct 31, 2022

Conversation

markmandel
Copy link
Member

What type of PR is this?

Uncomment only one /kind <> line, press enter to put that in a new line, and remove leading whitespace from that line:

/kind breaking
/kind bug

/kind cleanup

/kind documentation
/kind feature
/kind hotfix

What this PR does / Why we need it:

The upgrade to gRPC solved one issue, and I also added a limit to number of processes that could run for make -j otherwise the whole thing would fall over (also would crash my dev machine!).

Which issue(s) this PR fixes:

Closes #2224

Special notes for your reviewer:

$(nproc) returns the number of cores on the machine.

@markmandel markmandel added kind/cleanup Refactoring code, fixing up documentation, etc area/build-tools Development tooling. I.e. pretty much everything in the `build` directory. labels Oct 20, 2022
@markmandel markmandel added the feature-freeze-do-not-merge Only eligible to be merged once we are out of feature freeze (next full release) label Oct 20, 2022
Copy link
Member

@roberthbailey roberthbailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but will wait to approve (and merge) until after the release.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 1938de11-eeee-4e5a-880a-9a29f48f767d

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 5f2572b2-89bc-4ebc-8c5b-9412ff7ead08

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

The upgrade to gRPC solved one issue, and I also added a limit to number
of processes that could run for `make -j` otherwise the whole thing
would fall over (also would crash my dev machine!).

Closes googleforgames#2224
* Revert CI cache increment (don't think we need it)
* Add shell to cpp image for debugging.
* Fix formatting issue that is breaking CI.
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 860dc0d8-85ef-4b4f-bd16-c1d66a240560

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2769/head:pr_2769 && git checkout pr_2769
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.27.0-cea11bf-amd64

@mangalpalli mangalpalli removed the feature-freeze-do-not-merge Only eligible to be merged once we are out of feature freeze (next full release) label Oct 26, 2022
@govargo
Copy link
Contributor

govargo commented Oct 27, 2022

I built Docker image agones-build-sdk-base from scratch in my local PC(Macbook 2020) and got error when installing gRPC.

==> ERROR [3/6] RUN git clone --recurse-submodules -b v1.36.1 --depth 1 --shallow-submodules https://github.com/grpc/grpc /var/local/git/grpc &&     cd /var/local/git/grpc &&     mkdir -p cmake/b  687.8s
------                                                                                                                                                                                                       > [3/6] RUN git clone --recurse-submodules -b v1.36.1 --depth 1 --shallow-submodules https://github.com/grpc/grpc /var/local/git/grpc &&     cd /var/local/git/grpc &&     mkdir -p cmake/build &&     cd cmake/build &&     cmake -DgRPC_INSTALL=ON -DgRPC_BUILD_TESTS=OFF ../.. &&     make -j && make install:                                                                                                      #7 0.283 Cloning into '/var/local/git/grpc'...
...
#7 108.8 [ 20%] Building C object third_party/boringssl-with-bazel/CMakeFiles/crypto.dir/src/crypto/cipher_extra/e_rc4.c.o
#7 108.8 [ 20%] Built target upb
#7 108.9 [ 20%] Building C object third_party/boringssl-with-bazel/CMakeFiles/crypto.dir/src/crypto/cipher_extra/e_tls.c.o
#7 119.2 c++: fatal error: Killed signal terminated program cc1plus
#7 119.2 compilation terminated.
#7 125.8 c++: fatal error: Killed signal terminated program cc1plus
#7 125.8 compilation terminated.
#7 128.7 c++: fatal error: Killed signal terminated program cc1plus
#7 128.7 compilation terminated.
...
#7 678.0 make[1]: *** [CMakeFiles/Makefile2:6055: third_party/boringssl-with-bazel/CMakeFiles/ssl.dir/all] Error 2
#7 678.1 [ 33%] Built target absl_bad_optional_access
#7 678.2 [ 33%] Built target absl_throw_delegate
#7 678.2 [ 33%] Linking CXX static library libabsl_strings_internal.a
#7 678.3 [ 33%] Built target absl_debugging_internal
#7 678.6 [ 33%] Built target absl_strings_internal
#7 679.3 [ 33%] Linking CXX static library libabsl_base.a
#7 679.8 [ 33%] Built target absl_base
#7 682.9 make[1]: *** [CMakeFiles/Makefile2:5503: third_party/re2/CMakeFiles/re2.dir/all] Error 2
#7 685.4 make[1]: *** [CMakeFiles/Makefile2:5025: third_party/protobuf/CMakeFiles/libprotobuf-lite.dir/all] Error 2
#7 686.6 [ 33%] Linking C static library libcrypto.a
#7 686.8 [ 33%] Built target crypto
#7 687.3 make[1]: *** [CMakeFiles/Makefile2:4950: third_party/protobuf/CMakeFiles/libprotobuf.dir/all] Error 2
#7 687.3 make: *** [Makefile:130: all] Error 2
------
executor failed running [/bin/sh -c git clone --recurse-submodules -b $GRPC_RELEASE_TAG --depth 1 --shallow-submodules https://github.com/grpc/grpc /var/local/git/grpc &&     cd /var/local/git/grpc &&     mkdir -p cmake/build &&     cd cmake/build &&     cmake -DgRPC_INSTALL=ON -DgRPC_BUILD_TESTS=OFF ../.. &&     make -j && make install]: exit code: 2
make[5]: *** [build-build-sdk-image-base] Error 1
make[4]: *** [ensure-image] Error 2
make[3]: *** [ensure-build-sdk-image-base] Error 2
make[2]: *** [ensure-image] Error 2
make[1]: *** [ensure-build-sdk-image] Error 2

Current command is...

RUN git clone --recurse-submodules -b $GRPC_RELEASE_TAG --depth 1 --shallow-submodules https://github.com/grpc/grpc /var/local/git/grpc && \
    cd /var/local/git/grpc && \
    mkdir -p cmake/build && \
    cd cmake/build && \
    cmake -DgRPC_INSTALL=ON -DgRPC_BUILD_TESTS=OFF ../.. && \
    make -j && make install

I think -DCMAKE_BUILD_TYPE=Release is needed according to the official example: https://github.com/grpc/grpc/blob/master/test/distrib/cpp/run_distrib_test_cmake_module_install.sh#L35-L36
After adding -DCMAKE_BUILD_TYPE=Release flag, I can build and install gRPC in my local container.

Does this make sense?

If this is only my local issue, please ignore, however, if I need to send a PR, I'll send it.

@govargo
Copy link
Contributor

govargo commented Oct 27, 2022

Sorry I added the comment above, but it was wrong.
The most important thing is Mark's adding -j$(nproc).
Adding only -DCMAKE_BUILD_TYPE=Release failed too.

Please ignore my comment. It's my misunderstanding.

@markmandel
Copy link
Member Author

Gentle bump @roberthbailey 😄

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: ad2427ee-d5d0-4343-917e-a056f8a1c4f2

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2769/head:pr_2769 && git checkout pr_2769
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.28.0-5aa797d-amd64

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: markmandel, roberthbailey

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [markmandel,roberthbailey]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: d2989f51-6383-461f-8319-30e07b55c688

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Member Author

Weir SDK conformance flake:

{"grpcEndpoint":":9001","message":"Could not listen on grpc endpoint","severity":"fatal","source":"main","time":"2022-10-31T18:14:36.43899715Z"}
   Compiling tokio-macros v1.8.0
   Compiling tracing-attributes v0.1.23
   Compiling prost-derive v0.8.0
   Compiling pin-project-internal v1.0.12
   Compiling async-stream-impl v0.3.3
   Compiling thiserror-impl v1.0.37
make[1]: *** [includes/sdk.mk:145: run-sdk-conformance-no-build] Error 1
make: *** [includes/sdk.mk:181: run-sdk-conformance-test-rest] Error 2

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 982a7b20-75d3-4f07-849f-cff233a0a3e3

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/2769/head:pr_2769 && git checkout pr_2769
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.28.0-3737160-amd64

@markmandel markmandel merged commit d032d44 into googleforgames:main Oct 31, 2022
@markmandel markmandel deleted the build/sdk-bullseye branch October 31, 2022 19:44
@markmandel markmandel added this to the 1.28.0 milestone Oct 31, 2022
chiayi pushed a commit to chiayi/agones that referenced this pull request Nov 17, 2022
nodepools and regional clusters

Updates to release checklist. (googleforgames#2772)

* Updates to release checklist.

Adding items that showed up in the recent release that were not written
down or required better clarification.

* Review updates, and some other small tweaks.

Co-authored-by: Robert Bailey <robertbailey@google.com>

Release 1.27.0 (googleforgames#2776)

* Release 1.27.0

* Update FAQ on ExternalDNS (googleforgames#2773)

The feature flag it points to have been moved to stable, so the link
is not useful any more.

Also removed notes on ipv6, since they aren't 100% accurate, as we were
discussing in googleforgames#2767.

* Updates to release checklist. (googleforgames#2772)

* Updates to release checklist.

Adding items that showed up in the recent release that were not written
down or required better clarification.

* Review updates, and some other small tweaks.

Co-authored-by: Robert Bailey <robertbailey@google.com>

* Release-changes

* Review comment

* Review changes

Co-authored-by: Mark Mandel <markmandel@google.com>
Co-authored-by: Robert Bailey <robertbailey@google.com>

Version updates (googleforgames#2778)

Players in-game metric for when PlayerTracking is enabled (googleforgames#2765)

* Check for DeletionTimestamp of fleet and gameserverset before scaling

* Add metric to track player count in gameservers

* check PlayerStatus is not nil

* Update metrics available in docs

* Wrong relref path

* typo

* Change name for players in game metric to player connected. Add player capacity metric. Hide docs until next agones release.

* Duplicate metrics table

* add gameserver player tracking metrics to fleetViews

Co-authored-by: Mark Mandel <markmandel@google.com>

Remove generation for swagger Go code and Add static swagger codes for test (googleforgames#2757)

Co-authored-by: Mark Mandel <markmandel@google.com>

Updated allocation yaml files under examples/ to use selectors

Show how to set graceful termination in a game server that is safe to (googleforgames#2780)

evict.

Avoid retry from allocateFromLocalCluster under context kill. (googleforgames#2783)

* Version updates

* issue-2736-changes

Co-authored-by: Mark Mandel <markmandel@google.com>

Bring SDK base image to debian:bullseye (googleforgames#2769)

* Bring SDK base image to debian:bullseye

The upgrade to gRPC solved one issue, and I also added a limit to number
of processes that could run for `make -j` otherwise the whole thing
would fall over (also would crash my dev machine!).

Closes googleforgames#2224

* Force refresh of cpp cache on Cloud Build.

* Fixes for CI:

* Revert CI cache increment (don't think we need it)
* Add shell to cpp image for debugging.
* Fix formatting issue that is breaking CI.

Co-authored-by: Robert Bailey <robertbailey@google.com>

Update health-checking.md (googleforgames#2785)

Fixed spell error: spec.health.failureTheshold to spec.health.failureThreshold

Updated allocation yaml files under examples/ to use selectors (googleforgames#2787)

Cleanup of load tests (googleforgames#2784)

* issue-2744 updated changes with new description
* 2744 review changes

Sync Pod host ports back to GameServer in GCP (googleforgames#2782)

This is the start of the implementation for googleforgames#2777:

* Most of this is mechanical and implements a thin cloud product
abstraction layer in pkg/cloud, instantiated with New(product). The
product abstraction provides a single function so far:
SyncPodPortsToGameServer.

* SyncPodPortsToGameServer is inserted as a hook while syncing
IP/ports, to let different cloud providers handle port allocation
slightly differently (in this case, GKE Autopilot)

* In GKE Autopilot, we look for a JSON string like
`{"min":7000,"max":8000,"portsAssigned":{"7001":7737,"7002":7738}}`
as an indication that the host ports were reassigned (per policy).
As a side note to anyone watching, this is currently an unreleased
feature. If we see this, we use the provided mapping to map the
host ports in the GameServer.Spec.

With this change, it's possible to launch a GameServer and get a
healthy GameServer Pod by adding the following annotation:

```
annotations:
  cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
  autopilot.gke.io/host-port-assignment: '{"min": 7000, "max": 8000}'
```

If this PR causes any issues, the cloud product auto detection can
be disabled by setting `agones.cloudProduct=generic`, or forced to
GKE Autopilot using `agones.cloudProduct=gke-autopilot`.

In a future PR, I will add the host-port-assignment annotation
automatically on Autopilot

Co-authored-by: Mark Mandel <markmandel@google.com>

Update gke terraform files to allow autoscaling

Fix (not really) problems reported by VSCode (googleforgames#2790)

VSCode reports `main redeclared` between allocationload.go and
runscenario.go due to the fact that they both look like `package main`
binaries in the same directory, similar e.g. [this poster on a
different
project](https://stackoverflow.com/questions/66970531/vs-code-go-main-redeclared-in-this-block)

To fix it, it's easy enough to just give these binaries their own
package path and fix up the calling scripts.

Along the way, fix a lint complaint in runscenario.go

Add location variable for cluster location argument

Minor fix

changed default of location var to empty string

GameServerRestartBeforeReadyCrash: Run serially (googleforgames#2791)

Narrow the race in googleforgames#2445 by running GameServerRestartBeforeReadyCrash serially. See googleforgames#2445 (comment) for a detailed analysis.

Does not fix the issue - this is stopgap until we understand how to fix it.

Enable fieldalignment linter, then mostly ignore it (googleforgames#2795)

Enable the fieldalignment linter by enabling all `govet` checks
except shadowing. Ignore large swaths of code (tests, cmd/, APIs),
and nolint'd existing complaints that seemed irrelevant.

Along the way:

* removed existing nolint:maligned, as `maligned` is no more.
* disabled `structcheck` and `deadcode` as they are deprecated (and I
think have been subsumed by other linters?)
* changed `gameServerCacheEntry` to `gameServerCache`. It is the
cache, not just an entry.
* fixed alignment of `gameServerSetCacheEntry`.

Add fswatch library to watch and batch filesystem events, use in allocator (googleforgames#2792)

This pull refactors the fsnotify code in allocator/main out to a
shared library, and in that shared library implements a batched
notification processor.

Closes googleforgames#1816: This takes a slightly different approach than specified
in the issue, instead choosing to just delay processing until after a
batch processing period. I chose 1s - it's far longer than necessary,
but still much shorter than it takes for the secret changes to
propagate to the container anyways.

I considered the approach in googleforgames#1816 of trying to parse the actual
events, but it's too fiddly to get exactly right: e.g. maybe you only
refresh on "write", but then "chmod" could make the file readable
whereas it wasn't before, "rename" could expose a file that wasn't
there before, etc.

Cloud product: Split port allocators, implement Autopilot port allocation/policies (googleforgames#2789)

In the Agones on GKE Autopilot implementation, we have no need for the
port allocator - the informer/etc. is an unnecessary moving piece.
This PR allows for cloud products to provide their own port allocation
implementation, and implements the GKE Autopilot "allocator". We do
this by:

* Splitting portallocator off to its own package. It was basically
self-sufficient anyways, except it was a little too friendly with
controller_test.go. I solved that by introducing a TestInterface for
controller_test.go to upcast to.

* Allow cloud product implementations to define their own port
allocator.

* Defining a new port allocator for GKE that does a simple per-port
HostPort allocation, and adds the host-port-assignment annotation to
the pod template.

* Extend cloudproduct again to add a GameServer validator

* And in Autopilot, reject if the PortPolicy is not `Dynamic`

Release: Note to switch away from `agones-images` (googleforgames#2809)

Since we have few guardrails on accidentally touching `agones-images`
project, adding a note in the release checklist to switch back to a
local development project after running a release.

Flake: TestControllerGameServerCount (googleforgames#2805)

Made it deterministic in the test, and got rod of the potential race
conditions.

Also fix it such that the util function for generating GameServer names
always produce a unique name.

Closes googleforgames#2804

Co-authored-by: Robert Bailey <robertbailey@google.com>

Remove Windows FAQ Entry (googleforgames#2811)

The contents are no longer accurate, and are covered in the installation
section now.

Makefile changes for adding location variable

added autoscale parameters to Makefile and README

Markdown fix in readme

Changed LOCATION to always be set with ZONE as default

use  only if the variable has a value

fixed extraneous characters

update gke terraform exmaple module

Update Node.js dependencies and package (googleforgames#2815)

* Update all dependencies and Node,js to LTS version

* Update other docker images that use Node.js

Added autoscale to example cluster and added to website docs

Added defaults and feature expiry

Remove zone from gke/variable.tf file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/build-tools Development tooling. I.e. pretty much everything in the `build` directory. kind/cleanup Refactoring code, fixing up documentation, etc lgtm size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrade build tools from debian buster to bullseye
5 participants