Temporarily disable 2-node tests #3812
ci.yaml
on: pull_request
metadata
0s
Matrix: amd64 / test-distribution
Matrix: arm64 / test-distribution
Matrix: amd64 / test-jax / run-unit-test
amd64
/
...
/
launch-slurm-runner
40m 31s
amd64
/
test-nsys-jax-eks
3m 48s
Matrix: amd64 / test-nsys-jax / run-unit-test
Matrix: arm64 / test-jax / run-unit-test
Waiting for pending jobs
arm64
/
test-nsys-jax-eks
0s
arm64
/
...
/
launch-slurm-runner
Matrix: arm64 / test-nsys-jax / run-unit-test
Waiting for pending jobs
Matrix: amd64 / test-te / run-unit-test
Waiting for pending jobs
Matrix: amd64 / test-upstream-pax / pax-multi-node
Waiting for pending jobs
Matrix: amd64 / test-upstream-pax / single-process-evaluation
Waiting for pending jobs
Matrix: amd64 / test-upstream-pax / single-process-multi-device
Waiting for pending jobs
Matrix: amd64 / test-te-multigpu / te-multi-gpu
Waiting for pending jobs
Matrix: amd64 / test-upstream-t5x / t5x-multi-gpu
Matrix: amd64 / test-gemma / run-unit-test
Matrix: amd64 / test-levanter / run-unit-test
Matrix: amd64 / test-maxtext / maxtext-multinode
Matrix: amd64 / test-maxtext / single-process-multi-device
Matrix: amd64 / test-triton / run-unit-test
Matrix: amd64 / test-nsys-jax-archive
Matrix: arm64 / test-te / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-upstream-pax / pax-multi-node
Waiting for pending jobs
Matrix: arm64 / test-upstream-pax / single-process-evaluation
Waiting for pending jobs
Matrix: arm64 / test-upstream-pax / single-process-multi-device
Waiting for pending jobs
Matrix: arm64 / test-te-multigpu / te-multi-gpu
Waiting for pending jobs
Matrix: arm64 / test-upstream-t5x / t5x-multi-gpu
Waiting for pending jobs
Matrix: arm64 / test-gemma / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-levanter / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-maxtext / maxtext-multinode
Waiting for pending jobs
Matrix: arm64 / test-maxtext / single-process-multi-device
Waiting for pending jobs
Matrix: arm64 / test-triton / run-unit-test
Waiting for pending jobs
Matrix: arm64 / test-nsys-jax-archive
Matrix: amd64 / test-rosetta-pax / rosetta-pax-multi-node-te
Waiting for pending jobs
Matrix: amd64 / test-rosetta-pax / rosetta-pax-multi-node
Waiting for pending jobs
Matrix: amd64 / test-rosetta-pax / rosetta-pax-single-node-dropout-te
Waiting for pending jobs
Matrix: amd64 / test-rosetta-pax / single-process-evaluation-te
Waiting for pending jobs
Matrix: amd64 / test-rosetta-pax / single-process-multi-device-te
Waiting for pending jobs
Matrix: amd64 / test-rosetta-t5x / single-process-multi-device
Matrix: amd64 / test-rosetta-t5x / vit-multi-gpu-multi-node
Matrix: amd64 / test-rosetta-t5x / vit-single-process-multi-device
Matrix: arm64 / test-rosetta-pax / rosetta-pax-multi-node-te
Waiting for pending jobs
Matrix: arm64 / test-rosetta-pax / rosetta-pax-multi-node
Waiting for pending jobs
Matrix: arm64 / test-rosetta-pax / rosetta-pax-single-node-dropout-te
Waiting for pending jobs
Matrix: arm64 / test-rosetta-pax / single-process-evaluation-te
Waiting for pending jobs
Matrix: arm64 / test-rosetta-pax / single-process-multi-device-te
Waiting for pending jobs
Matrix: arm64 / test-rosetta-t5x / single-process-multi-device
Waiting for pending jobs
Matrix: arm64 / test-rosetta-t5x / vit-multi-gpu-multi-node
Waiting for pending jobs
Matrix: arm64 / test-rosetta-t5x / vit-single-process-multi-device
Waiting for pending jobs
Matrix: publish-containers
Waiting for pending jobs
finalize
/
publish-badge
Annotations
19 errors and 6 warnings
arm64 / build-upstream-pax / build-upstream-pax
buildx failed with: ERROR: failed to solve: process "/bin/sh -c <<\"EOF\" bash -exu -o pipefail\ngit-clone.sh ${URLREF_LINGVO} ${SRC_PATH_LINGVO}\npushd ${SRC_PATH_LINGVO}\n\nCPU_ARCH=\"$(dpkg --print-architecture)\"\nif [[ \"${CPU_ARCH}\" == \"arm64\" ]]; then\n\n# Use aarch distribution of protobufs\npatch -p1 <<\"EOFINNER\"\ndiff --git a/lingvo/repo.bzl b/lingvo/repo.bzl\nindex ce65822d2..d9c0277aa 100644\n--- a/lingvo/repo.bzl\n+++ b/lingvo/repo.bzl\n@@ -232,9 +232,9 @@ filegroup(\n )\n \"\"\",\n urls = [\n- \"https://github.com/protocolbuffers/protobuf/releases/download/v21.9/protoc-21.9-linux-x86_64.zip\",\n+ \"https://github.com/protocolbuffers/protobuf/releases/download/v21.9/protoc-21.9-linux-aarch_64.zip\",\n ],\n- sha256 = \"3cd951aff8ce713b94cde55e12378f505f2b89d47bf080508cf77e3934f680b6\",\n+ sha256 = \"a584286dfa8ebb17032ece206ed74d5e9931e2edb9016e427be2a0dab3b21071\",\n )\n\n def icu():\nEOFINNER\n\nfi\n\npip install tensorflow_datasets==4.9.2 auditwheel tensorflow==2.18.0\nfor pattern in \\\n \"s|tensorflow=|#tensorflow=|g\" \\\n \"s|dataclasses=|#dataclasses=|g\" \\\n \"s|==.*||g\" \\\n; do\n sed -i \"${pattern}\" ${SRC_PATH_LINGVO}/docker/dev.requirements.txt\ndone\n# Lingvo support only python < 3.12, so we hack it and update dependencies\n# to be able to build for py-3.12\nfor pattern in \\\n \"s|tensorflow-text~=2.13.0|tensorflow-text~=2.18.1|g\" \\\n \"s|tensorflow~=2.13.0|tensorflow~=2.18.0|g\" \\\n \"s|python_requires='>=3.8,<3.11'|python_requires='>=3.8,<3.13'|\" \\\n; do\n sed -i \"${pattern}\" ${SRC_PATH_LINGVO}/pip_package/setup.py;\ndone\npip install -r docker/dev.requirements.txt\n\n# Some tests are flaky right now, so we skip running the tests.\nBUILD_ARCH=\"x86_64\"\nif [[ \"$CPU_ARCH\" == \"arm64\" ]]; then\n BUILD_ARCH=\"aarch64\";\nfi\nsed -i 's/manylinux2014_x86_64/manylinux_2_38_'\"${BUILD_ARCH}\"'/' pip_package/build.sh\nSKIP_TESTS=1 PYTHON_MINOR_VERSION=$(python --version | cut -d ' ' -f 2 | cut -d '.' -f 2) pip_package/build.sh\nEOF" did not complete successfully: exit code: 1
|
amd64 / build-equinox / build-equinox
<!DOCTYPE html>
<!--
Hello future GitHubber! I bet you're here to remove those nasty inline styles,
DRY up these templates and make 'em nice and re-usable, right?
Please, don't. https://github.com/styleguide/templates/2.0
-->
<html>
<head>
<title>Unicorn! · GitHub</title>
<style type="text/css" media="screen">
body {
background-color: #f1f1f1;
margin: 0;
font-family: "Helvetica Neue", Helvetica, Arial, sans-serif;
}
.container { margin: 50px auto 40px auto; width: 600px; text-align: center; }
a { color: #4183c4; text-decoration: none; }
a:hover { text-decoration: underline; }
h1 { letter-spacing: -1px; line-height: 60px; font-size: 60px; font-weight: 100; margin: 0px; text-shadow: 0 1px 0 #fff; }
p { color: rgba(0, 0, 0, 0.5); margin: 10px 0 10px; font-size: 18px; font-weight: 200; line-height: 1.6em;}
ul { list-style: none; margin: 25px 0; padding: 0; }
li { display: table-cell; font-weight: bold; width: 1%; }
.logo { display: inline-block; margin-top: 35px; }
.logo-img-2x { display: none; }
@media
only screen and (-webkit-min-device-pixel-ratio: 2),
only screen and ( min--moz-device-pixel-ratio: 2),
only screen and ( -o-min-device-pixel-ratio: 2/1),
only screen and ( min-device-pixel-ratio: 2),
only screen and ( min-resolution: 192dpi),
only screen and ( min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
}
#suggestions {
margin-top: 35px;
color: #ccc;
}
#suggestions a {
color: #666666;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
}
</style>
</head>
<body>
<div class="container">
<p>
<img width="200" src="data:image/png;base64,
iVBORw0KGgoAAAANSUhEUgAAAZAAAAGZCAMAAACQbpc2AAADAFBMVEWEBz6FAD6FAD6GAD+MAEGOAEOKAEGOAEOGAD+IAECOAEOOAUOOAEOOAEOOAEOOAEOOAEOOAEOCACyOAEQAAACKAEJpoJ2KADqu0eSKAD2BAD+KAD6AxCNqwoX1Ziawo9LYnGuwhL7aiwaIyYudst6DlrnhcWvCcVvlbE9PvsL2u3uYmdCZum9Rns7clUYsreOYp2nuj37kiZLUfqhvxafqfmrNWEjA22uz1Vu6lsiRAArWhLfgdoONnF36rWrzpz/ajy/LbqR8pdWDVXx8faN6m0+6hE3+1pmHP2mTy1b4k1p/ueFlj7vMfzyBs1v/2GfLpFL5rQD+7teQABj4n0/rpCz6r1Z8apP+4rrfw1iV0JyLUkqQACTDnah5AACtaEt9zcGNhleJbU5lsOFRvE5FuOPS4WLSt7+8ajvXcJpcyNvhztSwUkTGgZz9uDSxRnPZwsnTkqy7jpsPt/AxyvnAeJMAxPnMiqPJYIyzf4/KqrPedaK5bIsQwfTo2t72hUepPWvn6mO8VoXrgbDRaJXle6r75FzZm7S2TnyTNUXv5eiDAA6j0WmKyWv51liRAACjP0OZNVuwa4L+uQAcw/SiNGH57mIxxPKmU3DgpL2sX3rutc3/vh9gu1D3tNEAtvOSKVAmxfagRmbnrcWZJVaTJET2qsv84eyk04PzcEb71OX0u9OUzHD1nsR+1PhtwFUBvfQ5x/RizvdQy/VZzPWaz3b8uST+wFWDABqKAC6m03uGx2X+xGKBxmL2iWDQjb777/Nt0Pf+xmlCyPT+v032hVv+yXD3lGx2wlr99/hJyvX1gFV20vf1e1DCaqr6xdz+wlzJe7T2jmWFACT3i7v+0IP9u0R8xF7Oh7r+zHn/+WLHc7CQGUP3vtePEUT0i7qHADORIUyEAC2BAACf0Hr3j73zhLX7x96RGkrMgbePFEiRDUeOA0P4wNiVAEaPBkTzh7f9yeCPCkSLADmRAESYAEf+/f6CAET4xdz4wtnzibiKAEH/9GH///+OAEP4w9pHeYEoAAAAFXRSTlP9+PLorFm9i97MnGlJOnoNKxwBAwB644ahAACClklEQVR42uydf0xUZ77/rQLyS4TBk4Yh2XTdxLrurr3qVcyuRdzvt7CSRYSrLMS7S/R6WZqy9Mem2dt+myLG2E4VOnXipLWYGnXStTXx/lOUq/QSgwgxWpEgFccTkE0cYZyJZ2YcyeDj9/15nvODmQG1OpR1t+8znBlmEPW85vPz+ZwzM+b9A8o0b85c7J5K/SMCMc2bG/cDkL8jZc5LmTnvByB/N8qcl/lM4rynVf+QQGYHEp5WA/nHA2LKnBfv9yQDy9OpfzQg4JHKmCftByB/H8o0gYc/5In/AchjKTMz1uYxL83vlxhL/wHIY/HALbY4MhICjDH/M09tGQIg04gjIw0HzhQ7HJlpIY+kuCVPwlNrIAAynUQSslNxFwvXB6xzU2YF/JJTYu7A0xvTpxnInIAnLoMi8RPIZOKRKD1tJnDITrPVEfLP+QHI4/WcMmf5A+ZUHNMngQFlpCRIwOF2SOb1FZJ/5twfgDx2TR3ye2bj0eOwEDAy01MSQoEAAw7ZbKnoX292xD29PKYTCL27Uz1MYoE42Mp3+4PEQsBInsUCMA5Jdihmtn5n/1hl0PX0ViEAMr0571yzn7FQIAk+5pFRmMTDjNS0OA6DSYrT4ZTMLL+xf2xsp9s/4yn2WNNpIXMyKM/ySCDiSXz4MRQuijQ3PT4tLjHkGRkZdkIOJwuZg9b1jf3gQR7r6W1kTScQpFjPxKOwDgAIUzyzQOQRYMydkzI7YZbidDkcTkUKQWazOeitXF9hBw5SZciZOi9T6GnkMp0uKymQkDHHAxzMLwcSM+eZJqWhskhOMivDDscIC5ohyW+x1lTm56+v2DkGGlz9FSEnynRdgPKUlezTCSTe4zGnZfsVtmWT34kljAmJgAbBSEuYyQIjLngnkLBUgkJFY6PdbgcKaExTfz48Fnxaanx8fGp6hlY1fvcU7p8SSEY288NAnP6XtlpgI7MncFq8Ao+Pm
|
amd64 / build-upstream-pax / build-upstream-pax
buildx failed with: ERROR: failed to solve: process "/bin/sh -c <<\"EOF\" bash -exu -o pipefail\ngit-clone.sh ${URLREF_LINGVO} ${SRC_PATH_LINGVO}\npushd ${SRC_PATH_LINGVO}\n\nCPU_ARCH=\"$(dpkg --print-architecture)\"\nif [[ \"${CPU_ARCH}\" == \"arm64\" ]]; then\n\n# Use aarch distribution of protobufs\npatch -p1 <<\"EOFINNER\"\ndiff --git a/lingvo/repo.bzl b/lingvo/repo.bzl\nindex ce65822d2..d9c0277aa 100644\n--- a/lingvo/repo.bzl\n+++ b/lingvo/repo.bzl\n@@ -232,9 +232,9 @@ filegroup(\n )\n \"\"\",\n urls = [\n- \"https://github.com/protocolbuffers/protobuf/releases/download/v21.9/protoc-21.9-linux-x86_64.zip\",\n+ \"https://github.com/protocolbuffers/protobuf/releases/download/v21.9/protoc-21.9-linux-aarch_64.zip\",\n ],\n- sha256 = \"3cd951aff8ce713b94cde55e12378f505f2b89d47bf080508cf77e3934f680b6\",\n+ sha256 = \"a584286dfa8ebb17032ece206ed74d5e9931e2edb9016e427be2a0dab3b21071\",\n )\n\n def icu():\nEOFINNER\n\nfi\n\npip install tensorflow_datasets==4.9.2 auditwheel tensorflow==2.18.0\nfor pattern in \\\n \"s|tensorflow=|#tensorflow=|g\" \\\n \"s|dataclasses=|#dataclasses=|g\" \\\n \"s|==.*||g\" \\\n; do\n sed -i \"${pattern}\" ${SRC_PATH_LINGVO}/docker/dev.requirements.txt\ndone\n# Lingvo support only python < 3.12, so we hack it and update dependencies\n# to be able to build for py-3.12\nfor pattern in \\\n \"s|tensorflow-text~=2.13.0|tensorflow-text~=2.18.1|g\" \\\n \"s|tensorflow~=2.13.0|tensorflow~=2.18.0|g\" \\\n \"s|python_requires='>=3.8,<3.11'|python_requires='>=3.8,<3.13'|\" \\\n; do\n sed -i \"${pattern}\" ${SRC_PATH_LINGVO}/pip_package/setup.py;\ndone\npip install -r docker/dev.requirements.txt\n\n# Some tests are flaky right now, so we skip running the tests.\nBUILD_ARCH=\"x86_64\"\nif [[ \"$CPU_ARCH\" == \"arm64\" ]]; then\n BUILD_ARCH=\"aarch64\";\nfi\nsed -i 's/manylinux2014_x86_64/manylinux_2_38_'\"${BUILD_ARCH}\"'/' pip_package/build.sh\nSKIP_TESTS=1 PYTHON_MINOR_VERSION=$(python --version | cut -d ' ' -f 2 | cut -d '.' -f 2) pip_package/build.sh\nEOF" did not complete successfully: exit code: 1
|
amd64 / test-jax / jax-A100-unit-test
Process completed with exit code 1.
|
amd64 / test-nsys-jax / nsys-jax-A100-unit-test
Process completed with exit code 1.
|
amd64 / test-nsys-jax / nsys-jax-A100-unit-test
Process completed with exit code 4.
|
amd64 / test-triton / triton-A100-unit-test
Process completed with exit code 1.
|
amd64 / test-rosetta-t5x / vit-multi-gpu-multi-node (1, 1)
The job running on runner jumpbox-vc69x-w5njr has exceeded the maximum execution time of 360 minutes.
|
amd64 / test-rosetta-t5x / vit-multi-gpu-multi-node (1, 1)
The operation was canceled.
|
amd64 / test-rosetta-t5x / vit-single-process-multi-device (8)
The job running on runner jumpbox-vc69x-rbqcb has exceeded the maximum execution time of 360 minutes.
|
amd64 / test-rosetta-t5x / vit-single-process-multi-device (8)
The operation was canceled.
|
amd64 / test-rosetta-t5x / single-process-multi-device (1P1G_te-0, 1, --enable-te 0)
The job running on runner jumpbox-vc69x-qb666 has exceeded the maximum execution time of 360 minutes.
|
amd64 / test-rosetta-t5x / single-process-multi-device (1P1G_te-0, 1, --enable-te 0)
The operation was canceled.
|
amd64 / test-upstream-t5x / test-upstream-t5x-outcome
Process completed with exit code 1.
|
amd64 / test-rosetta-t5x / test-t5x-rosetta-outcome
Process completed with exit code 1.
|
amd64 / test-maxtext / test-maxtext-outcome
Process completed with exit code 1.
|
amd64 / test-gemma / gemma-A100-unit-test
Canceling since a higher priority waiting request for 'CI-olupton/disable-doomed-tests' exists
|
CI
Error when evaluating 'strategy' for job 'publish-containers'. .github/workflows/ci.yaml (Line: 424, Col: 15): Matrix vector 'config' does not contain any values
|
CI
Error when evaluating 'strategy' for job 'publish-containers'. .github/workflows/ci.yaml (Line: 424, Col: 15): Matrix vector 'config' does not contain any values
|
amd64 / build-levanter / build-levanter
Failed to download action 'https://api.github.com/repos/docker/metadata-action/tarball/369eb591f429131d6889c46b94e711f089e6ca96'. Error: Response status code does not indicate success: 503 (Service Unavailable). 819C:26C4B:254C2D5:267331E:67A488F0
|
amd64 / build-levanter / build-levanter
Back off 13.809 seconds before retry.
|
amd64 / test-jax / jax-A100-unit-test
Failed to download action 'https://api.github.com/repos/actions/checkout/tarball/11bd71901bbe5b1630ceea73d27597364c9af683'. Error: Response status code does not indicate success: 504 (Gateway Timeout).
|
amd64 / test-jax / jax-A100-unit-test
Back off 11.575 seconds before retry.
|
amd64 / test-jax / jax-A100-unit-test
Failed to download action 'https://api.github.com/repos/docker/login-action/tarball/9780b0c442fbb1117ed29e0efdff1e18412f7567'. Error: Response status code does not indicate success: 504 (Gateway Timeout).
|
amd64 / test-jax / jax-A100-unit-test
Back off 19.869 seconds before retry.
|
Artifacts
Produced during runtime
Name | Size | |
---|---|---|
artifact-base-build-amd64
|
567 Bytes |
|
artifact-base-build-arm64
|
566 Bytes |
|
artifact-equinox-build-amd64
|
474 Bytes |
|
artifact-equinox-build-arm64
|
569 Bytes |
|
artifact-gemma-build-amd64
|
559 Bytes |
|
artifact-jax-build-amd64
|
554 Bytes |
|
artifact-jax-build-arm64
|
552 Bytes |
|
artifact-levanter-build-amd64
|
572 Bytes |
|
artifact-levanter-build-arm64
|
572 Bytes |
|
artifact-maxtext-build-amd64
|
568 Bytes |
|
artifact-maxtext-build-arm64
|
567 Bytes |
|
artifact-maxtext-test
|
661 Bytes |
|
artifact-pax-build-amd64
|
472 Bytes |
|
artifact-pax-build-arm64
|
471 Bytes |
|
artifact-rosetta-build-t5x-amd64
|
584 Bytes |
|
artifact-rosetta-build-t5x-arm64
|
583 Bytes |
|
artifact-rosetta-t5x-mgmn-test
|
632 Bytes |
|
artifact-t5x-build-amd64
|
569 Bytes |
|
artifact-t5x-build-arm64
|
567 Bytes |
|
artifact-triton-build-amd64
|
565 Bytes |
|
artifact-upstream-t5x-mgmn-test
|
1.63 KB |
|
jax-unit-test-A100
|
182 KB |
|
levanter-unit-test-A100
|
15 KB |
|
nsys-jax-unit-test-A100
|
30.5 MB |
|
rosetta-t5x-13174448497-1P8G_te-1
|
5.26 MB |
|
rosetta-t5x-vit-13174448497-VIT8G1N
|
33.2 KB |
|
triton-unit-test-A100
|
3.1 KB |
|
upstream-maxtext-13174448497-1DP1FSDP1TP1PP
|
16.1 KB |
|
upstream-maxtext-13174448497-1DP1FSDP8TP1PP
|
22.1 KB |
|
upstream-maxtext-13174448497-1DP2FSDP4TP1PP_single_process
|
16.4 KB |
|
upstream-maxtext-13174448497-1DP4FSDP2TP1PP
|
21.7 KB |
|
upstream-maxtext-13174448497-1DP8FSDP1TP1PP
|
22.1 KB |
|
upstream-maxtext-13174448497-2DP2FSDP2TP1PP
|
22.1 KB |
|
upstream-maxtext-metrics-test-log
|
2.94 KB |
|
upstream-t5x-13174448497-1P2G_fmha
|
6.4 MB |
|
upstream-t5x-13174448497-1P8G
|
6.4 MB |
|
upstream-t5x-metrics-test-log
|
7.78 KB |
|