Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage release 2024-12-09 #10053

Merged
merged 65 commits into from
Dec 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
45658cc
Update pgvector to 0.8.0 (#9733)
lubennikovaav Dec 2, 2024
5330122
test_runner: improve `wait_until` (#9936)
erikgrinaker Dec 2, 2024
bd09369
storcon: add metric for AZ scheduling violations (#9949)
jcsp Dec 2, 2024
cd1d2d1
fix(proxy): forward notifications from authentication (#9948)
conradludgate Dec 2, 2024
c18716b
CI(replication-tests): fix notifications about replication-tests fail…
bayandin Dec 2, 2024
1b60571
proxy: Create Elasticache credentials provider lazily (#9967)
cloneable Dec 2, 2024
fa909c2
Update consensus protocol spec (#9607)
arssher Dec 2, 2024
243bca1
Bump OTel, tracing, reqwest crates (#9970)
cloneable Dec 2, 2024
2dc238e
feat(proxy): emit JWT auth method and JWT issuer in parquet logs (#9971)
conradludgate Dec 2, 2024
d8ebd33
Stop changing the value of neon.extension_server_port at runtime (#9972)
tristan957 Dec 2, 2024
2e9207f
fix(testing): Use 1 MB shared_buffers even with LFC (#9969)
ololobus Dec 2, 2024
aaee713
storcon: use proper schedule context during node delete (#9958)
jcsp Dec 3, 2024
15d01b2
storcon_cli tenant-describe: include tenant-wide information in outpu…
problame Dec 3, 2024
cb10be7
page_service: batching observability & include throttled time in smgr…
problame Dec 3, 2024
a2a942f
Add support for the extensions test for Postgres v17 (#9748)
a-masterov Dec 3, 2024
dcb24ce
safekeeper,pageserver: add heap profiling (#9778)
erikgrinaker Dec 3, 2024
bbe4dfa
test_runner: use immediate shutdown in `test_sharded_ingest` (#9984)
erikgrinaker Dec 3, 2024
4d422b9
pageserver: only throttle pagestream requests & bring back throttling…
problame Dec 3, 2024
71d0042
storcon: in shard splits, inherit parent's AZ (#9946)
jcsp Dec 3, 2024
dcb6295
pageserver: only store SLRUs & aux files on shard zero (#9786)
jcsp Dec 3, 2024
b04ab46
pageserver: more detailed logs when calling re-attach (#9996)
jcsp Dec 3, 2024
27a42d0
chore(proxy): remove postgres config parser and md5 support (#9990)
conradludgate Dec 3, 2024
f312c65
pageserver: respond to multiple shutdown signals (#9982)
erikgrinaker Dec 3, 2024
3baef0b
Improvement: add console redirect timeout warning (#9985)
luixo Dec 3, 2024
9ef0662
chore(proxy): enforce single host+port (#9995)
conradludgate Dec 3, 2024
ca85f36
Support tenant manifests in the scrubber (#9942)
arpad-m Dec 3, 2024
944c1ad
tests & benchmarks: unify the way we customize the default tenant con…
problame Dec 3, 2024
023821a
test_page_service_batching: fix non-numeric metrics (#9998)
bayandin Dec 3, 2024
8d93d02
page_service: enable batching in Rust & Python Tests + Python benchma…
problame Dec 4, 2024
68205c4
storcon: return an error for drain attempts while paused (#9997)
VladLazar Dec 4, 2024
1b3558d
optimize parms for ingest bench (#9999)
Bodobolero Dec 4, 2024
9d75218
fix parsing human time output like "50m37s" (#10001)
Bodobolero Dec 4, 2024
7b18e33
pageserver: return proper status code for heatmap_upload errors (#9991)
erikgrinaker Dec 4, 2024
dcd016b
Assign /libs/proxy/ to proxy team (#10003)
cloneable Dec 4, 2024
bd52822
feat(proxy): add option to forward startup params (#9979)
conradludgate Dec 4, 2024
9a4157d
feat(compute): Set default application_name for pgbouncer connections…
ololobus Dec 4, 2024
699a213
Display reqwest error source (#10004)
erikgrinaker Dec 4, 2024
dec2e2f
Create a branch for compute release (#9637)
a-masterov Dec 4, 2024
60c0d19
tests: make storcon scale test AZ-aware (#9952)
jcsp Dec 4, 2024
e6cd505
pageserver: make `BufferedWriter` do double-buffering (#9693)
yliang412 Dec 4, 2024
0bab7e3
chore: update clap (#10009)
conradludgate Dec 4, 2024
131585e
chore: update rust-postgres (#10002)
conradludgate Dec 4, 2024
ed2d892
pageserver: fix buffered-writer on macos build (#10019)
yliang412 Dec 5, 2024
ffc9c33
proxy: Present new auth backend cplane_proxy_v1 (#10012)
awarus Dec 5, 2024
db79304
storage_controller: increase shard scan timeout (#10000)
erikgrinaker Dec 5, 2024
13e8105
feat(compute): Allow specifying the reconfiguration concurrency (#10006)
ololobus Dec 5, 2024
c0ba416
Add compute_logical_snapshots_bytes metric (#9887)
tristan957 Dec 5, 2024
71f38d1
feat(pageserver): support schedule gc-compaction (#9809)
skyzh Dec 5, 2024
6331cb2
Bump anyhow to 1.0.94 (#10028)
tristan957 Dec 5, 2024
6ff4175
Send Content-Type header on reconfigure request from neon_local (#10029)
tristan957 Dec 5, 2024
d1ab747
Fix desc_str for Azure container (#10021)
arpad-m Dec 5, 2024
56f867b
pageserver: only zero truncated FSM page on owning shard (#10032)
erikgrinaker Dec 6, 2024
ec4072f
pageserver: add `wait_until_flushed` parameter for timeline checkpoin…
erikgrinaker Dec 6, 2024
3f1c542
pageserver: add disk consistent and remote lsn metrics (#10005)
VladLazar Dec 6, 2024
7838659
pageserver: assert that keys belong to shard (#9943)
erikgrinaker Dec 6, 2024
fa07097
chore: Reorganize and refresh CODEOWNERS (#10008)
ololobus Dec 6, 2024
cc70fc8
pageserver: add metric for number of wal records received by each sha…
VladLazar Dec 6, 2024
14c4fae
test_runner/performance: add improved bulk insert benchmark (#9812)
erikgrinaker Dec 6, 2024
e4837b0
Bump sql_exporter to 0.16.0 (#10041)
tristan957 Dec 6, 2024
c42c28b
feat(pageserver): gc-compaction split job and partial scheduler (#9897)
skyzh Dec 6, 2024
b6eea65
Fix error message if PS connection is lost while receiving prefetch (…
hlinnaka Dec 6, 2024
b1fd086
test(pageserver): disable gc_compaction smoke test for now (#10045)
skyzh Dec 6, 2024
4d7111f
page_service: don't count time spent flushing towards smgr latency me…
problame Dec 7, 2024
ec79087
storcon: automatically clear Pause/Stop scheduling policies to enable…
jcsp Dec 7, 2024
6c349e7
Storage release 2024-12-09
github-actions[bot] Dec 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/actions/allure-report-generate/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,8 @@ runs:
PR_NUMBER=$(jq --raw-output .pull_request.number "$GITHUB_EVENT_PATH" || true)
if [ "${PR_NUMBER}" != "null" ]; then
BRANCH_OR_PR=pr-${PR_NUMBER}
elif [ "${GITHUB_REF_NAME}" = "main" ] || [ "${GITHUB_REF_NAME}" = "release" ] || [ "${GITHUB_REF_NAME}" = "release-proxy" ]; then
elif [ "${GITHUB_REF_NAME}" = "main" ] || [ "${GITHUB_REF_NAME}" = "release" ] || \
[ "${GITHUB_REF_NAME}" = "release-proxy" ] || [ "${GITHUB_REF_NAME}" = "release-compute" ]; then
# Shortcut for special branches
BRANCH_OR_PR=${GITHUB_REF_NAME}
else
Expand Down
3 changes: 2 additions & 1 deletion .github/actions/allure-report-store/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ runs:
PR_NUMBER=$(jq --raw-output .pull_request.number "$GITHUB_EVENT_PATH" || true)
if [ "${PR_NUMBER}" != "null" ]; then
BRANCH_OR_PR=pr-${PR_NUMBER}
elif [ "${GITHUB_REF_NAME}" = "main" ] || [ "${GITHUB_REF_NAME}" = "release" ] || [ "${GITHUB_REF_NAME}" = "release-proxy" ]; then
elif [ "${GITHUB_REF_NAME}" = "main" ] || [ "${GITHUB_REF_NAME}" = "release" ] || \
[ "${GITHUB_REF_NAME}" = "release-proxy" ] || [ "${GITHUB_REF_NAME}" = "release-compute" ]; then
# Shortcut for special branches
BRANCH_OR_PR=${GITHUB_REF_NAME}
else
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/_create-release-pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ defaults:
shell: bash -euo pipefail {0}

jobs:
create-storage-release-branch:
create-release-branch:
runs-on: ubuntu-22.04

permissions:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/benchmarking.yml
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ jobs:

# Post both success and failure to the Slack channel
- name: Post to a Slack channel
if: ${{ github.event.schedule }}
if: ${{ github.event.schedule && !cancelled() }}
uses: slackapi/slack-github-action@v1
with:
channel-id: "C06T9AMNDQQ" # on-call-compute-staging-stream
Expand Down
38 changes: 23 additions & 15 deletions .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ on:
- main
- release
- release-proxy
- release-compute
pull_request:

defaults:
Expand Down Expand Up @@ -70,8 +71,10 @@ jobs:
echo "tag=release-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT
elif [[ "$GITHUB_REF_NAME" == "release-proxy" ]]; then
echo "tag=release-proxy-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT
elif [[ "$GITHUB_REF_NAME" == "release-compute" ]]; then
echo "tag=release-compute-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT
else
echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main' or 'release'"
echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main' or 'release', 'release-proxy', 'release-compute'"
echo "tag=$GITHUB_RUN_ID" >> $GITHUB_OUTPUT
fi
shell: bash
Expand Down Expand Up @@ -513,7 +516,7 @@ jobs:
})

trigger-e2e-tests:
if: ${{ !github.event.pull_request.draft || contains( github.event.pull_request.labels.*.name, 'run-e2e-tests-in-draft') || github.ref_name == 'main' || github.ref_name == 'release' || github.ref_name == 'release-proxy' }}
if: ${{ !github.event.pull_request.draft || contains( github.event.pull_request.labels.*.name, 'run-e2e-tests-in-draft') || github.ref_name == 'main' || github.ref_name == 'release' || github.ref_name == 'release-proxy' || github.ref_name == 'release-compute' }}
needs: [ check-permissions, promote-images, tag ]
uses: ./.github/workflows/trigger-e2e-tests.yml
secrets: inherit
Expand Down Expand Up @@ -669,7 +672,7 @@ jobs:
neondatabase/compute-node-${{ matrix.version.pg }}:${{ needs.tag.outputs.build-tag }}-${{ matrix.version.debian }}-${{ matrix.arch }}

- name: Build neon extensions test image
if: matrix.version.pg == 'v16'
if: matrix.version.pg >= 'v16'
uses: docker/build-push-action@v6
with:
context: .
Expand All @@ -684,8 +687,7 @@ jobs:
pull: true
file: compute/compute-node.Dockerfile
target: neon-pg-ext-test
cache-from: type=registry,ref=cache.neon.build/neon-test-extensions-${{ matrix.version.pg }}:cache-${{ matrix.version.debian }}-${{ matrix.arch }}
cache-to: ${{ github.ref_name == 'main' && format('type=registry,ref=cache.neon.build/neon-test-extensions-{0}:cache-{1}-{2},mode=max', matrix.version.pg, matrix.version.debian, matrix.arch) || '' }}
cache-from: type=registry,ref=cache.neon.build/compute-node-${{ matrix.version.pg }}:cache-${{ matrix.version.debian }}-${{ matrix.arch }}
tags: |
neondatabase/neon-test-extensions-${{ matrix.version.pg }}:${{needs.tag.outputs.build-tag}}-${{ matrix.version.debian }}-${{ matrix.arch }}

Expand All @@ -708,7 +710,7 @@ jobs:
push: true
pull: true
file: compute/compute-node.Dockerfile
cache-from: type=registry,ref=cache.neon.build/neon-test-extensions-${{ matrix.version.pg }}:cache-${{ matrix.version.debian }}-${{ matrix.arch }}
cache-from: type=registry,ref=cache.neon.build/compute-node-${{ matrix.version.pg }}:cache-${{ matrix.version.debian }}-${{ matrix.arch }}
cache-to: ${{ github.ref_name == 'main' && format('type=registry,ref=cache.neon.build/compute-tools-{0}:cache-{1}-{2},mode=max', matrix.version.pg, matrix.version.debian, matrix.arch) || '' }}
tags: |
neondatabase/compute-tools:${{ needs.tag.outputs.build-tag }}-${{ matrix.version.debian }}-${{ matrix.arch }}
Expand Down Expand Up @@ -744,7 +746,7 @@ jobs:
neondatabase/compute-node-${{ matrix.version.pg }}:${{ needs.tag.outputs.build-tag }}-${{ matrix.version.debian }}-arm64

- name: Create multi-arch neon-test-extensions image
if: matrix.version.pg == 'v16'
if: matrix.version.pg >= 'v16'
run: |
docker buildx imagetools create -t neondatabase/neon-test-extensions-${{ matrix.version.pg }}:${{ needs.tag.outputs.build-tag }} \
-t neondatabase/neon-test-extensions-${{ matrix.version.pg }}:${{ needs.tag.outputs.build-tag }}-${{ matrix.version.debian }} \
Expand Down Expand Up @@ -833,6 +835,7 @@ jobs:
fail-fast: false
matrix:
arch: [ x64, arm64 ]
pg_version: [v16, v17]

runs-on: ${{ fromJson(format('["self-hosted", "{0}"]', matrix.arch == 'arm64' && 'small-arm64' || 'small')) }}

Expand Down Expand Up @@ -871,7 +874,10 @@ jobs:

- name: Verify docker-compose example and test extensions
timeout-minutes: 20
run: env TAG=${{needs.tag.outputs.build-tag}} ./docker-compose/docker_compose_test.sh
env:
TAG: ${{needs.tag.outputs.build-tag}}
TEST_VERSION_ONLY: ${{ matrix.pg_version }}
run: ./docker-compose/docker_compose_test.sh

- name: Print logs and clean up
if: always()
Expand Down Expand Up @@ -931,7 +937,7 @@ jobs:
neondatabase/neon-test-extensions-v16:${{ needs.tag.outputs.build-tag }}

- name: Configure AWS-prod credentials
if: github.ref_name == 'release'|| github.ref_name == 'release-proxy'
if: github.ref_name == 'release'|| github.ref_name == 'release-proxy' || github.ref_name == 'release-compute'
uses: aws-actions/configure-aws-credentials@v4
with:
aws-region: eu-central-1
Expand All @@ -940,12 +946,12 @@ jobs:

- name: Login to prod ECR
uses: docker/login-action@v3
if: github.ref_name == 'release'|| github.ref_name == 'release-proxy'
if: github.ref_name == 'release'|| github.ref_name == 'release-proxy' || github.ref_name == 'release-compute'
with:
registry: 093970136003.dkr.ecr.eu-central-1.amazonaws.com

- name: Copy all images to prod ECR
if: github.ref_name == 'release'|| github.ref_name == 'release-proxy'
if: github.ref_name == 'release' || github.ref_name == 'release-proxy' || github.ref_name == 'release-compute'
run: |
for image in neon compute-tools {vm-,}compute-node-{v14,v15,v16,v17}; do
docker buildx imagetools create -t 093970136003.dkr.ecr.eu-central-1.amazonaws.com/${image}:${{ needs.tag.outputs.build-tag }} \
Expand All @@ -965,7 +971,7 @@ jobs:
tenant_id: ${{ vars.AZURE_TENANT_ID }}

push-to-acr-prod:
if: github.ref_name == 'release'|| github.ref_name == 'release-proxy'
if: github.ref_name == 'release' || github.ref_name == 'release-proxy' || github.ref_name == 'release-compute'
needs: [ tag, promote-images ]
uses: ./.github/workflows/_push-to-acr.yml
with:
Expand Down Expand Up @@ -1053,7 +1059,7 @@ jobs:
deploy:
needs: [ check-permissions, promote-images, tag, build-and-test-locally, trigger-custom-extensions-build-and-wait, push-to-acr-dev, push-to-acr-prod ]
# `!failure() && !cancelled()` is required because the workflow depends on the job that can be skipped: `push-to-acr-dev` and `push-to-acr-prod`
if: (github.ref_name == 'main' || github.ref_name == 'release' || github.ref_name == 'release-proxy') && !failure() && !cancelled()
if: (github.ref_name == 'main' || github.ref_name == 'release' || github.ref_name == 'release-proxy' || github.ref_name == 'release-compute') && !failure() && !cancelled()

runs-on: [ self-hosted, small ]
container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/ansible:latest
Expand Down Expand Up @@ -1102,13 +1108,15 @@ jobs:
-f deployProxyAuthBroker=true \
-f branch=main \
-f dockerTag=${{needs.tag.outputs.build-tag}}
elif [[ "$GITHUB_REF_NAME" == "release-compute" ]]; then
gh workflow --repo neondatabase/infra run deploy-compute-dev.yml --ref main -f dockerTag=${{needs.tag.outputs.build-tag}}
else
echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main' or 'release'"
echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main', 'release', 'release-proxy' or 'release-compute'"
exit 1
fi

- name: Create git tag
if: github.ref_name == 'release' || github.ref_name == 'release-proxy'
if: github.ref_name == 'release' || github.ref_name == 'release-proxy' || github.ref_name == 'release-compute'
uses: actions/github-script@v7
with:
# Retry script for 5XX server errors: https://github.com/actions/github-script#retries
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/ingest_benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ concurrency:
jobs:
ingest:
strategy:
fail-fast: false # allow other variants to continue even if one fails
matrix:
target_project: [new_empty_project, large_existing_project]
permissions:
Expand Down
23 changes: 20 additions & 3 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ on:
type: boolean
description: 'Create Proxy release PR'
required: false
create-compute-release-branch:
type: boolean
description: 'Create Compute release PR'
required: false

# No permission for GITHUB_TOKEN by default; the **minimal required** set of permissions should be granted in each job.
permissions: {}
Expand All @@ -25,20 +29,20 @@ defaults:

jobs:
create-storage-release-branch:
if: ${{ github.event.schedule == '0 6 * * MON' || format('{0}', inputs.create-storage-release-branch) == 'true' }}
if: ${{ github.event.schedule == '0 6 * * MON' || inputs.create-storage-release-branch }}

permissions:
contents: write

uses: ./.github/workflows/_create-release-pr.yml
with:
component-name: 'Storage & Compute'
component-name: 'Storage'
release-branch: 'release'
secrets:
ci-access-token: ${{ secrets.CI_ACCESS_TOKEN }}

create-proxy-release-branch:
if: ${{ github.event.schedule == '0 6 * * THU' || format('{0}', inputs.create-proxy-release-branch) == 'true' }}
if: ${{ github.event.schedule == '0 6 * * THU' || inputs.create-proxy-release-branch }}

permissions:
contents: write
Expand All @@ -49,3 +53,16 @@ jobs:
release-branch: 'release-proxy'
secrets:
ci-access-token: ${{ secrets.CI_ACCESS_TOKEN }}

create-compute-release-branch:
if: inputs.create-compute-release-branch

permissions:
contents: write

uses: ./.github/workflows/_create-release-pr.yml
with:
component-name: 'Compute'
release-branch: 'release-compute'
secrets:
ci-access-token: ${{ secrets.CI_ACCESS_TOKEN }}
2 changes: 2 additions & 0 deletions .github/workflows/trigger-e2e-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ jobs:
echo "tag=release-$(git rev-list --count HEAD)" | tee -a $GITHUB_OUTPUT
elif [[ "$GITHUB_REF_NAME" == "release-proxy" ]]; then
echo "tag=release-proxy-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT
elif [[ "$GITHUB_REF_NAME" == "release-compute" ]]; then
echo "tag=release-compute-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT
else
echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main' or 'release'"
BUILD_AND_TEST_RUN_ID=$(gh run list -b $CURRENT_BRANCH -c $CURRENT_SHA -w 'Build and Test' -L 1 --json databaseId --jq '.[].databaseId')
Expand Down
32 changes: 23 additions & 9 deletions CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -1,15 +1,29 @@
/.github/ @neondatabase/developer-productivity
/compute_tools/ @neondatabase/control-plane @neondatabase/compute
/libs/pageserver_api/ @neondatabase/storage
/libs/postgres_ffi/ @neondatabase/compute @neondatabase/storage
/libs/remote_storage/ @neondatabase/storage
/libs/safekeeper_api/ @neondatabase/storage
# Autoscaling
/libs/vm_monitor/ @neondatabase/autoscaling
/pageserver/ @neondatabase/storage

# DevProd
/.github/ @neondatabase/developer-productivity

# Compute
/pgxn/ @neondatabase/compute
/pgxn/neon/ @neondatabase/compute @neondatabase/storage
/vendor/ @neondatabase/compute
/compute/ @neondatabase/compute
/compute_tools/ @neondatabase/compute

# Proxy
/libs/proxy/ @neondatabase/proxy
/proxy/ @neondatabase/proxy

# Storage
/pageserver/ @neondatabase/storage
/safekeeper/ @neondatabase/storage
/storage_controller @neondatabase/storage
/storage_scrubber @neondatabase/storage
/vendor/ @neondatabase/compute
/libs/pageserver_api/ @neondatabase/storage
/libs/remote_storage/ @neondatabase/storage
/libs/safekeeper_api/ @neondatabase/storage

# Shared
/pgxn/neon/ @neondatabase/compute @neondatabase/storage
/libs/compute_api/ @neondatabase/compute @neondatabase/control-plane
/libs/postgres_ffi/ @neondatabase/compute @neondatabase/storage
Loading
Loading