Skip to content

Commit

Permalink
feat(dockerhub-mirror) set up a new dedicated ACR to mirror DockerHub…
Browse files Browse the repository at this point in the history
… inside the Jenkins Azure infrastructure (#794)

Related to jenkins-infra/helpdesk#4192

Fixup of
91cf2dc

Reference Azure documentation:
https://learn.microsoft.com/en-us/azure/container-registry/container-registry-artifact-cache?pivots=development-environment-azure-portal

This PR introduces an Azure Container Registry set up as a DockerHub
mirror using a "Cache Rule" which mirrors `docker.io/*` to `*` (note: it
forbids us to use other caching mechanism!).

This registry has the following properties:

- Only available in the "sponsorship" subscription
- Anonymous pull access (constraint due to Docker pull through cache -
moby/moby#30880)
- Private network only: since we have anonymous pull policy (see above),
then we restrict to only a subset of private networks. It uses ["Azure
Private
Endpoints"](https://learn.microsoft.com/en-us/azure/private-link/private-endpoint-overview)
for this
- Note: it implies using Private DNS zones linked to networks. These
zone might need to be reused in the future for other private link if
required


The registry is available for the following (heavy DockerHub users)
services (I've only setup the Azure ephemeral VM agents subnets for now)
through a combination of (private endpoint with a NIC in the subnet +
private DNS zone with automatic records + inbound and outbound NSG
rules):
- ci.jenkins.io
- cert.ci.jenkins.io
- trusted.jenkins.io
- infra.jenkins.io

Azure makes it mandatory to log-in on DockerHub for such a mirror
system. As such, we use a distinct token stored in an Azure Keyvault
which is "Public Images Read Only" associated to the `jenkinsciinfra`
organization to avoid the "application" rate limit (e.g. 5k pull / day /
IP) and only have the DockerHub anti-abuse system as upper limit (which
seems to be a combination of requests and amount of data).

![Capture d’écran 2024-08-05 à 16 31
38](https://github.com/user-attachments/assets/f04e4c49-3500-4589-b0fc-42b5b1792066)

----

*Testing and approving*

This PR is expected to have no changes in the plan as it was applied
manually:

- End to end testing was done on each controller by:
- Starting an Azure VM ephemeral agent using a pipeline replay with
correct label
- The pipeline tries to resolve the DNS name
`dockerhubmirror.azurecr.io` and should resolve to an IP local to the VM
subnet
- Once the VM is up, checking the connectivity in Azure UI portal
(`Network Watcher` -> `Connection troubleshoot`)
    - Source VM is the agent VM, which name is retrieved from build log
    - Destination is `https://dockerhubmirror.azurecr.io`

<img width="1185" alt="Capture d’écran 2024-08-06 à 10 42 25"
src="https://github.com/user-attachments/assets/11d762a6-119c-4e03-b7f0-91072364aaa2">


- The bootstrap must be done in 2 `terraform apply` commands as
documented, because the ACR component `CredentialSet` is not supported
by Terraform yet (see comments in TF code).

Signed-off-by: Damien Duportal <damien.duportal@gmail.com>
  • Loading branch information
dduportal authored Aug 6, 2024
1 parent 91cf2dc commit 59c4445
Show file tree
Hide file tree
Showing 5 changed files with 281 additions and 141 deletions.
38 changes: 38 additions & 0 deletions cert.ci.jenkins.io.tf
Original file line number Diff line number Diff line change
Expand Up @@ -117,3 +117,41 @@ resource "azurerm_dns_a_record" "cert_ci_jenkins_io" {
ttl = 60
records = [module.cert_ci_jenkins_io.controller_private_ipv4]
}

## Allow access to/from ACR endpoint
resource "azurerm_network_security_rule" "allow_out_https_from_cert_ephemeral_agents_to_acr" {
provider = azurerm.jenkins-sponsorship
name = "allow-out-https-from-ephemeral-agents-to-acr"
priority = 4050
direction = "Outbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "443"
source_address_prefixes = data.azurerm_subnet.cert_ci_jenkins_io_sponsorship_ephemeral_agents.address_prefixes
destination_address_prefixes = distinct(
flatten(
[for rs in azurerm_private_endpoint.dockerhub_mirror["certcijenkinsio"].private_dns_zone_configs.*.record_sets : rs.*.ip_addresses]
)
)
resource_group_name = azurerm_resource_group.cert_ci_jenkins_io_controller_jenkins_sponsorship.name
network_security_group_name = module.cert_ci_jenkins_io_azurevm_agents_jenkins_sponsorship.ephemeral_agents_nsg_name
}
resource "azurerm_network_security_rule" "allow_in_https_from_cert_ephemeral_agents_to_acr" {
provider = azurerm.jenkins-sponsorship
name = "allow-in-https-from-ephemeral-agents-to-acr"
priority = 4050
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "443"
source_address_prefixes = data.azurerm_subnet.cert_ci_jenkins_io_sponsorship_ephemeral_agents.address_prefixes
destination_address_prefixes = distinct(
flatten(
[for rs in azurerm_private_endpoint.dockerhub_mirror["certcijenkinsio"].private_dns_zone_configs.*.record_sets : rs.*.ip_addresses]
)
)
resource_group_name = azurerm_resource_group.cert_ci_jenkins_io_controller_jenkins_sponsorship.name
network_security_group_name = module.cert_ci_jenkins_io_azurevm_agents_jenkins_sponsorship.ephemeral_agents_nsg_name
}
160 changes: 19 additions & 141 deletions ci.jenkins.io.tf
Original file line number Diff line number Diff line change
Expand Up @@ -95,117 +95,7 @@ resource "azurerm_network_security_rule" "allow_outbound_https_from_cijio_to_cij
network_security_group_name = module.ci_jenkins_io_sponsorship.controller_nsg_name
}

## Service DNS records
resource "azurerm_dns_cname_record" "ci_jenkins_io" {
name = trimsuffix(trimsuffix(local.ci_jenkins_io_fqdn, data.azurerm_dns_zone.jenkinsio.name), ".")
zone_name = data.azurerm_dns_zone.jenkinsio.name
resource_group_name = data.azurerm_resource_group.proddns_jenkinsio.name
ttl = 60
record = module.ci_jenkins_io_sponsorship.controller_public_fqdn
tags = local.default_tags
}
resource "azurerm_dns_cname_record" "assets_ci_jenkins_io" {
name = "assets.${azurerm_dns_cname_record.ci_jenkins_io.name}"
zone_name = data.azurerm_dns_zone.jenkinsio.name
resource_group_name = data.azurerm_resource_group.proddns_jenkinsio.name
ttl = 60
record = module.ci_jenkins_io_sponsorship.controller_public_fqdn
tags = local.default_tags
}

#### ACR to use as DockerHub (and other) Registry mirror
data "azurerm_resource_group" "cijio_agents" {
name = "ci-jenkins-io-ephemeral-agents"
provider = azurerm.jenkins-sponsorship
}

resource "azurerm_container_registry" "cijenkinsio" {
name = "cijenkinsio"
provider = azurerm.jenkins-sponsorship
resource_group_name = data.azurerm_resource_group.cijio_agents.name
location = data.azurerm_resource_group.cijio_agents.location
sku = "Premium"
admin_enabled = false
public_network_access_enabled = false # private links are used to reach the registry
anonymous_pull_enabled = true # Require "Standard" or "Premium" sku. Docker Engine cannot use auth. for pull trough cache - ref. https://github.com/moby/moby/issues/30880
data_endpoint_enabled = true # Required for endpoint private link

tags = local.default_tags
}

locals {
# CredentialSet is not supported by Terraform, so we have to specify its name
acr_cijenkinsio_dockerhub_credentialset = "dockerhub"
}

resource "azurerm_container_registry_cache_rule" "mirror_dockerhub" {
name = "mirror"
provider = azurerm.jenkins-sponsorship
container_registry_id = azurerm_container_registry.cijenkinsio.id
source_repo = "docker.io/*"
target_repo = "*"
# Credential created manually (unsupported by Terraform)
credential_set_id = "${azurerm_container_registry.cijenkinsio.id}/credentialSets/${local.acr_cijenkinsio_dockerhub_credentialset}"
}

resource "azurerm_private_endpoint" "acr_cijenkinsio_agents" {
name = "acr-cijenkinsio-agents"
provider = azurerm.jenkins-sponsorship
location = data.azurerm_resource_group.cijio_agents.location
resource_group_name = data.azurerm_resource_group.cijio_agents.name
subnet_id = data.azurerm_subnet.ci_jenkins_io_ephemeral_agents_jenkins_sponsorship.id

private_service_connection {
name = "acr-cijenkinsio-agents"
private_connection_resource_id = azurerm_container_registry.cijenkinsio.id
subresource_names = ["registry"]
is_manual_connection = false
}
private_dns_zone_group {
name = "privatelink.azurecr.io"
private_dns_zone_ids = [azurerm_private_dns_zone.acr_ci_jenkins_io.id]
}
tags = local.default_tags
}

resource "azurerm_private_dns_zone" "acr_ci_jenkins_io" {
name = "privatelink.azurecr.io"
provider = azurerm.jenkins-sponsorship
resource_group_name = data.azurerm_resource_group.cijio_agents.name

tags = local.default_tags
}

resource "azurerm_private_dns_zone_virtual_network_link" "acr_ci_jenkins_io_vnet_dns" {
name = "acr-ci-jenkins-io-vnet_-dns"
provider = azurerm.jenkins-sponsorship
resource_group_name = data.azurerm_resource_group.cijio_agents.name
private_dns_zone_name = azurerm_private_dns_zone.acr_ci_jenkins_io.name
virtual_network_id = data.azurerm_virtual_network.public_jenkins_sponsorship.id

registration_enabled = true
tags = local.default_tags
}

resource "azurerm_key_vault" "ci_jenkins_io" {
name = "ddutest" # "ci-jenkins-io"
provider = azurerm.jenkins-sponsorship
location = data.azurerm_resource_group.cijio_agents.location
resource_group_name = data.azurerm_resource_group.cijio_agents.name

tenant_id = data.azurerm_client_config.current.tenant_id
soft_delete_retention_days = 7
purge_protection_enabled = false
enable_rbac_authorization = true
enabled_for_deployment = true
enabled_for_disk_encryption = true
enabled_for_template_deployment = true

sku_name = "standard"

tags = local.default_tags
}

## Allow access to/from ACR endpoint
resource "azurerm_network_security_rule" "allow_out_https_from_cijio_agents_to_acr" {
provider = azurerm.jenkins-sponsorship
name = "allow-out-https-from-cijio-agents-to-acr"
Expand All @@ -215,42 +105,30 @@ resource "azurerm_network_security_rule" "allow_out_https_from_cijio_agents_to_a
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "443"
source_address_prefixes = data.azurerm_subnet.ci_jenkins_io_ephemeral_agents.address_prefixes
source_address_prefixes = data.azurerm_subnet.ci_jenkins_io_ephemeral_agents_jenkins_sponsorship.address_prefixes
destination_address_prefixes = distinct(
flatten(
[for rs in azurerm_private_endpoint.acr_cijenkinsio_agents.private_dns_zone_configs.*.record_sets : rs.*.ip_addresses]
[for rs in azurerm_private_endpoint.dockerhub_mirror["cijenkinsio"].private_dns_zone_configs.*.record_sets : rs.*.ip_addresses]
)
)
resource_group_name = module.ci_jenkins_io_sponsorship.controller_resourcegroup_name
network_security_group_name = module.ci_jenkins_io_sponsorship.controller_nsg_name
network_security_group_name = module.ci_jenkins_io_azurevm_agents_jenkins_sponsorship.ephemeral_agents_nsg_name
}

resource "azurerm_network_security_rule" "allow_in_https_from_cijio_agents_to_acr" {
provider = azurerm.jenkins-sponsorship
name = "allow-in-https-from-cijio-agents-to-acr"
priority = 4050
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "443"
source_address_prefixes = distinct(
flatten(
[for rs in azurerm_private_endpoint.acr_cijenkinsio_agents.private_dns_zone_configs.*.record_sets : rs.*.ip_addresses]
)
)
destination_address_prefixes = data.azurerm_subnet.ci_jenkins_io_ephemeral_agents.address_prefixes
resource_group_name = module.ci_jenkins_io_sponsorship.controller_resourcegroup_name
network_security_group_name = module.ci_jenkins_io_sponsorship.controller_nsg_name
## Service DNS records
resource "azurerm_dns_cname_record" "ci_jenkins_io" {
name = trimsuffix(trimsuffix(local.ci_jenkins_io_fqdn, data.azurerm_dns_zone.jenkinsio.name), ".")
zone_name = data.azurerm_dns_zone.jenkinsio.name
resource_group_name = data.azurerm_resource_group.proddns_jenkinsio.name
ttl = 60
record = module.ci_jenkins_io_sponsorship.controller_public_fqdn
tags = local.default_tags
}

# This role allows the ACR registry to read secrets
# Note: an admin must insert secrets into the keyvault manually and then create the credentialset in ACR manually
# which requires the "Key Vault Secrets Officer" or "Owner" role temporarily
resource "azurerm_role_assignment" "acr_read_keyvault_secrets" {
provider = azurerm.jenkins-sponsorship
scope = azurerm_key_vault.ci_jenkins_io.id
role_definition_name = "Key Vault Secrets User"
principal_id = "201fed0a-6e86-4600-a12b-945f2c1c0eb2"
skip_service_principal_aad_check = true
resource "azurerm_dns_cname_record" "assets_ci_jenkins_io" {
name = "assets.${azurerm_dns_cname_record.ci_jenkins_io.name}"
zone_name = data.azurerm_dns_zone.jenkinsio.name
resource_group_name = data.azurerm_resource_group.proddns_jenkinsio.name
ttl = 60
record = module.ci_jenkins_io_sponsorship.controller_public_fqdn
tags = local.default_tags
}
150 changes: 150 additions & 0 deletions dockerhub-mirror.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
#### ACR to use as DockerHub (and other) Registry mirror
resource "azurerm_resource_group" "dockerhub_mirror" {
name = "dockerhub-mirror"
provider = azurerm.jenkins-sponsorship
location = var.location
}

resource "azurerm_container_registry" "dockerhub_mirror" {
name = "dockerhubmirror"
provider = azurerm.jenkins-sponsorship
resource_group_name = azurerm_resource_group.dockerhub_mirror.name
location = azurerm_resource_group.dockerhub_mirror.location
sku = "Premium"
admin_enabled = false
public_network_access_enabled = false # private links are used to reach the registry
anonymous_pull_enabled = true # Requires "Standard" or "Premium" sku. Docker Engine cannot use auth. for pull trough cache - ref. https://github.com/moby/moby/issues/30880
data_endpoint_enabled = true # Required for endpoint private link. Requires "Premium" sku.

tags = local.default_tags
}

locals {
acr_private_links = {
"cijenkinsio" = {
"subnet_id" = data.azurerm_subnet.ci_jenkins_io_kubernetes_sponsorship.id
"vnet_id" = data.azurerm_virtual_network.public_jenkins_sponsorship.id
"rg_name" = data.azurerm_virtual_network.public_jenkins_sponsorship.resource_group_name
},
"certcijenkinsio" = {
"subnet_id" = data.azurerm_subnet.cert_ci_jenkins_io_sponsorship_ephemeral_agents.id,
"vnet_id" = data.azurerm_virtual_network.cert_ci_jenkins_io_sponsorship.id
"rg_name" = data.azurerm_virtual_network.cert_ci_jenkins_io_sponsorship.resource_group_name
},
"trustedcijenkinsio" = {
"subnet_id" = data.azurerm_subnet.trusted_ci_jenkins_io_sponsorship_ephemeral_agents.id,
"vnet_id" = data.azurerm_virtual_network.trusted_ci_jenkins_io_sponsorship.id
"rg_name" = data.azurerm_virtual_network.trusted_ci_jenkins_io_sponsorship.resource_group_name
},
"infracijenkinsio" = {
"subnet_id" = data.azurerm_subnet.infra_ci_jenkins_io_sponsorship_ephemeral_agents.id,
"vnet_id" = data.azurerm_virtual_network.infra_ci_jenkins_io_sponsorship.id
"rg_name" = data.azurerm_virtual_network.infra_ci_jenkins_io_sponsorship.resource_group_name
},
}
}

resource "azurerm_private_endpoint" "dockerhub_mirror" {
for_each = local.acr_private_links

name = "acr-${each.key}"
provider = azurerm.jenkins-sponsorship

location = azurerm_resource_group.dockerhub_mirror.location
resource_group_name = azurerm_resource_group.dockerhub_mirror.name
subnet_id = each.value.subnet_id

custom_network_interface_name = "acr-${each.key}-nic"

private_service_connection {
name = "acr-${each.key}"
private_connection_resource_id = azurerm_container_registry.dockerhub_mirror.id
subresource_names = ["registry"]
is_manual_connection = false
}
private_dns_zone_group {
name = "privatelink.azurecr.io"
private_dns_zone_ids = [azurerm_private_dns_zone.dockerhub_mirror[each.key].id]
}
tags = local.default_tags
}

resource "azurerm_private_dns_zone" "dockerhub_mirror" {
for_each = local.acr_private_links

# Conventional and static name required by Azure (otherwise automatic record creation does not work)
name = "privatelink.azurecr.io"
provider = azurerm.jenkins-sponsorship

# Private DNS zone name is static: we can only have one per RG
resource_group_name = each.value.rg_name

tags = local.default_tags
}

resource "azurerm_private_dns_zone_virtual_network_link" "dockerhub_mirror" {
for_each = local.acr_private_links

name = "privatelink.azurecr.io"
provider = azurerm.jenkins-sponsorship
# Private DNS zone name is static: we can only have one per RG
resource_group_name = each.value.rg_name
private_dns_zone_name = azurerm_private_dns_zone.dockerhub_mirror[each.key].name
virtual_network_id = each.value.vnet_id

registration_enabled = true
tags = local.default_tags
}

#trivy:ignore:avd-azu-0016
resource "azurerm_key_vault" "dockerhub_mirror" {
name = "dockerhubmirror"
provider = azurerm.jenkins-sponsorship
location = azurerm_resource_group.dockerhub_mirror.location
resource_group_name = azurerm_resource_group.dockerhub_mirror.name

tenant_id = data.azurerm_client_config.current.tenant_id
soft_delete_retention_days = 7
purge_protection_enabled = false
enable_rbac_authorization = true
enabled_for_deployment = true
enabled_for_disk_encryption = true
enabled_for_template_deployment = true
public_network_access_enabled = false

network_acls {
bypass = "AzureServices"
default_action = "Deny"
}

sku_name = "standard"

tags = local.default_tags
}

# IMPORTANT: when bootstraping, multiple Terraform apply are required until ACR CredentialSet can be created by Terraform (unsupported by Terraform until https://github.com/hashicorp/terraform-provider-azurerm/issues/26539 is done).
# 1. Start by creating the dockerhub-username and docker-password in the Keyvault (once created) which requires the "Key Vault Secrets Officer" or "Owner" role temporarily
# 2. Then create the CredentialSet in the registry (once created) with the name 'dockerhub'. It will be marked as "Unhealthy" (expected).
# 3. Then retrieve the principal ID and set it in the attributes below.
# 4. Finally re-run terraform apply one last time to create this role_assignement and the ACR cache rule. The CrednetialSet in ACR willb e marked as "Helathy" right after this apply.
resource "azurerm_role_assignment" "acr_read_keyvault_secrets" {
provider = azurerm.jenkins-sponsorship
scope = azurerm_key_vault.dockerhub_mirror.id
role_definition_name = "Key Vault Secrets User"
skip_service_principal_aad_check = true
# Need to be retrieved manually from Azure UI -> Container Registries -> Select the "azurerm_key_vault.dockerhub_mirror" resource -> Services -> Cache -> Crerdentials -> select "dockerhub"
principal_id = "90872c87-43ab-446d-89b2-741693c34b90"
}

resource "azurerm_container_registry_cache_rule" "mirror_dockerhub" {
name = "mirror"
provider = azurerm.jenkins-sponsorship
container_registry_id = azurerm_container_registry.dockerhub_mirror.id
source_repo = "docker.io/*"
target_repo = "*"

# Credential created manually (unsupported by Terraform until https://github.com/hashicorp/terraform-provider-azurerm/issues/26539 is done).
# Check dependent resource
depends_on = [azurerm_role_assignment.acr_read_keyvault_secrets]
credential_set_id = "${azurerm_container_registry.dockerhub_mirror.id}/credentialSets/dockerhub"
}
Loading

0 comments on commit 59c4445

Please sign in to comment.