Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HTCondor Windows download URI #1847

Merged

Conversation

tpdownes
Copy link
Member

@tpdownes tpdownes commented Oct 17, 2023

Upon the release of HTCondor 23.0, the URLs have for the 10.x series have been reorganized. This does not impact Linux repositories, however it does affect the Windows MSI installer download.

In addition, this PR adds:

  • validation for the user-supplied version string to enforce 10.x compatibility.
  • addresses a problem where dnf-automatic.service interferes with GPG key verification by manually importing the keys prior to setting up the yum repositories and allowing key import to retry for up to 10 minutes. In practice, dnf-automatic is observed to take approximately 1 minute to perform a kernel upgrade

This as been manually tested against the following blueprint:

blueprint_name: htc-htcondor

vars:
  project_id:  ## Set GCP Project ID Here ##
  deployment_name: example-pool
  region: us-central1
  zone: us-central1-f
  disk_size_gb: 100
  new_image_family: htc-example-10x
  new_windows_image_family: htc-example-win-10x
  spool_parent_dir: /shared

# Documentation for each of the modules used below can be found at
# https://github.com/GoogleCloudPlatform/hpc-toolkit/blob/main/modules/README.md

deployment_groups:
- group: primary
  modules:
  - id: network1
    source: modules/network/vpc
    settings:
      enable_iap_rdp_ingress: true
      enable_iap_winrm_ingress: true
    outputs:
    - network_name

  - id: htcondor_install
    source: community/modules/scripts/htcondor-install
    settings:
      condor_version: 10.7.1

  - id: htcondor_install_script
    source: modules/scripts/startup-script
    use:
    - htcondor_install

  - id: windows_startup
    source: community/modules/scripts/windows-startup-script
    settings:
      install_nvidia_driver: true

  - id: spoolfs
    source: modules/file-system/filestore
    use:
    - network1
    settings:
      filestore_tier: ENTERPRISE
      local_mount: $(vars.spool_parent_dir)

- group: packer
  modules:
  - id: custom-image
    source: modules/packer/custom-image
    kind: packer
    use:
    - network1
    - htcondor_install_script
    settings:
      image_family: $(vars.new_image_family)
      source_image_family: hpc-rocky-linux-8
      disk_size: $(vars.disk_size_gb)

- group: packer-windows
  modules:
  - id: image-windows
    source: modules/packer/custom-image
    kind: packer
    use:
    - network1
    settings:
      image_family: $(vars.new_windows_image_family)
      source_image_family: windows-2016
      machine_type: n1-standard-16
      accelerator_type: nvidia-tesla-t4
      accelerator_count: 1
      disk_size: 75
      disk_type: pd-ssd
      state_timeout: 15m
      windows_startup_ps1:
      - $(windows_startup.windows_startup_ps1[0])
      - $(htcondor_install.windows_startup_ps1)
      - |
        Write-Output 'Hello, World!'

Submission Checklist

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cloud HPC Toolkit Contribution guidelines #

@tpdownes tpdownes added the release-bugfix Added to release notes under the "Bug fixes" heading. label Oct 17, 2023
@tpdownes tpdownes force-pushed the fix_htcondor_windows_url branch from 7aa8364 to 8e788da Compare October 18, 2023 15:32
@tpdownes tpdownes requested a review from cdunbar13 October 18, 2023 15:53
@tpdownes tpdownes marked this pull request as ready for review October 18, 2023 15:53
@tpdownes tpdownes enabled auto-merge October 18, 2023 16:03
Copy link
Contributor

@cdunbar13 cdunbar13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a minor comment on the validation of the versions. We could do something similar to what we did with Slurm by keeping a list of versions that are known to be supported.

@tpdownes tpdownes disabled auto-merge October 19, 2023 16:28
@tpdownes tpdownes merged commit 638f602 into GoogleCloudPlatform:develop Oct 19, 2023
@tpdownes tpdownes deleted the fix_htcondor_windows_url branch October 19, 2023 16:32
@mr0re1 mr0re1 mentioned this pull request Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-bugfix Added to release notes under the "Bug fixes" heading.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants