Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 1.10.0 #767

Merged
merged 469 commits into from
Dec 7, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
469 commits
Select commit Hold shift + click to select a range
f307484
trying to resolve conflict
Craig-Wilson-NAG Apr 13, 2022
69ce65a
trying to resolve conflict
Craig-Wilson-NAG Apr 13, 2022
c6e347f
trying to resolve conflict
Craig-Wilson-NAG Apr 13, 2022
37b30b5
Merge branch 'add-docs' into wb_docs
Craig-Wilson-NAG Apr 13, 2022
749ab61
adding images
Craig-Wilson-NAG Apr 13, 2022
703eae3
Merge pull request #30 from brandenm-nag/workbench_docs
ningli Apr 13, 2022
9f1977c
Rework machine type info for Clusters into a string
Apr 13, 2022
1cb21f1
Revise the user guide
ningli Apr 14, 2022
6448466
tentative GPU support using new machine_type impl
ptooley Apr 14, 2022
bb713e6
small visual fix
ptooley Apr 14, 2022
79030fc
various fixes for new instance datamodel
ptooley Apr 14, 2022
05265f3
Minor GPU cleanup
Apr 14, 2022
fbb2648
Revise the developer's guide
ningli Apr 14, 2022
c7ed56b
disable running cluster edits
ptooley Apr 14, 2022
375618b
Merge pull request #33 from brandenm-nag/disable_invalid_cluster_updates
Apr 14, 2022
1280b4f
Merge remote-tracking branch 'origin/develop' into update_to_latest_d…
Apr 14, 2022
3f96154
Merge pull request #34 from brandenm-nag/update_to_latest_develop
Apr 14, 2022
ba6472c
Merge pull request #27 from brandenm-nag/precommit
ptooley Apr 14, 2022
9daa184
Make admin guide to pass pre-commit
ningli Apr 14, 2022
49e2bae
Reformat user guide images and make it pre-commit clean
ningli Apr 14, 2022
312d0f9
Merge branch 'new_frontend' into add-docs
ningli Apr 14, 2022
131de95
fix template error from c7ed56b
ptooley Apr 14, 2022
804934b
Fix up AppInstallLoc cluster_using
Apr 14, 2022
b5c9747
Add GPU support for Spack Installs
Apr 14, 2022
b5a1f46
Merge pull request #31 from brandenm-nag/machinetype_info_gpu
Apr 18, 2022
4f1e118
Add created_by label to GHPC-created resources
Apr 18, 2022
c638328
change dirs from frontend to community/front-end
Apr 18, 2022
7596a27
Merge pull request #36 from brandenm-nag/chdir
ningli Apr 18, 2022
83deea2
Merge branch 'new_frontend' into labeling
ningli Apr 19, 2022
cfef4a3
Merge branch 'labeling' into new_frontend
ningli Apr 19, 2022
22a4b64
Give superuser unlimited quota
ningli Apr 19, 2022
b361439
Hide buttons while creating new VPC/Subnet
ningli Apr 19, 2022
ec9f328
Add a validator for subnet CIDR field
ningli Apr 19, 2022
cf27d0a
Use ipaddress module to check CIDR
ningli Apr 20, 2022
86300f5
Adjust conditions to show buttons in VPC detail page
ningli Apr 20, 2022
df9934b
Merge pull request #37 from brandenm-nag/improvement-0419
ningli Apr 20, 2022
4c49806
begin linting with c2daemon
ptooley Apr 19, 2022
517a9b8
fix missing line too long
ptooley Apr 19, 2022
89b8c9e
clean lint from views
ptooley Apr 19, 2022
a85c31c
delint urls.py
ptooley Apr 19, 2022
191343c
clean remainder of django python
ptooley Apr 20, 2022
3b8dd19
fix broken forms.py
ptooley Apr 20, 2022
f166c80
more changes to fix userupdateform
ptooley Apr 20, 2022
4013c05
finish linting and first consistency and sanity pass
ptooley Apr 20, 2022
7b4479d
cleanup pass 2
ptooley Apr 20, 2022
c5aa268
hopefully final cleanups
ptooley Apr 20, 2022
5d82cfa
Fix up ansible yaml to actually update pip
Apr 20, 2022
720c571
final cleanups
ptooley Apr 20, 2022
31578f5
add python lint and format to pre-commit
ptooley Apr 20, 2022
33ecbae
revert attempt to tighten exception scope (also black fix)
ptooley Apr 20, 2022
2f9a2dc
Add information on custom image in admin guide
ningli Apr 21, 2022
57ec1b9
fix admin.py
ptooley Apr 21, 2022
133fc3d
cluster_manager deblacking fixes
ptooley Apr 21, 2022
82cd989
add pylintrc copyright notice
ptooley Apr 21, 2022
0b1db84
more c2 daemon and newline tidying
ptooley Apr 21, 2022
59b72a1
Merge pull request #39 from brandenm-nag/linting
ptooley Apr 21, 2022
60139e1
5d82cfa6 means we can remove the controller dev role again
ptooley Apr 21, 2022
0a30640
Merge pull request #41 from brandenm-nag/fixup_ansible_lint
ptooley Apr 21, 2022
a6c02d8
First attempt to support SlurmGCP Debian images
Apr 19, 2022
da2d262
Fix Javascript in job rerun page
ningli Apr 21, 2022
29813d4
Add space around button
ningli Apr 21, 2022
e461bfb
Fix Javascript in rerun form
ningli Apr 21, 2022
8295ce0
Update pricing info when changing partitions
ningli Apr 21, 2022
faf9a04
Fix missing instance type of job detail page
ningli Apr 21, 2022
3d7aedf
Add tentative support for N1 GPUs
Apr 19, 2022
f0e568f
Minor JS and form tweaks
Apr 20, 2022
6e2f5c4
Cleanup some form handling and JS
Apr 20, 2022
71dc8f2
Cleanup for linting
Apr 21, 2022
d676f2f
Set VPC cloud_state not status
Apr 21, 2022
cc48073
Add CSV export of cost data
ningli Apr 21, 2022
0540123
Pylint new codes
ningli Apr 21, 2022
5c3e65f
Remove unused old CITC firewalling code
Apr 21, 2022
2aca65f
Improve link text
ningli Apr 21, 2022
22cb09c
Merge pull request #42 from brandenm-nag/fix-rerun
Apr 21, 2022
28b9e38
Add machine_type options for controller and login nodes
Apr 21, 2022
a706df9
Add disk type and size for controller and login
Apr 21, 2022
774823c
Merge new_frontend into add-docs
ningli Apr 22, 2022
84c7fff
fix replacement error from linting
ptooley Apr 22, 2022
3282c9d
Add GPU pricing support, fix up N1 memory pricing
Apr 22, 2022
da7540c
Remove filter that is now filtered elsewhere
Apr 22, 2022
f2f6901
fix filesystem not displaying "creating" state
ptooley Apr 22, 2022
cf47719
Fix up Price calculations
Apr 22, 2022
75d2f87
Add N1 GPU/CPU waring
Apr 22, 2022
d3cc2f0
Remove 'CPU' part of the warning
Apr 22, 2022
50d40f7
Remove 'CPU' part of the warning
Apr 22, 2022
40c7f89
Always remove the warning before re-adding it
Apr 22, 2022
32ca3b5
Merge pull request #38 from brandenm-nag/debian_image_support
ningli Apr 22, 2022
68b9568
Merge pull request #40 from brandenm-nag/n1_gpu
Apr 22, 2022
d3a67c0
Merge pull request #32 from brandenm-nag/add-docs
ningli Apr 22, 2022
b17cfc2
Merge branch 'develop' into new_frontend
Apr 22, 2022
99c0788
fix workbench create view not listing filesystems
ptooley Apr 25, 2022
c97a4a2
Merge pull request #44 from brandenm-nag/fix_workbench_fs
Apr 25, 2022
2d20339
Fix bug showing wrong logged in admin user in user detail/update pages
ningli Apr 25, 2022
eec7205
Merge branch 'new_frontend' of https://github.com/brandenm-nag/hpc-to…
ningli Apr 25, 2022
932c4fe
Install and run Grafna
Apr 25, 2022
b897710
Integrate Grafana with Django
Apr 26, 2022
a90127b
Embed grafana
Apr 26, 2022
f2cc650
Update models.py
Apr 27, 2022
8f22772
To make default be default if blank, for Integer fields, must set bla…
Apr 27, 2022
56935c6
Merge pull request #43 from brandenm-nag/cluster_extensions
ningli Apr 27, 2022
dad8268
Only support C2 for Placement Groups as per SlurmGCP
Apr 27, 2022
ede0499
Install modern OpsAgent for monitoring
Apr 27, 2022
43323ae
Add default dashboard for new clusters
Apr 27, 2022
6b8f7be
cleanup and add slurm
ptooley Apr 28, 2022
88a8a35
PR #46 review changes
ptooley Apr 28, 2022
2844de5
Grafana integration fixups
Apr 27, 2022
403baed
Merge branch 'new_frontend' into grafana
Apr 28, 2022
9a69101
Merge pull request #2 from ikatsardis/slurm_workbench_integr
ptooley Apr 28, 2022
0273161
Merge branch 'new_frontend' into grafana
ptooley Apr 28, 2022
8f80dfd
Fixup GitHub bad spacing
Apr 28, 2022
b0e3afc
Quick fix for too-short DB name
Apr 28, 2022
2a73db9
Merge branch 'new_frontend' into grafana
Apr 28, 2022
1674463
Various HTML tweaks
Apr 28, 2022
1976485
Fix up tied-to-specific instance
Apr 28, 2022
b9a27ec
Update serializers to reflect changes in models
ningli Apr 29, 2022
b287bd5
Add application and job list in CLI application; fix pylint problems
ningli Apr 29, 2022
a8a9818
Hide subnet message as required
ningli Apr 29, 2022
6697a51
fix for upstream slurm being a moving target
ptooley May 9, 2022
16c25f5
Merge pull request #1 from ikatsardis/grafana
ningli May 9, 2022
358b814
Fix issues with workbench os-login and update docs
ptooley May 10, 2022
658194e
Merge pull request #3 from ikatsardis/workbench_fixes
ningli May 11, 2022
57f3acf
add not implemented warning code path to cli
ptooley May 11, 2022
317fda4
Merge pull request #4 from ikatsardis/cli-fix
ningli May 11, 2022
8c719fc
Update the frontend README file
ningli May 12, 2022
2da854d
Fix for errors when user not logged in.
mattstreet-nag Jun 27, 2022
720be15
Fix for error in generated YAML when selecting GPU compute VMs.
mattstreet-nag Jun 27, 2022
11c8aff
Merge branch 'main' into new_frontend
mattstreet-nag Jun 27, 2022
69e3fe3
Added new TKFE lock file to git ignore
mattstreet-nag Jun 29, 2022
dd74513
Restructured and rewritten deploy script.
mattstreet-nag Jun 30, 2022
bfcb780
Housekeeping - added log files to git ignore
mattstreet-nag Jun 30, 2022
66b6d3a
New destroy script.
mattstreet-nag Jun 30, 2022
d65c53f
Added extra checking to deploy script.
mattstreet-nag Jul 1, 2022
e15aac1
Major update to documentation.
mattstreet-nag Jul 6, 2022
ec3b5da
Renamed destroy.sh to teardown.sh
mattstreet-nag Jul 6, 2022
047d88d
Updated documentation.
mattstreet-nag Jul 12, 2022
ce95758
Multiple improvements to deploy script
mattstreet-nag Jul 19, 2022
f588f77
Minor changes to in-line docs and tidy up.
mattstreet-nag Jul 19, 2022
3c3541c
Separated out service account management into own script - this facil…
mattstreet-nag Aug 8, 2022
9e57936
Merge remote-tracking branch 'hpc-toolkit/main' into new_frontend
mattstreet-nag Aug 12, 2022
816bc60
Fix to Makefile to resolve merge
mattstreet-nag Aug 12, 2022
4db39d5
Added check on deployment name to deploy script
mattstreet-nag Aug 12, 2022
d4fdfef
Changes required to work with new version of ghpc toolkit (v1.1.0)
mattstreet-nag Aug 12, 2022
f8aa0cf
Implemented a fix for issue encountered when 'gcluster' account exist…
mattstreet-nag Aug 18, 2022
d0dfb8f
Increased minimum default disk size of Slurm login node from 20GB to …
mattstreet-nag Aug 24, 2022
2f7bd0b
Updated version of Slurm for workbench, to match SlurmGCP v5wq
mattstreet-nag Aug 24, 2022
3957c9b
Key-value changes required in GHPC YAML for Slurm GCP v5.
mattstreet-nag Aug 24, 2022
7d74ed4
Changes required to fix for Slurm GCP v5.
mattstreet-nag Sep 12, 2022
a014182
Merge remote-tracking branch 'hpc-toolkit/main' into new_frontend
mattstreet-nag Sep 12, 2022
7446e6f
Tidy up of generated YAML file, removing old SlurmGCP v4 comments
mattstreet-nag Sep 12, 2022
53a7d81
Update README.md
ikatsardis Oct 8, 2022
22daaf0
Update admin_guide.md
ikatsardis Oct 8, 2022
b3ce782
Update deploy.sh
ikatsardis Oct 8, 2022
7a81475
Update deploy.sh
ikatsardis Oct 8, 2022
0cfb678
Bug fix to deploy script.
mattstreet-nag Oct 17, 2022
fc6ba4a
Modified error checking and message to account for availability of Pl…
mattstreet-nag Oct 17, 2022
148b3d3
Update settings.py
ikatsardis Oct 18, 2022
168fba8
Update benchmarks.py
ikatsardis Oct 18, 2022
ecbbf0d
Update admin_guide.md
ikatsardis Oct 26, 2022
9afa398
Update admin_guide.md
ikatsardis Oct 26, 2022
fc34eb8
Update applications.py
ikatsardis Oct 31, 2022
89d5f42
Update clusters.py
ikatsardis Oct 31, 2022
2c3406b
Update jobs.py
ikatsardis Oct 31, 2022
7b06206
Added check to deployment name to ensure it is accepted by GHPC.
mattstreet-nag Oct 31, 2022
7299f37
Updated to fix issues found by pre-commit.
mattstreet-nag Nov 1, 2022
90d645d
Updates to fix pre-commit warnings/errors.
mattstreet-nag Nov 1, 2022
ae17dd7
Updated docs to pass pre-commit checks
mattstreet-nag Nov 2, 2022
e668d95
Merge remote-tracking branch 'upstream/develop' into new_frontend
heyealex Nov 2, 2022
c3671a0
Bump cloud.google.com/go/serviceusage from 1.3.0 to 1.4.0
dependabot[bot] Nov 7, 2022
fc8c2d5
Bump github.com/zclconf/go-cty from 1.12.0 to 1.12.1
dependabot[bot] Nov 9, 2022
378b095
Bump github.com/otiai10/copy from 1.7.0 to 1.9.0
dependabot[bot] Nov 9, 2022
05e9aba
Fixes to Go version for updated ghpc
mattstreet-nag Nov 10, 2022
d0b9437
Minor mod to deploy for shellcheck compliance
mattstreet-nag Nov 10, 2022
63a0c87
Automated pre-commit updates
heyealex Nov 10, 2022
cbf3299
Merge remote-tracking branch 'upstream/develop' into new_frontend
heyealex Nov 10, 2022
870579d
Pre-commit fixes in shell and markdown
heyealex Nov 10, 2022
a48e2ab
Bump github.com/hashicorp/hcl/v2 from 2.14.1 to 2.15.0
dependabot[bot] Nov 10, 2022
aa5e302
Add Batch MPI example running WRF
nick-stroud Nov 9, 2022
4affb4c
Add documentation for Batch MPI example
nick-stroud Nov 9, 2022
bec94fb
Add Batch MPI integration test
nick-stroud Nov 10, 2022
f611ff9
Update pbspro-preinstall
tpdownes Nov 10, 2022
86ecf40
Merge pull request #722 from tpdownes/fix_pbspro_devel_rpm
tpdownes Nov 10, 2022
d23f0a4
Address feedback
nick-stroud Nov 10, 2022
4db76ce
Remove some temporary files
heyealex Nov 11, 2022
59e8702
Remove module.json files, no longer needed
heyealex Nov 11, 2022
48485a8
Add node group when setting partition
heyealex Nov 11, 2022
74a44a6
Merge pull request #718 from nick-stroud/batch_mpi_example
nick-stroud Nov 11, 2022
eee942c
Update default spack version to 0.19.0
douglasjacobsen Nov 11, 2022
1f9196a
Merge pull request #728 from GoogleCloudPlatform/main
heyealex Nov 11, 2022
0ea1f45
Increase the timeout to 30min for Omnia blueprint
heyealex Nov 11, 2022
125ff34
Merge pull request #700 from heyealex/new_frontend
heyealex Nov 12, 2022
3129ce2
Merge pull request #725 from douglasjacobsen/spack_19_update
heyealex Nov 15, 2022
9a70a6e
Remove ansible install warning message in README
heyealex Nov 16, 2022
fddbdb2
Add requirements.txt for builder Dockerfile
heyealex Nov 16, 2022
d7c6e69
Merge pull request #729 from heyealex/increase-timeout-omnia
heyealex Nov 17, 2022
b82ebc0
Added newlines and group names after ghpc create
omartin2010 Nov 17, 2022
ff2b342
Fixed variable name to avoid confusion
omartin2010 Nov 17, 2022
b566bf2
Merge pull request #735 from omartin2010/martinolivier-fix-creation-o…
heyealex Nov 17, 2022
9af2f09
Add pointer to network storage doc in file systems
heyealex Nov 17, 2022
ea881d3
Set unique script names for all mount points in nfs-server
heyealex Nov 17, 2022
51f7d6d
Add mounting instructions to network storage modules
heyealex Nov 17, 2022
1466a45
Merge pull request #704 from GoogleCloudPlatform/dependabot/go_module…
heyealex Nov 17, 2022
982d435
Bump google.golang.org/api from 0.102.0 to 0.103.0
dependabot[bot] Nov 17, 2022
e5e4e2a
Merge pull request #715 from GoogleCloudPlatform/dependabot/go_module…
heyealex Nov 17, 2022
5ec4400
Merge pull request #708 from GoogleCloudPlatform/dependabot/go_module…
heyealex Nov 18, 2022
226f13d
Merge pull request #716 from GoogleCloudPlatform/dependabot/go_module…
heyealex Nov 18, 2022
b86fbeb
Merge pull request #721 from GoogleCloudPlatform/dependabot/go_module…
heyealex Nov 18, 2022
052edc8
Switch to GA provider v4.19.0 and above
tpdownes Nov 18, 2022
698b2ae
Change dependency enforcement for login to controller
heyealex Nov 18, 2022
f2173ef
Change default local_mounts to only one, update test
heyealex Nov 18, 2022
de894aa
Add pointer to the compatibility matrix
heyealex Nov 18, 2022
e3665df
Merge pull request #736 from heyealex/bugfix/multi-mount-nfs-server
heyealex Nov 18, 2022
ffb2780
Validate unique destinations startup-scripts
heyealex Nov 18, 2022
f5a78d0
Merge pull request #737 from heyealex/doc/file-system-mount-and-doc-l…
heyealex Nov 18, 2022
040ef38
Fix test_configs with duplicated destinations
heyealex Nov 18, 2022
b52f69b
Use one() rather than splat
heyealex Nov 18, 2022
9bf06ec
Merge pull request #740 from heyealex/validate-unique-destination-ss
heyealex Nov 18, 2022
de1fc19
Merge pull request #733 from GoogleCloudPlatform/builder/reconfig-pyt…
heyealex Nov 18, 2022
7d60be4
Merge pull request #732 from heyealex/doc/startup-script-rm-ansible-req
heyealex Nov 18, 2022
079f039
Bump github.com/spf13/afero from 1.9.2 to 1.9.3
dependabot[bot] Nov 18, 2022
e32b7ee
Merge pull request #741 from GoogleCloudPlatform/dependabot/go_module…
heyealex Nov 18, 2022
138eeab
Merge pull request #739 from heyealex/bugfix/controller_id_list
heyealex Nov 21, 2022
80235e5
Merge pull request #738 from tpdownes/fix_provider
cboneti Nov 21, 2022
199bb75
Increased timeout of Ominia cluster creation
cboneti Nov 22, 2022
b1a9ea8
wait-for-startup now detects timeouts
cboneti Nov 22, 2022
323d853
Merge pull request #742 from cboneti/increase-omnia-timeout
cboneti Nov 22, 2022
d466bc6
Adding a test case for timeouts
cboneti Nov 22, 2022
c2cc6f4
Merge branch 'develop' into wait-for-startup-timeout
cboneti Nov 22, 2022
dece72e
Merge pull request #743 from cboneti/wait-for-startup-timeout
cboneti Nov 22, 2022
d18a2df
Cap TF Google provider version at latest stable (4.43)
heyealex Nov 22, 2022
98f20f8
Merge pull request #744 from heyealex/cap-plugin-version
heyealex Nov 22, 2022
2e7f48c
make overwrite error exit with rc=1
kkr16 Nov 22, 2022
fef8a2b
Merge pull request #701 from kkr16/fix-overwrite-rc
heyealex Nov 22, 2022
9da0419
Add High IO Slurm on GCP v5 example blueprint
heyealex Nov 21, 2022
946cb74
Update command to only selected needed nodes
heyealex Nov 22, 2022
bb9b48b
Update suggested node group specification
heyealex Nov 22, 2022
a61ece2
Merge pull request #730 from heyealex/examples/slurm-gcp-v5-high-io
heyealex Nov 22, 2022
2c40597
Add high-io-slurm-gcp-v5 example as daily test
heyealex Nov 23, 2022
0961c73
Add reference to htcondor tutorial from example documentation
nick-stroud Nov 28, 2022
a19d16e
Add an update comments
heyealex Nov 29, 2022
3b3f06a
Merge pull request #731 from heyealex/tests/high-io-v5
heyealex Nov 29, 2022
111299d
Set enable_reconfigure to true by default in high io example
heyealex Nov 29, 2022
fd274ce
Merge pull request #747 from nick-stroud/add_ref_to_htcondor_tutorial
nick-stroud Nov 29, 2022
24eecf7
Fix for issue #746 - removed invalid options for filesystem (STANDARD…
mattstreet-nag Nov 29, 2022
9625544
Remove unspecified tier from Filestore options
heyealex Nov 29, 2022
0a311f2
Merge pull request #734 from heyealex/default-enable-reconfig-high-io
heyealex Nov 29, 2022
ffe0527
Merge pull request #749 from mattstreet-nag/develop
heyealex Nov 29, 2022
9d5a568
Update max google provider to 4.44.1
heyealex Nov 29, 2022
e4162b9
Merge pull request #751 from heyealex/update-terraform-provider
heyealex Nov 30, 2022
c1f4a44
Rolling version to 1.10.0
cboneti Dec 7, 2022
0295b00
Merge pull request #762 from cboneti/version/1.10.0
cboneti Dec 7, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ ghpc
# workspace level vscode settings
.vscode/

####
community/front-end/ofe/credential.json
**/.tkfe.lock
**/deployment.tar.gz

#### TERRAFORM

# Local .terraform directories
Expand All @@ -25,6 +30,7 @@ crash.log
# to change depending on the environment.
#
*.tfvars
**/tf*.log

# Ignore override files as they are usually used to override modules locally and so
# are not checked in
Expand Down
2 changes: 2 additions & 0 deletions cmd/create.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import (
"hpc-toolkit/pkg/config"
"hpc-toolkit/pkg/modulewriter"
"log"
"os"

"github.com/spf13/cobra"
)
Expand Down Expand Up @@ -98,6 +99,7 @@ func runCreateCmd(cmd *cobra.Command, args []string) {
var target *modulewriter.OverwriteDeniedError
if errors.As(err, &target) {
fmt.Printf("\n%s\n", err.Error())
os.Exit(1)
} else {
log.Fatal(err)
}
Expand Down
2 changes: 1 addition & 1 deletion cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ HPC deployments on the Google Cloud Platform.`,
log.Fatalf("cmd.Help function failed: %s", err)
}
},
Version: "v1.9.0",
Version: "v1.10.0",
Annotations: annotation,
}
)
Expand Down
4 changes: 4 additions & 0 deletions community/examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,10 @@ Examples using Intel HPC technologies can be found in the

[See description in core](../../examples/README.md#slurm-gcp-v5-ubuntu2004yaml-)

### slurm-gcp-v5-high-io.yaml

[See description in core](../../examples/README.md#slurm-gcp-v5-high-ioyaml-)

### htcondor-pool.yaml

[See description in core](../../examples/README.md#htcondor-poolyaml--)
Expand Down
1 change: 1 addition & 0 deletions community/examples/omnia-cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -91,3 +91,4 @@ deployment_groups:
source: community/modules/scripts/wait-for-startup
settings:
instance_name: ((module.manager.name[0]))
timeout: 2100
150 changes: 150 additions & 0 deletions community/examples/slurm-gcp-v5-high-io.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

---

blueprint_name: hpc-cluster-high-io-v5

vars:
project_id: ## Set GCP Project ID Here ##
deployment_name: high-io-slurm-gcp-v5
region: us-central1
zone: us-central1-c
# By default, public IPs are set in the login and controller to allow easier
# SSH access. To turn this behavior off, set this to true.
disable_public_ips: false
# Set to true for active cluster reconfiguration. Note that setting this
# option requires additional dependencies to be installed locally.
enable_reconfigure: true

# Documentation for each of the modules used below can be found at
# https://github.com/GoogleCloudPlatform/hpc-toolkit/blob/main/modules/README.md

deployment_groups:
- group: primary
modules:
# Source is an embedded module, denoted by "modules/*" without ./, ../, /
# as a prefix. To refer to a local or community module, prefix with ./, ../ or /
# Example - ./modules/network/pre-existing-vpc
- id: network1
source: modules/network/vpc

- id: homefs
source: modules/file-system/filestore
use: [network1]
settings:
local_mount: /home

- id: projectsfs
source: modules/file-system/filestore
use: [network1]
settings:
filestore_tier: HIGH_SCALE_SSD
size_gb: 10240
local_mount: /projects

- id: scratchfs
source: community/modules/file-system/DDN-EXAScaler
use: [network1]
settings:
local_mount: /scratch

# The lowcost partition is designed to run at a lower cost and without additional quota
# Use:
# `srun -N 4 <<Command>>` for any node in the partition.
# `srun -N 4 --mincpus 2` for node group n2s4.
- id: low_cost_node_group_n2s2
source: community/modules/compute/schedmd-slurm-gcp-v5-node-group
settings:
name: n2s2
machine_type: n2-standard-2
node_count_dynamic_max: 10

- id: low_cost_node_group_n2s4
source: community/modules/compute/schedmd-slurm-gcp-v5-node-group
settings:
name: n2s4
machine_type: n2-standard-4
node_count_dynamic_max: 10

- id: low_cost_partition
source: community/modules/compute/schedmd-slurm-gcp-v5-partition
use:
- network1
- homefs
- scratchfs
- projectsfs
- low_cost_node_group_n2s2
- low_cost_node_group_n2s4
settings:
is_default: true
partition_name: lowcost
enable_placement: false
exclusive: false

# The compute partition is designed for performance.
# Use:
# `srun -N 4 -p compute <<Command>>` for any node in the partition.
# `srun -N 4 -p compute --mincpus 30 <<Command>>` for node group c2s60.

- id: compute_node_group_c2s60
source: community/modules/compute/schedmd-slurm-gcp-v5-node-group
settings:
name: c2s60
node_count_dynamic_max: 200

- id: compute_node_group_c2s30
source: community/modules/compute/schedmd-slurm-gcp-v5-node-group
settings:
name: c2s30
node_count_dynamic_max: 200
machine_type: c2-standard-30

- id: compute_partition
source: community/modules/compute/schedmd-slurm-gcp-v5-partition
use:
- network1
- homefs
- scratchfs
- projectsfs
- compute_node_group_c2s60
- compute_node_group_c2s30
settings:
partition_name: compute

- id: slurm_controller
source: community/modules/scheduler/schedmd-slurm-gcp-v5-controller
use:
- network1
- homefs
- scratchfs
- projectsfs
- low_cost_partition
- compute_partition
settings:
machine_type: c2-standard-8
disable_controller_public_ips: $(vars.disable_public_ips)

- id: slurm_login
source: community/modules/scheduler/schedmd-slurm-gcp-v5-login
use:
- network1
- slurm_controller
settings:
machine_type: n2-standard-4
disable_login_public_ips: $(vars.disable_public_ips)

- id: hpc_dashboard
source: modules/monitoring/dashboard
outputs: [instructions]
2 changes: 0 additions & 2 deletions community/examples/spack-gromacs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,6 @@ deployment_groups:
source: community/modules/scripts/spack-install
settings:
install_dir: /sw/spack
spack_url: https://github.com/spack/spack
spack_ref: v0.18.0
log_file: /var/log/spack.log
configs:
- type: single-config
Expand Down
27 changes: 27 additions & 0 deletions community/front-end/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
/clusters/
fs/
vpcs/
runs/
myvenv/
__pycache__
website/ghpcfe/migrations/
.*.swp
*.sqlite3
bin/
lib/
share/
run/
configuration.yaml
website/static/
workbenches/
dependencies/
tf/deployment.tar.gz
key1
key2
.terraform/
.terraform.lock.hcl
terraform.tfstate*
terraform.tfvars
tf/.tkfe.lock
tf/tfapply.log
tf/tfdestroy.log
33 changes: 33 additions & 0 deletions community/front-end/ofe/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Google HPC Toolkit Open Front End

This is a web front-end for HPC applications on GCP. It delegates to the Cloud
HPC Toolkit to create cloud resources for HPC clusters. Through the convenience
of a web interface, system administrators can manage the life cycles of HPC
clusters and install applications; users can prepare & submit HPC jobs and run
benchmarks. This web application is built upon the Django framework.

## Deployment

This system can be deployed on GCP by an administrator using the following
steps:

* Arrange a hosting GCP project for this web application.
* Prepare the client side environment and secure sufficient IAM permissions for
the system deployment.
* When ready, clone this repository and run the deployment script at
`hpc-toolkit/community/front-end/deploy.sh` from a client machine or a Cloud
Shell. Follow instructions to complete the deployment. The whole process is
automated via Terraform and should complete within 15 minutes.
* Perform post-deployment configurations.

Please visit the [Administrator's Guide](docs/admin_guide.md) for more
information on system deployment.

Once the deployment is done, the administrator can use the web interface to
create HPC clusters, install applications, and set up other users. More
information is available in the [Administrator's Guide](docs/admin_guide.md)
and [User Guide](docs/user_guide.md).

You are welcome to contribute to this project. The
[Developer's Guide](docs/developer_guide.md) contains more information on the
implementation details of the system.
Loading