Skip to content

Releases: NVIDIA/ais-k8s

v1.6.1

09 Dec 21:17
Compare
Choose a tag to compare

See https://github.com/NVIDIA/ais-k8s/releases/tag/v1.6.0

AIS Operator v1.6.1

  • Added reconciliation of target and proxy container resources spec

Full Changelog: v1.6.0...v1.6.1

v1.6.0

06 Dec 05:39
Compare
Choose a tag to compare

IMPORTANT Please see compatibility docs for information on deploying clusters with this new version. It requires a new aisinit container >= v3.25 to generate configs for AIS pods.

AIS Operator v1.6.0

  • Added support for init container managed configs. See compatibility docs. This will improve compatibility between versions and help with upgrade paths.
  • Operator will now reconcile the entire pod spec for aisnode when image changes
  • Operator will now reconcile the entire init pod spec when init image changes
  • Added resource management options to AIS spec
  • Added MY_NODE env var to aisnode container
  • Added support for deployments with distributed tracing

Full Changelog: v1.5.0...v1.6.0

v1.5.0

22 Oct 18:31
Compare
Choose a tag to compare

AIS Operator v1.5.0

  • Updated to go 1.23 and latest dependencies
  • Added support for custom annotations passed from spec to aisnode containers via Annotations spec option
  • Added support for custom environment variables passed from spec to aisnode containers via Env spec option
  • Fixed a bug where rebalance would not properly disable and re-enable for upgrades if it had been modified manually
  • Removed the option for the operator manager to run external to the k8s cluster
  • Internal logic refactoring of AIS API and AuthN clients
  • Added Sync option to version config
  • Changed net.http.UseHttps option to solely control whether aisnode expects to use HTTPS rather than relying on presence of TLS secrets or cert manager issuer
  • Improved logging and requeue logic to make it easier to follow deployment progress and debug issues

Helm

Full Changelog: v1.4.1...v1.5.0

v1.4.1

20 Sep 15:32
Compare
Choose a tag to compare

AIS Operator v1.4.1

  • Fixed an issue where the operator would modify the rebalance config in the provided spec and not restore previous config after upgrades
  • Cleaned up logging and handling of DNS resolution on proxy startup

Major release v1.4.0: https://github.com/NVIDIA/ais-k8s/releases/tag/v1.4.0
Full Changelog: v1.4.0...v1.4.1

v1.4.0

10 Sep 16:05
Compare
Choose a tag to compare

AIS Operator v1.4.0

  • Improved state management to reconcile based on state rather than using blocking waits
  • Disabled rebalance at the AIS level before cluster modifications -- scaling, rolling upgrades, cluster re-creation
  • Added a watch on AIS spec configToUpdate for changes and keep those in sync with the cluster
  • Added ability to reconcile statefulset status
  • Updated default AIS config generation and improved compatibility through version changes
  • Added new AIS states for the following:
    • Scaling
    • HostCleanup
    • Finalized
  • Bug fixes
    • Fixed deep equal comparison with spec
    • Fixed cleanup jobs with proper status and termination
    • Improved wait behavior when waiting for AIS cluster readiness or decommissioning
  • QOL improvements -- Cleaned up logging, Added unit testing

API Changes

  • New options
    • cleanupMetadata -- Allows for cluster decommission while preserving cluster metadata for future deployments
    • tlsCertManagerIssuerName -- Specifies a cert-manager CSI issuer

Full Changelog: v1.3.0...v1.4.0

v1.3.0

01 Aug 20:54
Compare
Choose a tag to compare

AIS Operator v1.3.0

  • Added sidecar container for accessing stdout logs via k8s
  • Test improvements including unit tests for controller
  • Improved state management including new states for in-progress shutdown, in-progress decommission, and cleanup. See ClusterCondition list in aistore_types.go
  • Improved state logging and event recording
  • Remove unused "env-mount" volume mount
  • Added AuthN support

API changes

  • New cleanupMetadata option. Previous behavior matches cleanupMetadata=true. This option can now be disabled to allow preservation of cluster metadata (such as buckets) when decommissioning and transitioning to an entirely new cluster (new AIS custom resource).
  • New authNSecretName option to add secret signing key for JWT tokens in AIStore.

Full Changelog: v1.2.0...v1.3.0

v1.2.0

11 Jul 17:43
Compare
Choose a tag to compare

AIS Operator v1.2.0

Operator:

  • Breaking Change

    • Deployments with Operator versions >= 1.2.0 must specify an ais-init image >= 1.2.0
  • Changes

    • Added stateStorageClass field to AIS spec for dynamic state storage
    • Handle destroying statefulsets in unready state
    • Wait for cleanup job success before continuing decommission
    • Added internal shutdown status
    • Fixed duration type in AIS config
    • Added ais-init docker build (moved from aistore repo)
    • Move bash script logic into the init image
    • Use proper HTTP probes for liveness/readiness
  • Deprecated

Full Changelog: v1.1.1...v1.2.0

v1.1.1

10 Jun 18:42
Compare
Choose a tag to compare

AIS Operator v1.1.1

Highlights:

  • General Improvements:

    • Updated AIStore version to v3.23 in Helm chart, operator tests, deployment roles, and config samples.
    • Enhanced security and execution efficiency by refining the use of 'become: true' in Ansible playbooks, restricting elevated privileges to necessary tasks only.
    • Transitioned the default branch name from 'master' to 'main'.
  • Monitoring Enhancements:

    • Improved Grafana dashboard visuals and organization, enhancing panel visibility and highlighting unavailable numbers.
    • Updated AlertManager timings and Slack titles to better distinguish between alert statuses.
    • Fixed and optimized Grafana dashboard metrics, including throughput calculation and error graph adjustments.
    • Added more alerts for various AIS node states, including restart and maintenance mode alerts.
  • Operator Enhancements:

    • Fixed Backend field marshaling in the operator.
    • Made .spec.size optional, simplifying operator configuration.
    • Simplified the waitForDNSEntry method.
    • Explicitly disallowed multiple proxies on a single node for better stability.
    • Bumped AIStore dependency and default version to v1.1.0.
  • Documentation and Miscellaneous:

    • Added a compatibility matrix for AIStore and ais-operator.
    • Updated generated files and lint configurations.

Full Changelog: v1.1.0...v1.1.1

v1.1.0

23 Apr 04:09
Compare
Choose a tag to compare

AIS-Operator v1.1.0 Release Notes

Operator Enhancements:

  • New logsDir field to mount logs.
  • New cleanup jobs after decommissioning.
  • Automatic cluster decommission upon deletion.
  • Added mountLabel field to CRD; support for backward compatibility.
  • Enhanced DNS checks for proxies before resolving targets.
  • Improved flows for startup, restart, shutdown, and decommission.
  • Added shutdownCluster field to CRD spec.
  • Added hostNetwork parameter to target specifications in CRD.
  • General fixes and updates.

Documentation Updates:

  • Guidelines for deploying multiple targets per Kubernetes node.
  • General documentation fixes and updates.

Playbooks:

  • Updated to accommodate new operator field enhancements.

Additional Updates:

  • Experimental Helm chart for deploying AIS.
  • New ais-operator-helper Docker image for post-decommission cleanup jobs.
  • Various test fixes and improvements.

v1.0.0

05 Mar 20:35
Compare
Choose a tag to compare

AIS Operator v1.0.0

New Features

  • Support for different sizes of proxy/target stateful sets.
  • Enabled TLS certificate verification.
  • Operator client updated for TLS support.
  • Added multi-home support.
  • Helm chart creation added to Makefile bundle-manifests.

Fixes

  • Webhook fix for proxy/target spec size adjustments.

Updates

  • Upgraded to AISNode image v3.22.
  • Updated operator dependencies for better performance and security.