Releases: NVIDIA/aistore
3.25
Changelog
-
"S3 compatibility API: add missing access control" c046cb8
-
"core: async shutdown/decommission; primary to reject node-level requests" 2e17aaf
| * primary will now fail node-level decommission and similar lifecycle and cluster membership (changing) requests
| * keeping shutdown-cluster exception whenforced
(in re: local playground)
| * when shutting down or decommissioning an entire cluster primary will now perform the final step asynchronously
| * (so that the API caller receives ok) -
"python/sdk: improve error handling and logging for
ObjectFile
" b61b3db -
"core: cold-GET vs upgrading rlock to wlock" 9857e78
| * remove allsync.Cond
related state and logic
| * reduce low-levellock-info
to just rc and wlock
| * poll for up tohost-busy
timeout
| * returnerr-busy
if unsuccessful -
"CLI
show cluster
to sort rows by POD names with primary on top" e469684 -
"health check to be forwarded to primary when invoked with "primary-ready-to-rebalance" query param a59f921
| * (previously, non-primary would fail the request) -
"python: avoid module level import of webds; remove 'webds' dependency 228f23f
| * refactor dataset_config.py: avoid module-level import of ShardWriter
| * update pyproject.toml: add webdataset==0.2.86 as an optional dependency" -
"aisloader: '--subdir' vs prefix (clarify)" 7e7e8e4
-
"CLI: directory walk: do not call
lstat
on every entry (optimize)" 4a22b88
| * skip errors iff "continue-on-error"
| * add verbose mode to see all warnings - especially when invoked with the "continue-on-error" option
| * otherwise, stop walking and return the error in question
| * with partial rewrite -
"docs: add tips for copying files from Lustre;
ais put
vsais promote
" 3cb20f6 -
"CLI:
--num-workers
option ('ais put', 'ais archive', and more)" d5e6fbc
| * add; amend
| * an option to execute serially (consistent with aistore)
| * limit not to exceed (2 * num-CPUs)
| * remove--conc
flag (obsolete)
| * fix inline help -
"CLI: PUT and archive files from multiple matching directories" 16edff7
| *GLOB
alize
| * PUT: add back--include-src-dir
option -
"trim prefix: list-objects; bucket-summary; multi-obj operations" 7cf1546
| *rtrim(prefix, '*')
to satisfy one common expectation
| * proxy only (leaving CLI intact) -
"unify 'validate-prefix' & 'validate-objname'; count list-objects errors" 5789273
| * addErrInvalidPrefix
(type-code)
| * refactor and micro-optimizevalidate-*
helpers; unify
| * move object name validation to proxies; proxies to (also) counterr.list.n
| * refactorver-changed
andobj-move
3.24
Version 3.24 arrives nearly 4 months after the previous one and contains more than 400 commits that fall into several main categories, topics, and sub-topics:
1. Core
1.1 Observability
We improved and optimized stats-reporting logic and introduced multiple new metrics and new management alerts.
There's now an easy way to observe per-backend performance and errors, if any. Instead of (or rather, in addition to) a single combined counter or latency, the system separately tracks requests that utilize AWS, GCP, and/or Azure backends.
For latencies, we additionally added cumulative "total-time" metrics:
- "GET: total cumulative time (nanoseconds)"
- "PUT: total cumulative time (nanoseconds)"
- and more
Together with respective counters, those total-times can be used to compute precise latencies and throughputs over arbitrary time intervals - either on a per-backend basis or averaged across all remote backends, if any.
New management alerts include keep-alive
, tls-certificate-will-soon-expire
(see next section), low-memory
, low-capacity
, and more.
Build-wise, aisnode
with StatsD will now require the corresponding build tag.
Prometheus is effectively the default; for details, see related:
1.2 HTTPS; TLS
HTTPS deployment implies (and requires) that each AIS node (aisnode
) has a valid TLS (X.509) certificate.
TLS certificates tend to expire from time to time, or eventually. Each TLS certificate expires, with a standard-defined maximum of 13 months - roughly, 397 days.
AIS v3.24 automatically reloads updated certificates, tracks expiration times, and reports any inconsistencies between certificates in a cluster:
Associated Grafana and CLI-visible management alerts:
alert | comment |
---|---|
tls-cert-will-soon-expire |
Warning: less than 3 days remain until the current X.509 cert expires |
tls-cert-expired |
Critical (red) alert (as the name implies) |
tls-cert-invalid |
ditto |
Finally, there's a brand-new management API and ais tls
CLI.
1.3 Filesystem Health Checker (FSHC)
FSHC component detects disk faults, raises associated alerts, and disables degraded mountpaths.
AIS v3.24 comes with FSHC a major (version 2) update, with new capabilities that include:
- detect mountpath changed at runtime;
- differentiate in-cluster IO errors from network and remote backend (errors);
- support associated configuration (section "API changes; Config changes" below);
- resolve (mountpath, filesystem) to disk(s), and handle:
- no-disks exception;
- disk loss, disk fault;
- new disk attachments.
1.4 Keep-Alive; Primary Election
In-cluster keep-alive mechanism (a.k.a. heartbeat) was generally micro-optimized and improved. In particular, when and if failing to ping primary via intra-cluster control, an AIS node will now utilize its public network, if available.
And vice versa.
As an aside, AIS does not require provisioning 3 different networks at deployment time. This has always been and remains a recommended option. But our experience running Kubernetes clusters in production environments proves that it is, well, highly recommended.
1.5 Rebalance; Erasure Coding: Intra-Cluster streams
Needless to say, erasure coding produces a lot of in-cluster traffic. For all those erasure-coded slice-sending-receiving transactions, AIS targets establish long-living peer-to-peer connections dubbed streams.
Long story short, any operation on an erasure bucket requires streams. But, there's also the motivation not to keep those streams open when there's no erasure coding. The associated overhead (expectedly) grows proportionally with the size of the cluster.
In AIS v3.24, we solve this problem, or part of this problem, by piggybacking on keep-alive messages that provide timely updates. Closing EC streams is a lazy process that may take several extra minutes, which is still preferable given that AIS clusters may run for days and weeks at a time with no EC traffic at all.
1.6 List Virtual Directories
Unlike hierarchical POSIX, object storage is flat, treating forward slash ('/') in object names as simply another symbol.
But that's not the entire truth. The other part of it is that users may want to operate on (ie., list, load, shuffle, copy, transform, etc.) a subset of objects in a dataset that, for lack of a better word, looks exactly like a directory.
For details, please refer to:
1.7 API changes; Config changes
Including:
- "[API change] show TLS certificate details; add top-level 'ais tls' command" 091f7b0
- "[API change]: extend HEAD(object) to check remote metadata" c1004dd
- "[config change]: FSHC v2: track and handle total number of soft errors" a2d04da
- and more
1.8 Performance Optimization; Bug fixes; Improvements
Including:
- "new RMD not to trigger rebalance when disabled in the config" 550cade20
- "prefetch/copy/transform: number of concurrent workers" a5a30247d, 8aa832619
- "intra-cluster notifications: reduce locking, mem allocations" b7965b7be
- and much more
2. Initial Sharding (ishard
); Distributed Shuffle (dsort
)
Initial Sharding utility (ishard
) is intended to create well-formed WebDataset-formatted shards from the original dataset.
Goes without saying: original ML datasets will have an arbitrary structure, a massive number of small files and/or very large files, and deeply nested directories. Notwithstanding, there's almost always the need to batch associated files (that constitute computable samples) together and maybe pre-shuffle them for immediate consumption by a model.
Hence, ishard
:
3. Authentication; Access Control
Other than code improvements and micro-optimizations (as in continuous refactoring) of the AuthN codebase, the most notable updates also include:
topic | what changed |
---|---|
CLI | improved token handling; user-friendly (and improved) error management; easy-to-use configuration that entails admin credentials, secret keys, and tokens |
Configuration | notable (and related) environment variables: AIS_AUTHN_SECRET_KEY , AIS_AUTHN_SU_NAME , AIS_AUTHN_SU_PASS , and AIS_AUTHN_TOKEN |
AuthN container image (new) |
tailored specifically for Kubernetes deployments - for seamless integration and easy setup in K8s environments |
4. CLI
Usability improvements across the board, including:
- "add 'ais tls validate-certificates' command" 0a2f25c
- "'ais put --retries ' with increasing timeout, if need be" 99b7a96
- "copy/transform: add '--num-workers' (number of concurrent workers) option" 2414c68
- "extend 'show cluster' - add 'alert' column" 40d6580df
- "show configured backend providers" ba492a1
- "per-backend...
3.23
Version 3.23 arrives three months after the previous one. In addition to datapath optimizations and bug fixes, most of the other changes are enumerated in the following
Table of Contents
- List Objects; Bucket Inventory
- Selecting
Primary
at startup; Restarting cluster when node IPs change (K8s) - S3 (backend, frontend)
- BLOBs
- Mountpath labels
- Reading shards; Reading from shards
See also:
List Objects; Bucket Inventory
- S3 backend: S3 ListObjectsV2 may return a directory !6672
- list very large buckets using bucket inventory !6682, !6684, !6686, !6689, !6692
- list-objects: optimize for prefix; add 'dont-optimize' feature flag !6685
- list very large buckets using bucket inventory (major update, API changes) !6695, !6698
- list very large buckets using bucket inventory !6704
- list-objects: support non-recursive operation (new) !6711, !6712
- refactor and code-generate (message pack) list-objects results !6714
- bucket inventory; generic no-recursion helper !6715
- bucket inventory: support arbitrary schema; add validation !6769
- list-objects: micro-optimize setting custom properties of remote objects !6770
- list very large buckets using bucket inventory !6775, !6776, !6777, !6778
- list very large buckets using bucket inventory (major) !6810, !6811
- list very large buckets using bucket inventory !6815
- list-objects: skip virtual directories !6835
- list very large buckets using bucket inventory !6847, !6851, !6853
Selecting Primary
at startup; Restarting cluster when node IPs change (K8s)
- primary role: add 'is-secondary' environment; precedence !6746
- 'original' & 'discovery' URLs (major) !6747, !6749
- cluster config: new convention for primary URL; role of the primary during: initial deployment, cluster restart !6752, !6755
- cluster restart with simultaneous change of primary (major) !6758, !6760, !6761
- primary startup: always update node net-infos !6762
- all proxies to store
RMD
(previously, only primary) !6764 - node join: remove duplicate IP check (is redundant) !6783
- K8s startup with proxies change their network infos !6785
- primary startup: initial version of the cluster map !6787
- non-primary startup: retry and refactor; factor in !6788
- K8s: primary startup when net-infos change !6789
S3 (backend, frontend)
- backend put-object interface; presigned S3 (refactoring & cleanup) !6662
- default AWS region (cleanup) !6679
s3cmd
: add negative testing !6681- backend: S3 ListObjectsV2 may return a directory !6672
- backend: consolidate environment and defaults !6678
- backend: retain S3-specific error code !6688, !6691
- move presigned URLs code to
backend
package !6801 - multipart upload: read and send next part in parallel !6803
- backend: refactor and simplify !6819
- new feature flag to enable (older) path-style addressing !6821
BLOBs
- config change: assorted feature flags now have bucket scope (major) !6664, !6666
- Python: blob-download API !6687
- Python: get and prefetch with blob-download !6708
- blob downloader (minor ref) !6793
- blob-downloader: finalize control structures; refactor !6812
- GET via blob-download !6873
- multiple blob-download jobs (fixes) !6876
- prefetch via blob-downloader !6882
Mountpath labels
- override-config,
fspaths
section (minor ref) !6718 - config change, API change: mountpath labels (major) !6721, !6722, !6725, !6726, !6733, !6734, !6735, !6736, !6738
- backward compatibility v3.22 and prior; bump CLI version !6740, !6742
- log: mountpath labels vs shared filesystems; memory pressure !6744
Reading shards; Reading from shards
- reading (from) shards: add read-until, read-one, and read-regex methods !6823
- reading shards: read-until, read-one, read-regex !6824
- WebDataset: add
wds-key
; add comments !6826 - reading .TAR, .TGZ, etc. formatted objects (a.k.a. shards) - multiple selection !6827
- GET request to select multiple archived files (feature) !6859
- GET multiple archived files in one shot (feature) !6861, !6862, !6863, !6864, !6866
- Python: GET multiple files from an archive (shard) !6860
Core
- backend put-object interface (refactoring & cleanup) !6662
- get-stats API vs attach/detach mountpaths !6669
- unwrap URL errors; remove
mux.unhandle
; CLI: more tips !6673 - removing a node from a 2-node cluster (in re: rebalance) !6674
- POST /v1/buckets handler: add one more check to URI validation !6690
- last byte (minor ref) !6694
- project layout: move and consolidate all scripts !6699
- extend RMD to reinforce cluster integrity checking !6702
- micro-optimize fast-path fqn parsing !6707
- continued refactoring !6709, !6710
- security dependabot: fix #15 and #16 !6713
- aisnode: remove logs from conf !6727
- extract and unify cluster information; add flags !6741
- copy shared FS capacity; color high/low usage pct; up cli !6743
- node flags in a cluster map vs (node | cluster) restart; node equality !6765
- receive cluster-level metadata (minor ref) !6766
- dsort: write compressed tar !6771
- dsort: read compressed tar; add linter !6772
- backend: uniform naming, common base !6774
- remove
AIS_IS_PRIMARY
environment (is obsolete) !6781 - nlog: allow setting logging to STDERR flag in config !6791
- feature flags
fsync-put
will now have (also) bucket scope !6804 - cold GET: write locally and transmit in parallel (new) !6805, !6807
- move atomic 'stopping' (ref) !6817
aisloader
: add 's3-use-path-style' command line, to use older path-style addressing !6822- cold GET (fast): fclose and check !6825
- speed-up batch jobs (prefetch, archive, copy/transform, multi-object evict/delete) !6830
LOM
: addopen-file
method !6836- nlog: while stopping !6837
- multi-object TCB/TCO; not in-cluster objects; multi-page fix !6840, !6842
- xaction registry: when hk call is premature !6843
- add metrics: get-size and put-size !6849
- memsys/SGL: add compliant 'write-to' interface impl.; amend fast/simplified 'write-to' !6854, !6856, !6857
- stats and metrics: report cumulative GET and PUT sizes in bytes !6855
- datapath query parameters: preparse, reduce size !6858
- stats: fix Prometheus label for total size !6871
- imports (ref) !6878
- move and rename 'node-state-info' and 'node-state-flags' (ref) !6879
- new metric: node-state-flags (bitwise, gauge) !6880
- add management alerts: out-of-space & low-capacity (major) !6883
- add management alerts: out-of-memory & low-on-memory !6885
- microbench: use math/rand/v2 !6886
- transition to Go 1.22 math/rand/v2; crypto/rand reader !6887
- dsort test: use rand.v2 !6888
- transition to Go 1.22 math/rand/v2; add seeded-reader !6890
- cleanup 'cos/math' (ref) !6891
- tests: fix prefix-test for remote ais cluster !6893
CLI
- 'more' fixes !6665
- more tips !6673
- warn when switching cluster to operate in reverse proxy mode !6703
- show feature flags symbolically !6705
- backward compatibility v3.22 and prior; bump CLI version !6740
- 'ais show cluster' to highlight nodes that are low on memory !6745
- 'ls' and 'show object' to support size units (raw, SI, IEC) !6795
- progress bar decorators; elapsed time !6797
- fix used and available capacity !6806
- fix 'show throughput' to not show throughput when !6813
- quiet 'show cluster', 'show performance'; misplaced flags !6814
- 'ais ls' help and inline examples; native GET: add query params !6816
- copying remote objects; progress bar; usability !6839
- extend 'ais gen-shards' to generate WD-formatted shards !6865
- add '--count-and-time-only' option !6868, !6869
- max-pages and limit !6870
- stopping jobs !6875
Python
- add test for invalid bucket name !6683
- blob-download API !6687
- add timeout option to client + version bump !6693
- get and prefetch with blob-download !6708
- tests constants and refactoring !6717
- prefetch blob-download tests !6719
- cluster performance API !6724
- remote enabled tests cleanup refactored !6731
- add missing job tests !6737
- fix formatting issues !6753
- PyTorch: add Iterable-style datasets for AIS Backend !6759
- writer for image dataset !6767
AISSource
: list all objects !6779- add example for dataset_writer !6794
- add tests for dataset writer !6799
- log missing attributes in write_dataset !6820
- update docs !6844
- add MultiShard Stream to PyTorch !6852
- GET multiple files from an archive !6860
Build, CI
- transition to Go 1.22 !6675
- upgrade OSS packages !6680, !6750, !6768
- lint: upgrade; Go 1.22 int range !6728, !6732
- CI: MacOS fix !6729
- remove HDFS backend !6773
- upgrade golang.org/x/net !6831
- lint; min/max shadow !6850
- build: transition to Go 1.22 math/rand/v2 !6892
- CI: maintenance !6838
- lint: golangci-lint !6894
Documentation
- docs: fix https getting-started !6668
- docs: amend getting started !6670
- docs: fix the broken table of contents link !6677
- blog: Very large !6874
3.22
Highlights
- Blob downloader
- Multi-homing: support multiple user-facing network interfaces
- Versioning and remote sync
- execute in presence of out-of-band changes/deletions
- support latest version: the capability to check in-cluster metadata and, possibly,
GET
, download, prefetch, and/or copy the latest remote (object) version - remote synch: same as above, plus: remove in-cluster object if its remote counterpart is not present (any longer)
- both latest version and remote sync are supported in a variety of APIs (including
GET
primitive) and tools (CLI,aisloader
)
- Intra-cluster n-way mirroring
- to withstand a loss of node(s) erasure coding is now optional
- AWS S3 (frontend) API
- multipart V2 (major upgrade); other productization
- listing very large S3 datasets
- support presigned S3 requests (beta)
- List objects (job): show diff: in-cluster vs. remote
- Prefetch (job): V2 (major upgrade)
- Copy/transform (jobs): V2 (major upgrade)
- AWS S3: migrate AWS backend to AWS SDK V2
- Azure Blob Storage: transition to latest stable native SDK
See also: aistore features and brief overview.
Core
- NVMe multipathing: pick alternative block-stats location !6432
- rotate logs; remove redundant interfaces, other refactoring !6433
- cold
GET
: add stats !6435 - http(s) clients: unify naming, construction; reduce code !6438, !6439
- don't escape URL paths; up cli !6441
dsort
: sort records (minor) !6445- core: micro-optimize copy-buffer !6447
list-objects
utilities and helpers; rerunlist-objects
code-gen: refactor and optimize; cleanup !6450, !6451- intra-cluster transport: zero-copy header !6455
- Go API: (object, multi-object): ref !6456
- add 'read header timeout'; docs: aistore environment variables !6459
- core: support target multi-homing - comma-separated IPs (part one) !6464
- package 'ais': continued refactoring; up cli !6466
- support multiple user-facing network interfaces (multi-homing) !6467, !6468
- when setting backend two (or more) times a row !6469
- core: (begin, abort, commit) job - corner cases !6470
- in-cluster K8s environment: prune and cleanup, comment, and document !6471
- multi-object
PUT
- variations !6473, !6474 - unify
PUT
andPROMOTE
destination naming !6475 APPEND
(verb) to append if exists; amend metadata (major) !6476- EC: refactor and simplify erasure-coding datapath; docs: remove all gitlab references !6477
list-objects
: enforce intra-cluster access, validate !6480- EC: remove redundant state; simplify !6481
- Go API
get-bmd
; follow-up !6483 - EC: cleanup manager: remove rlock and unused map - micro-optimize !6490
- copy bucket: extend the command to sync remote bucket !6491
- extend 'copy bucket' to sync remote !6494, !6495, !6497, !6498, !6499
- don't compare checksums of different (checksum) types !6496
- when deleting non-present (remote) object !6502
- move transform/copy-bucket from 'mirror' package to 'xs' !6503
- don't create data mover in a single-node cluster !6504
- multi-object transform/copy (job): add missing cleanup !6506
- multi-object transform & copy !6507
- core: abort all (jobs) of a given kind; CLI 'ais stop'; strings: Damerau-Levensthein !6508
- revamp target initialization !6509
- copy/transform remote, non-present !6510
- revamp target initialization !6512, !6513
- [API change] get latest version (feature) !6516
- amend Prefetch; flush
atime
cache when shutting down !6517 - amend metadata cache flushing logic (
atime
,prefetch
,is-dirty
) !6518 - core: remote reader to support 'latest version' !6519
- extend config ROM; follow-up !6520
- Prefetch v2 !6521
- backend error formatting;
notification-listener
name !6522 - [API change] Prefetch v2; multi-object operations !6523
- Prefetch v2; cold-get stats; put size !6524
- [config change] versioning vs remote version changed or deleted !6525, !6526
- add 'remote-deleted' stats counter; Prefetch: test more !6528
- AWS backend
not-found
; job status; other cleanup !6529 - core: refactor 'copy-object' interface, prep to sync remote => in-cluster !6531
- [Cluster Config change] versioning vs remote version: remote changed, deleted !6532
- copy/transform (bucket | multi-object); intra-cluster notifications !6533
- revise/simplify 'is-not-exist' check;
ldp.reader
to honorsync-remote
option !6537 - pre-parse (
log-modules
,log-level
); micro-optimize !6538 - amend error handling:
not-found
vs list iterator; OOS !6539 - jobs ("xactions"): add and log non-critical errors;
join(error)
and fiends !6540 - [API change] list-objects to report 'version-changed' (new) !6541
list-objects
to report 'version-changed' (new) !6543, !6545list-objects
to report: 'version-changed', 'deleted' !6546list-objects
to support (in-cluster <=> remote) diff !6547, !6548- copy/transform with an option to sync remote: prune destination !6549
- copy/transform --sync: add stress test, extract "pruning" logic !6550
- revise and refine object write transaction (OWT) !6554, !6555
- Go API: amend 'wait-idle' helper method !6558
- copy/transform '--sync': use probabilistic filtering !6559
- refactor
list-range-prefix
iterator !6560 - multi-object copy/transform with '--sync' option !6561
- S3 API (on the front): fix
list-objects
!6562, !6563 - multi-object copy/transform with '--sync' option !6564
- core: reset idle timer; xaction names (micro-optimizations) !6565
- core:
ETag
in response headers !6569 - S3 API (frontend): validate object names; multipart pathnames !6570
- copy/transform with '--sync' option: add scripted test !6571, !6573
- backend: special case to return 404 instead of 403 !6575
- productize Azure backend !6576, !6578, !6580
- S3 multipart: write-through all parts !6585
- multipart upload: write-through all parts !6586
- multipart upload: add extended error message; add stress test !6587
- all supported backends: revisit range read (make it consistent across) !6589
- introduce blob downloader (new) !6592
- xaction (job) descriptor: remove unused specifiers !6593
- blob downloader: add dedicated (non-generic) control path !6595
- blob downloader (new) !6596, !6599, !6603
- multipart upload: fix
s3cmd
to run elsewhere !6600, !6601 - blob downloader (new) !6605, !6606, !6608
- blob downloader (new); remote AIS cluster !6613
- silent
HEAD(bucket)
!6614 - leverage erasure coding to provide intra-cluster mirroring (new) !6615, !6616
- blob downloader (new) !6618
- S3 (frontend): support presigned S3 requests (new) !6621
- intra-cluster mirroring: add integration test (no limit) !6622
- blob downloader (new) !6628, !6629, !6631, !6632, !6633, !6639
- add target's
get-cold-blob
interface; refactoring !6634 - AWS backend:
nil
client !6636 - Prefetch via blob-downloader: add 'blob-threshold' option !6637, !6638
- blob-downloader: user abort; expected checksum !6646
- Azure:
ETag
as object version; build !6647 - Azure: transition from preview to stable 1.x (major) !6648
- AWS backend: use
sync.Map
instead !6649, !6651 - (AWS, GCP) backend: log extended error info; RC5 !6653
- S3: presigned S3 requests; bucket config: add
max-page-size
!6657
Python
- v1.4.17 release !6431
- add support for self-signed certificates with or without verification !6465
- add 'latest' flag for GET !6536
- latest flag for prefetch and copy !6542
- release 1.4.19 !6544
- stress test for copy w/ '--sync' !6552
- fix
pylint
to pass !6556 - test multi-object copy with '--sync' flag !6567
- fix black formatter issues in github CI !6582
- github-CI lint - follow up !6583
- support range read (offset, length) !6588
- update common requirements !6609
- bump SDK version !6610
- lint: add more !6454
Bench
aisloader-composer
: install docker alongside latestcri-o
on CentOS !6436aisloader-composer
: fix install-docker and update OCI inventory !6446, !6449aisloader-composer
: update OCI inventory; avoid using reserved variables in playbooks !6452aisloader-composer
: update dashboard with k8s only networking visualization !6453aisloader
: supportlatest-version
!6581aisloader
: add '--cached' flag !6623
Build, CI
- refactor common 'k8s' package; up cli mod; docs !6434
- build/minikube: skip making cli !6437
- gitlab-CI: scheduled pipeline changes !6442
- upgrade OSS packages !6443
- lint: enable
gocritic
"huge-param" !6457 - lint: add
gosec
linter !6462 - gitlab: add
etl
label & rule !6488 - github-CI: publish
pypi
package for aistore !6492 - build: upgrade all minors !6501
- rename 'cluster' package !6514
- 'api' package not to import 'core' !6515
- tests, tests, and more tests !6530
- CI: fix HDFS docker image !6566
- CI: remove HDFS build and tests !6572
- deployment: add
jq
to init container for parsing JSON in Bash scripts !6577 - CI: update tgt cnt for test short !6579
- gitlab CI: add short test for cloud providers and long test for Azure !6584
- build: new linter !6624
- add github issue templates !6630
- build: release candidate 4 (rc4) !6640
- build: rc7; fixes !6658
Documentation
- blog: aistore fast tier cache !6444
- blog: maximizing cluster bandwidth with multihoming !6642
- document aistore environment variables !6459
- aistore environment variables !6461
- packages (
meta
,ais
,cmn
); docs !6463 - in-cluster K8s environme...
3.21
Highlights
- cold GET: extract and micro-optimize the flow
- sync Cloud bucket
- leverage validate-warm-GET bucket config, and
- extend it to support non-versioned Cloud buckets, and
- optionally, delete (remotely deleted) objects
- bucket sizing and counting:
- support very large buckets that are not necessarily present in the cluster;
- unify
ais ls --summary
andais storage summary
to utilize the same control message and flags
- list, summarize, and lookup the properties of remote buckets without adding them to cluster's BMD
- HTTPS:
- support TLS configuration to authenticate clients
- switch cluster from HTTP to HTTPS, and vice versa
- optimize metadata cache
- optimize capacity management
- bug fixes, performance improvements
Core
- set
prime-time
to amend local generation of globally unique IDs !6325 - multi-object (archive, copy, transform) jobs: transport endpoint !6326
- core: (maintenance, decommission, shutdown) transition w/ rebalancing !6327
- core: (maintenance, decommission, shutdown) transition w/ rebalancing !6328
- intra-cluster transport: make receive-side stats optional !6329
- intra-cluster transport: reduce receive side contention !6330
- fix channel full condition; rebalance-cluster; transport !6331
- feature flags: add
limited-coexistence
; transport: track closed endpoints !6334 - fix
prime-time
: addcaller-is-primary
; up cli module !6335 - switch existing cluster between HTTPS and HTTP !6336
- Go 1.21: use built-in
min
andmax
functions !6337 list-objects
(remote-bucket-and-only-remote-props); Go 1.21clear
built-in !6339- Go 1.20: use typed atomic pointer, remove unsafe !6343
- core: assorted micro-optimizations; remove read locks !6346
- tweak multi-error
join-err
, remove error channel (minor) !6347 - [API change] capacity management !6348
- xxhash; field-align
vol
package !6349 - bucket: new-query help; silent GET; test tools !6350
- etl: adding
fqn
param to spec templates !6351 - low-level control structs:
bucket
,namespace
!6352 - etl: Keras template fix !6355
- etl: fix hello-world ais-etl tests !6356
- core: don't recompute
uname
hash !6359 - repackage HRW methods !6361
- core: lom cache v2 (major update) !6362
- refactor: downloader's diff resolver; control plane (receive BMD) !6363
- core: lom metadata cache (cont-ed) !6365
dsort
: error handling, assorted cleanups, more scripted tests !6366- core transactions: concurrency !6368
- downloader: throttle; wait !6369
- optimize cold GET !6370
- global rebalance: log; minor edits !6373
- core: update backend 'get-reader' API (all supported backends) !6374
- core:
validate-warm-get
to support non-versioned buckets, and more !6375 validate-warm-get
to support non-versioned buckets !6376- [API change] silent
HEAD(object)
request !6378 - core: add
load-unsafe
(the faster way to load local metadata) !6382 - total disk size: compute at startup, recompute on change !6383
- [API change] new bucket summary; unify
list-objects
andsummary
!6384, !6386, !6387 - add
config.Rom
to consolidate assorted "read-mostly" config values; refactor and unify !6388 - [API change] new bucket summary (major update) !6390
- mountpath jogger: support bucket query !6392
- backend providers: do not include (
checksum
,version
) if not asked to !6394 - python: updated bucket info API !6395
- feature flags: don't-add-remote & don't-head-remote; log: add s3 module; verbosity; !6398
- support listing remote buckets without adding them to cluster's BMD !6399
- concurrent HEAD(object) vs evict/create bucket - fix the race !6400
- [API change] list and summarize remote buckets without adding remote buckets to cluster's BMD !6401
- datapath query (
dpq
) !6402 - Go-based API: response header to error message !6403
- [API change] new bucket summary !6405, !6406
- downloader: streamline and cleanup initialization sequence !6409
- HTTPS: support TLS configuration !6410, !6411, !6412, !6413, !6414, !6415, !6416
- assorted minor fixes !6417, !6418
- core: cold GET: fast path & slow path !6419
- cluster configuration: flip
validate-cold-get
!6420 - downloader (major update); [API change]: xaction registry !6422
validate-warm-get
: add scripted test utilizing remote ais cluster !6423- core: cold GET: fast path & slow path !6424, !6427
- feature flags: add
disable-fast-cold-get
;show performance latency
; up cli module !6425 - refactor ais/utils !6429
Bench: aisloader
and aisloader-composer
- skip list objects for 100% put load !6332
composer
: add playbook and script for intialaisloader
copy !6333composer
: add support foraisloader --filelist
option !6345- default value for duration should be infinite if num-epochs value is defined !6353
composer
: add epochs option for GET workloads !6354composer
: add cluster name prefix tonetdata
sources for easier filtering !6357- new bucket not to be listed; usability !6358
CLI
- typed
does-not-exist
error; misc !6358 - always print
dsort
job description !6367 show cluster
to report total num disks !6371show performance
: usability fixes, improvements !6379show performance
not to filter regex-selected zero columns !6380- attempt to copy/transform an empty remote bucket !6393
- new bucket summary; evict multiple buckets in one shot; pretty print !6396, !6397
ais show bucket
with an option to add remote bucket to cluster's BMD (effectively, create bucket) !6404ais search
: CLI command search results to include idiomatic extensions !6428
Build, CI, Deployment
- tests: upon node shutdown: wait for the node to stop (tcp) listening !6338
- CI: add gather-logs template for K8s tests !6340
- deploy: ais with HTTPS in minikube !6364
- build: bump
urllib
version !6372 - tests:
validate-warm-get
(scripted) !6423 - K8s playbooks: update kill
aisloader
command !6385 - docs: validate-warm-get; assorted !6377
- docs: add performance.md; inline help; rm all-columns flag (redundant) !6381
- build: upgrade all minors !6389
- CI: add
checkmarx
scan !6391 - build: upgrade
golangci-lint
, add linters !6407, !6408
3.20
Core
- tweak stop-maintenance logic; rebalance: cleanup log messages; assorted minor fixes !6288
- do not timestamp
err-aborted
message !6290 - [API change] dsort: remove extended metrics; add new counters; revise and refactor !6297
- list-objects; house-keeper;
aisloader
, logger (assorted fixes) !6298 - core stats: remove mutex and work channel - speed up !6299
- slab allocator: remove stats mutex, do not sort !6300
- consolidate and revise OOM handling !6301
- ETL: require admin access to create & delete; add feature flag !6302
- remove unused heartbeat tracker w/ minor ref !6308
- reimplement keep-alive mechanism (major) !6309
- keep-alive v2 (major update) !6312
- keep-alive v2: remove timeout stats (control structure and code) !6317
- keep-alive v2: add fast path !6320
- micro-optimize get-all-running (jobs); atomic heard-from/timed-out !6321
- node-restarted: remove 'lsof', use net dialer; fix node-decommissioning tests !6322
Tools and tests
- CI: update fspath (aka mountpath) config for minikube-based aistore deployments !6289
aisloader
: list and read s3 buckets directly !6291aisloader
: list, read, and write s3 buckets directly !6292- tests: K8s long tests (EchoGolang) fix !6293
aisloader
: fix cleanup option for s3 bucket benchmarks !6294aisloader
: reimplement direct get from s3 - use SDK !6295aisloader
: show progress when listing s3 directly !6296- CLI: add show details param to etl !6304
- tools: add check for ais etl deployment !6305
- tools: add
ETL_NAME
var for CLI tests !6310 aisloader-composer
: add playbook and script for clearing Linux Page Cache on all AIS targets !6311aisloader-composer
: add playbook for copying aws credentials !6314- tools: update check for aistore Kubernetes deployment !6315
- CI: update github action version (all modules) !6316
- CLI/ETL: support enumerated
arg-type
!6287, !6323
Build
- upgrade all OSS packages (minor versions) !6313
- transition to Go 1.21 !6318
3.19
Core
- [API change] archive and download logs (feature) !6172, !6175
- [API change] dsort: extend input format !6181
- [API change] dsort spec; CLI: print job spec !6204
- [API change] revise request spec (major upd) !6217
- [API change] dsort: is now 'xaction' as well !6253
- (downloader, dsort, ETL): disallow to run when out of space !6235
- handle "DNS lookup fail" as one of the unreachable err types; nlog flush-exit !6164
- when electing new primary; when joining nodes at startup !6165
- k8s: Change prod k8s and docker default to not log all to stderr !6166
- revise GFN !6167
- stats runner is now responsible to periodically flush logs !6170
- core: fail user attempt to abort global rebalance when !6184
- new Go API; assorted fixes !6189
- metasync BMD; up modules !6190
- downloader: return not-found when not found !6196
- start using scripted integration tests; CLI: 'dsort src dst spec' !6198
- support S3 AWS profiles with alternative creds (feature) !6214
- core: state transition => rebalance => (point of no return) !6216
- amend low-level Go API check-response routine; add error type-code !6228, !6229
- control plane: deserialize original error from call result !6230
- xactions: when checking inactivity ("is idle") !6242 !6243
- primary readiness vs cluster shutdown !6244
- Go API: wait for xaction-related conditions !6245
- assorted tuneups: space cleanup; housekeeping (HK) callback; log !6246
- access control: when copying/transforming/dsorting to non-existing 'ais://' destination !6255
- core: a call to update stats should never block !6257
- core stats: add fast counters !6258 !6259 !6261
- sparsify latency stats !6260
- ETL: refactor and cleanup construction !6267
- deploy/dev: updated minikube scripts !6272
- new option to add Cloud bucket to aistore without checking accessibility !6275, !6277
- un-throttle PUT mirroring; assorted changes !6278
- feature: local generation of global (job) IDs !6280 !6282
Performance
- Add distributed loader scripts and playbooks for using aisloader with multiple hosts !6156
- pyaisloader: usability improvements !6215
- Update Grafana dashboard to include latency statistics !6249
- Reorganize benchmarks and related tools !6254
- aisloader: no need to call
rand
for 100% or 50% read/write workloads !6256 - aisloader-composer: add dashboard for DC network and disk !6266
- aisloader: add an option to randomize gateways !6279
- aisloader-composer: fix output files for GET bench !6283
Python
- sdk: update ETL templates (docker migration) !6168
- sdk: Release version 1.4.1 !6169
- sdk: ETL templates (compress + ffmpeg decode) !6185
- sdk: ETL templates (imagepullpolicy as always) !6191
- sdk: adding keras_transform template !6200
- sdk: ETL templates fix !6201
- sdk: ETL templates (ffmpeg decode transformer) !6205
- sdk: compress ETL template (updated usage) !6211
- sdk: torchvision sample transformer ETL template !6221
- sdk: fix comments (minor) !6240
- sdk: update version !6248
- sdk: increase timeout for torchvision transformer template (large image) !6252
- sdk: updated torchvision transform ETL !6262
- sdk: update dsort job info query and related tests !6265
- sdk: switch ETL init code 'transform_url' boolean flag to 'arg_type' string !6269
- docs: update ETL dev deployment for macOS !6163
- ETL: keras template minor fix !6213
- ETL: remove incorrect reference !6268
- ETL: add 'arg-type=FQN' (new) !6271
Datasets (resize, resort, and shuffle)
- [API change] dsort: extend input format !6181
- dsort input format: iterate list, iterate range !6186 !6187
- start using scripted integration tests; CLI: 'dsort src dst spec' !6198
- add test scripts; memsys: init gmm only once !6192
- refactoring and renaming !6193
- move/consolidate error types; continued refactoring !6202
- Go API change; add dsort/api.go; CLI: print job spec !6203
- [API change]: dsort spec; CLI: print job spec !6204
- CLI/dsort: extend inline help, pretty-print job spec; update docs !6206
- dsort: continued refactoring (major update) !6208, !6209, !6210
- free sgl on error; feature: any extension !6212
- [API change] revise request spec (major upd) !6217
- create destination on the fly !6218
- record content path to retain full shard name !6219
- output shard size estimation (rewrite) !6223
- add is-compressed; refactor dsort-mem !6227
- compressable shards (major) !6231
- output ext; rcb buffer; fixes !6232
- duplicated records (full coverage & stress); fixes !6233
- fix tests; add stress !6234
- rename subpackage, fix comments, refactor !6237
- remove dsort-context, rewrite initialization !6238
- static/stateless shard readers/writers; refactor and simplify !6239
- two goroutines per each shard-distributing request !6241
- [API change]: dsort: is now 'xaction' as well !6253
- dsort: support generic abort-xaction API !6264
- no need to block when sending shard records !6286
CLI
- archive and download logs (feature) !6180
- clarify "copying" vs "transforming" and "cached" vs "present" !6183
- start using scripted integration tests; CLI: 'dsort src dst spec' !6198
- dsort: extend inline help, pretty-print job spec; update docs !6206
- dsort: Go API change; add dsort/api.go; CLI: print job spec !6203
- 'archive get' is now a shortcut (an alias) !6222
Build, test, and tools
- add test scripts; memsys: init gmm only once !6192
- tests and tools: cleanup around stop-maintenance, wait-rebalance" !6194
- deployment: update local deployment script to allow target-only deployment with defined primary host !6195
- deployment: optionally, skip deploying primary proxy !6197
- start using scripted integration tests; CLI: 'dsort src dst spec' !6198
- tools/generate shards: optimize buffer allocation !6224
- deploy/dev: Add ansible deployment scripts for deploying locally on multiple nodes !6199
- aistorage/CI docker image (lzma libraries) !6220
- tests: init with cleanup and without !6226
- CI: Retry stuck Python ETL tests in GitLab CI pipeline !6270
- remove
aisfs
(FUSE) !6273 - dev tools: readers; handle read from corrupted arch or non-arch !6250
Documentation
- update getting started !6161
- updated python sdk readme !6162
- update ETL dev deployment for macOS !6163
- update documentation with recent ETL changes !6173
- CLI/dsort: extend inline help, pretty-print job spec; update docs !6206
3.18
Core
- add
htext
to track restarted state; target run and misc !5966 - cluster rebalance (scenarios) !5969, !5971, !5973, !5974, !5975, !5977, !5980, !5983, !5986, !5987, !5989, !5991, !5992, !5993, !5995, !6002
- add 'cluster-ready' helper; use it to reinforce !5976
- cleanup better when decommissioning; previous BMD at startup !5979
- fs: reliable remove-all !5981, !5982, !5984
- yet another buf pool !5985
- do not modify cluster map when starting up; always skip logging idle disks !5988
- rebalance (scenarios, major update) !5992
- [API change]: core: rebalance (scenarios) !5993
- rebalance (major update); when receiving new cluster map !5995
- up modules; handle housekeeper registration race !5994
- 'not present in the loaded cluster map' and similar startup validation !5996
- shutdown or decommission a node that's already in maintenance !5998
- transport: never establish a streaming connection to the peer that's in maintenance (or will be) !5999
metasync
just-in-time; assorted refactoring (minor) !6001- maintenance mode: pre & post vs keepalive & metasync; CLI: more colored cues !6004
- shutdown is also 'maintenance'; docs: adding-removing intro !6005
- add
meta
package !6006, !6007 - ETL: add arg-type parameter when initializing with code !6008
- archive v2: support empty template (tar entire bucket); atime !6013
- keep poi.atime in nanoseconds !6015
- archive v2: append to arch; refactoring !6017
- archive v2: up modules !6018
- archive v2: part four (major) !6019
- archive v2: detect an empty tar when appending, and handle !6020
- archive v2: part six !6022
- archive v2: mime detection !6024
- archive v2: extend 'append-to-arch' to support tar.gz !6025, !6027
- archive v2: tar and tgz append; fixes !6028
- log filenames; overlapping run vs node-restarted !6029
- archive v2: multi-object append-to-arch !6030
- archive v2: multi-object append-to-arch !6033
- archive v2: multi-object append-to-arch !6034
- cleanup disk utils (minor) !6035
- ios startup: run the command only once !6036
- hide AuthN secret !6038
- archive v2: append to zip !6041
- archive v2: append to msgpack !6043
- add
cmn/archive
package !6044 - archive v2: write and copy via new 'cmn/archive' !6045
- archive v2: append via new 'cmn/archive' !6046
- [API change] archive v2: MIME vs file extensions !6047
- ios: cleanup lsblk cache; CLI: refactor get-node-arg; up modules !6048
- archive v2: remove msgpack; refactor !6051
- archive v2: add '.tar.lz4' serialization (new) !6053
- archive v2: tar.lz4 cont-d !6054
- archive v2: lz4 features; checksum !6055
- s3 compat: run E2E tests with correct HTTP/HTTPS mode !6057
- [API change]: append to arch if exists !6062
- [API change] append to arch if doesn't exist; CLI cont-d !6064
- checksumming and buffering vs reader-from !6074
- core: content-length universally; revise write-json and friends !6075, !6076
- archive v2: [API change] put (files, dirs) with an option to append !6081, !6082
- archive v2: quiesce faster, refine continue-on-error logic !6083
- core: double-check target-in-maintenance, quiesce faster !6084
- archive v2: finalize cmn/archive package !6085
- archive v2: finalize cmn/archive package !6086
- log verbosity: core and modules !6087
- http client: disable compression; core: undefer & micro-optimize !6066
- append to (non-existing) arch: an option to create !6068
- mem-pool alloc/free symmetry: copy/transform & archive !6069
- copy/transform, multi-archive: refactor Rx logic and error handling !6071
- log verbosity: core and modules; remote cluster !6089
- log verbosity: core and modules; remote cluster !6091
- ec: minor refactoring !6092
- archive v2: WD basename; get with extraction; Range !6093
- archive v2: WD basename; get with extraction; Range !6094
- archive v2: tools/archive utils !6095
- archive v2: tools/archive utils !6096
- compile-out asserts; super-verbose logging; log module 'mirror' (ref) !6097
- log verbosity at runtime; log modules; remove glog; unify (major update) !6099
- fields iterator; size converter; log rotation (fixes) !6101
- [API change] get-bucket-info to count remote objects !6102
- [API change] get-bucket-info to count remote objects !6103
- [API change] get-bucket-info (part three); docs and CLI !6106
- list-objects vs buckets: revise and refactor, add validation, clarify !6107
- list-objects: introduce optional args (ref, cleanup) !6108
- list-objects: mem-pool msgpack buffers !6109
- kvdb: remove redundant err-not-found; amend dsort, downloader, authn !6110
- x-lso must idle more time !6111
- log modules (part three) !6112
- add
nlog
(new logger) !6113, !6122, !6124 - do not log perf counters when there's no change; sort the names !6119
- fix disk usage call for clusters on mac OS !6121
- log etl events: spec parsed, pod ready, hpull/hpush !6125
- cleanup
fs.PathError
; add object name validation !6127 - extend dsort to support
.tar.lz4
!6128 - aisloader: cleanup output when running with json option !6129
- xactions to return extended error info !6130
- xactions: add and return errors (major upd) !6131, !6132
- xactions: error handling cont-d; bucket-scope multi-object op-s !6133
- jobs error handling and reporting; tests !6134, !6135, !6136
- use Go 1.20
join-err
; reduce default pre-election interval !6141 - http request multiplexer !6142
- dsort: use common cmn/archive pkg (major upd) !6143
- archive v2: tar formats (USTAR, etc.) !6153
- archive v2: extend list-objects and GET to operate inside archives !6155
- archive v2: extend list-objects and GET to operate inside archives !6157
- control plane: consistently propagate cluster map !6158
- old EC metadata shouldn't terminate cluster-wide rebalance !6159
- core: when primary goes down it notifies !6160
CLI
- performance tabs: always show 'cluster idle' if idle !5997
- stop-maintenance w/ no rebalance; stats idle-ness (minor) !6000
- add
transform-url
flag (used when initializing ETL with code) !6010 - misc. improvements !6011
mountpath
completions; disable/detach; minor ref !6056- de-spaghettify put handler !6059
- archive multi-object (cont-d) !6060
- archive v2: CLI put, append, alias, docs !6072
- move 'gen-shards' and extend it to support all formats !6073
- archive v2: CLI: APPEND-to-arch is now on-par with multi-PUT !6077
- archive v2: CLI:
dry-run
option is back !6078 - archive v2: CLI: destination naming; dry-run; tips and examples !6079
- list-objects: extended help, template with no ranges !6104
- error and warning verbosity (major update) !6105
- log modules (part two); CLI archive: pre-parse and add tip, advice !6100
- archive help !6126
- archive v2: CLI and docs !6149
- archive v2: multiple CLI updates and improvements; is-archive bit !6150
- remove k8s apimachinery pkg (minor) !6151
Python
- sdk/python: Add option to pre-import modules when initializing ETL with code !5963
- sdk/python: Refactor ETL to provide name once on object creation !5968
- sdk/python: Release version 1.2.0 !5970
- sdk/python: Fix python package links !5978
- sdk/python: improve SDK testing, fix bucket eviction keep_md parameter !5990
- sdk/python: Release version 1.2.2 !6012
- sdk/python: Add dSort support in SDK !6032
- sdk/python: Add wait to dsort abort test !6037
- sdk/python: Fix dsort abort test on faster systems !6039
- sdk/python: Add get_url option to bucket, object group, and objects !6042
- sdk/python: Add interface for AIS sources that can be accessed via list of URLs !6052
- PyTorch: Add new PyTorch DataPipe to iterate over URLs from various AIS sources !6058
- update lint-tests to work with Python 3.11 (pytorch unsupported) !6061
- sdk/python: Include user agent in SDK requests !6067
- sdk/python Use msgpack content type when listing objects in bucket !6070
- sdk/python: Add msgspec to pyproject dependencies !6088
- sdk/python: Release SDK version 1.3.0 !6090
- sdk/python: bucket summary + bucket info !6098
- sdk/python: bck info integration test fix !6115
- sdk/python: bck summary integration test fix !6117
- sdk/python: bck summ + bck info fixes (revert and cleanup) !6120
- sdk/python: pyaisloader !6123
- pyaisloader: bucket utils (minor fixes) !6139
- sdk/python: bucket summary and info (minor fixes) !6140
pyaisloader
: total size/count fix !6146- sdk/python: Release version 1.4.0 !6152
Docs
- add lifecycle.md !6009
- WebDataset Blog Post part 1: Storing WebDataset in AIS !6014
- WebDataset Blog Post part 2: AIS ETL on WebDataset shards !6016
- update docs/tools and cmd/README !6050
- WebDataset blog post pt 3 -- PyTorch Datatpipe with ETL !6065
- archive v2: CLI put, append, alias, docs !6072
- archive v2: CLI: destination naming; dry-run; tips and examples !6079
- archive v2: CLI and docs !6149
Build, CI
- add local build options to k8s scripts and fix local registry !6021
- build: upgrade all minors !6040
- add
pyaisloader
stage !6148 - k8s: fix deployment scripts for compatibility with the latest
aisnode
image !6154
3.17
Table of Contents
- CLI v1.2
- Python SDK v1.1.2
- S3 compatibility and Botocore
- API changes
- Tests and Documentation
- Core: bug fixes and improvements
- Build and Continuous Integration
- Extensions: Downloader, dSort, ETL
See also:
CLI
- show all jobs !5645
- start/stop job/xaction !5660
- refresh rate and countdown; long-running 'show job' and friends !5651
- 'show log node-name' to mimic 'tail -f' !5652, !5654
- add custom duration flag and logic !5655
- 'ais config (cluster|node|cli)', 'ais config reset', and friends !5656
- bucket completions !5657
- set-config to show all updates; tweak iter-fields reflection !5658
- 'show job' to aggregate all categories and support all selections !5661
- transition to using job display names (major) !5663
- start, stop, show jobs and xactions (cont-d) !5665
- amend and restructure jobs !5666
- running xactions (completions) !5672
- tweak config json printout; get-config from memory !5673
- update backend config !5674
- update backend config (part two) !5677
- add footnote, marshal message only once !5678
- remove
xaction
term and subcommand (everything isjob
now) !5692 - suggest (targets, proxies, nodes) !5694, !5696
- revise bash completions script !5697
- remove 'xaction' (term and subcommand) !5698, !5699, !5700
- 'show cluster': separate cluster nodes from all other (tab-tab) completions !5701
- consolidate and refactor cluster map access !5704
- tweak
ais create bucket --props
&ais bucket props set
!5706 - extend 'job start' to support (resilver, copy-bucket, rename-bucket) !5715
- tweak listed props !5719
- remove (cleanup) download and dsort jobs !5721
- extend 'ais stop' to support --all|--regex !5722
- 'show job' verbose option; unify usage args; ref PUT/APPEND !5723
- rewrite command-not-found logic; add similar commands !5724
show jobs
(major) !5726, !5727- bash autocomplete ordering improvements !5728
- improvements (usability) !5729
- add
bucket cp
alias !5730 - flag printable name; split 'show job' in parts; usability !5736
- further unify stopping, waiting-for, and showing jobs !5744
- revise & amend 'show rebalance' - all permutations !5761
- universal start-end formatting; template refactoring !5762
- jobs grouping by name and, within name, by UUID !5764
- complete
etl-name
transition !5767 - ETL tools, UUID (part one) !5745, !5746, !5749, !5753, !5754, !5763
- fix download/dsort progress !5769
- new table to show target statistics !5788
- 'ais show performance' (new) !5791, !5793, !5800, !5802, !5803, !5809, !5810, !5811, !5812, !5816
- IEC, SI, and raw (bytes, nanoseconds) formatting (major) !5820
- reduce code, simplify, cleanup !5821
- IEC, SI, and raw (bytes, nanoseconds) formatting (major) !5823
- disk stats: add average read/write sizes !5824
- amend existing mountpath tab and add a new one !5833, !5834
- expect node unreachable when iterating '--refresh' !5837
- assorted usability; add 'no-color' config !5839
- 'ais show performance': average (GET, PUT, etc.) sizes on the fly !5840
- support new API to reset stats !5841, !5843
- 'ais show performance': refactor throughput, add latency !5844
- 'ais show performance': finalize latency tab !5847
- 'ais show performance' cont-d !5848
- 'ais show performance': finalize top-level tab !5850
- 'ais show performance': add cluster-level throughput, beautify !5852
- 'ais show performance': alias 'stats' and remove older code !5853
- 'ais show performance': disk table v2 !5855
- 'ais show performance': finalize disk table !5857
- 'ais show performance': new mountpaths/disks/capacity table !5858, !5859
- 'ais show performance': finalize capacity table !5861
- refactor and cleanup multi-object put !5862
- multi-object PUT: source dir, list/range; matching pattern !5865
- fix concatenation logic, refactor progress bar !5867
- copy bucket: support progress bars (copied objects and size) !5870
- consistent timeout management !5871
- copy/transform a list or range of objects: add progress bar !5873
- copy/transform with progress bar: style, reuse !5874
- multi-object PUT !5876
- rogress bar: all multi-object operations; universal 'wait-for' !5879
- PUT multi-object - all flavors !5880
- get multiple objects in one shot (""multi-object GET"") !5884
- GET destination & assorted fixes !5882
- copy-bucket: prepend prefix, command helps, examples !5889, !5891
- more inline help !5892
- assorted improvements (minor) !5900
- fix downloading with progress bar enabled !5903
- how-to text: how to reconfigure remote ais cluster !5932
- add CLI compatibility warning (new CLI vs old cluster and vice versa) !5952
- cluster membership-changing operations (shutdown, decommission, et al): improve usability !5956
Python: SDK library and ETL
- Add unit tests for cluster class !5675
- Add unit test for api client class !5676
- Fix python ETL test workflow !5680
- Add unit tests for Xaction class !5681
- Add unit test for object class, fix object put data from a filepath !5682
- Add unit tests for ETL class !5683
- Add unit tests for sdk utils !5685
- Restructure python subdirectories and containment !5686
- Run python unit tests as part of default python test make option, include python ETL tests in all python test runs !5688
- Only run python ETL tests when python labels are added !5702
- update test utils to support running Python tests on Windows !5716
- Add multi-object functionality !5720
- Add tests and validation to object ranges, support leading zeros !5731
- Add unit test for object group class !5732
- Update documentation for python multi-object ops !5734
- Use aws bucket for python sdk CI tests to test caching functions !5739
- Add string template support to object groups !5741
- Refactor all references from xaction to job !5742
- Update ETL runtime defaults and add new python 3.11 option !5743
- Fix remote bucket tests to avoid collisions !5748
- Increment cloudpickle version and update github action to use 3.11 for tests !5756
- Fix python test dependencies !5780
- Refactor to simplify typing and tests !5789
- Bump python SDK to 1.0.5 !5790
- Update python sdk version !5792
- Update pytorch integration README with compatibility issue !5794
- Patch sdk to support torchdata integration !5795
- Bump aistore package version !5798
- Fix pylint and formatting !5799
- Add ObjAttr type for returning additional object metadata !5805
- Address lint warnings, general improvements !5808
- Add writer option to object get !5813
- Split README for different python projects !5814
- Add PROMOTE functionality to objects !5818
- Standardize pylint version, fix lint errors !5822
- Improve object put behavior and add directory put options !5826
- Use Pathlib over os.path !5827
- Refactor multi-file put to bucket class !5830
- Improve job interface and update promote options !5832
- Update python package build tools, increase version to 1.0.9 !5835
- Improve example documentation !5842
- Improve input validation !5846
- Set up proper logging, update constants !5849
- Add job wait for idle status and fix job bucket filter !5863
- Add multi-object copy !5866
- Fix remote test fixture !5877
- Add multi-object ETL !5878
- Release 1.0.10 !5881
- Expand multi-object examples !5883
- Improve usability !5888
- Follow-up for copy/transform prepend !5890
- Improve object interface !5893
- Add prefix_filter option to bucket copy !5899
- Add flags and target options to bucket list_objects methods !5901
- Release 1.1.0 !5910
- Release 1.1.1 !5911
- Improve python sdk examples !5913
- Add cluster list_jobs, require id for individual job status query !5916
- Add support for multi-object archive !5920
- Update bucket params to use Bucket object, support namespaces !5925
- Add prefix filter to bucket transform !5949
- Release 1.1.2 !5950
- PyTorch: add support for
etl-name
inAISDataset
andAISFileLoaderIterDataPipe
!5957
S3 compatibility and Botocore
- Add botocore monkey patch alongside python SDK !5684
- Move botocore and pytorch packages to python top-level, separate from sdk !5691
- Add s3 compatibility testing with boto3 !5703
- list-objects vs HEAD !5705
- compute multipart md5 and set etag !5708
- Improve s3 compat test documentation and update validated tests !5747
- Pass S3 delete a list of objects !5854
- fix infinite loop when listing objects; add bucket name to list object response !5864
- return bucket creation date in UTC !5869
- new flag
NoRecursion
to support S3 delimiter feature !5930
API changes and new APIs
- add API to query multiple xactions via any IC proxy !5670, !5671
- yet another API to query xactions (new) !5687
- list-objects API: tweak listed props !5718
- [API change] flatten
xaction-snap
control structure (major upd) !5740 - [API change] ETL: tools, UUID (part nine) !5754
- [API change] ETL: new query parameter to specify transform name !5755, !5765
- [API change] init/start ETL to return xaction ID !5768
- [API change] GET(object) !5770, !5772, !5773, !5775
- [API change] remove xaction 'query-msg' (deprecated) !5783, !5784, !5785
- get-object API: amend comments !5769
- refactor
api
package (major) !5776 - [new API] query metric names and kinds ('counter', 'latency', 'throughput', et al.) !5796
- [API change] get node status !5811
- [API change] core stats: consistency between stats-querying APIs !5817
- [API change] PUT(object) !5828
- [API change] amend capacity-disks-filesystems control !5831
- [API change] add API to reset stats !...
3.16.rc2
v1.3.16 v3.16.rc2