Skip to content

Commit

Permalink
[feat] Add AimCallback for distributed runs using the hugging face API
Browse files Browse the repository at this point in the history
There is a singular aim.Run which the main worker initializes and manages.
All auxiliary workers (local_rank 0 workers hosted on other nodes) collect
their metrics and forward them to the main worker. The main worker records
the metrics in AIM.

Signed-off-by: Vassilis Vassiliadis <vassilis.vassiliadis@ibm.com>
  • Loading branch information
VassilisVassiliadis committed Jan 27, 2025
1 parent eba27a9 commit 8ca4b36
Show file tree
Hide file tree
Showing 2 changed files with 500 additions and 0 deletions.
2 changes: 2 additions & 0 deletions aim/distributed_hugging_face.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Alias to SDK distributed hugging face interface
from aim.sdk.adapters.distributed_hugging_face import AimCallback # noqa F401
Loading

0 comments on commit 8ca4b36

Please sign in to comment.