Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[feat] Add AimCallback for distributed runs using the hugging face API
There is a singular aim.Run which the main worker initializes and manages. All auxiliary workers (local_rank 0 workers hosted on other nodes) collect their metrics and forward them to the main worker. The main worker records the metrics in AIM. Signed-off-by: Vassilis Vassiliadis <vassilis.vassiliadis@ibm.com>
- Loading branch information