Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set fs_local_rank as global_rank when FS_LOCAL_RANK is not available #456

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

hxdtest
Copy link
Contributor

@hxdtest hxdtest commented Feb 18, 2024

In scritp scripts/run_with_environment.shFS_LOCAL_RANK is set as RANK.

export RANK=$SLURM_PROCID
export FS_LOCAL_RANK=$SLURM_PROCID

If the job is not launched by scripts/run_with_environment.sh and all ranks share the same filesystem, every local rank0 writes global_indices.npy.

@hxdtest hxdtest changed the title get_fs_local_rank Set fs_local_rank as global_rank when FS_LOCAL_RANK is not available Feb 18, 2024
@2015aroras
Copy link
Collaborator

I don't think this is the right approach to this problem. Our code (arbitrarily) assumes that unless FS_LOCAL_RANK is set, each node has a separate file system. I don't think assuming that each node has the same file system is a better assumption to have. The best behavior might be to raise an error and tell the user to explicitly set FS_LOCAL_RANK, so that no assumption is made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants