Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: add "Using Environment Variables". #2188

Merged
merged 2 commits into from
Jul 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/examples/spot-jobs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,8 @@ The :code:`MOUNT` mode in :ref:`SkyPilot Storage <sky-storage>` ensures the chec
Note that the application code should save program checkpoints periodically and reload those states when the job is restarted.
This is typically achieved by reloading the latest checkpoint at the beginning of your program.

.. _spot-jobs-end-to-end:

An end-to-end example
---------------------

Expand Down
21 changes: 20 additions & 1 deletion docs/source/reference/yaml-spec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ Available fields:
# image_id: skypilot:k80-ubuntu-2004
# image_id: skypilot:gpu-ubuntu-1804
# image_id: skypilot:k80-ubuntu-1804
# It is also possible to specify a per-region image id (failover will only go through the regions sepcified as keys;
# It is also possible to specify a per-region image id (failover will only go through the regions sepcified as keys;
# useful when you have the custom images in multiple regions):
# image_id:
# us-east-1: ami-0729d913a335efca7
Expand All @@ -132,6 +132,16 @@ Available fields:
# To use a more limited but easier to manage tool:
# https://github.com/IBM/vpc-img-inst

# Environment variables (optional). These values can be accessed in the
# `file_mounts`, `setup`, and `run` sections below.
#
# Values set here can be overridden by a CLI flag:
# `sky launch/exec --env ENV=val` (if ENV is present).
envs:
MY_BUCKET: skypilot-temp-gcs-test
MY_LOCAL_PATH: tmp-workdir
MODEL_SIZE: 13b

file_mounts:
# Uses rsync to sync local files/directories to all nodes of the cluster.
#
Expand All @@ -156,6 +166,12 @@ Available fields:
# Copies a cloud object store URI to the cluster. Can be private buckets.
/datasets-s3: s3://my-awesome-dataset

# Demoing env var usage.
/checkpoint/${MODEL_SIZE}: ~/${MY_LOCAL_PATH}
/mydir:
name: ${MY_BUCKET} # Name of the bucket.
mode: MOUNT

# Setup script (optional) to execute on every `sky launch`.
# This is executed before the 'run' commands.
#
Expand All @@ -170,3 +186,6 @@ Available fields:
run: |
echo "Beginning task."
python train.py

# Demoing env var usage.
echo Env var MODEL_SIZE has value: ${MODEL_SIZE}
107 changes: 107 additions & 0 deletions docs/source/running-jobs/environment-variables.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@

.. _env-vars:

Using Environment Variables
================================================

User-specified environment variables
------------------------------------------------------------------

You can specify environment variables to be made available to a task in two ways:

- The ``envs`` field (dict) in a :ref:`task YAML <yaml-spec>`
- The ``--env`` flag in the ``sky launch/exec`` :ref:`CLI <cli>` (takes precedence over the above)

The ``file_mounts``, ``setup``, and ``run`` sections of a task YAML can access the variables via the ``${MYVAR}`` syntax.

Using in ``file_mounts``
~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: yaml

# Sets default values for some variables; can be overridden by --env.
envs:
MY_BUCKET: skypilot-temp-gcs-test
MY_LOCAL_PATH: tmp-workdir
MODEL_SIZE: 13b

file_mounts:
/mydir:
name: ${MY_BUCKET} # Name of the bucket.
mode: MOUNT

/another-dir2:
name: ${MY_BUCKET}-2
source: ["~/${MY_LOCAL_PATH}"]

/checkpoint/${MODEL_SIZE}: ~/${MY_LOCAL_PATH}

The values of these variables are filled in by SkyPilot at task YAML parse time.

Read more at `examples/using_file_mounts_with_env_vars.yaml <https://github.com/skypilot-org/skypilot/blob/master/examples/using_file_mounts_with_env_vars.yaml>`_.

Using in ``setup`` and ``run``
~~~~~~~~~~~~~~~~~~~~~~~~

All user-specified environment variables are exported to a task's ``setup`` and ``run`` commands (i.e., accessible when they are being run).

For example, this is useful for passing secrets to the task (see below).

Passing secrets
~~~~~~~~~~~~~~~~~~~~~~~~

We recommend passing secrets to any node(s) executing your task by first making
it available in your current shell, then using ``--env`` to pass it to SkyPilot:

.. code-block:: console

$ sky launch -c mycluster --env WANDB_API_KEY task.yaml
$ sky exec mycluster --env WANDB_API_KEY task.yaml

.. tip::

In other words, you do not need to pass the value directly such as ``--env
WANDB_API_KEY=1234``.





SkyPilot environment variables
------------------------------------------------------------------

SkyPilot exports these environment variables for a task's execution (while ``run`` commands are running):

.. list-table::
:widths: 20 70 10
:header-rows: 1

* - Name
- Definition
- Example
* - ``SKYPILOT_NODE_RANK``
- Rank (an integer ID from 0 to :code:`num_nodes-1`) of the node executing the task. Read more :ref:`here <dist-jobs>`.
- 0
* - ``SKYPILOT_NODE_IPS``
- A string of IP addresses of the nodes reserved to execute the task, where each line contains one IP address. Read more :ref:`here <dist-jobs>`.
- 1.2.3.4
* - ``SKYPILOT_NUM_GPUS_PER_NODE``
- Number of GPUs reserved on each node to execute the task; the same as the
count in ``accelerators: <name>:<count>`` (rounded up if a fraction). Read
more :ref:`here <dist-jobs>`.
- 0
* - ``SKYPILOT_TASK_ID``
- A unique ID assigned to each task.
Useful for logging purposes: e.g., use a unique output path on the cluster; pass to Weights & Biases; etc.

If a task is run as a :ref:`managed spot job <spot-jobs>`, then all
recoveries of that job will have the same ID value. Read more :ref:`here <spot-jobs-end-to-end>`.
- sky-2023-07-06-21-18-31-563597_myclus_id-1

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a bunch more env vars added by the #2106, but it should be fine to leave it out, as that feature is still experimental.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

The values of these variables are filled in by SkyPilot at task execution time.

You can access these variables in the following ways:

* In the task YAML's ``run`` commands (a Bash script), access them using the ``${MYVAR}`` syntax;
* In the program(s) launched in ``run``, access them using the
language's standard method (e.g., ``os.environ`` for Python).
1 change: 1 addition & 0 deletions docs/source/running-jobs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ More User Guides

distributed-jobs
grid-search
environment-variables