Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: add "Using Environment Variables". #2188

Merged
merged 2 commits into from
Jul 7, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/examples/spot-jobs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,8 @@ The :code:`MOUNT` mode in :ref:`SkyPilot Storage <sky-storage>` ensures the chec
Note that the application code should save program checkpoints periodically and reload those states when the job is restarted.
This is typically achieved by reloading the latest checkpoint at the beginning of your program.

.. _spot-jobs-end-to-end:

An end-to-end example
---------------------

Expand Down
21 changes: 20 additions & 1 deletion docs/source/reference/yaml-spec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ Available fields:
# image_id: skypilot:k80-ubuntu-2004
# image_id: skypilot:gpu-ubuntu-1804
# image_id: skypilot:k80-ubuntu-1804
# It is also possible to specify a per-region image id (failover will only go through the regions sepcified as keys;
# It is also possible to specify a per-region image id (failover will only go through the regions sepcified as keys;
# useful when you have the custom images in multiple regions):
# image_id:
# us-east-1: ami-0729d913a335efca7
Expand All @@ -132,6 +132,16 @@ Available fields:
# To use a more limited but easier to manage tool:
# https://github.com/IBM/vpc-img-inst

# Environment variables (optional). These values can be accessed in the
# `file_mounts`, `setup`, and `run` sections below.
#
# Values set here can be overridden by a CLI flag:
# `sky launch/exec --env ENV=val` (if ENV is present).
envs:
MY_BUCKET: skypilot-temp-gcs-test
MY_LOCAL_PATH: tmp-workdir
MODEL_SIZE: 13b

file_mounts:
# Uses rsync to sync local files/directories to all nodes of the cluster.
#
Expand All @@ -156,6 +166,12 @@ Available fields:
# Copies a cloud object store URI to the cluster. Can be private buckets.
/datasets-s3: s3://my-awesome-dataset

# Demoing env var usage.
/checkpoint/${MODEL_SIZE}: ~/${MY_LOCAL_PATH}
/mydir:
name: ${MY_BUCKET} # Name of the bucket.
mode: MOUNT

# Setup script (optional) to execute on every `sky launch`.
# This is executed before the 'run' commands.
#
Expand All @@ -170,3 +186,6 @@ Available fields:
run: |
echo "Beginning task."
python train.py

# Demoing env var usage.
echo Env var MODEL_SIZE has value: ${MODEL_SIZE}
110 changes: 110 additions & 0 deletions docs/source/running-jobs/environment-variables.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@

.. _env-vars:

Using Environment Variables
================================================

User-specified environment variables
------------------------------------------------------------------

You can specify environment variables to be made available to a task in two ways:

- The ``envs`` field (dict) in a :ref:`task YAML <yaml-spec>`
- The ``--env`` flag in the ``sky launch/exec`` :ref:`CLI <cli>` (takes precedence over the above)

The ``file_mounts``, ``setup``, and ``run`` sections of a task YAML file can then access these variables via the ``${MYVAR}`` syntax.

Using in ``file_mounts``
~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: yaml

envs:
MY_BUCKET: skypilot-temp-gcs-test
MY_LOCAL_PATH: tmp-workdir
MODEL_SIZE: 13b

file_mounts:
/mydir:
name: ${MY_BUCKET} # Name of the bucket.
mode: MOUNT

/another-dir2:
name: ${MY_BUCKET}-2
source: ["~/${MY_LOCAL_PATH}"]

/checkpoint/${MODEL_SIZE}: ~/${MY_LOCAL_PATH}

The values of these variables are filled in by SkyPilot at task YAML parse time.

Read more at `examples/using_file_mounts_with_env_vars.yaml <https://github.com/skypilot-org/skypilot/blob/master/examples/using_file_mounts_with_env_vars.yaml>`_.

Using in ``setup``
~~~~~~~~~~~~~~~~~~~~~~~~

All user-specified environment variables are exported to a task's ``setup`` commands.

Using in ``run``
~~~~~~~~~~~~~~~~~~~~~~~~

All user-specified environment variables are exported to a task's execution (i.e., while its ``run`` commands are running).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can combine the two sections into one to reduce the duplication.

Using in ``setup`` and ``run``

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, done.

For example, this is useful for passing secrets to the task (see below).

Passing secrets
~~~~~~~~~~~~~~~~~~~~~~~~

We recommend passing secrets to any node(s) executing your task by first making
it available in your current shell, then using ``--env`` to pass it to SkyPilot:

.. code-block:: console

$ sky launch -c mycluster --env WANDB_API_KEY task.yaml
$ sky exec mycluster --env WANDB_API_KEY task.yaml

.. tip::

In other words, you do not need to pass the value directly such as ``--env
WANDB_API_KEY=1234``.





SkyPilot environment variables
------------------------------------------------------------------

SkyPilot exports these environment variables for a task's execution (``run`` commands):

.. list-table::
:widths: 20 70 10
:header-rows: 1

* - Name
- Definition
- Example
* - ``SKYPILOT_NODE_RANK``
- Rank (an integer ID from 0 to :code:`num_nodes-1`) of the node executing the task. Read more :ref:`here <dist-jobs>`.
- 0
* - ``SKYPILOT_NODE_IPS``
- A string of IP addresses of the nodes reserved to execute the task, where each line contains one IP address. Read more :ref:`here <dist-jobs>`.
- 1.2.3.4
* - ``SKYPILOT_NUM_GPUS_PER_NODE``
- Number of GPUs reserved on each node to execute the task; the same as the
count in ``accelerators: <name>:<count>`` (rounded up if a fraction). Read
more :ref:`here <dist-jobs>`.
- 0
* - ``SKYPILOT_TASK_ID``
- A unique ID assigned to each task.
Useful for logging purposes: e.g., use a unique output path on the cluster; pass to Weights & Biases; etc.

If a task is run as a :ref:`managed spot job <spot-jobs>`, then all
recoveries of that job will have the same ID value. Read more :ref:`here <spot-jobs-end-to-end>`.
- sky-2023-07-06-21-18-31-563597_myclus_id-1

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a bunch more env vars added by the #2106, but it should be fine to leave it out, as that feature is still experimental.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

The values of these variables are filled in by SkyPilot at task execution time.

You can access these variables in the following ways:

* In the task YAML's ``run`` commands (a Bash script), access them using the ``${MYVAR}`` syntax;
* In the program(s) launched in ``run``, access them using the
language's standard method (e.g., ``os.environ`` for Python).
1 change: 1 addition & 0 deletions docs/source/running-jobs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ More User Guides

distributed-jobs
grid-search
environment-variables