Deploying to gh-pages from @ dstackai/dstack@831cf91 🚀

dstackai · Nov 15, 2024 · a7b2d0c · a7b2d0c
1 parent c0a6c89
commit a7b2d0c
Show file tree

Hide file tree

Showing 7 changed files with 147 additions and 227 deletions.
diff --git a/docs/index.html b/docs/index.html
@@ -2913,9 +2913,11 @@ <h4 id="2-define-configurations">2. Define configurations<a class="headerlink" h
 <p><code>dstack</code> supports the following configurations:</p>
 <ul>
 <li><a href="dev-environments/">Dev environments</a> &mdash; for interactive development using a desktop IDE</li>
-<li><a href="tasks/">Tasks</a> &mdash; for scheduling jobs (incl. distributed jobs) or running web apps</li>
-<li><a href="services/">Services</a> &mdash; for deployment of models and web apps (with auto-scaling and authorization)</li>
+<li><a href="tasks/">Tasks</a> &mdash; for scheduling jobs, incl. distributed ones (or running web apps)</li>
+<li><a href="services/">Services</a> &mdash; for deploying models (or web apps)</li>
 <li><a href="concepts/fleets/">Fleets</a> &mdash; for managing cloud and on-prem clusters</li>
+<li><a href="concepts/volumes/">Volumes</a> &mdash; for managing instance and network volumes (to persist data)</li>
+<li><a href="concepts/fleets/">Gateway</a> &mdash; for handling auto-scaling and ingress traffic</li>
 </ul>
 <p>Configuration can be defined as YAML files within your repo.</p>
 <h4 id="3-apply-configurations">3. Apply configurations<a class="headerlink" href="#3-apply-configurations" title="Permanent link">&para;</a></h4>

diff --git a/docs/quickstart/index.html b/docs/quickstart/index.html
@@ -2684,24 +2684,19 @@ <h2 id="run-a-configuration">Run a configuration<a class="headerlink" href="#run
 <div class="tabbed-set tabbed-alternate" data-tabs="1:3"><input checked="checked" id="dev-environment" name="__tabbed_1" type="radio" /><input id="task" name="__tabbed_1" type="radio" /><input id="service" name="__tabbed_1" type="radio" /><div class="tabbed-labels"><label for="dev-environment">Dev environment</label><label for="task">Task</label><label for="service">Service</label></div>
 <div class="tabbed-content">
 <div class="tabbed-block">
-<p>A dev environment lets you provision a remote machine with your code, dependencies, and resources, and access it 
-with your desktop IDE.</p>
+<p>A dev environment lets you provision an instance and access it with your desktop IDE.</p>
 <h5 id="define-a-configuration">Define a configuration<a class="headerlink" href="#define-a-configuration" title="Permanent link">&para;</a></h5>
 <p>Create the following configuration file inside the repo:</p>
 <p><div editor-title=".dstack.yml"> </p>
 <div class="highlight"><pre><span></span><code><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">dev-environment</span>
-<span class="c1"># The name is optional, if not specified, generated randomly</span>
 <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">vscode</span>
 
+<span class="c1"># If `image` is not specified, dstack uses its default image</span>
 <span class="nt">python</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;3.11&quot;</span>
-<span class="c1"># Uncomment to use a custom Docker image</span>
 <span class="c1">#image: dstackai/base:py3.13-0.6-cuda-12.1</span>
 
 <span class="nt">ide</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">vscode</span>
 
-<span class="c1"># Use either spot or on-demand instances</span>
-<span class="nt">spot_policy</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">auto</span>
-
 <span class="c1"># Uncomment to request resources</span>
 <span class="c1">#resources:</span>
 <span class="c1">#  gpu: 24GB</span>
@@ -2727,19 +2722,18 @@ <h5 id="run-the-configuration">Run the configuration<a class="headerlink" href="
 </code></pre></div>
 </div>
 <p>Open the link to access the dev environment using your desktop IDE.</p>
+<p>Alternatively, you can access it via <code>ssh &lt;run name&gt;</code>.</p>
 </div>
 <div class="tabbed-block">
-<p>A task allows you to schedule a job or run a web app. It lets you configure 
-dependencies, resources, ports, the number of nodes (if you want to run the task on a cluster), etc.</p>
+<p>A task allows you to schedule a job or run a web app. Tasks can be distributed and can forward ports.</p>
 <h5 id="define-a-configuration_1">Define a configuration<a class="headerlink" href="#define-a-configuration_1" title="Permanent link">&para;</a></h5>
 <p>Create the following configuration file inside the repo:</p>
-<p><div editor-title="streamlit.dstack.yml"> </p>
+<p><div editor-title="examples/misc/streamlit/serve-task.dstack.yml"> </p>
 <div class="highlight"><pre><span></span><code><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">task</span>
-<span class="c1"># The name is optional, if not specified, generated randomly</span>
 <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">streamlit</span>
 
+<span class="c1"># If `image` is not specified, dstack uses its default image</span>
 <span class="nt">python</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;3.11&quot;</span>
-<span class="c1"># Uncomment to use a custom Docker image</span>
 <span class="c1">#image: dstackai/base:py3.13-0.6-cuda-12.1</span>
 
 <span class="c1"># Commands of the task</span>
@@ -2750,14 +2744,14 @@ <h5 id="define-a-configuration_1">Define a configuration<a class="headerlink" hr
 <span class="nt">ports</span><span class="p">:</span>
 <span class="w">  </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8501</span>
 
-<span class="c1"># Use either spot or on-demand instances</span>
-<span class="nt">spot_policy</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">auto</span>
-
 <span class="c1"># Uncomment to request resources</span>
 <span class="c1">#resources:</span>
 <span class="c1">#  gpu: 24GB</span>
 </code></pre></div>
 </div>
+<p>By default, tasks run on a single instance. To run a distributed task, specify 
+<a href="../reference/dstack.yml/task/#distributed-tasks"><code>nodes</code> and system environment variables</a>, 
+and <code>dstack</code> will run it on a cluster.</p>
 <h5 id="run-the-configuration_1">Run the configuration<a class="headerlink" href="#run-the-configuration_1" title="Permanent link">&para;</a></h5>
 <p>Run the configuration via <a href="../reference/cli/#dstack-apply"><code>dstack apply</code></a>:</p>
 <div class="termy">
@@ -2780,44 +2774,35 @@ <h5 id="run-the-configuration_1">Run the configuration<a class="headerlink" href
 <span class="w">  </span>Local<span class="w"> </span>URL:<span class="w"> </span>http://localhost:8501
 </code></pre></div>
 </div>
-<p><code>dstack apply</code> automatically forwards the remote ports to <code>localhost</code> for convenient access.</p>
+<p>If you specified <code>ports</code>, they will be automatically forwarded to <code>localhost</code> for convenient access.</p>
 </div>
 <div class="tabbed-block">
-<p>A service allows you to deploy a web app or a model as a scalable endpoint. It lets you configure
-dependencies, resources, authorization, auto-scaling rules, etc. </p>
-<details class="info">
-<summary>Prerequisites</summary>
-<p>If you're using the open-source server, you must set up a <a href="../concepts/gateways/">gateway</a> before you can run a service.</p>
-<p>If you're using <a href="https://sky.dstack.ai" target="_blank">dstack Sky <span class="twemoji external"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="m11.93 5 2.83 2.83L5 17.59 6.42 19l9.76-9.75L19 12.07V5z"/></svg></span></a>,
-the gateway is already set up for you.</p>
-</details>
+<p>A service allows you to deploy a model or any web app as an endpoint.</p>
 <h5 id="define-a-configuration_2">Define a configuration<a class="headerlink" href="#define-a-configuration_2" title="Permanent link">&para;</a></h5>
 <p>Create the following configuration file inside the repo:</p>
-<p><div editor-title="streamlit-service.dstack.yml"> </p>
+<p><div editor-title="examples/deployment/vllm/service.dstack.yml"> </p>
 <div class="highlight"><pre><span></span><code><span class="nt">type</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">service</span>
-<span class="c1"># The name is optional, if not specified, generated randomly</span>
-<span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">streamlit-service</span>
+<span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">llama31-service</span>
 
+<span class="c1"># If `image` is not specified, dstack uses its default image</span>
 <span class="nt">python</span><span class="p">:</span><span class="w"> </span><span class="s">&quot;3.11&quot;</span>
-<span class="c1"># Uncomment to use a custom Docker image</span>
 <span class="c1">#image: dstackai/base:py3.13-0.6-cuda-12.1</span>
 
-<span class="c1"># Commands of the service</span>
+<span class="c1"># Required environment variables</span>
+<span class="nt">env</span><span class="p">:</span>
+<span class="w">  </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">HF_TOKEN</span>
 <span class="nt">commands</span><span class="p">:</span>
-<span class="w">  </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">pip install streamlit</span>
-<span class="w">  </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">streamlit hello</span>
-<span class="c1"># Port of the service</span>
-<span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8501</span>
-
-<span class="c1"># Comment to enable authorization</span>
-<span class="nt">auth</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">False</span>
+<span class="w">  </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">pip install vllm</span>
+<span class="w">  </span><span class="p p-Indicator">-</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096</span>
+<span class="c1"># Expose the vllm server port</span>
+<span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">8000</span>
 
-<span class="c1"># Use either spot or on-demand instances</span>
-<span class="nt">spot_policy</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">auto</span>
+<span class="c1"># Specify a name if it&#39;s an Open-AI compatible model</span>
+<span class="nt">model</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">meta-llama/Meta-Llama-3.1-8B-Instruct</span>
 
-<span class="c1"># Uncomment to request resources</span>
-<span class="c1">#resources:</span>
-<span class="c1">#  gpu: 24GB</span>
+<span class="c1"># Required resources</span>
+<span class="nt">resources</span><span class="p">:</span>
+<span class="w">  </span><span class="nt">gpu</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">24GB</span>
 </code></pre></div>
 </div>
 <h5 id="run-the-configuration_2">Run the configuration<a class="headerlink" href="#run-the-configuration_2" title="Permanent link">&para;</a></h5>
@@ -2837,32 +2822,33 @@ <h5 id="run-the-configuration_2">Run the configuration<a class="headerlink" href
 Provisioning<span class="w"> </span><span class="sb">`</span>streamlit<span class="sb">`</span>...
 ---&gt;<span class="w"> </span><span class="m">100</span>%
 
-<span class="w">  </span>Welcome<span class="w"> </span>to<span class="w"> </span>Streamlit.<span class="w"> </span>Check<span class="w"> </span>out<span class="w"> </span>our<span class="w"> </span>demo<span class="w"> </span><span class="k">in</span><span class="w"> </span>your<span class="w"> </span>browser.
-
-<span class="w">  </span>Local<span class="w"> </span>URL:<span class="w"> </span>https://streamlit-service.example.com
+Service<span class="w"> </span>is<span class="w"> </span>published<span class="w"> </span>at:<span class="w"> </span>
+<span class="w">  </span>http://localhost:3000/proxy/services/main/llama31-service
 </code></pre></div>
 </div>
-<p>Once the service is up, its endpoint is accessible at <code>https://&lt;run name&gt;.&lt;gateway domain&gt;</code>.</p>
+<p>If you specified <code>model</code>, the model will also be available via an OpenAI-compatible endpoint at
+<code>&lt;dstack server URL&gt;/proxy/models/&lt;project name&gt;</code>.</p>
+<details class="info">
+<summary>Gateway</summary>
+<p>By default, services run on a single instance. However, you can specify <code>replicas</code> and <code>target</code> to enable 
+<a href="../reference/dstack.yml/service/#auto-scaling">auto-scaling</a>.</p>
+<p>Note, to use auto-scaling, a custom domain, or HTTPS, set up a 
+<a href="../concepts/gateways/">gateway</a> before running the service.
+A gateway pre-configured for you if you are using <a href="https://sky.dstack.ai" target="_blank">dstack Sky <span class="twemoji external"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="m11.93 5 2.83 2.83L5 17.59 6.42 19l9.76-9.75L19 12.07V5z"/></svg></span></a>.</p>
+</details>
 </div>
 </div>
 </div>
-<blockquote>
-<p><code>dstack apply</code> automatically uploads the code from the current repo, including your local uncommitted changes.</p>
-</blockquote>
+<p><code>dstack apply</code> automatically provisions instances, uploads the code from the current repo (incl. your local uncommitted changes).</p>
 <h2 id="troubleshooting">Troubleshooting<a class="headerlink" href="#troubleshooting" title="Permanent link">&para;</a></h2>
-<p>Something not working? Make sure to check out the <a href="../guides/troubleshooting/">troubleshooting</a> guide.</p>
+<p>Something not working? See the <a href="../guides/troubleshooting/">troubleshooting</a> guide.</p>
 <h2 id="whats-next">What's next?<a class="headerlink" href="#whats-next" title="Permanent link">&para;</a></h2>
 <ol>
 <li>Read about <a href="../dev-environments/">dev environments</a>, <a href="../tasks/">tasks</a>, 
     <a href="../services/">services</a>, and <a href="../concepts/fleets/">fleets</a> </li>
+<li>Join <a href="https://discord.gg/u8SmfwPpMd">Discord <span class="twemoji external"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="m11.93 5 2.83 2.83L5 17.59 6.42 19l9.76-9.75L19 12.07V5z"/></svg></span></a></li>
 <li>Browse <a href="https://dstack.ai/examples">examples</a></li>
-<li>Join the community via <a href="https://discord.gg/u8SmfwPpMd">Discord <span class="twemoji external"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="m11.93 5 2.83 2.83L5 17.59 6.42 19l9.76-9.75L19 12.07V5z"/></svg></span></a></li>
 </ol>
-<div class="admonition info">
-<p class="admonition-title">Examples</p>
-<p>To see how dev environments, tasks, services, and fleets can be used for 
-training and deploying AI models, check out the <a href="examples/index.md">examples</a>.</p>
-</div>
 
 
 

diff --git a/docs/reference/api/rest/openapi.json b/docs/reference/api/rest/openapi.json