Skip to content

Latest commit

 

History

History
124 lines (87 loc) · 3.45 KB

user_guide.md

File metadata and controls

124 lines (87 loc) · 3.45 KB

SHARK User Guide

These instructions cover the usage of the latest stable release of SHARK. For a more bleeding edge release please install the nightly releases.

Prerequisites

Our current user guide requires that you have:

  • Access to a computer with an installed AMD Instinct™ MI300x Series Accelerator
  • Installed a compatible version of Linux and ROCm on the computer (see the ROCm compatability matrix)

Set up Environment

This section will help you install Python and set up a Python environment with venv.

Officially we support Python versions: 3.11, 3.12, 3.13

The rest of this guide assumes you are using Python 3.11.

Install Python

To install Python 3.11 on Ubuntu:

sudo apt install python3.11 python3.11-dev python3.11-venv

which python3.11
# /usr/bin/python3.11

Create a Python Environment

Setup your Python environment with the following commands:

# Set up a virtual environment to isolate packages from other envs.
python3.11 -m venv 3.11.venv
source 3.11.venv/bin/activate

Install SHARK and its dependencies

First install a torch version that fulfills your needs:

# Fast installation of torch with just CPU support.
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

For other options, see https://pytorch.org/get-started/locally/.

Next install shark-ai:

pip install shark-ai[apps]

Tip

To switch from the stable release channel to the nightly release channel, see nightly_releases.md.

Test the installation.

python -m shortfin_apps.sd.server --help

Quickstart

Run the SDXL Server

Run the SDXL Server

Run the SDXL Client

python -m shortfin_apps.sd.simple_client --interactive

Congratulations!!! At this point you can play around with the server and client based on your usage.

Note: Server implementation scope

The SDXL server's implementation does not account for extremely large client batches. Normally, for heavy workloads, services would be composed under a load balancer to ensure each service is fed with requests optimally. For most cases outside of large-scale deployments, the server's internal batching/load balancing is sufficient.

Update flags

Please see --help for both the server and client for usage instructions. Here's a quick snapshot.

Update server options:

Flags options
--host HOST
--port PORT server port
--root-path ROOT_PATH
--timeout-keep-alive
--device local-task,hip,amdgpu
--target gfx942,gfx1100
--device_ids
--tokenizers
--model_config
--workers_per_device
--fibers_per_device
--isolation per_fiber, per_call, none
--show_progress
--trace_execution
--amdgpu_async_allocations
--splat
--build_preference compile,precompiled
--compile_flags
--flagfile FLAGFILE
--artifacts_dir ARTIFACTS_DIR Where to store cached artifacts from the Cloud

Update client with different options:

Flags options
--file
--reps
--save Whether to save image generated by the server
--outputdir output directory to store images generated by SDXL
--steps
--interactive
--port port to interact with server