Skip to content

Latest commit

 

History

History
169 lines (115 loc) · 5.98 KB

README.md

File metadata and controls

169 lines (115 loc) · 5.98 KB

TT-Studio

TT-Studio enables rapid deployment of TT Inference servers locally and is optimized for Tenstorrent hardware. This guide explains how to set up and use TT-Studio in both standard and development modes.

Table of Contents

  1. Prerequisites
  2. Overview
  3. Quick Start
  4. Using startup.sh
  5. Documentation

Prerequisites

  1. Docker: Ensure that Docker is installed on your machine. You can refer to the installation guide here.

Quick Start

For General Users

To set up TT-Studio:

  1. Clone the Repository:

    git clone https://github.com/tenstorrent/tt-studio.git
    cd tt-studio
  2. Run the Startup Script:

    Run the startup.sh script:

    ./startup.sh

    See this section for more information on command-line arguments available within the startup script.

  3. Access the Application:

    The app will be available at http://localhost:3000.

  4. Cleanup:

    • To stop and remove Docker services, run:
      ./startup.sh --cleanup
  5. Running on a Remote Machine

    To forward traffic between your local machine and a remote server, enabling you to access the frontend application in your local browser, follow these steps:

    Use the following SSH command to port forward both the frontend and backend ports:

    # Port forward frontend (3000) to allow local access from the remote server
    ssh -L 3000:localhost:3000 <username>@<remote_server>

⚠️ Note: To use Tenstorrent hardware, during the run of startup.sh script, select "yes" when prompted to mount hardware. This will automatically configure the necessary settings, eliminating manual edits to docker compose.yml.


For Developers

Developers can control and run the app directly via docker compose, keeping this running in a terminal allows for hot reload of the frontend app. For any backend changes its advisable to re restart the services.

  1. Run in Development Mode:

    cd tt-studio/app
    docker compose up --build
  2. Stop the Services:

    docker compose down
  3. Using the Mock vLLM Model:

    • For local testing, you can use the Mock vLLM model, which spits out random set of characters back . Instructions to run it are here
  4. Running on a Machine with Tenstorrent Hardware:

    To run TT-Studio on a device with Tenstorrent hardware, you need to uncomment specific lines in the app/docker-compose.yml file. Follow these steps:

    1. Navigate to the app directory:

      cd app/
    2. Open the docker-compose.yml file in an editor (e.g., vim or a code editor like VS CODE ):

      vim docker-compose.yml
      # or
      code docker-compose.yml
    3. Uncomment the following lines that have a ! flag in front of them to enable Tenstorrent hardware support:

      #* DEV: Uncomment devices to use Tenstorrent hardware
      #! devices:
      #* mounts all Tenstorrent devices to the backend container
      #!   - /dev/tenstorrent:/dev/tenstorrent

      By uncommenting these lines, Docker will mount the Tenstorrent device (/dev/tenstorrent) to the backend container. This allows the docker container to utilize the Tenstorrent hardware for running machine learning models directly on the card.


Using startup.sh

The startup.sh script automates the TT-Studio setup process. It can be run with or without Docker, depending on your usage scenario.

Basic Usage

To use the startup script, run:

./startup.sh [options]

Command-Line Options

Option Description
--help Display help message with usage details.
--setup Run the setup.sh script with sudo privileges for all steps.
--cleanup Stop and remove all Docker services.

To display the same help section in the terminal, one can run:

./startup.sh --help
Automatic Tenstorrent Hardware Detection

If a Tenstorrent device (/dev/tenstorrent) is detected, the script will prompt you to mount it.


Documentation

  • Frontend Documentation: app/frontend/README.md
    Detailed documentation about the frontend of TT Studio, including setup, development, and customization guides.

  • Backend API Documentation: app/api/README.md
    Information on the backend API, powered by Django Rest Framework, including available endpoints and integration details.

  • Running vLLM Llama3.1-70B and vLLM Mock Model(s) in TT-Studio: HowToRun_vLLM_Models.md
    Step-by-step instructions on how to configure and run the vLLM model(s) using TT-Studio.

  • Contribution Guide: CONTRIBUTING.md
    If you’re interested in contributing to the project, please refer to our contribution guidelines. This includes setting up a development environment, code standards, and the process for submitting pull requests.

  • Frequently Asked Questions (FAQ): FAQ.md
    A compilation of frequently asked questions to help users quickly solve common issues and understand key features of TT-Studio.