-
Notifications
You must be signed in to change notification settings - Fork 42
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #9 from video-db/ankit/add-documentation
Ankit/add documentation
- Loading branch information
Showing
14 changed files
with
206 additions
and
51 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,39 @@ | ||
# Reasoning Engine | ||
## Reasoning Engine | ||
|
||
The Reasoning Engine is the core of the system. It is responsible for processing the input data and generating the output data. The Reasoning Engine is a collection of modules that work together to perform the reasoning process. Each module is responsible for a specific task, such as data processing, rule evaluation, or output generation. | ||
The Reasoning Engine is the core component that directly interfaces with the user. It interprets natural language input in any conversation and orchestrates agents to fulfill the user's requests. The primary functions of the Reasoning Engine are: | ||
|
||
* Maintain Context of Conversational History: Manage memory, context limits, input, and output experiences to ensure coherent and context-aware interactions. | ||
* Natural Language Understanding (NLU): Uses LLMs of your choice to have understanding of the task. | ||
* Intelligent Reference Deduction: Intelligently deduce references to previous messages, outputs, files, agents, etc., to provide relevant and accurate responses. | ||
* Agent Orchestration: Decide on agents and their workflows to fulfill requests. Multiple strategies can be employed to create agent workflows, such as step-by-step processes or chaining of agents provided by default. | ||
* Final Control Over Conversation Flow: Maintain ultimate control over the flow of conversation with the user, ensuring coherence and goal alignment. | ||
|
||
# Agents | ||
|
||
Agents are the core building blocks of the Reasoning Engine. They are responsible for processing the input data and generating the output data. Agents are designed to be modular and extensible, allowing developers to easily add new functionality to the system. Each agent is responsible for a specific task, such as data processing, rule evaluation, or output generation. | ||
## Agents | ||
|
||
# Tools | ||
An Agent is an autonomous entity that performs specific tasks using available tools. Agents define the user experience and are unique in their own way. Some agents can make the conversation fun while accomplishing tasks, similar to your favorite barista. Others might provide user experiences like a video player, display images, collections of images, or engage in text-based chat. Agents can also have personalities. We plan to add multiple agents for the same tasks but with a variety of user experiences. | ||
|
||
|
||
|
||
For example, the task "Give me a summary of this video" can be accomplished by choosing one of the summary agents: | ||
|
||
* "PromptSummarizer": This agent asks you for prompts that can be used for generating a summary. You have control and freedom over the style in each interaction. | ||
* "SceneSummarizer": This agent uses scene descriptions, audio, etc., to generate a summary in a specific format using its internal prompt. | ||
|
||
|
||
|
||
Key aspects of Agents include: | ||
|
||
* Task Autonomy: Agents perform tasks independently, utilizing tools to achieve their objectives. | ||
* Unique User Experiences (UX): Each agent offers a distinct user experience, enhancing engagement and satisfaction. Multiple agents for the same task offer personalized interactions and cater to different user preferences like loading a specific UI or just a text message. | ||
* Standardized Agent Interface: Agents communicate with the Reasoning Engine through a common API or protocol, ensuring consistent integration and interaction. | ||
|
||
## Tools | ||
|
||
Tools are functional building blocks that can be created from any library and used within agents. They are the functions that enable agents to perform their tasks. For example, we have created an upload tool that is a wrapper around the videodb upload function, another one is an index function with parameters. | ||
|
||
Key aspects of Tools include: | ||
|
||
* Functional Building Blocks: Serve as modular functions that agents can utilize to perform tasks efficiently. | ||
* Wrapper Functions: Act as wrappers for existing functions or libraries, enhancing modularity and reusability. | ||
|
||
Tools are the core building blocks of the Agents. They are used to extend the capabilities of the agents. Tools are designed to be modular and extensible, allowing developers to easily add new functionality to the system. Each tool is responsible for a specific task, such as data processing, rule evaluation, or output generation. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,4 @@ | ||
## Reasoning | ||
|
||
The "Reasoning" component of the Video Agent system comprises the ReasoningEngine and its configuration model, ReasoningEngineConfig. These core elements are designed to analyze and process input messages by utilizing a configurable set of language models. This facilitates advanced decision-making and response generation tailored to the context of video sessions. The configuration model allows precise control over operational parameters such as the number of iterations, system prompts, and integration with Langfuse for detailed operational tracing, enabling the system to adapt effectively to various interaction scenarios. | ||
|
||
### Reasoning Engine | ||
## Reasoning Engine | ||
|
||
|
||
::: spielberg.core.reasoning.ReasoningEngine | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,39 +1,86 @@ | ||
# Getting Started | ||
|
||
* Clone the repository: | ||
### Prerequisites | ||
|
||
```console | ||
- Python 3.9 or higher | ||
- Node.js 22.8.0 or higher | ||
- npm | ||
|
||
### Installation | ||
|
||
1. Clone the repository: | ||
|
||
``` bash | ||
git clone https://github.com/video-db/Spielberg.git | ||
cd Spielberg | ||
``` | ||
|
||
* Create the .env file and set the environment variables: | ||
2. Set up the environment: | ||
|
||
```console | ||
cp .env.example .env | ||
```bash | ||
./setup.sh | ||
``` | ||
|
||
* Use virtualenv as: | ||
This script will: | ||
- Install nvm (Node Version Manager) if not already installed | ||
- Install Node.js 22.8.0 using nvm | ||
- Install Python and pip | ||
- Set up virtual environments for both frontend and backend | ||
- Install dependencies for both frontend and backend | ||
|
||
Supported platforms: | ||
- Mac | ||
- Linux | ||
|
||
3. Configure the environment variables: | ||
|
||
```console | ||
python3 -m venv .venv | ||
source .venv/bin/activate | ||
```bash | ||
cp backend/.env.example backend/.env | ||
cp frontend/.env.example frontend/.env | ||
``` | ||
|
||
* Init the database | ||
Edit the `.env` files to add your API keys and other configuration options. | ||
|
||
```console | ||
[TODO]: Add all supported variables or point to documentation where we have given the list. | ||
|
||
4. Initialize and configuring the Database | ||
|
||
For SQLite (default): | ||
```bash | ||
make init-sqlite-db | ||
``` | ||
|
||
* Install the dependencies: | ||
This command will initialize the SQLite DB file in the `backend` directory. No additional configuration is required for SQLite. | ||
|
||
```console | ||
make install | ||
``` | ||
For other databases, follow the documentation [here](TODO: Add link to database configuration docs). | ||
|
||
|
||
## Project Structure | ||
|
||
* Start the server: | ||
- `backend/`: Contains the Flask backend application | ||
- `frontend/`: Contains the Vue 3 frontend application | ||
- `docs/`: Project documentation | ||
- `infra/`: Infrastructure-related files | ||
|
||
```console | ||
|
||
## Running the Application | ||
|
||
To start both the backend and frontend servers: | ||
|
||
```bash | ||
make run | ||
``` | ||
|
||
This will start the backend server on `http://127.0.0.1:8000` and the frontend server on `http://127.0.0.1:8080`. | ||
|
||
To run only the backend server: | ||
|
||
```bash | ||
make run-be | ||
``` | ||
|
||
To just run the frontend development server: | ||
|
||
```bash | ||
make run-fe | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,3 @@ | ||
# Welcome to Spielberg | ||
|
||
The Spielberg project is an advanced video processing and analysis platform that utilizes a range of AI agents and language models to handle diverse video management needs and tasks. It features a modular architecture that supports easy expansion and integration of new functionalities. Core components include specialized agents for distinct processing tasks, multiple language models for natural language processing, and a flexible database interface for data storage and retrieval. The project emphasizes ease of installation and setup through a streamlined Makefile, catering to developers looking to deploy or extend its capabilities efficiently. | ||
|
||
## Features |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,18 @@ | ||
{% extends "base.html" %} | ||
|
||
{% block announce %} | ||
<strong>Video Agents</strong> is in open beta. Come join our | ||
<strong>Spielberg</strong> is in open beta. Come join our | ||
<a href="https://discord.com/invite/py9P639jGz"> | ||
Discord community | ||
</a>. Feedback and questions are welcome! 🚀 | ||
{% endblock %} | ||
|
||
{% block htmltitle %} | ||
{% if page.meta and page.meta.title %} | ||
<title>{{ page.meta.title }}</title> | ||
{% elif page.title and not page.is_homepage %} | ||
<title>{{ page.title | striptags }}</title> | ||
{% else %} | ||
<title>{{ config.site_name }}</title> | ||
{% endif %} | ||
{% endblock %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,53 @@ | ||
## API Routes | ||
|
||
Routes are defined in the `routes` folder. | ||
|
||
|
||
## Agent routes | ||
## GET /agent | ||
|
||
Return the agent information | ||
|
||
```json | ||
[ | ||
{ | ||
"description": "This is an agent to summarize the given video of VideoDB.", | ||
"name": "summary" | ||
}, | ||
{ | ||
"description": "Get the download URLs of the VideoDB generated streams.", | ||
"name": "download" | ||
}, | ||
{ | ||
"description": "Agent to get information about the pricing and usage of VideoDB, it is also helpful for running scenarios to get the estimates.", | ||
"name": "pricing" | ||
} | ||
] | ||
``` | ||
|
||
## Session routes | ||
|
||
## GET /session | ||
|
||
Returns all the sessions | ||
|
||
```json | ||
[ | ||
{ | ||
"collection_id": "c-890bd0a5-2ec3-47c0-86dc-685953995206", | ||
"created_at": 1729092742, | ||
"metadata": {}, | ||
"session_id": "52881f6b-7560-4844-ac35-52af41d07ab8", | ||
"updated_at": 1729092742, | ||
"video_id": "m-138de44f-d963-4a4c-a239-a30df4dc496a" | ||
}, | ||
{ | ||
"collection_id": "c-890bd0a5-2ec3-47c0-86dc-685953995206", | ||
"created_at": 1729092642, | ||
"metadata": {}, | ||
"session_id": "6bf075a7-e7d4-4aba-985c-4cf0d3dc6f5b", | ||
"updated_at": 1729092642, | ||
"video_id": "m-13d436a6-ad61-410d-b51c-5ebd80e87066" | ||
} | ||
] | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
:root { | ||
--md-primary-fg-color: hsla(var(--md-hue), 15%, 9%, 1); | ||
--md-primary-fg-color--light: hsla(var(--md-hue), 15%, 9%, 1); | ||
--md-primary-fg-color--dark: hsla(var(--md-hue), 15%, 9%, 1); | ||
} | ||
|
||
[data-md-color-scheme="slate"] { | ||
--md-typeset-a-color: #EC5B16; | ||
} | ||
|
||
[data-md-color-scheme="default"] { | ||
--md-typeset-a-color: #EC5B16; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters