Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Add tutorial for adding your own objects #368

Merged
Merged
Show file tree
Hide file tree
Changes from 60 commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
0a479c7
Add draft docstring
smokestacklightnin Mar 12, 2024
c95d795
Add empty `TutorialAssistant`
smokestacklightnin Mar 12, 2024
254941c
Add first drafts of descriptions of the parts of an `Assistant`
smokestacklightnin Mar 13, 2024
7f257ff
Remove `max_input_size` from tutorial
smokestacklightnin Mar 13, 2024
c4c1f29
Add a very simple and primitive default answer
smokestacklightnin Mar 13, 2024
33438f9
Minor formatting changes
smokestacklightnin Mar 14, 2024
4b499d8
Add section on including the assistant in Ragna
smokestacklightnin Mar 14, 2024
09c38ba
Add step labels to subheadings
smokestacklightnin Mar 14, 2024
14d2215
Add `TutorialAssistant._default_answer` and add clarification to expl…
smokestacklightnin Mar 14, 2024
c9abade
Add syntax highlighting for `__init__.py`
smokestacklightnin Mar 14, 2024
bbb3715
Rename tutorial to use the singular instead of the plural
smokestacklightnin Mar 14, 2024
fc44d46
Reorganize the tutorial
smokestacklightnin Mar 15, 2024
6581b1c
Remove incorrect assumption
smokestacklightnin Mar 15, 2024
c62f9ff
Remove unnecessary typing annotations
smokestacklightnin Mar 15, 2024
c4b9a5f
Remove unnecessary code obfuscation
smokestacklightnin Mar 15, 2024
6a0a505
Remove incorrect discussion on how to add assistant to Ragna and add …
smokestacklightnin Mar 15, 2024
8ea1cb7
Add note on streaming responses
smokestacklightnin Mar 15, 2024
771fc71
Remove unnecessary docstring
smokestacklightnin Mar 18, 2024
715e6bb
Add back type hints
smokestacklightnin Mar 18, 2024
6e7c6a8
Replace docstring with explanation in paragraphs before
smokestacklightnin Mar 18, 2024
eccb483
Remove unnecessary notes
smokestacklightnin Mar 18, 2024
edc7b45
Rename `gallery_adding_an_assistant.py` to `gallery_adding_components…
smokestacklightnin Mar 18, 2024
7a8a618
Restructure and condense Adding Assistants docs
smokestacklightnin Mar 19, 2024
9a6f0fa
Add tutorial implementation of source storage
smokestacklightnin Mar 19, 2024
944698c
Add explanation how to add source storage
smokestacklightnin Mar 19, 2024
2ca8d5e
Break up long lines
smokestacklightnin Mar 19, 2024
9bf74f7
Add section on including external Python objects
smokestacklightnin Mar 19, 2024
599180c
Simplify tutorial title
smokestacklightnin Mar 19, 2024
8d987d4
Move description before implementation
smokestacklightnin Mar 19, 2024
f7ec2f9
Make subheadings more concise
smokestacklightnin Mar 20, 2024
d053f52
Shorten sample answer
smokestacklightnin Mar 20, 2024
15614c4
Move explanation before implementation again
smokestacklightnin Mar 20, 2024
01d8ec0
Rephrase docstring
smokestacklightnin Mar 20, 2024
bb31ce3
Include custom Python objects using the config file (preferred) inste…
smokestacklightnin Apr 3, 2024
be2ae8e
Remove unnecessary comment
smokestacklightnin Apr 3, 2024
4253a68
Change formatting for argument list
smokestacklightnin Apr 3, 2024
a68c313
Make description more concise
smokestacklightnin Apr 3, 2024
147b028
Move source storage section before assistant section
smokestacklightnin Apr 4, 2024
67e0a05
Move custom Python objects section before source storage
smokestacklightnin Apr 4, 2024
9baa430
Add short description to Assistant section of tutorial
smokestacklightnin Apr 4, 2024
795c3bd
Add explanation of `Source`s and `SourceStorage`s
smokestacklightnin Apr 4, 2024
b6e45b7
Put currently present sections in correct order
smokestacklightnin Apr 5, 2024
52d07af
Add Python API tutorial
smokestacklightnin Apr 6, 2024
c1ec46e
Add space
smokestacklightnin Apr 6, 2024
08e44a5
Add section on Web UI
smokestacklightnin Apr 6, 2024
0266c50
Add REST API section without explanation
smokestacklightnin Apr 6, 2024
5dc104a
Add description to the REST API section
smokestacklightnin Apr 6, 2024
a4d50c1
Remove printing of dummy file
smokestacklightnin Apr 6, 2024
81e1247
Make title consistent with others
smokestacklightnin Apr 6, 2024
614d6eb
Import `Assistant` when running `documentation_helper.py`
smokestacklightnin Apr 8, 2024
a6d60b4
Add newlines
smokestacklightnin Apr 8, 2024
76b1ff9
Import `Source` when running `documentation_helper.py`
smokestacklightnin Apr 8, 2024
5f832db
Import `typing.Iterator` when running `documentation_helper.py`
smokestacklightnin Apr 8, 2024
67cd9a3
Add print statements for readers to know where and when functions are…
smokestacklightnin Apr 9, 2024
e61db90
Simplify the starting and authenticating process
smokestacklightnin Apr 9, 2024
33bb68b
Don't include redundant link about uploading documents
smokestacklightnin Apr 9, 2024
225ced7
Consolidate configuration commands
smokestacklightnin Apr 9, 2024
b968870
Remove redundant line
smokestacklightnin Apr 10, 2024
9e18d2b
Add screenshot of Web UI
smokestacklightnin Apr 10, 2024
88c95d0
add support for custom source storages in documentation helpers
pmeier Apr 10, 2024
7bd848d
Track tutorial image with git-lfs
smokestacklightnin Apr 10, 2024
d0306c6
Add image via git-lfs
smokestacklightnin Apr 10, 2024
66df502
progress
pmeier Apr 21, 2024
4f7a9aa
dirty
pmeier Apr 21, 2024
19b1b30
async
pmeier Apr 22, 2024
ad88e98
ui custom
pmeier Apr 23, 2024
0ae3df6
cleanup
pmeier Apr 23, 2024
8c90c0f
Merge branch 'main' into tutorial/adding-more-components/rebased
pmeier Apr 24, 2024
71e9b18
lint
pmeier Apr 24, 2024
d0a2867
include image for web UI
pmeier Apr 24, 2024
c5f4b34
remove unnecessary lfs filter
pmeier Apr 24, 2024
3b603c6
Merge branch 'main' into tutorial/adding-more-components/rebased
pmeier May 1, 2024
65a7263
update
pmeier May 1, 2024
572e587
phrasing
pmeier May 16, 2024
26d5aa8
Merge branch 'main' into tutorial/adding-more-components/rebased
pmeier May 16, 2024
feb375c
cleanup
pmeier May 16, 2024
737e322
Rewrite rest API setup
pmeier May 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/assets/images/ragna-tutorial-components.png
pmeier marked this conversation as resolved.
Show resolved Hide resolved
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 13 additions & 8 deletions docs/documentation_helpers.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
import inspect
import itertools
import os
import subprocess
import sys
import tempfile
import textwrap
from pathlib import Path
from typing import Optional

Expand Down Expand Up @@ -47,14 +49,17 @@ def _prepare_config(self, config: Config) -> tuple[str, str]:
)[2].filename
custom_module = deploy_directory.name
with open(deploy_directory / f"{custom_module}.py", "w") as file:
# TODO: this currently only handles assistants. When needed, we can extend
# to source storages.
file.write("from ragna import assistants\n\n")

for assistant in config.assistants:
if assistant.__module__ == "__main__":
file.write(f"{inspect.getsource(assistant)}\n\n")
assistant.__module__ = custom_module
# FIXME Find a way to automatically detect necessary imports
file.write("import uuid; from uuid import *\n")
file.write("import textwrap; from textwrap import*\n")
file.write("from typing import *\n")
file.write("from ragna import *\n")
file.write("from ragna.core import *\n")

for component in itertools.chain(config.source_storages, config.assistants):
if component.__module__ == "__main__":
file.write(f"{textwrap.dedent(inspect.getsource(component))}\n\n")
component.__module__ = custom_module

config.to_file(config_path)
return python_path, config_path
Expand Down
269 changes: 269 additions & 0 deletions docs/tutorials/gallery_adding_components.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,269 @@
"""
# Adding Components

While Ragna has builtin support for a few [assistants][ragna.assistants] and
[source storages][ragna.source_storages], its real strength is allowing users
to incorporate custom components. This tutorial covers the basics of how to do that.
"""


# %%
# ## Source Storage

# %%
# A [`Source`][ragna.core.Source] is a data class that stores the documents that Ragna will
# use to augment your prompts. [`SourceStorage`][ragna.core.SourceStorage]s, usually [vector
# databases][ragna.source_storages], are the tools Ragna uses to store the documents held in the
# [`Source`][ragna.core.Source]s.

# %%
# [`SourceStorage`][ragna.core.SourceStorage] has two abstract methods,
# [`store()`][ragna.core.SourceStorage.store] and [`retrieve()`][ragna.core.SourceStorage.retrieve].

# %%
# [`store()`][ragna.core.SourceStorage.store] takes a list of [`Source`][ragna.core.Source]s
# as an argument and places them in the database that you are using to hold them. This is
# different for each different source storage implementation.

# %%
# [`retrieve()`][ragna.core.SourceStorage.retrieve] returns sources matching
# the given prompt in order of relevance.

import uuid

from ragna.core import Document, Source, SourceStorage

import textwrap


class TutorialSourceStorage(SourceStorage):
def __init__(self):
# set up database
self._storage: dict[int, list[Source]] = {}

def store(self, documents: list[Document], chat_id: uuid.UUID) -> None:
print(f"Hello from {type(self).__name__}().store()")
self._storage[chat_id] = [
Source(
id=str(uuid.uuid4()),
document=document,
smokestacklightnin marked this conversation as resolved.
Show resolved Hide resolved
location=f"page {page.number}"
if (page := next(document.extract_pages())).number
else "",
content=(content := textwrap.shorten(page.text, width=100)),
num_tokens=len(content.split()),
)
for document in documents
]

def retrieve(
self, documents: list[Document], prompt: str, *, chat_id: uuid.UUID
) -> list[Source]:
print(f"Retrieving {len(self._storage)} sources from {type(self).__name__}")
return self._storage[chat_id]


# %%
smokestacklightnin marked this conversation as resolved.
Show resolved Hide resolved
# ## Assistant

# %%
# This is an example of an [`Assistant`][ragna.core.Assistant], which provides an interface between the user and the API for the LLM that they are using. For simplicity, we are not going to implement an actual LLM here, but rather a demo assistant that just mirrors back the inputs. This is similar to the [`RagnaDemoAssistant`][ragna.assistants.RagnaDemoAssistant].

# %%
# The main thing to do is to implement the [`answer()`][ragna.core.Assistant.answer] abstract method.
# The [`answer()`][ragna.core.Assistant.answer] method is where you put the logic to access your LLM.
# This could call an API directly or call a local LLM.

# %%
# Your [`answer()`][ragna.core.Assistant.answer] method should take a prompt in the form of a
# string, and a list of [`Source`][ragna.core.Source]s, in addition to whatever other arguments
# necessary for your particular assistant. The return type is an [`Iterator`](https://docs.python.org/3/library/stdtypes.html#typeiter) of strings.

# %%
# !!! note
# Ragna also supports streaming responses from the assistant. See the
# [example how to use streaming responses](../../generated/examples/gallery_streaming.md)
# for more information.

from typing import Iterator

from ragna.core import Assistant, Source


class TutorialAssistant(Assistant):
def answer(self, prompt: str, sources: list[Source]) -> Iterator[str]:
print(f"Giving an answer from {type(self).__name__}")
yield (
f"This is a default answer. There were {len(sources)} sources."
f"The prompt was: "
f"{prompt}"
)


# %%
smokestacklightnin marked this conversation as resolved.
Show resolved Hide resolved
# ## Including Custom Python Objects

# %%
# If the module containing the custom object you want to include is in your
# [`PYTHONPATH`](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH),
# you can either use the [config file](../../references/config.md#referencing-python-objects)
# to add it, or follow the [Python API](#using-the-python-api-with-custom-objects) instructions below.

# %%
# If the module containing the custom object you want to include is not in your
# [`PYTHONPATH`](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH),
# suppose it is located at the path `~/tutorials/tutorial.py`. You can add `~/tutorial/` to your
# [`PYTHONPATH`](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH) using
# the command
#
# ```bash
# $ export PYTHONPATH=$PYTHONPATH:~/tutorials/
# ```

# %%
# ## Using the Python API with Custom Objects

# We first import some helpers that tell Python where to find our demo document that we will
# use for RAG

import sys
from pathlib import Path

sys.path.insert(0, str(Path.cwd().parent))

import documentation_helpers

document_path = documentation_helpers.assets / "ragna.txt"

# %%
# We next import the [ragna.Rag][] class and set up a chat using the custom objects from above

from ragna import Rag

chat = Rag().chat(
documents=[document_path],
source_storage=TutorialSourceStorage,
assistant=TutorialAssistant,
)

# %%
# Before we can ask a question, we need to [`prepare`][ragna.core.Chat.prepare] the chat, which
# under the hood stores the documents we have selected in the source storage.

_ = await chat.prepare()
smokestacklightnin marked this conversation as resolved.
Show resolved Hide resolved

# %%
# Finally, we can get an [`answer`][ragna.core.Chat.answer] to a question.

print(await chat.answer("What is Ragna?"))


# %%
# ## Using the REST API with Custom Objects

# %%
# To use our custom objects with the REST API or Web UI, make sure they are in your
# [`PYTHONPATH`](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH).

# %%
# If you want to use a [configuration file](../../references/config.md#example), you can add lines like the following.
#
# ```toml
# source_storages = [
# "tutorial.TutorialSourceStorage"
# ]
# assistants = [
# "tutorial.TutorialAssistant"
# ]
# ```

# %%
# If you're using the Web UI and you have added the components to the configuration file
# properly, you should see something like this when you create a chat:
# ![](../../assets/images/ragna-tutorial-components.png)

# %%
# The rest of this section will follow the steps of
# [the REST API tutorial](../../generated/tutorials/gallery_rest_api.md). More detail
# can be found there. This section focuses specifically on using custom objects.


# %%
# We first set up the [configuration](../../references/config.md) and start the API

import sys
from pathlib import Path

sys.path.insert(0, str(Path.cwd().parent))

import documentation_helpers

from ragna.deploy import Config


config = Config(source_storages=[TutorialSourceStorage], assistants=[TutorialAssistant])

rest_api = documentation_helpers.RestApi()

client = rest_api.start(config, authenticate=True)


# %%
# Next, we upload the documents.

import json

document_name = "ragna.txt"

with open(documentation_helpers.assets / document_name, "rb") as file:
content = file.read()

response = client.post("/document", json={"name": document_name}).raise_for_status()
document_upload = response.json()

document = document_upload["document"]

parameters = document_upload["parameters"]
client.request(
parameters["method"],
parameters["url"],
data=parameters["data"],
files={"file": content},
).raise_for_status()

# %%
# We can now start chatting, which also takes place in two steps.
# See
# [the REST API tutorial](../../generated/tutorials/gallery_rest_api.md#step-5-start-chatting)
# for more details.

# %%
# Note how the names of our source storage and assistant are quoted in the JSON payload below.

response = client.post(
"/chats",
json={
"name": "Tutorial REST API",
"documents": [document],
"source_storage": "TutorialSourceStorage",
"assistant": "TutorialAssistant",
"params": {},
},
).raise_for_status()
chat = response.json()

client.post(f"/chats/{chat['id']}/prepare").raise_for_status()

# %%
# We finally actually ask a question to our assistant.

response = client.post(
f"/chats/{chat['id']}/answer",
json={"prompt": "What is Ragna?"},
).raise_for_status()
answer = response.json()
print(json.dumps(answer, indent=2))

print(answer["content"])

rest_api.stop()
Loading