Future-House · jamesbraza · Oct 26, 2024 · Oct 25, 2024 · Oct 26, 2024 · Oct 26, 2024
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -55,6 +55,10 @@ repos:
     rev: 0.0.8
     hooks:
       - id: markdown-toc-creator
+  - repo: https://github.com/adamchainz/blacken-docs
+    rev: 1.19.1
+    hooks:
+      - id: blacken-docs
   - repo: https://github.com/srstevenson/nb-clean
     rev: 4.0.1
     hooks:

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,21 @@
+# Contributing to aviary
+
+## Repo Structure
+
+aviary is a monorepo using
+[`uv`'s workspace layout](https://docs.astral.sh/uv/concepts/workspaces/#workspace-layouts).
+
+## Installation
+
+1. Git clone this repo
+2. Install the project manager `uv`:
+   https://docs.astral.sh/uv/getting-started/installation/
+3. Run `uv sync`
+
+This will editably install the full monorepo in your local environment.
+
+## Testing
+
+To run tests, please just run `pytest` in the repo root.
+
+Note you will need OpenAI and Anthropic API keys configured.
diff --git a/README.md b/README.md
@@ -1,10 +1,17 @@
 # aviary
 
+![PyPI Version](https://img.shields.io/pypi/v/fhaviary)
+![PyPI Python Versions](https://img.shields.io/pypi/pyversions/fhaviary)
+![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)
+![Tests](https://github.com/Future-House/aviary/actions/workflows/tests.yml/badge.svg)
+
 Gymnasium framework for training language model agents on constructive tasks.
 
 <!--TOC-->
 
 - [Installation](#installation)
+  - [Google Colab](#google-colab)
+  - [Developer Installation](#developer-installation)
 - [Messages](#messages)
 - [Environment](#environment)
   - [Environment subclass and state](#environment-subclass-and-state)
@@ -30,26 +37,43 @@ Gymnasium framework for training language model agents on constructive tasks.
 
 ## Installation
 
-To install aviary:
+To install aviary (note `fh` stands for FutureHouse):
 
 ```bash
-pip install -e .
+pip install fhaviary
 ```
 
-To install aviary and the provided environments:
+To install aviary with the bundled environments:
 
 ```bash
-pip install -e . -e packages/gsm8k -e packages/hotpotqa
+pip install fhaviary[gsm8k]
+# or
+pip install fhaviary[hotpotqa]
+# or everything
+pip install fhaviary[dev]
 ```
 
-To run test suites you will need to set the `OPENAI_API_KEY` and `ANTHROPIC_API_KEY`
-environment variables. In `~/.bashrc` you can add:
+### Google Colab
+
+As of 10/25/2024, unfortunately Google Colab does not yet support Python 3.11 or 3.12
+([issue](https://github.com/googlecolab/colabtools/issues/3190)).
+
+Thus, as a workaround, you will need to install Python 3.11 into your notebook.
+Here is a simple snippet that will do that for you:
 
 ```bash
-export OPENAI_API_KEY=your_openai_api_key
-export ANTHROPIC_API_KEY=your_anthropic_api_key
+!sudo apt update > /dev/null
+!sudo apt-get install python3.11 python3.11-dev python3.11-distutils python3.11-venv > /dev/null
+!curl -sS https://bootstrap.pypa.io/get-pip.py | python3.11  > /dev/null
+!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1 > /dev/null
+!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 2 > /dev/null
+!sudo apt autoremove > /dev/null
 ```
 
+### Developer Installation
+
+For local development, please see the [CONTRIBUTING.md](CONTRIBUTING.md).
+
 ## Messages
 
 Communication between the agent and environment is done through messages.
@@ -96,12 +120,14 @@ information that you want to persist between steps and between tools.
 
 ```py
 from pydantic import BaseModel
-from aviary.env import Environment
+from aviary.core import Environment
+
 
 class ExampleState(BaseModel):
     reward: float = 0
     done: bool = False
 
+
 class ExampleEnv(Environment[ExampleState]):
     state: ExampleState
 ```
@@ -114,7 +140,7 @@ tasks, etc. attached to it.
 We expose a simple interface to some commonly-used environments that are included in the aviary codebase. You can instantiate one by referring to its name and passing keyword arguments:
 
 ```py
-from aviary.env import Environment
+from aviary.core import Environment
 
 env = Environment.from_name(
     "calculator",
@@ -128,7 +154,7 @@ Included with some environments are collections of problems that define training
 We refer to these as `TaskDataset`s, and expose them with a similar interface:
 
 ```py
-from aviary.env import TaskDataset
+from aviary.core import TaskDataset
 
 dataset = TaskDataset.from_name("hotpotqa", split="dev")
 ```
@@ -203,8 +229,9 @@ def print_story(story: str | bytes, state: ExampleState) -> None:
 Now we'll define the `reset` function which should set-up the tools and return one or more observations and the tools.
 
 ```py
-from aviary.message import Message
-from aviary.tools import Tool
+from aviary.core import Message
+from aviary.core import Tool
+
 
 def reset(self) -> tuple[list[Message], list[Tool]]:
     self.tools = [Tool.from_function(ExampleEnv.print_story)]
@@ -220,7 +247,8 @@ Now we can define the `step` function which should take an action and return the
 the episode was truncated.
 
 ```py
-from aviary.message import Message
+from aviary.core import Message
+
 
 async def step(self, action: Message) -> tuple[list[Message], float, bool, bool]:
     msgs: list[Message] = await self.exec_tool_calls(action, state=self.state)
@@ -234,7 +262,8 @@ You will probably often use this specific syntax for calling the tools - calling
 Lastly, we can define a function to export the state for visualization or debugging purposes. This is optional.
 
 ```py
-from aviary.env import Frame
+from aviary.core import Frame
+
 
 def export_frame(self) -> Frame:
     return Frame(

diff --git a/pyproject.toml b/pyproject.toml
@@ -32,16 +32,15 @@ cloud = [
 ]
 dev = [
     "SQLAlchemy[aiosqlite]~=2.0",  # Match aviary dependencies
-    "aviary.gsm8k[typing]",  # So `uv sync` pulls this in, and for type stubs
-    "aviary.hotpotqa",  # So `uv sync` pulls this in
-    "codeflash",
+    "aviary.gsm8k[typing]",
+    "aviary.hotpotqa",
     "fhaviary[image,llm,server,typing,xml]",
     "ipython>=8",  # Pin to keep recent
     "mypy>=1.8",  # Pin for mutable-override
     "pre-commit>=3.4",  # Pin to keep recent
     "pydantic~=2.9",  # Pydantic 2.9 changed JSON schema exports 'allOf', so ensure tests match
     "pylint-pydantic",
-    "pylint>=3.2",
+    "pylint>=3.2",  # Pin to keep recent
     "pytest-asyncio",
     "pytest-recording",
     "pytest-subtests",

diff --git a/tests/conftest.py b/tests/conftest.py
@@ -2,7 +2,7 @@
 
 import pytest
 
-from aviary.env import DummyEnv
+from aviary.core import DummyEnv
 
 from . import CASSETTES_DIR