Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
mlejva authored Apr 11, 2024
1 parent 3422c10 commit 6f7f251
Showing 1 changed file with 18 additions and 51 deletions.
69 changes: 18 additions & 51 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,14 @@
# Code Interpreter SDK

The repository contains a template and modules for the code interpreter sandbox. It is based on the Jupyter server and implements the Jupyter Kernel messaging protocol. This allows for sharing context between code executions and improves support for plotting charts and other display-able data.
This Code Interpreter SDK allows you to run AI-generated Python code and each run share the context. That means that subsequent runs can reference to variables, definitions, etc from past code execution runs.
The code interpreter runs inside the [E2B Sandbox](https://github.com/e2b-dev/e2b) - an open-source secure micro VM made for running untrusted AI-generated code and AI agents.
- ✅ Works with any LLM and AI framework
- ✅ Supports streaming content like charts and stdout, stderr
- ✅ Python & JS SDK
- ✅ Runs on serverless and edge functions
- ✅ 100% open source (including [infrastructure](https://github.com/e2b-dev/infra))

## Motivation

The code generated by LLMs is often split into code blocks, where each subsequent block references the previous one. This is a common pattern in Jupyter notebooks, where each cell can reference the variables and definitions from the previous cells. In the classical sandbox each code execution is independent and does not share the context with the previous executions.

This is suboptimal for a lot of Python use cases with LLMs. Especially GPT-3.5 and 4 expects it runs in a Jupyter Notebook environment. Even when ones tries to convince it otherwise. In practice, LLMs will generate code blocks which have references to previous code blocks. This becomes an issue if a user wants to execute each code block separately which often is the use case.

This new code interpreter template runs a Jupyter server inside the sandbox, which allows for sharing context between code executions.
Additionally, this new template also partly implements the [Jupyter Kernel messaging protocol](https://jupyter-client.readthedocs.io/en/latest/messaging.html). This means that, for example, support for plotting charts is now improved and we don't need to do hack-ish solutions like [in the current production version](https://github.com/e2b-dev/E2B/blob/main/sandboxes/code-interpreter/e2b_matplotlib_backend.py) of our code interpreter.

The current code interpreter allows to run Python code but each run share the context. That means that subsequent runs can reference to variables, definitions, etc from past code execution runs.
<img width="1200" alt="Post-02" src="https://github.com/e2b-dev/code-interpreter/assets/5136688/2fa8c371-f03c-4186-b0b6-4151e68b0539">

## Installation

Expand Down Expand Up @@ -181,48 +178,18 @@ await sandbox.notebook.execCell(code, {
await sandbox.close()
```

### Pre-installed Python packages inside the sandbox
## How it works
The code generated by LLMs is often split into code blocks, where each subsequent block references the previous one. This is a common pattern in Jupyter notebooks, where each cell can reference the variables and definitions from the previous cells. In the classical sandbox each code execution is independent and does not share the context with the previous executions.

The full and always up-to-date list can be found in the [`requirements.txt`](https://github.com/e2b-dev/E2B/blob/stateful-code-interpreter/sandboxes/code-interpreter-stateful/requirements.txt) file.
This is suboptimal for a lot of Python use cases with LLMs. Especially GPT-3.5 and 4 expects it runs in a Jupyter Notebook environment. Even when ones tries to convince it otherwise. In practice, LLMs will generate code blocks which have references to previous code blocks. This becomes an issue if a user wants to execute each code block separately which often is the use case.

```text
# Jupyter server requirements
jupyter-server==2.13.0
ipykernel==6.29.3
ipython==8.22.2
# Other packages
aiohttp==3.9.3
beautifulsoup4==4.12.3
bokeh==3.3.4
gensim==4.3.2
imageio==2.34.0
joblib==1.3.2
librosa==0.10.1
matplotlib==3.8.3
nltk==3.8.1
numpy==1.26.4
opencv-python==4.9.0.80
openpyxl==3.1.2
pandas==1.5.3
plotly==5.19.0
pytest==8.1.0
python-docx==1.1.0
pytz==2024.1
requests==2.26.0
scikit-image==0.22.0
scikit-learn==1.4.1.post1
scipy==1.12.0
seaborn==0.13.2
soundfile==0.12.1
spacy==3.7.4
textblob==0.18.0
tornado==6.4
urllib3==1.26.7
xarray==2024.2.0
xlrd==2.0.1
```
This new code interpreter template runs a Jupyter server inside the sandbox, which allows for sharing context between code executions.
Additionally, this new template also partly implements the [Jupyter Kernel messaging protocol](https://jupyter-client.readthedocs.io/en/latest/messaging.html). This means that, for example, support for plotting charts is now improved and we don't need to do hack-ish solutions like [in the current production version](https://github.com/e2b-dev/E2B/blob/main/sandboxes/code-interpreter/e2b_matplotlib_backend.py) of our code interpreter.

## Pre-installed Python packages inside the sandbox

The full and always up-to-date list can be found in the [`requirements.txt`](https://github.com/e2b-dev/E2B/blob/stateful-code-interpreter/sandboxes/code-interpreter-stateful/requirements.txt) file.

### Custom template using Code Interpreter
## Custom E2B sandbox with Code Interpreter SDK

The template requires custom setup. If you want to build your own custom template and use Code Interpreter, look at [README.md](./template/README.md) in the template folder.

0 comments on commit 6f7f251

Please sign in to comment.