Helm

Helm is an open-source LLM (Large Language Model) feature-based steering application inspired by Goodfire. The project leverages SAE (Sparse Autoencoder) feature clamping to steer and modulate the outputs of LLM responses, offering a unique and powerful way to control and direct AI-generated content.

Introduction

Helm is designed to allow users to incorporate multiple feature applications simultaneously, delivering highly precise steering capabilities. This powerful tool can be used to refine the output of language models by clamping various features, ultimately orientating the AI's responses according to the user's needs.

Usage

To get started with Helm, follow the steps below:

git clone git@github.com:morgancmartin/helm.git
cd helm/
source venv/bin/activate
pip install -r requirements.txt
cd ./frontend/
npm run dev

Implementation Details

Helm incorporates several advanced technologies and frameworks to achieve its functionalities:

Model: Utilizes the GPT2-small model for generating responses.
Frontend: Built using Remix for a seamless user interface experience.
Transformer Hooking: Employs the HookedTransformer from the remarkable TransformerLens library.
Sparse Autoencoders: Utilizes Joseph Bloom's Open Source Sparse Autoencoders across all Residual Stream Layers of GPT2-small. More details are available on Neuronpedia.
Feature Search and Explanation: Utilizies Neuronpedia's feature search and explanation API.

Helm’s capability of simultaneous feature application ensures precision steering, making it exceptionally useful for applications requiring meticulous control over language model outputs.

Demo

For a live demonstration of Helm in action, please check out our video demo here.

Links

Here are some useful resources and technologies associated with Helm:

Future Improvements

In future updates, Helm aims to introduce the following enhancements:

Support for Multiple Models: Extend support beyond GPT2-small to incorporate a variety of models, enhancing flexibility and application scope.
Max Activating Examples for Feature Cards: Enable the system to present maximum activating examples for feature cards, thereby providing improved insights and control.

Helm continues to evolve and aims to become an integral tool in steering the output of Large Language Models with unparalleled precision and ease. Contributions and feedback are always welcome in this ongoing journey of innovation!

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
frontend		frontend
.gitignore		.gitignore
2024-10-22 16-32-15.mkv		2024-10-22 16-32-15.mkv
README.md		README.md
helm-img.webp		helm-img.webp
hooked_prompt.py		hooked_prompt.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Helm

Table of Contents

Introduction

Usage

Implementation Details

Demo

Links

Future Improvements

About

Releases

Packages

Languages

morgancmartin/helm

Folders and files

Latest commit

History

Repository files navigation

Helm

Table of Contents

Introduction

Usage

Implementation Details

Demo

Links

Future Improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages