Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi stage attack #51

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -89,3 +89,4 @@ report.xml
cmake-build-*/
*/artifacts/
/examples/chrome-data/
/venv/
30 changes: 22 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
# LLAMATOR

## Description 📖
Red Teaming python-framework for testing chatbots and LLM-systems

Red teaming python-framework for testing vulnerabilities of chatbots based on large language models (LLM). Supports testing of Russian-language RAG systems.
---

[![License: CC BY-NC-SA 4.0](https://img.shields.io/badge/License-CC_BY--NC--SA_4.0-lightgrey.svg)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/llamator)](https://pypi.org/project/llamator)
[![PyPI](https://badge.fury.io/py/llamator.svg)](https://badge.fury.io/py/llamator)
[![Downloads](https://pepy.tech/badge/llamator)](https://pepy.tech/project/llamator)
[![Downloads](https://pepy.tech/badge/llamator/month)](https://pepy.tech/project/llamator)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

## Install 🚀

Expand All @@ -26,19 +33,26 @@ Documentation Link: [https://romiconez.github.io/llamator](https://romiconez.git

* 🌐 All LangChain clients
* 🧠 OpenAI-like API
* ⚙️ Custom Class (Telegram, Selenium, etc.)
* ⚙️ Custom Class (Telegram, WhatsApp, Selenium, etc.)

## Unique Features 🌟

* 🛡️ Support for custom attacks from the user
* 📊 Results of launching each attack in CSV format
* 📈 Report with attack requests and responses for all tests in Excel format
* 📄 Test report document available in DOCX format
* ️🗡 Support for custom attacks from the user
* 👜 Large selection of attacks on RAG / Agent / Prompt in English and Russian
* 🛡 Custom configuration of chat clients
* 📊 History of attack requests and responses in Excel and CSV format
* 📄 Test report document in DOCX format

## OWASP Classification 🔒

* 💉 [LLM01: Prompt Injection and Jailbreaks](https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/blob/main/2_0_vulns/LLM01_PromptInjection.md)
* 🕵 [LLM07: System Prompt Leakage](https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/blob/main/2_0_vulns/LLM07_SystemPromptLeakage.md)
* 🎭 [LLM09: Misinformation](https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/blob/main/2_0_vulns/LLM09_Misinformation.md)

## License 📜

© Roman Neronov, Timur Nizamov, Nikita Ivanov

This project is licensed under the terms of the **Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International** license. See the [LICENSE](LICENSE) file for details.

[![Creative Commons License](https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
[![Creative Commons License](https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
1 change: 0 additions & 1 deletion docs/howtos.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
- **WhatsApp bot testing via Selenium** - [GitHub](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-whatsapp.ipynb)
- **LangChain client testing with custom attack** - [GitHub](https://github.com/RomiconEZ/llamator/blob/release/examples/llamator-langchain-custom-attack.ipynb)


## Installation Guide

```bash
Expand Down
9 changes: 6 additions & 3 deletions docs/project_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,11 @@
LLAMATOR is a framework for testing vulnerabilities of chatbot systems and LLM.

**Key Features**
- Custom configuration of clients both for carrying out attacks and for testing
- Large selection of attacks in English and Russian languages
- Detailed testing results with reports in Excel format

* ️🗡 Support for custom attacks from the user
* 👜 Large selection of attacks (RAG / Agent / Prompt) in English and Russian
* 🛡 Custom configuration of chat clients
* 📊 History of attack requests and responses in Excel and CSV format
* 📄 Test report document in DOCX format

This project is designed for developers and researchers working in NLP and LLM domains.
6 changes: 5 additions & 1 deletion src/llamator/attacks/base64_injection.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,11 @@ def run(self) -> Generator[StatusUpdate, None, None]:
) # constant 'random_state' for better reproducibility
else:
# If dataset is smaller than requested number of attack samples, fit the dataset repeating until num_attempts is reached
data = pd.concat([data] * (self.num_attempts // len(data) + 1))[:self.num_attempts].sort_index().reset_index(drop=True)
data = (
pd.concat([data] * (self.num_attempts // len(data) + 1))[: self.num_attempts]
.sort_index()
.reset_index(drop=True)
)

# Generate list of attack prompt variations
yield StatusUpdate(self.client_config, self.test_name, self.status, "Working", 0, self.num_attempts)
Expand Down
6 changes: 5 additions & 1 deletion src/llamator/attacks/dan.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,11 @@ def run(self) -> Generator[StatusUpdate, None, None]:
) # constant `random_state` for better reproducibility
else:
# If dataset is smaller than requested number of attack samples, fit the dataset repeating until num_attempts is reached
data = pd.concat([data] * (self.num_attempts // len(data) + 1))[:self.num_attempts].sort_index().reset_index(drop=True)
data = (
pd.concat([data] * (self.num_attempts // len(data) + 1))[: self.num_attempts]
.sort_index()
.reset_index(drop=True)
)

# Generate list of DAN (do anything now) attack prompt variations
yield StatusUpdate(
Expand Down
6 changes: 5 additions & 1 deletion src/llamator/attacks/ethical_compliance.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,11 @@ def run(self) -> Generator[StatusUpdate, None, None]:
) # constant `random_state` for better reproducibility
else:
# If dataset is smaller than requested number of attack samples, fit the dataset repeating until num_attempts is reached
data = pd.concat([data] * (self.num_attempts // len(data) + 1))[:self.num_attempts].sort_index().reset_index(drop=True)
data = (
pd.concat([data] * (self.num_attempts // len(data) + 1))[: self.num_attempts]
.sort_index()
.reset_index(drop=True)
)

chat = ChatSession(self.attack_config.attack_client.get_target_client())
for attack_prompt_index, row in data.iterrows():
Expand Down
6 changes: 5 additions & 1 deletion src/llamator/attacks/harmful_behavior.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,11 @@ def run(self) -> Generator[StatusUpdate, None, None]:
) # constant `random_state` for better reproducibility
else:
# If dataset is smaller than requested number of attack samples, fit the dataset repeating until num_attempts is reached
data = pd.concat([data] * (self.num_attempts // len(data) + 1))[:self.num_attempts].sort_index().reset_index(drop=True)
data = (
pd.concat([data] * (self.num_attempts // len(data) + 1))[: self.num_attempts]
.sort_index()
.reset_index(drop=True)
)

# Generate list of attack prompt variations
yield StatusUpdate(self.client_config, self.test_name, self.status, "Generating", 0, self.num_attempts)
Expand Down
6 changes: 5 additions & 1 deletion src/llamator/attacks/past_tense.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,11 @@ def run(self) -> Generator[StatusUpdate, None, None]:
) # constant `random_state` for better reproducibility
else:
# If dataset is smaller than requested number of attack samples, fit the dataset repeating until num_attempts is reached
data = pd.concat([data] * (self.num_attempts // len(data) + 1))[:self.num_attempts].sort_index().reset_index(drop=True)
data = (
pd.concat([data] * (self.num_attempts // len(data) + 1))[: self.num_attempts]
.sort_index()
.reset_index(drop=True)
)

# Lists to store prompts, responses, and statuses for report generation
attack_prompts = []
Expand Down
6 changes: 5 additions & 1 deletion src/llamator/attacks/ru_dan.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,11 @@ def run(self) -> Generator[StatusUpdate, None, None]:
) # constant `random_state` for better reproducibility
else:
# If dataset is smaller than requested number of attack samples, fit the dataset repeating until num_attempts is reached
data = pd.concat([data] * (self.num_attempts // len(data) + 1))[:self.num_attempts].sort_index().reset_index(drop=True)
data = (
pd.concat([data] * (self.num_attempts // len(data) + 1))[: self.num_attempts]
.sort_index()
.reset_index(drop=True)
)

# Generate list of DAN (do anything now) attack prompt variations
yield StatusUpdate(
Expand Down
6 changes: 5 additions & 1 deletion src/llamator/attacks/ru_ucar.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,11 @@ def run(self) -> Generator[StatusUpdate, None, None]:
) # constant `random_state` for better reproducibility
else:
# If dataset is smaller than requested number of attack samples, fit the dataset repeating until num_attempts is reached
data = pd.concat([data] * (self.num_attempts // len(data) + 1))[:self.num_attempts].sort_index().reset_index(drop=True)
data = (
pd.concat([data] * (self.num_attempts // len(data) + 1))[: self.num_attempts]
.sort_index()
.reset_index(drop=True)
)

# Generate list of attack prompt variations
yield StatusUpdate(
Expand Down
6 changes: 5 additions & 1 deletion src/llamator/attacks/ucar.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,11 @@ def run(self) -> Generator[StatusUpdate, None, None]:
) # constant `random_state` for better reproducibility
else:
# If dataset is smaller than requested number of attack samples, fit the dataset repeating until num_attempts is reached
data = pd.concat([data] * (self.num_attempts // len(data) + 1))[:self.num_attempts].sort_index().reset_index(drop=True)
data = (
pd.concat([data] * (self.num_attempts // len(data) + 1))[: self.num_attempts]
.sort_index()
.reset_index(drop=True)
)

# Generate list of attack prompt variations
yield StatusUpdate(
Expand Down
Loading
Loading