Large Language Model Gateway (LLMGW)

Overview

The Large Language Model Gateway (LLMGW) is a API middleware designed to interface with AI models for chat completions. This API provides endpoints to interact with AI models, manage prompts, and retrieve completions, with built-in rate limiting, security, and clustering for scalability.

Features

AI Model Integration: Send prompts to AI models and get responses.
Scalable Architecture: Utilizes clustering to fork workers across CPU cores for better performance.
Rate Limiting: Built-in rate limiting to prevent abuse (1000 requests per 15 minutes).
Security: Uses Helmet for secure HTTP headers.
CLI Interaction: Support for clearing the console and exiting the server via command line.
Verbose Logging: Optional detailed logging of requests and responses.

Command-line options

Option	Description	Default
`--bindip`	IP address to bind the server to	`127.0.0.1`
`--bindport`	Port to bind the server to	`42069`
`--aihost`	AI model server host	`10.0.0.1`
`--aihostport`	AI model server port	`443`
`--verbose`	Enable verbose logging	`false`

Example:

To start the server on IP 127.0.0.1, port 5000, and connect to AI model at localhost:8000 with verbose logging:

./llmw --bindip 127.0.0.1 --bindport 42069 --aihost 10.0.0.1 --aihostport 443 --verbose

API Documentation

POST `/v1/chat/completions`

Request

Headers: Content-Type: application/json

Body:

{
  "model": "string",             // (Optional) Model ID, default: "TheBloke/Mistral-7B-Instruct-v0.2-AWQ"
  "messages": [
    {
      "role": "string",           // (Required) Role of the message (user/system/assistant)
      "content": "string"         // (Required) Content of the message
    }
  ],
  "max_tokens": 128,              // (Optional) Maximum number of tokens in the response
  "temperature": 0.7              // (Optional) Sampling temperature (0.1 - 1.0)
}

Response

Status 200:

{
  "id": "string",                          // Unique ID for the completion
  "object": "chat.completion",             // Response type
  "created": 1636107200,                   // Timestamp (Unix epoch)
  "model": "string",                       // Model ID used
  "choices": [
    {
      "message": {
        "role": "user",
        "content": "string"                // AI response content
      },
      "finish_reason": "stop",
      "index": 0
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 45,
    "total_tokens": 168
  }
}

Error (400 Bad Request):

{
  "error": {
    "message": "Temperature must be between 0.1 and 1.0",
    "type": "invalid_request_error"
  }
}

Development

Testing the API Locally: You can use tools like Postman or curl to test the API.

Example request using curl:

curl -X POST http://localhost:42069/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
  "model": "TheBloke/Mistral-7B-Instruct-v0.2-AWQ",
  "messages": [{"role": "user", "content": "Hello AI"}],
  "max_tokens": 100,
  "temperature": 0.7
}'

Logging: If the --verbose flag is enabled, detailed logs will appear in the console, showing received inputs and AI responses with ANSI color coding.

Scalability

The server uses clustering to distribute requests across multiple CPU cores. By default, it forks workers equal to the number of available CPU cores.

If you need to change this behavior, you can modify the clustering logic in the code.

Security

This API uses Helmet for security by adding HTTP headers that prevent some common attacks such as cross-site scripting (XSS) and clickjacking. It also implements rate-limiting to prevent abuse.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a pull request or create an issue if you encounter any problems.

This README.md covers installation, configuration, usage, and development guidelines for users and contributors. It is designed to be placed on a GitHub repository to assist developers in deploying and interacting with the API.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github		.github
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Large Language Model Gateway (LLMGW)

Overview

Features

Command-line options

Example:

API Documentation

POST `/v1/chat/completions`

Request

Response

Development

Scalability

Security

License

Contributing

About

Releases

Sponsor this project

Packages

License

hckrngnr/llmgw

Folders and files

Latest commit

History

Repository files navigation

Large Language Model Gateway (LLMGW)

Overview

Features

Command-line options

Example:

API Documentation

POST /v1/chat/completions

Request

Response

Development

Scalability

Security

License

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

POST `/v1/chat/completions`

Packages