Skip to content

Latest commit

 

History

History
124 lines (116 loc) · 3.45 KB

README.md

File metadata and controls

124 lines (116 loc) · 3.45 KB

AI Endpoint

Introduction

English | 简体中文

This project is an AI endpoint that provides a unified interface for AI models. It is designed to be multi-tenant, and supports load balancing, rate limiting, and logging capabilities. It is also containerized for easy deployment.

Capabilities

  • Azure API Proxy
    • Compatible with OpenAI API format
    • API Keys load balancing
    • Weighted round-robin
    • Adaptive weight
  • Multi-tenant
    • Request authentication
    • Configuration isolation
  • Rate limiting capabilities
    • Tenant-level rate limiting
    • Model-level rate limiting
  • Logging capabilities
    • Multi-channel writing
    • File splitting
  • Containerized deployment

Supported routes

  • POST /v1/chat/completions

Invocation method

curl --location 'localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer xxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' \
--data '{
    "stream": true,
    "model": "gpt-3.5-turbo",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "Does Azure OpenAI support customer managed keys?"
        },
        {
            "role": "assistant",
            "content": "Yes, customer managed keys are supported by Azure OpenAI."
        },
        {
            "role": "user",
            "content": "Do other Azure Cognitive Services support this too?"
        }
    ]
}'

Dependencies

Configuration file

logger:
  channels:
    - name: app
      filename: /var/log/app.log
      maxSize: 1
      maxAge: 30
      maxBackups: 10
      compress: false
      level: info
    - name: request
      filename: /var/log/request.log
      maxSize: 1
      maxAge: 30
      maxBackups: 10
      compress: false
      level: info
database:
  addr: root:password@tcp(127.0.0.1:3306)/endpoints?charset=utf8mb4&parseTime=True&loc=Local
redis:
  addr: 127.0.0.1:6379
  password: ""
  db: 0
azure:
  openai:
    models:
      - gpt-3.5-turbo
      - gpt-4
      - gpt-4-32k
    peers:
      - key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        endpoint: https://api.openai.com
        weight: 20
        deployments:
          - model: gpt-3.5-turbo
            # if you want to use the original openai model, set isOpenAI to true
            isOpenAI: true
          - model: gpt-4
            isOpenAI: true
          - model: gpt-4-32k
            isOpenAI: true
      - key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        endpoint: https://xxxxx.openai.azure.com
        weight: 20
        deployments:
          - name: gpt-35-turbo
            model: gpt-3.5-turbo
            version: 2023-03-15-preview
          - name: gpt-4
            model: gpt-4
            version: 2023-03-15-preview
          - name: gpt-4-32k
            model: gpt-4-32k
            version: 2023-03-15-preview