Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[add] mistral-7b model #63

Merged
merged 1 commit into from
Mar 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ English | [简体中文](docs/README_zh-CN.md)


## Latest Progress 🎉
- \[February 2024\] Add mistral-7b model
- \[February 2024\] Add gemini-pro model
- \[January 2024\] refactor the config-template.yaml to control the backend and the frontend settings at the same time, [click](https://github.com/InternLM/OpenAOE/blob/main/docs/tech-report/config-template.md) to find more introduction about the `config-template.yaml`
- \[January 2024\] Add internlm2-chat-7b model
Expand Down
1 change: 1 addition & 0 deletions docs/README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@


## 最新进展 🎉
- \[2024/02\] 添加 mistral-7b 模型
- \[2024/02\] 添加 gemini-pro 模型
- \[2024/01\] 重构了 config-template.yaml,可以同时配置前后端的设置
- \[2024/01\] 添加 internlm2-chat-7b 模型
Expand Down
3 changes: 3 additions & 0 deletions docs/todo/TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
- [x] add internlm2-chat-7b model as default
- [x] add Gemini model as default
- [x] refactor the config.yaml to make the model setting looks more logical
- [x] add Mistral-7b model
- [x] add Gemma model
- [ ] involve ollama as one of the inference engines
- [ ] dynamic add new model by editing external python files and the config.yaml
- [ ] build frontend project when OpenAOE start up
- [ ] support image interaction
18 changes: 18 additions & 0 deletions openaoe/backend/api/route_mistral.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from fastapi import APIRouter, Request, Response

from openaoe.backend.service.mistral import Mistral
from openaoe.backend.model.openaoe import AoeChatBody

router = APIRouter()


@router.post("/v1/mistral/chat", tags=["Mistral"])
async def mistral_chat(body: AoeChatBody, request: Request, response: Response):
"""
chat api for Mistral 7b model
:param body: request body
:param request: request
:param response: response
:return
"""
return await Mistral(request, response).chat(body)
20 changes: 20 additions & 0 deletions openaoe/backend/config/config-template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -77,4 +77,24 @@ models:
app_id:
ak:
sk:
mistral-7b:
provider: mistral
webui:
avatar: 'https://oss.openmmlab.com/frontend/OpenAOE/mistral.webp'
isStream: true
background: 'linear-gradient(#4848cf26 0%, #7498be 100%)'
path: '/v1/mistral/v1/mistral/chat'
payload:
messages: [ ]
model: mistral
prompt: ""
role_meta:
user_name: "user"
bot_name: "assistant"
stream: true
api:
api_base: http://localhost:11434
app_id:
ak:
sk:
...
1 change: 1 addition & 0 deletions openaoe/backend/config/constant.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
PROVIDER_XUNFEI = "spark"
PROVIDER_CLAUDE = "claude"
PROVIDER_INTERNLM = "internlm"
PROVIDER_MISTRAL = "mistral"

DEFAULT_TIMEOUT_SECONDS = 600

36 changes: 36 additions & 0 deletions openaoe/backend/model/mistral.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
"""
ref. to https://github.com/ollama/ollama/blob/main/docs/api.md
Parameters
model: (required) the model name
messages: the messages of the chat, this can be used to keep a chat memory
The message object has the following fields:

role: the role of the message, either system, user or assistant
content: the content of the message
images (optional): a list of images to include in the message (for multimodal models such as llava)
Advanced parameters (optional):

format: the format to return a response in. Currently the only accepted value is json
options: additional model parameters listed in the documentation for the Modelfile such as temperature
template: the prompt template to use (overrides what is defined in the Modelfile)
stream: if false the response will be returned as a single response object, rather than a stream of objects
keep_alive: controls how long the model will stay loaded into memory following the request (default: 5m)
"""

from typing import List, Optional, Literal, Dict
from pydantic import BaseModel


class Message(BaseModel):
role: Optional[Literal["user", "system", "assistant"]] = "user"
content: str
images: Optional[List[str]] = None # img in base64


class MistralChatBody(BaseModel):
model: str
messages: List[Message]
options: Optional[Dict] = {}
template: Optional[str] = None
stream: Optional[bool] = True
keep_alive: Optional[str] = '5m'
25 changes: 25 additions & 0 deletions openaoe/backend/model/openaoe.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
from typing import Optional, List, Literal
from pydantic import BaseModel


class Context(BaseModel):
send_type: str = 'assistant'
sender_type: str = "assistant"
text: str = ''


class RoleMeta(BaseModel):
user_name: Optional[str] = 'user'
bot_name: Optional[str] = 'assistant'


class AoeChatBody(BaseModel):
"""
OpenAOE general request body
"""
model: str
prompt: str
messages: Optional[List[Context]] = []
role_meta: Optional[RoleMeta] = None
type: Optional[Literal['text', 'json']] = 'json'
stream: Optional[bool] = True
3 changes: 2 additions & 1 deletion openaoe/backend/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,5 @@ pyyaml==6.0.1
httpx==0.25.0
sse-starlette==1.8.2
anyio==3.7.1
jsonstreamer==1.3.8
jsonstreamer==1.3.8
twine==5.0.0
61 changes: 61 additions & 0 deletions openaoe/backend/service/mistral.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
import json

import requests
from fastapi import Request, Response
from sse_starlette import EventSourceResponse

from openaoe.backend.config.biz_config import get_base_url
from openaoe.backend.config.constant import PROVIDER_MISTRAL
from openaoe.backend.model.openaoe import AoeChatBody
from openaoe.backend.model.mistral import MistralChatBody, Message

from openaoe.backend.util.log import log
logger = log(__name__)


class Mistral:
def __init__(self, request: Request, response: Response):
self.request = request
self.response = response

async def chat(self, body: AoeChatBody):
msgs = []
for msg in body.messages:
m = Message(role=msg.sender_type if msg.sender_type != 'bot' else "assistant", content=msg.text)
msgs.append(m)
last_m = Message(role='user', content=body.prompt)
msgs.append(last_m)
chat_url = get_base_url(PROVIDER_MISTRAL, body.model) + "/api/chat"
chat_body = MistralChatBody(
model="mistral",
messages=msgs
)
return self.chat_response_streaming(chat_url, chat_body)

def chat_response_streaming(self, chat_url: str, chat_body: MistralChatBody):
async def do_response_streaming():
try:
res = requests.post(chat_url, json=json.loads(chat_body.model_dump_json()), stream=True)
if res:
for chunk in res.iter_content(chunk_size=512, decode_unicode=True):
chunk = bytes.decode(chunk)
logger.info(f"chunk: {chunk}")
chunk_json = json.loads(chunk)
yield json.dumps({
"success": True,
"msg": chunk_json.get("message").get("content")
}, ensure_ascii=False)
except Exception as e:
logger.error(f"{e}")
yield json.dumps(
{
"success": "false",
"msg": f"from backend: {e}"
}
)

return EventSourceResponse(do_response_streaming())




Loading
Loading