Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add dashscope multimodal #92

Merged
merged 19 commits into from
Apr 9, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 15 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,20 +64,21 @@ applications in a centralized programming manner for streamlined development.
AgentScope provides a list of `ModelWrapper` to support both local model
services and third-party model APIs.

| API | Task | Model Wrapper |
|------------------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------|
| OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| | Embedding | [`OpenAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| | DALL·E | [`OpenAIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| DashScope API | Chat | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| | Image Synthesis | [`DashScopeImageSynthesisWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| | Text Embedding | [`DashScopeTextEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) |
| | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) |
| ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
| | Embedding | [`OllamaEmbedding`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
| | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
| Post Request based API | - | [`PostAPIModelWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) |
| API | Task | Model Wrapper |
| ---------------------- |-------------------------| ------------------------------------------------------------ |
| OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| | Embedding | [`OpenAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| | DALL·E | [`OpenAIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| DashScope API | Chat | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| | Image Synthesis | [`DashScopeImageSynthesisWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| | Text Embedding | [`DashScopeTextEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| | Multimodal Conversation | [`DashScopeMultiModalWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
pan-x-c marked this conversation as resolved.
Show resolved Hide resolved
| Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) |
| | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) |
| ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
| | Embedding | [`OllamaEmbedding`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
qbc2016 marked this conversation as resolved.
Show resolved Hide resolved
| | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
| Post Request based API | - | [`PostAPIModelWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) |

**Supported Local Model Deployment**

Expand Down
29 changes: 15 additions & 14 deletions README_ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,20 +53,21 @@ AgentScope是一个创新的多智能体开发平台,旨在赋予开发人员

AgentScope提供了一系列`ModelWrapper`来支持本地模型服务和第三方模型API。

| API | Task | Model Wrapper |
|------------------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------|
| OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| | Embedding | [`OpenAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| | DALL·E | [`OpenAIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| DashScope API | Chat | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| | Image Synthesis | [`DashScopeImageSynthesisWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| | Text Embedding | [`DashScopeTextEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) |
| | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) |
| ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
| | Embedding | [`OllamaEmbedding`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
| | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
| Post Request based API | - | [`PostAPIModelWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) |
| API | Task | Model Wrapper |
| ---------------------- |-------------------------| ------------------------------------------------------------ |
| OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| | Embedding | [`OpenAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| | DALL·E | [`OpenAIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |
| DashScope API | Chat | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| | Image Synthesis | [`DashScopeImageSynthesisWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| | Text Embedding | [`DashScopeTextEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| | Multimodal Conversation | [`DashScopeMultiModalWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) |
| Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) |
| | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) |
| ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
| | Embedding | [`OllamaEmbedding`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
qbc2016 marked this conversation as resolved.
Show resolved Hide resolved
| | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) |
| Post Request based API | - | [`PostAPIModelWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) |

**支持的本地模型部署**

Expand Down
2 changes: 2 additions & 0 deletions src/agentscope/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
DashScopeChatWrapper,
DashScopeImageSynthesisWrapper,
DashScopeTextEmbeddingWrapper,
DashScopeMultiModalWrapper,
)
from .ollama_model import (
OllamaChatWrapper,
Expand Down Expand Up @@ -48,6 +49,7 @@
"DashScopeChatWrapper",
"DashScopeImageSynthesisWrapper",
"DashScopeTextEmbeddingWrapper",
"DashScopeMultiModalWrapper",
"OllamaChatWrapper",
"OllamaEmbeddingWrapper",
"OllamaGenerationWrapper",
Expand Down
191 changes: 162 additions & 29 deletions src/agentscope/models/dashscope_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ def __call__(
)

# TODO: move is to prompt engineering
messages = self._preprocess_role(messages)
messages = _preprocess_role(messages)
# step3: forward to generate response
response = dashscope.Generation.call(
model=self.model_name,
Expand Down Expand Up @@ -197,34 +197,6 @@ def __call__(
raw=response,
)

def _preprocess_role(self, messages: list) -> list:
"""preprocess role rules for DashScope"""
# The models in this list require that the roles of messages must
# alternate between "user" and "assistant".
message_length = len(messages)
if message_length % 2 == 1:
# If the length of the message list is odd, roles will
# alternate, starting with "user"
roles = [
"user" if i % 2 == 0 else "assistant"
for i in range(message_length)
]
else:
# If the length of the message list is even, the first role
# will be "system", followed by alternating "user" and
# "assistant"
roles = ["system"] + [
"user" if i % 2 == 1 else "assistant"
for i in range(1, message_length)
]

# Assign the roles list to the "role" key for each message in
# the messages list
for message, role in zip(messages, roles):
message["role"] = role

return messages


class DashScopeImageSynthesisWrapper(DashScopeWrapperBase):
"""The model wrapper for DashScope Image Synthesis API."""
Expand Down Expand Up @@ -426,3 +398,164 @@ def __call__(
],
raw=response,
)


class DashScopeMultiModalWrapper(DashScopeWrapperBase):
"""The model wrapper for DashScope Text Embedding API."""
qbc2016 marked this conversation as resolved.
Show resolved Hide resolved

model_type: str = "dashscope_multimodal"

def _register_default_metrics(self) -> None:
# Set monitor accordingly
# TODO: set quota to the following metrics
self.monitor.register(
self._metric("call_counter"),
metric_unit="times",
)
self.monitor.register(
self._metric("prompt_tokens"),
metric_unit="token",
)
self.monitor.register(
self._metric("completion_tokens"),
metric_unit="token",
)
self.monitor.register(
self._metric("total_tokens"),
metric_unit="token",
)

def __call__(
self,
messages: list,
**kwargs: Any,
) -> ModelResponse:
"""Embed the messages with DashScope MultiModal API.

Args:
messages (`list`):
A list of messages to process.
**kwargs (`Any`):
The keyword arguments to DashScope MultiModal API,
e.g. `stream`. Please refer to
https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api
for more detailed arguments.

Returns:
`ModelResponse`:
The response text in text field, and the raw response in
raw field.

Note:
If involving image links, then the messages should be of the
following form:
messages = [
qbc2016 marked this conversation as resolved.
Show resolved Hide resolved
{
"role": "system",
"content": [
{"text": "You are a helpful assistant."},
],
},
{
"role": "user",
"content": [
{"text": "What does this picture depict?"},
{"image": "http://example.com/image.jpg"},
],
},
]
Therefore, you should input a list matching the content value
above.
If only involving words, just input them.

`parse_func`, `fault_handler` and `max_retries` are reserved
for `_response_parse_decorator` to parse and check the response
generated by model wrapper. Their usages are listed as follows:
- `parse_func` is a callable function used to parse and
check the response generated by the model, which takes the
response as input.
- `max_retries` is the maximum number of retries when the
`parse_func` raise an exception.
- `fault_handler` is a callable function which is called
when the response generated by the model is invalid after
`max_retries` retries.
"""
# step1: prepare keyword arguments
kwargs = {**self.generate_args, **kwargs}

for message in messages:
if not isinstance(message["content"], list):
message["content"] = [{"text": message["content"]}]
messages = _preprocess_role(messages)

# step2: forward to generate response
response = dashscope.MultiModalConversation.call(
model=self.model_name,
messages=messages,
**kwargs,
)

if response.status_code != HTTPStatus.OK:
error_msg = (
f" Request id: {response.request_id},"
f" Status code: {response.status_code},"
f" error code: {response.code},"
f" error message: {response.message}."
)
raise RuntimeError(error_msg)

# step3: record the model api invocation if needed
self._save_model_invocation(
arguments={
"model": self.model_name,
"messages": messages,
**kwargs,
},
response=response,
)

# step4: update monitor accordingly
self.update_monitor(
call_counter=1,
prompt_tokens=response.usage["input_tokens"],
completion_tokens=response.usage["output_tokens"],
total_tokens=response.usage["input_tokens"]
+ response.usage["output_tokens"],
)

# step5: return response
return ModelResponse(
text=response.output["choices"][0]["message"]["content"][0][
qbc2016 marked this conversation as resolved.
Show resolved Hide resolved
"text"
],
raw=response,
)


def _preprocess_role(messages: list) -> list:
"""preprocess role rules for DashScope"""
# The models in this list require that the roles of messages must
# alternate between "user" and "assistant".
message_length = len(messages)
if message_length % 2 == 1:
# If the length of the message list is odd, roles will
# alternate, starting with "user"
roles = [
"user" if i % 2 == 0 else "assistant"
for i in range(message_length)
]
else:
# If the length of the message list is even, the first role
# will be "system", followed by alternating "user" and
# "assistant"
roles = ["system"] + [
"user" if i % 2 == 1 else "assistant"
for i in range(1, message_length)
]

# Assign the roles list to the "role" key for each message in
# the messages list
for message, role in zip(messages, roles):
message["role"] = role

return messages
Loading