-
Notifications
You must be signed in to change notification settings - Fork 754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Integrate Persona Hub Techniques into CAMEL for Enhanced Agent Diversity #716
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
@coderabbitai resume |
Actions performedReviews resumed. |
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
… test_persona_group.py
@willshang76 Thanks for your reviewing, I will reply your comments soon! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Leave some comments
Hey @harryeqs , the error is due to
I think the argument naming has been changed in latest master branch, could you help check and fix this? |
@Wendong-Fan Hi Wendong the embedding model selection feature has been included and the bug is fixed, please have a look. |
camel/personas/persona_hub.py
Outdated
self, | ||
persona1: Persona, | ||
persona2: Persona, | ||
threshold: float, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
threshold: float, | |
similarity_threshold: float, |
camel/personas/persona_hub.py
Outdated
def deduplicate( | ||
self, | ||
similarity_threshold: float = 0.85, | ||
embedding_model: str = "text-embedding-3-small", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here make the type as BaseEmbedding
would be better, so that user can define the model they want to pass to this method
camel/personas/persona_hub.py
Outdated
threshold: float, | ||
embedding_model: str, | ||
) -> bool: | ||
r"""Check if two personas are similar.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add more detailed docstring
camel/personas/persona_hub.py
Outdated
persona1: Persona, | ||
persona2: Persona, | ||
threshold: float, | ||
embedding_model: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, user BaseEmbedding
@Wendong-Fan Hi Wendong sorry I forgot to mention you when I completed the update last week. Please have a look when you've got time? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @harryeqs ! Overall looks good, left some comments below related to the deduplicate, but I think for this PR we still need further polish, from format side, the docstring format and code format could be improved, from functionality side, like can we put id
as attribute under Persona
; Persona
now has attribute t2p_prompt
and p2p_prompt
, does this make sense?; in def text_to_persona(self, text: str, action: str = "read")
, we set read
as the default str value, from paper there are other types like |write|like|dislike.. should we set the action
as Literal
? etc..
I would be great you can do a comprehensive review and polish after you updated the deduplicate part, feel free to reach out to me for further discussion
def deduplicate( | ||
self, | ||
embedding_model: BaseEmbedding, # Removed default value | ||
similarity_threshold: float = 0.85, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel set the default value as 0.85
would be too high, 0.7
would be a more suitable value
camel/personas/persona_hub.py
Outdated
|
||
def deduplicate( | ||
self, | ||
embedding_model: BaseEmbedding, # Removed default value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
embedding_model: BaseEmbedding, # Removed default value | |
embedding_model: BaseEmbedding, |
camel/personas/persona_hub.py
Outdated
similarity_threshold (float): The similarity threshold for | ||
deduplication (default is 0.85). | ||
embedding_model (BaseEmbedding): The embedding model | ||
for similarity compairsion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring format
similarity_threshold (float): The similarity threshold for | |
deduplication (default is 0.85). | |
embedding_model (BaseEmbedding): The embedding model | |
for similarity compairsion. | |
similarity_threshold (float): The similarity threshold for | |
deduplication (default is 0.85). | |
embedding_model (BaseEmbedding): The embedding model | |
for similarity compairsion. |
camel/personas/persona_hub.py
Outdated
unique_personas[persona_id] = persona | ||
self.personas = unique_personas | ||
|
||
def is_similar( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method is internal
def is_similar( | |
def _is_similar( |
camel/personas/persona_hub.py
Outdated
persona1 (Persona1): A persona. | ||
persona2 (Persona2): The other persona. | ||
similarity_threshold (float): The threshold on consine similarity | ||
to determine whether the two personas are similar. | ||
embedding_model (BaseEmbedding): The embedding model | ||
for similarity compairsion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring format
persona1 (Persona1): A persona. | |
persona2 (Persona2): The other persona. | |
similarity_threshold (float): The threshold on consine similarity | |
to determine whether the two personas are similar. | |
embedding_model (BaseEmbedding): The embedding model | |
for similarity compairsion. | |
persona1 (Persona1): A persona. | |
persona2 (Persona2): The other persona. | |
similarity_threshold (float): The threshold on consine similarity | |
to determine whether the two personas are similar. | |
embedding_model (BaseEmbedding): The embedding model | |
for similarity compairsion. |
camel/personas/persona_hub.py
Outdated
def cosine_similarity(vec1, vec2): | ||
return np.dot(vec1, vec2) / ( | ||
np.linalg.norm(vec1) * np.linalg.norm(vec2) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this method out of def is_similar
for better code structure
camel/personas/persona_hub.py
Outdated
similarity_threshold: float, | ||
embedding_model: BaseEmbedding, | ||
) -> bool: | ||
r"""Check if two personas are similar by consine simlarity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
r"""Check if two personas are similar by consine simlarity | |
r"""Check if two personas are similar by consine similarity |
camel/personas/persona_hub.py
Outdated
similarity_threshold (float): The threshold on consine similarity | ||
to determine whether the two personas are similar. | ||
embedding_model (BaseEmbedding): The embedding model | ||
for similarity compairsion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
for similarity compairsion. | |
for similarity comparison. |
camel/personas/persona_hub.py
Outdated
persona1 (Persona1): A persona. | ||
persona2 (Persona2): The other persona. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
persona1 (Persona1): A persona. | |
persona2 (Persona2): The other persona. | |
persona1 (Persona): A persona. | |
persona2 (Persona): The other persona. |
camel/personas/persona_hub.py
Outdated
for persona_id, persona in self.personas.items(): | ||
if not any( | ||
self.is_similar( | ||
persona, up, similarity_threshold, embedding_model | ||
) | ||
for up in unique_personas.values() | ||
): | ||
unique_personas[persona_id] = persona | ||
self.personas = unique_personas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use caching for embeddings to reduce the cost caused by repeated embed like below?
@staticmethod
@lru_cache(maxsize=128)
def get_embedding(embedding_model: BaseEmbedding, description: Optional[str]):
r"""Cache embeddings to reduce recomputation."""
return embedding_model.embed(description)
Hi Wendong @Wendong-Fan , sorry for the delay! I've made a few changes, including updating the embedding method and setting |
Hey @harryeqs , Thanks for the contribution! I added one more commit here 3bdc4f6 to finish this PR, feel free to check and leave your comments~ |
Description
Describe your changes in detail.
Motivation and Context
close #715
Types of changes
What types of changes does your code introduce? Put an
x
in all the boxes that apply:Implemented Tasks
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!