-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Support execution limits in run_
functions
#374
base: main
Are you sure you want to change the base?
Conversation
Deploying pydantic-ai with Cloudflare Pages
|
API here is changing, this is not up to date. |
message_limit
and token_limit
params in run_
functionsrun_
functions
_request_count: int = 0 | ||
_request_tokens_count: int = 0 | ||
_response_tokens_count: int = 0 | ||
_total_tokens_count: int = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say yes if we want to also include this structure in RunContext
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the idea that the settings object also holds state, it feels to me like there should be a separate object for tracking state, and we can check the state against the settings. If I were a user I'd be inclined to reuse an instance of ExecutionLimitSettings
which obviously will cause issues.
I would imagine we make a private type _UsageState
or similar (which holds all the fields you are talking about here), and have one of ExecutionLimits
and _UsageState
have a method that accepts the other and raises an error if appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to put the usage state on the runcontext we can make it public, but I feel like we can do that later/separately. I'll note that I could imagine Samuel disagreeing with all this, and I wouldn't find that unreasonable.
model_settings = merge_model_settings(self.model_settings, model_settings) | ||
execution_limit_settings = execution_limit_settings or ExecutionLimitSettings(request_limit=50) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this where we want to set the default?
@@ -191,6 +191,7 @@ async def run( | |||
model: models.Model | models.KnownModelName | None = None, | |||
deps: AgentDeps = None, | |||
model_settings: ModelSettings | None = None, | |||
execution_limit_settings: ExecutionLimitSettings | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
execution_limit_settings: ExecutionLimitSettings | None = None, | |
execution_limits: ExecutionLimits | None = None, |
this would be my preference
|
||
def _check_limit(self, limit: int | None, count: int, limit_name: str) -> None: | ||
if limit and limit < count: | ||
raise UnexpectedModelBehavior(f'Exceeded {limit_name} limit of {limit} by {count - limit}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this deserves its own exception, and probably one that doesn't inherit from UnexpectedModelBehavior (as this is more or less expected behavior)
@@ -254,6 +256,8 @@ async def run( | |||
|
|||
messages.append(model_response) | |||
cost += request_cost | |||
# TODO: is this the right location? Should we move this earlier in the logic? | |||
execution_limit_settings.increment(request_cost) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would prefer if we added a request_count
field to the Cost
type, and then just did execution_limit_settings.validate(cost)
here (rather than incrementing both cost and the limits).
I'd also prefer we rename Cost
to Usage
or similar, since that's really what it's representing now, and would make it feel less weird to add the request_count
field. But even if we don't rename it like that, I think it's reasonable to add request_count: int
(or requests: int
) as a field on the type currently known as Cost
Fix #70
Should also fix #267
TODO:
execution_limit_settings
toRunContext
StreamedRunResult
?There's a part of me that's tempted to call this
AgentSettings
or something, though I think that's misleading bc it still only takes effect on arun
call, not across multiple agentrun
calls...