Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Support execution limits in run_ functions #374

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sydney-runkle
Copy link
Member

@sydney-runkle sydney-runkle commented Dec 17, 2024

Fix #70
Should also fix #267

TODO:

  • docs, once we verify this is the API we want (decide on name for this dataclass)
  • tests, same blocker as above
  • Add execution_limit_settings to RunContext
  • Confirm correct behavior for streaming - how to deal with StreamedRunResult?

There's a part of me that's tempted to call this AgentSettings or something, though I think that's misleading bc it still only takes effect on a run call, not across multiple agent run calls...

Copy link

cloudflare-workers-and-pages bot commented Dec 17, 2024

Deploying pydantic-ai with  Cloudflare Pages  Cloudflare Pages

Latest commit: bc08546
Status: ✅  Deploy successful!
Preview URL: https://3853196a.pydantic-ai.pages.dev
Branch Preview URL: https://cost-and-msg-limits.pydantic-ai.pages.dev

View logs

@sydney-runkle
Copy link
Member Author

API here is changing, this is not up to date.

@sydney-runkle sydney-runkle changed the title Draft: Support message_limit and token_limit params in run_ functions Draft: Support execution limits in run_ functions Dec 17, 2024
Comment on lines +92 to +95
_request_count: int = 0
_request_tokens_count: int = 0
_response_tokens_count: int = 0
_total_tokens_count: int = 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be public?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say yes if we want to also include this structure in RunContext...

Copy link
Contributor

@dmontagu dmontagu Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea that the settings object also holds state, it feels to me like there should be a separate object for tracking state, and we can check the state against the settings. If I were a user I'd be inclined to reuse an instance of ExecutionLimitSettings which obviously will cause issues.

I would imagine we make a private type _UsageState or similar (which holds all the fields you are talking about here), and have one of ExecutionLimits and _UsageState have a method that accepts the other and raises an error if appropriate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to put the usage state on the runcontext we can make it public, but I feel like we can do that later/separately. I'll note that I could imagine Samuel disagreeing with all this, and I wouldn't find that unreasonable.

model_settings = merge_model_settings(self.model_settings, model_settings)
execution_limit_settings = execution_limit_settings or ExecutionLimitSettings(request_limit=50)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this where we want to set the default?

@@ -191,6 +191,7 @@ async def run(
model: models.Model | models.KnownModelName | None = None,
deps: AgentDeps = None,
model_settings: ModelSettings | None = None,
execution_limit_settings: ExecutionLimitSettings | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
execution_limit_settings: ExecutionLimitSettings | None = None,
execution_limits: ExecutionLimits | None = None,

this would be my preference


def _check_limit(self, limit: int | None, count: int, limit_name: str) -> None:
if limit and limit < count:
raise UnexpectedModelBehavior(f'Exceeded {limit_name} limit of {limit} by {count - limit}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this deserves its own exception, and probably one that doesn't inherit from UnexpectedModelBehavior (as this is more or less expected behavior)

@@ -254,6 +256,8 @@ async def run(

messages.append(model_response)
cost += request_cost
# TODO: is this the right location? Should we move this earlier in the logic?
execution_limit_settings.increment(request_cost)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would prefer if we added a request_count field to the Cost type, and then just did execution_limit_settings.validate(cost) here (rather than incrementing both cost and the limits).

I'd also prefer we rename Cost to Usage or similar, since that's really what it's representing now, and would make it feel less weird to add the request_count field. But even if we don't rename it like that, I think it's reasonable to add request_count: int (or requests: int) as a field on the type currently known as Cost

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants