Releases: andygill/haverscript
Second Release (with 3.10 and 3.11 support)
Added
-
Support for Python 3.10 and 3.11 as well as 3.12 and 3.13 (0.2.1)
-
Adding
Middleware
type for composable prompt and response handlers. -
Middleware
can be added using|
, giving a small pipe-based representation of flow.
We have the following middleware components:echo()
adds echoing of prompts and replies.retry()
which uses the tenacity package to provide a generic retry.validate()
which checks the response for a predicate.stats()
adds a dynamic single line summary of each LLM call.cache()
add a caching component.transcript()
adds a transcript component (transcripts the session to a file).trace()
logs the calls through the middleware in both directions.fresh()
requests a fresh call to the LLM.options()
sets specific options.model()
set the model being used.format()
requires the output in JSON, with an optional pydantic class schema.meta()
is a hook to allow middleware to act like a test-time LLM.
-
Adding prompt specific flags to
Model.chat
.images : list[str]
are images to be passed to the model.middleware: Middleware
appends a chat-specific middleware to the call.
-
Added
Service
class, that can be asked about models, and can generateModel
s. -
Added
response.value
, which return the JSONdict
of the reply, the pydantic class, orNone
. -
Added spinner when waiting for the first token from LLM when using
echo
. -
Added
metrics
toResponse
, which contains basic metrics about the LLM call. -
Added
render()
method toModel
, for outputing markdown-style session viewing. -
Added
load()
method toModel
, for parsing markdown-style sessions. -
Added
LLMError
, and subclasses. -
Added support for together.ai's API as a first-class alternative to ollama.
-
Added many more examples.
-
Added many more tests.
Fixed
Changed
- Updated
children
method to return all children when no prompt is supplied. - Reworked SQL cache schema to store context as chain of responses, and use a
string pool. - Using the cache now uses LLM results in order, until exhausted, then calls the LLM.
Removed
There are some breaking API changes. In all cases, the functionality has been
replaced with something more general and principled.
The concepts that caused changes are
- One you have a
Response
, that interaction with the LLM is considered done.
There are no longer functions that attempt to re-run the call. Instead, middleware
functions can be used to filter out responses as needed. - The is not longer the concept of a
Response
being "fresh". Instead, the
cache uses a cursor when reading cached responses, and it is possible to ask
that a specific interaction bypasses the cache (using thefresh()
middleware). - Most helper methods (
echo()
,cache()
, etc) are now Middleware, and thus
more flexible.
Specifically, here are the changes:
- Removed
check()
andredo()
fromResponse
.
Replace it withvalidate()
andretry()
before the call to chat,
or as chat-specific middleware. - Removed
fresh
fromResponse
. The concept of fresh responses has been replaced
with a more robust caching middleware. There is nowfresh()
middleware. - Removed
json()
fromModel
. It is replaced with the more general
format()
middleware. echo()
andcache()
are no longerModel
methods, and nowMiddleware
instances.- The utility functions
accept
andvalid_json
are removed. They added no value,
given the removal ofredo
.
So, previously we would have session = connect("modelname").echo()
, and we now have
session = connect("modelname") | echo()
.
Second Release
Added
-
Adding
Middleware
type for composable prompt and response handlers. -
Middleware
can be added using|
, giving a small pipe-based representation of flow.
We have the following middleware components:echo()
adds eching of prompts and replies.retry()
which uses the tenacity package to provide a generic retry.validate()
which checks the response for a predicate.stats()
adds a dynamic single line summary of each LLM call.cache()
add a caching component.transcript()
adds a transcript component (transcripts the session to a file).trace()
logs the calls through the middleware in both directions.fresh()
requests a fresh call to the LLM.options()
sets specific options.model()
set the model being used.format()
requires the output in JSON, with an optional pydantic class schema.meta()
is a hook to allow middleware to act like a test-time LLM.
-
Adding prompt specific flags to
Model.chat
.images : list[str]
are images to be passed to the model.middleware: Middleware
appends a chat-specific middleware to the call.
-
Added
Service
class, that can be asked about models, and can generateModel
s. -
Added
response.value
, which return the JSONdict
of the reply, the pydantic class, orNone
. -
Added spinner when waiting for the first token from LLM when using
echo
. -
Added
metrics
toResponse
, which contains basic metrics about the LLM call. -
Added
render()
method toModel
, for outputing markdown-style session viewing. -
Added
load()
method toModel
, for parsing markdown-style sessions. -
Added
LLMError
, and subclasses. -
Added support for together.ai's API as a first-class alternative to ollama.
-
Added many more examples.
-
Added many more tests.
Fixed
Changed
- Updated
children
method to return all children when no prompt is supplied. - Reworked SQL cache schema to store context as chain of responses, and use a
string pool. - Using the cache now uses LLM results in order, until exhausted, then calls the LLM.
Removed
There are some breaking API changes. In all cases, the functionality has been
replaced with something more general and principled.
The concepts that caused changes are
- One you have a
Response
, that interaction with the LLM is considered done.
There are no longer functions that attempt to re-run the call. Instead, middleware
functions can be used to filter out responses as needed. - The is not longer the concept of a
Response
being "fresh". Instead, the
cache uses a cursor when reading cached responses, and it is possible to ask
that a specific interaction bypasses the cache (using thefresh()
middleware). - Most helper methods (
echo()
,cache()
, etc) are now Middleware, and thus
more flexable.
Specifically, here are the changes:
- Removed
check()
andredo()
fromResponse
.
Replace it withvalidate()
andretry()
before the call to chat,
or as chat-specific middleware. - Removed
fresh
fromResponse
. The concept of fresh responses has been replaced
with a more robust caching middleware. There is nowfresh()
middleware. - Removed
json()
fromModel
. It is replaced with the more general
format()
middleware. echo()
andcache()
are no longerModel
methods, and nowMiddleware
instances.- The utility functions
accept
andvalid_json
are removed. They added no value,
given the removal ofredo
.
So, previously we would have sesssion = connect("modelname").echo()
, and we now have
sesssion = connect("modelname") | echo()
.
First release
v0.1.0 Change license to MIT