-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design error handling mechanism #7
Comments
Currently batches are executed like this: datasette-enrichments/datasette_enrichments/__init__.py Lines 115 to 118 in 58b4990
That Having that method raise an exception isn't ideal, because if you hand it 100 rows and it works fine for 99 of them but errors on one we don't want to assume the entire batch failed. Instead, it could return a list of errors - something like this: errors = await self.enrich_batch(
db, job["table_name"], rows, json.loads(job["config"])
) That returned list would need to map row primary keys to errors though - and the information about what a primary key is has not really been dealt with yet. Maybe that list is a Bit of an ugly thing to have to implement each time in the embeddings though. Unless... there could be a Then if you want to do fancy batches - like with the OpenAI embeddings API - you can implement |
Then I can have a table called create table _enrichment_errors (
id integer primary key,
job_id integer references _enrichment_jobs(id),
row_pk text, -- JSON encoded, can be integer or string or ["compound", "pk"]
error text,
created_at text
) |
I think it's on the plugins themselves to catch errors and write them to this table. I'll add a |
I've been thinking about having an optional |
Or I could just have a default If you override |
New idea: instead of this: async def log_error(self, db: "Database", job_id: int, id: Any, error: str): I'll do this: async def log_error(self, db: "Database", job_id: int, ids: List[Union[int, str, Tuple[]], error: str): So any time you record an error you can record it against a LIST of row primary keys, from typing import Union, Tuple, List
# Define the type for the elements in the list
ElementType = Union[int, str, Tuple[Union[int, str], ...]]
async def log_error(..., ids: List[ElementType]):
pass |
That way I can have a default error logging mechanism where if your |
Refs:
I added a
error_count
column, and I have the idea that the job should automatically be cancelled if more than X errors occur (default X = 5).But... how does that actually work at the code level?
The text was updated successfully, but these errors were encountered: