1.12.0
Migrated providers from fetch to remote.net.request. Closes #26
Avoiding CORS issues and improving performance.
Refactor AI provider and embedding functionality, add optimize model reloading
By default, the Ollama API has 2048 tokens limit even for the largest models. So there are some heuristics to provide full context window if needed as well as to optimize the VRAM consumption.
Added cache invalidation after changing an embedding model
Before the change, the cache was not invalidated even if the embedding model was changed. That's critical because the embeddings are not interchangeable between models.
Added prompt templating for context and selection
Context information is below.
{{=CONTEXT_START=}}
---------------------
{{=CONTEXT=}}
{{=CONTEXT_END=}}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {{=SELECTION=}}
Answer:
More about Prompt templating in prompt-templating.md