Skip to content

Commit

Permalink
feat(settings): Update default model to TheBloke/Mistral-7B-Instruct-…
Browse files Browse the repository at this point in the history
…v0.2-GGUF (#1415)

* Update LlamaCPP dependency

* Default to TheBloke/Mistral-7B-Instruct-v0.2-GGUF

* Fix API docs
  • Loading branch information
imartinez authored Dec 17, 2023
1 parent c71ae7c commit 8ec7cf4
Show file tree
Hide file tree
Showing 5 changed files with 1,433 additions and 1,233 deletions.
13 changes: 13 additions & 0 deletions fern/docs/pages/api-reference/api-reference.mdx
Original file line number Diff line number Diff line change
@@ -1 +1,14 @@
# API Reference

The API is divided in two logical blocks:

1. High-level API, abstracting all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation:
- Ingestion of documents: internally managing document parsing, splitting, metadata extraction,
embedding generation and storage.
- Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt
engineering and the response generation.

2. Low-level API, allowing advanced users to implement their own complex pipelines:
- Embeddings generation: based on a piece of text.
- Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested
documents.
15 changes: 0 additions & 15 deletions fern/docs/pages/overview/welcome.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,21 +32,6 @@ The installation guide will help you in the [Installation section](/installation
/>
</Cards>

## API Organization

The API is divided in two logical blocks:

1. High-level API, abstracting all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation:
- Ingestion of documents: internally managing document parsing, splitting, metadata extraction,
embedding generation and storage.
- Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt
engineering and the response generation.

2. Low-level API, allowing advanced users to implement their own complex pipelines:
- Embeddings generation: based on a piece of text.
- Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested
documents.

<Callout intent = "info">
A working **Gradio UI client** is provided to test the API, together with a set of useful tools such as bulk
model download script, ingestion script, documents folder watch, etc.
Expand Down
Loading

0 comments on commit 8ec7cf4

Please sign in to comment.