models(gallery): add granite-3.0-1b-a400m-instruct (#3994)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
mudler · Oct 28, 2024 · d1cb246 · d1cb246
1 parent a8e10f0
commit d1cb246
Show file tree

Hide file tree

Showing 2 changed files with 66 additions and 0 deletions.
diff --git a/gallery/granite.yaml b/gallery/granite.yaml
@@ -0,0 +1,40 @@
+---
+name: "granite"
+
+config_file: |
+  mmap: true
+  template:
+    chat_message: |
+      <|{{ .RoleName }}|>
+      {{ if .FunctionCall -}}
+      Function call:
+      {{ else if eq .RoleName "tool" -}}
+      Function response:
+      {{ end -}}
+      {{ if .Content -}}
+      {{.Content }}
+      {{ end -}}
+      {{ if .FunctionCall -}}
+      {{toJson .FunctionCall}}
+      {{ end -}}
+    function: |
+      <|system|>
+      You are a function calling AI model. You are provided with functions to execute. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
+      {{range .Functions}}
+      {'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
+      {{end}}
+      For each function call return a json object with function name and arguments
+      {{.Input -}}
+      <|assistant|>
+    chat: |
+      {{.Input -}}
+      <|assistant|>
+    completion: |
+      {{.Input}}
+  context_size: 4096
+  f16: true
+  stopwords:
+  - '<|im_end|>'
+  - '<dummy32000>'
+  - '</s>'
+  - '<|'
diff --git a/gallery/index.yaml b/gallery/index.yaml
@@ -1,4 +1,30 @@
 ---
+- &granite3
+  name: "granite-3.0-1b-a400m-instruct"
+  urls:
+    - https://huggingface.co/ibm-granite/granite-3.0-1b-a400m-instruct
+    - https://huggingface.co/QuantFactory/granite-3.0-1b-a400m-instruct-GGUF
+  overrides:
+    parameters:
+      model: granite-3.0-1b-a400m-instruct.Q4_K_M.gguf
+  files:
+    - filename: granite-3.0-1b-a400m-instruct.Q4_K_M.gguf
+      sha256: 9571b5fc9676ebb59def3377dc848584463fb7f09ed59ebbff3b9f72fd7bd38a
+      uri: huggingface://QuantFactory/granite-3.0-1b-a400m-instruct-GGUF/granite-3.0-1b-a400m-instruct.Q4_K_M.gguf
+  url: "github:mudler/LocalAI/gallery/granite.yaml@master"
+  description: |
+    Granite 3.0 language models are a new set of lightweight state-of-the-art, open foundation models that natively support multilinguality, coding, reasoning, and tool usage, including the potential to be run on constrained compute resources. All the models are publicly released under an Apache 2.0 license for both research and commercial use. The models' data curation and training procedure were designed for enterprise usage and customization in mind, with a process that evaluates datasets for governance, risk and compliance (GRC) criteria, in addition to IBM's standard data clearance process and document quality checks.
+    Granite 3.0 includes 4 different models of varying sizes:
+        Dense Models: 2B and 8B parameter models, trained on 12 trillion tokens in total.
+        Mixture-of-Expert (MoE) Models: Sparse 1B and 3B MoE models, with 400M and 800M activated parameters respectively, trained on 10 trillion tokens in total.
+    Accordingly, these options provide a range of models with different compute requirements to choose from, with appropriate trade-offs with their performance on downstream tasks. At each scale, we release a base model — checkpoints of models after pretraining, as well as instruct checkpoints — models finetuned for dialogue, instruction-following, helpfulness, and safety.
+  tags:
+    - llm
+    - gguf
+    - gpu
+    - cpu
+    - moe
+    - granite
 - name: "moe-girl-1ba-7bt-i1"
   icon: https://cdn-uploads.huggingface.co/production/uploads/634262af8d8089ebaefd410e/kTXXSSSqpb21rfyOX7FUa.jpeg
   # chatml