Skip to content

Commit

Permalink
chore(docs): update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
hanxiao committed Feb 10, 2022
1 parent 2ed1f72 commit 8e0cd10
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 17 deletions.
2 changes: 1 addition & 1 deletion docs/advanced/document-store/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
```{toctree}
:hidden:
sqlite
weaviate
sqlite
```

Expand Down
57 changes: 41 additions & 16 deletions docs/advanced/document-store/weaviate.md
Original file line number Diff line number Diff line change
@@ -1,34 +1,59 @@
# Weaviate

One can use [Weaviate](https://www.semi.technology) as the document store for DocumentArray. It is useful when one wants to leverage Weaviate for storage and vector search.
One can use [Weaviate](https://www.semi.technology) as the document store for DocumentArray. It is useful when one wants to have faster Document retrieval on embeddings, i.e. `.match()`, `.find()`.


## Start Weaviate Service
## Usage

To use Weaviate as storage backend, one is required to have the Weaviate service started.
### Start Weaviate service

To use Weaviate as the storage backend, it is required to have the Weaviate service started. Create `docker-compose.yml` as follows:

```yaml
---
version: '3.4'
services:
weaviate:
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image: semitechnologies/weaviate:1.9.0
ports:
- 8080:8080
restart: on-failure:0
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
DEFAULT_VECTORIZER_MODULE: 'none'
ENABLE_MODULES: ''
CLUSTER_HOSTNAME: 'node1'
...
```

To start the Weaviate Docker Service, simply follow the instructions detailed [here](https://www.semi.technology/developers/weaviate/current/getting-started/installation.html#weaviate-without-any-modules).
Then

## Usage
```bash
docker compose up
```

### Create DocumentArray with Weaviate backend

Assuming service is started using the default configuration (i.e. server address is `http://localhost:8080`), one can instantiate a `DocumentArray` with Weaviate storage as such:
Assuming service is started using the default configuration (i.e. server address is `http://localhost:8080`), one can instantiate a DocumentArray with Weaviate storage as such:

```python
from docarray import DocumentArray

da = DocumentArray(storage='weaviate', config={'n_dim': 10})
```

```{admonition} Config is Required
:class: tip
Unlike SQLite storage, `config` is a required parameter to instantiate `DocumentArray` with Weaviate as storage.
In `config`, user is required to provide an `int` value for `n_dim` that represents the number of dimension of embeddings to be stored.
This allows for Weaviate's vector search capability.
```
The usage would be the same as the ordinary DocumentArray.

To access `DocumentArray` formerly persisted in Weaviate, one can specify the name of the persisted Weaviate object representing the `DocumentArray`,
along with the address or the client connecting to the server where data is persisted (`name` is required in this case but `client` is optional.
If `client` is not provided, then it will connect to the Weaviate service bound to `http://localhost:8080`).
To access a DocumentArray formerly persisted, one can specify the name, the address or the client connecting to the server. `name` is required in this case but `client` is optional. If `client` is not provided, then it will connect to the Weaviate service bound to `http://localhost:8080`.

Note, that the `name` parameter in `config` needs to be capitalized.

Expand All @@ -48,7 +73,7 @@ The following configs can be set:

| Name | Description | Default |
|--------------------|---------------------------------------------------------------------------------------------------------|-----------------------------|
| `n_dim` | Number of dimensions of embeddings to be stored and retrieved | N/A, this is required field |
| `n_dim` | Number of dimensions of embeddings to be stored and retrieved | **This is always required** |
| `client` | Weaviate client; this can be a string uri representing the server address or a `weaviate.Client` object | `'http://localhost:8080'` |
| `name` | Weaviate class name; the class name of Weaviate object to presesent this DocumentArray | None |
| `serialize_config` | [Serialization config of each Document](../../fundamentals/document/serialization.md) | None |

0 comments on commit 8e0cd10

Please sign in to comment.