Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verba - Pip Installation not working at all on Ubuntu 24.04 #317

Open
2 tasks done
Flagelmann opened this issue Nov 5, 2024 · 6 comments
Open
2 tasks done

Verba - Pip Installation not working at all on Ubuntu 24.04 #317

Flagelmann opened this issue Nov 5, 2024 · 6 comments
Labels
investigating Bugs that are still being investigated whether they are valid

Comments

@Flagelmann
Copy link

Flagelmann commented Nov 5, 2024

Description

No way to get Verba working on even a brand new VM Ubuntu 24.04.

Installation

  • pip install goldenverba

If you installed via pip, please specify the version:

Lastest Python version on Ubuntu 24.04

Weaviate Deployment

  • Local Deployment

Configuration

I do not have, once accessing the Verba portal, any "RAG", "Add Documents", etc. tabs.

Steps to Reproduce

Run a brand new VM based on Ubuntu 24.04 and try creating a simple python venv, then run the pip install goldenverba, set up the .env with Ollama and try to open the Verba frontend on port 8000. Once selected Local Deployment, no RAG or Add Documents tabs. No settings available.

Additional context

During the startup I get the following:

✘ Document Count retrieval failed: Query call with protocol GQL
Aggregate failed with message Error in GraphQL response: [ { "locations":
[ { "column": 12, "line": 1 } ], "message":
"Cannot query field "VERBA_Embedding_gemma2_9b" on type
"AggregateObjectsObj".", "path": null } ], for the following query:
{Aggregate{VERBA_Embedding_gemma2_9b(groupBy: ["doc_uuid"]){meta{count}
groupedBy { path value } }}}.

Additional info are:

  • Both the host and the VM can ping each other.
  • After the creation of the virtualenv in python there's a missing step in the docs, as I had to activate it before running the pip command.
@Flagelmann
Copy link
Author

Flagelmann commented Nov 5, 2024

I tried also with:

git clone https://github.com/weaviate/Verba

pip install -e .

No luck, always the same output (and it seems also connected to the Ollama server I'm running on host side).

Always in "reconnecting", but it actually pulled correctly the model I've installed on host side and no errors when starting the verba application on terminal side about Ollama.

It seems it does not work at all.

image

@whiskeytangofoxy
Copy link

whiskeytangofoxy commented Nov 27, 2024

Exact same issue here, but on Ubuntu 22.04. Have tried docker and built from source deployments, as well as dockerized Ollama and local Ollama set-ups. Everything seems to work correctly (e.g., Verba sees and connects to Ollama models) but "Reconnecting" button never goes away. Below are some logs from a built from source start-up:

$ verba start

ℹ Couldn't connect to Groq (https://api.groq.com/openai/v1/)
INFO: Will watch for changes in these directories: ['/home/ubuntu/verbapip']
WARNING: "workers" flag is ignored when reloading is enabled.
INFO: Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
INFO: Started reloader process [148869] using WatchFiles
ℹ Couldn't connect to Groq (https://api.groq.com/openai/v1/)
INFO: Started server process [148877]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: [redacted]:0 - "GET / HTTP/1.0" 200 OK
ℹ Cleaning Clients Cache
ℹ Cleaned up 0 clients
INFO: [redacted]:0 - "GET /api/health HTTP/1.0" 200 OK
ℹ Cleaning Clients Cache
ℹ Cleaned up 0 clients
INFO: [redacted]:0 - "GET /api/health HTTP/1.0" 200 OK
ℹ Cleaning Clients Cache
ℹ Cleaned up 0 clients
INFO: [redacted]:0 - "GET /api/health HTTP/1.0" 200 OK
✔ Connecting new Client
ℹ Connecting to Weaviate Cluster http://localhost:8080 with Auth
✘ Couldn't connect to Weaviate, check your URL/API KEY: Invalid port:
'8080:443'
✘ Failed to connect to Weaviate Couldn't connect to Weaviate, check
your URL/API KEY: Invalid port: '8080:443'
/home/ubuntu/verbapip/venv/lib/python3.10/site-packages/weaviate/warnings.py:303: ResourceWarning: Con004: The connection to Weaviate was not closed properly. This can lead to memory leaks.
Please make sure to close the connection using client.close().
warnings.warn(
INFO: [redacted]:0 - "POST /api/connect HTTP/1.0" 400 Bad Request
ℹ Cleaning Clients Cache
ℹ Cleaned up 0 clients
INFO: [redacted]:0 - "GET /api/health HTTP/1.0" 200 OK
✔ Connecting new Client
ℹ Connecting to Weaviate Cluster http://localhost:8000 with Auth
✘ Couldn't connect to Weaviate, check your URL/API KEY: Invalid port:
'8000:443'
✘ Failed to connect to Weaviate Couldn't connect to Weaviate, check
your URL/API KEY: Invalid port: '8000:443'
INFO: [redacted]:0 - "POST /api/connect HTTP/1.0" 400 Bad Request
ℹ Cleaning Clients Cache
ℹ Cleaned up 0 clients
INFO: [redacted]:0 - "GET /api/health HTTP/1.0" 200 OK

✔ Connecting new Client

ℹ Connecting to Weaviate Embedded

{"action":"startup","default_vectorizer_module":"none","level":"info","msg":"the default vectorizer modules is set to "none", as a result all new schema classes without an explicit vectorizer setting, will use this vectorizer","time":"2024-11-27T17:40:01Z"}
{"action":"startup","auto_schema_enabled":true,"level":"info","msg":"auto schema enabled setting is set to "true"","time":"2024-11-27T17:40:01Z"}
{"level":"info","msg":"No resource limits set, weaviate will use all available memory and CPU. To limit resources, set LIMIT_RESOURCES=true","time":"2024-11-27T17:40:01Z"}
{"level":"info","msg":"module offload-s3 is enabled","time":"2024-11-27T17:40:01Z"}
{"level":"warning","msg":"Multiple vector spaces are present, GraphQL Explore and REST API list objects endpoint module include params has been disabled as a result.","time":"2024-11-27T17:40:01Z"}
{"level":"info","msg":"open cluster service","servers":{"Embedded_at_8079":56321},"time":"2024-11-27T17:40:01Z"}
{"address":"10.0.0.252:56322","level":"info","msg":"starting cloud rpc server ...","time":"2024-11-27T17:40:01Z"}
{"level":"info","msg":"starting raft sub-system ...","time":"2024-11-27T17:40:01Z"}
{"address":"10.0.0.252:56321","level":"info","msg":"tcp transport","tcpMaxPool":3,"tcpTimeout":10000000000,"time":"2024-11-27T17:40:01Z"}
{"level":"info","msg":"loading local db","time":"2024-11-27T17:40:01Z"}
{"level":"info","msg":"local DB successfully loaded","time":"2024-11-27T17:40:01Z"}
{"level":"info","msg":"schema manager loaded","n":0,"time":"2024-11-27T17:40:01Z"}
{"level":"info","metadata_only_voters":false,"msg":"construct a new raft node","name":"Embedded_at_8079","time":"2024-11-27T17:40:01Z"}
{"action":"raft","index":1,"level":"info","msg":"raft initial configuration","servers":"[[{Suffrage:Voter ID:Embedded_at_8079 Address:10.0.0.252:46651}]]","time":"2024-11-27T17:40:01Z"}
{"last_snapshot_index":0,"last_store_applied_index":0,"last_store_log_applied_index":5,"level":"info","msg":"raft node constructed","raft_applied_index":0,"raft_last_index":5,"time":"2024-11-27T17:40:01Z"}
{"action":"raft","follower":{},"leader-address":"","leader-id":"","level":"info","msg":"raft entering follower state","time":"2024-11-27T17:40:01Z"}
{"action":"bootstrap","error":"could not join a cluster from [10.0.0.252:56321]","level":"warning","msg":"failed to join cluster, will notify next if voter","servers":["10.0.0.252:56321"],"time":"2024-11-27T17:40:02Z","voter":true}
{"action":"bootstrap","candidates":[{"Suffrage":0,"ID":"Embedded_at_8079","Address":"10.0.0.252:56321"}],"level":"info","msg":"starting cluster bootstrapping","time":"2024-11-27T17:40:02Z"}
{"action":"bootstrap","error":"bootstrap only works on new clusters","level":"error","msg":"could not bootstrapping cluster","time":"2024-11-27T17:40:02Z"}
{"action":"bootstrap","level":"info","msg":"notified peers this node is ready to join as voter","servers":["10.0.0.252:56321"],"time":"2024-11-27T17:40:02Z"}
{"action":"raft","last-leader-addr":"","last-leader-id":"","level":"warning","msg":"raft heartbeat timeout reached, starting election","time":"2024-11-27T17:40:02Z"}
{"action":"raft","level":"info","msg":"raft entering candidate state","node":{},"term":3,"time":"2024-11-27T17:40:02Z"}
{"action":"raft","level":"info","msg":"raft election won","tally":1,"term":3,"time":"2024-11-27T17:40:02Z"}
{"action":"raft","leader":{},"level":"info","msg":"raft entering leader state","time":"2024-11-27T17:40:02Z"}
{"level":"info","msg":"reload local db: update schema ...","time":"2024-11-27T17:40:02Z"}
{"index":"VERBA_CONFIG","level":"info","msg":"reload local index","time":"2024-11-27T17:40:02Z"}
{"docker_image_tag":"unknown","level":"info","msg":"configured versions","server_version":"1.26.1","time":"2024-11-27T17:40:03Z"}
{"action":"grpc_startup","level":"info","msg":"grpc server listening at [::]:50050","time":"2024-11-27T17:40:03Z"}
{"address":"10.0.0.252:56321","level":"info","msg":"current Leader","time":"2024-11-27T17:40:03Z"}
{"action":"restapi_management","docker_image_tag":"unknown","level":"info","msg":"Serving weaviate at http://127.0.0.1:8079","time":"2024-11-27T17:40:03Z"}
{"index":"VERBA_DOCUMENTS","level":"info","msg":"reload local index","time":"2024-11-27T17:40:03Z"}
{"action":"telemetry_push","level":"info","msg":"telemetry started","payload":"\u0026{MachineID:e3a87443-75d2-4a45-8c49-4366676a7ed4 Type:INIT Version:1.26.1 NumObjects:0 OS:linux Arch:arm64 UsedModules:[]}","time":"2024-11-27T17:40:03Z"}
{"index":"VERBA_Embedding_nomic_embed_text_latest","level":"info","msg":"reload local index","time":"2024-11-27T17:40:03Z"}
{"action":"hnsw_prefill_cache_async","level":"info","msg":"not waiting for vector cache prefill, running in background","time":"2024-11-27T17:40:04Z","wait_for_cache_prefill":false}
{"level":"info","msg":"Completed loading shard verba_config_42Mq7GU59WVB in 2.868712ms","time":"2024-11-27T17:40:04Z"}
{"action":"hnsw_vector_cache_prefill","count":3000,"index_id":"main","level":"info","limit":1000000000000,"msg":"prefilled vector cache","time":"2024-11-27T17:40:04Z","took":69960}

✔ Succesfully Connected to Weaviate
ℹ Connection time: 3.40 seconds
ℹ Using New RAG Configuration

INFO: [redacted]:0 - "POST /api/connect HTTP/1.0" 200 OK
ℹ Cleaning Clients Cache
ℹ Cleaned up 0 clients
INFO: [redacted]:0 - "GET /api/health HTTP/1.0" 200 OK
ℹ Cleaning Clients Cache
INFO: [redacted]:0 - "GET /ws/generate_stream HTTP/1.0" 404 Not Found
ℹ Cleaned up 0 clients
INFO: [redacted]:0 - "GET /api/health HTTP/1.0" 200 OK
ℹ Found existing Client
{"action":"hnsw_prefill_cache_async","level":"info","msg":"not waiting for vector cache prefill, running in background","time":"2024-11-27T17:40:04Z","wait_for_cache_prefill":false}
{"level":"info","msg":"Completed loading shard verba_documents_VqGEOnixuVWg in 1.029118ms","time":"2024-11-27T17:40:04Z"}
{"action":"hnsw_vector_cache_prefill","count":3000,"index_id":"main","level":"info","limit":1000000000000,"msg":"prefilled vector cache","time":"2024-11-27T17:40:04Z","took":76879}
ℹ Cleaning Clients Cache
{"action":"hnsw_prefill_cache_async","level":"info","msg":"not waiting for vector cache prefill, running in background","time":"2024-11-27T17:40:04Z","wait_for_cache_prefill":false}
{"level":"info","msg":"Completed loading shard verba_embedding_nomic_embed_text_latest_WOrKxNgiecbu in 894.277µs","time":"2024-11-27T17:40:04Z"}
{"action":"hnsw_vector_cache_prefill","count":3000,"index_id":"main","level":"info","limit":1000000000000,"msg":"prefilled vector cache","time":"2024-11-27T17:40:04Z","took":71800}
INFO: [redacted]:0 - "POST /api/get_datacount HTTP/1.0" 200 OK
ℹ Cleaned up 0 clients
INFO: [redacted]:0 - "GET /api/health HTTP/1.0" 200 OK
ℹ Found existing Client
INFO: [redacted]:0 - "POST /api/get_meta HTTP/1.0" 200 OK
ℹ Cleaning Clients Cache
ℹ Found existing Client
ℹ Cleaned up 0 clients
INFO: [redacted]:0 - "GET /api/health HTTP/1.0" 200 OK
INFO: [redacted]:0 - "POST /api/get_datacount HTTP/1.0" 200 OK
ℹ Found existing Client
ℹ Cleaning Clients Cache
ℹ Cleaned up 0 clients
INFO: [redacted]:0 - "GET /api/health HTTP/1.0" 200 OK
INFO: [redacted]:0 - "POST /api/get_labels HTTP/1.0" 200 OK
ℹ Found existing Client
INFO: [redacted]:0 - "POST /api/get_labels HTTP/1.0" 200 OK
{"action":"bootstrap","level":"info","msg":"node reporting ready, node has probably recovered cluster from raft config. Exiting bootstrap process","time":"2024-11-27T17:40:04Z"}
INFO: [redacted]:0 - "GET /ws/generate_stream HTTP/1.0" 404 Not Found
INFO: [redacted]:0 - "GET /ws/import_files HTTP/1.0" 404 Not Found

I imagine that the 404's on the generate_stream and import_files are to blame but cannot track down why this error is occurring. This is a completely fresh install of Ubuntu 22.04. Only other running processes of note are Ollama and nginx (which is acting as a simple authenticated reverse proxy for public connections).

@whiskeytangofoxy
Copy link

whiskeytangofoxy commented Dec 2, 2024

After thorough testing of further deployments using different weaviate set-ups including local, docker, custom (via a docker endpoint), and WCS configurations, nothing has changed. Signs seem to point to this being either a websocket issue in server/api.py or frontend issue in app/util.ts.

@thomashacker
Copy link
Collaborator

Hey, sorry for the late reply! Thanks for the issue, I'll look into this

@thomashacker thomashacker added the investigating Bugs that are still being investigated whether they are valid label Dec 6, 2024
@thomashacker
Copy link
Collaborator

@whiskeytangofoxy I think you're right, if the reconnecting button is consistently appearing the problem really might be with the websockets. If the error still persists and if you have time, could you share the frontend developer console, there might be some further indications about the websocket status

I'll try to recreate the errors you all mentioned. Thanks for all the details 🚀

@thomashacker
Copy link
Collaborator

Can you also let me know what exact python version you are using?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigating Bugs that are still being investigated whether they are valid
Projects
None yet
Development

No branches or pull requests

3 participants