Unable to run privateGPT without internet #1458

ninjanimus · 2023-12-26T14:56:21Z

My objective is to setup PrivateGPT with internet and then cutoff the internet for using it locally to avoid any potential data leakage.

I'm trying to install it on Ubuntu VM, no GPU. Everything working fine when internet is enabled. But once the disabled and the VM is restarted, unable to start the PrivateGPT server.

Below are the error details:

$ poetry run python3.11 -m private_gpt
Traceback (most recent call last):
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/socket.py", line 961, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 404, in _make_request
    self._validate_conn(conn)
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1058, in _validate_conn
    conn.connect()
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 363, in connect
    self.sock = conn = self._new_conn()
                       ^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f83a8887350>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 799, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /gpt-2/encodings/main/vocab.bpe (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f83a8887350>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/tony/installs/privateGPT/private_gpt/__main__.py", line 5, in <module>
    from private_gpt.main import app
  File "/home/tony/installs/privateGPT/private_gpt/main.py", line 3, in <module>
    import llama_index
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/__init__.py", line 21, in <module>
    from llama_index.indices import (
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/indices/__init__.py", line 4, in <module>
    from llama_index.indices.composability.graph import ComposableGraph
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/indices/composability/__init__.py", line 4, in <module>
    from llama_index.indices.composability.graph import ComposableGraph
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/indices/composability/graph.py", line 7, in <module>
    from llama_index.indices.base import BaseIndex
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/indices/base.py", line 6, in <module>
    from llama_index.chat_engine.types import BaseChatEngine, ChatMode
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/chat_engine/__init__.py", line 1, in <module>
    from llama_index.chat_engine.condense_question import CondenseQuestionChatEngine
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/chat_engine/condense_question.py", line 6, in <module>
    from llama_index.chat_engine.types import (
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/chat_engine/types.py", line 11, in <module>
    from llama_index.memory import BaseMemory
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/memory/__init__.py", line 1, in <module>
    from llama_index.memory.chat_memory_buffer import ChatMemoryBuffer
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/memory/chat_memory_buffer.py", line 12, in <module>
    class ChatMemoryBuffer(BaseMemory):
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/memory/chat_memory_buffer.py", line 18, in ChatMemoryBuffer
    default_factory=cast(Callable[[], Any], GlobalsHelper().tokenizer),
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/utils.py", line 55, in tokenizer
    enc = tiktoken.get_encoding("gpt2")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken/registry.py", line 73, in get_encoding
    enc = Encoding(**constructor())
                     ^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken_ext/openai_public.py", line 11, in gpt2
    mergeable_ranks = data_gym_to_mergeable_bpe_ranks(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken/load.py", line 82, in data_gym_to_mergeable_bpe_ranks
    vocab_bpe_contents = read_file_cached(vocab_bpe_file).decode()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken/load.py", line 50, in read_file_cached
    contents = read_file(blobpath)
               ^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken/load.py", line 24, in read_file
    resp = requests.get(blobpath)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /gpt-2/encodings/main/vocab.bpe (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f83a8887350>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

I have tried to disable /tmp files deletion on restart as I saw some files are getting downloaded while running initially after reboot. But, the requirement to connect to openaipublic.blob.core.windows.net is not bypassed with that. Thanks!

The text was updated successfully, but these errors were encountered:

onButtonUp · 2023-12-26T17:59:25Z

"Unable to run privateGPT without internet " same on macos.

ParetoOptimalDev · 2023-12-27T19:52:06Z

This is because the vocab files are being downloaded and you can bypass it by pre-downloading them and some other things per:

openai/whisper#1399 (comment)

ParetoOptimalDev · 2023-12-27T20:06:59Z

My objective is to setup PrivateGPT with internet and then cutoff the internet for using it locally to avoid any potential data leakage.

Do you plan to use something like firejail or systemd jails for this? I had something similar in mind!

yadav-arun · 2023-12-28T03:54:12Z

@ninjanimus I too faced the same issue. Here is the reason and fix :

Reason :
PrivateGPT is using llama_index which uses tiktoken by openAI , tiktoken is using its existing plugin to download vocab and encoder.json from internet every time you restart.

Fix :
you would need to put vocab and encoder files to cache. To specify a cache file in project folder, add following to <ROOT>/private_gpt/__init__.py

os.environ["TIKTOKEN_CACHE_DIR"] = "tiktoken_cache"

and create a folder by same name tiktoken_cache
Then running one time with internet will load files in your cache folder, which will be consumed from same path everytime.

ninjanimus · 2023-12-28T10:54:21Z

Thanks @ParetoOptimalDev and @yadav-arun for your answers!

I have tried @yadav-arun's suggestion and it worked flawlessly on Ubuntu.

My objective is to setup PrivateGPT with internet and then cutoff the internet for using it locally to avoid any potential data leakage.

Do you plan to use something like firejail or systemd jails for this? I had something similar in mind!

Currently, my requirement is to use a separate VM for executing the LLM and disconnect the network adapter on the guest VM once installation is complete, to avoid any data leakage due to accidental executions or misconfigurations.

ParetoOptimalDev mentioned this issue Dec 27, 2023

Is there a way for tiktoken to interoperate better with offline AI software? openai/tiktoken#232

Open

ninjanimus closed this as completed Dec 28, 2023

yadav-arun mentioned this issue Dec 29, 2023

tiktoken cache within repo for offline #1467

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to run privateGPT without internet #1458

Unable to run privateGPT without internet #1458

ninjanimus commented Dec 26, 2023 •

edited

Loading

onButtonUp commented Dec 26, 2023

ParetoOptimalDev commented Dec 27, 2023

ParetoOptimalDev commented Dec 27, 2023

yadav-arun commented Dec 28, 2023 •

edited

Loading

ninjanimus commented Dec 28, 2023 •

edited

Loading

Unable to run privateGPT without internet #1458

Unable to run privateGPT without internet #1458

Comments

ninjanimus commented Dec 26, 2023 • edited Loading

onButtonUp commented Dec 26, 2023

ParetoOptimalDev commented Dec 27, 2023

ParetoOptimalDev commented Dec 27, 2023

yadav-arun commented Dec 28, 2023 • edited Loading

ninjanimus commented Dec 28, 2023 • edited Loading

ninjanimus commented Dec 26, 2023 •

edited

Loading

yadav-arun commented Dec 28, 2023 •

edited

Loading

ninjanimus commented Dec 28, 2023 •

edited

Loading