Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to run privateGPT without internet #1458

Closed
ninjanimus opened this issue Dec 26, 2023 · 5 comments
Closed

Unable to run privateGPT without internet #1458

ninjanimus opened this issue Dec 26, 2023 · 5 comments

Comments

@ninjanimus
Copy link

ninjanimus commented Dec 26, 2023

My objective is to setup PrivateGPT with internet and then cutoff the internet for using it locally to avoid any potential data leakage.

I'm trying to install it on Ubuntu VM, no GPU. Everything working fine when internet is enabled. But once the disabled and the VM is restarted, unable to start the PrivateGPT server.

Below are the error details:

$ poetry run python3.11 -m private_gpt
Traceback (most recent call last):
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/socket.py", line 961, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 715, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 404, in _make_request
    self._validate_conn(conn)
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1058, in _validate_conn
    conn.connect()
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 363, in connect
    self.sock = conn = self._new_conn()
                       ^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f83a8887350>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 799, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /gpt-2/encodings/main/vocab.bpe (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f83a8887350>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/tony/installs/privateGPT/private_gpt/__main__.py", line 5, in <module>
    from private_gpt.main import app
  File "/home/tony/installs/privateGPT/private_gpt/main.py", line 3, in <module>
    import llama_index
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/__init__.py", line 21, in <module>
    from llama_index.indices import (
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/indices/__init__.py", line 4, in <module>
    from llama_index.indices.composability.graph import ComposableGraph
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/indices/composability/__init__.py", line 4, in <module>
    from llama_index.indices.composability.graph import ComposableGraph
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/indices/composability/graph.py", line 7, in <module>
    from llama_index.indices.base import BaseIndex
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/indices/base.py", line 6, in <module>
    from llama_index.chat_engine.types import BaseChatEngine, ChatMode
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/chat_engine/__init__.py", line 1, in <module>
    from llama_index.chat_engine.condense_question import CondenseQuestionChatEngine
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/chat_engine/condense_question.py", line 6, in <module>
    from llama_index.chat_engine.types import (
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/chat_engine/types.py", line 11, in <module>
    from llama_index.memory import BaseMemory
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/memory/__init__.py", line 1, in <module>
    from llama_index.memory.chat_memory_buffer import ChatMemoryBuffer
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/memory/chat_memory_buffer.py", line 12, in <module>
    class ChatMemoryBuffer(BaseMemory):
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/memory/chat_memory_buffer.py", line 18, in ChatMemoryBuffer
    default_factory=cast(Callable[[], Any], GlobalsHelper().tokenizer),
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/llama_index/utils.py", line 55, in tokenizer
    enc = tiktoken.get_encoding("gpt2")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken/registry.py", line 73, in get_encoding
    enc = Encoding(**constructor())
                     ^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken_ext/openai_public.py", line 11, in gpt2
    mergeable_ranks = data_gym_to_mergeable_bpe_ranks(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken/load.py", line 82, in data_gym_to_mergeable_bpe_ranks
    vocab_bpe_contents = read_file_cached(vocab_bpe_file).decode()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken/load.py", line 50, in read_file_cached
    contents = read_file(blobpath)
               ^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/tiktoken/load.py", line 24, in read_file
    resp = requests.get(blobpath)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tony/installs/privateGPT/.venv/lib/python3.11/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /gpt-2/encodings/main/vocab.bpe (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f83a8887350>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

I have tried to disable /tmp files deletion on restart as I saw some files are getting downloaded while running initially after reboot. But, the requirement to connect to openaipublic.blob.core.windows.net is not bypassed with that. Thanks!

@onButtonUp
Copy link

"Unable to run privateGPT without internet " same on macos.

@ParetoOptimalDev
Copy link

This is because the vocab files are being downloaded and you can bypass it by pre-downloading them and some other things per:

openai/whisper#1399 (comment)

@ParetoOptimalDev
Copy link

My objective is to setup PrivateGPT with internet and then cutoff the internet for using it locally to avoid any potential data leakage.

Do you plan to use something like firejail or systemd jails for this? I had something similar in mind!

@yadav-arun
Copy link
Contributor

yadav-arun commented Dec 28, 2023

@ninjanimus I too faced the same issue. Here is the reason and fix :

Reason :
PrivateGPT is using llama_index which uses tiktoken by openAI , tiktoken is using its existing plugin to download vocab and encoder.json from internet every time you restart.

Fix :
you would need to put vocab and encoder files to cache. To specify a cache file in project folder, add following to <ROOT>/private_gpt/__init__.py

os.environ["TIKTOKEN_CACHE_DIR"] = "tiktoken_cache"

and create a folder by same name tiktoken_cache
Then running one time with internet will load files in your cache folder, which will be consumed from same path everytime.

@ninjanimus
Copy link
Author

ninjanimus commented Dec 28, 2023

Thanks @ParetoOptimalDev and @yadav-arun for your answers!

I have tried @yadav-arun's suggestion and it worked flawlessly on Ubuntu.

My objective is to setup PrivateGPT with internet and then cutoff the internet for using it locally to avoid any potential data leakage.

Do you plan to use something like firejail or systemd jails for this? I had something similar in mind!

Currently, my requirement is to use a separate VM for executing the LLM and disconnect the network adapter on the guest VM once installation is complete, to avoid any data leakage due to accidental executions or misconfigurations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants