Kosmos-2.5 - Deployment Challenges on a Windows 11 + RTX 3090 PC #1596

abgulati · 2024-07-06T01:00:59Z

Hi,

I've spent a greater part of the last ten days trying to get the Kosmos-2.5 model working on my Windows 11 PC, relevant specs below:

Intel Core i9 13900KF
Nvidia RTX 3090FE
32GB DDR5 5600MT/s (16x2)
Windows 11 - OS Build 22631.3737
Python 3.11.5 & PIP 24.1.1
CUDA 12.4
Flash-Attention-2 (v2.5.9.post1)

This proved ridiculously impossible despite following the elaborate (/s) steps mentioned in the Kosmos-2.5 repo, and I ran around in circles trying to fix this. Turns out this model is at the moment EXTREMELY temperamental to the software environment and Python v3.11 causes many, many issues, and one must stick to v3.10.x.

Devs, I REALLY wish you'd mentioned this in the Kosmos repo! Since PyTorch & FlashAttention2 have no issues with v3.11, I didn't think Kosmos would either given it's not mentioned anywhere!

Turns out, sticking to the default v3.10.12 of WSL-Ubuntu works, but figuring this out was quite the journey. Sharing it below as well as all the steps that worked in case it may help someone facing the same issues.

Amongst the many errors I faced were (DO NOT TRY ANY OF THE RESOLUTIONS IN THIS SECTION, THEY'RE SHARED FOR REFERENCE ONLY. THE SOLUTION IS IN THE SECTION THAT FOLLOWS THIS ONE):

Error: ImportError: cannot import name II form omegaconf

Resolutions tried :

Error: ValueError: mutable default <class 'fairseq.dataclass.configs.CommonConfig'> for field common is not allowed: use default_factory

Resolutions tried:

Third party Fairseq lib as per this thread

Error: omegaconf.errors.ConfigAttributeError: Missing key seed full_key: common.seed object_type=dict

Resolution tried:

Requested assistance from Claude 3.5 Sonnet, added random 'seed': 42 to the init args of inference.py as per advise

Error: ValueError: Default process group has not been initialized, please make sure to call init_process_group.

Resolution tried:

Requested assistance from Claude 3.5 Sonnet, made the following changes:

a. To inference.py, added import torch.distributed as dist to imports and

if not dist.is_initialized(): 
    dist.init_process_group(backend='gloo', init_method='env://', rank=0, world_size=1) 
     torch.cuda.set_device(0)

to init() before use_cuda = True

b. To gpt.py, added the below to the build_model method of the GPTmodel class:

if hasattr(distributed_utils, 'get_data_parallel_rank')
    args.ddp_rank = distributed_utils.get_data_parallel_rank()
else:
    args.ddp_rank = 0 4

c. Ran with environment variables:

$env:MASTER_ADDR = "localhost"
$env:MASTER_PORT = "12355"

...which then lead to:

Error: RuntimeError: use_libuv was requested but PyTorch was build without libuv support

Resolution tried:

Ran with environment variable: $env:USE_LIBUV = "0"

Error: TypeError: embedding(): argument 'indices' (position 2) must be Tensor, not NoneType

This then led to a host of modifications to the .py files which led to messes best forgotten. So anyways...

TURNS OUT THE ISSUE WAS THE PYTHON VERSION 3.11.x ALL ALONG! PLEASE STICK TO 3.10.x!

SHARING MY WORKING WINDOWS 11 WSL SETUP BELOW:

Make sure Nvidia GPU drivers & CUDA (I used v12.4) are installed in the host Windows 11 system

Via PowerShell:

Ensure you have WSL version 2 by running:

wsl -v
# or
wsl --status

Update if not

Install Ubuntu:

wsl --install -d Ubuntu-22.04

# after installation & setup completes:

wsl --set-default Ubuntu-22.04

Now open a WSL-terminal by typing wsl in the Start Menu or a Command Prompt, or by searching for Ubuntu in the Start Menu
Install CUDA Toolkit v12.4.1:

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin

sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600

wget https://developer.download.nvidia.com/compute/cuda/12.4.1/local_installers/cuda-repo-wsl-ubuntu-12-4-local_12.4.1-1_amd64.deb

sudo dpkg -i cuda-repo-wsl-ubuntu-12-4-local_12.4.1-1_amd64.deb

sudo cp /var/cuda-repo-wsl-ubuntu-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/

sudo apt-get update

sudo apt-get -y install cuda-toolkit-12-4

Set NVCC PATH:

Confirm symlink for cuda:

ls -l /usr/local/cuda
ls -l /etc/alternatives/cuda

Update bashrc:

nano ~/.bashrc

# add this line to the end of bashrc:
export PATH=/usr/local/cuda/bin:$PATH

Reload bashrc:

source ~/.bashrc

Confirm CUDA installation:

nvcc -V
nvidia-smi

Install flash-attention:

Install PyTorch:

sudo apt install python3-pip
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu124

Install dependencies:

pip install wheel==0.37.1
pip install ninja==1.11.1
pip install packaging==24.1
pip install numpy==1.22
pip install psutil==6.0.0

git clone and cd repo:

git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention

Install from repo:

pip install . --no-build-isolation

Test flash-attention installation (example output: 2.5.9.post1):

python3
import flash_attn
print(flash_attn.__version__)

Install Kosmos-2.5!

PIP Requirements:

pip install tiktoken
pip install tqdm
pip install "omegaconf<=2.1.0"
pip install boto3
pip install iopath
pip install "fairscale==0.4"
pip install "scipy==1.10"
pip install triton
pip install git+https://github.com/facebookresearch/xformers.git@04de99bb28aa6de8d48fab3cdbbc9e3874c994b8
pip install git+https://github.com/Dod-o/kosmos2.5_tools.git@fairseq
pip install git+https://github.com/Dod-o/kosmos2.5_tools.git@infinibatch
pip install git+https://github.com/Dod-o/kosmos2.5_tools.git@torchscale
pip install git+https://github.com/Dod-o/kosmos2.5_tools.git@transformers

Clone Repo and Checkpoint:

git clone https://github.com/microsoft/unilm.git

cd unilm/kosmos-2.5/

wget https://huggingface.co/microsoft/kosmos-2.5/resolve/main/ckpt.pt

Run OCR!

python3 inference.py --do_ocr --image assets/example/in.png -- ckpt ckpt.pt

python3 inference.py --do_md --image assets/example/in.png -- ckpt ckpt.pt

(Optional) GUI for WSL - Very Helpful

sudo apt update
sudo apt upgrade
sudo apt install lxde
DISPLAY=:0 startlxde

The text was updated successfully, but these errors were encountered:

abgulati · 2024-07-06T01:36:43Z

After the above, I'm able to run Kosmos-2.5 on a single RTX3090 Windows 11 PC via WSL-Ubuntu. I created (and am closing) this issue in case anyone is facing the same errors and thus may benefit from my experience above.

If you're facing any Python package issues, Python v3.10.12 worked for me and my entire requirements.txt looks as below. Feel free to reach out if you think I can help in any way:

antlr4-python3-runtime==4.8
attrs==21.2.0
Automat==20.2.0
bcrypt==3.2.0
blinker==1.4
boto==2.49.0
boto3==1.34.140
botocore==1.34.140
Brotli==1.0.9
certifi==2020.6.20
cffi==1.16.0
chardet==4.0.0
charset-normalizer==3.3.2
click==8.0.3
colorama==0.4.4
command-not-found==0.3
constantly==15.1.0
cryptography==3.4.8
cupshelpers==1.0
Cython==3.0.10
dbus-python==1.2.18
defer==1.0.6
deluge==2.0.3
distro==1.7.0
distro-info==1.1+ubuntu0.2
einops==0.8.0
fairscale==0.4.0
fairseq @ git+https://github.com/Dod-o/kosmos2.5_tools.git@ec5dc5a2523a58bd8172519a4f5820c637bb22fd
filelock==3.13.1
flash-attn @ file:///home/abheekg/Downloads/flash-attention
fsspec==2024.6.1
GeoIP==1.3.2
httplib2==0.20.2
huggingface-hub==0.23.4
hydra-core==1.0.7
hyperlink==21.0.0
idna==3.3
importlib-metadata==4.6.4
incremental==21.3.0
infinibatch @ git+https://github.com/Dod-o/kosmos2.5_tools.git@aea555a721d701ec3b81c41c3d59b54dc28ced23
iopath==0.1.10
jeepney==0.7.1
Jinja2==3.1.4
jmespath==1.0.1
keyring==23.5.0
language-selector==0.1
launchpadlib==1.10.16
lazr.restfulclient==0.14.4
lazr.uri==1.0.6
libtorrent===2.0.5-build-libtorrent-rasterbar-qrM5vM-libtorrent-rasterbar-2.0.5-bindings-python
lxml==5.2.2
macaroonbakery==1.3.1
Mako==1.1.3
MarkupSafe==2.0.1
more-itertools==8.10.0
mpmath==1.3.0
mutagen==1.45.1
netifaces==0.11.0
networkx==3.3
ninja==1.11.1
numpy==1.22.0
nvidia-cublas-cu12==12.4.2.65
nvidia-cuda-cupti-cu12==12.4.99
nvidia-cuda-nvrtc-cu12==12.4.99
nvidia-cuda-runtime-cu12==12.4.99
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.0.44
nvidia-curand-cu12==10.3.5.119
nvidia-cusolver-cu12==11.6.0.99
nvidia-cusparse-cu12==12.3.0.142
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.99
nvidia-nvtx-cu12==12.4.99
oauthlib==3.2.0
olefile==0.46
omegaconf==2.0.6
packaging==24.1
Pillow==9.0.1
portalocker==2.10.0
protobuf==3.12.4
psutil==6.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.1
pycairo==1.20.1
pycparser==2.22
pycryptodomex==3.11.0
pycups==2.0.1
pygame==2.1.2
PyGObject==3.42.1
PyHamcrest==2.0.2
PyJWT==2.3.0
pymacaroons==0.13.0
PyNaCl==1.5.0
pyOpenSSL==21.0.0
pyparsing==2.4.7
pyRFC3339==1.1
python-apt==2.4.0+ubuntu3
python-dateutil==2.9.0.post0
pytorch-triton==3.0.0+dedb7bdf33
pytz==2022.1
pyxattr==0.7.2
pyxdg==0.27
PyYAML==5.4.1
regex==2024.5.15
rencode==1.0.6
requests==2.32.3
s3transfer==0.10.2
sacrebleu==2.4.2
safetensors==0.4.3
scipy==1.10.0
SecretStorage==3.3.1
service-identity==18.1.0
setproctitle==1.2.2
six==1.16.0
sympy==1.12.1
systemd-python==234
tabulate==0.9.0
tiktoken==0.7.0
timm==0.4.12
tokenizers==0.13.3
torch==2.5.0.dev20240705+cu124
torchaudio==2.4.0.dev20240705+cu124
torchscale @ git+https://github.com/Dod-o/kosmos2.5_tools.git@a0275c11a1de798e342c9dde34a3752fd2984719
torchvision==0.20.0.dev20240705+cu124
tqdm==4.66.4
transformers @ git+https://github.com/Dod-o/kosmos2.5_tools.git@b7b7e7cab86931c46890376b5fd02d53ea36da67
triton==2.3.1
Twisted==22.1.0
typing_extensions==4.12.2
ubuntu-pro-client==8001
ufw==0.36.1
unattended-upgrades==0.1
urllib3==1.26.5
wadllib==1.3.6
websockets==9.1
xdg==5
xformers @ git+https://github.com/facebookresearch/xformers.git@04de99bb28aa6de8d48fab3cdbbc9e3874c994b8
yt-dlp==2022.4.8
zipp==1.0.0
zope.interface==5.4.0

abgulati closed this as completed Jul 6, 2024

abgulati mentioned this issue Jul 16, 2024

Kosmos-2.5 - Python version 3.10.x (and any other confirmed working versions) should be mentioned as a requirement to deploy & infer the Kosmos-2.5 model #1603

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kosmos-2.5 - Deployment Challenges on a Windows 11 + RTX 3090 PC #1596

Kosmos-2.5 - Deployment Challenges on a Windows 11 + RTX 3090 PC #1596

abgulati commented Jul 6, 2024 •

edited

Loading

abgulati commented Jul 6, 2024

Kosmos-2.5 - Deployment Challenges on a Windows 11 + RTX 3090 PC #1596

Kosmos-2.5 - Deployment Challenges on a Windows 11 + RTX 3090 PC #1596

Comments

abgulati commented Jul 6, 2024 • edited Loading

Amongst the many errors I faced were (DO NOT TRY ANY OF THE RESOLUTIONS IN THIS SECTION, THEY'RE SHARED FOR REFERENCE ONLY. THE SOLUTION IS IN THE SECTION THAT FOLLOWS THIS ONE):

TURNS OUT THE ISSUE WAS THE PYTHON VERSION 3.11.x ALL ALONG! PLEASE STICK TO 3.10.x!

SHARING MY WORKING WINDOWS 11 WSL SETUP BELOW:

abgulati commented Jul 6, 2024

abgulati commented Jul 6, 2024 •

edited

Loading