Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Mac One click installer Mar 08, 2024 #1456

Merged
merged 3 commits into from
Mar 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,8 +147,12 @@ Note that for all platforms, some packages such as DocTR, Unstructured, BLIP, St
---

#### macOS (CPU/M1/M2) with full document Q/A capability
* One-click Installers (Experimental and subject to changes)
* One-click Installers (Experimental and subject to changes, we haven't tested each and every feature with these installers, we encourage the community to try them and report any issues)

Mar 07, 2024
- [h2ogpt-osx-m1-cpu](https://h2o-release.s3.amazonaws.com/h2ogpt/Mar2024/h2ogpt-osx-m1-cpu)
- [h2ogpt-osx-m1-gpu](https://h2o-release.s3.amazonaws.com/h2ogpt/Mar2024/h2ogpt-osx-m1-gpu)

Nov 08, 2023
- [h2ogpt-osx-m1-cpu](https://h2o-release.s3.amazonaws.com/h2ogpt/Nov2023/h2ogpt-osx-m1-cpu)
- [h2ogpt-osx-m1-gpu](https://h2o-release.s3.amazonaws.com/h2ogpt/Nov2023/h2ogpt-osx-m1-gpu)
Expand Down
25 changes: 22 additions & 3 deletions dev_installers/mac/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,31 @@ This document provide the details to build one click installers for MacOS. To ma

- Clone `h2ogpt` from https://github.com/h2oai/h2ogpt.git
- Create conda environment and installer all required dependencies, consult [build_mac_installer.sh](build_mac_installer.sh) for more details.
- Run below commands to build the installer
- Run below commands to build the spec file for installer, replace the `--name` appropriately depending on whether building for CPU only or with MPS (GPU) support
```shell
cd h2ogpt
pyinstaller ./dev_installers/mac/mac_run_app.py -D -w --name=h2ogpt-osx-m1-cpu-debug --hiddenimport=h2ogpt --collect-all=h2ogpt --noconfirm --recursive-copy-metadata=transformers --collect-data=langchain --collect-data=gradio_client --collect-all=gradio --path=${CONDA_PREFIX}/python3.10/site-packages --collect-all=sentencepiece --add-data=./Tesseract-OCR:Tesseract-OCR --add-data=./poppler:poppler
pyi-makespec mac_run_app.py -F --name=h2ogpt-osx-m1-cpu \
--hidden-import=h2ogpt \
--collect-all=h2ogpt \
--recursive-copy-metadata=transformers \
--collect-data=langchain \
--collect-data=gradio_client \
--collect-all=gradio \
--collect-all=sentencepiece \
--collect-all=gradio_pdf \
--collect-all=llama_cpp \
--collect-all=tiktoken_ext \
--add-data=../../Tesseract-OCR:Tesseract-OCR \
--add-data=../../poppler:poppler
```

- Edit the `h2ogpt-osx-m1-cpu.spec` and/or `h2ogpt-osx-m1-gpu.spec` and add below code block to `Analysis()`, to explicitly tell PyInstaller to collect all `.py` modules from listed dependencies.
```
module_collection_mode={
'gradio' : 'py',
'gradio_pdf' : 'py',
},
```
- Run `pyinstaller h2ogpt-osx-m1-cpu.spec` to build the installer.
### Deployment Mode

- Clone `h2ogpt` from https://github.com/h2oai/h2ogpt.git
Expand Down
69 changes: 47 additions & 22 deletions dev_installers/mac/build_mac_installer.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,57 +6,82 @@ then
echo "conda could not be found, need conda to continue!"
exit 1
fi

# Remove old Tesseract and poppler deps
rm -rf ./Tesseract-OCR poppler

conda env remove -n h2ogpt-mac
conda create -n h2ogpt-mac python=3.10 rust -y
conda activate h2ogpt-mac

pip install --upgrade pip
python -m pip install --upgrade setuptools

# Install required dependencies into conda environment
pip install -r requirements.txt --extra-index https://download.pytorch.org/whl/cpu
pip install -r requirements.txt --extra-index https://download.pytorch.org/whl/cpu -c reqs_optional/reqs_constraints.txt
# Required for Doc Q/A: LangChain:
pip install -r reqs_optional/requirements_optional_langchain.txt
pip install -r reqs_optional/requirements_optional_langchain.txt -c reqs_optional/reqs_constraints.txt
# Optional: PyMuPDF/ArXiv:
pip install -r reqs_optional/requirements_optional_langchain.gpllike.txt -c reqs_optional/reqs_constraints.txt
# Optional: Selenium/PlayWright:
pip install -r reqs_optional/requirements_optional_langchain.urls.txt -c reqs_optional/reqs_constraints.txt
# Optional: DocTR OCR:
conda install weasyprint pygobject -c conda-forge -y
pip install -r reqs_optional/requirements_optional_doctr.txt -c reqs_optional/reqs_constraints.txt
# Optional: for supporting unstructured package
python -m nltk.downloader all

# Required for CPU: LLaMa/GPT4All:
# For MPS support
if [ -z "$BUILD_MPS" ]
then
echo "BUILD_MPS is not set, skipping MPS specific configs..."
pip uninstall llama-cpp-python -y
CMAKE_ARGS="-DLLAMA_METAL=off" FORCE_CMAKE=1 pip install -r reqs_optional/requirements_optional_llamacpp_gpt4all.txt -c reqs_optional/reqs_constraints.txt --no-cache-dir
else
if [ "$BUILD_MPS" = "1" ]
then
echo "BUILD_MPS is set to 1, running MPS specific configs..."
export CMAKE_ARGS=-DLLAMA_METAL=on # remove if CPU MAC
export FORCE_CMAKE=1
pip uninstall llama-cpp-python -y
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -r reqs_optional/requirements_optional_llamacpp_gpt4all.txt -c reqs_optional/reqs_constraints.txt --no-cache-dir
fi
fi
pip install -r reqs_optional/requirements_optional_llamacpp_gpt4all.txt --no-cache-dir

# Optional: PyMuPDF/ArXiv:
pip install -r reqs_optional/requirements_optional_langchain.gpllike.txt
# Optional: Selenium/PlayWright:
pip install -r reqs_optional/requirements_optional_langchain.urls.txt
# Optional: for supporting unstructured package
python -m nltk.downloader all
# Addtional Requirements
pip install https://h2o-release.s3.amazonaws.com/h2ogpt/chromamigdb-0.3.25-py3-none-any.whl
pip install https://h2o-release.s3.amazonaws.com/h2ogpt/hnswmiglib-0.7.0.tgz
pip install librosa -c reqs_optional/reqs_constraints.txt

# Install PyInstaller
pip install PyInstaller

# Install and copy tesseract & poppler
#brew install tesseract@5.3.3
#brew install poppler@23.10.0
cp -R /opt/homebrew/Cellar/poppler/23.10.0/ ./poppler
cp -R /opt/homebrew/Cellar/tesseract/5.3.3/ ./Tesseract-OCR

#brew install poppler
#brew install tesseract
cp -R /opt/homebrew/Cellar/poppler/24.02.0/ ./poppler
cp -R /opt/homebrew/Cellar/tesseract/5.3.4_1/ ./Tesseract-OCR

# Build and install h2ogpt
make clean dist
pip install ./dist/h2ogpt*.whl

# Build Mac Installer
# below command is used to build current .spec file replace it whenever use new configs
# pyinstaller mac_run_app.py -F --name=h2ogpt-osx-m1-cpu --hiddenimport=h2ogpt --collect-all=h2ogpt --noconfirm --recursive-copy-metadata=transformers --collect-data=langchain --collect-data=gradio_client --collect-all=gradio --collect-all=sentencepiece --add-data=./Tesseract-OCR:Tesseract-OCR --add-data=./poppler:poppler
# below command is used to build current .spec file from project root, replace it whenever use new configs
#pyi-makespec mac_run_app.py -F --name=h2ogpt-osx-m1-cpu \
# --hidden-import=h2ogpt \
# --collect-all=h2ogpt \
# --recursive-copy-metadata=transformers \
# --collect-data=langchain \
# --collect-data=gradio_client \
# --collect-all=gradio \
# --collect-all=sentencepiece \
# --collect-all=gradio_pdf \
# --collect-all=llama_cpp \
# --collect-all=tiktoken_ext \
# --add-data=../../Tesseract-OCR:Tesseract-OCR \
# --add-data=../../poppler:poppler

# add below argument to Analysis() call in h2ogpt-osx-m1-cpu.spec file
#module_collection_mode={
# 'gradio' : 'py',
# 'gradio_pdf' : 'py',
#}
if [ "$BUILD_MPS" = "1" ]
then
echo "BUILD_MPS is set to 1, building one click installer for MPS..."
Expand Down
12 changes: 11 additions & 1 deletion dev_installers/mac/h2ogpt-osx-m1-cpu.spec
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,16 @@ tmp_ret = collect_all('gradio')
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
tmp_ret = collect_all('sentencepiece')
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
tmp_ret = collect_all('gradio_pdf')
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
tmp_ret = collect_all('llama_cpp')
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
tmp_ret = collect_all('tiktoken_ext')
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]


a = Analysis(
['./mac_run_app.py'],
['mac_run_app.py'],
pathex=[],
binaries=binaries,
datas=datas,
Expand All @@ -28,6 +34,10 @@ a = Analysis(
runtime_hooks=[],
excludes=[],
noarchive=False,
module_collection_mode={
'gradio' : 'py',
'gradio_pdf' : 'py',
},
)
pyz = PYZ(a.pure)

Expand Down
12 changes: 11 additions & 1 deletion dev_installers/mac/h2ogpt-osx-m1-gpu.spec
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,16 @@ tmp_ret = collect_all('gradio')
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
tmp_ret = collect_all('sentencepiece')
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
tmp_ret = collect_all('gradio_pdf')
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
tmp_ret = collect_all('llama_cpp')
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]
tmp_ret = collect_all('tiktoken_ext')
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]


a = Analysis(
['./mac_run_app.py'],
['mac_run_app.py'],
pathex=[],
binaries=binaries,
datas=datas,
Expand All @@ -28,6 +34,10 @@ a = Analysis(
runtime_hooks=[],
excludes=[],
noarchive=False,
module_collection_mode={
'gradio' : 'py',
'gradio_pdf' : 'py',
},
)
pyz = PYZ(a.pure)

Expand Down
Empty file added src/vision/__init__.py
Empty file.
Loading