Select the OS you are installing the environment for:
- EC2 (with Linux) - execute steps as described in Section 1.1
- Windows - execute steps as described in Section 1.2
As a pre-requisites:
- SSH key generated and added to github (so to be able to clone the repo)
- Generate SSH key pair
cd ~
ssh-keygen -t rsa -b 4096 -C "your_name@gmail.com"
Save key to: ~/.ssh/id_rsa_projectx
>>>yes
- No passphrase used
- Start the ssh-agent in the background
eval "$(ssh-agent -s)"
- Add your SSH private key to the ssh-agent
ssh-add ~/.ssh/id_rsa_projectx
- Generate SSH key pair
Find the most recent Anaconda's Linux distro in https://repo.anaconda.com/archive/ (I am using Anaconda3-2021.11-Windows-x86_64.exe).
- Download and install
https://repo.anaconda.com/archive/Anaconda3-2021.11-Windows-x86_64.exe
- In my case, I did install Anaconda to
C:\Users\Nazar\Anaconda3
- In my case, I did install Anaconda to
- Open environmental variables on your Windows machine and make sure your
Path
env variable contains the following:C:\Users\Nazar\Anaconda3\Scripts\windows
C:\Users\Nazar\Anaconda3\Library\
C:\Users\Nazar\Anaconda3\Library\bin\
C:\Users\Nazar\Anaconda3\Library\mingw-w64\bin\
For the majority of the use cases of this repo, create light weight conda environment (mainly for EDA, data wrangling, ...). In Windows PowerShell:
cd C:\ProjectX\workspace\project_x
conda activate base
conda install nb_conda_kernels -y
- If the environment already exists:
conda env remove -n py38eda
conda clean --all -y
conda env create -f environment_py38eda_windows.yml
conda activate py38eda
For the heavy-weight work like NER, Geocoding, one needs to install the following conda environment. In Windows PowerShell:
cd ~/workspace/project_x
conda activate base
conda install nb_conda_kernels -y
- If the environment already exists:
conda env remove -n py38dnn
conda clean --all -y
conda env create -f environment_py38dnn_windows.yml
conda activate py38dnn
- To install Spacy (GPU-enabled):
conda install -c pytorch torchvision==0.11.3 pytorch==1.10.2 cudatoolkit==11.3.1 -y
conda install -c conda-forge cupy==10.1.0 spacy==3.2.1 spacy-transformers==1.1.5 -y
- Install pre-trained Spacy-models:
python -m spacy download en_core_web_lg
python -m spacy download en_core_web_trf
- Install Prodigy (optional, needed for NER):
-
cd C:\ProjectX\workspace
- If not exists:mkdir prodigy
-cd C:\ProjectX\workspace\prodigy\prodigy_source\1.11.7\windows
-pip install prodigy-1.11.7-cp38-cp38-win_amd64.whl