Exploring semantic simility algorithms in NLP to develop solutions in text search. Rather than looking for text match, look for sentences or phrases with similar meaning.
-
Install the dependencies for our code using Conda. You may need to adjust the environment YAML file depending on your setup.
conda env create -f environment.yaml
-
Install the CLIP repository
pip install git+https://github.com/openai/CLIP.git
-
Download pretrained weights (conceptual_weights.pt) and place them in the weights directory. You can find the COCO and Conceptual Captions pretrained models on Google Drive.
-
Launch your environment with
conda activate understanding
orsource activate understanding
-
Place text files to search in
data
folder -
Run
python -i main.py
to open an interactive Python session. Use the methodsearch
to search for phrases and sentences that appear in the text. For example,search("The quick brown fox jumps over the lazy dog")
.