Notes Search

The plan is to use the sentence transformers to create embeddings that we can later look up with the annoy nearest neighbor search to find relevant files. We might need to implement some sort of caching embeddings with maybe sqlite later to speed up startup.

Offline usage

optionally make a model directory and in the model directory git clone the embedding model and cross encoder model.

The folder structure (tree -d) output should look like models/ ├── msmarco-distilbert-base-tas-b │ └── 1_Pooling └── ms-marco-TinyBERT-L-2

Remember to "git-lfs pull" after git cloning to get the model files. The main.py automatically checks these two folders before trying to load the models from the internet

Usage

Usage: source venv/bin/activate

First create the embeddings for the notes and the annoy tree with the following: python3 main.py build (notes-dir) (data directory name) ex. python3 main.py build ~/Notes notes

Do semantic search on the notes with the following: python3 main.py search "(query string) (data directory name)" ex. python3 main.py search "ssh" notes

Useful references

Annoy usage

Sentence transformer usage

ReRanking

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt
run		run

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Notes Search

Offline usage

Usage

Useful references

About

Releases

Packages

Languages

Luicosas/NotesSearch

Folders and files

Latest commit

History

Repository files navigation

Notes Search

Offline usage

Usage

Useful references

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages