Sometimes when we crawl or got data text, we confuse because the data similar and close to another sentences, such as name of brand, head of news, head of journal, or any other sentences. So I create a similarity check function to help us know how similar sentences to each other. This project contain :
Cosine Similarity
Difflib Get Close Match (based on Fuzzy Logic)
ReGex
Pandas
Numpy
Math
Time
Sys
OS
To test this project, you can run python main.py.
To know how to use cosine similarity function, you can open and check in the cosine_function.ipynb
*** Note : The output of main.py is data table with close match and similarity, you can custom the treshold inside this code, enjoy ***