A website made using difflib library and Streamlit
Check this website now!
-
Reading the .csv file
-
Some features are selected for further calculations
- Genres
- Keywords
- Tagline
- Cast
- Director
-
Replacing all the null values with null string
for feature in selected_feature:
mov_data[feature]=mov_data[feature].fillna('')
- Combining all the selected features
- Converting text data into feature vector
vectoriser = TfidfVectorizer()
A feature vector is an n-dimensional vector of numerical features that represent some object
- Getting Similarity scores using Cosine similarity
cosine_similarity(feature_vectorizer)
- Getting movie name as input from user
- Searching similar names based on that movie title to find the closest match
- Selecting the first obtained movie title and get its index value
- Getting similarity score by with all the movies in database
list(enumerate(similarity[index_of_movie]))
- Slice and select any length of movies name, i gave 10 suggestions
- Import pickle library
- Dumping the loaded model into .pkl file
pickle.dump(similarity, open(filename,'wb'))
- Load the model by using
pickle.load(open('similarity.pkl','rb'))
function - Do all the remaining operations to get the suggestions
- Dataset
- Numpy
- Pandas
- Difflib
- Pickle
- Streamlit
- TfidfVectorizer from sklearn
- Cosine_similarity from sklearn