I was tasked with scrapping review data from Skytrax. I focused on reviews specifically about British Airways.
Once the dataset was scrapped, data cleaning was performed in order to prepare the data for analysis.The NLP techniques used were topic modelling, sentiment analysis or wordclouds to provide some insight into the content of the reviews.
Finally, I summarised my findings within a single powerpoint slide.
I explored the dataset, to understand the different columns and some basic statistics of the dataset. Then, I prepared the dataset for a predictive model.
I trained a machine learning model to be able to predict the target outcome, which was a customer making a booking. I used RandomForest algorithm that easily allowed the output information about how each variable within the model contributes to its predictive power.
After training the model, I evaluated how well it performed by conducting cross-validation and outputting appropriate evaluation metrics. Furthermore, I created a visualisation to interpret how each variable contributed to the model. Finally, I summarised my findings in a single slide.