maybe separate data processing step with data exploration step then we can do bi and tri gram tfidf explorations how to define a switch statement here?
what do we want to communicate to our audience? change main script to print out variations of our data that are revealing and quantifiable
eliminate neutral words from collection get the highest feature words sentiment dictionary tf-idf word/doc 2vec naive bayes classifier