Skip to content

The main objective of this repository is to share my knowledge reffering to the Natural Language Processing (NLP) scope.

License

Notifications You must be signed in to change notification settings

Vitor-Sallenave/Studies-in-NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

⚙️🤖 Studies in Natural Language Processing (NLP)


◾ Module 1: NLP with Spacy

  • Tokenization
  • Pos-tagging e Dependencies
  • Named Entity Recognition (NER)
  • Managing Stopwords
  • Creating a Vocabulary
  • Searching for similarity
  • Expressions Matching
  • Displacy Visualization
  • Working with Pipelines

◾ Module 2: NLP with NLTK

  • Tokenization
  • Managing Stopwords and Punctuation
  • Stemming
  • Metrics
  • Costumized Pos-tagging
  • NER
  • Lemmatization

◾ Module 3: Machine Learning and Deep Learning in NLP

  • Implementing Neural Networks (Keras and TensorFlow)
  • Spam Classification (NN)
  • Creating Embeddings with NNs

◾ Module 4: Sentiment Analysis

  • LSTM: Supervised model
  • VADER: Regulated model
  • Comparison: LSTM x VADER

◾ Module 5: Transformers, BERT and GPTs

  • HuggingFace and OpenAI
  • Question Answering
  • Fill-mask
  • Summarization
  • Text Generation
  • Text Translation

◾ Module 6: Topics Modeling with BERT - BERTopic

  • Data Processing
  • Main Hyperparameters

◾ Module 7: NLP with Spark

  • Working on the Databricks Environment
  • Data Pre-processing
  • Training and Evaluating the model