Skip to content

mrhallonline/NARST2024_Taking_a_look_under_the_hoodII-beyond_automation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

NARST2024_Taking_a_look_under_the_hoodII-beyond_automation

Natural Language Toolkit NLP-Workshop

Overview

The data corpus that we will be using is a CSV file of Whisper AI transcription of a high school math class. This data is located in Google Drive, we will use NLTK, Python, and Google Colab to copy and process the file so that it can be analyzed by you during the workshop. From there we will do some basic processing and analysis to extract specific features that give us information about student discussions during this math class.

Section 1 ( minutes)

  1. What do we know?
  2. What do we want to know?
  3. Some NLP Basics
  4. What is feature extraction
  5. Using Google Colab

Section 2 ( minutes)

  1. Installing dependencies and libraries
  2. Connecting to Google Drive
  3. Importing and initial processing of Uncertainty Transcript
  4. Some quick analysis

Section 3 ( minutes)

  1. Word counts and sorting
  2. Concordance
  3. N-grams and collocations
  4. Visualizations

Conclusion ( minutes)

  1. Issues to keep in mind when normalizing your data corpus
  2. Potential pitfalls and ethical considerations
  3. What did we learn?

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published