Bowles-Halliday Word Clouds

Scott Cohn

This project takes a folder of PDFs (\bh_pdfs), one per textbook chapter, converts the files to a .txt files and places the .txt file in a new folder (\bh_txts). A cleaning function parses the text and finds the most frequent words and n-grams and compiles them into a list of top-X words. This list is ported to an external application, Wordl, to create a word cloud.

This is an extension of the forthcoming microeconomics text found here.

Here is an example:

The only file that needs to be run is the bh_wordcloud file.

REQUIREMENTS:

Install: https://www.xpdfreader.com/download.html and copy the executables (xpdf, pdftotext, etc.) to to /usr/local/bin. Use bash script: sudo cp Applications/xpdf/bin64/pdftotext /usr/local/bin

DIRECTORY:

  +-- README.md
  +-- bh_wordcloud.RProj
  +-- bh_cleantext.R
  +-- bh_text_analysis.R
  +-- bh_wordcloud.R 
  +-- excel_aggregation.R
  +-- wordstop.csv
  +-- compdict.csv
  +-- _bh_pdfs
  |   +-- BHMicro_chxx.pdf
  |   +-- BHMicro_chxx.pdf
  +-- _bh_txts
  |   +-- BHMicro_chxx.txt
  |   +-- BHMicro_chxx.txt
  +-- _bh_txts_clean
  |   +-- chxx_wordle.txt
  |   +-- chxx_wordle.txt
  +-- _bh_wordlist
  |   +-- wordlist_master.xlsx
  |   +-- chxx_wordlist.csv
  |   +-- chxx_wordlist.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bowles-Halliday Word Clouds

Scott Cohn

REQUIREMENTS:

DIRECTORY:

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
bh_clouds		bh_clouds
bh_pdfs		bh_pdfs
bh_txts		bh_txts
bh_txts_clean		bh_txts_clean
bh_wordlist		bh_wordlist
README.md		README.md
bh_cleantext.R		bh_cleantext.R
bh_text_analysis.R		bh_text_analysis.R
bh_wordcloud.R		bh_wordcloud.R
bh_wordcloud.Rproj		bh_wordcloud.Rproj
bh_wordcloud_ggplot.R		bh_wordcloud_ggplot.R
compdict.csv		compdict.csv
excel_aggregation.R		excel_aggregation.R
wordstop.csv		wordstop.csv

scottcohn97/bh_wordcloud

Folders and files

Latest commit

History

Repository files navigation

Bowles-Halliday Word Clouds

Scott Cohn

REQUIREMENTS:

DIRECTORY:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages