Skip to content

Latest commit

 

History

History
37 lines (22 loc) · 1.14 KB

readme.md

File metadata and controls

37 lines (22 loc) · 1.14 KB

job-titles

Normalized dataset of 70k job titles

Data Normalizations

The data is normalized in the following ways:

  • lowercase
  • - replaced with a <Space>
  • , removed

Caveats

  • Duplicates such as a and p mechanic and a&p mechanic
  • Non-English titles such as ab initio etl developer

See also

Contribute

Feel free to open a pull request fixing above listed caveats or any other enhancements.

Only edit job-titles.txt. After doing so run ./format.sh.

Attribution

This dataset is a collection of the following sources: