Skip to content

CD4Py 0.1.0

Latest
Compare
Choose a tag to compare
@mir-am mir-am released this 11 Sep 20:04
· 4 commits to master since this release

Added

  • A parallel tokenizer for Python source code files.
  • A library module for pre-processing tokenized files, calculating TF-IDF, finding KNNs, and identifying duplicate files.
  • A command-line interface for detection of duplicate files in Python projects.