Skip to content

Releases: saltudelft/CD4Py

CD4Py 0.1.0

11 Sep 20:04
Compare
Choose a tag to compare

Added

  • A parallel tokenizer for Python source code files.
  • A library module for pre-processing tokenized files, calculating TF-IDF, finding KNNs, and identifying duplicate files.
  • A command-line interface for detection of duplicate files in Python projects.