Powerful tool designed to clean and preprocess plaintext files; Remove non-numeric/alphabetic/punctuational characters, with the ability to collapse repeated punctuations.
nlp
sanitization
machine-learning
natural-language-processing
automation
regex
data-transformation
data-analysis
mit-license
command-line-tool
text-processing
data-preprocessing
regular-expressions
plaintext
data-cleaning
numeric-data
file-manipulation
punctuation-handling
machine-learing-preprocessing
-
Updated
Jan 31, 2024 - Rust