Please note: this project is still experimental
A library for detecting, parsing, and removing personally identifiable information from strings and objects.
This repository contains the datasets and the npm module for pii-filter
.
A showcase example of this library is the pii-filter web-extension for browsers.
We hope that this software can be useful in some of the following scenarios:
- privacy, security, fraud-detection, and data-auditing
- anonymizing data for research, marketing and machine learning
- accessibility and online guidance
- word tagging and word spotting
- designing chatbots
pii-filter
currently supports the following languages and PII:
- Dutch
- First Names
- Family Names
- Pet Names
- Medicine Names
- Phone Numbers
- Email Addresses
- Dates
More information can be found in dataset/README.md.
- adding support for english and other languages
- detecting more PII
- bank numbers and credit card numbers
- passport / id numbers
- social security numbers
- detecting PII in images
- providing coordinates which can be used for blurring or erasing PII