Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for custom regular expressions #115

Closed
vaidik opened this issue Dec 10, 2020 · 1 comment
Closed

Support for custom regular expressions #115

vaidik opened this issue Dec 10, 2020 · 1 comment
Labels
enhancement New feature or request

Comments

@vaidik
Copy link

vaidik commented Dec 10, 2020

Is there a way to add custom regular expressions? If not, having support that from configuration or another JSON file would be super useful and it will make piicatcher useful for a wider audience.

@vrajat
Copy link
Member

vrajat commented Dec 10, 2020

Not yet. However this is a good feature addition. Right now, there a RegexScanner which uses CommonRegex for a list of regular expressions and maps it to PIITypes.

In a custom scanner, the list can be picked up from a json file. Each item in the list should consist of the regular expression and the PIIType it maps to.

Does this sound reasonable? Are you open to contributing this feature?

@vrajat vrajat added the enhancement New feature or request label Dec 17, 2020
vrajat added a commit to vrajat/dbcat that referenced this issue Dec 13, 2021
PIIType as enum cannot be extended. Change PIIType to a base class that
can be inherited to create new PII types. Consequently, do not create an
enum in Postgres. Serialize and deserialize to JSON and store in the
database using Pydantic.

Related to tokern/piicatcher#115
vrajat added a commit to tokern/dbcat that referenced this issue Dec 13, 2021
PIIType as enum cannot be extended. Change PIIType to a base class that
can be inherited to create new PII types. Consequently, do not create an
enum in Postgres. Serialize and deserialize to JSON and store in the
database using Pydantic.

Related to tokern/piicatcher#115
vrajat added a commit to vrajat/piicatcher that referenced this issue Dec 17, 2021
Detectors detect PII. Change PII Type to class hierarchy instead of
enums. With this change new PII types can be defined.

Support adding plugins using entry points for new detectors. Remove
spacy detector and convert it into plugin hosted in another repository.

Fix tokern#115
@vrajat vrajat closed this as completed in 0a82c66 Dec 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants