This library can clean fields for a big set of documents thanks to given paths and clean functions. A document could be json, bson (from gopkg.in/mgo.v2/bson) and more generaly map[string]interface{}.
Some interesting use cases:
- anonymize some fields (password, username, credit cards...)
- fix some typos on a set of documents
- update dates of data set for testing purpose
- at least, all these changes can be done during a oriented document database export like MongoDB or Elasticsearch
For associating an element in document and a clean method, we need a configuration file (toml format):
method="constantCleaner"
args=[]
["node2.leaf2"]
method="constantCleaner"
args=[]
["node2.leaf4"]
method="constantCleaner"
args=[]
["node3.node31.node311.node3111.leaf32"]
method="constantCleaner"
args=[]
["leaf4"]
method="constantCleaner"
args=[]
["node5.node51.node511.leaf5"]
method="constantCleaner"
args=[]
For each X.Y.Z path, these rules are applied:
- X,Y and Z could be any field names
- If X is an array, librairy looks for Y field on each array elements
- clean method is applied to Z with given args.
- in example above, only constantCleaner method is defined but properties can contain different clean methods.
// define a struct type
type constantValueCleaner struct {
}
// this type must implement clean method
func (c *constantValueCleaner) Clean(value interface{}) (changed interface{}, err error) {
changed = 1234
err = nil
return
}
This clean method only returns 1234 constant for any given values.
// init
propertiesReader := ... // properties reader (from string, file, ...)
cleaners := make(map[string]doccleaner.ValueCleaner) // map cleaner type names given in properties file with a real cleaner instance
cleaners["constantCleaner"] = &constantValueCleaner{} // fill cleaners map
jsonCleaner := doccleaner.NewDocCleaner(propertiesReader,cleaners) // initialize the json cleaner
// call
jsonCleaner.Clean(objJson) // clean an unmarshalled json object
More example in [node_test.go|node_test.go]