Skip to content

go library for cleaning documents (json, bson, map[string]interface{} ...)

Notifications You must be signed in to change notification settings

garnaud/doccleaner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

doccleaner

goal

This library can clean fields for a big set of documents thanks to given paths and clean functions. A document could be json, bson (from gopkg.in/mgo.v2/bson) and more generaly map[string]interface{}.

use cases

Some interesting use cases:

  • anonymize some fields (password, username, credit cards...)
  • fix some typos on a set of documents
  • update dates of data set for testing purpose
  • at least, all these changes can be done during a oriented document database export like MongoDB or Elasticsearch

howto

configuration

For associating an element in document and a clean method, we need a configuration file (toml format):

method="constantCleaner"
args=[]
["node2.leaf2"]
method="constantCleaner"
args=[]
["node2.leaf4"]
method="constantCleaner"
args=[]
["node3.node31.node311.node3111.leaf32"]
method="constantCleaner"
args=[]
["leaf4"]
method="constantCleaner"
args=[]
["node5.node51.node511.leaf5"]
method="constantCleaner"
args=[]

For each X.Y.Z path, these rules are applied:

  • X,Y and Z could be any field names
  • If X is an array, librairy looks for Y field on each array elements
  • clean method is applied to Z with given args.
  • in example above, only constantCleaner method is defined but properties can contain different clean methods.

clean function

// define a struct type
type constantValueCleaner struct {
}

// this type must implement clean method
func (c *constantValueCleaner) Clean(value interface{}) (changed interface{}, err error) {
  changed = 1234
  err = nil
  return 
}

This clean method only returns 1234 constant for any given values.

all together

// init
propertiesReader := ... // properties reader (from string, file, ...)
cleaners := make(map[string]doccleaner.ValueCleaner) // map cleaner type names given in properties file with a real cleaner instance
cleaners["constantCleaner"] = &constantValueCleaner{} // fill cleaners map 
jsonCleaner := doccleaner.NewDocCleaner(propertiesReader,cleaners) // initialize the json cleaner

// call
jsonCleaner.Clean(objJson) // clean an unmarshalled json object

More example in [node_test.go|node_test.go]

About

go library for cleaning documents (json, bson, map[string]interface{} ...)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages