Besides Java API Ranger also supports configuring generators through YAML Configuration. Configuration can be easily parsed into ObjectGenerator
and after that can be further built upon with Java API or can be used directly. Following code block shows how configuration can be loaded from Java.
InputStream inputStream = ...;
String jsonPath = "$.parent";
Map<String, Object> config = (Map<String, Object>) YamlUtils.load(inputStream, jsonPath);
ObjectGenerator<Map<String, Object>> generator = new ConfigurationParser(config).build();
This will construct ObjectGenerator
from input source and use parent
property as root.
You can use YamlUtils
or any other way to parse YAML file and select desired node as root. Currently, YamlUtils
only parses JSON path in format $.a.b.c
, so no support for anything else. Structure of YAML must be following:
firstLevel:
secondLevel:
configRoot:
values:
firstName: random(['Stephen', "Richard", 'Arnold'])
addr:
city: random(["New York", "London"])
street: 2nd St
houseNumber: random(1..20)
user:
name: $firstName
address: $addr
output: $user
configRoot
is in this case root element. Configuration must have two elements below it. values
where all the values are defined, and output
which will be return value for constructed ObjectGenerator
. Any other element below root element will be ignored.
Value can be defined as you would normally in YAML file.
values:
name: Patrick
Value can be either primitive, reference or expression.
Value can be of any primitive type (boolean, byte, short, integer, long, float, double, string and date). Following section depicts type usage.
values:
booleanTrueVal1: true
booleanTrueVal2: True
booleanFalseVal1: false
booleanFalseVal2: False
byteVal: byte(23)
shortVal: short(-832)
implicitIntegerVal: 3242
explicitIntegerVal: int(3221)
implicitLongVal: 332848429842932
explicitLongVal: long(323)
explicitFloatVal: float(-88.64)
implicitDoubleVal: 32.23
explicitDoubleVal: double(-0.11)
stringVal1: some text
stringVal2: 'some text'
stringVal3: "some text"
dateVal: 2017-21-06
Defined value can be referenced at other places using '$'
sign, anywhere you can define value you can also use value reference.
Value reference honor local scope and can be referenced with '.'
dereference operator.
values:
randomNames: random(["Peter", "Patrick", "Nick"])
a:
b:
c: random(1..10)
text:
firstLine: "Global first line"
user:
text:
firstLine: "User first line"
innerUser:
firstName: $randomNames
num: $a.b.c
text: $text.firstLine
output: $user.innerUser
text
field would in this case have 'User first line' value due to local scope and shading.
Value can be expression.
values:
age: random(7..77)
output: $age
In YAML configuration there are analogous expressions for every helper method defined in Java API.
Has two meanings depending on the arguments.
Generates random value from list of possible values. Has optional distribution
which can be set.
Default distribution
is UniformDistribution
. Elements of the list can be of any type and it does not need to be same type for all the elements, although it is probably rare use cases that different types within the list will be needed.
There are two parameter variations:
values:
name: random(["Mike", "Peter", "Adam", "Mathew"])
name: random(["Mike", "Peter", "Adam", "Mathew"], uniform())
output: $name
Any variation would create ObjectGenerator
which can generate possible sequence:
"Peter", "Peter", "Mathew", "Adam", "Mathew", "Peter", "Mike", "Mike", "Adam", ...
Generates random value within specified range. Has optional useEdgeCases
and distribution
which can be set.
Default value for useEdgeCases
is false
and default distribution
is UniformDistribution
.
There are several parameter variations:
values:
age: random(1..100)
age: random(1..100, false)
age: random(int(1)..int(100), false, uniform())
output: $age
Any variation would create ObjectGenerator
which can generate possible sequence:
1, 36, 17, 87, 43, 55, 91, 83, 2, 21, 76
If useEdgeCases
is set to true, first value from sequence will be one set for beginning, and last value would be one before end value since range is inclusive only at beginning [a, b).
values:
age: random(short(1)..short(100), true)
output: $age
This code would generate sequence that always has first two elements 1 and 99 as those are the edge cases. After that, any random value would be picked.
Currently only two distributions are supported: Uniform and Normal distribution.
Uniform distribution can be simply used by stating uniform()
.
values:
age: random([1, 5, 17, 18, 20], uniform())
output: $age
Normal distribution can be used in two ways.
normal()
where default values are mean=0.5
, standardDeviation=0.125
, lowerBound=0
, upperBound=1
.
And normal(mean, standardDeviation, lowerBound, upperBound)
.
values:
age:
age1: random(byte(1)..byte(100), true, normal())
age2: random(double(1)..double(100), false, normal(0, 1, -4, 4))
output: $age
Has two meanings depending on the arguments.
Generates values in the order they are specified until the end. Then starts again from beginning.
values:
serverIpAddress: circular(["10.10.0.1", "10.10.0.2", "10.10.0.3", "10.10.0.4"])
output: $serverIpAddress
This would create ObjectGenerator
which will generate following sequence:
"10.10.0.1", "10.10.0.2", "10.10.0.3", "10.10.0.4", "10.10.0.1", "10.10.0.2", "10.10.0.3", "10.10.0.4", "10.10.0.1", ...
Generates values from the beginning of the range to the end using step as increment. When end is reached, values are generated again from the beginning. Currently supports long and double ranges.
values:
temperature: circular(float(12.0)..float(25.0), float(0.2))
dayOfYear: circular(1..365, 1)
First generator would generate following sequence:
12.0, 12.2, 12.4, 12.6, 12.8, 13.0, ..., 24.6, 24.8, 25.0, 12.0, 12.2, ...
And second would generate:
1, 2, 3, 4, 5, 6, ..., 363, 364, 365, 1, 2, 3, ...
Generates list out of specified values.
values:
names: list(["Ema", circular(["Mike", "Steve", "John"]), "Ned", circular(["Jessica", "Lisa"])])
This would create ObjectGenerator
which will generate following sequence:
["Ema", "Mike", "Ned", "Jessica"]
["Ema", "Steve", "Ned", "Lisa"]
["Ema", "John", "Ned", "Jessica"]
["Ema", "Mike", "Ned", "Lisa"]
.
.
.
Generates empty list.
values:
names: list()
This generates empty list.
Generates empty map.
values:
names: emptyMap()
This generates empty map. It is useful when generating JSON with empty object. For example;
values:
user:
username: peter
email: email_address@domain.com
address: emptyMap()
jsonUser: json($user)
This will generate:
{"username:"peter","email":"email_address@domain.com","address":{}}
Generates random length list out of specified ObjectGenerator
.
There are two parameter variations:
values:
r: random(10..100)
names: list(3, 5, $r)
names: list(3, 5, $r, uniform())
This would create ObjectGenerator
which will generate lists with minimum 3 and
maximum 5 members:
[16, 36, 44]
[92, 96, 33, 25]
[12, 14, 54]
[78, 79, 35, 88, 96]
.
.
.
Note that object generator used within random list generator (in this case r
) should not be referenced in any other object generator within hierarchy since its values are reset and regenerated multiple times while list is constructed. Referencing it in any other place would result in having different values across value hierarchy.
Generates values with probability based on their weights.
values:
names: weighted([("Stephen", 11.5), ("George", 50), ("Charles", 38.5)])
This would create ObjectGenerator
which can generate possible sequence:
"Stephen", "George", "Charles", "George", "Charles", "George", "George", "Stepen", "Charles", ...
Where probability for name "George" is 50%, for "Charles" 38.5% and for "Stephen" 11.5%. However, weights do not need to sum up to 100, this example has it just for purpose of calculating the probability easily.
Having weighted distribution is great, at least for some use cases. But there are times where you will need to be precise, you cannot have with weighted distribution, especially when working with small numbers (< 1 000 000). Exact weighted distribution gives you precision, at the cost of limited number of objects.
values:
names: exactly([("Stephen", 11), ("George", 50), ("Charles", 39)])
output: $names
Values will be generated by probability specified by weight. Weight in this case needs to be of long type. If 100 elements are generated in this case, "George" would be generated exactly 50, "Charles" 39 and "Stephen" 11 times. If generation of more than 100 elements is attempted, exception will be thrown. If less than 100 elements are generated, they will follow weighted distribution. In order to provide precision, exact weighted distribution discards particular value from possible generation if value reached its quota. That is the reason that there is a limitation to number of generated values.
Generates UUID strings.
values:
id: uuid()
output: $id
Possible sequence is:
"27dbc38f-cadf-4d42-b18a-44c839e8b8f1", "575fb812-bb98-4f76-b31b-bf42e3ac2d62", "a7e229f3-875d-4a6a-9a5d-fb0670c3afdf", ...
Generates random string of specified length with optional character ranges. If ranges not specified, string will contain only characters from following ranges: 'A'-'Z'
, 'a'-'z'
and '0'-'9'
. Length can be specified as a number, but also as an expression which can evaluate to different number each time. Uniform distribution is used to select characters from character ranges.
There are two parameter variations:
values:
randomString1: randomContentString(5)
output: $randomString1
values:
randomString2: randomContentString(8, ['A'..'F', '0'..'9'])
output: $randomString2
values:
randomString3: randomContentString(random(5..10), ['A'..'Z'])
output: $randomString3
randomString1
will generate strings of length 5 with characters from ranges: 'A'-'Z'
, 'a'-'z'
and '0'-'9'
.
"Ldsfa", "3Jdf0", "AOSyu", "qr4Qe", "sf23c", "sdFfi", "320fS", ...
randomString2
will generate strings of length 8 from specified range of characters.
"EF893232", "2E49D0AB", "BE129E15", "938FFC1C", "BB8A43ED", "829D1CA2", ...
randomString3
will generate strings of length from 5 to 10 with characters from range: 'A'-'Z'
.
"SDFAD", "LJAOSDUF", "DJSKIEMNLS", "KEUXLANX", "DFSAW", "DFAEAN", ...
These functions return current time:
now()
returnslong
UTC time in millisecondsnowDate()
returnsDate
nowLocalDate()
returnsLocalDate
nowLocalDateTime()
returnsLocalDateTime
These functions perform arithmetic operations on provided values:
add('byte', 4, random(1..10))
returnsbyte
value from 5 to 14add('short', 10, 5)
returnsshort
value 15add('int', 10, add('int', 3, 4))
returnsint
value 17add('long', long(86400000), now())
returnslong
current millis one day in futureadd('float', float(3.2), 4.4)
returnsfloat
value 7.6add('double', 3.2, 20)
returnsdouble
value 23.2subtract('byte', 4, 2)
returnsbyte
value 2subtract('short', 4, 2)
returnsshort
value 2subtract('int', random(0..100), 10)
returnsint
value from -10 to 90subtract('long', now(), long(345600000))
returnslong
current millis 4 days in pastsubtract('float', 1, 3.2)
returnsfloat
value -2.2subtract('double', 1000.1, 2)
returnsdouble
value 998.1multiply('byte', 4, 2)
returnsbyte
value 8multiply('short', 1, 10)
returnsshort
value 10multiply('int', 60, 60)
returnsint
value 3600multiply('long', circular([1, 2, 3]), 5)
returnslong
values 5, 10, 15, 5, ...multiply('float', 44, 0.05)
returnsfloat
value 2.2multiply('double', 4, 2)
returnsdouble
value 8divide('byte', 4, 2)
returnsbyte
value 2divide('short', 4, 2)
returnsshort
value 2divide('int', 4, 2)
returnsint
value 2divide('long', 4, 2)
returnslong
value 2divide('float', 4, 2)
returnsfloat
value 2divide('double', 4, 2)
returnsdouble
value 2
It is also possible to use CSV file as source of data and to combine it with other Ranger's functions and values. There are three parameter variations:
values:
csv: csv("my-csv.csv")
output: $csv
values:
csv: csv("my-csv.csv", ',')
output: $csv
values:
csv: csv("my-csv.csv", ',', "\\n", false, '"', '#', true, null())
output: $csv
First two variations are shorthand variants, with reasonable defaults, third variation represents full list of parameters. If parameters are not specified (first or second variations) default values are used. List of parameters with explanation and default value:
1. path
Default value - No default value, mandatory
Description - Path to a CSV file
2. delimiter
Default value - ','
Description - Delimiter of columns within CSV file
3. recordSeparator
Default value - "\n"
Description - Delimiter of records within CSV file
4. trim
Default value - true
Description - True if each column value is to be trimmed for leading and trailing whitespace, otherwise false
5. quote
Default value - null
Character that will be stripped from beginning and end of each column if present. If set to null, no characters will be stripped (nothing will be used as quote character)
If null needs to be set explicitly, that can be done with null value function 'null()'.
6. commentMarker
Default value - '#'
Description - Character to use as a comment marker, everything after it is considered comment
7. ignoreEmptyLines
Default value - true
Description - True if empty lines are to be ignored, otherwise false
8. nullString
Default value - null
Description - Converts string with given value to null. If set to null, no conversion will be done
If null needs to be set explicitly, that can be done with null value function 'null()'.
If for example we have CSV with following values:
John,Smith,555-1331,New York,US
Peter,Braun,133-1123,Berlin,DE
# Commented line,Should, not be taken,into,account
Jose,Garcia,328-3221,Madrid,ES
And following code:
values:
csv: csv("filePath", ',', "\n", true, '"', '#', true, null())
output: string("{} {} - {} {}", get("c0", $csv), get("c1", $csv), get("c3", $csv), get("c4", $csv))
It would generate following lines:
John Smith - New York US
Peter Braun - Berlin DE
Jose Gercia - Madrid ES
Creates a formatted string using the specified format string and values.
values:
name: random(["Peter", "Stephen", "Charles"])
age: random(15..40)
text: string("{} is {} years old.", $name, $age)
output: $text
Possible generated values are:
"Peter is 18 years old.", "Peter is 34 years old.", "Charles is 27 years old.", ...
Transforms long, Date, LocalDate and LocalDateTime value into date format.
values:
date: time("yyyy-MM-dd", random(1483228800000..1514764800000))
output: $date
Possible generated values are:
"2017-03-25", "2017-08-08", "2017-10-11", ...
values:
date: time("yyyy-MM-dd HH:mm:ss.SSS", nowDate())
output: $date
This configuration will generate string time stamps, which can be helpful in many cases.
Transforms value of complex ObjectGenerator
into JSON.
values:
user:
id: circular(1..2000000, 1)
username: string("{}{}", random(["aragorn", "johnsnow", "mike", "batman"]), random(1..100))
firstName: random(["Peter", "Rodger", "Michael"])
lastName: random(["Smith", "Cooper", "Stark", "Grayson", "Atkinson", "Durant"])
maried: false
accountBalance: random(0.0..10000.0)
address:
city: random(["New York", "Washington", "San Francisco"])
street: random(["2nd St", "5th Avenue", "21st St", "Main St"])
houseNumber: random(1..55)
output: json($user)
Possible generated values are:
{"id":1,"username":"mike1","firstName":"Michael","lastName":"Cooper","maried":false,"accountBalance":0.0,"address":{"city":"San Francisco","street":"Main St","houseNumber":1}}
{"id":2,"username":"mike99","firstName":"Rodger","lastName":"Smith","maried":false,"accountBalance":9999.99999999999,"address":{"city":"San Francisco","street":"21st St","houseNumber":54}}
{"id":3,"username":"johnsnow35","firstName":"Michael","lastName":"Atkinson","maried":false,"accountBalance":9636.00274910154,"address":{"city":"New York","street":"Main St","houseNumber":37}}
Extracts property value from complex ObjectGenerator
, useful for data correlation.
values:
lightLoad:
type: LIGHT
value: random(1..10)
mediumLoad:
type: MEDIUM
value: random(11..100)
heavyLoad:
type: HEAVY
value: random(101..1000)
randomLoad: random([$lightLoad, $mediumLoad, $heavyLoad])
load:
additionalField: "Some value"
additionalField2: true
type: get("type", $randomLoad)
value: get("value", $randomLoad)
output: $load
Possible generated values are maps with values, for brevity, presented here as json:
{ "additionalFields": "Some value", "additionalFields2": true, type: "LIGHT", value: 4 },
{ "additionalFields": "Some value", "additionalFields2": true, type: "HEAVY", value: 120 },
{ "additionalFields": "Some value", "additionalFields2": true, type: "MEDIUM", value: 94 },
{ "additionalFields": "Some value", "additionalFields2": true, type: "HEAVY", value: 823 },
...
Since output of the YAML Configuration is ObjectGenerator
, it can be further combined within Java API
if needed. However, vice versa is not possible.