Skip to content

Transformation

Anton Le edited this page Nov 5, 2021 · 11 revisions

Transformation features

Now that we learned about extraction of the data from excel spreadsheets, let's look into how we can further consume and manipulate with this data.

In extraction examples all the parsed records were the instances of GenericRecord. It holds the data in the denormalized way as a Map<String, Any>. But what we need is a strict model with checks along the way.

E.g. let's look into into broken stats spreadsheet within basic_examples file. You can see that Club Brugge misses the data about the total points.

However, this is a critical information for us and we want to check non-nullability of the data along the extraction of the data (with capturing what rows do not comply with our model validation).

Apart from this, we want to store the data in a normalised way as a data class with statically typed fields. E.g. we know that points could only be integers, if it's something else — it should not pass the validation checks!

To support both of the features refinery allows you to define the data class with custom row parser to map the values to the fields of the data class

Clone this wiki locally