Provides reading and writing Parquet file
First of all we need to make mapping configuration
If we have the same types of data between columns and properties
var mapConfig = new MapperConfig<TestModel>()
.MapProperty(x => x.Id, "ID")
.MapProperty(x => x.Name, "NAME")
.MapProperty(x => x.Value, "VALUE");
If we have different types between columns and properties
var mapConfig = new MapperConfig<TestModel2>()
.MapProperty(x => x.IdStr, "ID", x => int.Parse(x.Split()[0]), x => x + " modified")
.MapProperty(x => x.NameInt, "NAME", IntToStr, StrToInt)
.MapProperty(x => x.Value, "VALUE");
private int StrToInt(string x)
{
return int.Parse(x.Split()[2]);
}
private string IntToStr(int x)
{
return $"Name is {x}";
}
We can use a special class to map types. For instance, a Parquet.Net doesn't know about a type DateTime it uses DateTimeOffset so we can use:
var mapConfig = new MapperConfig<TestModel3>()
.MapProperty(x => x.DateValue, "DATE", DefaultMappers.DateTimeOffsetToDateTime);
Furthermore, you can use custom mappers, who implement an Interface IDifferentTypesMapper<TProperty, TColumn>.
If we have a file
var parquerEngine = new ParquetDataEngine();
var data = parquerEngine.Read(mapConfig, "test.parquet");
If we have a stream
var parquerEngine = new ParquetDataEngine();
var data = parquerEngine.Read(mapConfig, parquetStream);
We can write data to file or stream
var parquetEngine = new ParquetDataEngine();
// We can use a file
parquetEngine.Write(mapConfig, "test.parquet", testData);
// Or we can use a stream
// parquetEngine.Write(mapConfig, parquetStream, testData);
We can append data to new row group
var parquetEngine = new ParquetDataEngine();
parquetEngine.Append(mapConfig, "test.parquet", data);