v2.0.0-RC2
Pre-releaseIntroduction of v2
Back in the days, Avro4k has been created in 2019. During 5 years, a lot of work has been done greatly around avro generic records and generating schemas.
Recently, kotlinx-serialization and kotlin did big releases, improving a lot of stuff (features, performances, better APIs). The json API of kotlinx-serialization propose a great API, so we tried to replicate its simplicity.
A big focus has been done to make Avro4k more lenient to simplify devs' life and improve adoption.
I hope this major release will make Avro easier to use, even more in pure kotlin 🚀
As a side note, we may implement our own plugins to generate data classes and schemas, stay tuned !
Highlights and Breaking changes
Needs Kotlin 2.0.0 and kotlinx.serialization 1.7.0-RC
You need at least Kotlin 2.0.0 and kotlinx.serialization 1.7.0-RC to use Avro4k v2 (version matrix is indicated in the README) as there is breaking changes in kotlinx-serialization plugin and library (released in tandem with kotlin version).
More information here: kotlinx-serialization v1.7.0-RC
ExperimentalSerializationApi
Since the API deeply changed, all the new functions, properties, classes, annotations that are annotated with ExperimentalSerializationApi
will show you a warn as they could change at any moment. Those annotated members will be un-annotated after a few releases if they proved their stability 🪨
To suppress this warning, you may opt-in the experimental serialization API. It is advised to not opt-in globally in the compiler arguments to avoid surprises when using experimental stuff 😅
Direct binary serialization
Before, serializing avro using Avro4k was done through a generic step, that converted first the data classes to generic maps, and then pass this generic data to the apache avro library.
Now, encoding to and decoding from binary is done directly, that improved a lot the performances (see Performances & benchmark section).
Note
We are still supporting the generic data serialization as long as there is a solution for kafka schema registry serialization (future avro4k module to be created), but it will be removed in the future to simplify the avro4k library as it is not really a serialization but more a conversion.
Support anything to encode and decode at root level
Now, no need to wrap your value in a record, you can serialize nearly everything and generate the corresponding schema!
This includes any data class, enum, sealed interface, value class, primitive or contextual values 🚀
Totally new API
The previous API needed to well understand how to use it, especially when playing with InputStream and OutputStream.
There is now different entrypoints for different purposes:
Avro
: the main entrypoint to generate schemas, encode and decode avro format. This is the pure raw avro format without anything elseAvroObjectContainerFile
: the entrypoint to encode avro data files, following the official spec, and usingAvro
for each value serialization.AvroSingleObject
: the entrypoint for encoding a single object prefixed with the schema fingerprint, following the official spec, and also usingAvro
for value serialization.
Here are some examples of the changes:
Pure avro serialization (no specific format, no prefix, no magic byte, just pure avro binary)
// Previously
val bytes = Avro.default.encodeToByteArray(TheDataClass.serializer(), TheDataClass(...))
Avro.default.decodeFromByteArray(TheDataClass.serializer(), bytes)
// Now
val bytes = Avro.encodeToByteArray(TheDataClass(...))
Avro.decodeFromByteArray<TheDataClass>(bytes)
generic data serialization (convert a kotlin data class to a GenericRecord to then be handled by a `GenericDatumWriter` in avro)
// Previously
val genericRecord: GenericRecord = Avro.default.toRecord(TheDataClass.serializer(), TheDataClass(...))
Avro.default.fromRecord(TheDataClass.serializer(), genericRecord)
// Now
val genericData: Any? = Avro.encodeToGenericData(TheDataClass(...))
Avro.decodeFromGenericData<TheDataClass>(genericData)
Configure the `Avro` instance
// Previously
val avro = Avro(
AvroConfiguration(
namingStrategy = FieldNamingStrategy.SnackCase,
implicitNulls = true,
),
SerializersModule {
contextual(CustomSerializer())
}
)
// Now
val avro = Avro {
namingStrategy = FieldNamingStrategy.SnackCase
implicitNulls = true
serializersModule = SerializersModule {
contextual(CustomSerializer())
}
}
Changing the name of a record
// Previously
@AvroName("TheName")
@AvroNamespace("a.custom.namespace")
data class TheDataClass(...)
// Now
@SerialName("a.custom.namespace.TheName")
data class TheDataClass(...)
Writing an avro object container file with a custom field naming strategy
// Previously
Files.newOutputStream(Path("/your/file.avro")).use { outputStream ->
Avro(AvroConfiguration(namingStrategy = SnakeCaseNamingStrategy))
.openOutputStream(TheDataClass.serializer()) { encodeFormat = AvroEncodeFormat.Data(CodecFactory.snappyCodec()) }
.to(outputStream)
.write(TheDataClass(...))
.write(TheDataClass(...))
.write(TheDataClass(...))
.close()
}
// Now
val dataSequence = sequenceOf(
TheDataClass(...),
TheDataClass(...),
TheDataClass(...),
)
val avro = Avro { fieldNamingStrategy = FieldNamingStrategy.SnakeCase }
Files.newOutputStream(Path("/your/file.avro")).use { outputStream ->
AvroObjectContainerFile(avro)
.encodeToStream(dataSequence, outputStream) {
codec(CodecFactory.snappyCodec())
// you can also add your metadata !
metadata("myProp", 1234L)
metadata("a string metadata", "hello")
}
}
Warning
Migration guide: WIP
Implicit nulls by default
Previously, when nothing were decoded for a nullable field was failing.
Now, it decodes null
and is not failing. To opt-out this feature, configure your Avro
instance with implicitNulls = false
.
It has been enabled by default to simplify the use of Avro4k and make it
Lenient
The apache avro library is strict regarding the types and strongly follow the avro spec. An example is that a float in kotlin can be written and read as a float and a double in avro.
Avro4k is pushing the lenience where a float can be written and read as a float, a double, a string, an int and a long in avro.
A type matrix has been written inside README
.
No more reflection
Thanks to this little change,
Absolutely no more reflection, so that allows using android or GraalVM AOT native compilation (need kotlinx-serialization 1.7.0).
Unified & cleaned annotations
Some numbers: 4 annotations has been removed over 12!
AvroJsonProp
has been merged toAvroProp
: the json content is automatically detected, so any non-json content is handled as a stringAvroAliases
has been merged toAvroAlias
: there is now avarags
to pass as many aliases as you want using the same annotationAvroInline
has been removed in favor of kotlin nativevalue class
AvroEnumDefault
is now to be applied directly on the default enum memberScalePrecision
has been renamed toAvroDecimal
to keep a common prefixAvroNamespace
andAvroName
has been replaced by the native kotlinx-serializationSerialName
annotationAvroNamespaceOverride
has been created to allow replacing the namespace of a field schema (⚠️ this annotation is not stable and can disappear at any moment)
Caching
All schemas are cached using WeakIdentityHashMap
to allow the GC to remove the cache entries in case of low available memory.
Also some other internal expensive parts are cached for quicker encoding and decoding.
Performances & benchmark
Warning
WIP
What's Changed
- fix: Assume kotlin.Pair as a normal data class instead of an union by @Chuckame in #174
- feat!: No more reflection and customizable logical types by @Chuckame in #175
- feat: Add support for decoding with avro aliases by @Chuckame in #177
- Generalize encoding/decoding tests (#168) by @Chuckame in #179
- chore: Add spotless with ktlint + editorconfig by @Chuckame in #180
- feat: Support kotlin's value classes by @Chuckame in #183
- feat: Revamp naming strategy and related annotations by @Chuckame in #182
- feat: Merge ScalePrecision to AvroDecimalLogicalType by @Chuckame in #191
- chore: Upgrade github actions and use standard gradle actions by @Chuckame in #192
- feat: revamp the schema generation by @Chuckame in #190
- feat: New Avro entrypoint by @Chuckame in #186
- feat: Support everything at root level by @Chuckame in #202
- feat!: Set @AvroEnumDefault directly to the enum value instead of the class by @Chuckame in #203
- feat!: Merge AvroJsonProp to AvroProp by @Chuckame in #204
- build: Explicit API mode to prevent exposing internal stuff by @Chuckame in #205
- Union perf improvement by @Chuckame in #208
- deps: Upgrade kotlinx-serialization and kotlin by @Chuckame in #209
- docs: Improve documentation by @Chuckame in #210
- feat!: No more kotlin-reflect for logical types by @Chuckame in #214
- Direct encoding by @Chuckame in #215
- feat: Allow generating a release on a non-main branch by @Chuckame in #217
Full Changelog: v1.10.1...v2.0.0-RC2