From 31b7889b06a3988437c4472030307655ab6c9fe1 Mon Sep 17 00:00:00 2001 From: Chuckame Date: Thu, 9 May 2024 13:53:41 +0200 Subject: [PATCH 1/4] docs: Improve README.md --- README.md | 1065 ++++++++++------- api/avro4k-core.api | 1 - .../github/avrokotlin/avro4k/annotations.kt | 5 + .../avrokotlin/avro4k/decoder/AvroDecoder.kt | 10 + 4 files changed, 649 insertions(+), 432 deletions(-) diff --git a/README.md b/README.md index 8ac35238..cf6311b3 100644 --- a/README.md +++ b/README.md @@ -1,617 +1,820 @@ -# -[Avro](https://avro.apache.org/) format for [kotlinx.serialization](https://github.com/Kotlin/kotlinx.serialization). This library is a port of [sksamuel's](https://github.com/sksamuel) Scala Avro generator [avro4s](https://github.com/sksamuel/avro4s). - ![build-main](https://github.com/avro-kotlin/avro4k/workflows/build-main/badge.svg) -[](http://search.maven.org/#search%7Cga%7C1%7Cavro4k) +[![Download](https://img.shields.io/maven-central/v/com.github.avro-kotlin.avro4k/avro4k-core)](https://search.maven.org/artifact/com.github.avro-kotlin.avro4k/avro4k-core) +[![Kotlin](https://img.shields.io/badge/kotlin-1.6-blue.svg?logo=kotlin)](http://kotlinlang.org) +[![Avro spec](https://img.shields.io/badge/avro%20spec-1.11.1-blue.svg?logo=apache)](https://avro.apache.org/docs/1.11.1/specification/) + +# Introduction +**Avro4k** (or Avro for Kotlin) is a library that brings [Avro](https://avro.apache.org/) serialization format in kotlin, based on the **reflection-less** kotlin library +called [kotlinx-serialization](https://github.com/Kotlin/kotlinx.serialization). -## Introduction +Here are the main features: -Avro4k is a Kotlin library that brings support for Avro to the Kotlin Serialization framework. This library supports reading and writing to/from binary and json streams as well as supporting Avro schema generation. +- **Full avro support**, including logical types, unions, recursive types, and schema evolution :white_check_mark: +- **Encode and decode** anything to and from binary format, and also in generic data :toolbox: +- **Generate schemas** based on your values and data classes :pencil: +- **Customize** the generated schemas and encoded data with annotations :construction_worker: +- **Fast** as it is reflection-less :rocket: +- **Simple API** to get started quickly, also with native support of `java.time`, `BigDecimal`, `BigInteger` and `UUID` classes :1st_place_medal: +- **Relaxed matching** for easy schema evolution as it natively [adapts compatible types](#types-matrix) :cyclone: -- [Generate schemas](https://github.com/avro-kotlin/avro4k#schemas) from Kotlin data classes. This allows you to use data classes as the canonical source for the schemas and generate Avro schemas from _them_, rather than define schemas externally and then generate (or manually write) data classes to match. -- Marshall data classes to / from instances of Avro Generic Records. The basic _structure type_ in Avro is the _IndexedRecord_ or it's more common subclass, the _GenericRecord_. This library will marshall data to and from Avro records. This can be useful for interop with frameworks like Kafka which provide serializers which work at the record level. -- [Read / Write](https://github.com/avro-kotlin/avro4k#input--output) data classes to input or output streams. Avro records can be serialized as binary (with or without embedded schema) or json, and this library provides _AvroInputStream_ and _AvroOutputStream_ classes to support data classes directly. -- [Support logical types](https://github.com/avro-kotlin/avro4k#types). This library provides support for the Avro [logical types](https://avro.apache.org/docs/1.11.0/spec.html#Logical+Types) out of the box, in addition to the _standard_ types supported by the Kotlin serialization framework. -- Add custom serializer for other types. With Avro4k you can easily add your own _AvroSerializer_ instances that provides schemas and serialization for types not supported by default. +> [!WARNING] +> **Important**: As of today, avro4k is **only available for JVM platform**, and theoretically for android platform (as apache avro library is already **android-ready**).
If +> you would like to have js/wasm/native compatible platforms, please put a :thumbsup: on [this issue](https://github.com/avro-kotlin/avro4k/issues/207) -## Schemas +# Quick start -Writing schemas manually through the Java based `SchemaBuilder` classes can be tedious for complex domain models. -Avro4k allows us to generate schemas directly from data classes at compile time using the Kotlin Serialization library. -This gives you both the convenience of generated code, without the annoyance of having to run a code generation step, as well as avoiding the performance penalty of runtime reflection based code. +## Basic -Let's define some classes. +
+Example: ```kotlin -@Serializable -data class Ingredient(val name: String, val sugar: Double, val fat: Double) +package myapp + +import com.github.avrokotlin.avro4k.* +import kotlinx.serialization.* @Serializable -data class Pizza(val name: String, val ingredients: List, val vegetarian: Boolean, val kcals: Int) +data class Project(val name: String, val language: String) + +fun main() { + // Generating schemas + val schema = Avro.schema() + println(schema.toString()) // {"type":"record","name":"Project","namespace":"myapp","fields":[{"name":"name","type":"string"},{"name":"language","type":"string"}]} + + // Serializing objects + val data = Project("kotlinx.serialization", "Kotlin") + val bytes = Avro.encodeToByteArray(data) + + // Deserializing objects + val obj = Avro.decodeFromByteArray(bytes) + println(obj) // Project(name=kotlinx.serialization, language=Kotlin) +} ``` -To generate an Avro Schema, we need to use the `Avro` object, invoking `schema` and passing in the serializer generated by the Kotlin Serialization compiler plugin for your target class. This will return an `org.apache.avro.Schema` instance. +
-In other words: +## Single object + +Avro4k provides a way to encode and decode single objects with `AvroSingleObject` class. This encoding will prefix the binary data with the schema fingerprint to +allow knowing the writer schema when reading the data. The downside is that you need to provide a schema registry to get the schema from the fingerprint. +This format is perfect for payloads sent through message brokers like kafka or rabbitmq as it is the most compact schema-aware format. + +
+Example: ```kotlin -val schema = Avro.schema() -println(schema.toString(true)) -``` +package myapp -Where the generated schema is as follows: +import com.github.avrokotlin.avro4k.* +import kotlinx.serialization.* +import org.apache.avro.SchemaNormalization -```json -{ - "type":"record", - "name":"Pizza", - "namespace":"com.github.avrokotlin.avro4k.example", - "fields":[ - { - "name":"name", - "type":"string" - }, - { - "name":"ingredients", - "type":{ - "type":"array", - "items":{ - "type":"record", - "name":"Ingredient", - "fields":[ - { - "name":"name", - "type":"string" - }, - { - "name":"sugar", - "type":"double" - }, - { - "name":"fat", - "type":"double" - } - ] - } - } - }, - { - "name":"vegetarian", - "type":"boolean" - }, - { - "name":"kcals", - "type":"int" - } - ] +@Serializable +data class Project(val name: String, val language: String) + +fun main() { + val schema = Avro.schema() + val schemasByFingerprint = mapOf(SchemaNormalization.parsingFingerprint64(schema), schema) + val singleObjectInstance = AvroSingleObject { schemasByFingerprint[it] } + + // Serializing objects + val data = Project("kotlinx.serialization", "Kotlin") + val bytes = singleObjectInstance.encodeToByteArray(data) + + // Deserializing objects + val obj = singleObjectInstance.decodeFromByteArray(bytes) + println(obj) // Project(name=kotlinx.serialization, language=Kotlin) } ``` -You can see that the schema generator handles nested data classes, lists, primitives, etc. For a full list of supported object types, see the table later. +
+> For more details, check in the avro spec the [single object encoding](https://avro.apache.org/docs/1.11.1/specification/#single-object-encoding). -### Overriding class name and namespace +## Object container -Avro schemas for complex types (RECORDs) contain a name and a namespace. -By default, these are the name of the class and the enclosing package name, but it is possible to customize these using the annotations `@SerialName`. +Avro4k provides a way to encode and decode object container — also known as data file — with `AvroObjectContainerFile` class. This encoding will prefix the binary data with the +full schema to +allow knowing the writer schema when reading the data. This format is perfect for storing multiple long-term objects in a single file. -For example, the following class: +
+Example: ```kotlin -package com.github.avrokotlin.avro4k.example +package myapp -data class Foo(a: String) -``` +import com.github.avrokotlin.avro4k.* +import kotlinx.serialization.* +import org.apache.avro.SchemaNormalization + +@Serializable +data class Project(val name: String, val language: String) -Would normally have a schema like this: +fun main() { + val schema = Avro.schema() + val schemasByFingerprint = mapOf(SchemaNormalization.parsingFingerprint64(schema), schema) + val singleObjectInstance = AvroSingleObject { schemasByFingerprint[it] } -```json -{ - "type":"record", - "name":"Foo", - "namespace":"com.github.avrokotlin.avro4k.example", - "fields":[ - { - "name":"a", - "type":"string" - } - ] + // Serializing objects + val data = Project("kotlinx.serialization", "Kotlin") + val bytes = singleObjectInstance.encodeToByteArray(data) + + // Deserializing objects + val obj = singleObjectInstance.decodeFromByteArray(bytes) + println(obj) // Project(name=kotlinx.serialization, language=Kotlin) } ``` -#### Overriding the class name and the namespace +
+ +> For more details, check in the avro spec the [single object encoding](https://avro.apache.org/docs/1.11.1/specification/#single-object-encoding). + +# Important notes + +- **Avro4k** is highly based on apache avro library, that implies all the schema validation is done by it +- All members annotated with `@ExperimentalSerializationApi` are **subject to changes** in future releases without any notice as they are experimental, so please + check the release notes to check the needed migration + +# Setup + +
+ Gradle Kotlin DSL ```kotlin -package com.github.avrokotlin.avro4k.example +plugins { + kotlin("jvm") version kotlinVersion + kotlin("plugin.serialization") version kotlinVersion +} -@SerialName("com.other.Wibble") -data class Foo(val a: String) +dependencies { + implementation("com.github.avro-kotlin.avro4k:avro4k-core:$avro4kVersion") +} ``` -And then the generated schema looks like this: +
-```json -{ - "type":"record", - "name":"Wibble", - "namespace":"com.other", - "fields":[ - { - "name":"a", - "type":"string" - } - ] +
+ +
+ Gradle Groovy DSL + +```groovy +plugins { + id 'org.jetbrains.kotlin.multiplatform' version kotlinVersion + id 'org.jetbrains.kotlin.plugin.serialization' version kotlinVersion +} + +dependencies { + implementation "com.github.avro-kotlin.avro4k:avro4k-core:$avro4kVersion" } ``` +
-#### Overriding only the namespace +
-We can also just override the namespace while keeping the class name as record name: +
+ Maven -```kotlin -package com.github.avrokotlin.avro4k.example +Add serialization plugin to Kotlin compiler plugin: + +```xml -@SerialName("com.other.Foo") -data class Foo(val a: String) + + + + org.jetbrains.kotlin + kotlin-maven-plugin + ${kotlin.version} + + + compile + compile + + compile + + + + + + kotlinx-serialization + + + + + org.jetbrains.kotlin + kotlin-maven-serialization + ${kotlin.version} + + + + + ``` -And then the generated schema looks like this: +Add the avro4k dependency: -```json -{ - "type":"record", - "name":"Foo", - "namespace":"com.other", - "fields":[ - { - "name":"a", - "type":"string" - } - ] -} +```xml + + + com.github.avro-kotlin.avro4k + avro4k-core + ${avro4k.version} + ``` +
+# How to generate schemas -#### Overriding only the name +Writing schemas manually or using the Java based `SchemaBuilder` can be tedious. +`kotlinx-serialization` simplifies this generating for us the corresponding descriptors to allow generating avro schemas easily, without any reflection. +Also, it provides native compatibility with data classes (including open and sealed classes), inline classes, any collection, array, enums, and primitive values. -We can just override the name while keeping the namespace. Note that you need to replicate the namespace in the `@SerialName` annotation: +> [!NOTE] +> For more information about the avro schema, please refer to the [avro specification](https://avro.apache.org/docs/1.11.1/specification/) + +To allow generating a schema for a specific class, you need to annotate it with `@Serializable`: ```kotlin -package com.github.avrokotlin.avro4k.example +@Serializable +data class Ingredient(val name: String, val sugar: Double) -@SerialName("com.github.avrokotlin.avro4k.example.Wibble") -data class Foo(val a: String) +@Serializable +data class Pizza(val name: String, val ingredients: List, val topping: Ingredient?, val vegetarian: Boolean) ``` -And then the generated schema looks like this: +Then you can generate the schema using the `Avro.schema` function: + +```kotlin +val schema = Avro.schema() +println(schema.toString(true)) +``` + +The generated schema will look as follows: ```json { - "type":"record", - "name":"Wibble", - "namespace":"com.github.avrokotlin.avro4k.example", - "fields":[ - { - "name":"a", - "type":"string" - } - ] + "type": "record", + "name": "Pizza", + "namespace": "com.github.avrokotlin.avro4k.example", + "fields": [ + { + "name": "name", + "type": "string" + }, + { + "name": "ingredients", + "type": { + "type": "array", + "items": { + "type": "record", + "name": "Ingredient", + "fields": [ + { + "name": "name", + "type": "string" + }, + { + "name": "sugar", + "type": "double" + } + ] + } + } + }, + { + "name": "topping", + "type": [ + "null", + { + "type": "record", + "name": "Ingredient" + } + ], + "default": null + }, + { + "name": "vegetarian", + "type": "boolean" + } + ] } ``` +If you need to configure your `Avro` instance, you need to create your own instance of `Avro` with the wanted configuration, and then use it to generate the schema: -### Overriding a field name - -The `@SerialName` annotation can also be used to override field names. -This is useful when the record instances you are generating or reading need to have field names different from the Kotlin data classes. -For example if you are reading data generated by another system or another language. +```kotlin +val yourAvroInstance = Avro { + // your configuration +} +yourAvroInstance.schema() +``` -Given the following class. +# Usage + +## Types matrix + +| Kotlin type | Avro reader type | Compatible avro writer type | Avro logical type | Note / Serializer class | +|-----------------------------|------------------|--------------------------------------------|--------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `Boolean` | `boolean` | `string` | | | +| `Byte`, `Short`, `Int` | `int` | `long`, `float`, `double`, `string` | | | +| `Long` | `long` | `int`, `float`, `double`, `string` | | | +| `Float` | `float` | `double`, `string` | | | +| `Double` | `double` | `float`, `string` | | | +| `Char` | `int` | `string` | | The value serialized is the char code. When reading from a `string`, requires exactly 1 char | +| `String` | `string` | `bytes`, `fixed` | | | +| `ByteArray` | `bytes` | `string`, `fixed` | | | +| `Map<*, *>` | `map` | | | The map key must be string-able. Mainly everything is string-able except null and composite types (collection, data classes) | +| `out Collection<*>` | `array` | | | | +| `data class` | `record` | | | | +| `enum class` | `enum` | | | | +| Any field with `@AvroFixed` | `fixed` | `bytes`, `string` | | You can only annotated fields that are compatible with `bytes`or `string`, otherwise it throws an error | +| `java.math.BigDecimal` | `bytes` | `int`, `long`, `float`, `double`, `string` | `decimal` | By default, the scale is `2` and the precision `8`. To change it, annotate the field with `@AvroDecimal` | +| `java.math.BigDecimal` | `string` | | | To use it, [register the serializer](#support-additional-non-serializable-types) `com.github.avrokotlin.avro4k.serializer.BigDecimalAsStringSerializer`. `@AvroDecimal` is ignored in that case | +| `java.util.UUID` | `string` | | `uuid` | To use it, just annotate the field with `@Contextual` | +| `java.net.URL` | `string` | | | To use it, just annotate the field with `@Contextual` | +| `java.math.BigInteger` | `string` | `int`, `long`, `float`, `double` | | To use it, just annotate the field with `@Contextual` | +| `java.time.LocalDate` | `int` | `long`, `string` | `date` | To use it, just annotate the field with `@Contextual` | +| `java.time.Instant` | `long` | `string` | `timestamp-millis` | To use it, just annotate the field with `@Contextual` | +| `java.time.Instant` | `long` | `string` | `timestamp-micros` | To use it, [register the serializer](#support-additional-non-serializable-types) `com.github.avrokotlin.avro4k.serializer.InstantToMicroSerializer` | +| `java.time.LocalDateTime` | `long` | `string` | `timestamp-millis` | To use it, just annotate the field with `@Contextual` | +| `java.time.LocalTime` | `int` | `long`, `string` | `time-millis` | To use it, just annotate the field with `@Contextual` | + +> [!NOTE] +> For more details, check the [built-in classes in kotlinx-serialization](https://github.com/Kotlin/kotlinx.serialization/blob/master/docs/builtin-classes.md) + +## Add documentation to a schema + +You may want to add documentation to a schema to provide more information about a field or a named type (only RECORD and ENUM for the moment). + +> [!WARNING] +> Do not use `@org.apache.avro.reflect.AvroDoc` as this annotation is not visible by Avro4k. ```kotlin -package com.github.avrokotlin.avro4k.example - -data class Foo(val a: String, @SerialName("z") val b : String) -``` +import com.github.avrokotlin.avro4k.AvroDoc -Then the generated schema would look like this: +@Serializable +@AvroDoc("This is a record documentation") +data class MyData( + @AvroDoc("This is a field documentation") + val myField: String +) -```json -{ - "type":"record", - "name":"Foo", - "namespace":"com.github.avrokotlin.avro4k.example", - "fields":[ - { - "name":"a", - "type":"string" - }, - { - "name":"z", - "type":"string" - } - ] +@Serializable +@AvroDoc("This is an enum documentation") +enum class MyEnum { + A, + B } ``` -Notice that the second field is z and not b. +> [!NOTE] +> This impacts only the schema generation. + +## Support additional non-serializable types + +When looking at the [types matrix](#types-matrix), you can see some of them natively supported by Avro4k, but some others are not. +Also, your own types may not be serializable. -Note: `@SerialName` does not add an alternative name for the field, but an override. -If you wish to have alternatives then you should use `@AvroAlias`. +To fix it, you need to create a custom **serializer** that will handle the serialization and deserialization of the value, and provide a descriptor. +> [!NOTE] +> This impacts the serialization and the deserialization. It can also impact the schema generation if the serializer is providing a custom logical type or a custom schema through the descriptor. -### Overriding the namespaces for all nested records, fixed and enums in a field +### Write your own serializer + +To create a custom serializer, you need to implement the `KSerializer` interface and override the `serialize` and `deserialize` functions. +Also, you'll need to provide a descriptor that includes the `@AvroLogicalType` annotation. + +
+Create a generic serializer that doesn't need specific Avro features ```kotlin -package com.github.avrokotlin.avro4k.example +object YourTypeSerializer : KSerializer { + override val descriptor: SerialDescriptor = PrimitiveSerialDescriptor("YourType", PrimitiveKind.STRING) + + override fun serialize(encoder: Encoder, value: YourType) { + encoder.encodeString(value.toString()) + } -data class Foo(@AvroNamespaceOverride("overridden") val nested: Bar) -data class Bar(val a: String) + override fun deserialize(decoder: Decoder): YourType { + return YourType.fromString(decoder.decodeString()) + } +} ``` -Then the generated schema would look like this: +
-```json -{ - "type":"record", - "name":"Foo", - "namespace":"com.github.avrokotlin.avro4k.example", - "fields":[ - { - "name":"nested", - "type": { - "type":"record", - "name":"Bar", - "namespace":"overridden", - "fields":[ - { - "name":"a", - "type":"string" - } - ] - } - } - ] +
+Create a serializer that needs Avro features like getting the schema or encoding bytes and fixed types + +```kotlin +object YourTypeSerializer : AvroSerializer { + // the descriptor that will be used to generate the schema + override val descriptor: SerialDescriptor = PrimitiveSerialDescriptor("YourType", PrimitiveKind.STRING) + + override fun serializeAvro(encoder: AvroEncoder, value: YourType) { + encoder.currentWriterSchema // you can access the current writer schema + encoder.encodeString(value.toString()) + } + + override fun deserializeAvro(decoder: AvroDecoder): YourType { + decoder.currentWriterSchema // you can access the current writer schema + return YourType.fromString(decoder.decodeString()) + } + + override fun serializeGeneric(encoder: Encoder, value: YourType) { + // you may want to implement this function if you also want to use the serializer outside of Avro4k + encoder.encodeString(value.toString()) + } + + override fun deserializeGeneric(decoder: Decoder): YourType { + // you may want to implement this function if you also want to use the serializer outside of Avro4k + return YourType.fromString(decoder.decodeString()) + } } ``` -Notice that the second field is z and not b. +
-Note: `@SerialName` does not add an alternative name for the field, but an override. -If you wish to have alternatives then you should use `@AvroAlias`. +### Register the serializer globally +You first need to configure your `Avro` instance with the wanted serializer instance: + +```kotlin +import kotlinx.serialization.modules.SerializersModule +import kotlinx.serialization.modules.contextual + +val myCustomizedAvroInstance = Avro { + serializersModule = SerializersModule { + // give the object serializer instance + contextual(YourTypeSerializerObject) + // or instanciate it if it's a class and not an object + contextual(YourTypeSerializerClass()) + } +} +``` +Then just annotated the field with `@Contextual`: -### Change the record/enum/fixed naming strategy -To change the naming strategy for records, enums and fixed types, create your own instance of `Avro` with the wanted naming strategies. +```kotlin +@Serializable +data class MyData( + @Contextual val myField: YourType +) +``` -- `fieldNamingStrategy` is used for field names. -- `typeNamingStrategy` is used for record, enum and fixed names. +### Register the serializer just for a field ```kotlin -Avro(AvroConfiguration( - fieldNamingStrategy = /* ... */, - typeNamingStrategy = /* ... */, -)) +@Serializable +data class MyData( + @Serializable(with = YourTypeSerializer::class) val myField: YourType +) ``` -### Adding properties and docs to a Schema +## Changing record's field name -Avro allows a doc field, and arbitrary key/values to be added to generated schemas. -Avro4k supports this through the use of `@AvroDoc` and `@AvroProp` annotations. +By default, field names are the original name of the kotlin fields in the data classes. -These properties works on either complex or simple types - in other words, on both fields and classes. +> [!NOTE] +> This impacts the schema generation, the serialization and the deserialization of the field. -For example, the following code: +### Individual field name change -```kotlin -package com.github.avrokotlin.avro4k.example +To change a field name, annotate it with `@SerialName`: +```kotlin @Serializable -@AvroDoc("hello, is it me you're looking for?") -data class Foo(@AvroDoc("I am a string") val str: String, - @AvroDoc("I am a long") val long: Long, - val int: Int) +data class MyData( + @SerialName("custom_field_name") val myField: String +) ``` -Would result in the following schema: +> [!NOTE] +> `@SerialName` will still be handled by the naming strategy -```json -{ - "type": "record", - "name": "Foo", - "namespace": "com.github.avrokotlin.avro4k.example", - "doc":"hello, is it me you're looking for?", - "fields": [ - { - "name": "str", - "type": "string", - "doc" : "I am a string" - }, - { - "name": "long", - "type": "long", - "doc" : "I am a long" - }, - { - "name": "int", - "type": "int" - } - ] +### Field naming strategy (overall change) + +To apply a naming strategy to all fields, you need to set the `fieldNamingStrategy` in the `Avro` configuration. + +> [!NOTE] +> This is only applicable for RECORD fields, and not for ENUM symbols. + +There is 3 built-ins strategies: + +- `NoOp` (default): keeps the original kotlin field name +- `SnakeCase`: converts the original kotlin field name to snake_case with underscores before each uppercase letter +- `PascalCase`: upper-case the first letter of the original kotlin field name + +First, create your own instance of `Avro` with the wanted naming strategy: + +```kotlin + +val myCustomizedAvroInstance = Avro { + fieldNamingStrategy = FieldNamingStrategy.Builtins.SnakeCase } + ``` -Similarly, for properties: +Then, use this instance to generate the schema or encode/decode data: ```kotlin -package com.github.avrokotlin.avro4k.example +package my.package @Serializable -@AvroProp("jack", "bruce") -data class Annotated(@AvroProp("richard", "ashcroft") val str: String, - @AvroProp("kate", "bush") val long: Long, - val int: Int) +data class MyData(val myField: String) + +val schema = myCustomizedAvroInstance.schema() // {...,"fields":[{"name":"my_field",...}]} ``` -Would generate this schema: +## Set a default field value -```json -{ - "type": "record", - "name": "Annotated", - "namespace": "com.github.avrokotlin.avro4k.example", - "fields": [ - { - "name": "str", - "type": "string", - "richard": "ashcroft" - }, - { - "name": "long", - "type": "long", - "kate": "bush" - }, - { - "name": "int", - "type": "int" - } - ], - "jack": "bruce" -} -``` +While reading avro binary data, you can miss a field (a kotlin field is present but not in the avro binary data), so Avro4k fails as it is not capable of constructing the kotlin +type without the missing field value. -### Decimal scale and precision +> [!NOTE] +> By default, all nullable fields are optional as a `default: null` is automatically added to the schema ([check this section](#disable-implicit-default-null-for-nullable-fields) +> to opt out from this default behavior). -In order to customize the scale and precision used by BigDecimal schema generators, -you can add the `@AvroDecimal` annotation to instances of BigDecimal. +### @AvroDefault -For example, this code: +To avoid this error, you can set a default value for a field by annotating it with `@AvroDefault`: ```kotlin +import com.github.avrokotlin.avro4k.AvroDefault + @Serializable -data class Test(@AvroDecimal(1, 4) val decimal: BigDecimal) +data class MyData( + @AvroDefault("default value") val stringField: String, + @AvroDefault("42") val intField: Int?, + @AvroDefault("""{"stringField":"custom value"}""") val nestedType: MyData? = null +) +``` + +> [!NOTE] +> This impacts only the schema generation and the deserialization of the field, and not the serialization. + +> [!WARNING] +> Do not use `@org.apache.avro.reflect.AvroDefault` as this annotation is not visible by Avro4k. -val schema = Avro.schema(Test.serializer()) +### kotlin default value + +You can also set a kotlin default value, but this default won't be present into the generated schema as Avro4k is not able to retrieve it: + +```kotlin +@Serializable +data class MyData( + val stringField: String = "default value", + val intField: Int? = 42, +) ``` -Would generate the following schema: +> This impacts only the deserialization of the field, and not the serialization or deserialization. -```json -{ - "type":"record", - "name":"Test", - "namespace":"com.foo", - "fields":[{ - "name":"decimal", - "type":{ - "type":"bytes", - "logicalType":"decimal", - "scale":"1", - "precision":"4" - } - }] -} +## Add aliases + +To be able of reading from different written schemas, or able of writing to different schemas, you can add aliases to a named type (record, enum) field by annotating it +with `@AvroAlias`. The given aliases may contain the full name of the alias type or only the name. + +> [Avro spec link](https://avro.apache.org/docs/1.11.1/specification/#aliases) + +> [!NOTE] +> Aliases are not impacted by [naming strategy](#field-naming-strategy-overall-change), so you need to provide aliases directly applying the corresponding naming strategy if you +> need to respect it. + +```kotlin +import com.github.avrokotlin.avro4k.AvroAlias + +@Serializable +@AvroAlias("full.name.RecordName", "JustOtherRecordName") +data class MyData( + @AvroAlias("anotherFieldName", "old_field_name") val myField: String +) ``` -### Avro Fixed +> [!NOTE] +> This impacts the schema generation, the serialization and the deserialization. -Avro supports the idea of fixed length byte arrays. -To use these we can either override the schema generated for a type to return Schema.Type.Fixed. -This will work for types like String or UUID. -You can also annotate a field with `@AvroFixed(size)`. +> [!WARNING] +> Do not use `@org.apache.avro.reflect.AvroAlias` as this annotation is not visible by Avro4k. -For example: +## Add metadata to a schema (custom properties) + +You can add custom properties to a schema to have additional metadata on a type. +To do so, you can annotate the data class or field with `@AvroProp`. The value can be a regular string or any json content: ```kotlin -package com.github.avrokotlin.avro4k.example +@Serializable +@AvroProp("custom_string_property", "The default non-json value") +@AvroProp("custom_int_property", "42") +@AvroProp("custom_json_property", """{"key":"value"}""") +data class MyData( + @AvroProp("custom_field_property", "Also working on fields") + val myField: String +) +``` + +> [!NOTE] +> This impacts only the schema generation. For more details, check the [avro specification](https://avro.apache.org/docs/1.11.1/specification/#schema_props). + +> [!WARNING] +> Do not use `@org.apache.avro.reflect.AvroMeta` as this annotation is not visible by Avro4k. + +## Change scale and precision for `decimal` logical type -data class Foo(@AvroFixed(7) val mystring: String) +By default, the scale is `2` and the precision `8`. To change it, annotate the field with `@AvroDecimal`: -val schema = Avro.schema(Foo.serializer()) +```kotlin +@Serializable +data class MyData( + @AvroDecimal(scale = 4, precision = 10) val myField: BigDecimal +) ``` -Will generate the following schema: +> [!NOTE] +> This impacts the schema generation, the serialization and the deserialization. -```json -{ - "type": "record", - "name": "Foo", - "namespace": "com.github.avrokotlin.avro4k.example", - "fields": [ - { - "name": "mystring", - "type": { - "type": "fixed", - "name": "mystring", - "size": 7 - } - } - ] +## Change enum values' name + +By default, enum symbols are exactly the name of the enum values in the enum classes. To change this default, you need to annotate enum values with `@SerialName`. + +```kotlin +@Serializable +enum class MyEnum { + @SerialName("CUSTOM_NAME") + A, + B, + C } ``` -### Transient Fields +> [!NOTE] +> This impacts the schema generation, the serialization and the deserialization. -The kotlinx.serialization framework does not support the standard @transient anotation to mark a field as ignored, but instead supports its own `@kotlinx.serialization.Transient` annotation to do the same job. - Any field marked with this will be excluded from the generated schema. +## Set enum default -For example, the following code: +When reading with a schema but was written with a different schema, sometimes the reader can miss the enum symbol that triggers an error. +To avoid this error, you can set a default symbol for an enum by annotating the expected fallback with `@AvroEnumDefault`. ```kotlin -package com.github.avrokotlin.avro4k.example - -data class Foo(val a: String, @Transient val b: String = "default value") -``` +@Serializable +enum class MyEnum { + A, -Would result in the following schema: + @AvroEnumDefault + B, -```json -{ - "type": "record", - "name": "Foo", - "namespace": "com.github.avrokotlin.avro4k.example", - "fields": [ - { - "name": "a", - "type": "string" - } - ] + C } ``` -### Nullable fields, optional fields and compatibility +> [!NOTE] +> This impacts the schema generation, the serialization and the deserialization. + +## Change type name (RECORD and ENUM) + +RECORD and ENUM types in Avro have a name and a namespace (composing a full-name like `namespace.name`). By default, the name is the name of the class/enum and the namespace is the +package name. +To change this default, you need to annotate data classes and enums with `@SerialName`. + +> [!WARNING] +> `@SerialName` is redefining the full-name of the annotated class or enum, so you **must** repeat the name or the namespace if you only need to change the namespace or the name +> respectively. + +> [!NOTE] +> This impacts the schema generation, the serialization and the deserialization. + +### Changing the name while keeping the namespace -#### TL;DR; -To make your nullable fields optional (put `default: null` on all nullable fields if no other explicit default provided) and be able to remove nullable fields regarding compatibility checks, -you can set in the configuration the `defaultNullForNullableFields` to `true`. Example: ```kotlin -Avro(AvroConfiguration(defaultNullForNullableFields = true)) +package my.package + +@Serializable +@SerialName("my.package.MyRecord") +data class MyData(val myField: String) ``` -#### Longer story +### Changing the namespace while keeping the name -With avro, you can have nullable fields and optional fields, that are taken into account for compatibility checking when using the schema registry. +```kotlin +package my.package -But if you want to remove a nullable field that is not optional, depending on the compatibility mode, it may not be compatible because of the missing default value. +@Serializable +@SerialName("custom.namespace.MyData") +data class MyData(val myField: String) +``` -- What is an optional field ? -> An optional field is a field that have a *default* value, like an int with a default as `-1`. - -- What is a nullable field ? -> A nullable field is a field that contains a `null` type in its type union, but **it's not an optional field if you don't put `default` value to `null`**. +### Changing the name and the namespace -So to mark a field as optional and facilitate avro contract evolution regarding compatibility checks, then set `default` to `null`. +```kotlin +package my.package + +@Serializable +@SerialName("custom.namespace.MyRecord") +data class MyData(val myField: String) +``` -## Types - -Avro4k supports the Avro logical types out of the box as well as other common JDK types. - -Avro has no understanding of Kotlin types, or anything outside of it's built in set of supported types, so all values must be converted to something that is compatible with Avro. - -For example a `java.sql.Timestamp` is usually encoded as a `Long`, and a `java.util.UUID` is encoded as a `String`. - -Some values can be mapped in multiple ways depending on how the schema was generated. For example a `String`, which is usually encoded as an `org.apache.avro.util.Utf8` could also be encoded as an array of bytes if the generated schema for that field was `Schema.Type.BYTES`. -Therefore some serializers will take into account the schema passed to them when choosing the avro compatible type. - -The following table shows how types used in your code will be mapped / encoded in the generated Avro schemas and files. -If a type can be mapped in multiple ways, it is listed more than once. - -| JVM Type | Schema Type | Logical Type | Encoded Type | -|--------------------------------|------------------|--------------------|--------------------------| -| String | STRING | | Utf8 | -| String | FIXED | | GenericFixed | -| String | BYTES | | ByteBuffer | -| Boolean | BOOLEAN | | java.lang.Boolean | -| Long | LONG | | java.lang.Long | -| Int | INT | | java.lang.Integer | -| Short | INT | | java.lang.Integer | -| Byte | INT | | java.lang.Integer | -| Double | DOUBLE | | java.lang.Double | -| Float | FLOAT | | java.lang.Float | -| UUID | STRING | UUID | Utf8 | -| LocalDate | INT | Date | java.lang.Int | -| LocalTime | INT | Time-Millis | java.lang.Int | -| LocalDateTime | LONG | Timestamp-Millis | java.lang.Long | -| Instant | LONG | Timestamp-Millis | java.lang.Long | -| Timestamp | LONG | Timestamp-Millis | java.lang.Long | -| BigDecimal | BYTES | Decimal<8,2> | ByteBuffer | -| BigDecimal | FIXED | Decimal<8,2> | GenericFixed | -| BigDecimal | STRING | Decimal<8,2> | String | -| T? (nullable type) | UNION | | null, T | -| ByteArray | BYTES | | ByteBuffer | -| ByteArray | FIXED | | GenericFixed | -| ByteBuffer | BYTES | | ByteBuffer | -| List[Byte] | BYTES | | ByteBuffer | -| Array | ARRAY | | Array[T] | -| List | ARRAY | | Array[T] | -| Set | ARRAY | | Array[T] | -| Map[String, V] | MAP | | java.util.Map[String, V] | -| data class T | RECORD | | GenericRecord | -| enum class | ENUM | | GenericEnumSymbol | - -In order to use logical types, annotate the value with an appropriate Serializer, or `@Contextual` as it is handled by default: +## Changing the namespace of all nested named type(s) + +Sometimes, using classes from other packages or libraries, you may want to change the namespace of a nested named type. This is done annotating the field +with `@AvroNamespaceOverride`. ```kotlin +import kotlinx.serialization.Serializable +import com.github.avrokotlin.avro4k.AvroNamespaceOverride + @Serializable -data class WithInstant( - @Serializable(with=InstantSerializer::class) val inst: Instant +data class MyData( + @AvroNamespaceOverride("new.namespace") val myField: NestedRecord ) + +// ... +package external.package.name + @Serializable -data class WithInstantContextual( - @Contextual val inst: Instant -) +data class NestedRecord(val field: String) ``` -All the logical type serializers are available in the package `com.github.avrokotlin.avro4k.serializer`. You can find additional serializers for [arrow](https://arrow-kt.io/) types in the -[avro4k-arrow](https://github.com/avro-kotlin/avro4k-arrow) project. +> [!NOTE] +> This impacts the schema generation, the serialization and the deserialization. -## Input / Output +## Change type name (FIXED only) -### Formats +> [!WARNING] +> For the moment, it is not possible to manually change the namespace or the name of a FIXED type as the type name is coming from the field name and the namespace from the +> enclosing data class package. -Avro supports four different encoding types [serializing records](https://avro.apache.org/docs/current/spec.html#Data+Serialization+and+Deserialization). +## Set a custom logical type -These are binary with schema, binary without schema, json and single object encoding. +To create a custom logical type, you need to create a serializer that will handle the serialization and deserialization of the value, and provide a descriptor that include +the `@AvroLogicalType` annotation. +Additionally, you need to [register your serializer](#support-additional-non-serializable-types). -In avro4k these are represented by an `AvroFormat` enum with three values - `AvroFormat.Binary` (binary no schema), `AvroFormat.Data` (binary with schema), and `AvroFormat.Json`. +> [!WARNING] +> When [this issue](https://github.com/Kotlin/kotlinx.serialization/issues/2631) is released, this section will be updated as the implementation will change. -Binary encoding without the schema does not include field names, self-contained information about the types of individual bytes, nor field or record separators. -Therefore readers are wholly reliant on the schema used when the data was encoded but the format is by far the most compact. Binary encodings are [fast](https://www.slideshare.net/oom65/orc-files?next_slideshow=1). +TODO when kotlinx-serialization released the version to unwrap nullable descriptor -Binary encoding with the schema is still quick to deserialize, but is obviously less compact. However, as the schema is included readers do not need to have access to the original schema. +## Skip a kotlin field -Json encoding is the largest and slowest, but the easist to work with outside of Avro, and of course is easy to view on the wire (if that is a concern). +To skip a field during encoding, you can annotate it with `@kotlinx.serialization.Transient`. +Note that you need to provide a default value for the field as the field will be totally discarded also during encoding (IntelliJ should trigger a warn). +```kotlin +import kotlinx.serialization.Serializable +import kotlinx.serialization.Transient -### Using avro4k in your project +@Serializable +data class Foo(val a: String, @Transient val b: String = "default value") +``` -Gradle -```implementation 'com.github.avro-kotlin.avro4k:avro4k-core:xxx'``` +> [!NOTE] +> This impacts the schema generation, the serialization and the deserialization. -Maven -```xml - - com.github.avro-kotlin.avro4k - avro4k-core - xxx - +## Disable implicit `default: null` for nullable fields + +Avro4k makes by default your nullable fields optional (put `default: null` on all nullable fields if no other explicit default provided). +You can opt out this feature by setting `implicitNulls` to `false` in the `Avro` configuration: + +```kotlin +Avro { + implicitNulls = false +} ``` -Check the latest released version on Maven Central +> [!NOTE] +> This impacts the schema generation, the serialization and the deserialization. + +# Nullable fields, optional fields and compatibility + +With avro, you can have nullable fields and optional fields, that are taken into account for compatibility checking when using the schema registry. + +But if you want to remove a nullable field that is not optional, depending on the compatibility mode, it may not be compatible because of the missing default value. + +- What is an optional field ? + +> An optional field is a field that have a *default* value, like an int with a default as `-1`. + +- What is a nullable field ? + +> A nullable field is a field that contains a `null` type in its type union, but **it's not an optional field if you don't put `default` value to `null`**. + +So to mark a field as optional and facilitate avro contract evolution regarding compatibility checks, then set `default` to `null`. -### Known problems +# Known problems -Kotlin 1.7.20 up to 1.8.10 cannot properly compile @SerialInfo-Annotations on enums (see https://github.com/Kotlin/kotlinx.serialization/issues/2121). -This is fixed with kotlin 1.8.20. So if you are planning to use any of avro4k's annotations on enum types, please make sure that you are using kotlin >= 1.8.20. +- Kotlin 1.7.20 up to 1.8.10 cannot properly compile @SerialInfo-Annotations on enums (see https://github.com/Kotlin/kotlinx.serialization/issues/2121). + This is fixed with kotlin 1.8.20. So if you are planning to use any of avro4k's annotations on enum types, please make sure that you are using kotlin >= 1.8.20. -### Contributions +# Contributions Contributions to avro4k are always welcome. Good ways to contribute include: - Raising bugs and feature requests -- Fixing bugs and enhancing the DSL +- Fixing bugs and enhancing the API - Improving the performance of avro4k -- Adding to the documentation +- Adding documentation diff --git a/api/avro4k-core.api b/api/avro4k-core.api index 733548fc..0073d4ca 100644 --- a/api/avro4k-core.api +++ b/api/avro4k-core.api @@ -161,7 +161,6 @@ public final class com/github/avrokotlin/avro4k/AvroSingleObjectKt { } public abstract interface class com/github/avrokotlin/avro4k/decoder/AvroDecoder : kotlinx/serialization/encoding/Decoder { - public abstract fun decodeValue ()Ljava/lang/Object; public abstract fun getCurrentWriterSchema ()Lorg/apache/avro/Schema; } diff --git a/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt b/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt index e44dd7d1..38b29d91 100644 --- a/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt +++ b/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt @@ -5,6 +5,7 @@ package com.github.avrokotlin.avro4k import com.github.avrokotlin.avro4k.serializer.BigDecimalSerializer import kotlinx.serialization.ExperimentalSerializationApi import kotlinx.serialization.SerialInfo +import kotlinx.serialization.SerializationException import kotlinx.serialization.descriptors.SerialDescriptor import org.apache.avro.LogicalType import org.apache.avro.Schema @@ -113,6 +114,10 @@ public interface AvroSchemaSupplier { /** * Allows to specify the logical type applied on the generated schema of a property. + * + * The given class **must** be an object and implement [AvroLogicalTypeSupplier], otherwise an [SerializationException] will be thrown. + * + * WARNING: This uses reflection to retrieve the object instance. */ @SerialInfo @ExperimentalSerializationApi diff --git a/src/main/kotlin/com/github/avrokotlin/avro4k/decoder/AvroDecoder.kt b/src/main/kotlin/com/github/avrokotlin/avro4k/decoder/AvroDecoder.kt index 582ac5bc..9ede1604 100644 --- a/src/main/kotlin/com/github/avrokotlin/avro4k/decoder/AvroDecoder.kt +++ b/src/main/kotlin/com/github/avrokotlin/avro4k/decoder/AvroDecoder.kt @@ -6,6 +6,10 @@ import org.apache.avro.Schema import org.apache.avro.generic.GenericFixed public interface AvroDecoder : Decoder { + /** + * Provides the schema used to encode the current value. + * It won't return a union as the schema correspond to the actual value. + */ @ExperimentalSerializationApi public val currentWriterSchema: Schema @@ -15,5 +19,11 @@ public interface AvroDecoder : Decoder { @ExperimentalSerializationApi public fun decodeFixed(): GenericFixed + /** + * Decode a value that corresponds to the [currentWriterSchema]. + * + * You should prefer using directly [currentWriterSchema] to get the schema and then decode the value using the appropriate **decode*** method. + */ + @ExperimentalSerializationApi public fun decodeValue(): Any } \ No newline at end of file From a0306818fc63f94e275d3d3a0bb7a482ae9e1bd9 Mon Sep 17 00:00:00 2001 From: Chuckame Date: Thu, 9 May 2024 15:58:05 +0200 Subject: [PATCH 2/4] chore!: Stop supporting Timestamp class --- api/avro4k-core.api | 13 ------ .../com/github/avrokotlin/avro4k/Avro.kt | 2 - .../avrokotlin/avro4k/serializer/date.kt | 43 ------------------- .../encoding/LogicalTypesEncodingTest.kt | 7 --- .../avro4k/schema/DateSchemaTest.kt | 2 - 5 files changed, 67 deletions(-) diff --git a/api/avro4k-core.api b/api/avro4k-core.api index 0073d4ca..1528334a 100644 --- a/api/avro4k-core.api +++ b/api/avro4k-core.api @@ -310,19 +310,6 @@ public final class com/github/avrokotlin/avro4k/serializer/LocalTimeSerializer : public fun serializeGeneric (Lkotlinx/serialization/encoding/Encoder;Ljava/time/LocalTime;)V } -public final class com/github/avrokotlin/avro4k/serializer/TimestampSerializer : com/github/avrokotlin/avro4k/serializer/AvroTimeSerializer { - public static final field INSTANCE Lcom/github/avrokotlin/avro4k/serializer/TimestampSerializer; - public synthetic fun deserializeAvro (Lcom/github/avrokotlin/avro4k/decoder/AvroDecoder;)Ljava/lang/Object; - public fun deserializeAvro (Lcom/github/avrokotlin/avro4k/decoder/AvroDecoder;)Ljava/sql/Timestamp; - public synthetic fun deserializeGeneric (Lkotlinx/serialization/encoding/Decoder;)Ljava/lang/Object; - public fun deserializeGeneric (Lkotlinx/serialization/encoding/Decoder;)Ljava/sql/Timestamp; - public fun getLogicalType (Ljava/util/List;)Lorg/apache/avro/LogicalType; - public synthetic fun serializeAvro (Lcom/github/avrokotlin/avro4k/encoder/AvroEncoder;Ljava/lang/Object;)V - public fun serializeAvro (Lcom/github/avrokotlin/avro4k/encoder/AvroEncoder;Ljava/sql/Timestamp;)V - public synthetic fun serializeGeneric (Lkotlinx/serialization/encoding/Encoder;Ljava/lang/Object;)V - public fun serializeGeneric (Lkotlinx/serialization/encoding/Encoder;Ljava/sql/Timestamp;)V -} - public final class com/github/avrokotlin/avro4k/serializer/URLSerializer : kotlinx/serialization/KSerializer { public static final field INSTANCE Lcom/github/avrokotlin/avro4k/serializer/URLSerializer; public synthetic fun deserialize (Lkotlinx/serialization/encoding/Decoder;)Ljava/lang/Object; diff --git a/src/main/kotlin/com/github/avrokotlin/avro4k/Avro.kt b/src/main/kotlin/com/github/avrokotlin/avro4k/Avro.kt index dd74e977..fd8384b0 100644 --- a/src/main/kotlin/com/github/avrokotlin/avro4k/Avro.kt +++ b/src/main/kotlin/com/github/avrokotlin/avro4k/Avro.kt @@ -13,7 +13,6 @@ import com.github.avrokotlin.avro4k.serializer.InstantSerializer import com.github.avrokotlin.avro4k.serializer.LocalDateSerializer import com.github.avrokotlin.avro4k.serializer.LocalDateTimeSerializer import com.github.avrokotlin.avro4k.serializer.LocalTimeSerializer -import com.github.avrokotlin.avro4k.serializer.TimestampSerializer import com.github.avrokotlin.avro4k.serializer.URLSerializer import com.github.avrokotlin.avro4k.serializer.UUIDSerializer import kotlinx.serialization.DeserializationStrategy @@ -64,7 +63,6 @@ public sealed class Avro( contextual(LocalDateSerializer) contextual(LocalTimeSerializer) contextual(LocalDateTimeSerializer) - contextual(TimestampSerializer) } ) diff --git a/src/main/kotlin/com/github/avrokotlin/avro4k/serializer/date.kt b/src/main/kotlin/com/github/avrokotlin/avro4k/serializer/date.kt index 53cd8040..6c59472c 100644 --- a/src/main/kotlin/com/github/avrokotlin/avro4k/serializer/date.kt +++ b/src/main/kotlin/com/github/avrokotlin/avro4k/serializer/date.kt @@ -18,7 +18,6 @@ import kotlinx.serialization.encoding.Encoder import org.apache.avro.LogicalType import org.apache.avro.LogicalTypes import org.apache.avro.Schema -import java.sql.Timestamp import java.time.Instant import java.time.LocalDate import java.time.LocalDateTime @@ -179,48 +178,6 @@ public object LocalDateTimeSerializer : AvroTimeSerializer(LocalD } } -public object TimestampSerializer : AvroTimeSerializer(Timestamp::class, PrimitiveKind.LONG) { - override fun getLogicalType(inlinedStack: List): LogicalType { - return LogicalTypes.timestampMillis() - } - - override fun serializeAvro( - encoder: AvroEncoder, - value: Timestamp, - ) { - encoder.encodeResolvingUnion({ - with(encoder) { - BadEncodedValueError(value, encoder.currentWriterSchema, Schema.Type.STRING, Schema.Type.LONG) - } - }) { schema -> - when (schema.type) { - Schema.Type.LONG -> - when (schema.logicalType) { - is LogicalTypes.TimestampMillis, null -> encoder.encodeLong(value.toInstant().toEpochMilli()) - is LogicalTypes.TimestampMicros -> encoder.encodeLong(value.toInstant().toEpochMicros()) - else -> null - } - - Schema.Type.STRING -> encoder.encodeString(value.toInstant().toString()) - else -> null - } - } - } - - override fun serializeGeneric( - encoder: Encoder, - value: Timestamp, - ) { - encoder.encodeLong(value.toInstant().toEpochMilli()) - } - - override fun deserializeAvro(decoder: AvroDecoder): Timestamp = deserializeGeneric(decoder) - - override fun deserializeGeneric(decoder: Decoder): Timestamp { - return Timestamp(decoder.decodeLong()) - } -} - public object InstantSerializer : AvroTimeSerializer(Instant::class, PrimitiveKind.LONG) { override fun getLogicalType(inlinedStack: List): LogicalType { return LogicalTypes.timestampMillis() diff --git a/src/test/kotlin/com/github/avrokotlin/avro4k/encoding/LogicalTypesEncodingTest.kt b/src/test/kotlin/com/github/avrokotlin/avro4k/encoding/LogicalTypesEncodingTest.kt index bffe7d2c..535167bb 100644 --- a/src/test/kotlin/com/github/avrokotlin/avro4k/encoding/LogicalTypesEncodingTest.kt +++ b/src/test/kotlin/com/github/avrokotlin/avro4k/encoding/LogicalTypesEncodingTest.kt @@ -15,7 +15,6 @@ import org.apache.avro.SchemaBuilder import java.math.BigDecimal import java.math.BigInteger import java.net.URL -import java.sql.Timestamp import java.time.Instant import java.time.LocalDate import java.time.LocalDateTime @@ -44,7 +43,6 @@ internal class LogicalTypesEncodingTest : StringSpec({ BigDecimal("123.45"), LocalDate.ofEpochDay(18262), LocalTime.ofSecondOfDay(45296), - Timestamp(1577889296000), Instant.ofEpochSecond(1577889296), Instant.ofEpochSecond(1577889296, 424000), UUID.fromString("123e4567-e89b-12d3-a456-426614174000"), @@ -61,7 +59,6 @@ internal class LogicalTypesEncodingTest : StringSpec({ null, null, null, - null, null ) @@ -82,7 +79,6 @@ internal class LogicalTypesEncodingTest : StringSpec({ 18262, 45296000, 1577889296000, - 1577889296000, 1577889296000424, "123e4567-e89b-12d3-a456-426614174000", "http://example.com", @@ -98,7 +94,6 @@ internal class LogicalTypesEncodingTest : StringSpec({ null, null, null, - null, null ) ) @@ -111,7 +106,6 @@ internal class LogicalTypesEncodingTest : StringSpec({ @Serializable(BigDecimalAsStringSerializer::class) val decimalString: BigDecimal, @Contextual val date: LocalDate, @Contextual val time: LocalTime, - @Contextual val timestamp: Timestamp, @Contextual val instant: Instant, @Serializable(InstantToMicroSerializer::class) val instantMicros: Instant, @Contextual val uuid: UUID, @@ -123,7 +117,6 @@ internal class LogicalTypesEncodingTest : StringSpec({ @Serializable(BigDecimalAsStringSerializer::class) val decimalStringNullable: BigDecimal?, @Contextual val dateNullable: LocalDate?, @Contextual val timeNullable: LocalTime?, - @Contextual val timestampNullable: Timestamp?, @Contextual val instantNullable: Instant?, @Serializable(InstantToMicroSerializer::class) val instantMicrosNullable: Instant?, @Contextual val uuidNullable: UUID?, diff --git a/src/test/kotlin/com/github/avrokotlin/avro4k/schema/DateSchemaTest.kt b/src/test/kotlin/com/github/avrokotlin/avro4k/schema/DateSchemaTest.kt index b15cfd7e..07489d8f 100644 --- a/src/test/kotlin/com/github/avrokotlin/avro4k/schema/DateSchemaTest.kt +++ b/src/test/kotlin/com/github/avrokotlin/avro4k/schema/DateSchemaTest.kt @@ -7,7 +7,6 @@ import com.github.avrokotlin.avro4k.serializer.InstantToMicroSerializer import com.github.avrokotlin.avro4k.serializer.LocalDateSerializer import com.github.avrokotlin.avro4k.serializer.LocalDateTimeSerializer import com.github.avrokotlin.avro4k.serializer.LocalTimeSerializer -import com.github.avrokotlin.avro4k.serializer.TimestampSerializer import io.kotest.core.spec.style.FunSpec import kotlinx.serialization.KSerializer import kotlinx.serialization.builtins.nullable @@ -20,7 +19,6 @@ internal class DateSchemaTest : FunSpec({ LocalTimeSerializer to LogicalTypes.timeMillis().addToSchema(Schema.create(Schema.Type.INT)), LocalDateTimeSerializer to LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG)), InstantSerializer to LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG)), - TimestampSerializer to LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG)), InstantToMicroSerializer to LogicalTypes.timestampMicros().addToSchema(Schema.create(Schema.Type.LONG)) ).forEach { (serializer: KSerializer<*>, expected) -> test("generate date logical type for $serializer") { From 7c07046b7d81fa0233d8c756664633ad8020620b Mon Sep 17 00:00:00 2001 From: Chuckame Date: Thu, 9 May 2024 15:57:35 +0200 Subject: [PATCH 3/4] chore! Remove custom schema as it is not supported in the serialization --- api/avro4k-core.api | 5 ---- .../github/avrokotlin/avro4k/annotations.kt | 14 --------- .../avrokotlin/avro4k/schema/ValueVisitor.kt | 6 ---- .../avro4k/schema/VisitorContext.kt | 5 ---- .../avro4k/schema/AvroSchemaTest.kt | 29 ------------------- 5 files changed, 59 deletions(-) delete mode 100644 src/test/kotlin/com/github/avrokotlin/avro4k/schema/AvroSchemaTest.kt diff --git a/api/avro4k-core.api b/api/avro4k-core.api index 1528334a..5069227e 100644 --- a/api/avro4k-core.api +++ b/api/avro4k-core.api @@ -150,11 +150,6 @@ public synthetic class com/github/avrokotlin/avro4k/AvroProp$Impl : com/github/a public final synthetic fun value ()Ljava/lang/String; } -public synthetic class com/github/avrokotlin/avro4k/AvroSchema$Impl : com/github/avrokotlin/avro4k/AvroSchema { - public fun (Lkotlin/reflect/KClass;)V - public final synthetic fun value ()Ljava/lang/Class; -} - public final class com/github/avrokotlin/avro4k/AvroSingleObjectKt { public static final fun decodeFromByteArray (Lcom/github/avrokotlin/avro4k/AvroSingleObject;Lkotlinx/serialization/DeserializationStrategy;[B)Ljava/lang/Object; public static final fun encodeToByteArray (Lcom/github/avrokotlin/avro4k/AvroSingleObject;Lorg/apache/avro/Schema;Lkotlinx/serialization/SerializationStrategy;Ljava/lang/Object;)[B diff --git a/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt b/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt index 38b29d91..94501abf 100644 --- a/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt +++ b/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt @@ -8,7 +8,6 @@ import kotlinx.serialization.SerialInfo import kotlinx.serialization.SerializationException import kotlinx.serialization.descriptors.SerialDescriptor import org.apache.avro.LogicalType -import org.apache.avro.Schema import org.intellij.lang.annotations.Language import kotlin.reflect.KClass @@ -99,19 +98,6 @@ public annotation class AvroDefault( @Target(AnnotationTarget.PROPERTY) public annotation class AvroEnumDefault -/** - * Allows to specify the schema of a property. - */ -@SerialInfo -@ExperimentalSerializationApi -@Target(AnnotationTarget.PROPERTY) -public annotation class AvroSchema(val value: KClass) - -@ExperimentalSerializationApi -public interface AvroSchemaSupplier { - public fun getSchema(stack: List): Schema -} - /** * Allows to specify the logical type applied on the generated schema of a property. * diff --git a/src/main/kotlin/com/github/avrokotlin/avro4k/schema/ValueVisitor.kt b/src/main/kotlin/com/github/avrokotlin/avro4k/schema/ValueVisitor.kt index 2267ce2f..e022e1a5 100644 --- a/src/main/kotlin/com/github/avrokotlin/avro4k/schema/ValueVisitor.kt +++ b/src/main/kotlin/com/github/avrokotlin/avro4k/schema/ValueVisitor.kt @@ -3,7 +3,6 @@ package com.github.avrokotlin.avro4k.schema import com.github.avrokotlin.avro4k.Avro import com.github.avrokotlin.avro4k.AvroFixed import com.github.avrokotlin.avro4k.AvroLogicalType -import com.github.avrokotlin.avro4k.AvroSchema import com.github.avrokotlin.avro4k.internal.AvroSchemaGenerationException import com.github.avrokotlin.avro4k.internal.jsonNode import com.github.avrokotlin.avro4k.internal.nonNullSerialName @@ -117,7 +116,6 @@ internal class ValueVisitor internal constructor( logicalType = annotations.logicalType.getLogicalType(annotations) } when { - annotations.customSchema != null -> setSchema(annotations.customSchema.getSchema(annotations)) annotations.fixed != null -> visitFixed(annotations.fixed) descriptor.isByteArray() -> visitByteArray() else -> super.visitValue(descriptor) @@ -127,10 +125,6 @@ internal class ValueVisitor internal constructor( private fun AnnotatedElementOrType.getLogicalType(valueAnnotations: ValueAnnotations): LogicalType { return this.annotation.value.newObjectInstance().getLogicalType(valueAnnotations.stack) } - - private fun AnnotatedElementOrType.getSchema(valueAnnotations: ValueAnnotations): Schema { - return this.annotation.value.newObjectInstance().getSchema(valueAnnotations.stack) - } } private fun KClass.newObjectInstance(): T { diff --git a/src/main/kotlin/com/github/avrokotlin/avro4k/schema/VisitorContext.kt b/src/main/kotlin/com/github/avrokotlin/avro4k/schema/VisitorContext.kt index 447a61a9..1c350b55 100644 --- a/src/main/kotlin/com/github/avrokotlin/avro4k/schema/VisitorContext.kt +++ b/src/main/kotlin/com/github/avrokotlin/avro4k/schema/VisitorContext.kt @@ -9,7 +9,6 @@ import com.github.avrokotlin.avro4k.AvroFixed import com.github.avrokotlin.avro4k.AvroLogicalType import com.github.avrokotlin.avro4k.AvroNamespaceOverride import com.github.avrokotlin.avro4k.AvroProp -import com.github.avrokotlin.avro4k.AvroSchema import com.github.avrokotlin.avro4k.internal.findAnnotation import com.github.avrokotlin.avro4k.internal.findAnnotations import com.github.avrokotlin.avro4k.internal.findElementAnnotation @@ -72,20 +71,17 @@ internal data class FieldAnnotations( internal data class ValueAnnotations( val stack: List, val fixed: AnnotatedElementOrType?, - val customSchema: AnnotatedElementOrType?, val logicalType: AnnotatedElementOrType?, ) { constructor(descriptor: SerialDescriptor, elementIndex: Int) : this( listOf(SimpleAnnotatedLocation(descriptor, elementIndex)), AnnotatedElementOrType(descriptor, elementIndex), - AnnotatedElementOrType(descriptor, elementIndex), AnnotatedElementOrType(descriptor, elementIndex) ) constructor(descriptor: SerialDescriptor) : this( listOf(SimpleAnnotatedLocation(descriptor)), AnnotatedElementOrType(descriptor), - AnnotatedElementOrType(descriptor), AnnotatedElementOrType(descriptor) ) } @@ -137,6 +133,5 @@ internal fun ValueAnnotations?.appendAnnotations(other: ValueAnnotations) = ValueAnnotations( fixed = this?.fixed ?: other.fixed, logicalType = this?.logicalType ?: other.logicalType, - customSchema = this?.customSchema ?: other.customSchema, stack = (this?.stack ?: emptyList()) + other.stack ) \ No newline at end of file diff --git a/src/test/kotlin/com/github/avrokotlin/avro4k/schema/AvroSchemaTest.kt b/src/test/kotlin/com/github/avrokotlin/avro4k/schema/AvroSchemaTest.kt deleted file mode 100644 index e93437c7..00000000 --- a/src/test/kotlin/com/github/avrokotlin/avro4k/schema/AvroSchemaTest.kt +++ /dev/null @@ -1,29 +0,0 @@ -package com.github.avrokotlin.avro4k.schema - -import com.github.avrokotlin.avro4k.AnnotatedLocation -import com.github.avrokotlin.avro4k.AvroAssertions -import com.github.avrokotlin.avro4k.AvroSchema -import com.github.avrokotlin.avro4k.AvroSchemaSupplier -import io.kotest.core.spec.style.StringSpec -import kotlinx.serialization.Serializable -import org.apache.avro.Schema -import org.apache.avro.SchemaBuilder - -internal class AvroSchemaTest : StringSpec({ - "@AvroLogicalType annotation should be supported" { - AvroAssertions.assertThat() - .generatesSchema(SchemaBuilder.fixed("myCustomSchema").doc("a doc").size(42)) - } -}) { - @JvmInline - @Serializable - private value class Something( - @AvroSchema(CustomSchemaSupplier::class) val value: String, - ) -} - -internal object CustomSchemaSupplier : AvroSchemaSupplier { - override fun getSchema(stack: List): Schema { - return Schema.createFixed("myCustomSchema", "a doc", null, 42) - } -} \ No newline at end of file From f3964d0a74f1cec29af6ceba4cc207f360d9c02d Mon Sep 17 00:00:00 2001 From: Chuckame Date: Thu, 9 May 2024 13:55:16 +0200 Subject: [PATCH 4/4] chore!: Internalize AvroLogicalType annotation, waiting for the new kotlinx release --- api/avro4k-core.api | 5 ----- .../kotlin/com/github/avrokotlin/avro4k/annotations.kt | 10 ++-------- 2 files changed, 2 insertions(+), 13 deletions(-) diff --git a/api/avro4k-core.api b/api/avro4k-core.api index 5069227e..90162415 100644 --- a/api/avro4k-core.api +++ b/api/avro4k-core.api @@ -101,11 +101,6 @@ public final class com/github/avrokotlin/avro4k/AvroKt { public static final fun schema (Lcom/github/avrokotlin/avro4k/Avro;Lkotlinx/serialization/KSerializer;)Lorg/apache/avro/Schema; } -public synthetic class com/github/avrokotlin/avro4k/AvroLogicalType$Impl : com/github/avrokotlin/avro4k/AvroLogicalType { - public fun (Lkotlin/reflect/KClass;)V - public final synthetic fun value ()Ljava/lang/Class; -} - public abstract interface annotation class com/github/avrokotlin/avro4k/AvroNamespaceOverride : java/lang/annotation/Annotation { public abstract fun value ()Ljava/lang/String; } diff --git a/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt b/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt index 94501abf..3a2e9788 100644 --- a/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt +++ b/src/main/kotlin/com/github/avrokotlin/avro4k/annotations.kt @@ -5,7 +5,6 @@ package com.github.avrokotlin.avro4k import com.github.avrokotlin.avro4k.serializer.BigDecimalSerializer import kotlinx.serialization.ExperimentalSerializationApi import kotlinx.serialization.SerialInfo -import kotlinx.serialization.SerializationException import kotlinx.serialization.descriptors.SerialDescriptor import org.apache.avro.LogicalType import org.intellij.lang.annotations.Language @@ -99,16 +98,11 @@ public annotation class AvroDefault( public annotation class AvroEnumDefault /** - * Allows to specify the logical type applied on the generated schema of a property. - * - * The given class **must** be an object and implement [AvroLogicalTypeSupplier], otherwise an [SerializationException] will be thrown. - * - * WARNING: This uses reflection to retrieve the object instance. + * Will be removed when we will be able to unwrap a nullable descriptor. */ @SerialInfo -@ExperimentalSerializationApi @Target(AnnotationTarget.PROPERTY) -public annotation class AvroLogicalType(val value: KClass) +internal annotation class AvroLogicalType(val value: KClass) @ExperimentalSerializationApi public interface AvroLogicalTypeSupplier {