Skip to content

Commit

Permalink
Merge pull request #224 from Chuckame/docs
Browse files Browse the repository at this point in the history
fix: Removed AvroNamespaceOverride as it was not fully implemented
  • Loading branch information
Chuckame authored Jun 25, 2024
2 parents 9df5c94 + 87c58ac commit 25f3f38
Show file tree
Hide file tree
Showing 15 changed files with 225 additions and 301 deletions.
117 changes: 117 additions & 0 deletions Migrating-from-v1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
Here is the guide of how to migrate from Avro4k v1 to v2 using examples.

> [!INFO]
> If you are missing a migration need, please [file an issue](https://github.com/avro-kotlin/avro4k/issues/new/choose) or [make a PR](https://github.com/avro-kotlin/avro4k/compare).
## Pure avro serialization

```kotlin
// Previously
val bytes = Avro.default.encodeToByteArray(TheDataClass.serializer(), TheDataClass(...))
Avro.default.decodeFromByteArray(TheDataClass.serializer(), bytes)

// Now
val bytes = Avro.encodeToByteArray(TheDataClass(...))
Avro.decodeFromByteArray<TheDataClass>(bytes)
```

## Set a field default value to null

```kotlin
// Previously
data class TheDataClass(
@AvroDefault(Avro.NULL)
val field: String?
)

// Now
// ... Nothing, as it is the default behavior!
data class TheDataClass(
val field: String?
)
```

## generic data serialization
Convert a kotlin data class to a `GenericRecord` to then be handled by a `GenericDatumWriter` in avro.

```kotlin
// Previously
val genericRecord: GenericRecord = Avro.default.toRecord(TheDataClass.serializer(), TheDataClass(...))
Avro.default.fromRecord(TheDataClass.serializer(), genericRecord)

// Now
val genericData: Any? = Avro.encodeToGenericData(TheDataClass(...))
Avro.decodeFromGenericData<TheDataClass>(genericData)
```


## Configure the `Avro` instance

```kotlin
// Previously
val avro = Avro(
AvroConfiguration(
namingStrategy = FieldNamingStrategy.SnackCase,
implicitNulls = true,
),
SerializersModule {
contextual(CustomSerializer())
}
)

// Now
val avro = Avro {
namingStrategy = FieldNamingStrategy.SnackCase
implicitNulls = true
serializersModule = SerializersModule {
contextual(CustomSerializer())
}
}
```

## Changing the name of a record

```kotlin
// Previously
@AvroName("TheName")
@AvroNamespace("a.custom.namespace")
data class TheDataClass(...)

// Now
@SerialName("a.custom.namespace.TheName")
data class TheDataClass(...)
```

## Writing an avro object container file with a custom field naming strategy

```kotlin
// Previously
Files.newOutputStream(Path("/your/file.avro")).use { outputStream ->
Avro(AvroConfiguration(namingStrategy = SnakeCaseNamingStrategy))
.openOutputStream(TheDataClass.serializer()) { encodeFormat = AvroEncodeFormat.Data(CodecFactory.snappyCodec()) }
.to(outputStream)
.write(TheDataClass(...))
.write(TheDataClass(...))
.write(TheDataClass(...))
.close()
}


// Now
val dataSequence = sequenceOf(
TheDataClass(...),
TheDataClass(...),
TheDataClass(...),
)
val avro = Avro { fieldNamingStrategy = FieldNamingStrategy.SnakeCase }
Files.newOutputStream(Path("/your/file.avro")).use { outputStream ->
AvroObjectContainerFile(avro)
.encodeToStream(dataSequence, outputStream) {
codec(CodecFactory.snappyCodec())
// you can also add your metadata !
metadata("myProp", 1234L)
metadata("a string metadata", "hello")
}
}
```

29 changes: 4 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ fun main() {

- **Avro4k** is highly based on apache avro library, that implies all the schema validation is done by it
- All members annotated with `@ExperimentalSerializationApi` are **subject to changes** in future releases without any notice as they are experimental, so please
check the release notes to check the needed migration
check the release notes to check the needed migration. At least, given a version `A.B.C`, only the minor `B` number will be incremented, not the major `A`.
- **Avro4k** also supports encoding and decoding generic data, mainly because of confluent schema registry compatibility as their serializers only handle generic data. When avro4k
will support their schema registry, the generic encoding will be removed to keep this library as simple as possible.

Expand Down Expand Up @@ -737,30 +737,6 @@ package my.package
data class MyData(val myField: String)
```

## Changing the namespace of all nested named type(s)

Sometimes, using classes from other packages or libraries, you may want to change the namespace of a nested named type. This is done annotating the field
with `@AvroNamespaceOverride`.

```kotlin
import kotlinx.serialization.Serializable
import com.github.avrokotlin.avro4k.AvroNamespaceOverride

@Serializable
data class MyData(
@AvroNamespaceOverride("new.namespace") val myField: NestedRecord
)

// ...
package external.package.name

@Serializable
data class NestedRecord(val field: String)
```

> [!NOTE]
> This impacts the schema generation, the serialization and the deserialization.
## Change type name (FIXED only)

> [!WARNING]
Expand Down Expand Up @@ -836,6 +812,9 @@ So to mark a field as optional and facilitate avro contract evolution regarding
- Kotlin 1.7.20 up to 1.8.10 cannot properly compile @SerialInfo-Annotations on enums (see https://github.com/Kotlin/kotlinx.serialization/issues/2121).
This is fixed with kotlin 1.8.20. So if you are planning to use any of avro4k's annotations on enum types, please make sure that you are using kotlin >= 1.8.20.

# Migrating from v1 to v2
Heads up to the [migration guide](Migrating-from-v1.md) to update your code from avro4k v1 to v2.

# Contributions

Contributions to avro4k are always welcome. Good ways to contribute include:
Expand Down
29 changes: 12 additions & 17 deletions api/avro4k-core.api
Original file line number Diff line number Diff line change
@@ -1,14 +1,20 @@
public final class com/github/avrokotlin/avro4k/AnnotationsKt {
public static final fun asAvroLogicalType (Lkotlinx/serialization/descriptors/SerialDescriptor;)Lkotlinx/serialization/descriptors/SerialDescriptor;
}

public abstract interface class com/github/avrokotlin/avro4k/AnyValueDecoder {
public abstract fun decodeAny (Lcom/github/avrokotlin/avro4k/AvroDecoder;)Ljava/lang/Object;
}

public abstract class com/github/avrokotlin/avro4k/Avro {
public abstract class com/github/avrokotlin/avro4k/Avro : kotlinx/serialization/BinaryFormat {
public static final field Default Lcom/github/avrokotlin/avro4k/Avro$Default;
public synthetic fun <init> (Lcom/github/avrokotlin/avro4k/AvroConfiguration;Lkotlinx/serialization/modules/SerializersModule;Lkotlin/jvm/internal/DefaultConstructorMarker;)V
public fun decodeFromByteArray (Lkotlinx/serialization/DeserializationStrategy;[B)Ljava/lang/Object;
public final fun decodeFromByteArray (Lorg/apache/avro/Schema;Lkotlinx/serialization/DeserializationStrategy;[B)Ljava/lang/Object;
public fun encodeToByteArray (Lkotlinx/serialization/SerializationStrategy;Ljava/lang/Object;)[B
public final fun encodeToByteArray (Lorg/apache/avro/Schema;Lkotlinx/serialization/SerializationStrategy;Ljava/lang/Object;)[B
public final fun getConfiguration ()Lcom/github/avrokotlin/avro4k/AvroConfiguration;
public final fun getSerializersModule ()Lkotlinx/serialization/modules/SerializersModule;
public fun getSerializersModule ()Lkotlinx/serialization/modules/SerializersModule;
public final fun schema (Lkotlinx/serialization/descriptors/SerialDescriptor;)Lorg/apache/avro/Schema;
}

Expand All @@ -24,10 +30,6 @@ public synthetic class com/github/avrokotlin/avro4k/AvroAlias$Impl : com/github/
public final synthetic fun value ()[Ljava/lang/String;
}

public final class com/github/avrokotlin/avro4k/AvroAnnotationsKt {
public static final fun asAvroLogicalType (Lkotlinx/serialization/descriptors/SerialDescriptor;)Lkotlinx/serialization/descriptors/SerialDescriptor;
}

public final class com/github/avrokotlin/avro4k/AvroBuilder {
public final fun build ()Lcom/github/avrokotlin/avro4k/AvroConfiguration;
public final fun getFieldNamingStrategy ()Lcom/github/avrokotlin/avro4k/FieldNamingStrategy;
Expand Down Expand Up @@ -164,15 +166,6 @@ public final class com/github/avrokotlin/avro4k/AvroKt {
public static final fun schema (Lcom/github/avrokotlin/avro4k/Avro;Lkotlinx/serialization/KSerializer;)Lorg/apache/avro/Schema;
}

public abstract interface annotation class com/github/avrokotlin/avro4k/AvroNamespaceOverride : java/lang/annotation/Annotation {
public abstract fun value ()Ljava/lang/String;
}

public synthetic class com/github/avrokotlin/avro4k/AvroNamespaceOverride$Impl : com/github/avrokotlin/avro4k/AvroNamespaceOverride {
public fun <init> (Ljava/lang/String;)V
public final synthetic fun value ()Ljava/lang/String;
}

public final class com/github/avrokotlin/avro4k/AvroObjectContainerFile {
public fun <init> ()V
public fun <init> (Lcom/github/avrokotlin/avro4k/Avro;)V
Expand Down Expand Up @@ -224,16 +217,18 @@ public synthetic class com/github/avrokotlin/avro4k/AvroProp$Impl : com/github/a
public final synthetic fun value ()Ljava/lang/String;
}

public final class com/github/avrokotlin/avro4k/AvroSingleObject {
public final class com/github/avrokotlin/avro4k/AvroSingleObject : kotlinx/serialization/BinaryFormat {
public fun <init> (Lkotlin/jvm/functions/Function1;Lcom/github/avrokotlin/avro4k/Avro;)V
public synthetic fun <init> (Lkotlin/jvm/functions/Function1;Lcom/github/avrokotlin/avro4k/Avro;ILkotlin/jvm/internal/DefaultConstructorMarker;)V
public fun decodeFromByteArray (Lkotlinx/serialization/DeserializationStrategy;[B)Ljava/lang/Object;
public final fun decodeFromStream (Lkotlinx/serialization/DeserializationStrategy;Ljava/io/InputStream;)Ljava/lang/Object;
public fun encodeToByteArray (Lkotlinx/serialization/SerializationStrategy;Ljava/lang/Object;)[B
public final fun encodeToStream (Lorg/apache/avro/Schema;Lkotlinx/serialization/SerializationStrategy;Ljava/lang/Object;Ljava/io/OutputStream;)V
public final fun getAvro ()Lcom/github/avrokotlin/avro4k/Avro;
public fun getSerializersModule ()Lkotlinx/serialization/modules/SerializersModule;
}

public final class com/github/avrokotlin/avro4k/AvroSingleObjectKt {
public static final fun decodeFromByteArray (Lcom/github/avrokotlin/avro4k/AvroSingleObject;Lkotlinx/serialization/DeserializationStrategy;[B)Ljava/lang/Object;
public static final fun encodeToByteArray (Lcom/github/avrokotlin/avro4k/AvroSingleObject;Lorg/apache/avro/Schema;Lkotlinx/serialization/SerializationStrategy;Ljava/lang/Object;)[B
}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
package com.github.avrokotlin.benchmark

import com.github.avrokotlin.avro4k.Avro
import com.github.avrokotlin.avro4k.decodeFromByteArray
import com.github.avrokotlin.avro4k.encodeToByteArray
import kotlinx.benchmark.Benchmark
import kotlinx.serialization.decodeFromByteArray
import kotlinx.serialization.encodeToByteArray

internal object Avro4kClientsStaticReadBenchmark {
@JvmStatic
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,18 +11,6 @@ import kotlinx.serialization.descriptors.SerialDescriptor
import org.apache.avro.LogicalType
import org.intellij.lang.annotations.Language

/**
* When annotated on a property, deeply overrides the namespace for all the nested named types (records, enums and fixed).
*
* Works with standard classes and inline classes.
*/
@SerialInfo
@ExperimentalSerializationApi
@Target(AnnotationTarget.PROPERTY)
public annotation class AvroNamespaceOverride(
val value: String,
)

/**
* Adds a property to the Avro schema or field. Its value could be any valid JSON or just a string.
*
Expand Down Expand Up @@ -83,6 +71,11 @@ public annotation class AvroFixed(val size: Int)
/**
* Sets the default avro value for a record's field.
*
* - Records and maps have to be represented as a json object
* - Arrays have to be represented as a json array
* - Nulls have to be represented as a json `null`. To set the string `"null"`, don't forget to quote the string, example: `""""null""""` or `"\"null\""`.
* - Any non json content will be treated as a string
*
* Ignored in inline classes.
*/
@SerialInfo
Expand Down Expand Up @@ -112,7 +105,7 @@ public annotation class AvroEnumDefault
* }
* ```
*
* For more complex needs, please file an issue [here](https://github.com/avro-kotlin/avro4k/issues).
* For more complex needs, please file an feature request [here](https://github.com/avro-kotlin/avro4k/issues).
*/
@ExperimentalSerializationApi
public fun SerialDescriptor.asAvroLogicalType(): SerialDescriptor {
Expand Down
39 changes: 19 additions & 20 deletions src/main/kotlin/com/github/avrokotlin/avro4k/Avro.kt
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import com.github.avrokotlin.avro4k.serializer.LocalDateTimeSerializer
import com.github.avrokotlin.avro4k.serializer.LocalTimeSerializer
import com.github.avrokotlin.avro4k.serializer.URLSerializer
import com.github.avrokotlin.avro4k.serializer.UUIDSerializer
import kotlinx.serialization.BinaryFormat
import kotlinx.serialization.DeserializationStrategy
import kotlinx.serialization.ExperimentalSerializationApi
import kotlinx.serialization.KSerializer
Expand All @@ -33,15 +34,15 @@ import java.io.ByteArrayInputStream
*/
public sealed class Avro(
public val configuration: AvroConfiguration,
public val serializersModule: SerializersModule,
) {
public override val serializersModule: SerializersModule,
) : BinaryFormat {
// We use the identity hash map because we could have multiple descriptors with the same name, especially
// when having 2 different version of the schema for the same name. kotlinx-serialization is instanciating the descriptors
// when having 2 different version of the schema for the same name. kotlinx-serialization is instantiating the descriptors
// only once, so we are safe in the main use cases. Combined with weak references to avoid memory leaks.
private val schemaCache: MutableMap<SerialDescriptor, Schema> = WeakIdentityHashMap()

internal val recordResolver = RecordResolver(this)
internal val polymorphicResolver = PolymorphicResolver(this.serializersModule)
internal val polymorphicResolver = PolymorphicResolver(serializersModule)
internal val enumResolver = EnumResolver()

public companion object Default : Avro(
Expand Down Expand Up @@ -88,6 +89,20 @@ public sealed class Avro(
}
return result
}

override fun <T> decodeFromByteArray(
deserializer: DeserializationStrategy<T>,
bytes: ByteArray,
): T {
return decodeFromByteArray(schema(deserializer.descriptor), deserializer, bytes)
}

override fun <T> encodeToByteArray(
serializer: SerializationStrategy<T>,
value: T,
): ByteArray {
return encodeToByteArray(schema(serializer.descriptor), serializer, value)
}
}

public fun Avro(
Expand Down Expand Up @@ -121,8 +136,6 @@ public class AvroBuilder internal constructor(avro: Avro) {
private class AvroImpl(configuration: AvroConfiguration, serializersModule: SerializersModule) :
Avro(configuration, serializersModule)

// schema gen extensions

public inline fun <reified T> Avro.schema(): Schema {
val serializer = serializersModule.serializer<T>()
return schema(serializer.descriptor)
Expand All @@ -132,13 +145,6 @@ public fun <T> Avro.schema(serializer: KSerializer<T>): Schema {
return schema(serializer.descriptor)
}

// encoding extensions

public inline fun <reified T> Avro.encodeToByteArray(value: T): ByteArray {
val serializer = serializersModule.serializer<T>()
return encodeToByteArray(schema(serializer), serializer, value)
}

public inline fun <reified T> Avro.encodeToByteArray(
writerSchema: Schema,
value: T,
Expand All @@ -147,13 +153,6 @@ public inline fun <reified T> Avro.encodeToByteArray(
return encodeToByteArray(writerSchema, serializer, value)
}

// decoding extensions

public inline fun <reified T> Avro.decodeFromByteArray(bytes: ByteArray): T {
val serializer = serializersModule.serializer<T>()
return decodeFromByteArray(schema(serializer.descriptor), serializer, bytes)
}

public inline fun <reified T> Avro.decodeFromByteArray(
writerSchema: Schema,
bytes: ByteArray,
Expand Down
10 changes: 10 additions & 0 deletions src/main/kotlin/com/github/avrokotlin/avro4k/AvroDecoder.kt
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,19 @@ public interface AvroDecoder : Decoder {
@ExperimentalSerializationApi
public val currentWriterSchema: Schema

/**
* Decode a [Schema.Type.BYTES] value.
*
* A bytes value is a sequence of bytes prefixed with an int corresponding to its length.
*/
@ExperimentalSerializationApi
public fun decodeBytes(): ByteArray

/**
* Decode a [Schema.Type.FIXED] value.
*
* A fixed value is a fixed-size sequence of bytes, where the length is not materialized in the binary output as it is known by the [currentWriterSchema].
*/
@ExperimentalSerializationApi
public fun decodeFixed(): GenericFixed

Expand Down
Loading

0 comments on commit 25f3f38

Please sign in to comment.