permalink | sidebar | title | keywords | enterprise | toc | Tags | previous | next |
---|---|---|---|---|---|---|---|---|
enterprise_avro.html |
mydoc_sidebar |
Avro Plugin |
Enterprise, Avro, Binary, File, HyperStream |
true |
false |
Enterprise, Avro, Binary, File |
enterprise_virtualcolumns.html |
{% include prev_next.html %}
When working with large data volumes, it is sometimes unnecessary to have a database at all. It can be more performant to read binary data directly from a file and then analyze it in Speedment HyperStream. One such format for storing binary data is Avro.
Speedment generally uses the metadata in a database as the domain model when generating code. The metadata is then stored in a speedment.json
-file, and unless you call mvn speedment:reload
, it will only connect to the database if that file doesn't exist. When working with Avro-files, this can be used to your advantage. Instead of using the database metadata to generate the speedment.json
-file, use the Maven plugin speedment-avro-maven-plugin
(only available for Speedment HyperStream) to create it from a number of Avro-schemas. Then run mvn speedment:generate
as usual to generate Java code.
<plugin>
<groupId>com.speedment.enterprise.plugins</groupId>
<artifactId>speedment-avro-maven-plugin</artifactId>
<version>${speedment.enterprise.version}</version>
<configuration>
<projectName>sakila</projectName>
<deleteExisting>true</deleteExisting>
<enableEnumPlugin>true</enableEnumPlugin>
<overridesFile>src/main/json/speedment_override.json</overridesFile>
<schemas>
<schema>
<tableId>film</tableId>
<schemaFile>src/main/avro/Film.avsc</schemaFile>
<overridesFile>src/main/json/speedment_film_override.json</overridesFile>
</schema>
</schemas>
</configuration>
<!-- Execute automatically -->
<executions>
<execution>
<id>Reload From Avro File</id>
<phase>generate-sources</phase>
<goals>
<goal>avro</goal>
</goals>
</execution>
</executions>
</plugin>
In the example above, the plugin is configured to generate a project called "sakila" with one table called "Film" based on the Avro schema src/main/avro/Film.avsc
. There are two so called <overridesFile>
specified; one for the project as a whole and one that is local only to the Film
table. These are optional .json
-files that you can create manually that will override any settings generated by the plugin. This is useful for configuring things that the Avro-plugin can't yet figure out automatically, like the package structure of your project or which type mappers to use.
Here is an example of how src/main/json/speedment_override.json
could look:
{
"config" : {
"id" : "sakila",
"name" : "sakila",
"companyName" : "speedment",
"packageName" : "com.speedment.example.sakila",
"appId" : "b9c8eb47-3810-4f45-972b-f2cf64f43d71",
"dbmses" : [
{
"id" : "sakila",
"alias" : "my_dbms",
"schemas" : [
{
"id" : "sakila"
"alias" : "my_schema"
}
]
}
]
}
}
A custom company
and packageName
is set on the project-level and also a custom alias
for both the generated dbms
and the schema
is specified. Note that the id
of both project
, dbms
and schema
will be the value you specified as projectName
in the maven plugin configuration.
Here is an example of how the table-specific src/main/json/speedment_film_override.json
could look:
{
"config" : {
"id" : "sakila",
"dbmses" : [
{
"id" : "sakila",
"schemas" : [
{
"id" : "sakila",
"tables" : [
{
"id" : "film",
"avroDataFile" : "Film.avro",
"columns" : [
{
"id" : "film_id",
"alias" : "id"
},
{
"id" : "rating",
"enabled" : false
}
]
}
]
}
]
}
]
}
}
In this case the location of the binary data file for the Film
table is specified, and some of the default settings are changed. First the column (film_id
) is given an alias to be able reference it as id
in the code. Secondly the column (rating
) is disable since it is not needed in the application.
The next step is to invoke mvn speedment:generate
automatically after the speedment-avro-maven-plugin
has been invoked. This can be done like this:
<plugin>
<groupId>com.speedment.enterprise</groupId>
<artifactId>speedment-enterprise-maven-plugin</artifactId>
<version>${speedment.enterprise.version}</version>
<configuration>
<components>
<component>com.speedment.enterprise.datastore.tool.DataStoreToolBundle</component>
<component>com.speedment.enterprise.plugins.avro.generator.AvroGeneratorBundle</component>
</components>
</configuration>
<executions>
<execution>
<id>Generate Speedment Sources</id>
<phase>generate-sources</phase>
<goals>
<goal>generate</goal>
</goals>
</execution>
</executions>
</plugin>
Finally, you need to make sure that you have the avro-dependencies on the classpath. To fix this, simply add the avro-runtime
-dependency to the pom.xml
-file.
<dependency>
<groupId>com.speedment.enterprise.plugins</groupId>
<artifactId>avro-runtime</artifactId>
</dependency>
To load data from the .avro
-files instead of a database, you need to add the AvroRuntimeBundle
to the Speedment application builder and append withSkipCheckDatabaseConnectivity()
.
new AmericanApplicationBuilder()
.withSkipCheckDatabaseConnectivity()
.withBundle(AvroRuntimeBundle.class)
.withBundle(DataStoreBundle.class)
.build();
Note that the order is important! The AvroRuntimeBundle
needs to come before the DataStoreBundle
since both specify a custom StreamSupplierComponent
.
When the code is generated, it will look a bit different than if a regular SQL database was used. First, you no longer have any SqlAdapter.java
-files, but instead a number of AvroAdapter.java
-files. These handle the deserialization from Avro to Speedment.
To change the location of the data file (the one that ends with .avro
) at runtime, you can override the dataFile
-method in XXXAvroAdapter.java
. For an example, in the Sakila demo, I could override the method like this:
public class FilmAvroAdapter extends GeneratedFilmAvroAdapter {
@Config(name = "film.data.file", value = "Film.avro")
private String filmDataFile;
@Override
protected Path dataFile(ProjectComponent projects) {
return Paths.get(filmDataFile);
}
}
You can now change the location of the data file without recompiling by simply creating a settings.properties
-file in the root of my project with the line:
film.data.file = another_dir/another_film.avro
And it will be used instead.
{% include prev_next.html %}
If you have any question, don't hesitate to reach out to the Speedment developers on Gitter.