Skip to content

Latest commit

 

History

History
204 lines (171 loc) · 7.65 KB

enterprise_avro.md

File metadata and controls

204 lines (171 loc) · 7.65 KB
permalink sidebar title keywords enterprise toc Tags previous next
enterprise_avro.html
mydoc_sidebar
Avro Plugin
Enterprise, Avro, Binary, File, HyperStream
true
false
Enterprise, Avro, Binary, File
enterprise_virtualcolumns.html

{% include prev_next.html %}

About

When working with large data volumes, it is sometimes unnecessary to have a database at all. It can be more performant to read binary data directly from a file and then analyze it in Speedment HyperStream. One such format for storing binary data is Avro.

Generating Code

Speedment generally uses the metadata in a database as the domain model when generating code. The metadata is then stored in a speedment.json-file, and unless you call mvn speedment:reload, it will only connect to the database if that file doesn't exist. When working with Avro-files, this can be used to your advantage. Instead of using the database metadata to generate the speedment.json-file, use the Maven plugin speedment-avro-maven-plugin (only available for Speedment HyperStream) to create it from a number of Avro-schemas. Then run mvn speedment:generate as usual to generate Java code.

<plugin>
    <groupId>com.speedment.enterprise.plugins</groupId>
    <artifactId>speedment-avro-maven-plugin</artifactId>
    <version>${speedment.enterprise.version}</version>
    
    <configuration>
        <projectName>sakila</projectName>
        <deleteExisting>true</deleteExisting>
        <enableEnumPlugin>true</enableEnumPlugin>
        <overridesFile>src/main/json/speedment_override.json</overridesFile>
        <schemas>
            <schema>
                <tableId>film</tableId>
                <schemaFile>src/main/avro/Film.avsc</schemaFile>
                <overridesFile>src/main/json/speedment_film_override.json</overridesFile>
            </schema>
        </schemas>
    </configuration>
    
    <!-- Execute automatically -->
    <executions>
        <execution>
            <id>Reload From Avro File</id>
            <phase>generate-sources</phase>
            <goals>
                <goal>avro</goal>
            </goals>
        </execution>
    </executions>
</plugin>

In the example above, the plugin is configured to generate a project called "sakila" with one table called "Film" based on the Avro schema src/main/avro/Film.avsc. There are two so called <overridesFile> specified; one for the project as a whole and one that is local only to the Film table. These are optional .json-files that you can create manually that will override any settings generated by the plugin. This is useful for configuring things that the Avro-plugin can't yet figure out automatically, like the package structure of your project or which type mappers to use.

Override Generated speedment.json

Here is an example of how src/main/json/speedment_override.json could look:

{
  "config" : {
    "id" : "sakila",
    "name" : "sakila",
    "companyName" : "speedment",
    "packageName" : "com.speedment.example.sakila",
    "appId" : "b9c8eb47-3810-4f45-972b-f2cf64f43d71",
    "dbmses" : [
      {
        "id" : "sakila",
        "alias" : "my_dbms",
        "schemas" : [
          {
            "id" : "sakila"
            "alias" : "my_schema"
          }
        ]
      }
    ]
  }
}

A custom company and packageName is set on the project-level and also a custom alias for both the generated dbms and the schema is specified. Note that the id of both project, dbms and schema will be the value you specified as projectName in the maven plugin configuration.

Override Table Specific Settings

Here is an example of how the table-specific src/main/json/speedment_film_override.json could look:

{
  "config" : {
    "id" : "sakila",
    "dbmses" : [
      {
        "id" : "sakila",
        "schemas" : [
          {
            "id" : "sakila",
            "tables" : [
              {
                "id" : "film",
                "avroDataFile" : "Film.avro",
                "columns" : [
                  {
                    "id" : "film_id",
                    "alias" : "id"
                  },
                  {
                    "id" : "rating",
                    "enabled" : false
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
}

In this case the location of the binary data file for the Film table is specified, and some of the default settings are changed. First the column (film_id) is given an alias to be able reference it as id in the code. Secondly the column (rating) is disable since it is not needed in the application.

Automating Build Process

The next step is to invoke mvn speedment:generate automatically after the speedment-avro-maven-plugin has been invoked. This can be done like this:

<plugin>
    <groupId>com.speedment.enterprise</groupId>
    <artifactId>speedment-enterprise-maven-plugin</artifactId>
    <version>${speedment.enterprise.version}</version>

    <configuration>
        <components>
            <component>com.speedment.enterprise.datastore.tool.DataStoreToolBundle</component>
            <component>com.speedment.enterprise.plugins.avro.generator.AvroGeneratorBundle</component>
        </components>
    </configuration>

    <executions>
        <execution>
            <id>Generate Speedment Sources</id>
            <phase>generate-sources</phase>
            <goals>
                <goal>generate</goal>
            </goals>
        </execution>
    </executions>
</plugin>

Finally, you need to make sure that you have the avro-dependencies on the classpath. To fix this, simply add the avro-runtime-dependency to the pom.xml-file.

<dependency>
    <groupId>com.speedment.enterprise.plugins</groupId>
    <artifactId>avro-runtime</artifactId>
</dependency>

Using Avro Files at Runtime

To load data from the .avro-files instead of a database, you need to add the AvroRuntimeBundle to the Speedment application builder and append withSkipCheckDatabaseConnectivity().

new AmericanApplicationBuilder()
    .withSkipCheckDatabaseConnectivity()
    .withBundle(AvroRuntimeBundle.class)
    .withBundle(DataStoreBundle.class)
    .build();

Note that the order is important! The AvroRuntimeBundle needs to come before the DataStoreBundle since both specify a custom StreamSupplierComponent.

Customizing Generated Code

When the code is generated, it will look a bit different than if a regular SQL database was used. First, you no longer have any SqlAdapter.java-files, but instead a number of AvroAdapter.java-files. These handle the deserialization from Avro to Speedment.

To change the location of the data file (the one that ends with .avro) at runtime, you can override the dataFile-method in XXXAvroAdapter.java. For an example, in the Sakila demo, I could override the method like this:

public class FilmAvroAdapter extends GeneratedFilmAvroAdapter {

    @Config(name = "film.data.file", value = "Film.avro")
    private String filmDataFile;
    
    @Override
    protected Path dataFile(ProjectComponent projects) {
        return Paths.get(filmDataFile);
    }
    
}

You can now change the location of the data file without recompiling by simply creating a settings.properties-file in the root of my project with the line:

film.data.file = another_dir/another_film.avro

And it will be used instead.

{% include prev_next.html %}

Questions and Discussion

If you have any question, don't hesitate to reach out to the Speedment developers on Gitter.