Skip to content

Storage Plugin Model

Paul Rogers edited this page Jan 4, 2017 · 20 revisions

Overview

Storage Plugin API

A storage plugin provides access to a set of tables. In Hadoop-based systems, tables are implemented as files in the HDFS file system, so Drill's primary storage plugin is the FileSystemPlugin. Since a file system stores many kinds of files, the file system plugin is associated with a collection of format plugins (see below). Drill supports any arbitrary storage plugin, including for systems other than file systems.

The storage plugin is a plan-time concept and is not directly available at run time (while fragments execute.) The storage plugin itself is meant to be a light-weight description of the external system and, as such, can be created and discarded frequently, even within the scope of a single query plan session.

A storage plugin itself is a "type": it provides system-specific behavior but typically provides the ability to define multiple storage system instances for a given storage plugin type. It is these instances that we see in the storage plugin section of the Drill UI: we often call these storage plugins, but they are really storage plugin configurations: the information needed to define a specific instance.

Every storage plugin must implement the StoragePlugin interface, often by subclassing the AbstractStoragePlugin class. Each storage plugin also defines a Jackson-serialized configuration object which extends StoragePluginConfig: it is the JSON-serialized version of this object which we see in Drill's storage plugin web UI.

Drill provides a top-level namespace of storage plugins. The names here are not the storage plugin names (types) themselves, but rather the names associated with storage plugin configurations. (The schema name space also holds the names of schemas, which will be discussed later.) Then each storage plugin defines a name space for tables:

SELECT * FROM `myPlugin`.`myTable`

Here, myPlugin is the name of a storage plugin configuration (that maps to a storage plugin via the type field in the configuration), and myTable is a table defined by that plugin instance.

Storage Plugin Lifecycle

The best way to understand a storage plugin is to sketch out the storage plugin lifecycle as seen from a single query.

The following setup steps make the plugin available to use:

  • The developer defines a storage plugin class that implements the StoragePlugin interface.
  • The developer also provides a drill-module.conf file that adds the developer's package to the Drill scan path.
  • The developer may also provide a bootstrap-storage-plugins.json file that is used to create the initial set of plugin configurations when Drill first starts.
  • At startup, Drill scans the class path looking for drill-module.conf files, and loads each, which adds the custom plugin package to the scan path.
  • Drill then scans the scan path looking for classes that extend StoragePlugin. Each is registered as a storage plugin (type), keyed by the type of the first argument to its (required) three-argument constructor. The first argument is the plugin configuration, so this provides a link from the plugin configuration class to the plugin itself.
  • Drill checks if the storage plugin configuration registry has any entries. If so, Drill loads them. If not, Drill loads bootstrap-storage-plugins.json instead.
  • The user defines a storage plugin configuration that creates a entry in Drill's global schema name space.

Now the plugin is available to users. Next the planner must resolve the table name:

  • Drill looks for the schema name space portion of the table name: the myPlugin in the earlier example.
  • Drill uses that name to look up the storage plugin configuration in the schema name space.
  • Drill looks at the type field to identify the type of the schema. The type for storage plugins is file (sic).
  • Drill uses the class of the storage plugin configuration as a key to locate the constructor for the storage plugin itself.
  • Drill creates an instance of the storage plugin to use for the query, passing the storage plugin instance an instance of the plugin configuration. Actually, Drill will create multiple instances as planning proceeds.
  • Drill calls the registerSchemas method of the plugin. This method creates an instance of a schema which implements the Calcite Schema interface and adds it to the schemas registered for this plugin. (The schemas are registered in Calcite, not in the plugin itself.)
  • Calcite calls the getTable() method on the schema object to resolve the table name from the query into a Calcite Table object, typically using the DynamicDrillTable class which extends Table.
  • DynamicDrillTable holds a list of Jackson-serializable objects which the plugin can retrieve later by deserializing the serialized form of the table data.

The planner now must move from a table definition to an operator (definition) for a scan of that table. Tables in Drill usually resolve to a directory of files, or a large file with distributed blocks. In either case, the query can launch multiple parallel scans on the table. The next step is for the planner to resolve the table into a physical scan plan (definition).

  • The planner calls the getPhysicalScan method in the plugin class, providing it with the user name (for security checks), a place to obtain the deserialized table hints created above, and a list of columns that the user selected from the table. (Though, strangely, for a SQL query, the list of columns is always just *, even if the SELECT statement names specific columns.
  • The actual set of columns is provided later in a call to the clone() method which must make a copy of the group scan using the now-non-empty list of columns.
  • The getPhysicalScan() method returns a "group scan" A definition of the scan of the table as a whole. (Presumably the "group" refers to the fact that this object represents a group of scans...). The returned group scan object implements GroupScan typically by extending AbstractGroupScan. This class is also Jackson-serializable and is serialized and cloned during the planning process. (Thus, the object cannot hold pointers to objects, only simple string and primitive values, and/or objects composed of such values.)
  • The group scan provides much metadata information to the planner about how to create the required scan plan.
  • The planner then calls applyAssignments() to inform the group scan of the Drillbits on which scans will run. (It is not clear how the planner or group scan decides how many scans to run on each Drillbit.)
  • The planner calls getSpecificScan() on the group scan, providing a minor fragment id, to get the SubScan physical operator (definition) for one specific scan. The storage plugin provides a subclass that is, again, Jackson serializable and is sent to the fragment executor as the physical operator ("POp") for this operation.

Storage Plugin Configuration

This section provides more detail about plugin configuration. Several forms are required.

Basics

The plugin class itself provides two forms of configuration information:

  • If the storage plugin is not within one of Drill's own packages, then the jar file containing the plugin must contain a drill-module.conf file which adds that package to Drill's class path scan:
drill: {
  classpath.scanning: {
    packages += "org.apache.drill.exec.store.kudu"
  }
}
  • The storage plugin class must implement StoragePlugin (often via the AbstractStoragePlugin class.) It is the implementation of this interface which marks the class as a storage plugin. Any class that implements StoragePlugin is presumed to be one. To disable a class which is not really a plugin, create a configuration but mark the plugin as disabled in the configuration.
  • The class must implement a three-argument constructor:
  public MockStorageEngine(MockStorageEngineConfig configuration,
                           DrillbitContext context, String name) {
  • The type of the first argument identifies the class of the configuration for this plugin.
  • The plugin configuration class must be 1) Jackson serializable, and 2) implement the StoragePluginConfig interface (often via the StoragePluginConfigBase class.)
  • Somewhere on the class path a file must exist called bootstrap-storage-plugins.json which contains at least one serialized form of the storage plugin configuration. Without such as class, the plugin is invisible to Drill. (That is, the plugin exists only via a configuration.)

Given these six elements, Drill will find, load and configure the storage plugin, and you will see the default plugin configuration in Drill's web UI when you first start Drill. (Perhaps the configuration must be created manually if the plugin registration already exists?...)

Each operator seems to also need a unique enum value in {{UserProtos.CoreOperatorType}}, but a quick survey of the code finds very limited use of this enum.

Storage Plugin Configuration Class

The storage plugin configuration class must:

  • Extend StoragePluginConfigBase
  • Implement the equals() and hash() methods. Evidently, plugins are considered based on their content and Drill must sometimes determine if two plugin configurations are identical.

Plan-time Operations

Format Plugin API

Format Plugin Lifecycle

Scan Operator

Scans are created at runtime as follows:

  • The physical plan contains the "sub-scan" physical operator (definition) created above.
  • The ImplCreator class uses the operator name in the sub-scan to look up the corresponding operator implementation (record batch) in using the OperatorCreatorRegistry associated with the Drillbit.
  • The batch creator class is associated with each operator (implementation) and is responsible for converting the physical operator (definition) into an record batch (operator implementation.)
  • Scans appear to create a RecordReader subclass to handle actual reading, and pair this with a ScanBatch operator implementation to handle interfacing into the Drill operator hierarchy.
  • ScanBatch calls setup on the RecordReader to do initial setup. Operators that know their schema can set up the schema here. Otherwise, schema setup can wait until later.

The fragment now runs. The tree calls next() on the ScanBatch to return a batch of records. (By convention, the first batch should return just a schema.)

  • ScanBatch calls allocate() on the RecordReader to set up the vectors for the record batch.
  • Repeatedly calls the next() method on the RecordReader to read a batch of rows into the vectors allocated above.
  • Sets the row count on each of the value vectors.
  • Determines if the batch includes a schema different than the previous batch.
  • Passes along the proper status code to the caller.

Of course, some of these steps involve more than the summary suggests.

  • ScanBatch defines a Mutator class which must be used to build the set of value vectors. The Mutator takes a field schema in the form of a MaterializedField.
  • Uses the TypeHelper to convert from the field schema to a value vector instance.
  • Registers the value vector with the vector container associated with the ScanBatch.
  • Adds the value vector to the field vector map, indexed by field name (actually, a field path for nested structures.)

Comments on Current Design

A number of improvements are possible to the current design.

  • Each plugin should have a registration file that identifies the class name of the plugin. It is far cheaper to search for such files than to scan every class looking for those that extends some particular interface.
  • Each plugin should use annotations to provide static properties such as the plugin tag name (file, mock or whatever) instead of looking for the type of the first constructor argument as is done now.
  • Rather than searching for a known constructor, require that plugins have no constructor or a zero-argument constructor. In the plugin interface define a register method that does what the three-argument constructor currently does. This moves registration into the API rather than as a special non-obvious form of the constructor.
  • Each plugin configuration should name its storage plugin using the tag from the plugin definition.

The above greatly simplifies the storage plugin system:

  • Definition files name the storage plugin class.
  • The class (or definition entry) gives a tag to identify the plugin.
  • Given a definition file, Drill builds a (tag --> plugin) table.
  • Given a storage plugin definition, the plugin tag in the definition maps, via the above table, to the plugin itself.

Other changes:

  • Storage plugins should have extended lives. As it is, they are created multiple times for each query, and thus many times across queries. This makes it had for the plugin to hold onto state (such as a connection to an external system, cached data, etc.)
  • Clearly separate the planner-time (transient) objects and the persistent plan objects passed across the network. This will allow the planner-only objects to contain handles to internal state (something that now is impossible given the internal Jackson serializations.)
  • Avoid unnecessary copies of the planning-time objects by tightening up the API. (Provide columns once rather than twice, etc.)
Clone this wiki locally