-
Notifications
You must be signed in to change notification settings - Fork 1
Storage Plugin Model
A storage plugin provides access to a set of tables. In Hadoop-based systems, tables are implemented as files in the HDFS file system, so Drill's primary storage plugin is the FileSystemPlugin
. Since a file system stores many kinds of files, the file system plugin is associated with a collection of format plugins (see below). Drill supports any arbitrary storage plugin, including for systems other than file systems.
The storage plugin is a plan-time concept and is not directly available at run time (while fragments execute.) The storage plugin itself is meant to be a light-weight description of the external system and, as such, can be created and discarded frequently, even within the scope of a single query plan session.
A storage plugin itself is a "type": it provides system-specific behavior but typically provides the ability to define multiple storage system instances for a given storage plugin type. It is these instances that we see in the storage plugin section of the Drill UI: we often call these storage plugins, but they are really storage plugin configurations: the information needed to define a specific instance.
Every storage plugin must implement the StoragePlugin
interface, often by subclassing the AbstractStoragePlugin
class. Each storage plugin also defines a Jackson-serialized configuration object which extends StoragePluginConfig
: it is the JSON-serialized version of this object which we see in Drill's storage plugin web UI.
Drill provides a top-level namespace of storage plugins. The names here are not the storage plugin names (types) themselves, but rather the names associated with storage plugin configurations. (The schema name space also holds the names of schemas, which will be discussed later.) Then each storage plugin defines a name space for tables:
SELECT * FROM `myPlugin`.`myTable`
Here, myPlugin
is the name of a storage plugin configuration (that maps to a storage plugin via the type
field in the configuration), and myTable
is a table defined by that plugin instance.
The best way to understand a storage plugin is to sketch out the storage plugin lifecycle as seen from a single query.
The following setup steps make the plugin available to use:
- The developer defines a storage plugin class that implements the
StoragePlugin
interface. - The developer also provides a
drill-module.conf
file that adds the developer's package to the Drill scan path. - The developer may also provide a
bootstrap-storage-plugins.json
file that is used to create the initial set of plugin configurations when Drill first starts. - At startup, Drill scans the class path looking for
drill-module.conf
files, and loads each, which adds the custom plugin package to the scan path. - Drill then scans the scan path looking for classes that extend
StoragePlugin
. Each is registered as a storage plugin (type), keyed by the type of the first argument to its (required) three-argument constructor. The first argument is the plugin configuration, so this provides a link from the plugin configuration class to the plugin itself. - Drill checks if the storage plugin configuration registry has any entries. If so, Drill loads them. If not, Drill loads
bootstrap-storage-plugins.json
instead. - The user defines a storage plugin configuration that creates a entry in Drill's global schema name space.
Now the plugin is available to users. Next the planner must resolve the table name:
- Drill looks for the schema name space portion of the table name: the
myPlugin
in the earlier example. - Drill uses that name to look up the storage plugin configuration in the schema name space.
- Drill looks at the
type
field to identify the type of the schema. The type for storage plugins isfile
(sic). - Drill uses the class of the storage plugin configuration as a key to locate the constructor for the storage plugin itself.
- Drill creates an instance of the storage plugin to use for the query, passing the storage plugin instance an instance of the plugin configuration. Actually, Drill will create multiple instances as planning proceeds.
- Drill calls the
registerSchemas
method of the plugin. This method creates an instance of a schema which implements the CalciteSchema
interface and adds it to the schemas registered for this plugin. (The schemas are registered in Calcite, not in the plugin itself.) - Calcite calls the
getTable()
method on the schema object to resolve the table name from the query into a CalciteTable
object, typically using theDynamicDrillTable
class which extendsTable
. -
DynamicDrillTable
holds a list of Jackson-serializable objects which the plugin can retrieve later by deserializing the serialized form of the table data.
The planner now must move from a table definition to an operator (definition) for a scan of that table. Tables in Drill usually resolve to a directory of files, or a large file with distributed blocks. In either case, the query can launch multiple parallel scans on the table. The next step is for the planner to resolve the table into a physical scan plan (definition).
- The planner calls the
getPhysicalScan
method in the plugin class, providing it with the user name (for security checks), a place to obtain the deserialized table hints created above, and a list of columns that the user selected from the table. (Though, strangely, for a SQL query, the list of columns is always just*
, even if the SELECT statement names specific columns. - The actual set of columns is provided later in a call to the
clone()
method which must make a copy of the group scan using the now-non-empty list of columns. - The
getPhysicalScan()
method returns a "group scan" A definition of the scan of the table as a whole. (Presumably the "group" refers to the fact that this object represents a group of scans...). The returned group scan object implementsGroupScan
typically by extendingAbstractGroupScan
. This class is also Jackson-serializable and is serialized and cloned during the planning process. (Thus, the object cannot hold pointers to objects, only simple string and primitive values, and/or objects composed of such values.) - The group scan provides much metadata information to the planner about how to create the required scan plan.
- The planner then calls
applyAssignments()
to inform the group scan of the Drillbits on which scans will run. (It is not clear how the planner or group scan decides how many scans to run on each Drillbit.) - The planner calls
getSpecificScan()
on the group scan, providing a minor fragment id, to get theSubScan
physical operator (definition) for one specific scan. The storage plugin provides a subclass that is, again, Jackson serializable and is sent to the fragment executor as the physical operator ("POp") for this operation.
This section provides more detail about plugin configuration. Several forms are required.
The plugin class itself provides two forms of configuration information:
- If the storage plugin is not within one of Drill's own packages, then the jar file containing the plugin must contain a
drill-module.conf
file which adds that package to Drill's class path scan:
drill: {
classpath.scanning: {
packages += "org.apache.drill.exec.store.kudu"
}
}
- The storage plugin class must implement
StoragePlugin
(often via theAbstractStoragePlugin
class.) It is the implementation of this interface which marks the class as a storage plugin. Any class that implementsStoragePlugin
is presumed to be one. To disable a class which is not really a plugin, create a configuration but mark the plugin as disabled in the configuration. - The class must implement a three-argument constructor:
public MockStorageEngine(MockStorageEngineConfig configuration,
DrillbitContext context, String name) {
- The type of the first argument identifies the class of the configuration for this plugin.
- The plugin configuration class must be 1) Jackson serializable, and 2) implement the
StoragePluginConfig
interface (often via theStoragePluginConfigBase
class.) - Somewhere on the class path a file must exist called
bootstrap-storage-plugins.json
which contains at least one serialized form of the storage plugin configuration. Without such as class, the plugin is invisible to Drill. (That is, the plugin exists only via a configuration.)
Given these six elements, Drill will find, load and configure the storage plugin, and you will see the default plugin configuration in Drill's web UI when you first start Drill. (Perhaps the configuration must be created manually if the plugin registration already exists?...)
Each operator seems to also need a unique enum value in {{UserProtos.CoreOperatorType}}, but a quick survey of the code finds very limited use of this enum.
The storage plugin configuration class must:
- Extend
StoragePluginConfigBase
- Implement the
equals()
andhash()
methods. Evidently, plugins are considered based on their content and Drill must sometimes determine if two plugin configurations are identical.
Scans are created at runtime as follows:
- The physical plan contains the "sub-scan" physical operator (definition) created above.
- The
ImplCreator
class uses the operator name in the sub-scan to look up the corresponding operator implementation (record batch) in using theOperatorCreatorRegistry
associated with the Drillbit. - The batch creator class is associated with each operator (implementation) and is responsible for converting the physical operator (definition) into an record batch (operator implementation.)
- Scans appear to create a
RecordReader
subclass to handle actual reading, and pair this with aScanBatch
operator implementation to handle interfacing into the Drill operator hierarchy. -
ScanBatch
callssetup
on theRecordReader
to do initial setup. Operators that know their schema can set up the schema here. Otherwise, schema setup can wait until later.
The fragment now runs. The tree calls next()
on the ScanBatch
to return a batch of records. (By convention, the first batch should return just a schema.)
-
ScanBatch
callsallocate()
on theRecordReader
to set up the vectors for the record batch. - Repeatedly calls the
next()
method on theRecordReader
to read a batch of rows into the vectors allocated above. - Sets the row count on each of the value vectors.
- Determines if the batch includes a schema different than the previous batch.
- Passes along the proper status code to the caller.
Of course, some of these steps involve more than the summary suggests.
-
ScanBatch
defines aMutator
class which must be used to build the set of value vectors. TheMutator
takes a field schema in the form of aMaterializedField
. - Uses the
TypeHelper
to convert from the field schema to a value vector instance. - Registers the value vector with the vector container associated with the
ScanBatch
. - Adds the value vector to the field vector map, indexed by field name (actually, a field path for nested structures.)
A number of improvements are possible to the current design.
- Each plugin should have a registration file that identifies the class name of the plugin. It is far cheaper to search for such files than to scan every class looking for those that extends some particular interface.
- Each plugin should use annotations to provide static properties such as the plugin tag name (
file
,mock
or whatever) instead of looking for the type of the first constructor argument as is done now. - Rather than searching for a known constructor, require that plugins have no constructor or a zero-argument constructor. In the plugin interface define a
register
method that does what the three-argument constructor currently does. This moves registration into the API rather than as a special non-obvious form of the constructor. - Each plugin configuration should name its storage plugin using the tag from the plugin definition.
The above greatly simplifies the storage plugin system:
- Definition files name the storage plugin class.
- The class (or definition entry) gives a tag to identify the plugin.
- Given a definition file, Drill builds a (tag --> plugin) table.
- Given a storage plugin definition, the plugin tag in the definition maps, via the above table, to the plugin itself.
Other changes:
- Storage plugins should have extended lives. As it is, they are created multiple times for each query, and thus many times across queries. This makes it had for the plugin to hold onto state (such as a connection to an external system, cached data, etc.)
- Clearly separate the planner-time (transient) objects and the persistent plan objects passed across the network. This will allow the planner-only objects to contain handles to internal state (something that now is impossible given the internal Jackson serializations.)
- Avoid unnecessary copies of the planning-time objects by tightening up the API. (Provide columns once rather than twice, etc.)