Skip to content

UDFs Background Information

Paul Rogers edited this page May 3, 2018 · 31 revisions

Introduction

Drill supports User Defined Functions (UDFs) which are SQL functions added to an existing Drill installation. UDFs use the same mechanism as Drill's built-in SQL functions. To develop UDFs effectively, you should have a strong knowledge of Drill internals gained from experience as the only documentation is the code itself.

Drill provides documentation about how to create a UDF. The information is procedural and walks you through the steps, assuming that you already know enough about Drill internals to fill in the gaps. The purpose of this page is to explain a bit of that background information for UDF authors who are not yet Drill internals experts.

If there is only one message you take away from this page, let it be:


Drill UDFs are a Strict Subset of Java


Instead, they are use Drill-specific Domain-specific language (DSL) that happens to be expressed in a subset of Java. Use only those Java constructs that Drill specifically allows.

The material here describes the theory behind Drill's UDF support so you know what is going on behind the scenes. We then present a simple framework to make UDFs easier to develop and suggest debugging strategies by walking though the process to develop and test a simple (row-by-row) UDF. We then dive into the undocumented details of the mechanisms your code will use. Finally we present a troubleshooting guide of the many things that will go wrong, what they mean, and how to correct the problems.

To avoid excessive duplication, this page assumes you are familiar with the existing documentation. We'll touch on some sections to offer simpler alternates, but mostly count on the Drill documentation for the basics of setting up a Maven project, etc.

Topics

Clone this wiki locally