LucidDbUdfJavaHowto

From Eigenpedia

Jump to: navigation, search

Contents

So, you want to write a UDF in Java?

As explained in LucidDbUdfHowto, simple user-defined functions can be written directly as SQL expressions, but in general you'll need the power of a procedural programming language and third-party libraries to express custom logic. LucidDB supports Java as its extension language for writing external routines; if you're clever, you can probably also figure out how to use other languages such as Python via extension projects such as Jython. We'll stick to Java here, and show how to use Java's regular expression library to enhance SQL's filtering capabilities.

Set up Java development environment

There's a good chance you'll want to use an IDE such as Eclipse to write your routine implementations; this will make the process much easier. But to keep things IDE-neutral, we'll explain things in terms of command line execution of utilities from Sun's JDK. You shouldn't need any other software, and you don't need any special libraries from LucidDB; this is a nice feature of the way SQL:2003 defines calls to Java methods. Any version of the JDK should work for creating a UDF, but LucidDB itself requires a 1.5 JVM to run.

Your LucidDB distribution should contain a directory named examples, with a subdirectory named udf.

Create a .java file defining your class

The examples/udf directory already has what you need; take a look at file examples/udf/com/yoyodyne/RegexMatchUdf.java in a text editor:

package com.yoyodyne;
public class RegexMatchUdf
{
    public static boolean execute(String input, String pattern)
    {
        return input.matches(pattern);
    }
}

NOTE: For a real UDF, you'll probably want to write JUnit testcases to help you develop and debug your UDF in a test-driven fashion, so that by the time you plug it into SQL, it will already be working smoothly.

Compile to a .class file

From the examples/udf directory, run the Java compiler:

cd examples/udf
javac com/yoyodyne/RegexMatchUdf.java

This will produce a file RegexMatchUdf.class in the com/yoyodyne directory.

Package the class into a .jar file

Before it can be loaded by LucidDB, the .class file needs to be packaged into a .jar file:

jar cf ../../plugin/regex-udf.jar com/yoyodyne/RegexMatchUdf.class

This creates the jar in the plugin directory under LucidDB's installation root.

Install the .jar in LucidDB

Now, execute the following SQL commands:

create schema regex;
set schema 'regex';
call sqlj.install_jar('file:${FARRAGO_HOME}/plugin/regex-udf.jar','udf_jar',0);
create function pattern_match(input varchar(1024), pattern varchar(1024))
returns boolean
language java
no sql
external name 'regex.udf_jar:com.yoyodyne.RegexMatchUdf.execute';

Before testing it out, there's a bit of explaining to do. In LucidDB, an installed jar is a first-class object just like a table or view, and it lives in a schema just like those objects. So the schema-qualified SQL name of the jar in this case is REGEX.UDF_JAR. The installed jar also has a URL property recording where the actual jar resource should be loaded from (in the file system, in this case, although LucidDB could just as easily load it from an http URL).

Other first-class objects such as as user-defined functions can reference installed jars containing their implementation. In this case, the external name property of the REGEX.PATTERN_MATCH UDF has three parts:

  1. the regex.udf_jar part before the colon tells LucidDB the SQL name of the jar (from which LucidDB can find the corresponding URL)
  2. the com.yoyodyne.RegexMatchUdf part tells LucidDB the package-qualified name of the Java class containing the method to invoke
  3. the execute part tells LucidDB the name of the static method in that class to invoke

Create test data

OK, we're ready to try out the UDF, but first we need some test data:

create table countries(name varchar(128));
insert into countries
values
    ('Uruguay'),
    ('Paraguay'),
    ('Chile'),
    ('Argentina'),
    ('Venezuela');

Execute UDF

Let's find countries with names starting with either 'A' or 'U':

select *
from countries
where regex.pattern_match(name, '(A|U).*');

The results should look like this:

0: jdbc:luciddb:> select *
. . . . . . . . > from countries
. . . . . . . . > where regex.pattern_match(name, '(A|U).*');
+------------+
|    NAME    |
+------------+
| Uruguay    |
| Argentina  |
+------------+

Deal with existing jars

What if you already have a jar file from someone else? If you're lucky enough to know of a public static method in that jar which already does what you want, then you can just load it directly.

In most cases, however, you'll need to wrap the jar's functionality with an adapter interface suitable for invocation from SQL. This is the current procedure:

  1. Compile your adapter class(es), importing classes from the third-party jar.
  2. Create a new jar which combines your classes with the classes from the third-party jar.
  3. Load the combined jar into LucidDB

SQL:2003 actually provides a facility for declaring jar dependencies so that you would instead be able to load the thirdparty jar as-is, load a small adapter jar you have written separately, declare the dependency, and then invoke routines in your adapter jar. LucidDB does not yet support this facility.

Move on to Table Functions

If you can't express a complex transformation you need as a UDF, you may be able to use a table function instead. LucidDbUdxJavaHowto explains.

Personal tools