EXPLORING LISP ON THE JVM

One of the most exciting things in the Java world right now is the work being done to get other programming languages to run on the virtual machine. There is a lot of buzz around JRuby, Groovy, Scala, and the JavaScript engine Rhino. But why stop there? If you really want to take a step outside the mainstream and dive into something completely different from Java, Lisp is a great option. And there are several open-source implementations of the Lisp programming language for the JVM out there, ready to be explored.

So what makes Lisp worth looking at? For one thing, this fifty-year-old language has been the catalyst of many ideas we take for granted today. The if-then-else construct originally came out of Lisp, as did early attempts at object-orientation and automatic memory management with garbage collection. Lexical closures -- a hot topic for Java programmers right now -- were first explored in Lisp back in the seventies. And beyond that, Lisp still has many other unique features that other languages have yet to adopt, the kind of good ideas that are bound to make a comeback in the future.

This article is aimed at Java developers curious about Lisp. It will discuss the different dialects of Lisp that are available on the JVM today, and give you a crash course in how Lisp programming works and what makes it unique. Finally we'll look at how Lisp code can be integrated with a Java system.

There are plenty of both free and commercial Lisp systems available for many different platforms. For a Java user who wants to start out exploring Lisp, staying on the JVM is a good first choice. It's easy to get going and you get the benefit of being able to make use of all the existing Java libraries and tools that you're familiar with.

COMMON LISP AND SCHEME

There are two main dialects of Lisp: Common Lisp and Scheme. They're based on many of the same ideas, but are still different enough to stir up intense debates about which one is the better choice.

Common Lisp is an ANSI standard that was finalized in 1991, consolidating ideas from several earlier Lisps. It's a large environment designed for many kinds of application development, most famously artificial intelligence. Scheme on the other hand comes out of the academic world, is deliberately more minimalistic and has turned out to be good language for both teaching computer science and for embedded scripting. Another couple of well-known Lisps you may have run into are the smaller application-specific DSLs, like Emacs Lisp or AutoCAD's AutoLISP.

On the JVM there are implementations of both the main dialects, but the Schemes are more mature. Armed Bear Common Lisp (www.armedbear.org/abcl.html) is a reasonably complete implementation of the Common Lisp standard, but it suffers from that the distribution can't be built unless you have another Common Lisp system installed, which may be a problem for beginners.

On the Scheme-side the two main players are Kawa (www.gnu.org/software/kawa) and SISC (www.sisc-scheme.org -- the Second Interpreter of Scheme Code). For the examples in this article, we're going to use Kawa. Kawa is actually a framework for creating new languages compiled to Java bytecode, with Scheme being one of the implementations. Incidentally, its creator Per Bothner is now working for Sun on the compiler for the JavaFX project.

Another contender worth looking out for is Clojure (clojure.sourceforge.net). This is a new language, its own dialect of Lisp that sits somewhere between Scheme and Common Lisp. Being designed directly for the JVM it has the cleanest Java integration of all the Lisps mentioned so far, as well as featuring some other exciting ideas, like built-in support for concurrency and transactional memory. Clojure is still in an exploratory beta-stage right now, so it's perhaps a little too early to build something on top of, but it's definitely a project to keep your eyes on going forward.

THE READ-EVAL-PRINT-LOOP

Let's start out by getting Kawa installed. The distribution, a single jar-file, can be downloaded directly from ftp://ftp.gnu.org/pub/gnu/kawa/kawa-1.9.1.jar. Once you get the jar, add it to the classpath. With that done you're ready to start up the REPL by running this command:

  java kawa.repl
  #|kawa:1|# 

This starts Kawa and gives you a prompt. So what is this? REPL means READ-EVAL-PRINT-LOOP and is a way to interface with a running Lisp system -- it READs your input, EVALuates it, PRINTs the result, and then loops back to the prompt again. The way you develop a Lisp program is often different from the "write code, compile, run" cycle followed when programming Java. Lisp programmers typically fire up their Lisp systems and keep them running, making the line between compile- and runtime blurry. Within the REPL, functions and variables can be modified on the fly. Code is dynamically interpreted or compiled.

First up, something real simple: adding two numbers together.

  #|kawa:1|# (+ 1 2)
  3

This is the typical structure of a Lisp expression, or "form". The syntax is very consistent: the expressions are always wrapped with parenthesis, and use an prefix notation, so the plus goes before the two terms. To make a more advanced construct, you nest several forms to build a tree structure:

  #|kawa:2|# (* (+ 1 2) (- 3 4))
  -3

Scheme's built-in functions work the same way:

  #|kawa:3|# (if (> (string-length "Hello world") 5) 
                 (display "Longer than 5 characters"))
  Longer than 5 characters

Here you have an if-statement checking whether a specific string is longer than five characters. If that happens to be true, like in this case, the next expression is executed causing a message to be printed. Note that the indentation here is for readability purposes only. You could write this whole thing on one line if you wanted to.

The parenthesis-heavy style used for Lisp code is known as "s-expressions". It also doubles as a generic way to define structured data, much like XML. Lisp has many built-in functions that let you manipulate data in s-expression format very easily, and that in turn leads to one of its strengths. Since the syntax is so simple, writing programs that generate or modify code gets a lot simpler than in other languages. We'll see more of that when we get into examples of macros.

FUNCTIONS

Scheme is usually considered part of the family of functional programming languages. Unlike the object-oriented world, the main means of abstraction here are functions and the data they operate on -- not classes and objects. Everything you do in Scheme is really calling functions that take some arguments and return a result. To create a function you use the define keyword:

  #|kawa:4|# (define (add a b) (+ a b))

This defines add, which takes two arguments, a and b. The body of the function simply executes +, and automatically returns the result. Note that there are no static type declarations. All type checking is done at runtime, the same way as in other dynamic languages.

With the function defined, you can simply call it from the REPL:

  #|kawa:5|# (add 1 2)
  3

Functions are first-class citizens in the Scheme world, and can be passed around much like objects in Java. This opens up some very interesting possibilities. Let's start by creating a function that takes an argument and doubles it:

  #|kawa:6|# (define (double a) (* a 2))

And then define a list of three numbers, by calling the list function:

  #|kawa:7|# (define numbers (list 1 2 3))

Now for the exciting part:

  #|kawa:8|# (map double numbers)
  (2 4 6)

Here you called map, which takes two arguments: another function, and a list of some kind. map will iterate over each of the elements in the list and call the supplied function for each one. The results are then collected into a new list, which is what you can see getting returned to the REPL. This is a more "functional" way to do what would be solved with a for-loop in Java.

LAMBDAS

Even more convenient is to use the lambda keyword to define an anonymous function, similar to how an anonymous inner class would work in Java. Reworking the example above, you can the skip the definition of the intermediate double function and write your map statement as just:

  #|kawa:9|# (map (lambda (a) (* 2 a)) numbers)
  (2 4 6)

It's also possible to define a function that just returns a lambda. A classic textbook example is this:

  #|kawa:10|# (define (make-adder a) (lambda (b) (+ a b)))
  #|kawa:11|# (make-adder 2)
  #<procedure /dev/stdin:9>

What happened here? First you defined a function called make-adder that takes one argument, a. make-adder returns an anonymous function that expects another argument, "b". When called, this anonymous function will in turn calculate the sum of a and b.

Executing (make-adder 2) -- or "give me a function that will add 2 to the argument I send to it" -- causes the REPL to report back the cryptic <procedure /dev/stdin:12> which is just the actual lambda procedure printed out as a string. To use this you can do something like:

  #|kawa:12|# (define add-3 (make-adder 3))
  #|kawa:13|# (add-3 2)
  5

The great thing here is that the lambda works as a closure. It "closes over" and keeps references to the variables that were in scope when it was created. The lambda that was the result of the (make-adder 3) call holds on to the value of a, and when (add-3 2) is executed it will be able to calculate 3 + 2 and return the expected 5.

MACROS

The features we've looked at so far are very similar to what can be found in newer dynamic languages. Ruby, for example, also lets you use anonymous blocks to process collections of objects, exactly what we did with a lambda and the map function earlier. So let's switch gears now and look at something more uniquely Lispy: macros.

Both Scheme and Common Lisp have macro systems. When you see people referring to Lisp as "the programmable programming language", this is what they mean. With macros you can actually hook into the compiler and redefine the language itself. This is where Lisp's uniform syntax really starts paying off and the whole thing gets really interesting.

For a simple example, let's look at loops. Originally there were no loops defined in the Scheme language. The classic way to iterate over things was to either use map or recursive function calls. Thanks to a compiler trick known as tail-call optimizations recursion can be used without blowing out the stack. Later a very flexible do command was introduced, and using it to execute a loop would look something like this:

(do ((i 0 (+ i 1)))
    ((= i 5) #t)
    (display "Print this "))

Here we define an index variable, i, initialize it to zero, and set it up to be incremented by one each iteration. The looping breaks once the expression (= i 5) evaluates to true, and then #t, Scheme's equivalent of Java's boolean true, is returned. Inside the loop we just print a string.

This is a lot of complicated boilerplate code if all we need is just a simple loop. In some cases being able to do something more straight forward would be preferable, like:

(dotimes 5 (display "Print this"))

Thanks to macros it is possible to add the special syntax for dotimes to the language, using the appropriately named define-syntax function:

(define-syntax dotimes
  (syntax-rules ()
    ((dotimes count command) ; Pattern to match
     (do ((i 0 (+ i 1)))     ; What it expands to
         ((= i count) #t)
         command))))

Executing this tells the system that any call to dotimes needs to be treated in a special way. Scheme will use the syntax rules we defined to match a pattern and expand it before sending the result over to the compiler. In this case the pattern is (dotimes count command), which is then transformed into the regular do loop.

Execute this in the REPL, and you get:

#|kawa:14|# (dotimes 5 (display "Print this "))
Print this Print this Print this Print this Print this #t

Two questions are bound to come up after this example. First, why do we need to use a macro at all? Couldn't something like this be done with a regular function? The answer is no - a call to a function would actually trigger the evaluation of all its arguments before they get sent off, and that wouldn't work in this case. How would you handle (do-times 0 (format #t "Never print this")), for example? The evaluation needs to be deferred, and that can only be done with a macro.

Second, we're using the variable i inside the macro. Won't there be collisions if the code in the command expression happens to use the same name for one of its variables? No need to worry - Scheme's macros are what's known as "hygienic". The compiler automatically detects and knows how to work around naming collisions like that, completely transparent for the programmer.

Now that you've seen this, imagine trying to add your own loop construct to Java. It's close to impossible. Well, not quite perhaps -- the compiler is after all open source so you're free to download it and get right to work, but that's not really a realistic option. In other dynamic languages closures can get you a bit further along the way to make the language look the way you want, but there are still cases where those constructs are not powerful or flexible enough to completely let you tweak the syntax.

This power is why Lisp often comes up as the language to beat when the topic is meta-programming or domain-specific languages. Lisp programmers have been the long-time champions of "bottom-up programming", when the language itself is adjusted to fit your problem domain, instead of the other way around.

CALLING SCHEME CODE FROM JAVA

One of the main benefits with running another language on top of the JVM is how code written in it can be integrated with existing applications. It's easy to imagine using Scheme to model some complex business logic that tends to change often, and then embedding it in a more stable Java framework. The rule engine Jess (www.jessrules.com) is one example of an idea going in that direction, running on the JVM but using its own Lisp-like language to declare rules.

But getting interoperability between different programming languages to work in a clean way is a tricky problem, especially when the mismatch between the two is as big as for Java and Lisp. No standards exists for how to do this integration, and the dialects that live on the JVM all approach the problem differently. Kawa has relatively nice support for Java integration, and we'll keep using it to investigate how to define a Swing GUI using Scheme code.

Running some Kawa code from within a Java program is simple:

import java.io.InputStream;
import java.io.InputStreamReader;

import kawa.standard.Scheme;

public class SwingExample {	
	
    public void run() {
	
        InputStream is = getClass().getResourceAsStream("/swing-app.scm");
        InputStreamReader isr = new InputStreamReader(is);
		
        Scheme scm = new Scheme();
        try {
            scm.eval(isr);
        } catch (Throwable schemeException) {
            schemeException.printStackTrace();
        }
    }
	
    public static void main(String arg[]) {
        new SwingExample().run();
    }
}

In this example, a file containing a Scheme program called swing-app.scm is expected to be found somewhere on the classpath. A new instance of the interpreter, kawa.standard.Scheme, is created, and called to evaluate the contents of that file.

Kawa doesn't support the JSR-223 scripting APIs that were introduced in Java 1.6 yet (javax.scripting.ScriptEngine, etc). If you need a Lisp that does that, your best bet is SISC.

CALLING JAVA LIBRARIES FROM SCHEME

Before we go on to write a bigger Lisp program it's time to find a more fitting editor to work with, or keeping track of all those parenthesis will soon drive you crazy. One of the most popular choices is of course to use Emacs -- it is after all programmable in its own dialect of Lisp -- but for a Java developer sticking with Eclipse may be more comfortable. If that is the case you should install the free SchemeScript plugin before moving on. It can be found at schemeway.sourceforge.net/schemescript.html. There is also a plugin for Common Lisp development called Cusp (bitfauna.com/projects/cusp).

Now, let's look at the actual contents of swing-app.scm, and what we need to do to define a simple GUI using Kawa. This example will open up a small frame with one button in it. After the button has been clicked once it is disabled.

(define-namespace JFrame <javax.swing.JFrame>)
(define-namespace JButton <javax.swing.JButton>)
(define-namespace ActionListener <java.awt.event.ActionListener>)
(define-namespace ActionEvent <java.awt.event.ActionEvent>)

(define frame (make JFrame))
(define button (make JButton "Click only once"))

(define action-listener
  (object (ActionListener)
    ((action-performed e :: ActionEvent) :: <void>
      (*:set-enabled button #f))))

(*:set-default-close-operation frame (JFrame:.EXIT_ON_CLOSE))
(*:add-action-listener button action-listener)
(*:add frame button)
(*:pack frame)
(*:set-visible frame #t)

The first couple of lines use the define-namespace command to set up shorter names for the Java classes we're about to use, similar to Java's import statements.

We then define the frame and the button. Java objects are created by using the make function. In the button case we supply a string as an argument to the constructor, and Kawa is smart enough to do the translation to a java.lang.String object as needed.

Let's skip the ActionListener definition for now, and take a look at the last five lines first. Here the notation *: is used to trigger methods on objects. For example (*:add frame button) is the equivalent of frame.add(button). Also notice how the names of methods are automatically translated from Java's camel-case style to the lower-case words separated by dashes typical for Scheme. Behind the scenes set-default-close-operation will get turned into setDefaultCloseOperation, for example. Another detail here is how :. is used to access static fields. (JFrame:.EXIT_ON_CLOSE) is the equivalent of JFrame.EXIT_ON_CLOSE.

Now back to the ActionListener. Here the object function is used to create an anonymous class that implements the java.awt.event.ActionListener interface. The action-performed function is set up to call setEnabled(false) on the button. Some type information must be added here, to let the compiler know that action-performed is supposed to be the implementation of void actionPerformed(ActionEvent e) defined in the ActionListener interface. Earlier we said that normally in Scheme you don't need types, but in this case when you're interfacing with Java the compiler needs the extra information.

Once you have both these files, compile SwingExample.java and make sure to put both the resulting class and swing-app.scm on the classpath somewhere. Next, run java SwingExample to see the GUI. You can also have the REPL evaluate the code in the file by using the load function: (load "swing-app.scm"). This opens the door to manipulating the GUI components dynamically. For example, you can switch the text on the button by executing (*:set-text button "New text") at the REPL prompt, and see the changes take effect immediately.

Of course, this example was only meant to show how to call Java from Kawa. By no means is it the most elegant Scheme code you could imagine. If you in fact wanted to define a large Swing UI in Scheme, you're probably best off raising the abstraction level a bit and hide the messy integration code behind a couple of well-chosen functions and macros.

RESOURCES

Hopefully reading this has sparked some interest in Lisp. Trust me when I say that there are plenty of things left to explore. Here are some resources where you can learn more:

ABOUT THE AUTHOR

Per Jacobsson is a software architect at eHarmony.com in Los Angeles. He has worked with Java for ten years and been a Lisp hobbyist for two. You can reach him at pjacobsson.com.