Orangutans grok Common Lisp

This being the season to be jolly (and eat bananas), I produced a magical piece of software: a compiler and decompiler for the language Ook!  Which proves that orangutans grok Common Lisp.

You can try out this recreational and magical piece of software by downloading the library from common-lisp.net.  A brief introduction can be found, of course, in the library (if you can see it!)

Please send bananas and DO NOT say "monkey"!




(R)creational Common Lisp

In my real work, I (or better, my students and colleagues) use R to program our data analyses (if you are interested, you can check out the TRONCO package.)

Of course, being a Common Lisp quite-high priest at this point and a recreational programmer in it, I wondered how to render several R constructs in the language with parentheses.  It turns out that it was quite fun to re-engineer some of the key features of the R language.

This is not the first such project trying to mimic R.  I know at least two of them; the most public one being Common Lisp Stat, but, being known for suffering from NIH and RTW syndromes, I had to jump in.

Here is a short summary of what I did and what was achieved.

Generic Math, Arrays and Vectors

R provides generic math on vectors.  Pretty much everything in R is a vector (stored in column major order; more on this later) including arrays.  The generic math helps you writing things like
> 42 * 1:5 # The ':' operator produces a vector.
[1]  42  84 126 168 210


The generic math and the construction of some operators in CL is pretty much straightforward.  Where things get hairy is in the handling of the several ways that a R programmer may use to index, slice and dice vectors and arrays. As an example, you could write
> a = array(11:14, dim = c(2, 2), dimnames = list(list("a", "b"), list("x", "y")))

> a
   x  y
a 11 13
b 12 14

In brief, you have a matrix with "named" rows and columns. There are other idiosyncrasies but this is already something interesting to deal with, as you can now write things like:
> a[1, "y"]
[1] 3

> a["b", ]
 x  y 
12 14

The second "indexing" selects, or slices, the entire second row (counting from 1) and, as a result, returns a vector with the column names retained.

Indexing can also become interesting because of other idiosyncrasies in R.  Consider the following:
> a[2]
[1] 12

> a[42]
[1] NA # 'NA' is the 'not available' indicator.

> a[c(2, 3, 4, 10)]
[1]  12  13  14 NA

Indexing by only one index accesses the underlying vector in column major order. If the index is beyond the current limit, the value NA ('not available') is returned.  If the sole index is a vector, then the indexing returns a vector of the values in the array accessed column major order.

Indexing and Assignments

Of course, once you handle slicing and dicing of arrays, you have to deal with setting these slices and dice.

In R you can have a lot of what you can dream of (and something more, some of which, you wished was not there).

Basic assignment on vectors and arrays works as expected. Consider:
> a = array(11:19, dim = c(3, 3), dimnames = list(list("a", "b", "c"), list("x", "y", "z")))

> a
   x  y  z
a 11 14 17
b 12 15 18
c 13 16 19

> a[2, "y"] = 42

> a
   x  y  z
a 11 14 17
b 12 42 18
c 13 16 19

So far, nothing special is happening. Things actually make sense when trying to set a sub-slice of the array.
> b = array(42, dim = c(2, 2))

> b
     [,1] [,2]
[1,]   42   42
[2,]   42   42

> a[2:3, 2:3] = b

> a
   x  y  z
a 11 14 17
b 12 42 42
c 13 42 42

However, when setting a sub-slice of an array to a vector, or setting the elements of an array using a single vector index (i.e., when accessing the array in column major order), things may become a bit surprising.
> a[2:3, 2:3] = c(1, 1, 1, 1)

> a # This is not a surprise.
   x  y  z
a 11 14 17
b 12  1  1
c 13  1  1

> a[2:3, 2:3] = c(1, 2, 3, 4, 5, 6) # This is also expected (modulo the 'multiple'.)
Error in a[2:3, 2:3] = c(1, 2, 3, 4, 5, 6) : 
  number of items to replace is not a multiple of replacement length

> a[c(1, 2, 6, 7)] = c(111, 111, 111, 111) # Ok, understandable.
> a
    x   y   z
a 111  14 111
b 111   1   1
c  13 111   1

> a[c(1, 2, 12, 7)] = c(1, 1, 1, 1)

> a # What?!?
 [1]   1   1  13  14   1 111   1   1   1  NA  NA   1

Ok, the last example shows how R manages assigning past the limit of the underlying vector.

Not really a problem, once you get the hang of it, but, source of a bit of confusion nevertheless, at least to yours truly.

In any case, all of this must be dealt with once you want to get into building some facilities like these on top of Common Lisp.

Let's "Reinvent the Wheel" (RTW) in Common Lisp

Getting the basics down was not so difficult.  Common Lisp has a somewhat richer set of array and vector data types than R, however, some design decisions resulted in some fun (albeit probably sub-optimal) setup.

Here is a list of decisions taken, which may be subject to debate.

  • Objects are represented as structures.
  • Constructors follow the R conventions (no make-array!)
  • Vectors and arrays are mapped to CL vectors and arrays.
    This means that
    • Arrays are kept as multi-dimensional entities which are accessed in row major order
    • Vector and array "objects" are separate and different.
  • A generic reference function AT is introduced (and its counterpart, (SETF AT).)

Of course, these choices will make porting code from R to the CL library/tool quite cumbersome; let's say that this goal was dropped early on. The CL library (which, by the way, is named ρ, that is: RHO) will look familiar to a R programmer, but it will not be the same. After all, is it Common Lisp, isn't it?

Here is a run down of the basic features.  This was the hard part, in the opinion of the author.  Getting more of R implemented will be somewhat easier.

The AT Generic Function

The AT generic function works as expected on the standard CL objects (as it should).
RHO 10 > (defvar qwerty "qwerty")

RHO 11 > (setf (at qwerty 3) #\R)

RHO 12 > qwerty

RHO 13 > (at qwerty 4)

RHO 14 > (defvar a #2A((1 2 3) (4 5 6) (7 8 9)))

;;; ...

RHO 16 > (at a 1 1)

RHO 17 > (setf (at a 0 0) 42)

RHO 18 > a
#2A((42 2 3) (4 5 6) (7 8 9))

The generic function AT also works on structures and class instances.
RHO 22 > (defstruct foo a s d)

RHO 23 > (defvar foo (make-foo :s 42))

RHO 24 > (at foo 'foo-s) ; In this case the accessor function FOO-S is invoked.

RHO 25 > (setf (at foo 'foo-d) "Food")

RHO 26 > foo
#S(FOO :A NIL :S 42 :D "Food")

So far, so good.  Now let's have some fun.

Creating Arrays (and Vectors and Matrices)

Let's create some arrays, vectors and matrices.
RHO 28 > (setf rr (range 1 9)) ; This would be 'rr = 1:8' in R.
  1  2  3  4  5  6  7  8

RHO 29 > (setf rv (vector 1 22/23 -3.2 42 5.8))
      1  22/23   -3.2     42    5.8

RHO 30 > (setf rv2 (vector* (list 1 22/23 -3.2 42 5.8)
                            :names '(a b c d e)))
      A      B      C      D      E
      1  22/23   -3.2     42    5.8

The objects returned by VECTOR and VECTOR* (the full constructor) are wrappers around the underlying CL vectors.  Their class is RHO:VECTOR.  Note also that *PRINT-READABLY* is set to NIL, therefore, a "pretty printing" of the vectors is produced.
RHO 37 > (setf ra (array (range 1 9) :dim '(3 3)))
      [, 0] [, 1] [, 2]
[0, ]     1     2     3
[1, ]     4     5     6
[2, ]     7     8     1

RHO 38 > (at ra 1 1)

The function RANGE excludes the upper limit (following CL conventions), while the ARRAY constructors "recycles" the values passed to it (following R conventions).  The value produces is pretty-printed, while its class is RHO:ARRAY.  Indexing is 0-based, as in CL.

Creating matrices is straightforward and, again, a R-like function is used.
RHO 3 > (setf rm (matrix '(1 1 2 0) :nrow 2 :ncol 2))
      [, 0] [, 1]
[0, ]     1     1
[1, ]     2     0

RHO 4 > (at rm 0 1)

RHO 5 > (at rm 1 0)

Slicing and Dicing in RHO

Now that we have all the bases covered, we can proceed to some "slicing and dicing".  All the R indexing and assignment methods are provided in RHO, modulo the Common Lisp background choices, bugs and unimplemented features; not many of the last kind.

First of all, let's set some dimension names...
RHO 7 > (setf (dimension-names ra) '(("a" "b") ("x" "y")))
(("a" "b") ("x" "y"))

RHO 8 > ra
      [, x] [, y] [, 2]
[a, ]     1     2     3
[b, ]     4     5     6
[2, ]     7     8     1

RHO 9 > (at ra 1 "y")

Next we create a new array (a matrix, but in R, matrices are separate subclasses of arrays) rb and use it as a value to set a sub-matrix of ra.
RHO 13 > (setf rb (array 42 :dim '(2 2)))
      [, 0] [, 1]
[0, ]    42    42
[1, ]    42    42

RHO 29 > (setf (at ra (vector 1 2) (vector 1 2)) rb)
#2A((42 42) (42 42))

RHO 15 > ra
      [, x] [, y] [, 2]
[a, ]     1     2     3
[b, ]     4    42    42
[2, ]     7    42    42

As you can see, everything works as expected.

Remember that we have built a matrix rm; let's use it to set some elements of the array ra, according to the R "access-by-matrix" semantics.
RHO 16 > rm
      [, 0] [, 1]
[0, ]     1     1
[1, ]     2     0

RHO 24 > (type-of rm)

RHO 27 > (setf (at ra rm) 1024)

RHO 28 > ra
      [, x] [, y] [, 2]
[a, ]     1     2     3
[b, ]     4  1024    42
[2, ]  1024    42    42

RHO 29 > (setf (at ra #(1 2) #(1 2)) #(1 1 1 1))
#(1 1 1 1)

RHO 30 > ra
      [, x] [, y] [, 2]
[a, ]     1     2     3
[b, ]     4     1     1
[2, ]  1024     1     1

Note however, that, although "correct", no attempt to signal an error is made for incongruous dimensions.
RHO 42 > (setf (at ra #(1 2) #(1 2)) #(11 12 13 14 15))
#(11 12 13 14 15)

RHO 43 > ra
      [, x] [, y] [, 2]
[a, ]     1     2     3
[b, ]     4    11    12
[2, ]  1024    13    14

Getting the indexing of different combinations of scalars, vectors, and arrays was an interesting exercise. The implementation is probably bloated and not easily understandable; most probably because of the design choice of keeping the underlying CL vectors and arrays "as-they-are", especially multidimensional arrays.

Missing Things...

Most important one is "negative" indexing, which does not work completely yet.
Of course, I am limiting this comment to arrays and vectors: other R data structures, such as factors and data.frames are in the making, but all the heavy lifting is now done.

Want to Play?

Well, for the time being, drop me a line (you know where to find me).  We can talk about setting up a proper project and discuss some design choices.




Thanks to Masayuki Takagi, a not-so-obvious bug was fixed in the basic unification machinery of CL-UNIFICATION.  This led to the addition of a couple of new utility functions and some other cleanups (hopefully).

The result should be in Quicklisp in the next round.  Otherwise you can always clone/update/pull from the main repository.




I just revamped the CL-ENUMERATIONS library and API.

This was simply a nice cleanup of the previous library. The new documentation pages are, of course, generated with a little HEΛP, which also got some fine tuning to be able to generate doc pages from quite a variety of relatively "plain" doc-strings.



There's a pulse

Hello there,

After a long time I finally had a few hours (sic!) to get back to actual hacking.

The main thing I did was to get myself re-accustomed with the new common-lisp.net infrastructure based on GitLab. This took some time to "realign" my repositories and getting my head wrapped around the way things work.

The result is that I went back to HEΛP to fix some tricky package issues, which are the result of wanting to be too cocky. Next I want to rework the web page layout in order to use the new HTML5 "semantic" tags.

After that, I will rework the documentation of my libraries and go on to "finish" a couple of other ones I have on the back burner.

Do I have too much on my lisping plate? :) :)




Open parenthesis

It has been a year since I posted something about Common Lisp.  What have I been doing meanwhile on this front?  Well, not much visible, but I am still ill from NIH syndrome, therefore I have been cooking up a few things, while doing my real work ;)
In any case, the most time consuming things in my corner of the Common Lisp world have been:
  • Moving repos from CVS to git and getting the new common-lisp.net to agree with me (or me with it: you decide).  Some new things have been deployed on Sourceforge as I had a very old account there.
  • Fixing HEΛP to ensure that it worked nicely in most implementations and Quicklisp.
  • Building a new library called CLAST (Common Lisp Abstract Syntax Trees; reminder of "clastic rocks") that will do TRT according to my personal tastes; this library will play a role in my rebuilding of CLAZY and other little things.
Stay tuned.



Enumerated Types in CL

The Holidays gave me some time to go back to work on another of my little libraries. Here is the result.

Enumerated Types in Common Lisp

Several programming languages have the notion of enumerated type, e.g., C/C++ enum. Common Lisp can substitute the notion of enumerated type using the member type specifiers.

    (deftype colors () '(member red green blue))

This is sufficient for the annotation purposes of traditional Common Lisp, but it falls short of the standard uses that are expected of enumerated types in C/C++ and other languages. Moreover, languages like Java now provide very structured enumerated types.
Consider the following Java enumerated type, which provides an idiomatic way to express certain language processing functionalities:

    public enum Operation {
      PLUS   { double eval(double x, double y) { return x + y; } },
      MINUS  { double eval(double x, double y) { return x - y; } },
      TIMES  { double eval(double x, double y) { return x * y; } },
      DIVIDE { double eval(double x, double y) { return x / y; } };

      // Do arithmetic op represented by this constant
      abstract double eval(double x, double y);

It is not difficult to emulate such methods in Common Lisp using EQL specializers and using symbols as enumerated type tags.
On the other hand, C/C++ use of enumerated types as numeric integer constants, as in

    enum numbers {ONE,
                  FORTY_TWO = 42,

can easily be emulated in Common Lisp by using appropriate defconstant's; alas, this breaks the use of case and similar constructs.
Common Lisp programmers have, needless to say, come up with several versions of enumerated types. Most available definitions provide a DEFENUM or DEF-ENUM macro which provides functionalities similar to C/C++ enumerated types, while adding a switch or select macros as work-around the case problem. A typical rendition of the C/C++ enum just shown could be rendered as

    (def-enum numbers ONE TWO (FORTY-TWO 42) FORTY-THREE)

In the best Common Lisp N.I.H. tradition, yet anothern DEFENUM macro is necessary.

The DEFENUM Library

The present library, unimaginatively named DEFENUM, provides a new DEFENUM macro that merges most of the functionalities provided elsewhere. The objective is, as always, to make the simple things simple and the complex ones possible (and possibly, simple as well).
The simplest enumerated types are represented as expected:

     (defenum season (spring summer fall winter))

This defines an enumerated type (the macro expands - also - into a deftype) with the four tags indicating the seasons. The type checks work as expected, and the C/C++ behavior is also reproduced, as the following shows:

    cl-prompt> (typep 'fall 'season)

    cl-prompt> (+ fall winter)

The library provides also a number of facilities to handle enumerated types.

    cl-prompt> (season-p 'spring)

    cl-prompt> (season-p winter) ; No quote.

The function season accesses the individual tags.

    cl-prompt> (season winter) 

Tags can be handled as lists.

    cl-prompt> (tags 'season)

Other functions are available as well.

    cl-prompt> (loop for season in (tags 'season)
                     for s-id from spring
                     do (format t "~S is followed by ~S~%"
                                (tag-of 'season (mod (1+ s-id) 4))))
    SPRING is followed by SUMMER
    SUMMER is followed by FALL
    FALL is followed by WINTER
    WINTER is followed by SPRING

    cl-prompt> (previous-enum-tag 'season 'fall)

Enumerated types live in their own namespace.

    cl-prompt> (find-enum 'season)

Simple and Structured Enumerated Types

The season enumerated type is simple: nothing particularly complicated. On the other hand, Java enumerated types show that is possible to extend the notion of enumerated type in an interesting way, by introducing the notion of structured enumerated type.
Two interesting examples in Java [JE5] are the Planet and the Operation enumerated types.

    public enum Planet {
      MERCURY (3.303e+23, 2.4397e6),
      VENUS   (4.869e+24, 6.0518e6),
      EARTH   (5.976e+24, 6.37814e6),
      MARS    (6.421e+23, 3.3972e6),
      JUPITER (1.9e+27,   7.1492e7),
      SATURN  (5.688e+26, 6.0268e7),
      URANUS  (8.686e+25, 2.5559e7),
      NEPTUNE (1.024e+26, 2.4746e7),
      PLUTO   (1.27e+22,  1.137e6);

      private final double mass;   // in kilograms
      private final double radius; // in meters
      Planet(double mass, double radius) {
        this.mass = mass;
        this.radius = radius;
      public double mass()   { return mass; }
      public double radius() { return radius; }

      // universal gravitational constant  (m3 kg-1 s-2)
      public static final double G = 6.67300E-11;

      public double surfaceGravity() {
        return G * mass / (radius * radius);
      public double surfaceWeight(double otherMass) {
        return otherMass * surfaceGravity();

A program that exercises the Planet enumerated type is the following (the units are unimportant):

    $ cat Planet.java
    public enum Planet {


        public static void main(String[] args) {
          double earthWeight = Double.parseDouble(args[0]);
          double mass = earthWeight/EARTH.surfaceGravity();
          for (Planet p : Planet.values())
             System.out.printf("Your weight on %s is %f%n",
                               p, p.surfaceWeight(mass));

    $ java Planet 175
    Your weight on MERCURY is 66.107583
    Your weight on VENUS is 158.374842
    Your weight on EARTH is 175.000000
    Your weight on MARS is 66.279007
    Your weight on JUPITER is 442.847567
    Your weight on SATURN is 186.552719
    Your weight on URANUS is 158.397260
    Your weight on NEPTUNE is 199.207413
    Your weight on PLUTO is 11.703031

DEFENUM provides structured enumerated types as well. The Java Planet enumerated type and the test program above are rendered in Common Lisp as follows:

    (defconstant g 6.67300D-11)

    (defenum (planet (:initargs (mass radius)))
         ((MERCURY (3.303D+23 2.4397D6))
          (VENUS   (4.869D+24 6.0518D6))
          (EARTH   (5.976D+24 6.37814D6))
          (MARS    (6.421D+23 3.3972D6))
          (JUPITER (1.9D+27   7.1492D7))
          (SATURN  (5.688D+26 6.0268D7))
          (URANUS  (8.686D+25 2.5559D7))
          (NEPTUNE (1.024D+26 2.4746D7))
          (PLUTO   (1.27D+22  1.137D6))
         ((mass 0.0d0 :type double-float)
          (radius 0.0d0 :type double-float)

         (:documentation "The Planet Enum.")

         (:method surface-gravity ((p planet))
          (* g (/ mass (* radius radius))))

         (:method surface-weight ((p planet) other-mass)
          (* other-mass (surface-gravity p)))

Note the definition of methods that will be specialized on the tags of the enumeration. With the definition above, the test Java program can be rendered, as an example, as follows:

    cl-prompt> (let* ((earth-weight 175)
                      (mass (/ earth-weight (surface-gravity 'earth)))
                   (dolist (p (tags 'planet))
                     (format t "Your weight on ~S is ~F~%"
                             (tag-name  p)
                             (surface-weight p mass)))

    Your weight on MERCURY is 66.10758266016366
    Your weight on VENUS is 158.37484247218296
    Your weight on EARTH is 174.99999999999997
    Your weight on MARS is 66.27900720649754
    Your weight on JUPITER is 442.84756696175464
    Your weight on SATURN is 186.55271929202414
    Your weight on URANUS is 158.39725989314937
    Your weight on NEPTUNE is 199.20741268219015
    Your weight on PLUTO is 11.703030772485283

tag-name is necessary because the print-object method for structured tags is not too extreme; it could be made to print the tag name.
Of course, all the other functions presented before work as expected.

    cl-prompt> (planet-p 'venus)

    cl-prompt> (planet-p 'vulcan)

The Java Operation example shows how to define methods that are actually specialized on tags (the rationale is to avoid writing error-prone non-object-oriented switch statements. The Java Operation is the following:

    public enum Operation {
      PLUS    { double eval(double x, double y) { return x + y; } },
      MINUS   { double eval(double x, double y) { return x - y; } },
      TIMES   { double eval(double x, double y) { return x * y; } },
      DIVIDE  { double eval(double x, double y) { return x / y; } };

      // Do arithmetic op represented by this constant
      abstract double eval(double x, double y);

      public static void main(String args[]) {
        double x = Double.parseDouble(args[0]);
        double y = Double.parseDouble(args[1]);
        for (Operation op : Operation.values())
          System.out.printf("%f %s %f = %f%n", x, op, y, op.eval(x, y));

This form of Java enum type definition allows for the direct association of methods to tags. In Common Lisp this is nothing more than EQL specialized methods, and DEFENUM directly provides for this idiom. Actually there are two equivalent forms for it.

    (defenum operation
         (:method evaluate ('plus x y) (+ x y)) ; Note the 'quote' shorthand.
         (:method evaluate ('minus x y) (- x y))
         (:method evaluate ('times x y) (* x y))
         (:method evaluate ('divide x y) (/ x y))

... or, more similar to the original Java idiom ...

    (defenum operation
         ((PLUS () (:method evaluate (x y) (+ x y))) ; Note the 'no tag argument' shorthand.
          (MINUS () (:method evaluate (x y) (- x y)))
          (TIMES () (:method evaluate (x y) (* x y)))
          (DIVIDE () (:method evaluate (x y) (/ x y)))

The test program works as expected.

    cl-prompt> (let ((x 4) (y 2))
                  (dolist (op (tags 'operation))
                     (format t "~S ~S ~S = ~S~%" x (tag-name op) y (evaluate op x y))))
    4 PLUS 2 = 6
    4 MINUS 2 = 2
    4 TIMES 2 = 8
    4 DIVIDE 2 = 2

Therefore DEFENUM allows for all Java enumerated types idioms.

Having Your Cake...

Java enumerated types cannot be used in the way C/C++ enum types are (or, at least, not directly). This is because Java has only structured enumerated types.
DEFENUM lets you eat your cake.
Consider the following (contrived) C++ program.

// rgb.cc --

#include <iostream>

using namespace std;

enum rgb {
  RED   = 0xff0000,
  GREEN = 0x00ff00,
  BLUE  = 0x0000ff

mix_colors(enum rgb c1, int c2, int c3) { return c1 | c2 | c3; }

main() {
  enum rgb basic_color = GREEN;

  cout << "The basic color chosen is: " << basic_color << endl;
  cout << "WHITE is: " << (RED | GREEN | BLUE) << endl;
  cout << "WHITE is also: " << mix_colors(RED, GREEN, BLUE) << endl;

// end of file -- rgb.cc --

DEFENUM allows to mix and match simple and structured enumerated types, as the following (again, contrived) example shows.

    cl-prompt> (defenum (colors (:initargs (r g b)))
                  ((red   #xff0000 (255 0 0) (:method mix (c2 c3) (logior red c1 c2)))
                   (green #x00ff00 (0 255 0) (:method mix (c2 c3) (logior green c1 c2)))
                   (blue  #x0000ff (0 0 255) (:method mix (c2 c3) (logior blue c1 c2)))
                  ((r 0 :type (integer 0 255))
                   (g 0 :type (integer 0 255))
                   (b 0 :type (integer 0 255))
                  (:documentation "The Colors Enum."))

    cl-prompt> (format t "The color WHITE is ~D~%" (mix red green blue)) ; Yes, this works as is!
    The color WHITE is 16777215

    cl-prompt> (mapcar #'colors-r (tags 'colors))
    (255 0 0)

The C/C++ style numeric tag can be mixed with structured enumerated types.
Now you can eat your cake.
Not that you have to, but it is nice to know you can.

Final Remarks

The DEFENUM library is inspired by Java 5.0 'enum' classes and it allows you to build both simple and structured enumerated types in Common Lisp. It offers all the facilities of the Java version, while also retaining the C/C++ 'enum tags are integers' feature.
The use of the DEFENUM enumeration types has some limitations, due, of course, to Common Lisp slack typing and a few implementation choices.


The usual ones, plus searches of 'common lisp defenum def-enum'.
[JE5] Java Online Documentation, Enums, link.

Project site

DEFENUM is hosted at http://defenum.sourceforge.net.