Clementson's Blog

Bits and pieces (mostly Lisp-related) that I collect from the ether.

March 2004
Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31
Feb  Apr

Lisp is slow (NOT!)

Monday, March 8, 2004

Every so often, some troll on c.l.l. tries to upset people by saying things like "why the @#$@! parenthesis?" or "why is Lisp so slow?". Usually the c.l.l. regulars just recognize trolls for what they are and ignore these types of comments. Recently, however, a number of regulars decided to take one particular troll up on his "challenge".

The troll was (fittingly) named "nobody" and one of his contentions was that "1. The fastest Lisp implementations are slow (See any third-party benchmark)". When he was was asked what his definition of "slow" was and what "third-party benchmark" he was talking about, he threw down the gauntlet:

"The best known non-stupid (real problem, any algorithm) benchmark is probably the Coyote Gulch test. There are many languages that it has been translated into. If you can produce (write or find) and post a Lisp version that is within 10% of C performance, I will admit that #1 is incorrect."
The "Coyote Gulch" test that he selected was almabench, a program that calculates ephemeris for eight planets (excluding Pluto) using a mathematical series. It is available on Scott Robert Ladd's web site and is designed to test floating point performance. The test has already been heavily optimized for C++, Fortran and Java but there were no existing Lisp versions of the code. So, the selected benchmark was a "bit unfair" in that a lot of effort had already been put into coming up with optimized C++ code but no initial work had been done on a Lisp implementation. However, this didn't deter people.

Pascal Bourguignon made the initial translation from C++ to Lisp. Christian Lynbech, Marco Antoniotti, Gareth McCaughan, Raymond Toy, Brian Mastenbrook, rif, and Duane Rettig all contributed code and ideas to come up with a Lisp version that actually bettered the highly optimized C++ code in execution speed. Brian Mastenbrook summarized the results on a number of different processors and CL implementations. Here is one set of results (1 C++ implementation and 2 CL implementations on a single hardware platform) for comparison purposes:

Intel Pentium 4 3.0 GHz

Compiler Real Time
g++ (GCC) 3.3.1 27.374s
CMUCL 18e 26.16s
SBCL 0.8.5.24 26.895s


The source code for the final Lisp version is here. Although the resulting code wasn't very "lispish", Thomas Burdick explained why this doesn't really matter:
"rif writes:
> So what have we learned? We confirmed what we pretty much knew: you
> can write a C program in CL, at which point the relative speed of your
> C and CL versions will depend on the relative quality of the code
> generation. It's worthwhile to know the specifics of your
> implementation (for instance, it seems that a substantial amount of
> the last-mile speed up for this benchmark on CMUCL came from figuring
> out ftruncate could be fast inside a macro, but not inside an inline
> function).

We already knew this, but I think examples of how to do this are of potentially great value. As more people are drawn to Lisp, there will be more people who need to microoptimize portions of their code, but don't know how. Educationals like this are one way we can get this knowledge passed on. Hey, c.l.lisp just made something out of nothing^[^Dnobody.

> What would be even more interesting to me would be an example more
> like what I feel I experience anecdotally --- that I write programs
> that are a bit slower than C (maybe 25% to 50%), but they're much more
> flexible, more abstract, cleaner, easily modifiable, and so on. I
> don't really feel like the CL version of almabench we currently have
> shows any of the benefits of CL.

This didn't show any of the benefits of CL wrt C++, but it does show some of the benefits wrt languages like Squeak Smalltalk or Python. Sure, it's possible to call out to C or C++ or Fortran, but if you have working Lisp code, and you know how to micro-optimize for your implementation, you can spend an hour or so (how long did Duane say he spent on it?) and get C-ish performance. And still have portable CL, and all the safety benefits thereof."
Duane Rettig summarized the real value of doing a benchmark very nicely:
"I think that the lesson here is important - although I never took the troll who started this thread seriously, I definitely took his benchmark seriously. And although many here have decried nobody's business, almost to the point of even coming right out and stating the half-truth that 'there's no such thing as a good benchmark', please note that it is only half the truth; the other half is that 'there's no such thing as a bad benchmark', either; it behooves us all to take every timing test seriously enough to at least know what is occurring, and also to know the implementation you work with well enough to know its strengths and weaknesses."
Good points. This episode is unlikely to silence the trolls; however, it might have openned some people's eyes to the type of performance that can be extracted from current CL implementations.

emacs Copyright © 2004 by Bill Clementson