Iteration in Scheme

drobilla.net ⋅ Thu 13 March 2008

Thoughts brought on by using SWIG to make Scheme bindings for C/C++ code:

In Scheme, being a LISP, pretty much everything is usually a list. This is great and makes for elegant code, but as a data structures guy I cringe to see the ridiculous amount of O(n) searching done in LISP code. Why is the underlying structure (a list) a dependency of the code using it? Lack of abstraction: car and cdr only work on cons cells, and lists are made of cons cells - period. This sucks for plenty of reasons (the one that encouraged this rant is making Scheme bindings for C++ code).

Say you want to expose a C or C++ list in Scheme. To be really useful in Scheme, it will have to work with car and cdr. Luckily any list has the concept of 'get the value at this position' (car) and 'get the next position' (cdr). For C++ the position is an iterator, car is *, and cdr is ++. Unfortunately making 'nice' (ie typical and un-weird) Scheme bindings for these means you literally have to duplicate the entire C++ list and build a Scheme list out of it, because there's no functional abstraction going on here, and the type system is nonexistant.

I guess this boils down to a rant about needing iterators in Scheme. I cringe at sounding so much like an OO "patterns" kind of guy, but there it is :)

Lets say our Scheme dialect has some a type system with polymorphism. Scheme code is generally something like:

(define (do-something list)
  (something (car list))
  (if (not (null? (cdr list)))
    (do-something (cdr list))))

(define a-list '(1 2 3 4 5))

(do-something a-list)

Rename car "get" and cdr "next" to avoid confusion, and use an iterator:

(define (do-something iter)
  (something (get iter))
  (if (valid? (next iter))
    (do-something (next iter))))

(define a-collection (whatever))

(do-something (start a-collection))

The idiom for do-something (the fundamental LISP idiom if there ever was one) is the same. The only difference is the call to do-something must pass an iterator, here with the start function. Slightly unfortunate maybe, but the benefits are huge:

Common list-like iteration for all data structures (e.g. trees)
Different styles of iteration (use reverse-end instead of start and have do-something work in reverse without modification)

I suppose since I'm assuming a stronger/better/faster Scheme dialect with a good type system, a collection could just automatically convert to the 'start' iterator, and the calling code could even be the same if you just want classic LISP style code. I'm a huge fan of LISP style languages (mostly because the code is a simple list structure itself), but there really needs to be an abstraction between the underlying data structure and the code that uses it.