Nil points

Eurovision memories from back in the day (image source bbc.co.uk)

Visualising Clojure

I recently visited a splendid blog by Joseph Wilk that explains Clojure sequence functions by visualising their effects.

After being impressed by the visuals I was struck by a couple of the functions (nthnext and nthrest) that are shown to have the same effect on a non-empty sequence.

This post is supported by live code snippets and that you can live edit too:

(nthnext [0 1 2 3] 2)
(nthrest [0 1 2 3] 2)

I was not familiar with them so I started digging a bit deeper. They are based, possibly obviously, on next and rest. The only difference between next and rest (and their derivatives) is their return value when the sequence is nil or empty.

(let [nil-rest-gives (rest nil)
      empty-rest-gives (rest [])]
  [nil-rest-gives empty-rest-gives])
(let [nil-next-gives (next nil)
      empty-next-gives (next [])]
  [nil-next-gives empty-next-gives])

Seq and ye shall find

In turn the docs say that both rest and next are based on seq.

Looking at the code for next and rest we see how this is achieved: they are wrapped in method calls to native code.

Here are the Clojure definitions, stripped of most metadata for brevity:

(def
 ^{:doc "Returns a seq of the items after the first. Calls seq on its
  argument.  If there are no more items, returns nil."}  
 next (fn ^:static next [x] (. clojure.lang.RT (next x))))

(def
 ^{:doc "Returns a possibly empty seq of the items after the first.
 Calls seq on its argument."}  
 rest (fn ^:static rest [x] (. clojure.lang.RT (more x))))

Swan dive

Diving deeper, next and more are written in Java, so let’s sneak a view behind the curtain:

public static ISeq next(Object x) {
  if(x instanceof ISeq) {
    return ((ISeq)x).next();
  } else {
    ISeq seq = seq(x);
    return seq == null? null : seq.next();
  }
}

public static ISeq more(Object x) {
  if(x instanceof ISeq) {
    return ((ISeq)x).more();
  } else {
    ISeq seq = seq(x);
    return (ISeq)(seq == null? PersistentList.EMPTY : seq.more());
  }
}

The significant difference is that when rest calls into more, it will return PersistentList.EMPTY if the call to seq is a Java null. The next implementation instead returns the null.

Well, that’s cool: having drilled right the way down we can see precisely why the behaviour differs.

There maybe reason

So what does it mean to Clojure programmer? Having a variant that returns nil might seem strange in the FP world where Tony Hoare’s nillion dollar mistake has been, ahem, fixed.

In other languages there are no more accidental nils once we have the monadic finery afforded by types like Option, Maybe, Coulda, Shoulda and Woulda. Yes, I did invent those last three to see if you’re still with me.

Functionil Streaming

We don’t have Maybe or Option (or other static) types in Clojure so let’s see how this works in practice.

Let’s start with a trivial - even silly - chain of map / filter over a sequence.

(map inc (nthnext (filter even? (range 200)) 90))

Now lets write a function that does not do well with nils

(defn next-nil-fail [x] (if (= 0 x) 0 (/ x x)))
; (next-nil-fail 10)
(next-nil-fail nil)

Now, let’s throw that function in a working chain and see what happens

(defn next-nil-fail [x] (if (= 0 x) 0 (/ x x)))
(map next-nil-fail (nthnext (filter even? (range 200)) 90))

Boring but OK, now let’s see what happens when we start throwing nils

(defn next-nil-fail [x] (if (= 0 x) 0 (/ x x)))
; filter returns 100 values so 101 > 100 and nthnext will return a nil
(map next-nil-fail (nthnext (filter even? (range 200)) 101))

Nothing will come of nothing

Say what? How in the Lear didn’t it generate an error?

(defn next-nil-fail [x] (if (= 0 x) 0 (/ x x)))
(map next-nil-fail [nil])

We can see that nil-fail will fail if it is passed a nil or a collection of nils.

So something funny going is on here! Let’s look at how map works:

; relevant code snippet from map
([f coll]
 (lazy-seq
  (when-let [s (seq coll)]
    (if (chunked-seq? s)
      (let [c (chunk-first s)
            size (int (count c))
            b (chunk-buffer size)]
        (dotimes [i size]
            (chunk-append b (f (.nth c i))))
        (chunk-cons (chunk b) (map f (chunk-rest s))))
      (cons (f (first s)) (map f (rest s)))))))

And now we see: the last line shows that the map implementation uses rest to avoid producing nils on the stream when the seq is empty or nil.

Nils desperandum

But map does not inspect the content of each element so cannot stop us from hitting all fail cases.

(defn next-nil-fail [x] (if (= 0 x) 0 (/ x x)))
(map next-nil-fail (interpose nil [1 2 3]))

[NB: This throws a null pointer exception for CLJ on the JVM but CLJS is more forgiving!)]

The idiomatic route is to fix up your code:

(defn div-nil-ok [x]
  (cond (nil? x) x
        (= 0 x) 0
        :else (/ x x))) ; it's just a toy, don't make me write all the cases!
(div-nil-ok nil)
(map next-nil-fail (interpose nil [1 2 3]))

Monadic puns

Is nil a seq in Clojure?

(seq? nil)

No, the implementation of map uses a monadic approach to ensure that nils are not propagated through the chain.

Other seq functions take the same approach to ensure that functions which generate nil instead of empty lists are treated equally.

This is also called nil-punning. Oh and one small niggle with that excellent article: Eric claims

first has nothing to return if the seq is empty, and so it returns nil. Because nil is a seq, first works on nil, and returns nil. rest works on nil as well, because nil is a seq.

Naughty Eric ;-) We see that nil is not a seq but instead, the operations on sequences work to provide that effect.

But but but next

Ah, yes in the case of next and its derivatives, the sequence functions make no such concessions.

So why would such functions exist?

Functionil Recursion

Nil can be used to good effect as a terminal condition when writing recursive programs.

(loop [nums (range 10)
       results []]
  (if nums
    (recur (nthnext nums 2) (conj results (first nums)))
    results))

Do not change nthnext for nthrest ;-)

Conclusion

Clojure has made working with nil pain free in general due to the design of the sequence operations.

There are however still some edge cases that need to be addressed where nils can be inserted into sequences.

There are also many options (or enough rope) for adventurous programmers to use them where they have value.

Follow up

I am reserving this space to explain in more detail why nil is treated differently between CLJ and CLJS.

Comments / links to help me are welcome!

Credits

This post was written using the power of KLIPSE to enable the interactive code blocks.