Learning Clojure

Tuesday, August 24, 2010

SLIME debugger

http://hugoduncan.org/post/2010/swank_clojure_gets_a_break_with_the_local_environment.xhtml
```
;; All power to:

;; http://hugoduncan.org/post/2010/swank_clojure_gets_a_break_with_the_local_environment.xhtml

;; Who has given us a way to debug things under emacs/slime/swank, by stopping a
;; running program to examine the variables

;; First define this function under emacs/slime/swank

(defn factorial [n]
(when (= n 23) (swank.core/break))
(if (< n 2) n
(* n (factorial (dec n)))))

;; Then try
(factorial 30)
;; at the repl, so it runs in the repl thread, rather than executing it with
;; C-M-x or C-x e, which for some reason doesn't work as well.

;; Hugo's article explains how to view the local environment and create a repl
;; in context so that you can examine the state of the program when break was
;; called. You can then restart as if nothing had happened.

;; It's not as wonderful as traditional SLIME/LISP debugging, but it's a good
;; start!

```

Reduce: Not Scary

```;; Three very fundamental operators in any sort of programming are map, filter
;; and reduce.

;; They represent the common programming tasks of transforming a collection,
;; selecting from it, and summing over it.

;; Most people think that map and filter are fairly obvious, but there seems to
;; be a certain amount of confusion about reduce.

;; But it's actually very simple in concept, and represents an easy idea.

;; Often one needs to loop over a collection, and store results in an
;; accumulator.

;; The simplest example of a reduction would be adding the numbers in a list.

;; Suppose the numbers are 1,2,3,4,5,6,7,8,9,10 and we want to find their sum

;; In C, or another naturally imperative language, we'd say:

;; int list[]={1,2,3,4,5,6,7,8,9,10};

;; int len=10;

;; int a=0;
;; int i;

;; for (i=0; i<len; i++){
;;      a += list[i];
;; }

;; Using atoms to provide mutable state, we can do something similar in Clojure:

(def lst '(1 2 3 4 5 6 7 8 9 10))
(def len 10)

(def a (atom 0))

(dotimes [i len]
(swap! a + (nth lst i)))

;; The value ends up in the atom a, just as in the C version.

;; In clojure, this looks slightly more complicated.

;; Partly because mutation in clojure is intrinsically more complicated, because
;; clojure is extremely concerned with thread safety, and so we need to allocate
;; and dereference atoms rather than mutating local variables.

;; And partly because C has very good notations for its fundamental operations.

;; But logically they're the same algorithm.

;; But I'd feel dirty writing this code in clojure, even though that would have
;; been a perfectly good piece of LISP in the sixties. It's just a feeling that
;; I have that it is better to avoid mutation unless it's actually necessary.

;; Even though the mutation-less algorithms are often harder to write, they're
;; often easier to debug and test.

;; A more natural way to accumulate over a list in clojure is the
;; loop-as-function-call, with accumulator and iterator as parameters:

(loop [a 0 i 0]
(if (= i len) a
(recur (+ a (nth lst i)) (inc i))))

;; This is much more idiomatic code in clojure, and it doesn't mutate any values
;; even though the variable-rebinding in the recur call produces a very similar

;; effect.

;; And here the final value is the value of the expression, which is nicer.

;; Of course, clojure's lists know when they are empty, so we don't need an
;; explicit loop counter.

(loop [a 0 l lst]
(if (empty? l) a
(recur (+ a (first l)) (rest l))))

;; l moves along the list, while a accumulates the values.

;; It still looks a bit long-winded, but we can easily imagine that this is a
;; common pattern:
(loop [a _ l _]
(if (empty? l) a
(recur (_ a (first l)) (rest l))))

;; Where the blanks represent holes in the boilerplate we have to fill in.

;; It should be almost as common as the equivalent pattern:

;; a= _
;; for(i=0; i<_; i++)
;; {
;;   a _= _ [i]
;; }

;; is in C.

;; Where in both cases we need to fill in the _ with the initial value of the
;; accumulator, the list to be accumulated over, and the operation to be
;; performed.

;; Pretty much the first law of programming is:
;; If you see a common pattern, you should name it and abstract it so it goes away.

;; The pattern is called reduce.

;; We need to fill in the blanks with the function to do the accumulating,
;; the initial value of the accumulator, and the list

;; Since we're reducing the list lst, using the operation +, and starting
;; with the value zero, we write:

(reduce + 0 lst)

;; reduce is clojure's natural way of expressing accumulation over a list

;; in the same way as the for-loop over += and ++ is C's

;; Here are some other examples

(reduce * 1 lst)

;; This produces the product of the numbers in the list.

;; In these cases where the order of the arguments doesn't matter

;; we can think of reduce as 'put the function between the values'

(reduce + 0 '(1 2 3 4 5 6 7 8 9 10)) ; (0 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10)
(reduce * 1 '(1 2 3 4 5 6 7 8 9 10)) ; (1 * 1 * 2 * 3 * 4 * 5 * ........)

;; But this image is actually harmful when the ordering does matter.

;; What's really going on is that the operator is being used to feed values
;; into the accumulator one by one.

(reduce + 0 '(1 2 3))
;; proceeds like:
;; (+ 0 1) -> 1
;;         (+ 1 2) -> 3

;;                 (+ 3 3) -> 6

;; If this seems confusing, even though you're happy with the C version,
;; think about the C loop.

;; They're actually just different ways of expressing the same idea,
;; and should look equally natural. Seeing how either one can always be
;; transformed into the other will help.

;; Here's an example where the order does matter:
(reduce conj '() '(1 2 3))

(conj '()  1)   ; -> '(1)
(conj '(1) 2)   ; -> '(2 1)

(conj '(2 1) 3) ; -> '(3 2 1)

;; So
(reduce conj '() '(1 2 3)) -> '(3 2 1)

;; Of course this simple reduction is so common that the pattern
;; (reduce conj '() _ ) already has a name

(reverse '( 1 2 3 ))

;; Here's the definition of reverse in the clojure.core source code!
(defn reverse
"Returns a seq of the items in coll in reverse order. Not lazy."

[coll]
(reduce conj () coll))

;; An acceptable definition of reduce itself would be:

(defn my-reduce [fn init coll]
(loop [acc init l (seq coll)]
(if (empty? l) acc
(recur (fn acc (first l)) (rest l)))))

;; This works on any collection that can be made into a sequence:
(my-reduce * 1 '(1 2 3)) ;; a list
(my-reduce * 1 #{1,2,3}) ;; a set
(my-reduce * 1  [1,2,3]) ;; a vector

;; The real reduce in clojure.core is an optimised version and can deal with all
;; sorts of collections efficiently, but in spirit it is just making every
;; collection into a sequence and then doing what my little skeleton above did.

;; It also has another feature, which is that if you don't provide an initial
;; value for the accumulator, then it takes the first element of the sequence as
;; its initial value, and accumulates over the rest of the sequence.

;; For operations which produce answers of the same type as their arguments,
;; this is often what you want.

(reduce * '(1 2 3 4)) ;24

(reduce +  [1 2 3 4])  ;10
(reduce bit-xor '(1 2 3 4 5)) ;1

;; So why has this simple operation got a scary reputation?

;; I think it's because all the common cases are so useful that they have

;; already been further abstracted away, like reverse.  So in fact you don't
;; meet it that often in practice.

;; Let's see if we can construct something more interesting:

;; Suppose you had a list of strings

(def strlist '("fred" "barney" "fred" "wilma"))

;; And wanted to count how many times each string occurs in the list.

;; We want an accumulator to keep the strings and counts in, and a function
;; which will take that accumulator, and a new string, and return the updated
;; accumulator.

;; The obvious accumulator is a map. We'd want the answer to be something like
{"fred" 2, "barney" 1, "wilma" 1}

;; So what function will add strings to a map?

;;In a rather naive and long-winded way:

(let [oldval
(if (contains? map string)
(map string)
0)]
(assoc map string (inc oldval))))

;; Here's how we'd use it to count our list, starting from the empty map {}, and
;; using addtomap to add each string into the accumulator returned by each call.
(addtomap {} "fred")                      ;; {"fred" 1}
(addtomap {"fred" 1} "barney")            ;; {"barney" 1, "fred" 1}

(addtomap {"fred" 1, "barney" 1} "fred")  ;; {"fred" 2, "barney" 1}
(addtomap {"fred" 2, "barney" 1} "wilma") ;; {"wilma" 1, "fred" 2, "barney" 1}

;; So the reduce part is obvious, once you have addtomap

;; But a real Clojure programmer would look at addtomap and think:

;; We can write (map string 0) instead of
;; (if (contains? map string)
;;           (map string)

;;           0)

;; So a better version of addtomap would be:

(let [oldval (map string 0)]
(assoc map string (inc oldval))))

;; And now the let statement looks redundant, so let's say
(assoc map string (inc (map string 0))))

;; And then he might say
;; "since I'm only going to use this function here, why not make it anonymous?"
(fn [map string] (assoc map string (inc (map string 0))))

;; And now the reduce looks like:
(reduce (fn [map string] (assoc map string (inc (map string 0)))) {} strlist)

;; And, well, at this point, any reasonable man is going to think:
;; "Since I'm writing a one-liner, I might as well use the anonymous shorthand"

#(assoc %1 %2 (inc (%1 %2 0)))

(reduce #(assoc %1 %2 (inc (%1 %2 0))) {} strlist)

;; And if you already understand reduce and anonymous functions, and how maps
;; work, this is actually not too hard to understand.

;; In fact this is the version of the function that I originally wrote.

;; But I can see it might be a bit off-putting if you thought reduce itself was
;; scary.

;; Actually the obfuscation / pleasing terseness is all in the anonymous
;; function, and the behaviour of maps, and the reduce bit isn't scary at all.

;; Here's another deliberately obscure example, using a little structure as an
;; accumulator.  See if you can figure out what it does using the above ideas to
;; unpack it. I'm using the destructuring notation to take the little structure
;; apart, and then the function modifies each part and puts them back together again

(reduce (fn[[c s] n] [(+ c n), (str s n)]) [0,""] lst)

;; The trick is to work out what the anonymous function does to the starting
;; value of the accumulator when it gets a value from the list.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; Clojure's natural facility with abstractions and small functions allows
;; some truly terse code.

;; This little piece of code counts words in a file and orders them by popularity:

(sort #(< (%1 1) (%2 1))
(reduce #(assoc %1 %2 (inc (%1 %2 0))) {}
(clojure.contrib.string/split
#"\W"
(slurp "/home/john/hobby-code/reduce.clj"))))

;; With practice this sort of thing is actually readable. Promise!

;; But if I was actually writing it for someone else to read,
;; I'd probably split it up and give the bits names.

(let [filecontents (slurp "/home/john/hobby-code/reduce.clj")
words        (clojure.contrib.string/split #"\W" filecontents)
wordmap      (reduce #(assoc %1 %2 (inc (%1 %2 0))) {}  words)
sortedwords  (sort #(< (%1 1) (%2 1)) wordmap)]
sortedwords)

;; And if I knew the library, I'd remember that two of those little operations

;; The idiom
( reduce #(assoc %1 %2 (inc (%1 %2 0)) {} .... )

;; which I used to find myself writing all the time, is such a useful thing
;; that it too has made it into clojure.core as the function frequencies:

(frequencies strlist)

;; and the sort with the comparator on the second elements can be replaced by sort-by, and second

(let [filecontents (slurp "/home/john/hobby-code/reduce.clj")
words        (clojure.contrib.string/split #"\W" filecontents)
wordmap      (frequencies words)
sortedwords  (sort-by second wordmap)]
sortedwords)

;; And then I'd abstract away the word counting operations from the file reading part

(defn sorted-word-frequencies [string]
(sort-by second (frequencies
(clojure.contrib.string/split #"\W+" string))))

;; So now I can ask for the 10 most popular words:

(take 10 (reverse (sorted-word-frequencies (slurp "/home/john/hobby-code/reduce.clj"))))

;; which is also pleasingly terse, but I think more readable.

```

Wednesday, August 18, 2010

A Special Squarish Age

Microsoft Research published some problems for their 15th anniversary, some of which are quite tricky.
This one isn't, but I liked the program I wrote to solve it because it demonstrates how nice lazy streams are.

```
;; Microsoft Research 15th Anniversary Problem: A special squarish age.

;; A number is squarish if it is the product of two consecutive integers.
;; e.g. 6 is squarish because it is the product of 2 and 3

;; A colleague claims that his age is squarish, and furthermore, that his last squarish birthday was a squarish number of years ago.
;; What age could he be?

(def squarish (map (fn[[x y]] (* x y)) (partition 2 1 (iterate inc 0))))
(def squarishdiffs (map - (drop 1 squarish) squarish))

(defn find-in-increasing-seq [x seq]
(cond (empty? seq) false
(< (first seq) x) (recur x (rest seq))
(= (first seq) x) true
(> (first seq) x) false))

(defn squarish? [x] (find-in-increasing-seq x squarish))

(def possible-ages (mapcat
(fn [s d] (if (squarish? d) (list s) '()))
(drop 1 squarish)
squarishdiffs))

(take 10 possible-ages)

;; He's probably 42. 12 is a bit young for a colleague, 110 is pushing it.```

Saturday, August 7, 2010

Returning after a long absence (Incanter and Clojure version 1.2)

It's been five months since I last did anything in Clojure. What has changed?

Much to my relief my pom.xml file still produces a working version of clojure which I can connect to from emacs using:

\$ mvn clojure:swank
and M-x slime-connect from emacs.

Now the first thing I want to do, with Clojure being a bit of a moving target, is to find out what the latest versions of everything are.

I'll use maven's versions plugin to find out. Add this snippet to the pom file.

<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>versions-maven-plugin</artifactId>
<version>1.2</version>
</plugin>

[INFO] The following dependencies in Dependencies have newer versions:
[INFO]   org.clojure:clojure ............................... 1.1.0 -> 1.2.0-RC2
[INFO]   org.clojure:clojure-contrib ....................... 1.1.0 -> 1.2.0-RC2
[INFO]   swank-clojure:swank-clojure .................. 1.2.0-SNAPSHOT -> 1.2.1

So it looks like clojure's gone up a version, which is about to be released.

\$ mvn versions:use-latest-versions

updates these in the pom automatically

\$ mvn clojure:swank

and rather remarkably, everything still seems to work.

Slime says, on connecting: Versions differ 2010-02-01 (slime) vs 2009-09-14, which is interesting since swank-clojure was one of the things updated, and 14th September last year isn't very recent.

The first thing I want to look at is incanter, so I go to http://clojars.org and search for incanter. Clojars tells me that the maven snippet should be

<dependency>
<groupId>incanter</groupId>
<artifactId>incanter</artifactId>
<version>1.2.3-SNAPSHOT</version>
</dependency>

So I add that to my pom.xml

\$ mvn clojure:swank

Now causes a lot of downloading... including clojure-1.2.0-SNAPSHOT and clojure-contrib-1.2.0-SNAPSHOT, which are presumably dependencies of incanter. I wonder how different they are from the release candidates?

At any rate the server starts, so (I've never used incanter), I'll try the first test I find when googling:

; SLIME 2010-02-01
user>
user> (use '(incanter core charts))
nil
user> (view (function-plot sin -4 4))

And up pops a very nice graph.

I must say, that was all a lot easier than I was expecting!