Search This Blog

Wednesday, September 30, 2009

How it Works: the Monad

I'm currently reading the excellent tutorial on monads here: and paraphrasing it to help me understand.
The simplest monad is let, the identity monad.

(let [a 1]
  (let [b 2]
    (let [c (* a b)]
      (let [d (* a a)]
        (let [e (+ (* b b) (/ c d))]
          (let [f (+ c d e)]
            (let [g (- c)]
              (* a b c d e f g))))))))

The let above represents a complex computation.
During the computation, values are computed, names are bound to values, values are used as inputs to functions.
Values could also be pulled in from the global namespace, and side effects could be produced. Let is a very simple thing allowing the representation of very complex and powerful things.
In fact, let is only a syntactic variant of lambda, or as clojure calls it, fn.

((fn[a] (*a 2)) 1)

is exactly equivalent to

(let [a 1]
  (* a 2))

In fact in some lisps, that's how it's implemented.
If all we had was fn, we could build let with a simple macro.

(let [a 1]
  (let [b (inc a)]
    (* a b)))

is just

((fn [a]
   ((fn [b]
      (* a b)
      ) (inc a))
   ) 1)

And it's easy to transform one into the other.
But the second form is much harder to read.

If we didn't have let, how could we do this sort of thing and remain sane?
We could make it easier to read by defining the function bind:

(defn bind [value function]
  (function value))

This, by reversing the order of function and argument, allows us to write:

(bind 1 (fn [a]
          (bind (inc a) (fn [b]
                          (* a b)))))

Thus putting the names nearer to the values they take.

This being lisp, we could introduce a special syntax to take away the boilerplate:

(defmacro with-binder [binder bindings expression]
  (if (= 0 (count bindings))
    `(~binder ~(second bindings) (fn[~(first bindings)]
                                (with-binder ~binder ~(drop 2 bindings) ~expression)))))

and we have let back!

(with-binder bind 
  [a 1
   b (inc a)]
  (* a b))

Notice that I've put bind as a parameter of the macro. It could have been hard coded, but now we can put other functions in its place.

There are sometimes reasons to use a different bind.

For instance, suppose that we have functions that can produce nil.
Consider a function which looks something up in a list and returns nil if it's not there.

(defn string->int[x]
  (condp = x "one" 1 "two" 2 nil))
(defn int->string[x]
  (condp = x 1 "one" 2 "two" nil))

Here are some unit tests:

(map string->int '("one" "two" "three"))
(map int->string '(1 2 3))

If we want to compute without throwing exceptions, we need to catch nils and short-circuit the bits of the computation that can't deal with them.

(defn buggy-add-string [a b]
  (let [av (string->int a)
        bv (string->int b)
        cv (+ av bv)
        c (int->string cv)] 

(buggy-add-string "one" "one")   ;;-> "two"
(buggy-add-string "one" "two")   ;;-> nil
(buggy-add-string "one" "three") ;;is an error. 

We need instead to write:

(defn guarded-add-string [a b]
  (let [av (string->int a)]
    (if (nil? av) nil
        (let [bv (string->int b)]
          (if (nil? bv) nil
              (let [cv (+ av bv)]
                (let [c (int->string cv)]

(guarded-add-string "one" "one")   ;;-> "two"
(guarded-add-string "one" "two")   ;;-> nil
(guarded-add-string "one" "three") ;;-> nil 

This sort of code is repetitive, difficult to read and understand, boring and error-prone to write. If we want to do this sort of thing (and we want to do it all the time!) we need to find a way of abstracting the pattern away, so that we can leave the interesting parts to be our program.

First let's notice that it doesn't do any harm to check for nils even after functions that don't produce them, so we could make the code more uniform by always checking:

(defn over-guarded-add-string [a b]
  (let [av (string->int a)]
    (if (nil? av) nil
        (let [bv (string->int b)]
          (if (nil? bv) nil
              (let [cv (+ av bv)]
                (if (nil? cv) nil
                    (let [c (int->string cv)]
                      (if (nil? c) nil

Of course, this makes the readability even worse, but we have more hope of abstracting away a uniform pattern.

By analogy with the bind function above, we can use maybe-bind to abstract away the checking.

(defn maybe-bind [value function]
  (if (nil? value) nil
      (function value)))

(defn maybe-add-string [a b]
  (maybe-bind (string->int a) (fn [av]
        (maybe-bind (string->int b) (fn [bv]
              (maybe-bind (+ av bv) (fn [cv]
                    (maybe-bind (int->string cv) (fn[c]

(maybe-add-string "one" "one")   ;;-> "two"
(maybe-add-string "one" "two")   ;;-> nil
(maybe-add-string "one" "three") ;;-> nil

This is still a bit of a nightmare, but a similar pattern to the one above has emerged.

If we use our macro from above, it looks much better

(defn monadic-add-string [a b]
  (with-binder maybe-bind
    [av (string->int a)
     bv (string->int b)
     cv (+ av bv)
     c (int->string cv)] 

but it still works:

(monadic-add-string "one" "one") ;;-> "two"
(monadic-add-string "one" "two") ;;-> nil
(monadic-add-string "one" "three") ;;-> nil

This is just like the code we would have written if we'd been happy to let nils cause exceptions.

We've literally substituted "with-binder maybe-bind" for "let" and all the horror has gone away.

I think this would be pretty impressive of itself, since removing exactly this source of complexity was the major motivation for inventing exception handling.

But it turns out to be an example of a general pattern where you want to process values before assigning them to variables in a series of lets.

Monday, September 28, 2009

A Simple Monad Example

A Simple Monad Example

There's an excellent tutorial on monads here:

Here's a very simple example of the use of monads, which are nothing more scary than a generalization of 'let'.

We'll calculate Pythagoras' rule in words.

Here are some numbers as strings:

(def integers (partition 2 '[1 "one" 2 "two" 3 "three" 4 "four" 5 "five" 12 "twelve"]))

This is a little recursion to look up a string or number and return its partner:

(defn swap
  ([x] (swap x integers))
  ([x lst]
     (if (=(count lst) 0) nil
         (let [[a s] (first lst)]
           (if (= x a) s
               (if (= x s) a
                   (recur x (rest lst))))))))

The function returns nil if it can't find a match. This seems reasonable behaviour for our pythagoras function too.

(defn buggy-pythag[xs ys]
  (let [x (swap xs)
        y (swap ys)
        x2 (* x x)
        y2 (* y y)
        z (Math/sqrt (+ x2 y2))
        zs (swap z)]

Now (buggy-pythag "three" "four") is "five" as we expect, but sadly, if we call (buggy-pythag "one" "six") , then an exception is thrown.

The problem is that nils passed into the maths functions cause exceptions. In order to make the program work we'd have to wrap all the calls to swap in code to detect nils and return nil instead of proceeding with the calculation.

Fortunately, we can replace the let statement with a monadic computation which does just that. There's a built-in monad library:

(use 'clojure.contrib.monads)

The identity monad is an exact drop-in replacement for let. The maybe monad is like that, but also does what we want, immediately returning nil as the result of the whole computation as soon as it sees one produced as the value of an intermediate step. All we have to do is replace 'let' with 'domonad maybe-m'.

(defn pythagoras [xs ys]
  (domonad maybe-m
           [x (swap xs)
            y (swap ys)
            x2 (* x x)
            y2 (* y y)
            z (Math/sqrt (+ x2 y2))
            zs (swap z)]

Now we can call our function safely on any two strings:

(pythagoras "three" "four")
(pythagoras "four" "five")
(pythagoras "one" "six")
(pythagoras "five" "twelve")

Wednesday, September 23, 2009

further macros

The other day while writing a program, I found myself repeatedly typing the same code over and over again.

To avoid unnecessary explanations, let us imagine that it was the factorial program.

(defn factorial [n]
  (if (= n 0) 1 (* n (factorial (dec n)))))
Because factorial is such a tricky program, and so likely to go wrong, I find myself paranoid testing the function

(factorial 1)
(factorial 10)
(map factorial (range 10))
I'm using an environment where with a single click I can evaluate a piece of code, so I write these little paranoid checks into the file, and then whenever I modify the function I can go and re-evaluate the tests to check that I haven't broken it.

Eventually I find that, while I can't be bothered going back and doing this too often, I'd like my little set of regression tests to be run every time I recompile the file. This catches many mysterious bugs, and is generally worth its weight in gold. I find if this is going on, then I don't miss the security of static type checking.

It's also nice to have the original tests still in the file, so that I can execute them explicitly if I want to, and they also serve as handy documentation for the function.

What I end up writing is:

 (= (factorial 1) 1)
 (= (factorial 10) 3628800)
 (= (map factorial (range 10)) '(1 1 2 6 24 120 720 5040 40320 362880)))
Emacs' handy evaluate-and-put-the result-into-the-file function (C-uxe) is invaluable here.

Here is a single expression which gets executed every time the file is loaded, which runs my factorial function in a number of cases.

That's actually really handy in itself, and I can execute it manually, and I can execute the sub-expressions.

But it silently throws the result away if the test fails on load, so although it will catch type errors, it doesn't catch bugs.

So I change it to be:

(let [test
       (= (factorial 1) 1)
       (= (factorial 10) 3628800)
       (= (map factorial (range 10)) '(1 1 2 6 24 120 720 5040 40320 362880)))]
  (println "factorial" "tests" (if test "pass" "fail"))

In fact the code that I usually use in scheme is more complicated than this, and will tell me how many tests passed, and which tests failed, and what the answers for the failing tests were and what they should have been, and it will deal with exceptions that get thrown, and it will deliberately cause exceptions and check that they happen as expected, and it will redirect standard output and standard error so that the anything printed by the functions can be checked too, etc, etc, but this is a good start.

But at this point I'm already annoyed that after every function that I write, I have to repeat a block of code like this, and I find myself using cut and paste a lot, which is never a good sign, and if I want to improve the test code I find myself making the same changes in various different places, and in short my senses are screaming for an abstraction of some sort.

In fact for this example a functional approach would work, but I know from doing this many times before in various languages that it always ends up being difficult when you come to reporting errors, and so you end up, if you can, doing it as a macro eventually, so I'll make it a macro from the start.

Now, what I want is to be able to write

(testset "factorial"
         (= (factorial 1) 1)
         (= (factorial 10) 3628800)
         (= (map factorial (range 10)) '(1 1 2 6 24 120 720 5040 40320 362880)
and have the compiler treat it as if I had written:

(let [test
       (= (factorial 1) 1)
       (= (factorial 10) 3628800)
       (= (map factorial (range 10)) '(1 1 2 6 24 120 720 5040 40320 362880)))]
  (println "factorial" "tests" (if test "pass" "fail"))
or getting rid of the specifics:

(testset title test1 test2 ...) ->

(let [test (and
  (println title "tests" (if test "pass" "fail"))

and this is a very easy macro to write.

(defmacro testset [title & tests]
  `(let [test# (and 
     (println ~title "tests" (if test# "pass" "fail"))
Note how we use the backquote syntax so that the macro looks very like the desired code
and ~ (unquote) to put the title where it should be
and ~@ (splicing unquote) to put the variable argument list of tests into the code as if they'd been typed there one by one, and not as a list
and we've got a local variable test which we want to get generated as an opaque symbol so that it won't interfere with any other variables, so we use clojure's auto-gensym facility to do this automatically, just by calling it test# .

We can use macroexpand to see what's going on

(macroexpand '(testset "hi" (= (println "Hello World") nil)))
on my machine the generated code looks like:

(let* [test__6260__auto__ (clojure.core/and 
                           (= (println "Hello World") nil)
      (clojure.core/println "hi" "tests" (if test__6260__auto__ "pass" "fail")) 
and we can see that clojure is doing something clever in its backquote syntax that means that the println and the and in the macro are getting resolved into the namespace where the macro was defined, but the if, and the println that was in the test are getting left unresolved.

And I have to confess that I am not sure what is going on here, but it seems to work!

It worries me that I don't understand the macro mechanism here.

In classic lisp, macros are very simple and powerful things, but there are a couple of traps into which it is easy to fall for beginners. But they are not too difficult to avoid if you understand how things work

In classic scheme, there is a hygenic macro mechanism, which is hard to understand but results in the traps not being there. But it also results in some of the power being difficult to get at. Once you actually grok it, however, it's a dream to use.

Clojure appears to have taken a middle path which I do not yet understand. Macros seem easy to write and the usual traps seem covered.

I worry that there may be different traps for the unwary.

But it is so useful that I am doomed to find them, if they are there....

And this is a good thing!

macros 101

Having thoroughly embarrassed myself recently by being unable to get a simple debug macro working whilst trying to demonstrate how simple and useful they are, I thought I'd write the example down. Suppose we have a simple program:

(def a 10)
(def b 20)

(println "answer:" (* a b)) 
And suppose we want to debug it. We may wish to insert tracing statements, that print out the values of various expressions as the program runs.

These tend to look like:

(println "the current value of a is" a)

or equivalently:

(println "the current value of" (quote a) "is" a)

This rapidly gets old, particularly if what we want is to look at the values of complex expressions:

(println "the current value of" (quote (* a b)) "is" (* a b))

We can make the repeated text go away with a function:

(defn debug-fn [exp val]
  (println "the current value of" exp "is" val)) 
which allows us to write

(debug-fn '(* a a) (* a a)) 
But that's as far as it goes. We still need to write the expression twice, and we still need the ', or (quote ) to stop the literal expression from getting evaluated.

Even if we're prepared to use run-time evaluation, we have to jump through hoops to make sure that the expression gets evaluated in the correct environment. Which is worse.

Analogues of this exact problem drive me up the wall in every language which doesn't have macros. It's such a useful construct that I'm surprised that there isn't a special way to do it.

In the absence of such a special feature, what would be nice, is if we could say to the compiler:

" every time you see something like

(debug (* a a)) 
imagine I'd written

(println "the current value of" (quote (* a a)) "is" (* a a)) 
instead. "

And in a lisp, we can:

(defmacro debug [var]
  `(println "the current value of " (quote ~var) "is" ~var)) 
here are some examples of the macro in use:

(debug a)
(debug b)
(debug *)
(debug (* a b))
(debug (* 1 2 3 4)) 
We can ask the compiler exactly what it pretends to see when it sees (debug a)

(println (macroexpand '(debug a)))

or we could be smug:

(debug (macroexpand '(debug a))) 

Saturday, September 12, 2009

threads and transactional memory

This beautiful program is straight from the clojure docs: It creates nvecs vectors of nitems numbers, e.g. [1 2 3][4 5 6][7 8 9] and then runs nthreads threads in parallel, swapping numbers from vector to vector niters times

(defn run [nvecs nitems nthreads niters]
  (let [vec-refs (vec (map (comp ref vec)
                           (partition nitems (range (* nvecs nitems)))))
        swap #(let [v1 (rand-int nvecs)
                    v2 (rand-int nvecs)
                    i1 (rand-int nitems)
                    i2 (rand-int nitems)]
                 (let [temp (nth @(vec-refs v1) i1)]
                   (alter (vec-refs v1) assoc i1 (nth @(vec-refs v2) i2))
                   (alter (vec-refs v2) assoc i2 temp))))
        report #(do
                 (prn (map deref vec-refs))
                 (println "Distinct:"
                          (count (distinct (apply concat (map deref vec-refs))))))]
    (dorun (apply pcalls (repeat nthreads #(dotimes [_ niters] (swap)))))

Monday, September 7, 2009

a dynamic java with a REPL

I found some interesting tips in a thread on Stack Overflow. It turns out that from the clojure REPL, java looks quite like a dynamic language. For example:

user> (println (apply str (interpose "\n" (map str (.getMethods (class "")))))
There are also some nice goodies in repl-utils. Try:

user> (use 'clojure.contrib.repl-utils)
user> (javadoc "")
user> (source javadoc)
user> (keys (ns-interns 'clojure.contrib.repl-utils))
user> (show String)
user> (show String 70)

Sunday, September 6, 2009

a minimal web app

(use 'compojure)

(def posts (ref []))

(defn link [text url] [:a {:href url} text])

(defn layout [title body]
   [:head [:title title]]
    [:h1 title]
    [:br] (link "home" "/")]))

(defn main-page []
  (layout "main page" 
     [:ul (map (fn [x] [:li (str x)]) @posts)]
     (form-to [:post "/add-record"]
       (text-area "text")
       (submit-button "save")
     (form-to [:post "/destroy-database"]
       (submit-button "clear"))]))

;;for post-redirect-get whilst modifying state
(defmacro sync-redirect [& body]
     (dosync ~@body)
     (redirect-to "/")))

;;associate actions with pages
(defroutes app
  (GET "/" (main-page))
  (POST "/add-record" (sync-redirect (alter posts conj (:text params))))
  (POST "/destroy-database" (sync-redirect (ref-set posts [])))

  (GET  "*" (layout "uninterpreted GET request"             (str params)))
  (POST "*" (layout "uninterpreted POST request"            (str params)))
  (ANY  "*" (layout "uninterpreted request of unknown type" (str params))))

;;Run the server. If we reload this file into a running image then run-server
;;throws an exception. 
(import '( BindException))
(defn start-server [port]
  (try (run-server {:port port} "/*" (servlet app))
       (println (str "Server started on port " port "."))
       (catch _
  (println (str "Failed to start server on port " port ". Already running?")))))

(start-server 8080)

Wednesday, September 2, 2009

Fractal tree

I ripped off this fractal tree program from:
I've modified it slightly. To me the most impressive thing about it is that you can redefine one of the functions, without closing the window displaying the tree, and then as soon as you resize the window the tree is redrawn using the new definition.
This makes the process of trying out new formulae great fun.

(import '(javax.swing JFrame JPanel )
 '(java.awt Color Graphics Graphics2D))

(defn draw-tree [g2d angle x y length branch-angle depth]
  (if (> depth 0)
    (let [new-x (+ x (* -1 length (Math/sin (Math/toRadians angle))))
   new-y (+ y (* -1 length (Math/cos (Math/toRadians angle))))
   new-length (fn [] (* length (+ 0.75 (rand 0.1))))
   new-angle (fn [op] (op angle (* branch-angle (+ 0.75 (rand)))))]
      (. g2d drawLine x y new-x new-y)
      (draw-tree g2d (new-angle +) new-x new-y (new-length) branch-angle (- depth 1))
      (draw-tree g2d (new-angle -) new-x new-y (new-length) branch-angle (- depth 1)))))

(defn render [g w h ]
  (doto g
    (.setColor (Color/BLACK))
    (.fillRect 0 0 w h)
    (.setColor (Color/GREEN)))
  (let [init-length ( / (min w h) 5),
 branch-angle (* 10 (/ w h)),
 max-depth 10]
    (draw-tree  g 0.0 (/ w 2) h init-length branch-angle max-depth)))

(defn create-panel []
    "Create a panel with a customised render"
  (proxy [JPanel] []
    (paintComponent [g]
      (proxy-super paintComponent g)
      (render g (. this getWidth) (. this getHeight)))))

(defn run []
  (let [frame (JFrame. "Fractal Tree")
 panel (create-panel)]
    (doto frame
      (.add panel)
      (.setSize 640 400)
      (.setVisible true))))


Tuesday, September 1, 2009

nested maps and some ways of getting at the keys

;;nested maps and some ways of getting at the keys

(def me {:name
         {:firstname "John"
          :middlename "Lawrence"
          :surname "Aspden"}
         {:street "Catherine Street"
          :town {:name "Cambridge"
                 :county "Cambridgeshire"
                          :name "England"
                          :history "Faded Imperial Power"
                          :role-in-world "Beating Australia at Cricket"}}}})

(:name me)
(get me :name)

(get-in me [:name :middlename])
(reduce get me [:address :town :country :role-in-world])
(-> me :address :town :county)

(assoc-in me [:name :initials] "JLA")
(update-in me [:address :street] #(str "33 " %))

HTML Formatter

Here's an interesting clojure->html pretty printer. It's been used to format itself.

Pig Latin

Ripped off from the fine tutorial which I am currently reading.

(use '[clojure.contrib.str-utils :only (str-join)])

(defn pig-latin [word]
  (let [first-letter (first word)]
    (if ((set "aeiou") first-letter)
      (str word "ay")
      (str (subs word 1) first-letter "ay"))))

(defn pls [sentence]
  (str-join " "
      (map pig-latin
        (re-seq #"\w+" sentence))))

(pls "red orange yellow green blue indigo violet")

(pls "The quality of mercy is not strained. It falleth as the gentle rain from heaven.")