Blog Archive

Search This Blog

Saturday, September 11, 2010

Clojure Macro Tutorial (Part I, Getting the Compiler to Write Your Code For You)

I am asked to point out that this is part of a series, following the form of the Ring Cycle, although not directly inspired by it. Syntax Quote plays the role of Siegfried in part III.

http://www.learningclojure.com/2010/09/clojure-macro-tutorial-part-i-getting.html http://www.learningclojure.com/2010/09/clojure-macro-tutorial-part-ii-compiler.html http://www.learningclojure.com/2010/09/clojure-macro-tutorial-part-ii-syntax.html

Suggestions for Götterdämmerung will be most welcome.

;; A couple of chaps have asked me to write a clojure macro tutorial, to explain
;; my debugging macro:

(defmacro dbg[x] `(let [x# ~x] (println '~x "=" x#) x#))

;; Which I use to print out intermediate values in functions.

;; Here's an example function that we might want to debug:
(defn pythag [ x y ] (* (* x x) (* y y)))

;; And here is a version enhanced to print out its thought process as it runs
(defn pythag [ x y ]  (dbg (* (dbg (* x x)) (dbg (* y y)))))

(pythag 4 5)
;; (* x x) = 16
;; (* y y) = 25
;; (* (dbg (* x x)) (dbg (* y y))) = 400
;; 400

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; I'm going to try to imagine that I didn't know how to write dbg, and had to
;; go at it by trial and error, to show why it is as it is.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; The problem that dbg solves:

;; Often the best way to debug code is by adding print statements.

;; Consider our old friend factorial

(defn factorial [n]
  (if (< n 2) n
      (* n (factorial (dec n)))))

(factorial 5) ;; 120

;; How would we watch it at work?
;; This modified version prints out the value of every recursive call:

(defn factorial [n]
  (if (< n 2) n
      (* n (let [a (factorial (dec n))]
             (println "(factorial (dec n))=" a)
             a))))

(factorial 5)
;;(factorial (dec n))= 1
;;(factorial (dec n))= 2
;;(factorial (dec n))= 6
;;(factorial (dec n))= 24
;;120

;; So now we can watch the stack unwind. This gives us confidence in the inner
;; workings of the function.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; The problem with this solution is that I've had to do a fair bit of typing to
;; change the function into a version that prints out its intermediate values.

;; First, let's give n a global value so that we can evaluate fragments out of
;; context:
(def n 5) 

;; Here's the original function again (re-evaluate the definition)
(defn factorial [n]
  (if (< n 2) n
      (* n (factorial (dec n)))))

;; Specifically, what I had to do was change
(factorial (dec n))
;; into
(let [a (factorial (dec n))] (println "(factorial (dec n))=" a) a)

;; Which is an expression which not only evaluates to 24 when n=5 , like the
;; original did, but which prints out (factorial (dec n))= 24 at the same time.

;; Notice that the phrase (factorial (dec n)) has to be repeated in the code.

;; Every time I would like to examine the value returned by an expression as my
;; program runs, I have to make this complicated but mechanical change. Even
;; more annoyingly, I have to tell the compiler the same thing twice.

;; Any time you find that you have to do too much typing and perform mechanical
;; repetitions, you will also find difficulty in reading, and potential for
;; error. It is always to be avoided.

;; As Larry Wall said, the chief virtue of a programmer is laziness.

;; This simple repetitive task should be as easy as changing
(factorial (dec n))
;; to
(dbg (factorial (dec n)))

;; Normally, when one spots a common pattern like this, one makes a function.
;; But a function to do what we want here is problematical, because we need the
;; source code as well as the evaluated value of (factorial (dec n)) We might
;; try something like:
(defn dbgf [s x]
  (println s "=" x)
  x)

;; And use it like this:
(defn factorial [n]
  (if (< n 2) n
      (dbgf "(* n factorial(dec n))" (* n (factorial (dec n))))))
  
;; That's a bit better, but I'm still changing
(* n (factorial (dec n)))
;; into 
(dbgf "(* n factorial(dec n))" (* n (factorial (dec n))))

;; Which is less error prone, but still repetitive.

;; The reason that we need to hold dbgf's hand like this, telling it what to
;; print out in two different ways, is that a function's arguments are evaluated
;; before the function is called.

;; If we want to write
(dbg (* n (factorial (dec n))))
;; and have it work as we intend, then we need to take control of when
;; (* n (factorial (dec n))) is evaluated.

;; And this is the problem that macros solve.

;; We need to work out how to write:
(dbg (* n (factorial (dec n))))
;; And get:
(let [a (factorial (dec n))] (println "(factorial (dec n))=" a) a)
;; Instead.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; Generating Code

;; Now because lisp code and lisp data are very similar things, we can easily
;; write a function which will generate the code that we want:

;; Let's define:
(defn dbg-code [s]
  (list 'let ['a s] (list 'println (list 'quote s) "=" 'a) 'a))
;; Which is just a function that takes some code and gives back some code.

;; We can call this function on little pieces of code, to get other little
;; pieces of code

(dbg-code 'x)
;; (let [a x] (println (quote x) "=" a) a)

(dbg-code '(* x x))
;; (let [a (* x x)] (println (quote (* x x)) "=" a) a)

(dbg-code '(* n (factorial (dec n))))
;; (let [a (* n (factorial (dec n)))] (println (quote (* n (factorial (dec n)))) = a) a)

;; Nothing 'macro' has gone on yet! This is just a function, taking advantage of
;; lisp's ability to easily deal with the lists and trees and vectors that make
;; up lisp code.

;; But the function generates exactly the code that we'd like the compiler to
;; substitute in.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; Macros

;; Now how shall we turn our code-generating function into a macro?
;; Just change defn to defmacro:

(defmacro dbg-1 [s]
  (list 'let ['a s] (list 'println (list 'quote s) "=" 'a) 'a))

;; Now it's a macro!

;; Let's try it out:
(defn factorial [n]
  (if (< n 2) n
      (dbg-1 (* n (factorial (dec n))))))

(factorial 5)
;; (* n (factorial (dec n))) = 2
;; (* n (factorial (dec n))) = 6
;; (* n (factorial (dec n))) = 24
;; (* n (factorial (dec n))) = 120
;; 120

;; Bingo!

;; When the compiler sees a macro, which is just a function that returns some
;; code, it runs the function, and substitutes the code that is returned into
;; the program.

;; It is like programming the compiler to be an apprentice programmer who will
;; write out the tedious bits longhand for you.

;; We can even ask the compiler what it sees when it expands dbg-1:
(macroexpand-1 '(dbg-1 x))
;; (let [a x] (println (quote x) "=" a) a)
(macroexpand-1 '(dbg-1 (* x x)))
;; (let [a (* x x)] (println (quote (* x x)) "=" a) a)
(macroexpand-1 '(dbg-1 (println x)))
;; (let [a (println x)] (println (quote (println x)) "=" a) a)
(macroexpand-1 '(dbg-1 (inc x)))
;; (let [a (inc x)] (println (quote (inc x)) "=" a) a)
(macroexpand-1 '(dbg-1 (* n (factorial (dec n)))))
;; (let [a (* n (factorial (dec n)))] (println (quote (* n (factorial (dec n)))) "=" a) a)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; So have we won??

;; We have certainly solved our problem as we originally conceived it, but there
;; are potentially a couple of difficulties with our solution, which I shall
;; cover in a later post.

;; Neither are terribly likely to occur in practice, but it is better to write
;; bug-free code than almost-bug-free code.

;; The chief virtues of a programmer are laziness and paranoia.
;; Million to one chances have a way of coming up nine times out of ten.

;; Also, compare our solution with the actual dbg macro:

(defmacro dbg-1 [s]
  (list 'let ['a s] (list 'println (list 'quote s) "=" 'a) 'a))

(defmacro dbg[x] `(let [x# ~x] (println '~x "=" x#) x#))

;; There are obviously similarities, but the complete version looks more like
;; the generated code, and is thus easier to write and to read once you've got
;; the hang of the weird `, #, and ~ things.

;; And the second version also solves the two potential difficulties which I
;; haven't explained yet!


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; Things to play with to check you've understood:

;; I. Find the source for clojure.core/when
(clojure.repl/source when)
;; figure out what it does.

;; II. What is wrong with:
(defmacro dbg-oops [s]
  (list 'do (list 'println (list 'quote s) "=" s) s))

;; III. Can you improve dbg so you can write (dbg * 3 2) rather than
;; (dbg (* 3 2)), saving a couple of brackets every time you use it?

;; IV. Using the methods above, can you write a for loop. We'd want
(forloop [i 1 10]
  (print i)
  (print (* i i)))

;; to expand to:
(loop [i 1]
  (when (<= i 10)
    (print i)
    (print (* i i))
    (recur (inc i))))

;; V. (hard) If you managed to do IV above, and write forloop, then congratulations! ;; You have
;; understood this post. But I bet your solution suffers from either or both of
;; the subtle problems that I mentioned above. Can you work out what they are?

;; Hint. How many times does the finish condition get evaluated as you go
;; through the loop? How would you make sure that only happened once? What if a
;; name you'd used in the macro clashed with a name used in the code where the
;; macro is?

;; Both problems are, in fact, easy to solve. See part II.




25 comments:

  1. "Often the best way to debug code is by adding print statements."

    Scattering print statements throughout your code is a Bad Thing; you either have to keep removing them from your code after dev (hurting you later if you need to debug), or end up with no granularity control over the information being emitted and unnecessarily hurt performance.

    That's why we have logging libraries for production code.

    Take a look at clojure.contrib.logging, in particular the spy macro.

    http://clojure.github.com/clojure-contrib/branch-master/logging-api.html

    ReplyDelete
  2. I'm not entirely certain that I agree with you when just trying to get a function working in the first place, but that looks like a fine logging library for production code. Thank you very much!

    ReplyDelete
  3. Nice post you have written. I already know lisp macros (from a long time ago... On Lisp & ANSI Common Lisp) and after asking Which Programming Language Should I Learn Next? in my blog, the answer I gave myself (and most readers gave) was Clojure. Thus every bit of clojure knowledge I get from the web is pretty welcome! Thanks!

    Ruben

    ReplyDelete
  4. Ruben,

    Do notice that it isn't finished yet. Clojure's quasiquote does a good job of making gensyms automatic (and almost compulsory!) and the clever read-time resolution of names means that clojure can have lisp-style macros without being a lisp-2. Really the best of both worlds. That will be part 2.

    John.

    ReplyDelete
  5. Ruben, checked your post and I notice you're a mathematician. My degree was in maths.

    Obviously I think you should learn Clojure, but I'm biased because it's my current favourite.

    Others you might want to look at are:

    Python. A lovely design which fits your head and doesn't try to force you into one paradigm. It actually feels a lot like clojure. Both of them have nicely integrated modern data structures.

    Clojure's more functional, Python's more
    imperative.
    Clojure's a lisp, Python's more C/Algol.

    On both those counts, I think mathematicians feel more happy with the first style over the second.

    Ocaml, (or standard ML). For those who like compile time type checking. A bit more restrictive than lispy-pythony dynamic languages, but wonderful fun once you get the hang of the type system, and blazingly fast.

    Haskell. An experiment in what it would be like to have a purely functional language. Great for mathematics. Not so good for webservers.

    ReplyDelete
  6. Hi, thanks for your answer!

    Purely functional code is hard to write for some iterative problems like numeric integrators (I tried to code a Runge-Kutta-Fehlberg 7-8 integrator in Lisp, and it doesn't quite fit the language... too many progns needed).

    For me, functional-ness depends more on the problem that being a mathematician... but indeed, I feel better writing functional code, if I can.

    ReplyDelete
  7. Ruben, indeed! The classic example where functional is horrible and mutable is good is a program with a random number generator. It's more that the "feel better .... if I can" feeling seems to happen more often with mathmos.

    I'm surprised that a Runge-Kutta solver would be a case here though. Maybe there's something I'm missing about the algorithm. If you post or link to imperative code I'll see what I can do!

    ReplyDelete
  8. I have only a C example, not in my GoogleCode (not worth it, I just use it for teaching), but it is a standard procedure where you call repeatedly something, and have several straight sums. It can be "functionalized", at the expense of speed. The problem is this ;)

    Ruben

    ReplyDelete
  9. Oh, well Clojure is unlikely to be as fast as C for that sort of thing, but I don't think it would be a functional vs imperative question.

    I would think it could be as fast as Java. And I would also think that an OCaml or Haskell version could be as fast as C. I may be dissing the JVM here. I'm told it's pretty good these days.

    Can I have a look? I haven't tried optimizing something for speed in Clojure and it would be nice to see how it looked (almost certainly not as nice as the C version or even the Java version, but I'd be interested in finding out).

    Just out of interest, why do you need an ODE solver to be fast? I would have thought it would be fast enough in Javascript! Now PDEs are a different question, obviously....

    ReplyDelete
  10. "Haskell. An experiment in what it would be like to have a purely functional language. Great for mathematics. Not so good for webservers."

    Really? Simon Marlow wrote about a high-performance webserver in Haskell over ten years ago. Here's a more recent version of his paper. See also the classic paper "Tackling the Awkward Squad"".

    Jumping forward to the present day, there's a number of Haskell web frameworks, including Happstack, Snap, Yesod, and a number of others you can find on Hackage.

    GHC Haskell already has great support for concurrency, and the next version of GHC will support a new I/O event manager that should improve performance on web-like workloads.

    Everyone Knows that Haskell is an "experiment" and not suitable for real work, simply because everyone repeats this fact. It has no grounding in reality.

    ReplyDelete
  11. "Haskell. An experiment in what it would be like to have a purely functional language. Great for mathematics. Not so good for webservers."

    Really? Simon Marlow wrote about a high-performance webserver in Haskell over ten years ago. Here's a more recent version of his paper. See also the classic paper "Tackling the awkward squad".

    Jumping forward to the present day, there's a number of web frameworks for Haskell, including Happstack, Snap, Yesod, and others you'll find on Hackage.

    GHC Haskell has excellent concurrency support; see for example chapters 24, 27, and 28 of Real World Haskell. The next version of GHC will have a new I/O manager [pdf] which should provide better scalability on web-like workloads.

    Of course, the "common wisdom" is that Haskell is an "experiment" and is unsuitable for real-world tasks. The facts are irrelevant; if you repeat something often enough it's accepted as true.

    ReplyDelete
  12. An Anonymous Poster left the following comment:

    Unfortunately it appears to have been eaten by some weird process in blogger. It did get e-mailed to me though, so I'm posting it for him:


    "Haskell. An experiment in what it would be like to have a purely functional language. Great for mathematics. Not so good for webservers."

    Really? Simon Marlow wrote about a high-performance webserver in Haskell over ten years ago. Here's a more recent version of his paper. See also the classic paper "Tackling the awkward squad".

    Jumping forward to the present day, there's a number of web frameworks for Haskell, including Happstack, Snap, Yesod, and others you'll find on Hackage.

    GHC Haskell has excellent concurrency support; see for example chapters 24, 27, and 28 of Real World Haskell. The next version of GHC will have a new I/O manager [pdf] which should provide better scalability on web-like workloads.

    Of course, the "common wisdom" is that Haskell is an "experiment" and is unsuitable for real-world tasks. The facts are irrelevant; if you repeat something often enough it's accepted as true.

    ReplyDelete
  13. And my reply would have been:

    Oh God, I'm sorry.

    I did put it fourth on my list of recommended languages.

    I wish Haskell well. It's very impressive.

    I just bloody *hate* not having mutable state. I can't see how you can write anything except mathematics like that.

    I do think that mathematics is a valid thing to write. I write lots of it.

    I know that people have written all sorts of cool stuff in Haskell, I just can't imagine how they did it.

    I shouldn't have said it wasn't good for webservers.

    I did say that. It's obviously wrong, and I take it back.

    Can I back off and just say that I'm not clever enough to use it?

    And yes, I should read those papers. And I probably will.

    Thank you for the links, which survived as far as my e-mail inbox but didn't survive being copied and pasted here.

    ReplyDelete
  14. "I just bloody *hate* not having mutable state. I can't see how you can write anything except mathematics like that."

    You can't. That's why Haskell has many ways to handle mutable state.

    Take a look at Real World Haskell. It's accessible for beginners, and goes pretty deep into these real-world issues.

    ReplyDelete
  15. If you're the same anonymous as before, do you have any idea what happened to your comment? Any error message when it was submitted or was it just quietly swallowed?

    ReplyDelete
  16. This is good explanation. I am new to lisp macros.

    Thanks, makes it a bit simpler to understand, but still struggling to find out which ones supposed to keep as code and which ones to evaluate.

    I am going through the "homework" steps you wrote at the end. Can't get the forloop macro to work, hopefully I'll get it down before you post the second part of the tutorial.

    ReplyDelete
  17. Good luck! If you can get IV to work then you can say that you understand what macros are.

    If it helps, I can't get the second half of this damned tutorial I said I'd write to work. What I thought was simple and obvious is turning out to be a bitch to explain clearly. Which probably means that I don't understand it as well as I thought I did.

    Race you!

    ReplyDelete
  18. Thanks for a very good post

    Could you please clarify this part as well
    (println '~x "=" x#)
    ;; what happens when we do '~x given say x = (* 5 6). I assumed ~x wud force it to become '30 giving 30. But as you point out it prints (* 5 6)=30.

    Thanks in advance.

    ReplyDelete
  19. Anonymous, I believe that you may be a little confused about the whole subject of evaluation and quoting. I have made you a kata:

    http://www.learningclojure.com/2010/11/syntax-quote-kata-for-confused.html

    Try it. Enlightenment may dawn. If not you are going to have to read SICP or The Little Schemer or something to work out what is going on.

    I didn't really get it until I'd written my first scheme interpreter. I would recommend this exercise. It's not nearly as frightening as it sounds, and takes only a short page of code.

    ReplyDelete
  20. "I just bloody *hate* not having mutable state"

    Doesn't seem to be holding Erlang back in real-world applications.

    ReplyDelete
    Replies
    1. Are you retarded, or just confused about basic terminology? Erlang has mutable state. I think you are confusing immutability with message passing.

      Delete
  21. Great post. Im using this now... Added a hook for those wonderful pre and post conditions :

    (defmacro log [fn-name args pre & body]
    `(defn ~fn-name ~args ~pre
    (println "call func -> ..." ~fn-name ~args)
    ~@body))

    ReplyDelete
  22. Thank you for this article.
    The point where you transition your function to a macro, was a big A-HA moment for me. Now I get that a macro is simply a function with benefits:
    (my-macro expr) == (eval (my-function (quote expr)))
    This seems to me to even be a good way to develop a macro.

    ReplyDelete

Followers