Search This Blog


Thursday, September 30, 2010

Clojure 1.3 : First Impression : Clojure vs Scala Revisited

Fractal Tree of the Gods
A while ago, I was disappointed to discover that a certain fractal tree program was visibly slower in Clojure than a very similar program in Scala, and that there was no obvious way to improve it, because it intrinsically involved a tree-recursion, which caused boxing and unboxing of function arguments.

So the first thing I tried with this new version of Clojure, which is supposed to cure such difficulties, was the fractal tree. I had to add the new ^:static declaration to the recursive function, and then type-hint the function's arguments. And now look at it go! It redraws the tree in about 25ms, or 50 frames/second, which means that as you resize the window, the tree rustles cheerfully!

An extraordinary difference. Try it yourself, a pom.xml for the new clojure is here, and very simple full instructions for setting up the full clojure 1.2/emacs/swank/slime are here. All that you need to change to try the new version is the pom.xml file.

And the thing that impressed me most about it the first time, the ability to fiddle with draw-tree in emacs, press M-C-x to redefine the function in the running image, and then see the changes as soon as the window is redrawn, is still true. I think graphics programming in Clojure is about to become wonderful fun. The speed of compiled java combined with the experimental, exploratory qualities of LISP.

Since this alpha release is likely to turn into the next version of clojure, and I've been quite interested in the speed of things recently, I'm going to change over to 1.3-alpha now, and all further posts will be based on it. No point learning to optimize the current version when the new one will obviously require different tricks.

Here's the new source code for fractal-tree.clj:

(import '(javax.swing JFrame JPanel )
        '(java.awt Color Graphics Graphics2D))

(defn ^:static draw-tree [ #^Graphics g2d ^double angle ^double x ^double y ^double length ^double branch-angle ^long depth]
  (if (> depth 0)
    (let [new-x (- x (* length (Math/sin (Math/toRadians angle))))
          new-y (- y (*  length (Math/cos (Math/toRadians angle))))
          new-length (fn [] (* length (+ 0.75 (rand 0.1))))
          new-angle  (fn [op] (op angle (* branch-angle (+ 0.75 (rand)))))]
      (. g2d drawLine x y new-x new-y)
      (draw-tree g2d (new-angle +) new-x new-y (new-length) branch-angle (- depth 1))
      (draw-tree g2d (new-angle -) new-x new-y (new-length) branch-angle (- depth 1)))))

(defn render [ #^Graphics g w h ]
  (doto g
    (.setColor (Color/BLACK))
    (.fillRect 0 0 w h)
    (.setColor (Color/GREEN)))
  (let [init-length ( / (min w h) 5),
        branch-angle (* 10 (/ w h)),
        max-depth 12]
    (#'draw-tree  g 0.0 (/ w 2) h init-length branch-angle max-depth)))

(defn create-panel []
    "Create a panel with a customised render"
  (proxy [JPanel] []
    (paintComponent [g]
                    (proxy-super paintComponent g)
                    (time (render g (. this getWidth) (. this getHeight))))))

(defn run []
  (let [frame (JFrame. "Clojure Fractal Tree")
        panel (create-panel)]
    (doto frame
      (.add panel)
      (.setSize 640 400)
      (.setVisible true))))


Clojure 1.3 alpha release : A pom.xml to try it out with

There's a new version 1.3 of clojure in alpha release:

To try it out, install maven, create a file pom.xml as below, and then use $ mvn clojure:repl

This should start a clojure 1.3 REPL with editing and history provided by the jLine library. Have fun.

If you're already using emacs and slime, then all you have to do is $ mvn clojure:swank to start the clojure end, and M-x slime-connect to connect emacs to the running server.

If you'd like to use emacs, but don't know how to configure it to work with clojure, then there are complete instructions here, which I've just tested on a brand new account, and which still work fine. Emacs moans a bit during the install, as is its wont, but everything works nevertheless.

Only the file pom.xml needs to change to get the 1.3 alpha release working instead of 1.2 using these instructions.

Shawn comments below that this even works on Windows nowadays if you can install maven.

 If anyone else has success or trouble on any platform please comment below.






Wednesday, September 29, 2010

Macros and Type Hints: Metadata and the Medusa.

In my last post, there was a piece of code-generation that I couldn't do, and Meikel Brandmeyer ( ) helped me to do it in spite of how dim I was being. This was his advice.
;; Meikel's answer to my question: How to get rid of the call to int-array?

;; In my last post, I posed a question about how to get rid of a call to
;; int-array in some generated code.

;; I was using it to let the compiler know that a certain array was a Java array
;; of ints.

;; Normally, you'd do this with the type hint ^ints, but I was generating the
;; code, and couldn't make it work.

;; Meikel Brandmeyer ( ) not only answered my question,
;; but through sheer patience managed to get the answer through my skull, and so
;; I'm writing it down here as a reference for the next time I can't remember.

;; Using type hints turns out to be tricky from generated code, because
;; type hints are metadata. However once you've seen the trick it's OK.

;; Here's a model problem:

;; Some arrays of java ints:
(def thousand-ints    (int-array 1000     (range)))
(def million-ints     (int-array 1000000  (range)))
(def ten-million-ints (int-array 10000000 (range)))

;; A function that looks at them, but does nothing.
(defn do-nowt [a]
  (let [len (alength a)]
     (loop [i 0]
       (when (< i len)
         (aget a i)
         (recur (inc i))))))

;; As if the array were the Medusa, just looking turns the function to stone:
(time (do-nowt thousand-ints))
"Elapsed time: 46.054186 msecs"
; 200000 cycles/loop!!! What is it doing??

;; How to make it fast?
(defn do-nowt-quickly [^ints a]
  (let [len (alength a)]
     (loop [i (int 0)]
       (when (< i len)
         (aget a i)
         (recur (inc i))))))

(time (do-nowt-quickly thousand-ints))
"Elapsed time: 0.109981 msecs"
; 433 cycles/loop

;; How fast can HotSpot make it over a long run?
(time (do-nowt-quickly ten-million-ints))
"Elapsed time: 266.500137 msecs"
;; 115 cycles/loop

;; If the array's all zeroes, it can actually get it down to 27. What is that
;; about? Is it optimizing away *looking at the array*?  If not, what is so
;; special about zeros that you can look at them faster?


;; We need to apply the function int to the initializer for the loop variable
;; and the type-hint ^ints to the input variable. 

;; How to generate the code rather than writing it by hand?

;; One's first thought is just to take the fast function, and use it as a
;; template for syntax-quote and auto-gensyms:

(defn mek-thing-as-does-nowt-quickly[]
  `(fn[^ints a#]
     (let [len# (alength a#)]
       (loop [i# (int 0)]
         (when (< i# len#)
           (aget a# i#)
           (recur (inc i#)))))))

;; However:
#_ (eval (mek-thing-as-does-nowt-quickly))

;; Unable to resolve classname: clojure.core/ints
;;   [Thrown class java.lang.IllegalArgumentException]

;; I don't know what is going on here.

;; I fear the reader and the metadata it brings.


;; The Mirror of Perseus:

(defn mek-another-thing-as-does-nowt-quickly[]
  `(fn[^"ints" a#]
     (let [len# (alength a#)]
       (loop [i# (int 0)]
         (when (< i# len#)
           (aget a# i#)
           (recur (inc i#)))))))

(let [f (eval (mek-another-thing-as-does-nowt-quickly))]
  (time (f ten-million-ints)))
"Elapsed time: 282.178954 msecs"

;; Meikel pointed out that "[I" is a synonym for "ints". I don't know why he
;; thought that was important, since they both seem to work, but that probably
;; means that there's still something going on here that I haven't understood,
;; so I mention it.


;; An Entertaining Variation:

;; Here we use a gensym for the function argument rather than an auto-gensym.
;; This allows for run-time generation of the string "ints" so we can make
;; various specialized functions for arrays of different types.

(defn mek-yet-another-thing-as-does-nowt-quickly[]
  (let [a (gensym)]
    `(fn[ ~(with-meta a {:tag "ints"}) ]
       (let [len# (alength ~a)]
         (loop [i# (int 0)]
           (when (< i# len#)
             (aget ~a i#)
             (recur (inc i#))))))))

(let [f (eval (mek-yet-another-thing-as-does-nowt-quickly))]
  (time (f ten-million-ints)))
"Elapsed time: 271.423182 msecs"

Monday, September 27, 2010

Clojure Faster than Machine Code?

I woke up this morning having had a dream about how to solve a very large scale data processing problem that a friend of mine was wrestling with, and I wrote the following program:

It seems to run very fast.

;; Cheating on a Map Lookup. 

;; Suppose we have a function:

(defn step-function [x]
  (cond (< x 1) 0
        (< x 2) 1
        (< x 3) 3
        (< x 4) 4
        (< x 6) 3
        (< x 8) 2
        (< x 9) 3
        (< x 10) 3
        (< x 11) 2
        (< x 12) 1
        :else 0))

(step-function 6) ;2

;; Now imagine that the function is to be generated from some data. We might use a
;; map to hold the values at which the value of the function changes:

(def lookup-table {1 1, 2 3, 3 4, 4 3, 6 2, 8 3, 9 3, 10 2, 11 1, 12 0})

;; And given such a map, we might implement the function by this simple program:
;; Take all the keys that are less than or equal to x.
;; If there are none, then give the default value.
;; Otherwise provide the value of the biggest key which is less than or equal to x
(defn lookup-fn [map default]
  (fn [x]
    (if-let [ [k v]  (last (filter (fn[[k v]] (<= k x)) map ))  ]

((lookup-fn lookup-table 0) 6) ;2

;; Here we look up the values  1, 7, 6, and 20 and -10 in our table:

(map (lookup-fn lookup-table 0) '(1 7 6 20 -10))  ;(1 2 2 0 0)

;; 1 is actually in the map, so that goes to 1, 7 is between 6 and 8, so that
;; goes to 6's value of 2, 20 is higher than all the entries, so it gets 12's
;; value of 0. -10 < 1, is lower than all the map entries so we'll give it the
;; default value 0.

;; A quick test:

(defn fn-to-map [fn range]
  "Evaluate fn everywhere in range. Return map of all values to all results."
  (apply sorted-map (interleave range (map fn range))))

(fn-to-map (lookup-fn lookup-table 0) (range -1 14))
;;{-1 0, 0 0, 1 1, 2 3, 3 4, 4 3, 5 3, 6 2, 7 2, 8 3, 9 3, 10 2, 11 1, 12 0, 13 0}

(fn-to-map step-function (range -1 14))
;;{-1 0, 0 0, 1 1, 2 3, 3 4, 4 3, 5 3, 6 2, 7 2, 8 3, 9 3, 10 2, 11 1, 12 0, 13 0}

;; At least as far as integers go, the map and the function are interchangeable.

;; But how fast is it?

;; This is the hard coded version:
(time (doall (map step-function (range 1000))))
"Elapsed time: 3.054365 msecs"

;; And here's our version:
(time (doall (map (lookup-fn lookup-table 0) (range 1000))))
"Elapsed time: 15.657764 msecs"

;; Not bad, but what if we wanted a performant version, to use on trillions of
;; data points? Either version would take hours to run.

;; And what if we were expecting our map to grow and grow until running
;; down the map list one by one was an insane option? This would affect the hard
;; coded version too, of course. The difference is only a constant factor.

;; Well, a binary search is one way to deal with the growth in the map.
;; We could visualize the binary search on our simple map like this:
;; (1 2 3 4 6 8 9 10 11 12)
;; test >= 8
;; Dividing and conquering at each step:
;; ((1 2 3 4 6) (8 9 10 11 12))
;; test >= 3 or 10
;; Looking in the two halves
;; (((1 2) (3 4 6)) ((8 9) (10 11 12)))
;; test >= 1, 4, 8, or 11
;; Making finer and finer distinctions at each step
;; ((((1) (2)) ((3) (4 6))) (((8) (9)) ((10) (11 12))))
;; test >= 6 or 12
;; ((((1) (2)) ((3) ((4) (6)))) (((8) (9)) ((10) ((11) (12)))))
;; and we're done in at most four steps.
;; The corresponding values are:
;; ((((1) (2)) ((3) ((4) (6)))) (((8) (9)) ((10) ((11) (12)))))
;; ((((1) (3)) ((4) ((3) (2)))) (((3) (3)) ((2 ) ((1 ) (0 )))))

;; So the sane way of proceeding would probably be to write a binary search
;; function and call it a day.

;; But if we really really needed it to be fast, why not:
(defn lookup-fn-handwritten [x]
  (if (< x 6) 
    (if (< x 3)                         ; x is < 6
      (if (< x 2)                       ; x is < 3
        (if ( < x 1)                    ; x is < 2
          0                             ; < 1
          1)                            ; 1 <= x < 2
        3)                              ; 2 <= x < 3
      (if (< x 4)                       ; 3 <= x < 6
        4                               ; 3 <= x < 4
        2))                             ; 4 <= x < 6
    (if (< x 10)                        ; 6 <= x < 10
      (if (< x 9)                       ; 6 <= x < 9
        (if (< x 8) 
          2                             ; 6 <= x < 8
          3)                            ; 8 <= x < 9
        3)                              ; 9 <= x < 10
      (if (< x 11)                      ; 10 < x
        (if (< x 12)                    ; 11 <= x
          1                             ; 11 <= x < 12
        0))))                           ; 12 <= x
;; I have seen this sort of code occasionally in dark corners.  When a man knows
;; how his processor works, knows how his C compiler works, knows his data
;; structures, and really, really needs his loops to be fast then he will
;; occasionally write this sort of thing.

;; This is sort of code that Real Programmers write.

;; A quick test:
(fn-to-map lookup-fn-handwritten (range -1 14))
{-1 0, 0 0, 1 1, 2 3, 3 4, 4 2, 5 2, 6 2, 7 2, 8 3, 9 3, 10 1, 11 0, 12 0, 13 0}

;; It works and it's fast. Already it's faster than the original cond, and its
;; performance advantage will only increase as the map grows:
(time (doall (map lookup-fn-handwritten (range 1000))))
"Elapsed time: 1.442812 msecs"

;; I'd hope it would be faster than the general binary search, because it's
;; implementing the same algorithm but replacing a lot of indirections with
;; branches in the code.

;; Every reference into the map gets replaced by a simple 'less than' test and
;; possibly a jump.

;; Why not, indeed?

;; Well, first of all because it's wrong:
 (fn-to-map lookup-fn-handwritten (range -1 14))
 (fn-to-map (lookup-fn lookup-table 0) (range -1 14))) ; false

;; Go on, find the bug. I dare you.

;; Such code is horrible to write and impossible to read.  We could do it, if we
;; really needed to, but it would be mechanical, repetitive, boring and error
;; prone.

;; And if we were gathering the data for the lookup table as part of our
;; program, we wouldn't be able to hand code a special function every time.

;; Hmmmmmmm...

;; Let's look at some easy cases of an imaginary program to write the code for
;; us:

;; The easiest case is:
'(make-lookup-fn {} default)

;; The second easiest is:
'(make-lookup-fn {10 yo} default)
'(fn[x] (if (< x 10) default yo))

;; The third easiest is
'(make-lookup-fn  {8 hey 10 yo 12 hi} default)
;; Which we could represent as:
'(fn[x] (if (< x 10)
          (make-lookup-fn {8 hey} default)
          (make-lookup-fn {12 hi} yo)))

;; oh hell, let's just write it:

(defn make-lookup-expression [var lookup-map lowdefault]
  (let [vmap (sort (seq lookup-map))
        vmcount (count vmap)]
    (cond ;; base cases
     (= vmcount 0) lowdefault
     (= vmcount 1) (let [[test high] (first vmap)]
                     (list 'if (list '< var test) lowdefault high))
     :else ;; recursion (divide map at a pivot element half way along)
     (let [pivot (int (/ (count vmap) 2))
           [test highdefault] (nth vmap pivot)
           before-pivot (take pivot vmap)
           after-pivot  (drop (inc pivot) vmap)]
       (list 'if (list '< var test) ;; and generate an if that chooses which half
             (make-lookup-expression var before-pivot lowdefault)
             (make-lookup-expression var after-pivot  highdefault))))))

;; I actually found that easier to write than the hand-written nest of if
;; statements above.  It all just seemed to fit together according to plan.

;; Let's try it on our example lookup table:
(make-lookup-expression 'x {1 1, 2 3, 3 4, 4 3, 6 2, 8 3, 9 3, 10 2, 11 1, 12 0} 'default)

 (< x 8)
  (< x 3)
  (if (< x 2) (if (< x 1) default 1) 3)
  (if (< x 6) (if (< x 4) 4 3) 2))
 (if (< x 11) (if (< x 10) (if (< x 9) 3 3) 2) (if (< x 12) 1 0)))

;; Looks like the sort of thing.

;; A warning: As Meikel Brandmeyer points out in a comment below, it would be
;; better to use `< than '< here in real code.

;; If we can generate the code for the nest of ifs, we can generate the code for
;; a lookup function: We shouldn't use x as the variable though, just in case it
;; somehow finds its way into the map! Let's use a gensym for the variable so
;; that it can't capture anything:

(defn make-lookup-fn [lookup-map default]
  (let [var (gensym)]
    (list 'fn [var] (make-lookup-expression var lookup-map default))))

;; The compiler is with us always, so we can turn that into a real function:
(def lookup-fn-automatic (eval (make-lookup-fn {1 1, 2 3, 3 4, 4 3, 6 2, 8 3, 9 3, 10 2, 11 1, 12 0} 0)))

;; The bug is banished:
 (fn-to-map lookup-fn-automatic (range -1 14))
 (fn-to-map (lookup-fn lookup-table 0) (range -1 14))) ;true!

;; Compilers don't make those sorts of mistakes.

;; So we now have the advantages of hard-coding, without the drawbacks. If we
;; can construct our map, then we can construct our hand-coded function, it's
;; just that it's being hand-coded by the compiler at runtime, which is the best
;; sort of hand coding.

;; And it seems to do the business, speed-wise
(def million (doall (range 1000000)))

(time (dorun (map lookup-fn-automatic million)))
"Elapsed time: 778.459478 msecs"

;; Just for comparison:
(time (dorun (map #(* 3 %) million)))
"Elapsed time: 474.40039 msecs"

;; So it seems that our lookup is now comparable in speed to multiplication.
;; In terms of cycles, 778 milliseconds for 1000000 operations means
;; 778 nanoseconds per operation, which is about (* 4.33 778)
;; 3300 cpu cycles per operation with my processor running at 4.33 GHz

;; That's still a lot of cycles! But we're still doing generic arithmetic on
;; arrays of boxed objects.  There is a two orders of magnitude cost for that
;; sort of thing, which is why dynamic languages are often thought to be slow.

;; Let's have a look at how we can speed things up on the occasions when we need
;; to.  We end up writing code that looks like optimized C, but in return we get
;; optimized C speeds.

;; Optimizing Clojure

;; Things go faster if we work on primitive integers and native arrays, although
;; currently the semantics of this are surprisingly subtle. The compiler will
;; not compile literals like 0 to primitive constants without being told to. I
;; don't understand why.

;; It's better to bind them to local variables, at which point the extremely
;; clever HotSpot JVM will notice and optimize at runtime!

;; It seems to be the case that replacing 0 with (int 0) speeds things up a lot,
;; but that (let [zero (int 0)] ....) is even better. I'd love to know why.

;; Also, primitives will not survive function calls, which means that we have to
;; throw away the advantages of first order functions and abstractions like map,
;; and stick everything in a big loop, as if we were writing Java or C.

;; I'm told that these little idiosyncracies are being worked on....

;; Meanwhile, ...

;; Here's an example loop, using the primitive types, but without the
;; type hints that reassure the compiler that there's nothing fishy going on.
(last (let [source (into-array Integer/TYPE (range 1000))
            destination (aclone source)]
        ;; after the arrays are created, time the inner loop:
        (time (loop [x 0]
                (if (< x (alength source))
                  (do (aset destination x (* 3 (aget source x)))
                      (recur (inc x))))))
"Elapsed time: 168.434663 msecs"
;; three quarters of a million machine cycles per loop!

;; Now move all the constants out of the inner loop, type hint them as integers And
;; use unchecked-inc for the loop variable. 
(last (let [source (int-array (range 1000))
            destination (int-array (aclone source))
            length (alength source)
            zero (int 0) three (int 3)]
        ;; after the arrays are created, time the inner loop
        (time (loop [x zero]
                (if (< x length)
                  (do (aset destination x (*  three (aget source x)))
                      (recur (unchecked-inc x))))))
"Elapsed time: 1.1944 msecs"
;; 5000 cycles per loop. Still not brilliant, but a lot better! It's great fun
;; removing these optimizations one by one until suddenly the whole thing
;; becomes 100 times slower!

;; Now we've speeded it up, we can use it on much larger arrays. Try length
;; 1000000 now.
(last (let [source (int-array million)
            destination (aclone source)
            length (alength source)
            zero (int 0) three (int 3)]
        ;; after the arrays are created, time the inner loop
        (time (loop [x zero]
                (if (< x length)
                  (do (aset destination x (* three (aget source x)))
                      (recur (unchecked-inc x))))))
"Elapsed time: 45.465962 msecs"
;; 200 cycles per loop.

;; The loop seems to be sub-linear in the number of things it's looping over!  I
;; figure that this must be HotSpot spotting something clever that it can do.

;; Although actually we're only down to 200 cycles/multiply even now. I guess
;; we're reading and writing from RAM all the time?

;; However, look how long it takes just to make an array of a million integers
;; in the first place:
(time (int-array 1000000))
"Elapsed time: 5.84744 msecs"

;; Since the looping, multiplying and mapping is now only taking ten times
;; longer than it takes to allocate a suitable destination array in the first
;; place, let's call that a wrap, even though there might be another factor of
;; ten hiding in there somewhere!

;; I'm told that this should be as fast as the equivalent Java.
;; I wonder if that's true? Only one way to find out, I suppose....

;; Note to self: Write Java version and benchmark it.

;; Cheating Optimally

;; So what would the handwritten map-lookup loop look like in clojure's
;; equivalent of assembly language?  The nasty bit is that we have to hard-code
;; the constants, and we're going to need a lengthy let-expression and an
;; if-expression transformed to use the local variables.

;; First, let's generate the list of all the constants in the map, as well as
;; the default value From that we can generate the binding form for the big
;; let-expression, and a map of which constants have been bound to which local
;; variables.
(defn constant-helper [mp default]
  (let [constants (sort (set (cons default (apply concat (sort (seq mp))))))
        constants-let (apply vector (mapcat #(list (symbol (str "n" %))(list 'int  %)) constants))
        constant-symbols (map #(symbol (str "n" %)) constants)
        constants-symbols-map (apply sorted-map (interleave constants constant-symbols))]
    (list constants-let constants-symbols-map)))

;; Trying this on our example map
(constant-helper {1 1, 2 3, 3 4, 4 3, 6 2, 8 3, 9 3, 10 2, 11 1, 12 0} 255)
;; We can see what it does:
#_([n0 (int 0) n1 (int 1) n2 (int 2) ............. n12 (int 12) n255 (int 255)]
   {0 n0, 1 n1, 2 n2, 3 n3, 4 n4, 6 n6, 8 n8, 9 n9, 10 n10, 11 n11, 12 n12, 255 n255})

;; To put the constants into the expression, once it's generated, we can use the
;; code-walker function from clojure walk:

;; Here's a simple example of a code-walk, or tree map as it's sometimes known.
 #(if (integer? %) (get {1 "1", 2 "2"} % %) %)
 '(+ 1 (* 2 3)))
;; Gives:
;; (+ "1" (* "2" 3))
;; See how it's changed 1 and 2 but left everything else alone?

;; So we can make both the let-expression and the transformed nest of ifs with
(defn transformed-exprs [mp default]
  (let [[let-expr cs-map] (constant-helper mp default)
        if-expr (make-lookup-expression 'x (sort (seq mp)) default)
        transformed-if-expr (clojure.walk/postwalk
                             #(if (integer? %) (get cs-map % %) %)
    (list transformed-if-expr let-expr)))

;; Evaluating:
(transformed-exprs {1 1, 2 3, 3 4, 4 3, 6 2, 8 3, 9 3, 10 2, 11 1, 12 0} 255)
;; Gives us:
#_(((if (< x n8)   ...   (if (< x n9) n3 n3) n2) (if (< x n12) n1 n0))
   [n0 (int 0) n1 (int 1) ... n12 (int 12) n255 (int 255)])
;; Which are the parts we need to make an optimal loop:

;; so the final expression we're looking at would be:

(let [source (int-array million)
      destination (aclone source)
      length (alength source)]
        (let  [n0 (int 0) n1 (int 1) n2 (int 2) n3 (int 3) n4 (int 4) n6 (int 6) n8 (int 8) n9 (int 9) n10 (int 10) n11 (int 11) n12 (int 12) n255 (int 255)]
           (loop [i (int 0)]
             (if (< i length)
               (do (aset destination i (let [x (aget source i)]
                                         (if (< x n8) (if (< x n3) (if (< x n2) (if (< x n1) n255 n1) n3) (if (< x n6) (if (< x n4) n4 n3) n2)) (if (< x n11) (if (< x n10) (if (< x n9) n3 n3) n2) (if (< x n12) n1 n0)))))
                   (recur (unchecked-inc i)))))))

"Elapsed time: 28.387293 msecs"

;; Which somewhat to my amazement is not only executable and produces the correct answer,
;; but actually seems faster than the multiplication example! (something like 100 cycles/loop).

;; So the final step is to take a map and a default value, and generate the
;; expression above, which takes a java int array as input and gives back
;; another one, with all the values passed through the lookup table.

(defn generate-array-transformer [mp default]
  (let [[if-expr let-expr] (transformed-exprs mp default)]
          (let [source# (int-array source#)
                destination# (aclone source#)
                length# (int (alength source#))]
            (time (let  ~let-expr
              (loop [i# (int 0)]
                (if (< i# length#)
                  (do (aset destination# i# (let [~'x (aget source# i#)] ~if-expr))
                   (recur (unchecked-inc i#)))))))

;; Here's how we use it to make the loop code
(generate-array-transformer {1 1, 2 3, 3 4, 4 3, 6 2, 8 3, 9 3, 10 2, 11 1, 12 0} 255)

;; And now we'll compile it and assign it to a suitably named variable
(def never-going-to-work (eval (generate-array-transformer {1 1, 2 3, 3 4, 4 3, 6 2, 8 3, 9 3, 10 2, 11 1, 12 0} 255)))

;; Let's create a java array with our data in it
(def million-ints (int-array million))

(take 100 (never-going-to-work million-ints))
"Elapsed time: 11.656722 msecs"
;;(255 1 3 4 3 3 2 2 3 3 2 1 0 0 0 ......
;; bloody hell!

;; Not only does it appear to be working, but the inner loop now appears to down
;; to twelve milliseconds.  50 cycles per number transformed.

;; However, if we time the whole thing:
(time (never-going-to-work million-ints))
"Elapsed time: 10.887267 msecs"
"Elapsed time: 148.408179 msecs"

;; I'm very happy with that, considering that I've managed to optimize away a
;; lookup into a data structure.

;; But it's annoying that the whole loop actually takes 148ms.

;; Most of the time is being spent in the call to int-array. But the call to
;; int-array is only there so that the compiler can tell it's an int array!

;; The thing passed in is an int-array already! It doesn't need to spend this
;; time transforming it!

;; How do I let the compiler know that it's actually going to get an int-array passed in?

;; Just to prove that it wasn't a fluke, here's a different map
(def never-going-to-work-2 (eval (generate-array-transformer {1 99, 2 33, 3 4, 6 2, 8 3, 9 3, 10 2, 11 1, 12 0, 15 -1, 24 100} 100)))

(time (take 100 (never-going-to-work-2 million-ints))) ;(100 99 33 4 4 4 2 2 3 3 2 1 0 0 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 100 100 ......
;;inner loop:
"Elapsed time: 11.455121 msecs"
;;whole call:
"Elapsed time: 138.169579 msecs"

;; Just to check that it's actually doing something, and that we're not being fooled by
;; some sort of lazy eval
(def ten-million-ints (int-array (apply concat (repeat 400000 (range 25) ))))

(time (take 100 (never-going-to-work-2 ten-million-ints)))
"Elapsed time: 156.814438 msecs"
"Elapsed time: 1437.094696 msecs"
;; (100 99 33 4 4 4 2 2 3 3 2 1 0 0 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 100 100 99 33 4 4 ...

;; As a stiffer test, let's make a completely random map with 100 entries:
(def random-map (apply sorted-map (for [i (range 200)] (rand-int 100))))

;; And generate and compile code to inline the binary search in this large random map
(def large-random-step-function (eval (generate-array-transformer random-map 100)))

;; Let's see how it does:
(time (take 100 (large-random-step-function ten-million-ints)))
"Elapsed time: 276.477877 msecs"
"Elapsed time: 1497.910398 msecs"
;;(19 19 12 29 28 28 28 4 72 99 99 87 87 ............

;; 119 cycles per lookup over 10 million integers. The whole loop in less than a second,
;; although there's still this completely silly second and a half where it's turning an array
;; of ints into an identical array of ints.

;; I should be able to get rid of this using the type hint ^ints, but I can't
;; make the expression generator use it.  Does anyone know how to modify it so
;; this problem goes away?

;; Wild Speculation

;; And so I wonder:

;; This technique strikes me as very general, and very useful. All sorts of
;; things can be represented as lookups in tables.

;; This whole program took me one short day to write, and the whole time I was
;; doing things that I've never done before, just going by intuition. Once
;; you've got the hang of it, it's easy.

;; I think that the program should be as fast as the equivalent java program
;; would be, although I haven't got around to actually testing that, so I may
;; have dropped the ball somewhere.

;; In any case, it's probably possible to generate code like this that does run
;; as fast as whatever the fastest Java implementation actually is.

;; The JVM is widely thought to be around the same speed as native machine code
;; or optimized C.

;; I'm absolutely sure that I'm not able to write the equivalent program in
;; Java, C, or assembler without code-generation.

;; The code generation would be very very very much harder in Java, C or
;; assembler.

;; And so I wonder, is Clojure the fastest computer language in the world?

;; Bugger

;; If you try to make the map much larger than 100 entries
;; e.g.
(def random-map (apply sorted-map (for [i (range 400)] (rand-int 1000))))

;; Then when you try to compile it:
(def large-random-step-function (eval (generate-array-transformer random-map 100)))
;; The compilation fails with this interesting error:
;; Too many arguments in method signature in class file user$eval23503$fn__23504$fn__23505
;; [Thrown class java.lang.ClassFormatError]

;; Something tells me that the let-bound local variables are getting translated
;; to a function call (let/lambda equivalence and all that), and that Java has a
;; hard coded limit somewhere. May I guess that that limit is 256?

;; I don't know whether there's any way round that. I'm too tired to think.

;; If we change the program so that it just uses (int 0) instead of
#_ (let [n0 (int 0)] ...)
;; then we get most of the benefits of optimizing it, but not all.

Friday, September 24, 2010

Graphics like it's 1979: How to get clojure to act like a ZX Spectrum

I once wrote a rant about how, in 1982, it had seemed very easy to draw graphics on the screens of computers, and that nowadays it seems hard, and wondering how a child nowadays is supposed to start writing the graphics programs that were the best bit of programming when I was young.

This rant got on to the front page of reddit for a full day, and was the top hit on the programming reddit for several, and got voted on and commented on so much that I think that a few hundred thousand people must have read it.

About one third of people agreed with me, about one third disagreed and showed me ways to do it, and lots of kids said that yes, computers were incomprehensible, but they quite enjoyed programming their pocket calculators, which these days have screens and BASIC and are essentially portable ZX81s.

In the end, I ended up being convinced by the optimists, and wrote a program for twelve year old programmers which uses the turtle module that is built into python, and relaxed, knowing that the kids would be OK, and figuring that I could probably muddle through.

But it did still seem to be true that there was no effort at all to plotting on the first computers I had, and then later with Turbo Pascal it was a question of copying a boilerplate preamble to set up the graphics card and then off you go in the same way, and then windows arrived and all hell broke loose and it was ages before I learned how to make pictures again.

I've written some windows graphical programs, but it was always a chore. Java made it easier, but it was still a bit of a pain. And notwithstanding python's turtle module, it's always a pain in every language I use, and it never seems easy or fun like it used to do.

The other day, I wanted to plot something. And I started wondering gloomily how to do it from clojure. And the easiest way seemed to be to import incanter and use its graphics libraries, which are intended for statistics and make lovely pictures of functions, but don't quite seem to be the sort of thing that I remember from when I was a boy.

And I thought: Scratch that itch.

So I downloaded a ZX Spectrum emulator

$ sudo apt-get install spectemu-x11

Fully expecting to be badly disappointed.

But no. First of all, the emulator's great. It's so like the real thing was that I could feel old habits and reflexes coming back. Well done whoever wrote it. And the ZX BASIC manual is just as well written as I remembered. I can quite see how this marvellous little device got me hooked.

What I wasn't expecting was how impressive the spectrum itself was. I've designed and programmed embedded devices with about the same amount of power, and these guys took what was basically a glorified washing machine controller, a TV output, and a rubber keyboard they invented themselves and made a wonderful little computer out of it that cost half as much as it would have cost to buy just the keyboard for the first IBM PC, which launched around the same time and actually used a very similar glorified washing machine controller, but cost about what my father earned in a month when I was 12.

You would expect it to have been utter rubbish, but it's really not. It takes about 20 minutes to get used to its strange keyboard and line based editor (vi was invented in 1976, and at the time was a vast thing that ran on computers that cost hundreds of thousands of pounds), and after that you're just talking directly to the little processor in BASIC.

With hindsight, those guys must have been awesome hackers. And awesome designers. And awesome industrial-process men, and awesome engineers, and awesome marketers. My God they must have been good.

And it may be somewhat limited, but a child could learn to program on one of these even these days, and it would be fun.

And I spent a happy few hours plotting things. Without having to think about how a java.awt.Graphics2D.DoubleBuffered.Image.ofDespair plugs into an OverComplicatedInterfaceAdapterFactoryProxy so it can be ........

And once my itch was scratched, I thought that it would be quite easy to make a little library for clojure that worked in the same way as the Spectrum's plotting routines. So I looked up some old java programs that I had written back in the days when Java seemed fresh and new and sane, and this is what I came up with:

;; The classic ZX Spectrum sine drawing program

(use 'simple-plotter)
(create-window "ZX Graphics" 256 176)

;; 10 FOR X = 1 TO 255
;; 20 PLOT X, 88+80*SIN(X/128*PI)
;; 30 NEXT X

(doseq [x (range 256)]
  (plot x (+ 88 (* 80 (Math/sin (* Math/PI (/ x 128)))))))

;; Hey, let's have some axes as well

;; 40 INK 6
;; 50 PLOT 0,88: DRAW 255,0
;; 60 PLOT 127,0: DRAW 0,168

(ink gray)
(plot 0 88) (draw 255 0)
(plot 127 0) (draw 0 175)

And then I started to think that it was a bit much that ZX BASIC, which really is not a terribly expressive language, seemed to have things to teach clojure about plotting, and I thought of Barnsley's lovely fractal fern, which would be a big and difficult program to write on the Spectrum.

(use 'simple-plotter)

(defn transform [[xx xy yx yy dx dy]]
  (fn [[x y]] [(+ (* xx x) (* xy y) dx)
               (+ (* yx x) (* yy y) dy)]))

(def barnsleys-fern '((2  [  0    0     0     0.16 0 0    ])
                      (6  [  0.2 -0.26  0.23  0.22 0 0    ])
                      (7  [ -0.15 0.28  0.26  0.24 0 0.44 ])
                      (85 [  0.85 0.04 -0.004 0.85 0 1.6  ])))

(defn choose [lst] (let [n (count lst)] (nth lst (rand n))))

(defn iteration [transforms]
  (let [transform-list (mapcat (fn [[n t]] (repeat n (transform t))) transforms)]
    (fn [x] ((choose transform-list) x))))

(def barnsley-points (iterate (iteration barnsleys-fern) [0 1]))

(create-window "Barnsley's Fern" 350 450)

(ink green)
(scaled-scatter-plot (take 10000 barnsley-points) 50 300 50 400 100)

And that appears to have handed ZXBASIC its ass on a plate, if you will pardon the expression.

And doing graphics programming is fun again.

Although performance-wise, my little library appears to have much in common with the ZX Spectrum.

No matter. I know that I can make it run at least 300 times faster without too much trouble. I am fairly confident that I will be writing more programs using it, and I am sure that one day I will find something that needs it to be faster, and take the trouble to sit down with performance tuning tools and sort it out.

For the moment, I am still working on the interface, which is the important bit. After that, algorithms and correctness. Once it's all nice and stable, I'll probably translate it into Java, or Clojure 1.3, which I am told may be able to achieve Java-like performance with minimal tuning and type hints.

Here's my little library. Sorry about the code. It isn't well written, it's very slow, and I haven't taken the trouble to comment or explain it, because I'll have completely rewritten it in a couple of days. I post it in case anyone else misses their Spectrum.

(ns simple-plotter
  (:import (javax.swing JFrame JPanel )
           (java.awt Color Graphics Graphics2D Image))
  (:use (clojure.contrib def)))

;; This is an attempt to make graphics in clojure as simple as it was on a ZX
;; Spectrum. Let us count the whole maven/leiningen/cake-clojure-emacs-(require
;; 'simple-plotter) thing as being a one-time effort equivalent to persuading
;; one's father to buy a ZX Spectrum in the first place.

;; Define some colours to use:
;; java.awt.Color/GREEN -> simple-plotter/green

(defmacro- defcolours [& colours]
  (list* 'do (map #(list 'def  (symbol (. (str %) toLowerCase)) (symbol (str "Color/" (str %)))) colours)))

(defcolours black blue cyan darkGray gray green lightGray magenta orange pink red white yellow)

;; Private machinery

(defn- draw-lines [lines #^Graphics2D g2d xs ys]
  (doseq [[x1 y1 x2 y2 color] @lines]
    (. g2d setColor color)
    (. g2d drawLine (* xs x1) (* ys y1) (* xs x2) (* ys y2))))

(defn- render-lines-to-graphics [lines paper-color height width
                                 #^Graphics2D g w h ]
  (doto g
    (.setColor @paper-color)
    (.fillRect 0 0 w h))
    (draw-lines lines g (/ w @width) (/ h @height)))

(defn- primitive-repaint [plotter]
  (. (plotter :panel) repaint))

(defn create-plotter [title width height ink paper]
  (let [lines       (atom [])
        height      (atom height)
        width       (atom width)
        paper-color (atom paper)
        panel       (proxy [JPanel] []
                      (paintComponent [g]
                                      (proxy-super paintComponent g)
                                       lines paper-color height width
                                       #^Graphics2D g
                                       (. this getWidth)
                                       (. this getHeight))))
        frame (JFrame. title)]
    (doto frame
      (.add panel)
      ;; The extra space 2,32 is taken up by the window decorations
      ;; in GNOME. How to get that from the OS? 
      (.setSize (+ @width 2) (+ @height 32))
      (.setVisible true))
    {:height height
     :width  width
     :ink-color (atom ink)
     :paper-color paper-color
     :current-position (atom [0,0])
     :lines lines
     :panel panel}))

(defn- primitive-line [plotter x1 y1 x2 y2]
  (let [ink @(:ink-color plotter)]
    (swap! (:lines plotter) conj [x1 y1 x2 y2 ink]))
  (primitive-repaint plotter))

(defn- set-paper-color [plotter color]
  (swap! (plotter :paper-color) (constantly color))
  (primitive-repaint plotter))

(defn- set-ink-color [plotter color]
  (swap! (plotter :ink-color) (constantly color)))

(defn- set-current-position [plotter [x y]]
  (swap! (plotter :current-position) (constantly [x y])))

(defn- remove-lines [plotter] (swap! (plotter :lines) (constantly [])))

(defn- make-scalars [points xleft xright ytop ybottom]
  (let [xmax (reduce max (map first points))
        xmin (reduce min (map first points))
        ymax (reduce max (map second points))
        ymin (reduce min (map second points))]
    [(fn[x] (+ xleft (* (/ (- x xmin) (- xmax xmin))    (- xright xleft))))
     (fn[y] (+ ybottom  (* (/ (- y ymin) (- ymax ymin)) (- ytop ybottom))))]))

(defvar- current-plotter (atom nil))

;; ;; Public Interface

(defn create-window
  ([] (create-window "Simple Plotter"))
  ([title] (create-window title 1024 768))
  ([title width height] (create-window title width height white black ))
  ([title width height ink paper]
     (let [plotter (create-plotter title width height ink paper)]
       (swap! current-plotter (constantly plotter))

;;Makes a version of a function with an implicit first argument of plotter, and
;;a default version with no first argument is not supplied, and which uses
;;current-plotter instead.
(defmacro ddefn [fnname args & body]
  `(defn ~fnname
     (~args (~fnname @~'current-plotter ~@args))
     ([~'plotter ~@args] ~@body)))

;; So that instead of saying
;; (defn plot
;;   ([x1 y1] (plot @current-plotter x1 y1))
;;   ([plotter x1 y1]
;;      (primitive-line plotter x1 y1 x1 y1)
;;      (set-current-position plotter [x1 y1])))
;; we can say:

(ddefn plot [x1 y1]
       (primitive-line plotter x1 y1 x1 y1)
       (set-current-position plotter [x1 y1]))

(ddefn cls [] 
       (remove-lines plotter)
       (primitive-repaint plotter))

(ddefn plot [x1 y1]
     (primitive-line plotter x1 y1 x1 y1)
     (set-current-position plotter [x1 y1]))

(ddefn draw [dx dy]
  (let [[x1 y1] @(plotter :current-position)
        [x2 y2] [(+ x1 dx) (+ y1 dy)]]
    (primitive-line plotter x1 y1 x2 y2)
    (set-current-position plotter [x2 y2])))

(ddefn line [x1 y1 x2 y2]
   (plot plotter x1 y1)
   (draw plotter (- x2 x1) (- y2 y1)))

(ddefn ink   [color] (set-ink-color plotter color))

(ddefn paper [color] (set-paper-color plotter color))

(ddefn scaled-scatter-plot [points xleft xright ytop ybottom scalepoints]
  (let [[xsc ysc] (make-scalars (take scalepoints points) xleft xright ytop ybottom)]
      (doseq [[x y] points]
        (plot (* (xsc x))
              (* (ysc y))))))

(defn window [plotter]
  (swap! current-plotter (fn[x] plotter)))

(ddefn get-height [] @(plotter :height))

(ddefn get-width  [] @(plotter :width))

;; Examples

(defn sine-example[]

  (create-window "sine")

  ;; sine graph
  (doseq [x (range 1024)]
    (plot x (+ 384 (* 376 (Math/sin (* Math/PI (/ x 512)))))))

  ;; axes
  (ink yellow)
  (plot 0 384) (draw 1024 0)
  (line 512 0 512 1024))


;; (defn examples []
;;   (sine-example)
;;   (in-ns 'user)
;;   (use 'simple-plotter)
;;   (load-file "fractal-fern.clj")
;;   (load-file "zxsin.clj")
;;   (load-file "gridpattern.clj") )