Blog Archive

Search This Blog

Thursday, September 30, 2010

Clojure 1.3 : First Impression : Clojure vs Scala Revisited


Fractal Tree of the Gods
A while ago, I was disappointed to discover that a certain fractal tree program was visibly slower in Clojure than a very similar program in Scala, and that there was no obvious way to improve it, because it intrinsically involved a tree-recursion, which caused boxing and unboxing of function arguments.

So the first thing I tried with this new version of Clojure, which is supposed to cure such difficulties, was the fractal tree. I had to add the new ^:static declaration to the recursive function, and then type-hint the function's arguments. And now look at it go! It redraws the tree in about 25ms, or 50 frames/second, which means that as you resize the window, the tree rustles cheerfully!

An extraordinary difference. Try it yourself, a pom.xml for the new clojure is here, and very simple full instructions for setting up the full clojure 1.2/emacs/swank/slime are here. All that you need to change to try the new version is the pom.xml file.

And the thing that impressed me most about it the first time, the ability to fiddle with draw-tree in emacs, press M-C-x to redefine the function in the running image, and then see the changes as soon as the window is redrawn, is still true. I think graphics programming in Clojure is about to become wonderful fun. The speed of compiled java combined with the experimental, exploratory qualities of LISP.

Since this alpha release is likely to turn into the next version of clojure, and I've been quite interested in the speed of things recently, I'm going to change over to 1.3-alpha now, and all further posts will be based on it. No point learning to optimize the current version when the new one will obviously require different tricks.

Here's the new source code for fractal-tree.clj:



(import '(javax.swing JFrame JPanel )
        '(java.awt Color Graphics Graphics2D))

(defn ^:static draw-tree [ #^Graphics g2d ^double angle ^double x ^double y ^double length ^double branch-angle ^long depth]
  (if (> depth 0)
    (let [new-x (- x (* length (Math/sin (Math/toRadians angle))))
          new-y (- y (*  length (Math/cos (Math/toRadians angle))))
          new-length (fn [] (* length (+ 0.75 (rand 0.1))))
          new-angle  (fn [op] (op angle (* branch-angle (+ 0.75 (rand)))))]
      (. g2d drawLine x y new-x new-y)
      (draw-tree g2d (new-angle +) new-x new-y (new-length) branch-angle (- depth 1))
      (draw-tree g2d (new-angle -) new-x new-y (new-length) branch-angle (- depth 1)))))

(defn render [ #^Graphics g w h ]
  (doto g
    (.setColor (Color/BLACK))
    (.fillRect 0 0 w h)
    (.setColor (Color/GREEN)))
  (let [init-length ( / (min w h) 5),
        branch-angle (* 10 (/ w h)),
        max-depth 12]
    (#'draw-tree  g 0.0 (/ w 2) h init-length branch-angle max-depth)))

(defn create-panel []
    "Create a panel with a customised render"
  (proxy [JPanel] []
    (paintComponent [g]
                    (proxy-super paintComponent g)
                    (time (render g (. this getWidth) (. this getHeight))))))


(defn run []
  (let [frame (JFrame. "Clojure Fractal Tree")
        panel (create-panel)]
    (doto frame
      (.add panel)
      (.setSize 640 400)
      (.setVisible true))))

(run)

27 comments:

  1. Very cool program. I haven't checked out Clojure 1.3-alpha yet, but on 1.2 I seem to be able to get a big speedup just by moving the call to Math/toRadians out of the draw-tree function. With the original code (minus primitive type hints), I was getting 20-50ms to redraw, but after converting draw-tree to work with radians directly it consistently draws in under 5ms.

    The modified code passes angle directly to Math/sin and Math/cos when calculating new-x and new-y in draw-tree, and converts branch-angle to radians in the render function before passing it to draw-tree.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. It's definitely faster. After multiple runs:

    1.3.0:
    "Elapsed time: 55.751885 msecs"
    "Elapsed time: 27.157153 msecs"
    "Elapsed time: 32.88198 msecs"
    "Elapsed time: 11.831966 msecs"
    "Elapsed time: 20.818177 msecs"
    "Elapsed time: 13.818499 msecs"
    "Elapsed time: 11.109498 msecs"
    "Elapsed time: 29.436303 msecs"
    "Elapsed time: 10.542898 msecs"
    "Elapsed time: 12.035864 msecs"

    1.2.0:
    "Elapsed time: 179.576282 msecs"
    "Elapsed time: 56.212614 msecs"
    "Elapsed time: 67.700993 msecs"
    "Elapsed time: 344.026875 msecs"
    "Elapsed time: 49.263416 msecs"
    "Elapsed time: 46.297098 msecs"
    "Elapsed time: 66.401638 msecs"
    "Elapsed time: 65.916399 msecs"
    "Elapsed time: 94.904712 msecs"
    "Elapsed time: 98.595937 msecs"
    "Elapsed time: 47.049942 msecs"
    "Elapsed time: 109.587326 msecs"

    ReplyDelete
  4. As someone commented on your previous post - you're still slowing the clojure version down a bit by creating two functions and calling them twice each, rather than just calculating new-angle and new-length directly.

    ReplyDelete
  5. -- (time (render g (.getWidth this) (.getHeight this)))

    ++ (time (render g (.getWidth ^JPanel this) (.getHeight ^JPanel this))

    Reflection is slowing you down I think.

    /Lau

    ReplyDelete
  6. That piece of code only gets executed once per render, so it won't speed things up much. The slowdown is due to autoboxing in the draw-tree function, where the code is spending most of its time.

    Timings (on Clojure 1.3.0-alpha1 this time):

    Original code:
    Elapsed time: 14.461624 msecs
    Elapsed time: 11.558751 msecs
    Elapsed time: 11.723124 msecs
    Elapsed time: 17.423027 msecs
    Elapsed time: 12.037191 msecs
    Elapsed time: 11.961879 msecs

    With the JPanel type hint:
    Elapsed time: 11.135711 msecs
    Elapsed time: 11.234126 msecs
    Elapsed time: 14.418567 msecs
    Elapsed time: 11.233905 msecs
    Elapsed time: 11.165679 msecs
    Elapsed time: 13.988894 msecs

    The real speedup is from the primitive type hints in draw-tree (which is the point of the blog post). Without those, the results are much worse:
    Elapsed time: 59.64743 msecs
    Elapsed time: 58.919367 msecs
    Elapsed time: 57.896841 msecs
    Elapsed time: 61.501948 msecs
    Elapsed time: 58.25483 msecs
    Elapsed time: 59.627082 msecs

    PS: these results are from a different (much slower) machine to the ones in my previous post. Also, the REPL is awesome for testing things quickly like this.

    ReplyDelete
  7. It's nice to see an actual example of the efficiency improvements.

    Out of interest and on a completely different note, how might you add a loop to this code that forces a re-render every t ms? It might be nice to watch the tree bristle on its own accord. However, on inspection of the code, it's not immediately obvious to me where I might insert such a loop...

    ReplyDelete
  8. What is the runtime for the same program in Scala?

    ReplyDelete
  9. @Sam: You can force re-rendering by adding the following code to the run function (substituting 500 with whatever value of t you want):
    (.start (javax.swing.Timer. 500
    (proxy [java.awt.event.ActionListener] []
    (actionPerformed [evt] (.repaint panel)))))

    But that probably won't do what you want - it randomly re-generates the entire tree every frame, so it doesn't really look like the tree is rustling. To do get that sort of effect, you would need to save the points for the first 8 or 10 levels, and only have the final few levels be re-randomized on a per-frame basis. That would need a considerable change to the program structure, I think.

    ReplyDelete
  10. Ok, and how is the performance compared to the Scala version?

    ReplyDelete
  11. I find Criterium a great library for benchmarking in Clojure, I do use it for working in rinzelight.

    ReplyDelete
  12. @ Nadeem, well spotted with the radians! I prefer it like that actually. 1/36th of a circle is more intuitive than 10 degrees anyway, although there's not much in it! Also thanks for all your comments.

    @ Anonymous, it didn't make much difference to the earlier version. Now it's so fast the overhead of creating 200000 anonymous functions/second has become noticeable, although I think it's the garbage collection that slows it down. It appears more as a variance in the timing than as an absolute speedup. I don't notice any difference on my netbook, where there are two cores.

    @Lau, that's only called 50 times a second so it probably isn't that important. I'll do it anyway though.

    @gnuvince, anonymous
    Aargh. Sorry, I didn't mean this post to be an attack on Scala. I wrote it without thinking in ten minutes while a bit drunk because I'd decided I'd try 1.3.0-alpha between coming home from the pictures and going to bed and the first thing I tried was the fractal tree and it was so fast and pretty.

    And then I check my email for lunch and it's the top hit on Hacker News and has got more readers in a day than my carefully constructed macro tutorial that I spent days on has got in a month. Oops.

    I like Scala, and ML-style languages in general, and Clojure and Scala should be friends.

    My initial impression with the original Scala program is that it's a tiny bit faster, say 18ms vs 15ms or something.

    But I am not very good at Scala. I'll make another post with the two programs side by side and proper timings, and then if anyone's interested proper Scala hackers can have a pop at speeding that up.

    I'm not even a proper Clojure hacker. I only got interested in Clojure speed about a week ago!

    ReplyDelete
  13. OK, ran original Scala program as blogged earlier with latest Ubuntu scala version and got:

    Elapsed time: 73.279809 msecs
    Elapsed time: 19.729761 msecs
    Elapsed time: 19.634687 msecs
    Elapsed time: 14.264257 msecs
    Elapsed time: 18.697175 msecs
    Elapsed time: 15.330199 msecs
    Elapsed time: 17.168918 msecs


    With the clojure version currently installed and fractaltree already modified to take into account the suggestions above:

    "Elapsed time: 85.193506 msecs"
    "Elapsed time: 32.344713 msecs"
    "Elapsed time: 37.643066 msecs"
    "Elapsed time: 20.565986 msecs"
    "Elapsed time: 35.433783 msecs"
    "Elapsed time: 18.332186 msecs"
    "Elapsed time: 21.329968 msecs"
    "Elapsed time: 19.300131 msecs"


    You can see HotSpot doing its thing in the timings.

    I think that's a clear win for Scala.

    I'm being unfair to Scala even then. For instance I must make the equivalent mods, particularly radian measure, to the Scala version.

    When I do I'll time it and publish.

    My point was not "Clojure is faster than Scala". My point was "The difference used to be very obvious for this program. Now it isn't. Side by side you'd be hard pushed to tell them apart."

    ReplyDelete
  14. Of course, what we should really do is make a collaborative fractal tree project with maven to make one program open Clojure, Scala and Java versions side by side, and let people run this program on their own machines.

    Totally useless as a benchmark, but what an advert for JVM language interop!

    And all you need to do is $git clone and $mvn run!

    ReplyDelete
  15. And hell, we could open a swank server and you could *still* fiddle with the running image from EMACS.

    ReplyDelete
  16. PS, no-one picked up that you had to re-evaluate both draw-tree and render, because I somehow had published the wrong program.

    I have added the vital #' to render that makes what I said true.

    Did not one of the 7000 people who read this post today actually execute the program?!

    ReplyDelete
  17. Weirdly, replacing the (rand) with (Math/random) like in the scala version seems to stabilise something.

    "Elapsed time: 117.733204 msecs"
    "Elapsed time: 20.40241 msecs"
    "Elapsed time: 27.394806 msecs"
    "Elapsed time: 18.451451 msecs"
    "Elapsed time: 18.437798 msecs"
    "Elapsed time: 20.133874 msecs"
    "Elapsed time: 19.439343 msecs"

    Looking at core.clj, (rand 0.1) calls (rand), and in doing so, I bet it boxes and unboxes the double, and does generic arithmetic, causing garbage generation and slowness.

    Those times are looking pretty close now...

    ReplyDelete
  18. But then Scala, in its mightiness, the radians modification having been applied, produces:


    Elapsed time: 54.284259 msecs
    Elapsed time: 16.918854 msecs
    Elapsed time: 13.562566 msecs
    Elapsed time: 13.070082 msecs
    Elapsed time: 12.38225 msecs
    Elapsed time: 10.266561 msecs
    Elapsed time: 11.741986 msecs

    The crowd go mad, scenting blood. Can Clojure recover from this mortal blow?

    ReplyDelete
  19. My google-fu is failing me. How is this: (#'draw-tree) different from just calling (draw-tree)?

    Care to you point me in the right direction?

    ReplyDelete
  20. #'draw-tree is reader-macro sugar for (var draw-tree)

    Maybe that keeps it from optimizing away the lookup of the draw-tree symbol, to make the interactive slime magic work?

    http://clojure.org/reader

    ReplyDelete
  21. Thanks for the Scala numbers. Pretty impressive speedup with the new Clojure version. Keep up the good work.

    ReplyDelete
  22. > The speed of compiled java combined with the experimental, exploratory qualities of LISP.

    I'm sorry, but shouldn't actual natively-compiled Common Lisp code come very close to compiled java?

    According to http://shootout.alioth.debian.org/u64/which-programming-languages-are-fastest.php#about it actually comes ahead of Clojure on the median (whether that means much is another matter -- I'm pretty sure the benchmarks weren't updated to take the new Clojure's type hinting into account).

    ReplyDelete
  23. @Samium

    No question that there are fast lisps! In fact I'm surprised that SBCL isn't higher on that list, given how tunable it is.

    I was celebrating the fact that clojure appears to be taking steps towards becoming one of them.

    ReplyDelete
  24. Interestingly this code won't compile with 1.3.0-alpha4 as draw-tree has too many arguments:

    CompilerException java.lang.IllegalArgumentException: fns taking primitives support only 4 or fewer args

    See here: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/Compiler.java#L4658

    I wonder what purpose the restriction serves.

    ReplyDelete
  25. In case people are interested, here's a thread on the Clojure mailing list that I started regarding the 4 or fewer args restriction:

    http://groups.google.com/group/clojure/browse_thread/thread/308ee26a432a117a

    ReplyDelete
  26. Hi Sam, well spotted! I've actually gone back to using clojure 1.2, since there seems to be an unnecessary amount of ugliness in the 1.3-alphas released so far.

    Partly the fact that the 1.3's don't seem to work well with slime, partly the fact that I hardly ever notice the speedup, partly because I hate having to litter my code with Ns to get arithmetic to work, and partly because when I do want to optimise a particular routine, the approaches I was using in 1.2 actually seem to get better results than I can get in 1.3.

    I've also decided that I really don't like the JVM, which I've come to think of as a sort of putrid bog on which clojure is built, and that Java interop isn't much use to me.

    I love clojure as a LISP, but I'm starting to hope for a nice modern LISP that uses all the good stuff in clojure and ditches the Java-evil that pollutes it.

    That said, it's still my favourite language, the cleanest and most expressive language I know, and the one I use all the time. Which is not bad for a language designed with Java interop as an explicit goal!

    ReplyDelete
  27. Your example is flawed the same way as the last post. Please replace these closure definitions:
    new-length (fn [] (* length (+ 0.75 (rand 0.1))))
    new-angle (fn [op] (op angle (* branch-angle (+ 0.75 (rand)))))]

    with regular functions just like you did in scala and it will run a ton faster. Someone posted that in your previous scala vs. clojure post and that is nearly all the time difference between the implementations.

    ReplyDelete

Followers