Search This Blog

Monday, September 30, 2013

Efficiency and Progress III : How Close Can We Get to C Speeds?

;; Efficiency and Progress III : How Close Can We Get to C Speeds?

;; When messing about trying to make clojure quick, it's often well to have
(set! *warn-on-reflection* true)

;; A kind commenter on my previous post (Bernard) made me aware of the
;; -march=native option for gcc which allows it to optimize for the
;; particular platform it's compiled on.

;; On my little netbook, this doubles the speed of the C program to
;; 8.6 seconds, so it looks like C can add two vectors of a million
;; integers to one another, and then add all the entries in the
;; result, in around 9 milliseconds.

;; On the other hand, another commenter, Dmitry points out that I
;; should make sure that my jvm is running in server mode, and that
;; leiningen isn't turning off the jvm's performance optimizations.

;; This is done by either by adding
;;  :jvm-opts ^:replace ["-server"]
;; Or as a paranoid check, starting a clean clojure by hand:
;; java -server -classpath ~/.m2/repository/org/clojure/clojure/1.5.1/clojure-1.5.1.jar clojure.main

;; I also figure that if I'm going to accuse clojure of being slow, I ought to upgrade to java version 7.

(System/getProperty "java.vm.version") ;-> "23.7-b01"

;; So:
(time (reduce + (map + (range 1000000) (range 1000000)))) ;-> 999999000000
"Elapsed time: 3238.342039 msecs"
"Elapsed time: 2925.501909 msecs"
"Elapsed time: 2079.815112 msecs"
"Elapsed time: 2031.237985 msecs"
"Elapsed time: 2023.951652 msecs"
"Elapsed time: 2095.66391 msecs"
"Elapsed time: 2031.429136 msecs"

;; You can see the jvm optimizing as it runs this code repeatedly
;; People keep telling me to use a benchmarking library called
;; criterium, but I reckon I don't need precision instruments to
;; measure a difference of more than two orders of magnitude, and it
;; sounds like one more complication.

;; If I'm wrong, I'm sure someone will point it out.

;; Let's call Clojure's time for this operation 2023 ms, the fastest
;; that it managed in several runs. I think that's actually fair
;; enough. Other random stuff that's going on is probably only going
;; to slow it down.

(/ 2023 8.6) ;-> 235.2325581395349

;; That's a speed ratio of about 250x

;; Any further suggestions about tuning the underlying tools gratefully received!

;; As a sanity check I also translated the C program into
;; similar-looking java, and that runs in 22secs, so the jvm should be
;; able to add and then reduce two integer arrays in about 22ms.

;; Just as a sanity check:
(import efficiencyandprogress)
(time (efficiencyandprogress/microbenchmark))
"Elapsed time: 22761.461075 msecs"

;; That also allows us a comparison of clojure to its underlying
(/ 2023 22.) ;-> 91.95454545454545

;; which looks about right for a dynamic language.
;; In fact, given lazy-sequences and immutable data structures, it's very good indeed!


;; I've also had various suggestions about how this little
;; microbenchmark could be speeded up.

;; Another commenter, Mark, proved that Clojure can do the actual adding up quickly:
 (loop [i 0
        a 0]
   (if (= i 1000000) a (recur (inc i) (+ a i i))))) ;-> 999999000000
"Elapsed time: 56.808898 msecs"

;; But of course that's not really fair, since it's not adding big
;; vectors but can potentially do all its calculating in the processor
;; itself. The graph algorithms I'm playing with won't be so amenable.


;; After a bit of reading round, it appears that the only way to get
;; this sort of thing to run fast in clojure is to use java's
;; primitive arrays, essentially trying to recreate the java version
;; inside clojure.

;; It's quick to create these arrays. (Not that that really matters)
(time (def a (int-array 1000000)))
"Elapsed time: 5.553077 msecs"
(time (def b (int-array 1000000)))
"Elapsed time: 5.758837 msecs"

;; And you can iterate over them using dotimes and aset, C-style.

;; Do this naively and it's insanely slow
(time (dotimes [i 1000000] (aset a i i)))
;; this is where *warn-on-reflection* pays off : Reflection warning: call to aset can't be resolved.
"a long time...."

;; But what a difference a type-hint makes!
 (let [a ^ints a]
   (dotimes [i 1000000] (aset a i i))))
"Elapsed time: 68.792057 msecs"

;; However, we're still in clojure, which is nice if you want to do
;; pretty much anything other than fast arithmetic on arrays. Arrays
;; can be seqs just like any other collection.
(take 10 a) ;-> (0 1 2 3 4 5 6 7 8 9)

;; dotimes is a macro, so we can have a look at what it's doing
(macroexpand '(dotimes [i 1000000] (aset a i i)))

;; And by analogy construct loops of our own
 (let [a ^ints a b ^ints b]
   (loop [i 0] 
     (when (< i 1000000) 
       (aset a i i)
       (aset b i i)
       (recur (unchecked-inc i))))))
"Elapsed time: 92.136649 msecs"

;; Here we do a vector addition (eek! mutation!)
 (let [a ^ints a b ^ints b]
   (loop [i 0] 
     (when (< i 1000000) 
       (aset b i (+ (aget a i)(aget b i)))
       (recur (unchecked-inc i))))))
"Elapsed time: 152.685547 msecs"

(take 10 b) ;-> (0 2 4 6 8 10 12 14 16 18)

;; and reduction
 (let [b ^ints b]
   (loop [i 0 sum 0] 
     (if (< i 1000000)
       (recur (unchecked-inc i) (+ sum (aget b i)))
"Elapsed time: 106.103927 msecs"

;; So let's bite the bullet and see whether we can do the actual computation that C and Java did.
;; I'm estimating about 4 minutes rather than 9 seconds, but it's progress.

;; To be fair to clojure, we'll allow it to skip the overflow checks
;; that neither C nor Java are bothering with.
(set! *unchecked-math* true)

(def N 1000000)
(def a (int-array (range N)))
(def b (int-array N))

 (let [a ^ints a b ^ints b N ^int N]
   (loop [count 0 sum 0]
     (if (= count 1000) sum
           (println count sum)
           (dotimes [i N] (aset b i (+ (aget a i)(aget b i))))
           (recur (inc count) (+ sum (loop [i 0 ret 0] 
                                       (if (= i N) ret
                                           (recur (unchecked-inc i) (+ ret (aget b i))))))))))))

;; There's clearly something unpleasant in some woodshed somewhere, since I get this warning:
;; recur arg for primitive local: sum is not matching primitive, had: Object, needed: long
;; Auto-boxing loop arg: sum

;; but actually, this does ok!
"Elapsed time: 174125.982988 msecs"
;; The right answer, 9x slower than Java, 20x slower than C. A vast improvement.

;; At this point I'm starting to think that this might give me a way
;; of avoiding the grisly prospect of dropping into C or Java whenever
;; I have some graph algorithm to run.

;; The code is completely unreadable, and almost unwriteable, but it's
;; possible that we can do something about that. We are in a lisp
;; after all.

;; Still, currently I'm thinking, after what's now many days of
;; wrestling with this problem :

;; Difficult and Time Consuming to Write
;; Fragile (whether you get the speedups seems to depend on the detailed structure of the expressions)
;; Impossible to Understand
;; Still Pretty Slow

;; Can anyone show me a better way?

;; I have tried using the visualvm profiler, which I always found very
;; useful when trying to speed up clojure code in version 1.2.

;; Getting it running was an odyssey in itself, and it hasn't given me
;; any helpful insights yet.

;; I'm probably just not understanding what it's doing.

Wednesday, September 25, 2013

Efficiency and Progress II : Behold, C

/* A guest language on this blog. A welcome, please, for C */

/* The reason that I am off on this particular one at the moment is
   because I recently waited 3 hours for a clojure program to
   terminate, after about a day trying to get it to run at all. */

/* When it became apparent that it was not going to finish any time
   soon, I hacked together an answer in C, using what should have been
   a less efficient algorithm that I had come up with while
   waiting. */

/* That program took about 15 minutes to write and 65 seconds to run,
   and got the correct answer. */

/* That comparison is entirely unfair to both clojure and C in all
   sorts of ways, but if I am going to spend time getting clojure to
   run at C-ish speeds, I need to know what I should be aiming for. */

/* This program is what I am using as a comparison for (reduce + (map + _ _ )) */

/* To make sure that clever compilers and runtimes aren't doing any
   sneaky dead-code elimination, it is actually doing some sort of
   computation. But it is mainly mapping and reducing. Lots.*/


#define N 1000000

int a[N];
int b[N];

int main(void)
  int i, count;
  long long sum=0;

  for (i=0; i< N; i++) {

  for(count=0; count<1000; count++){
    for (i=0; i< N; i++) {

    for (i=0; i< N; i++) {

  printf("sum=%lli\n", sum);

/* gcc -std=gnu99 -Ofast efficiencyandprogress.c -o efficiencyandprogress && time ./efficiencyandprogress */
/* sum=250249749750000000 */

/* real 0m16.053s */
/* user 0m15.992s */
/* sys  0m0.032s */

/* So it looks as though adding one array to another and then adding up all the values in an array takes about 16ms in total.*/
/* That's 16ns per array entry which looks sort of OK on my little netbook, which boast an Intel Atom CPU N455 running at 1.66GHz with a 512kb*/

/* I'm hoping there's enough complexity here that the compiler actually has to run the program rather than taking short cuts*/

/* But just as a check, here's the code running with gcc in 'do exactly what I say so I can debug it' mode.*/

/* gcc -std=gnu99 -O0 efficiencyandprogress.c -o efficiencyandprogress && time ./efficiencyandprogress */
/* sum=250249749750000000 */

/* real 0m27.850s */
/* user 0m27.692s */
/* sys  0m0.060s */

/* This produces a small constant factor speedup, as expected if the two versions are doing the same work. */

Efficiency and Progress : My God, Clojure is Slow!

A warning: this post is considered stupid and harmful by a large number of people.

From the comments:

Anonymous May 28, 2014 at 7:06 AM

I love that this post shows up so highly in Clojure searches. It's a somber reminder to not write about things I know nothing about (let alone with a proclamation like the author uses).

And my reply:

Refute me then, and I will make another post saying I was wrong. I wrote this after two days trying to get a clojure program to complete while doing an algorithms course. I eventually rewrote it in C and it ran in a few seconds on the same hardware.

Once upon a time I wrote posts claiming that clojure was very fast if you wrote it cleverly. No one complained then and they did get a lot of publicity. Nowadays I've lost the knack. If anything it seems that compromises made to speed up the performance have resulted in un-optimizability and I understand that the official advice is to use Java for the tight loops.

But I have no wish to damage the public reputation of clojure, which I love and use all the time. What search makes this post show up on the front page?

;; Efficiency and Progress
;; Are ours once again
;; Now that we have the neut-ron bomb
;; It's nice and quick and clean and ge-ets things done...


;; When you program in Clojure, you get the raw speed of assembler.

;; Unfortunately, that is, assembler on a ZX81, running a Z80 processor at 4MHz in 1981.

;; If anything, that comparison is unfair to my old ZX81. Does anyone
;; remember '3D Invaders', a fast and exciting first person shooter /
;; flight simulator that ran in 1K of RAM *including memory for the
;; screen*?

;; Once upon a time, I had the knack of making clojure run at the same
;; speed as Java, which is not far off the same speed as C, which is
;; not far off the speed of the sort of hand-crafted machine code which
;; no-one in their right mind ever writes, in these degenerate latter
;; days which we must reluctantly learn to call the future.

;; But I seem to have lost the knack. Can anyone show me what I am doing wrong?

;; At any rate, it isn't too hard to get it to run at something like
;; the real speed of the machine, as long as you're prepared to write
;; code that is more like Java or C than Clojure.

;; So here are some thoughts about how to do this.

;; Which I offer up only as a basis for discussion, and not in any way
;; meaning to stir up controversy, or as flame-bait or ammunition for
;; trolls or anything of that sort.


;; Clojure is very slow:

(time (reduce + (map + (range 1000000) (range 1000000))))
"Elapsed time: 5316.638869 msecs"
;-> 999999000000

;; The greater part of its slowness seems to be do with lazy sequences

(time (def seqa (doall (range 1000000))))
"Elapsed time: 3119.468963 msecs"
(time (def seqb (doall (range 1000000))))
"Elapsed time: 2839.593429 msecs"

(time (reduce + (map + seqa seqb)))
"Elapsed time: 3558.975552 msecs"
;-> 999999000000

;; It looks as though making a new sequence is the expensive bit
(time (doall (map + seqa seqb)))
"Elapsed time: 3612.553803 msecs"
;-> (0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 ...)

;; Just adding things up is way faster
(time (reduce + seqa))
"Elapsed time: 315.717033 msecs"

;; I wondered if there was a way of avoiding lazy-seqs

(time (def veca (vec seqa)))
"Elapsed time: 470.512696 msecs"

(time (def vecb (vec seqb)))
"Elapsed time: 374.796054 msecs"

;; After all, 'use the right data structure for the problem' is pretty much lesson 1, and if vectors are not a good data structure
;; for this problem, then what is?

;; But it seems that despite the speed of making the vectors, it doesn't help much when we do our thing.
;; In fact it's a bit slower
(time (reduce + (mapv + veca vecb)))
"Elapsed time: 4329.070268 msecs"


;; So lets say 3600ms to add together two arrays of 1000000 elements and sum the result.

;; In C on the same machine (my little netbook with its 1.66GHz Atom
;; and 512kb cache) this seems to take 16 ms, being 8ms for the map
;; and 8 ms for the reduce.  I'm assuming that that time is mostly
;; spent waiting on the main memory, but I may be wrong. Who knows how
;; these things are done?

(/ 3600 16) ;-> 225

;; So shall we call this a 225x slowdown for the natural expression in
;; the two languages of mapping and reducing?

(time (reduce + seqa))
"Elapsed time: 358.152249 msecs"

;; If we just look at the reduction, then that's 
(/ 358 8.) ; 44.75

;; So around 50x


Sunday, September 22, 2013

How on earth do you program without *recursion*?

;; Does anyone know what the hell is going on here ?

(clojure-version) "1.5.1"

;; I mean, this is bad enough:
;; Here's a function to sum up all the numbers from 1..n

(def gauss-recurse (fn [n] (if (< n 1) 0 (+ n (gauss-recurse (dec n))))))

(gauss-recurse 3500) ;-> 6126750
(gauss-recurse 4000) ;-> StackOverflowError   clojure.lang.Numbers$LongOps.combine (

;; A maximum stack depth of 3500 or so is completely rubbish, but it's been like that for a while.

;; But I don't remember this:

(def gauss-memoized (memoize (fn [n] (if (< n 1) 0 (+ n (gauss-memoized (dec n)))))))

(gauss-memoized 160) ;-> StackOverflowError   clojure.lang.RT.boundedLength (

(gauss-memoized 100) ;-> 5050   I could probably do this in my head. At least it's the right answer.

;; Why would memoization affect the stack depth anyway?

;; Please please let this be a bug, or some idiocy on my part.

;; Please don't let my favourite language have decided that I can't
;; memoize a recursive function for any non-trivial problem.

;; And if any smartarse is thinking: 'You could use'
(reduce + (range 1 5001)) ;-> 12502500
(* 5000 5001 1/2) ;-> 12502500N

;; or

(defn gauss-tail 
  ([n] (gauss-tail n 0))
  ([n acc] (if (zero? n) acc (recur (dec n) (+ n acc)))))

(gauss-tail 5000) ;-> 12502500

;; etc, etc, etc, then I already know, ok. That's not the point. So fuck off. 

;; Unless you're about to give me a general macro which will transform
;; any collection of mutually recursive functions into an efficient
;; stackless implementation which avoids repeated solutions of the
;; same problem. In which case I will be most interested.

;; How on earth do you program without recursion ?

;; And why would you want to?

Friday, September 20, 2013

Algorithms II

In case it isn't obvious, I'm doing a course in Algorithms.

Tim Roughgarden's Algorithms II from Stanford, made available for free to all comers by the magic of coursera.

It's absolutely fascinating, and I can't recommend it highly enough.

I did the same lecturer's  Algorithms I course a few months ago, and it was also mind-blowing.

It's making me think that for all these years where I thought I was a seasoned expert, actually I haven't been a terribly good programmer at all.

That's a humbling feeling, but also a very nice feeling. There is still a great deal of stuff to learn. Stuff that matters.

So much of this stuff is absolutely foundational. My ideas of what a computer can do are expanding.

There have been several formative experiences in my computer-life:

Seeing my first ever computer program, in BASIC, in an electronics magazine when I was about ten years old, and being stunned to realise that I understood it.

Being allowed to play on the TRS80s in a computer shop in Sheffield (day in day out for weeks. thank you kind salesmen!)

Getting my own computer and really learning how to program in BASIC (thank you kind parents, I know how difficult it must have been to afford it)

Learning Python (thanks Guido)

Learning Lisp (thanks everyone!)

This algorithms course (thanks Tim, and the long list of pioneers whose work you are presenting so well).

Honestly. It's that good. I wish I'd done this when I was twenty.

Actually I wish I'd done this instead of going to school.

Thursday, September 19, 2013

Putting it all together : Using Union-Find in Kruskal's Algorithm

;; Kruskal's Algorithm and Union-Find

;; So I want to try out my version of union-find on the inspiration for it, Kruskal's algorithm

;; Here's my attempt:

(defn make-union-find [elts]
  (apply hash-map (mapcat (fn[x][x [x [x]]]) elts)))

(defn joined? [union-find a b]
  (= (first (union-find a)) (first (union-find b))))

(defn join [union-find a b]
  (let [[leader-a _] (union-find a)
        [leader-b _] (union-find b)]
    (if (= leader-a leader-b) 
      union-find ;; nothing to do
      (let [[_ followers-a] (union-find leader-a)
            [_ followers-b] (union-find leader-b)]
        (if (>= (count followers-a) (count followers-b))  ;; if a's in the bigger group
          (let [new-followers (into followers-a followers-b)] ;; combine follower-lists
            (let [uf (assoc union-find leader-a [leader-a new-followers])] ;; add the new followers to the 'a group leader'
              (reduce (fn[uf x] (assoc uf x [leader-a []])) uf followers-b))) ;; and change every member of the 'b group' to follow the 'a group leader' instead
          (join union-find b a)))))) ;; but if a's in the smaller group do it the other way round

;; And here's the data from the original maximum-spanning-tree problem

(def cities ["London" "Birmingham" "Sheffield" "Bristol" "Leeds" "Liverpool" "Manchester"])

(def link-costs 
   ["London" "Birmingham" 103]
   ["London" "Sheffield"  167]
   ["London" "Leeds" 175]
   ["London" "Bristol" 100]
   ["London" "Liverpool" 178]
   ["London" "Manchester" 181]

   ["Birmingham" "Sheffield"  91]
   ["Birmingham" "Leeds" 92 ]
   ["Birmingham" "Bristol" 79 ]
   ["Birmingham" "Liverpool" 75 ]
   ["Birmingham" "Manchester" 95]

   ["Sheffield" "Bristol" 180]
   ["Sheffield" "Leeds" 33]
   ["Sheffield" "Liverpool" 63]
   ["Sheffield" "Manchester" 37]

   ["Bristol" "Leeds" 171]
   ["Bristol" "Liverpool" 136]
   ["Bristol" "Manchester" 139]

   ["Leeds" "Liverpool" 73]
   ["Leeds" "Manchester" 40]

   ["Liverpool" "Manchester" 27]])

;; Kruskal's Algorithm Again:

;; Order the links by cost
(def links (sort-by (fn[[_ _ b]] b) link-costs))

;; And construct an initial partition of the cities
(def city-uf (make-union-find cities))

;; Kruskal has told us that should go through our list of links in cost order, building
;; only those links which link unlinked things

;; This function takes the partition so far, and the current partially constructed spanning tree
;; and adds a new link to the tree it is building
;; if and only if the link helps
(defn add-link [[uf tree] link]
  (let [a (first link)
        b (second link)]
    (if (joined? uf a b) 
      [uf tree]
      [(join uf a b) (cons link tree)])))

;; By the Power of Kruskal and by that of Union-Find:

(reduce + (map (fn[[_ _ x]] x) (second (reduce add-link [city-uf '()] links)))) ;-> 351
(reduce + (map (fn[[_ _ x]] x) (second (reduce add-link [city-uf '()] (reverse links))))) ;-> 988

;; Well, they are surely the same answers as before. 

;; How about performance?

(time (reduce add-link [(make-union-find (range 0 10)) '()] (take 10 (repeatedly (fn[] (list (rand-int 10) (rand-int 10) (rand-int 10))))))) 
"Elapsed time: 3.388981 msecs"
(time (reduce add-link [(make-union-find (range 0 100)) '()] (take 100 (repeatedly (fn[] (list (rand-int 100) (rand-int 100) (rand-int 100))))))) 
"Elapsed time: 15.278055 msecs"
(time (reduce add-link [(make-union-find (range 0 1000)) '()] (take 1000 (repeatedly (fn[] (list (rand-int 1000) (rand-int 1000) (rand-int 1000))))))) 
"Elapsed time: 96.710761 msecs"
(time (reduce add-link [(make-union-find (range 0 10000)) '()] (take 10000 (repeatedly (fn[] (list (rand-int 10000) (rand-int 10000) (rand-int 10000))))))) 
"Elapsed time: 978.454873 msecs"
(time (reduce add-link [(make-union-find (range 0 100000)) '()] (take 100000 (repeatedly (fn[] (list (rand-int 100000) (rand-int 100000) (rand-int 100000))))))) 
"Elapsed time: 15033.289079 msecs"

;; So far we're looking pretty linear, as I was expecting, but suddenly:

(time (reduce add-link [(make-union-find (range 0 200000)) '()] (take 200000 (repeatedly (fn[] (list (rand-int 200000) (rand-int 200000) (rand-int 200000)))))))
"Elapsed time: 129027.687251 msecs"
(time (reduce add-link [(make-union-find (range 0 1000000)) '()] (take 1000000 (repeatedly (fn[] (list (rand-int 1000000) (rand-int 1000000) (rand-int 1000000)))))))
"Elapsed time: 500747.749151 msecs"

;; Not sure what happened there! Anyone know?

Implementing a Data Structure : Union-Find version II

;; Union-Find II

;; Union-Find is a data structure specifically designed for keeping
;; track of partitions and equivalence relations.

;; A partition is a division of a large set into smaller sets
;; It can also be viewed as an equivalence relation

;; a is joined to b, if and only if a and be are in the same subset in the partition.

;; Consider our set of cities

(def cities ["London" "Birmingham" "Sheffield" "Bristol" "Leeds" "Liverpool" "Manchester"])

;; Our second try at a structure will be to associate with every item a pointer and a vector
;; The pointer will point to the leader of the item's group (the 'canonical element')

;; The vector will record every item which points to this item, which we should be able to use 
;; to speed up the join operation.

;; Our initial union-find for our cities looks like

(defn make-union-find [elts]
  (apply hash-map (mapcat (fn[x][x [x [x]]]) elts)))

(def initial (make-union-find cities))

;; {"Liverpool" ["Liverpool" ["Liverpool"]],
;;  "Sheffield" ["Sheffield" ["Sheffield"]],
;;  "Manchester" ["Manchester" ["Manchester"]], 
;;  "Birmingham" ["Birmingham" ["Birmingham"]], 
;;  "Bristol" ["Bristol" ["Bristol"]],
;;  "London" ["London" ["London"]], 
;;  "Leeds" ["Leeds" ["Leeds"]]}

;; To join Liverpool to London now, we must first decide which is to be the new leader (say Liverpool)
;; And then modify London's entry to be 
;;  "London" ["Liverpool" []]
;; And Liverpool's to be
;; "Liverpool" ["Liverpool" ["Liverpool" "London"]]

;; Here's our new, considerably more complex version of join:

(defn join [union-find a b]
  ;; find the leaders of both groups
  (let [[leader-a _] (union-find a)
        [leader-b _] (union-find b)]
    ;; if a and b are in the same group
    (if (= leader-a leader-b) union-find ;; nothing to do
        ;; but otherwise
        (let [[_ followers-a] (union-find leader-a)
              [_ followers-b] (union-find leader-b)]
          (let [new-followers (into followers-a followers-b)] ;; combine the follower-lists
            (let [uf (assoc union-find leader-a [leader-a new-followers])] ;; add the new followers to the 'a group leader'
              (reduce (fn[uf x] (assoc uf x [leader-a []])) uf followers-b))))))) ;; and change every member of the 'b group' to follow the 'a group leader' instead

(join initial "Liverpool" "London")
;; {"Liverpool" ["Liverpool" ["Liverpool" "London"]],
;;  "Sheffield" ["Sheffield" ["Sheffield"]],
;;  "Manchester" ["Manchester" ["Manchester"]],
;;  "Birmingham" ["Birmingham" ["Birmingham"]],
;;  "Bristol" ["Bristol" ["Bristol"]],
;;  "London" ["Liverpool" []],
;;  "Leeds" ["Leeds" ["Leeds"]]}

;; I can't believe I just wrote that and it worked!

;; Checking whether things are joined is still easy
(defn joined? [union-find a b]
  (= (first (union-find a)) (first (union-find b))))

(def testnet
  (-> initial 
      (join "Liverpool" "London")
      (join "London" "Liverpool")
      (join "Bristol" "Manchester")
      (join "Liverpool" "Leeds")
      (join "Bristol" "Leeds")))

;; {"Liverpool" ["Bristol" []],
;;  "Sheffield" ["Sheffield" ["Sheffield"]],
;;  "Manchester" ["Bristol" []],
;;  "Birmingham" ["Birmingham" ["Birmingham"]],
;;  "Bristol" ["Bristol" ["Bristol" "Manchester" "Liverpool" "London" "Leeds"]],
;;  "London" ["Bristol" []],
;;  "Leeds" ["Bristol" []]}

;; I still don't actually believe that this is working:

(joined? testnet "Liverpool" "Liverpool") ;-> true
(joined? testnet "Liverpool" "London") ;-> true
(joined? testnet "Leeds" "London") ;-> true
(joined? testnet "Bristol" "London") ;-> true
(joined? testnet "Bristol" "Manchester") ;-> true
(for [x cities] (joined? testnet "Sheffield" x)) ;-> (false false true false false false false)

cities ;-> ["London" "Birmingham" "Sheffield" "Bristol" "Leeds" "Liverpool" "Manchester"]
(map (comp count (partial filter identity)) 
     (partition (count cities) 
                (for [x cities y cities] (joined? testnet x y)))) ;-> (5 1 1 5 5 5 5)

;; But I can't break it!

;; Now, neat as this all is, it's still possible that a join will
;; produce as many updates as there are elements in the partition

;; So I want to introduce one final optimization
;; When two groups are joined, the group that gets the updates should be the smaller group.

;; This doesn't sound like much, but it is actually crucial, reducing the time to 
;; join everything together from n^2 to n log n 

;; That's because the big group merges don't happen very often. Look at it from an element's point of view.
;; How many times does it expect to participate in a merge that changes its leader?

;; Unfortunately this makes join almost unimaginably complex:
(defn join [union-find a b]
  (let [[leader-a _] (union-find a)
        [leader-b _] (union-find b)]
    (if (= leader-a leader-b) 
      union-find ;; nothing to do
      (let [[_ followers-a] (union-find leader-a)
            [_ followers-b] (union-find leader-b)]
        (if (>= (count followers-a) (count followers-b))  ;; if a's in the bigger group
          (let [new-followers (into followers-a followers-b)] ;; combine follower-lists
            (let [uf (assoc union-find leader-a [leader-a new-followers])] ;; add the new followers to the 'a group leader'
              (reduce (fn[uf x] (assoc uf x [leader-a []])) uf followers-b))) ;; and change every member of the 'b group' to follow the 'a group leader' instead
          (join union-find b a)))))) ;; but if a's in the smaller group do it the other way round

;; I have a feeling that I should try to factor this function a bit. 
;; But I'm so proud of the fact that I got it working that I don't want to mess it up.
;; I think this feeling makes me what used to be called in my C days 'a three star programmer'.

;; And now I really do feel paranoid
(defn sanity-check [union-find cities]
  (map (comp count (partial filter identity)) 
       (partition (count cities) 
                  (for [x cities y cities] (joined? union-find x y)))))

cities ;-> ["London" "Birmingham" "Sheffield" "Bristol" "Leeds" "Liverpool" "Manchester"]
(sanity-check initial cities) ;-> (1 1 1 1 1 1 1)
(sanity-check (-> initial (join "Liverpool" "London")) cities) ;-> (2 1 1 1 1 2 1)
(sanity-check (-> initial (join "London" "Liverpool")) cities) ;-> (2 1 1 1 1 2 1)
(sanity-check (-> initial (join "London" "Liverpool")(join "Liverpool" "London")) cities) ;-> (2 1 1 1 1 2 1)
(sanity-check (-> initial (join "London" "Liverpool")(join "Bristol" "Manchester")(join "Liverpool" "London")) cities) ;-> (2 1 1 2 1 2 2)
(sanity-check (-> initial (join "Bristol" "London")(join "London" "Liverpool")(join "Bristol" "Manchester")(join "Liverpool" "London")) cities) ;-> (4 1 1 4 1 4 4)
(sanity-check (-> initial (join "Sheffield" "Birmingham")(join "Bristol" "London")(join "London" "Liverpool")(join "Bristol" "Manchester")(join "Liverpool" "London")) cities) ;-> (4 2 2 4 1 4 4)

;; Hmm, maybe I'm convinced.

;; This is actually a terribly old-fashioned implementation of union-find 

;; A quick google turns up:
;; Which looks spiffy, but which I haven't actually tried.

;; I want to use mine, dammit!

Implementing a Data Structure: Union-Find

;; Union-Find I


;; In the last post, I showed Kruskal's algorithm for finding Extremal
;; Spanning Trees in a weighted graph.

;; If we were to try to scale that algorithm, we'd find that it was
;; quadratic in the number of vertices in the graph.

;; The problem is that we're repeatedly searching lists of sets to see
;; whether two things are connected or not

;; So we could speed it up a lot if we could find a data structure
;; that is good at that sort of thing

;; Specifically, we'd like to be able to ask whether a is joined to b quickly,

;; and we'd like to be able to quickly modify the relation when we decide to join a to b


;; Union-Find is a data structure specifically designed for keeping
;; track of partitions and equivalence relations.

;; A partition is a division of a large set into smaller sets
;; It can also be viewed as an equivalence relation

;; a is joined to b, if and only if a and be are in the same subset in the partition.

;; Consider our set of cities

(def cities ["London" "Birmingham" "Sheffield" "Bristol" "Leeds" "Liverpool" "Manchester"])

;; We'll make that into a hash of cities, where each city points to itself

(def initial (apply hash-map (mapcat (fn[x][x x]) cities)))

;; We'll interpret that map by saying 'All things that point to the same thing are in the same component'.
;; So in our initial map, where everything is pointing to itself, nothing is joined.

;; Say we want to assert that Liverpool ~ London. 
;; Then we want to make everything that points to Liverpool point to whatever it is London points to.

(defn join [union-find [a b]]
  (let [leader1 (union-find a)
        leader2 (union-find b)]
    (into {} (map (fn[[k v]] (if (= v leader1) [k leader2] [k v])) union-find))))

;; Let's connect Liverpool to London and London to Bristol, and Manchester to Sheffield
(reduce join initial [["Liverpool" "London"] ["London" "Bristol" ] ["Manchester" "Sheffield"]]) 
;-> {"Liverpool" "Bristol", "Sheffield" "Sheffield", "Manchester" "Sheffield", "Birmingham" "Birmingham", "Bristol" "Bristol", "London" "Bristol", "Leeds" "Leeds"}

;; Notice that Liverpool, Bristol, and London all point to Bristol (call that the Bristol Group)
;; And that Sheffield and Manchester form a group with Sheffield as its leader (call that the Sheffield group)
;; Whilst Leeds and Birmingham stand in splendid isolation, leaders of their own groups.

;; Now we can easily and quickly check which places are connected

(defn joined? [union-find a b]
  (= (union-find a)(union-find b)))

 (reduce join initial [["Liverpool" "London"] ["London" "Bristol" ] ["Manchester" "Sheffield"]])
 "Bristol" "Liverpool") ;-> true

 (reduce join initial [["Liverpool" "London"] ["London" "Bristol" ] ["Manchester" "Sheffield"]])
 "Bristol" "Manchester") ;-> false

;; But that's only half the problem. When joining cities, we still need to scan the whole map.

;; That means that if we're mainly joining things, rather than querying them, our performance is still poor.

;; So we should make each leader keep a list of the things that point to it.

;; Let's start again!

Kruskal's Algorithm for Minimal Spanning Trees

;; Kruskal's Algorithm and Minimal Spanning Trees

;; Suppose, in a wild flight of speculative fantasy, that years of conservative government
;; has completely destroyed the english railway network.

;; And piling absurdity upon absurdity, suppose that an eventual socialist government has
;; decided to rebuild it.

;; Under socialism, there's no money, so it's important to
;; connect the largest cities as cheaply as possible.
(def cities ["London" "Birmingham" "Sheffield" "Bristol" "Leeds" "Liverpool" "Manchester"])

;; Cadres of engineers are employed to pull cost projections out of their asses:
(def link-costs 
   ["London" "Birmingham" 103]
   ["London" "Sheffield"  167]
   ["London" "Leeds" 175]
   ["London" "Bristol" 100]
   ["London" "Liverpool" 178]
   ["London" "Manchester" 181]

   ["Birmingham" "Sheffield"  91]
   ["Birmingham" "Leeds" 92 ]
   ["Birmingham" "Bristol" 79 ]
   ["Birmingham" "Liverpool" 75 ]
   ["Birmingham" "Manchester" 95]

   ["Sheffield" "Bristol" 180]
   ["Sheffield" "Leeds" 33]
   ["Sheffield" "Liverpool" 63]
   ["Sheffield" "Manchester" 37]

   ["Bristol" "Leeds" 171]
   ["Bristol" "Liverpool" 136]
   ["Bristol" "Manchester" 139]

   ["Leeds" "Liverpool" 73]
   ["Leeds" "Manchester" 40]

   ["Liverpool" "Manchester" 27]])

;; First we do two preprocessing steps:

;; We put the potential links in order of cost
(def links (sort-by (fn[[_ _ b]] b) link-costs))

;; And construct an initial partition of the cities
(def city-partition (map (comp set list) cities))

;; We need a function which can merge sets
(defn partition-merge [part a b]
  (conj (filter #(not (or (contains? % a)(contains? % b))) part)
        (apply clojure.set/union 
               (concat (filter #(contains? % a) part)
                       (filter #(contains? % b) part)))))

;; So for instance, should we build a route between London and Liverpool
(partition-merge city-partition "London" "Liverpool") 
;-> (#{"Liverpool" "London"} #{"Birmingham"} #{"Sheffield"} #{"Bristol"} #{"Leeds"} #{"Manchester"})

;; The partition-merge keeps track of the fact that we've connected the two

;; Note that if we build a redundant line, the size of the partition will not reduce
(-> city-partition
    (partition-merge "London" "Liverpool")
    (partition-merge "Bristol" "Liverpool")
    (partition-merge "London" "Bristol")) 
;-> (#{"Liverpool" "Bristol" "London"} #{"Birmingham"} #{"Sheffield"} #{"Leeds"} #{"Manchester"})

;; Kruskal has told us that should go through our list of links in cost order, building
;; only those which increase the connectivity.

;; This function takes the partition so far, and adds a new link to the tree it is building
;; if and only if the link helps
(defn add-link [[partition tree] link]
  (let [new (partition-merge partition (first link) (second link))]
    (if (< (count new) (count partition))
      [new (cons link tree)]
      [new tree])))

;; By the Power of Kruskal:

(def tree (second (reduce add-link [city-partition '()] links)))
;; ["London" "Bristol" 100]
;; ["Birmingham" "Bristol" 79]
;; ["Birmingham" "Liverpool" 75]
;; ["Sheffield" "Manchester" 37]
;; ["Sheffield" "Leeds" 33]
;; ["Liverpool" "Manchester" 27]

;; The people's railway should go 
;; London->Bristol->Birmingham->Liverpool->Manchester->Sheffield->Leeds

(reduce + (map (fn[[_ _ x]] x) tree)) ;-> 351

;; And it will use 351 glorious miles of shining steel.

;; Can we do better?


;; One way to do better, of course, if you were the head of the railway-building department
;; would be increase the size of your empire to the maximum you can get away with

;; Kruskal's Algorithm is equally good at finding maximal spanning trees

(def max-tree (second (reduce add-link [city-partition '()] (reverse links))))

(doseq [i max-tree] (println i))
;; [London Birmingham 103]
;; [Bristol Leeds 171]
;; [London Leeds 175]
;; [London Liverpool 178]
;; [Sheffield Bristol 180]
;; [London Manchester 181]

(reduce + (map (fn[[_ _ x]] x) max-tree)) ;-> 988

;; Now the railway goes in a star topology out from London to the great cities.

;; Ridiculous

Wednesday, September 18, 2013

The Knapsack Problem : Another Classic Puzzle from Computer Science

;; The Knapsack Problem

;; Suppose you've got twelve pounds
(def budget 12)

;; And there's a thing that costs a pound, but is worth 20.
;; And another thing that costs 3, but is worth 30
;; And another thing that costs 3, but is worth 21
;; And a thing that costs 6 but is worth 40
(def things (map (fn[[c v]] {:cost c :value v}) [[1 20][3 30][3 21][6 40]]))

(defn price [things]    (reduce + (map :cost  things)))
(defn evaluate [things] (reduce + (map :value things)))

(evaluate things) ;-> 111
(price things) ;-> 13

;; So there's 111's worth of things going for 13, but you can't buy everything.

;; What do you buy? 

(defn value [sorted-things]
   (let [order   sorted-things
         baskets (reductions conj '() order)]
     (last (take-while #(<= (price %) budget) baskets)))))

;; Well, if you're a cynic
(value (sort-by :cost things)) ;-> 71
;; Then you come away with 71's worth

;; And if you're an idealist
(value (reverse (sort-by :value things))) ;-> 91
;; Then you do way better with 91

;; A more cunning approach is to take things in order of their price/value ratio
(value (reverse (sort-by (fn [{:keys[value cost]}] (/ value cost)) things))) ;-> 71
;; Sadly that does worse than the approach that only pays attention to the value.

;; So it seems that out of the three natural-seeming 'greedy algorithms', the best solution is 91

;; Yet another approach is to exhaustively search the space of possibilities:

(defn subsets [things]
  (if (empty? things) '(())
      (let [srt (subsets (rest things))]
        (concat (map #(cons (first things) %) srt) srt))))

(reverse (sort-by second (for [i (subsets things)] [(price i) (evaluate i)]))) 
;-> ([13 111] [12 91] [10 90] [10 81] [7 71] [9 70] [9 61] [7 60] [6 51] [4 50] [4 41] [6 40] [3 30] [3 21] [1 20] [0 0])

;; Which tells us that the best combination is unaffordable, so we
;; have to settle for the second best, which is paying 12 to get 91,
;; which the idealist has been trying to tell us all along.

;; But the idealistic approach is unlikely to work in the general case. 

;; Consider a thing which is worth a lot, but horribly expensive, and
;; lots of other things which are worth a fair bit and dirt cheap.

;; Personally my money would have been on the 'buy things in order of
;; price/value ratio' approach, but we above that that fails in at
;; least one easy case.

;; So it appears that if we are faced with a problem like this, ( and
;; there are many such problems ), then we are doomed.

;; Exhaustive search is not feasible once you've got more than a very
;; few items, and yet the various greedy algorithms above get the
;; wrong answers.

;; And yet if you write down a knapsack problem like this, you will
;; not find it appallingly difficult to pick the best arrangement.

;; There is a certain tradition at this point of exclaiming 'The HUMAN
;; COMPUTER', and then going on to derive your favourite philosophical
;; position on the nature of consciousness, which will miraculously
;; turn out to be whatever it was you thought before you contemplated
;; the problem in question.

;; But wait ...