Search This Blog

Wednesday, August 21, 2013

Colonial Blotto

;; Everyone's heard of the Prisoner's Dilemma, but I hadn't heard
;; until recently of Colonel Blotto, which was another foundation of
;; Game Theory.

;; Colonel Blotto and his opponent each have 100 troops, and between
;; them are 10 hills.

;; They must allocate troops to the hills, without knowing each
;; other's decisions.

;; On each hill, the larger number of troops wins, and the side which
;; wins the most battles wins overall.

;; Imagine that Blotto sends ten troops to each hill, but his cunning
;; enemy sends 11 troops to nine hills, and sends one poor soldier on
;; a suicide mission to the tenth.

;; Blotto loses 9-1, which is a pearl-handled revolver performance.

;; In the next battle, his successor cunningly sends 12 troops to the
;; first eight hills, four to hill nine, and none to hill 10.

;; The enemy, however, has anticipated this, and sent one man to claim
;; the undefended hill ten, five to the lightly defended hill nine,
;; and thirteen to most of the others. Another stunning victory.

;; Even though it was originally thought of as a simple war game,
;; Blotto is a better model for election campaigning!


;; Suppose that a colonial empire (let us without loss of generality
;; call them "The British") is desperately trying to hang on to its
;; last foreign possession against the heroic guerilla tactics of a
;; bunch of terrorists (the "Rebels"). The terrorists are badly
;; outnumbered, but agile.

;; The British are a slow moving bunch, who will divide their large
;; force equally among the strategic targets that they need to
;; protect.

;; The Rebels have, as well as the second mover advantage of getting
;; to see their enemy's troop distribution, the choice of how many
;; fronts to fight on.

;; They win if they manage to win on a majority of fronts.

;; Suppose that the British have four thousand men
(def colonial-troops 4000)
;; And the rebels have 1500 
(def rebel-troops 1500)

;; How many fronts should the rebels choose to fight on?

;; Suppose 7 is the rebel commander's lucky number
(def fronts 7)

;; The British send 571 men to each front
(quot colonial-troops fronts) ;-> 571

;; Which we might choose to represent as:
(repeat fronts (quot colonial-troops fronts))

;; Which leaves them with a reserve of three
(- colonial-troops (* fronts min-colonial)) ;-> 3

;; Which we might choose to represent as: 
(set! *print-length* 25)
(concat (repeat 3 1) (repeat 0))

;; They distribute their reserve also:
(def colonial-dist (map + 
                        (repeat fronts (quot colonial-troops fronts))
                        (concat (repeat 3 1) (repeat 0))))

;; To leave a final distribution:
colonial-dist ;-> (572 572 572 571 571 571 571)

(assert (= (reduce + colonial-dist) colonial-troops) "grr")

;; We can summarize this in the following function:
(defn colonial-troop-allocation[troops fronts]
  (let [min-colonial (quot troops fronts)
        excess (- troops (* min-colonial fronts))
        excess-dist (concat (repeat excess 1) (repeat 0))
        min-dist (repeat fronts min-colonial)
        colonial-dist (map + min-dist excess-dist)]
    (assert (= (reduce + colonial-dist) troops) "grr")

(colonial-troop-allocation 120 1) ;-> (120)
(colonial-troop-allocation 120 2) ;-> (60 60)
(colonial-troop-allocation 120 3) ;-> (40 40 40)

(colonial-troop-allocation 225 3) ;-> (75 75 75)
(colonial-troop-allocation 225 5) ;-> (45 45 45 45 45)
(colonial-troop-allocation 225 7) ;-> (33 32 32 32 32 32 32)
(colonial-troop-allocation 225 8) ;-> (29 28 28 28 28 28 28 28)

(colonial-troop-allocation 7 8) ;-> (1 1 1 1 1 1 1 0)

;; Now the rebels have an easier task. They know how many fronts they
;; have to win on, and they only have to win by one soldier on each
;; front, so they can work out a winning allocation so:

(defn rebels-allocation [colonial-allocation]
  (let [majority (inc (quot (count colonial-allocation) 2))]
    (map inc (take majority (sort colonial-allocation)))))
(defn rebels-needed-for-rebellion [colonial-troops fronts]
  (let [ca (colonial-troop-allocation colonial-troops fronts)
        rn (rebels-allocation ca)]
    (reduce + rn)))

(map (partial rebels-needed-for-rebellion colonial-troops) (iterate inc 1)) 
;-> (4001 4002 2668 3003 2403 2670 2288 2505 2225 2406 2186 2338 2159 2292 2139 2259 2124 2230 2111 2211 2101 2192 2098 2176 2093 ...)

;; To win on one front, they need 4001 brave comrades. 
;; To win on two, they need 4002 (one win and one draw doesn't quite cut it).  
;; To win on three, they only need 2668.
;; As the number of fronts goes up, the number appears to be settling down 
;; to around 2000, which is a nice result.

;; "A perfectly informed flexible force fighting an enemy with a
;; uniform order-of-battle needs more than half as many troops to win"

;; How would you advise the rebel commander?

;; What advice would you give the Brits?

Tuesday, August 20, 2013

Gibbs Sampler : Clojure vs Julia, Java, C, MATLAB, Scala and Python

I never could resist wading into someone else's fight. To be honest, I think Clojure and I lose this one pretty badly, even though it turns out to be easy enough to make it do the sampling at Javatic speeds.

;; Brother Jackson ( has recently been trying to
;; persuade me to use a new language.

;; He describes Julia ( as being 'like Matlab,
;; but not crap', so it's clear that there are points both for and
;; against it.

;; As part of his campaign, he pointed me to
;; and that lead to:
;; which compares various languages on a simple gibbs sampler.

;; Darren's timings go from 7 minutes (for R) to 8 seconds (for C)
;; The point of Tim's posts is that Matlab *can* be persuaded to run this quickly, 
;; and that Julia just does.

;; Two other languages that also do well are Java and Scala, with timings around
;; the 11 second mark for the simple Gibbs sampler benchmark.

;; This is a bit of a red rag to a bull, as far as I'm concerned. One
;; reason I like clojure so much is that it lets me think at a very
;; abstract level.

;; With most abstract languages, you pay a heavy speed penalty, and
;; Clojure's no exception, but generally, if it goes slowly enough to
;; annoy you, there's a simple way to speed it up.

;; So I'm thinking, 'If Java and Scala can go that quickly then surely
;; Clojure can'.  It's been a while since I last tried to speed
;; something up, and a lot has changed since.

;; Firstly, for comparison, I ran Darren's C program on my little
;; netbook and it took 91 seconds. The Java program took 147 seconds.
;; So I'm guessing his computer's about 11 times faster than mine,
;; give or take, but that the scale factor's about constant.

;; Here's the Java program that did rather well:

;; import java.util.*;
;; import cern.jet.random.tdouble.*;
;; import cern.jet.random.tdouble.engine.*;
;; class Gibbs
;; {
;;     public static void main(String[] arg)
;;     {
;;     int N=50000;
;;     int thin=1000;
;;     DoubleRandomEngine rngEngine=new DoubleMersenneTwister(new Date());
;;     Normal rngN=new Normal(0.0,1.0,rngEngine);
;;     Gamma rngG=new Gamma(1.0,1.0,rngEngine);
;;     double x=0;
;;     double y=0;
;;     System.out.println("Iter x y");
;;     for (int i=0;i<N;i++) {
;;         for (int j=0;j<thin;j++) {
;;         x=rngG.nextDouble(3.0,y*y+4);
;;         y=rngN.nextDouble(1.0/(x+1),1.0/Math.sqrt(2*x+2));
;;         }
;;         System.out.println(i+" "+x+" "+y);
;;     }
;;     }
;; }

;; The first thing to do is to translate this program into clojure.

;; First we want to add the Parallel Colt library on which this
;; program depends Since restarting the repl is pretty much a fate
;; worse than death we can use the admirable pomegranate library
(require 'cemerick.pomegranate)
 :coordinates '[[net.sourceforge.parallelcolt/parallelcolt "0.10.0"]])

;; Using Parallel Colt was a bit of a pain in java, requiring fiddling
;; with jars and classpaths and poms.xml, and the corresponding libgsl
;; for the C program required a weird bit of debugging before I could
;; get it to work. But the combination of pomegranate and maven
;; makes adding libraries child's play.

;; On the other hand, you do have to import every java class that you
;; intend to use in Clojure by hand. There's no way to translate * as
;; far as I know.
(import (cern.jet.random.tdouble.engine DoubleMersenneTwister))
(import (cern.jet.random.tdouble Normal Gamma))
(import java.util.Date)

;; If we care about speed in Clojure, then this is always a cool thing to do:
(set! *warn-on-reflection* true)
;; The point of it is to warn you when Clojure can't find the type of
;; a java object at compile time.  When this happens, it uses
;; reflection, and if that happens in a tight loop, it can be
;; *extremely* slow

;; So, here we go:

(def N 50000)
(def thin 1000)

(def rngEngine (DoubleMersenneTwister. (Date.)))

(def rngN (Normal. 0.0,1.0,rngEngine))
(def rngG (Gamma.  1.0,1.0,rngEngine))

;; I'm translating as literally as possible here, but I'm going to
;; leave the printing out until the end so that I can play with the
;; program without generating screenfuls of crap.

;; The original is very imperative, and doesn't really fit
;; Clojure's natural style. You can't just change the values of things
;; in Clojure, you have to use Software Transactional Memory.  

;; This has good and bad aspects, but when you're trying to make a
;; literal translation of an imperative program it looks very ugly,
;; and it's going to have a huge overhead in a calculation like this.

(defn literal[]
  (let [x (atom 0) y (atom 0)]
    (dotimes [i N]
      (when (zero? (mod i 10)) (println i)) ;; Printing a progress counter
      (dotimes [j thin]
        (swap! x (fn[_] (.nextDouble rngG 3.0 (+ (* @y @y) 4))))
        (swap! y (fn[_] (.nextDouble rngN (/ (+ 1 @x)) (/ 1.0 (Math/sqrt (+ 2 (* 2 @x)))))))))))

;; Notice that as soon as you define this function, you get:
;; Reflection warning, NO_SOURCE_PATH:6 - call to nextDouble can't be resolved.
;; Reflection warning, NO_SOURCE_PATH:7 - call to nextDouble can't be resolved.

;; And the performance is truly appalling! There is no way I am ever
;; going to wait for this to finish, but I calculate that it will take
;; around an hour and a half. That's quite a slowdown, and it puts
;; Clojure in the same category as R and (badly written) MATLAB

(time (literal)) ;-> let's say about 90 minutes or so.

;; Luckily, reflection warnings are easy to fix. 
;; You just tell the compiler the type of the mysterious object:
(defn literal-with-reflections-fixed[]
  (let [x (atom 0) y (atom 0)]
    (dotimes [i N]
      (when (zero? (mod i 10)) (println i))
      (dotimes [j thin]
        (swap! x (fn[_] (.nextDouble ^Gamma rngG 3.0 (+ (* @y @y) 4))))
        (swap! y (fn[_] (.nextDouble ^Normal rngN (/ (+ 1 @x)) (/ 1.0 (Math/sqrt (+ 2 (* 2 @x)))))))))))

;; That gives us a gigantic speedup, but the performance is still poor
(time (literal-with-reflections-fixed)) ;-> about 9 minutes

;; Remembering that Java took 2 1/2 minutes, we have a factor of four
;; to get rid of here.  That's actually much better than I was
;; expecting given this awful abuse of STM. I'll be less wary of using
;; it heavily in future.

;; But the obvious bottleneck is still all that use of the STM
;; mechanism in the middle of the tight inner loop.  Besides, it's
;; just nasty. That's really not what Clojure's heavily guarded
;; mutable state is for.
;; Let's get rid of that and replace it with clojure's natural loop/recur mechanism
;; in the inner loop, where we're just throwing away the intermediate values, 
;;  and use the elegant iterate for the outer loop, where we want them.

(defn gibbs-loop [[x y]]
  (loop [j thin x x y y]
    (let [x (.nextDouble ^Gamma rngG 3.0 (+ (* y y) 4))
          y (.nextDouble ^Normal rngN (/ (+ 1 x)) (/ 1.0 (Math/sqrt (+ 2 (* 2 x)))))]
      (if (> j 0) (recur (dec j) x y )
          [x y]))))

(def samples (iterate gibbs-loop [0 0]))

;; Now, how long does it take us to get our 50000th sample? 
(time (nth samples N))
;; 219 seconds. (vs 147 for Java/Scala and 91 seconds for C )

;; And I think at that point I'm happy.

;; It's probably possible to bum a few more cycles out of this code, but it's hard to think of why you'd care.

;; Clojure's never going to run faster than Java when you're asking
;; them to perform the same computation, So the best we could hope for
;; is to get rid of the 50% slowdown over Java. I'm sure that's
;; possible, but I think the code above is a nice compromise between
;; speed, elegant expression of the idea, and flexibility. All of
;; which are important in real programs.

;; I did say that I'd cover writing all this to a file as well (and numbering the iterations), so

(time (spit "" (apply str (map (fn [i [x y]] (str i " " x " " y "\n")) (range) (take N samples)))))
;; 3 seconds. Told you that wasn't the hard part.

;; What I really have no idea how to do, let alone to do fast, is to
;; make a traditional unix-style command line utility that can be run
;; from the command prompt and write to stdout.

;; I imagine it's possible, but the amount of
;; library/classpath/maven/#! trickery required is going to be
;; immense. And I don't think there's any getting away from the fact
;; that clojure takes ages to start up.

;; That's a real shame, because I really love that style of programming. I miss it.

;; I wish that there was a language like clojure that lived in the traditional UNIX/C world rather than on the JVM. 

;; Pretty much everything about the JVM drives me up the wall. But
;; actually it's not the JVM's fault that clojure takes ages to
;; start. The Java version of this program starts almost instantly.


;; Footnotes

;; If you'd written this sampler in clojure in the first place, your
;; original code would probably have looked something like this:

(defn gibbs [[x y]]
  (let [x (.nextDouble rngG 3.0 (+ (* y y) 4))
        y (.nextDouble rngN (/ (+ 1 x)) (/ 1.0 (Math/sqrt (+ 2 (* 2 x)))))]
    [x y]))

(defn chain [thin [x y]]
  (nth (iterate gibbs [x y]) thin))

(def samples  (iterate (partial chain thin) [0 0]))

;; Notice the reflection warnings!

;; If you then fix the reflections:

(defn gibbs [[x y]]
  (let [x (.nextDouble ^Gamma  rngG 3.0 (+ (* y y) 4))
        y (.nextDouble ^Normal rngN (/ (+ 1 x)) (/ 1.0 (Math/sqrt (+ 2 (* 2 x)))))]
    [x y]))

(defn chain [thin [x y]]
  (nth (iterate gibbs [x y]) thin))

(def samples  (iterate (partial chain thin) [0 0]))

;; Then you get code which isn't too shabby.

(time (nth (iterate (partial chain thin) [0 0]) 100))
;; 540msec

(time (nth (iterate (partial chain thin) [0 0]) 1000))
;; 5542msec

(time (nth (iterate (partial chain thin) [0 0]) 10000))
;; 56353 mseconds

(time (nth (iterate (partial chain thin) [0 0]) 50000))
;; 319 seconds. About 1/2 the speed of Java.

;; But you end up spending about half the program time packing and
;; unpacking [x y] pairs and creating and garbage collecting a lazy
;; seq that you're never going to look at.

;; I think it's worth having the explicit loop/recur in the inner loop

;; On the other hand:

(defn samples-loop [N]
  (loop [x 0.0 y 0.0 j thin i N acc []]
      (let [x (.nextDouble ^Gamma rngG 3 (+ (* y y) 4))
            y (.nextDouble ^Normal rngN (/ (+ 1 x)) (/ 1 (Math/sqrt (+ 2 (* 2 x)))))]
        (if (> j 0) (recur x y (dec j) i acc)
            (if (> i 0) (recur x y thin (dec i) (conj acc [x y])))))))

;; Has clearly left the path of righteousness. 

;; Although it does achieve the speed for which it sold its soul.

(time (last (samples-loop 50000)))
;; 182 seconds

Wednesday, August 14, 2013

The Shield of Perseus

;; The Shield of Perseus

;; It's a perennial problem for me that occasionally I ask my REPL for an infinite sequence.

;; never do this: 
(iterate (fn[s] (list s s)) '()) ;-> death

;; It's not so bad in a terminal, where the REPL just freezes, but in emacs it's a disaster.
;; The whole thing snarls up in a really nasty way and it usually takes several minutes to fix.

;; A real man would dive into the nrepl code and fix it, but life is too short.

;; safety catch on
(set! *print-length* 7)
(set! *print-level* 7)

(iterate (fn[s] (list s s)) '()) ;-> (() (() ()) ((() ()) (() ())) (((() ()) (() ())) ((() ()) (() ()))) ((((() ()) (() ())) ((() ()) (() ()))) (((() ()) (() ())) ((() ()) (() ())))) (((((() ()) (() ())) ((() ()) (() ()))) (((() ()) (() ())) ((() ()) (() ())))) ((((() ()) (() ())) ((() ()) (() ()))) (((() ()) (() ())) ((() ()) (() ()))))) ((((((# #) (# #)) ((# #) (# #))) (((# #) (# #)) ((# #) (# #)))) ((((# #) (# #)) ((# #) (# #))) (((# #) (# #)) ((# #) (# #))))) (((((# #) (# #)) ((# #) (# #))) (((# #) (# #)) ((# #) (# #)))) ((((# #) (# #)) ((# #) (# #))) (((# #) (# #)) ((# #) (# #)))))) ...)

;; of course there is a price to pay

(range 10) ;-> (0 1 2 3 4 5 6 ...)

((apply comp (repeat 10 list)) 'hi) ;-> (((((((#)))))))

Tuesday, August 13, 2013

IQ Test

tldr: I made an 'iq test': in clojure. Have a go. (make sure you've got cookies enabled first)

A while ago I got a bit annoyed with some 'IQ tests'.


It occurred to me that the main problem with them was that they have very strong practise effects.

I'm not sure that you can talk about measuring something if by measuring it you change it for ever.

Most people seem to think that IQ tests are telling you something about the speed of your brain, and something about your ability at abstract reasoning on novel problems.

I don't think they're telling you anything of the sort. They're telling you whether you've seen problems like this before.

From that point of view the reason for the mysterious 'Flynn Effect' (IQ scores are rocketing over time / our ancestors were morons) seems pretty obvious, and it throws all sorts of beliefs about intelligence differences between classes, races and nations into a new light.

Psychologist friends tell me that in order to qualify as an IQ test, a test has to be pretty much immune to practice effects, but I just can't see how this can possibly be true. Certainly I got much better at IQ tests by doing a few, and the same is true for my Mum, who went from barely being able to answer the questions to happily knocking off all the puzzles quickly with a couple of lessons from me.

Anyway, I reckon that an IQ test is measuring a combination of things.

  • Brain Speed
  • Speed of Learning
  • Previous Experience
  • Practice

And I wondered how to get rid of the second two, which aren't that interesting.

Starting from a real IQ test I stripped out everything that I thought was dodgy, and ended up with a simple symmetry spotting game.

It's got huge practice effects. (Once you've figured it out, there's a speed you can do the puzzles at, and you can get about four times faster with minimal practice).

But because it can generate puzzles at random, you can practice as much as you like.

So my hope is that it 'benchmarks the brain', by measuring the speed you can do a task which feels like a model of abstract/mathematical thought.

And now my question is:

"Are there differences between people, or is everyone equally good at this once they've figured it out?"

Please help me decide by having a go at it.

A score between 40 and 50 is good on the first go. If you then have another go the following day, you should be much better at it, and this effect seems to continue until you eventually level out at a very much higher speed.

Also, I have no idea at all how to write a web app, and I'm sure I've left lots of nasty exploits open.

Feel free to hack it and deface it as much as you like, as long as you tell me how you've done it so I can fix it!

Another problem with the idea of IQ testing is a thing called 'stereotype threat', which is when you're so busy worrying about whether 'your kind' are good at this sort of thing that your performance drops off. Using this, people have apparently reversed the strongest results in experimental psychology regarding gender differences.

I can imagine exactly how this might work, and to this day thank the Good Lord that I never realised that the English stereotype of Irish stupidity applied to me. I always felt English when I was growing up. Perhaps I didn't notice because I am half-Irish, and we are a bit dim.

I hope that my thing will be immune to this sort of thing too, because you can just practise peacefully with no-one watching until you know how good you are at it, and play until you've got a score that you think you can't improve on.

Diversity Prediction Theorem

;; Diversity Prediction Theorem

;; I've recently seen in various places the Diversity Predition Theorem, variously expressed as:

;; Crowd Error = Average Error - Diversity
;; The Crowd Beats the Average
;; Being different is as important as being good
;; Encourage Diversity

;; I'm not going to argue with a mathematical tautology, but I'm a bit sceptical about the 
;; happy feelings induced by this law.

;; Consider:

(defn diversity-theorem [truth predictions]
  (let [square (fn[x] (* x x))
        mean (/ (reduce + predictions) (count predictions))
        avg-sq-diff (fn[a] (/ (reduce + (for [x predictions] (square (- x a)))) (count predictions)))]
    {:average-error (avg-sq-diff truth)
     :crowd-error (square (- truth mean))
     :diversity (avg-sq-diff mean)}))

;; We have a thing whose real value is 49, and some experts who guess the value fairly accurately
;; Their average guess is really close!
(diversity-theorem 49 '(48 47 51))    ;-> {:average-error 3, :crowd-error 1/9, :diversity 26/9}

;; We make our population of experts more diverse:
(diversity-theorem 49 '(48 47 51 42)) ;-> {:average-error 29/2, :crowd-error 4, :diversity 21/2}