Learning Clojure

Search This Blog

Friday, March 1, 2019

Anti Fascist Action

This question:

https://stackoverflow.com/questions/2352020/debugging-in-clojure

on Stack Overflow had loads of excellent answers, and every time someone added one, I learnt something from it.

So obviously some twat has closed it.

If you've got the relevant magic powers, go and vote to reopen it.

Let battle commence.




On the meta level, why is Stack Overflow such a bunch of fascist bastards these days? Why would people voluntarily spend their time making things worse?

Tuesday, February 5, 2019

The Unexpected Appearance of Schlemiel, the Painter


;; The Unexpected Appearance of Schlemiel, the Painter

;; So, it is clear that my model of how things are done was badly broken

;; I am doing some statistics, one day, and so I define:

;; the average of a finite sequence
(defn average [sq] (/ (reduce + sq) (count sq)))

;; and the square of a number
(defn square [x] (* x x))

;; and a way of forgetting about all the fiddly little digits at the end
(defn twosf   [x]  (float (/ (Math/round (* x 100.0)) 100))) 

;; but for the variance I am a little torn between:
(defn variance-one [sq]
  (let [av (average sq)]
    (average (map #(square (- % av)) sq))))

;; ;

(defn variance-two [sq]
  (let [sqdiff #(square (- % (average sq)))]
    (average (map  sqdiff sq))))

;; and (I have a regrettable weakness for the terse...) 
(defn variance-one-liner [sq] (average (map #(square (- % (average sq))) sq)))

;; but what I am not expecting, is this: 

(let [s (repeatedly 1000 #(rand))]
  (twosf (reduce + s)) ;; just to force the sequence to be generated before timing things
  [(time (twosf (reduce + s)))
   (time (twosf (average  s)))
   (time (twosf (variance-one s)))
   (time (twosf (variance-two s)))
   (time (twosf (variance-one-liner s)))])

;; "Elapsed time: 0.535715 msecs"
;; "Elapsed time: 0.834523 msecs"
;; "Elapsed time: 1.417108 msecs"
;; "Elapsed time: 251.650722 msecs"
;; "Elapsed time: 248.196331 msecs"
;; [496.83 0.5 0.09 0.09 0.09]


;; It seems that all these functions are correct, in the sense that they are producing
;; correct-looking answers, and yet:

;; It seems that variance-one is doing what I expect, running down the sequence twice and ending up
;; taking about twice as long as averaging it.

;; But that the other two are taking hundreds of times longer, possibly because they are
;; re-calculating the average of the sequence every time.

;; I had a nice hour or so, thinking about what was going on here, and why, and wonder if you might
;; enjoy the same thoughts, dear readers.

Thursday, December 13, 2018

Reinforcement Learning : Exploration vs Exploitation : Multi-Armed Bandits


;; Reinforcement Learning : Exploration vs Exploitation : Multi-Armed Bandits

;; I'm reading the excellent:

;; Reinforcement Learning: An Introduction
;; by Richard S. Sutton and Andrew G. Barto

;; The book's website, on which is available a complete pdf, is here:
;; http://www.incompleteideas.net/book/the-book.html

;; In Chapter 2, they introduce multi-armed bandits as a simplified model problem 

;; On the basis that you don't understand anything you can't explain to a computer, I thought I'd code it up:

;; Here is a 2 armed bandit
(defn bandit [action]
  (case action
    :arms? [:right :left]
    :right (if (< (rand) 0.5) 4 0)
    :left (if (< (rand) 0.2) 5 0)
    :oops!!))

;; We can ask it how many arms it's got, and what they're called
(bandit :arms?) ; [:right :left]

;; And we can pull those arms. Rewards are variable.
(bandit :right) ; 4 ; 4 ; 4 ; 0 ; 0 ; 0 ; 0
(bandit :left) ; 5 ; 0 ; 0 ; 0 ; 5 ; 0 ; 5 ; 0

;; Once we pull an arm, we'll have an action/reward pair
(bandit :right) ; 4
;; the pair would be:
[:right 4]

;; Here's a function that yanks an arm at random, and gives us such a pair
(defn random-yank [bandit]
  (let [a (rand-nth (bandit :arms?))]
    [a (bandit a)]))

(random-yank bandit) ; [:left 0]
(random-yank bandit) ; [:right 4] 

;; And a utility function to take the average of a sequence. We need to be able to provide a default value if the sequence is empty.
(defn average
  ([seq default] (if (empty? seq) default (/ (reduce + seq) (count seq))))
  ([seq] (average seq 0)))

;; with some examples
(average [1 2 3 4 5]) ; 3
(average (list) 10) ; 10
(average (list 1) 2) ; 1
(average [] 100) ; 100


;; If we just pull arms at random we get an average reward of about 1.5

(float (average (map second (repeatedly 1000 #(random-yank bandit))))) ; 1.49

;; Since we can see the code for this particular bandit, we know that
;; the expected value of pulling the right arm is 2 (a half-chance of
;; a reward of 4) and the expected reward for the left arm is 0.2*5 = 1

;; So if we were seeking to maximize reward, we'd probably be best to pull the right arm all the time.

(float (average (map bandit (repeat 10000 :right)))) ; 1.9912 
(float (average (map bandit (repeat 10000 :left )))) ; 0.985


;; The interesting question is, if we don't know how the bandit works, how should we design an algorithm that gets the most reward?
;; (Or at least do better than yanking arms at random!)

;; One thing our algorithm is going to have to do is keep some state to record what happens.
;; Let's start by recording the results of all pulls to date:

;; At first, we know nothing, so we can set up a table to represent that we know nothing
(defn initial-state [bandit]
  (into {} (for [k (bandit :arms?)] [k (list)])))

;; We haven't pulled either arm yet
(initial-state bandit) ; {:right (), :left ()}

;; When we get a new action reward/pair, we'll add the result to our state
(defn update-state [state [action reward]]
  (update-in state [action] #(conj % reward)))

;; here are some examples of using update-state
(update-state {:right (), :left ()} [:right 2]) ; {:right (2), :left ()}
(reduce update-state {:right (), :left ()} [[:right 2] [:left 3]  [:right 4] [:right 5]]) ; {:right (5 4 2), :left (3)}

;; here's how we can use it to record the result of ten random yanks
(reduce update-state
        (initial-state bandit)
        (repeatedly 10 #(random-yank bandit))) ; {:right (4 4 0 0 0), :left (0 0 0 0 5)}


;; Once we actually have some data, we can make estimates of the expected rewards

;; mapvals applies a function to every value in a map, returning a new map with the same keys
(defn mapvals [m f] (into {} (for [[k v] m] [k (f v)]))) 

;; examples
(mapvals {} inc) ; {}
(mapvals {:a 1} inc) ; {:a 2}
(mapvals {:a 1, :b 2} inc) ; {:a 2, :b 3}
(mapvals {:a 1, :b 2, :c 3} #(* % %)) ; {:a 1, :b 4, :c 9}


;; In the book, Q_t(a) is the current estimate (at time t)
;; We'll use as our estimate of the value of an action the average value seen so far, or zero if we have no information

(defn Q [state] (mapvals state #(average % 0)))

;; examples
(Q '{:right (5 4 2), :left (3)}) ; {:right 11/3, :left 3}
(Q '{:right (5 4 2), :left ()}) ; {:right 11/3, :left 0}
(Q (initial-state bandit)) ; {:right 0, :left 0} 
(Q (update-state (initial-state bandit) (random-yank bandit))) ; {:right 0, :left 2}

;; let's check that we get roughly what we expect in the long run
(Q (reduce update-state (initial-state bandit)
        (repeatedly 10000 #(random-yank bandit)))) ; {:right 9832/5015, :left 1027/997}


;; If we have estimates of the value of each arm, then a good way to
;; use them is to pull the arm with the highest estimate.

;; This is called 'exploitation', as opposed to 'exploration', which
;; is when you try things you think may be suboptimal in order to get
;; information

;; The 'greedy' action is the one with the highest expected value. Of
;; course there may be more than one greedy action especially at first.

;; To help with this, another utility function:

;; max-keys finds the keys with the highest value in a map, and returns a list with just these keys and values
(defn max-keys [m]
  (let [slist (reverse (sort-by second m)) 
        [_ max] (first slist)]
    (take-while #(= (second %) max) slist)))

;; examples
(max-keys {}) ; ()
(max-keys {1 0}) ; ([1 0])
(max-keys {1 0, 2 0}) ; ([2 0] [1 0])
(max-keys {1 0, 2 1}) ; ([2 1])
(max-keys {1 0, 2 1, 3 -1 , 4 -3, 5 2, 6 2}) ; ([6 2] [5 2])

;; if there is a tie for the greedy action, we can choose at random between the candidates
;; And so we can go from estimates to greedy action like this:
(defn greedy-action [estimates]
  (first (rand-nth (max-keys estimates))))

;; examples
(greedy-action '{:right 10, :left 3}) ; :right
(greedy-action '{:right 10, :left 3 :centre 20}) ; :centre
(greedy-action '{:right 10, :left 3 :centre 3}) ; :right
(greedy-action '{:right 3, :left 3 :centre 3}) ; :right

(greedy-action (Q '{:right (5 4 2), :left (3)})) ; :right
(greedy-action (Q '{:right (), :left (3)})) ; :left
(greedy-action (Q (initial-state bandit))) ; :left

;; after a lot of random pulls, the greedy action should reliably be the one with the highest expected payoff
(greedy-action (Q (reduce update-state (initial-state bandit)
        (repeatedly 10000 #(random-yank bandit))))) ; :right 

;; OK, so we have our stage set, a way of recording what's happened, and some helpful functions defined.


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; Our first try at a learning algorithm will be 'by hand', as it were.

;; We'll always make the 'greedy' choice.

;; At first, we have no records to go on
(initial-state bandit) ; {:right (), :left ()}

;; expected values for both levers are therefore zero
(Q (initial-state bandit)) ; {:right 0, :left 0}

;; so the greedy action will get chosen at random
(greedy-action (Q (initial-state bandit))) ; :left

;; in this case, we've chosen :left, and the bandit's response is
(bandit :left) ; 0 

;; we record it
(update-state (initial-state bandit) [:left 0]) ; {:right (), :left (0)}

;; and we have a new state
'{:right (), :left (0)}

;; new estimates
(Q '{:right (), :left (0)}) ; {:right 0, :left 0}

;; and again, we choose at random
(greedy-action (Q '{:right (), :left (0)})) ; :left

;; the bandit is not feeling very generous
(bandit :left) ; 0

(update-state '{:right (), :left (0)} [:left 0]) ; {:right (), :left (0 0)}

;; new state:
'{:right (), :left (0 0)}

;; new estimates
(Q '{:right (), :left (0 0)}) ; {:right 0, :left 0}

;; this time we choose :right
(greedy-action (Q '{:right (), :left (0 0)})) ; :right

;; and the bandit pays out! 
(bandit :right) ; 4

(update-state '{:right (), :left (0 0)} [:right 4]) ; {:right (4), :left (0 0)}

;; the greedy action will be :right now, because we have evidence that right is better.
(greedy-action (Q '{:right (4), :left (0 0)})) ; :right

;; You get the idea......

;; Let's automate that....

;; Given a state and a bandit, we decide an action and the bandit
;; responds, producing an action/reward pair, and a new state

(defn greedy-algorithm [bandit state]
  (let [action (greedy-action (Q state))
        reward (bandit action)]
    [[action reward] (update-state state [action reward])]))


(greedy-algorithm bandit (initial-state bandit)) ; [[:left 0] {:right (), :left (0)}]

;; To get something we can iterate:

(defn step [[[a r] state]]
  (greedy-algorithm bandit state))

(iterate step [ [:dummy :dummy] (initial-state bandit)])

;; ([[:dummy :dummy] {:right (), :left ()}]
;;  [[:left 5] {:right (), :left (5)}]
;;  [[:left 0] {:right (), :left (0 5)}]
;;  [[:left 0] {:right (), :left (0 0 5)}]
;;  [[:left 0] {:right (), :left (0 0 0 5)}]
;;  [[:left 0] {:right (), :left (0 0 0 0 5)}]
;;  [[:left 0] {:right (), :left (0 0 0 0 0 5)}]
;;  [[:left 0] {:right (), :left (0 0 0 0 0 0 5)}]
;;  [[:left 0] {:right (), :left (0 0 0 0 0 0 0 5)}]
;;  [[:left 0] {:right (), :left (0 0 0 0 0 0 0 0 5)}]
;;  [[:left 0] {:right (), :left (0 0 0 0 0 0 0 0 0 5)}]
;;  [[:left 0] {:right (), :left (0 0 0 0 0 0 0 0 0 0 5)}]
;;  [[:left 5] {:right (), :left (5 0 0 0 0 0 0 0 0 0 0 5)}]
;;  [[:left 0] {:right (), :left (0 5 0 0 0 0 0 0 0 0 0 0 5)}]
;;  [[:left 0] {:right (), :left (0 0 5 0 0 0 0 0 0 0 0 0 0 5)}]
;;  [[:left 0] {:right (), :left (0 0 0 5 0 0 0 0 0 0 0 0 0 0 5)}]
;;  [[:left 0] {:right (), :left (0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 5)}]

;; In this case, the greedy algorithm happens to get a payout on its
;; first try, and decides that it will pull that arm for ever. It
;; never even tries the other arm.

;; Try again:

(iterate step [ [:dummy :dummy] (initial-state bandit)])
;;([[:dummy :dummy] {:right (), :left ()}]
;;  [[:right 0] {:right (0), :left ()}]
;;  [[:right 0] {:right (0 0), :left ()}]
;;  [[:left 0] {:right (0 0), :left (0)}]
;;  [[:right 4] {:right (4 0 0), :left (0)}]
;;  [[:right 4] {:right (4 4 0 0), :left (0)}]
;;  [[:right 4] {:right (4 4 4 0 0), :left (0)}]
;;  [[:right 4] {:right (4 4 4 4 0 0), :left (0)}]
;;  [[:right 4] {:right (4 4 4 4 4 0 0), :left (0)}]
;;  [[:right 4] {:right (4 4 4 4 4 4 0 0), :left (0)}]
;;  [[:right 0] {:right (0 4 4 4 4 4 4 0 0), :left (0)}]
;;  [[:right 0] {:right (0 0 4 4 4 4 4 4 0 0), :left (0)}]
;;  [[:right 4] {:right (4 0 0 4 4 4 4 4 4 0 0), :left (0)}]
;;  [[:right 0] {:right (0 4 0 0 4 4 4 4 4 4 0 0), :left (0)}]
;;  [[:right 4] {:right (4 0 4 0 0 4 4 4 4 4 4 0 0), :left (0)}]
;;  [[:right 0] {:right (0 4 0 4 0 0 4 4 4 4 4 4 0 0), :left (0)}]

;; In this case, it tried the right arm a couple of times, then had a
;; go with the left arm, then went back to the right arm, won a
;; payout, and then got hung up on pulling the right arm repeatedly.



;; We've got a couple of problems here!

;; First is that the algorithm has clearly got into a state where it
;; always pulls the left arm (in the first case), and the right
;; arm (in the second case).

;; It can't be doing the right thing in both cases.

;; Secondly the state is growing linearly, as the algorithm remembers
;; all previous results. That's giving us algorithmic complexity
;; problems and the calculation will get slower and slower, and
;; eventually run out of memory.













Wednesday, October 5, 2016

Contract Programmer Seeks Job in Cambridge (£500 reward)

Anyone in Cambridge need a programmer? I'll give you £500 if you can find me a job that I take.

CV at http://www.aspden.com

I make my usual promise, which I have paid out on several times:

If, within the next six months, I take a job which lasts longer than one month, and that is not obtained through an agency, then on the day the first cheque from that job cashes, I'll give £500 to the person who provided the crucial introduction.

If there are a number of people involved somehow, then I'll apportion it fairly between them. And if the timing conditions above are not quite met, or someone points me at a shorter contract which the £500 penalty makes not worth taking, then I'll do something fair and proportional anyway.

And this offer applies even to personal friends, and to old contacts whom I have not got round to calling yet, and to people who are themselves offering work, because why wouldn't it?

And obviously if I find one through my own efforts then I'll keep the money. But my word is generally thought to be good, and I have made a public promise on my own blog to this effect, so if I cheat you you can blacken my name and ruin my reputation for honesty, which is worth much more to me than £500.



And I also make the following boast:

I know all styles of programming and many languages, and can use any computer language you're likely to use as it was intended to be used.

I have a particular facility with mathematical concepts and algorithms of all kinds. I can become very interested in almost any problem which is hard enough that I can't solve it easily.

I have a deserved reputation for being able to produce heavily optimised, but nevertheless bug-free and readable code, but I also know how to hack together sloppy, bug-ridden prototypes, and I know which style is appropriate when, and how to slide along the continuum between them.

I've worked in telecoms, commercial research, banking, university research, chip design, server virtualization, university teaching, sports physics, a couple of startups, and occasionally completely alone.

I've worked on many sizes of machine. I've written programs for tiny 8-bit microcontrollers and gigantic servers, and once upon a time every IBM machine in the Maths Department in Imperial College was running my partial differential equation solvers in parallel in the background.

I'm smart and I get things done. I'm confident enough in my own abilities that if I can't do something I admit it and find someone who can.

I know what it means to understand a thing, and I know when I know something. If I understand a thing then I can usually find a way to communicate it to other people. If other people understand a thing even vaguely I can usually extract the ideas from them and work out which bits make sense.

Tuesday, September 13, 2016

Anyone work for Google?

Hi, I don't suppose anyone reading this works for Google?

Years ago, I registered the domain name learningclojure.com through blogger. It's auto-renewed ever since.

They keep e-mailing my personal gmail account to tell me that the domain name has expired and needs renewing, but there's no way to renew it as far as I can tell.

They want me to use the 'administrator account' for the domain. I've never had any such account!

And it seems to be deliberately impossible to get in touch with Google by any method.

If anyone can help I'd really appreciate it!

Sunday, December 6, 2015

Define Macro


;; When I say

(def a (* 4 5)) ;-> #'user/a

;; I'd rather the repl told me that I'd just assigned the value 20 to
;; something rather than that I'd just assigned something to user/a

a ;-> 20

;; A macro for this is easy

(defmacro define [var expr]
  `(let [tmp# ~expr]
     (def ~var tmp#) tmp#))

(define a (* 4 5)) ;-> 20

;; I'd also like to be able to say, in a scheme-like manner

'(define (square x) (* x x))

;; meaning
(defn square [x] (* x x)) ; #'user/square

;; So I can modify my define macro so:

(defmacro define [var expr]
  (cond (symbol? var)  `(let [tmp# ~expr] (def ~var tmp#) tmp#)
        (list? var)  `(defn ~(first var) ~(into [] (rest var)) ~expr)))

(macroexpand '(define a (* 20 20))) ;-> (let* [tmp__1986__auto__ (* 20 20)] (def a tmp__1986__auto__) tmp__1986__auto__)
(macroexpand '(define (square x) (* x x))) ;-> (def square (clojure.core/fn ([x] (* x x))))

(define a (* 20 20)) ; 400
(define (cube x) (* x x x)) ;-> #'user/cube
(cube 20) ;-> 8000

;; I must say, I haven't actually tried this in practice yet, but it looks like it might work

(define (random-error)
  (+ (rand) -0.5)) ;-> #'user/random-error

(random-error) ;-> -0.28442993770155234
(random-error) ;-> 0.2911519817783499
(random-error) ;-> -0.4037254523155406

(define bell (/ (reduce + (repeatedly 10 random-error)) 10)) ;-> 0.015416035491431623

;; I'm sure someone will let me know if it's broken.

;; This is a bit sub-optimal:
(define square (fn[x] (* x x))) ; #function[user/eval8586$tmp--8558--auto----8587]

;; Any suggestions for improvements?

Sunday, November 29, 2015

Contract Programmer Seeks Job in Cambridge (£500 reward)

I'm looking for a job, and as usual I'll pay a commission of £500 to anyone who can find me a good one.

Details here: http://johnlawrenceaspden.blogspot.co.uk/2015/11/contract-programmer-seeks-job-in.html

CV here: http://www.aspden.com

Obviously Clojure is a speciality, and I'd love a Clojure job, but I can program in all styles.

My other favourite languages are C, Python, and ML. All of them seem to hit a sweet spot where they're just the best tool for certain tasks.

If you're a local company using Java, who might be interested in giving Clojure a try, I'd love to try to show you what all the fuss is about.

Followers