Comments on Learning Clojure: Gibbs Sampler : Clojure vs Julia, Java, C, MATLAB, Scala and Python

Bit late to this. Once the clojure loops are writt...

2014-06-07T07:24:14.577+01:00

Bit late to this. Once the clojure loops are written imperatively using pre-allocated double arrays, the clojure speed is about the same as Java and in my tests takes about 45% longer than the code posted by Edmund (who pre-allocates the array). The following library has some good Java PRNGs http://www.iro.umontreal.ca/~simardr/ssj/indexe.html.. I timed the following code taking 45% longer than the Julia:
(import (umontreal.iro.lecuyer.rng MT19937 MRG32k3a))
(import (umontreal.iro.lecuyer.randvar NormalGen GammaAcceptanceRejectionGen ))
(def SSJrngEngine (MT19937. (MRG32k3a.)))
(defn samples-loop [^long N ^long thin]
(loop [x 0.0 y 0.0 j thin i N acc (make-array Double/TYPE N 2)]
(let [x (*(GammaAcceptanceRejectionGen/nextDouble SSJrngEngine 3.0 1.0 )(+ (* y y) 4.0))
y (+ (/(NormalGen/nextDouble SSJrngEngine 0.0 1.0)(Math/sqrt (+ 2.0 (* 2.0 x)))) (/ 1.0 (+ 1.0 x))) ]
(if (> j 0) (recur x y (dec j) i acc)
(if (> i 0) (do
(let [^doubles dacc (aget ^objects acc (- N i))]
(aset dacc 0 x )
(aset dacc 1 y )
)
(recur x y thin (dec i) acc)))))))

Since Gamma PRNGs call Gaussian and Uniform PRNGs, this story largely reflects the speed of pure Java vs C uniform and normal PRNGs (calling the PRNGs dominate the numerical loops). The expensive parts of the Julia Uniform and Gaussian PRNGs are coded in c: https://github.com/JuliaLang/Rmath/blob/master/src/randmtzig.c. Not surprising that pure java version take about 45% longer which is roughly the original Java vs C comparison. There are also algorithmic differences between e.g. Julia's C Gaussian PRNG and the various Java ones. This one is well regarded in finance: http://home.online.no/~pjacklam/notes/invnorm/#Java

A multivariate Gaussian comparison would be interesting given that Clojure matrices are a recent development but Julia has a matrix focus.

You probably timed Julia's start up time too. ...

2014-03-31T13:06:19.176+01:00

You probably timed Julia's start up time too. It's pretty funny how Julia is hyped for performance but the start-up is so slow.

There is also an additional quirk if you use leini...

2013-09-16T12:41:21.419+01:00

There is also an additional quirk if you use leiningen: https://github.com/technomancy/leiningen/wiki/Faster#tiered-compilation By default, leiningen use some JVM options that hurt performance.

You can squeeze a little bit more performance with...

2013-09-16T12:39:07.578+01:00

You can squeeze a little bit more performance with following:
-(set! *unchecked-math* true)
-ensure that "thin" and "N" are casted to longs instead of being Objects. In your last example they are Objects;
-introduce type Pair instead of constructing a vector every time.
This gives an additional 5-10% in my tests.

Indeed, it might sink your boat !

2013-08-21T09:04:42.925+01:00

Indeed, it might sink your boat !

Its so heavy to carry you'd never make good yo...

2013-08-21T09:04:15.787+01:00

Its so heavy to carry you'd never make good your getaway :)

Probably should, but there's not a great deal ...

2013-08-21T00:49:01.447+01:00

Probably should, but there's not a great deal of noise in the signal from that. It would also be a good idea to force garbage collection and give the code enough runs through that the JIT has time to optimize it, both of which are noticeable effects. And there are bound to be other factors.

But to be honest, I don't really care about anything but the order-of-magnitude! Looks like I can get two of those by stealing Ed's laptop.

Is your laptop really 90x faster than my netbook? ...

2013-08-21T00:45:20.044+01:00

Is your laptop really 90x faster than my netbook? What the hell? I thought Moore's Law had stopped?

For comparisons like this why not seed the RNG wit...

2013-08-20T18:36:05.287+01:00

For comparisons like this why not seed the RNG with the same constant every time.

I get to be the baddie - love it. On my laptop ...

2013-08-20T17:36:58.129+01:00

I get to be the baddie - love it.

On my laptop (time (last (samples-loop 50000))) takes ~1750msec.

Lifting and updating from http://dmbates.blogspot.co.uk/2012/05/simple-gibbs-example-in-julia.html here's the equivalent code Julia. Ie no funny tricks with parallelism

------------------
using Distributions
function JGibbs(N::Int, thin::Int)
mat = Array(Float64, (N, 2))
x = 0.
y = 0.
for i = 1:N
for j = 1:thin
x = rand(Gamma(3)) * (y*y + 4)
y = 1/(x + 1) + randn()/sqrt(2(x + 1))
end
mat[i,:] = [x,y]
end
mat
end

@elapsed JGibbs(50000, 200)
------------------------------

That takes 680msec, which is hard to argue with.

Brother Jackson

Instead of parallelcolt you can look to Clatrix th...

2013-08-20T17:01:31.183+01:00

Instead of parallelcolt you can look to Clatrix that wraps the JBlas...

Brilliant as always, John. Kudos.

2013-08-20T08:44:07.281+01:00

Brilliant as always, John. Kudos.