Sunday 27 April 2014

$RANDOM

I recently discovered a variable in BASH which is going to stick as one of my favourite variables (everybody has one).
That is the $RANDOM variable.

It generates a pseudorandom number between 0 and 32767. This is an awkward choice of limits, but it can be adjusted by just taking a modulus:

$ echo $((RANDOM%2))

outputs 0 or 1, or:

$ echo $((RANDOM%10))

outputs a number between 0 and 9.

This inspired me to do some simple calculations on random numbers which you can find in my other blog post.
If you want the details of what and how I did them (apologies if most of it is just quick and dirty scripting), keep reading!


I produced the random distribution of points in 2D, by creating two lists of numbers (doubles) between 0 and 1, joining them and use gnuplot to plot them:

$ for i in {1..5000} ; do echo "$((RANDOM%10000)) / 10000" | bc -l; done > random.txt
$ sort --random-sort random.txt > random2.txt
$ paste random.txt random2.txt > randomjoin.txt
gnuplot> plot 'randomjoin.txt' using 1:2 pt 7 ps 0.4 notitle


Funnily enough, the uniform distribution - which I generated by applying a small "wobble" to a grid of evenly spaced points - was more complicated to produce:

$ for i in {0..70} ; do for j in {0..70}; do echo "$i $j" ; done ; done | awk '{print $1/70 " " $2/70}' > uniform.txt
$ sed 's/\./\.00/g' random.txt > wobble1.txt
$ sed 's/\./\.00/g' random2.txt > wobble2.txt
$ paste wobble1.txt wobble2.txt > wobblejoin.txt
$ paste uniform.txt wobblejoin.txt > uniformjoin.txt
$ awk '{print $1-$3 " " $2-$4}' uniformjoin.txt > uniformwobble.txt
gnuplot> plot 'uniformwobble.txt' using 1:2 pt 7 ps 0.4 notitle


The 1D plots have been generated with the same data and some re-binning tricks in gnuplot:

gnuplot> binwidth=0.1
gnuplot> bin(x,width)=width*floor(x/width)
gnuplot> plot 'random.txt' using (bin($1,binwidth)):(1.0) smooth freq with boxes notitle



For the coin tosses example, I generated the sequences using some BASH:

$ for i in {1..100000} ; do for i in {1..15}; do echo -n $((RANDOM%2)); done | sed 's/0/H/g' | sed 's/1/T/g' && echo; done > tosses.txt

And analyzed, counting the repetition (or better, the non-repetitions) in it with:

$ echo "100000 - `grep -v 'HHH' tosses.txt | grep -v 'TTT' | wc -l`" | bc -l

As a bonus, the same set can also be used to explore the number of head (or tails) for each occurrence. Given that for a fair coin there is a 50% chance of getting head (and tails), the result should be having about half heads and half tails for most cases. Below are listed the number of instances (first column) for a given number of heads (second column):

$ while read line ; do echo $line | grep -o H | wc -l ; done < tosses.txt > freq.txt
$ more freq.txt | sort | uniq -c


      3       0
     51      1
    312     2
   1359    3
   4189    4
   9105    5
  15338   6
  19608   7
  19806   8
  15175   9
   9145    10
   4095    11
   1393    12
    362     13
     55      14
      4       15

This is a binomial distribution centered around 7.5, as expected.

No comments:

Post a Comment