Saturday 16 March 2013

Irrational approaches

Nobody knows what to call people like me. Bioinformatician?  Bioinformaticist?  Sir? I'm satisfied enough to be a binfus, which has happy overtones of doofus.  To every trade its tool-box and one of the tools I use almost every day to make sense of our evolutionary world is sequence alignment.  Here, for example, are the first 40 amino acids of IL6, an important immune protein, from 3 different species:
 you can see that the rat and mouse sequences are almost identical (yellow and grey) and there is only one case where the human state is identical to one of the rodent species (blue) but not the other.  You can use this sort of analysis to infer that rat and mouse had a more recent common ancestor than either did with primates like Archbishop Ramsey.

So far, so standard.  But what happens if you apply this particular angle-grinder to the troika of famous irrational numbers (FINs)?  That's what I did with the first 200 digits of Phi (1.618etc; the golden section), e (2.718etc; Euler's number; Napier's constant; the base of natural logarithms) and Pi (3.141etc; the ratio between the circumference and diameter of a circle): 


What I'm asking here is if the order of decimal digits of these irrational numbers show a  phylogenetic relationship, similar to rats and mice vs humans.  And it seems that they do!  If you look at the alignment 100 random digits with another 100 random digits, you expect 1/10 of them to be identical.  So the expected value of 200 “pair-wise” aligned random digits is 20 identities.  The more stringent case where the digit is identical in all three numbers is expected in 1/100 of the possible positions and, as expected, we have 2 (grey) examples above.  I find that there are 28 cases where we have identity between Phi and Pi; 19 cases where Phi and e are identical; but only 12 where Pi and e are the same. In each case the expectation is 20 IDs, but there are more than twice as many Phi/Pi identities (green) than there are e/Pi identities (blue).  Yer, yer, the statisticians cry; but it’s a small sample and you’d expect these stochastic fluctuations.  So I did a ChiSq test, with 2 degrees of freedom, and the departure from expectation is more than you’d expect by chance (ChiSq = 6.64, critical ChiSq = 5.99).

So let’s call this Bob’s Conjecture and leave it there for now.  A “trivial” analysis (N=200 * 3) like this is, for a crap programmer like me, more efficient to do by hand.  The next step is to crank up the length of the aligned digital sequence to 1000, then 1 million digits; to see if the conjecture holds for octal and binary representations of these numbers.  Over to you, readers. Goldbach's conjecture has been waiting to be proved, or a counter-example found, since June 1742.

No comments:

Post a Comment