Turing understood the critical differences between the terms in the title. And he understood their differences relevance to the design of cryptanalysis methods.

We already saw how in affine combination spaces one can calculate likelihood functions in terms of log addition. That is, calculate in the algebra of likelihoods (rather than in the algebra of frequency probabilities).

Any gambler should know that in a run of similar results that there is , despite the run, no memory property in the frequencies observed. Thus likelihood is clearly distinguished from frequency.

But what is information and how did Turing apply it to cryptanalysis – knowing it to be yet further distinguished from probabilities and likelihoods both?

The answer lies in the Monte Carlo example. While an unbiased but unbalanced roulette wheel may offer exploitable information about the frequencies observed (the outcomes) it still does not change the likelihood of any particular event.

so what is the exploit value of information?

In the roulette wheel case, it tells you how long you must play the runs before the unbalacedness will show up in the statistics … And improve your return.

The property tells one something about the underlying mechanisms – generating the likelihoods. It tells one about the underlying group and its symmetries ;or lack thereof) used to generate a run, rather than run properties. In roulette wheel terms, it tells about unbalancedness of its mechanics.

In modern cipher design one learns about those boolean functions which are both balanced and non linear (being at a distance as far as possible from specifically affine combination functions.)

Or in pure information science terms, there must be no information from the run about the generators of the likelihoods.

And just as there must be no information flow from the product about the design So there must be no further meta information from the design about the designers (and their intellectual background). The secret of the cipher is not in the algorithm but in how the design impacts complexity.

The easiest way to think about information is to think about the complexity of the machine needed to compute the probability space. Known in the 1930, folks designs ciphers to need more compute power than was available – as measured by the information properties.

Now apply that to 1970s des where dengineers designed resistance to only a certain quantity of mechanical

The last point to note is that the tractability of non linear mappings of spaces (that appear to resist diffrential cryptanalysis) can depend on whether they are generated by linear like vector spaces (keyed by cryptographic parameters). Just as the exponential map links probability spaces to likelihood spaces, so lie groups are related to lie algebras through the exponentiation of (lie) matrices