To understand the argument below we need clear mental model of p-norms:

Fortunately, its easy (being entirely about taxi drivers in New York):

The wikipedia article makes it clear that p-norms (such as 1-norm…) are only generalizations of entirely intuitive processes of measuring the distance, as seen by a taxi driver. While the crow-flies distance is intuitive, obviously the taxi driver in new york cannot so fly – but must go around corners instead. Thus while the L1 norm looks short, one remembers that as it’s a part of a triangle the length of it’s apparent “crow flies distance” is a hypotenuse – whose (squared) length is the sum of the (squared) lengths of the two streets …along which the taxi actually drives.

Yes, the crows flies distance in four quadrants makes a circle (in a funky way), as the diagram of L1 shows. Is an “edgy” circle, visibly. And it compares with the more traditional view of a smooth circle shown for the contrasting L2 norm.

Between l=1 (edgy circle) and l=2 (circle) the line in red shows a l=3/2 norm. Indeed, we see a half-way house circle that is less edgy but not quite fully parametric.

By the time we get to deforming the initially edgdy-circle, through the parametric-circle to the nth circle distortion corresponding to the L “infinity” norm we have the “actual” taxi metering measure – distance measured in terms of how many yards are actually rolled by the wheels, on the rectilinear streets.

So there, to understand p norms all we have to do is visualize crows, taxis, and taxi meters. It if helps, I view the conventional circular distance on he screen used by a 1940-era naval radar operator on a container ship (a kind of taxi) – forced to measure things on the “arc of circles” (due to the limits of how 1940s radars worked). Which explains a lot ,of course, since folks are generally used to the difference between a mile and a nautical mile (that takes curvature into consideration, as reflected in the nautical-taxi rate (of how many cycles of its engines actually occurred to get around that arc)).

Getting back to the more-advanced norm concept of fedya,

its easy to see now that we have the usual (y –x) convention, but calculated on a special calculator relevant for cayley graphs (or typex/enigma wheel wirings, ifs that’s easier to visualize). At either of x or y, consider now its connectivity to its own local neighbors. The connectivity is a local “energy metric” (at x, at y). Now take the difference, to get a gradient between the 2 (where the denominator might be 1). Or, as my daughter is taught to say (in strange American math teaching) the “run” of the subtraction over a “rise” of 1.

Fedya’s argument goes on to consider a world in which the very “form” of these circles changes. That is, one modules the norm itself. In particular, as one does a stochastic walk (does encoding) the next transition matrix (for the next symbol) has lesser energy. Now we can understand just how that energy dissipates on a walk, understanding moreover how the L1 and L.inf norms come into play when analyzing the “signaling” of an energy norm.

He points out that the L1 norm is always 1, and one can “control” the L.inf norm. That is the transition matrix’s evolution can be inducing different energy pulses – or moving between energy norms as it evolves as a symbol encoder. While its final form may be L.inf, on the way there we get to signal via “side-effect” energy pulses at each stage, as the norms change. The rate of change of norm changes is the signaling mechanism.

If you like, there are different taxi charging schemes on the various segments of the journey. Which is actually true! (since the first 1000 yards are at rate x, whereas above 1000 yards is rate y). A double differential considers the rate of change of (taxi) rates…in different cities, perhaps.

So think of the kernel-based transform as one that is “modulating energy”- according to the classical power-square laws (found in RMS – and valve circuits in particular of 1940). One can almost see turing watching a valve with a certain gas within change color under certain lighting, as it ‘pulses’ under the control of its “network circuit” designed to manipulated its “power output”. Of course, we all use the thing everyday, when we sharpen or blur photos!

When we look at the nice intuition gifted to us by fedya who focussed on degree=4 .We can see the 4 quarters of the phase-space of the resulting type of unit-circle. Its very intuitive to visualize then a world of degree=8. Obviously, there are 8 arcs (rather than 4). So in a large degree, even distance between two points at L.inf norm will be a circle, with rather edgy path.

Now, Fedya’s intuition about how energy is “controlled” is very telling:

He gives a very intuitive argument for why there is a constant loss – given cauchy-schwatz type arguments (tuned up to “power” graphs). Algebraic complexity and geometric complexity have a fundamental relationship to each other, which gets reduced down to noting that squaring averages relates to averages of squares. And, pertinent to cryptanalysis, looking at “pairwise” differences now allows one to estimate the relationship, noting that here is a constant difference in the complexity. i.e. the number of “energy steps” computed algebraically relates to the number of energy steps computed geometrically.

Anyways if you were to going to ”stain” a packet SEQUENCE, you want to leave an energy signature that can be detected by a double-differentiating detector (as it scans the passively tapped packet capture set). For every two TCP sequence number changes ( a kind of path walk), go look for an energy change in another field. Or, consider the sequence numbers used by the return path of the TCP channel to that used by the outgoing path, etc.

Whereas fedya was modeling a very general random walk on nodes in a graph, GCHQ will be modeling path walks of packets through routers – which comes to the same thing. Packets in a TCP flow really don’t route on different paths, that much – though perhaps a security conscious designers would start to make that happen. .. to defeat staining. One would be actively measuring one routing (aka level-3 packet forwarding network) for the degree to which it detects/resists staining.