Cool graphic proofs

This post is supposed to collect cool geometric proofs that I’ve found on other websites.

Triangular numbers

In number theory, the sum of the first ncubes is the square of the nth triangular number. That is,

1^3 + 2^3 + 3^3 + \cdots + n^3 = \left( 1 + 2 + 3 + \cdots + n \right)^2.

The same equation may be written more compactly using the mathematical notation for summation:

\sum_{k=1}^{n} k^{3}= \left(\sum_{k=1}^{n}k \right)^{2}.

This identity is sometimes called Nicomachus’s theorem.

A triangular number or triangle number counts objects arranged in an equilateral triangle, as in the diagram on the right. The nth triangular number is the number of dots composing a triangle with n dots on a side, and is equal to the sum of the n natural numbers from 1 to n.

Binomial theorem


For positive values of a and b, the binomial theorem with n = 2 is the geometrically evident fact that a square of side a + b can be cut into a square of side a, a square of side b, and two rectangles with sides a and b. With n = 3, the theorem states that a cube of side a + b can be cut into a cube of side a, a cube of side b, three a×a×b rectangular boxes, and three a×b×b rectangular boxes.


Being stuck together in an elevator brings particles closer

Assuming two particles of masses m_1 and m_2 in the gravitational field of the earth.The earth has a mass of 5,972 \times 10^{24} kg according to Wikipedia. What is the tidal force on the two particles as a function of their distance to the center of the earth?

For point masses, the Newtonian gravity states that they attract each other with a force \vec{F_G}=\frac{GMm}{r^2}\hat{r}. So if we assume two particles with a distance d between them where each particle has a distance R from the center of mass (COM) of the earth, then we can calculate the force on the particles in the direction of each other. we can do so by neglecting their mutual attraction through gravity or we can account for it. This depends on their masses and on the distances involved.

If we do the math (simple euclidean geometry in the plane) we get a very acute angle between the direct lines connecting the particles with the earth and the line which is perpendicular to the line connecting the particles. The angle is \phi = \arcsin(0.5 d/R). Now we can calculate the force on a particle along the line connecting it with the COM of the earth, as given above, and then divide this force into a component in the direction of the direct line connecting the particles and its perpendicular. We get \vec{F}_{attract} = |F_G|\sin(\phi) \cdot \hat{d}.

When one calculates this for two people (70 kg) standing 1m apart on the surface of the earth (remember: all point masses), it turns out that the tidal force (5.39456507 \cdot 10^{-5} kg\cdot m / s^2) is seven orders of magnitude smaller than the the force attracting the people towards earth (-6.87375482\cdot 10^{2} kg\cdot m / s^2). The mutual gravitation (-3.27029920 \cdot 10^{-7} kg\cdot m / s^2) on the other hand is still two orders of magnitude smaller than the tidal force.

Lifting the people into space only makes the distance between the people more insignificant compared to the distance to earth and thus the relative tidal force gets smaller and smaller, making the relative impact of the mutual gravitation in the free-falling elevator more important. Once we have to humans on a spaceship (say 100km above ground) the tidal force is completely irrelevant vs. the mutual gravitation.

Here’s my Python code for that (and btw. astropy is cool):

import matplotlib.pyplot as plt
import numpy as np
import astropy.units as u
from astropy import constants as c

M = 5.972e24 *
m = 70.0 *
earth = {'mass': M, 'loc': np.array([0.0, 0.0]) * u.m}

def gravity(one, two):
m1 = one['mass']
p1 = one['loc']
m2 = two['mass']
p2 = two['loc']

distance_squared = (p1 - p2).dot(p1 - p2)
return c.G*m1*m2/distance_squared*(p1-p2)/np.sqrt(distance_squared)

# These are the numbers
A = [6371000.0*0.1, 6371000.0, 6371000.0*10.0, 6371000.0*100.0] # earth radius
d = [1.0] # distance between particles

for i in range(len(A)):
for j in range(len(d)):
# The Center of Mass (COM) of the earth is supposed to sit at the origin
# Since our problem has rotational symmetry, along the perpendicular towards the COM
# we consider it 2-dimensional
p = [{'mass': m, 'loc': np.array([-d[j]/2.0, A[i]])*u.m},
{'mass': m, 'loc':np.array([d[j]/2.0, A[i]])*u.m}]

print("Gravitational force with earth: {0}".format(gravity(earth, p[0])))
print("Gravitational force with eachother: {0}".format(gravity(p[0], p[1])))

How to write a 21st century paper

tl;dr: In this post I point to two ideas: How to write proofs by Leslie Lamport
and how to write reasearch papers by Simon Peyton Jones.

Last week, we had the Heidelberg Laureate Forum here in … well, Heidelberg. I did not participate but I like to read about the event and in particular its invited laureates, as for example on their blog.

One talk that I liked particularly was the one by Leslie Lamport. (Btw.: Some people like to call him “The Turing award winner and father of LaTeX”, but given his publications list that’s both too unspecific and too narrow.) Anyway, I liked his talk about “How to write a 21st century proof”. There is also a paper about the same idea.

I liked the idea of clearly structured proofs. He demonstrates this idea with a proof of a corollary of the mean value theorem in calculus. He does this by taking a conventionally written prose proof from a text book and adding structure and names to it. Pretty much as one would do when trying to teach a computer (proof assistant) the proof. Eventually (in the paper) he even goes all the way to define the proof in his own specification language (TLA+) which can be used as a proof checking language. So he goes from conventional (mostly prose) proof to structured and better readable proof on to not-readable-for-humans-but-for-computers proof. And the middle step is the one I liked very much.

The particular proof that Leslie Lamport has picked for his talk and paper has a linear structure. Therefore, it is easy to enumerate the steps of the proof as a list. However, some proofs don’t have this linear structure but still lend themselves to a depiction by other directed acyclic graphs (See for example this rewrite of Szemeredis theorem by Terence Tao, page 8). But the general idea of dividing a proof into smaller (how small) steps and specifying all the assumptions and conclusions of each step, linking them to a directed acyclic graph is a very natural way to do this.

The difficult questions are of course how small should the steps be and how much detailed should be added to each step. But Lamport also answers this question: Make it hierarchical and use modern tools (in this case hypertext). This way one can expand or collapse each part of the proof and hide details from the eyes of sophisticated readers. If one cannot use hypertext, the only solution is to make an assumption and imagine a particular reader. This reader can be a little curious kid (as in the paper’s proof from the freshman calculus book) or a fields medalist reading your latest research article.

For me personally, it sounds like an interesting idea to extend the idea of less prose and more structure to scientific articles. But of course, this has already been done by Lamport’s fellow Microsoft employee (and Haskell inventor (again the question: is this fair to him or the other Haskell inventors)) Simon Peyton Jones in his article and talk How to write a great research paper.

One thing that I don’t like about papers can be captured by the following quote from Jones’ talk:

“Computer programs often have bugs. It is very important to eliminate these bugs [1,2]. Many researchers have tried [3,4,5,6]. It really is very important.”


[Better] Example:
“Consider this program, which has an interesting bug. <brief description>. We will show an automatic technique for identifying and removing such bugs”


I also agree with Jones in that I don’t like it when authors put the related work section in the beginning and therefore build a hill or wall over which the reader has to climb to get to the actual results. I think a good approach to this problem would be the same way that Lamport proposed in his talk: Add structure and hide the details in the lower levels of the hierarchy.

If I find the time I will try to exemplify this at few papers from my field.

The destructive starting phase

Every beginning is difficult. I always feel the implications of this sentence when I want to understand something completely new. This something can be a mathematical concept or a piece of code (software or hardware) written by someone else or my former self.

At the beginning you have to de-construct the work of the other person. What they have built with their creativity, you have to persistently un-build with your intelligence to understand their creative process if you later want to be creative with what they’ve done.

That’s a fundamental problem with human creativity (in certain disciplines?) that you have to go through the process that others have already gone through in order to augment their achievements. Unfortunately, you cannot just continue working with the knowledge of the other person that has been there before. There is no magical transfer of state of mind. All you can do is absorb what they’ve done and get into another, similar state of mind that allows you to reproduce and the surpass their results.

And sometimes that just feels hard. But it feels even harder if you allow yourself or other people to put time pressure on your project. This process is never fast and predictable and you never know how long it will take you to understand the creation of other people.

Reinforcement learning I – Temporal difference learning


After I’ve started working with reward-modulated STDP in spiking neural networks, I got curious about the background of research on which it was based. This led me to the book by Richard Sutton and Andrew Barto called “Reinforcement Learning”.  The book is from 1998 and it’s freely readable on the internet! In the book’s Introduction they cover the example of an agent learning to beat a given (imperfect) agent in the game of Tic Tac Toe. Two remarks have to be made: 1. The agent has to be imperfect because a perfect agent in Tic Tac Toe (if it’s the one doing the first move) can never be beaten. 2. The agent does not learn to play Tic Tac Toe, this skill is assumed, but it learns a value map for its policy.

Since I liked the example and wanted to try it out myself, I decided to write this blog post about it. By the way, the code can be found on github (run Continue reading

Reed-Solomon Codes

While trying to understand QR codes, I came across many explanations of Reed-Solomon codes. These codes are used to encode the data in a QR code and make it more robust against wrong pixels. In particular, they allow for recovery of a certain amount of errors. The best intro to Reed-Solomon codes I found so far is this one.