Friday, December 28, 2007

I-mmortality

When interacting in the web, are we fully aware that all of our utterances, articles we wrote, posts we sent, pictures we published, rants, jokes, praises, mistakes, sins, will likely be navigating the humming vortices of the net for ever after, being archived and replicated in servers of ever-increasing capacity, for coming generations to occasionally stumble upon and bring to the surface like mosquitoes trapped in amber?

In Yourcenar's Memoirs of Hadrian, the Emperor comments on the inscriptions scratched by casual visitors all over the stony legs of the Colossus of Memnon; one of them was done by some Eumenes six centuries before the time of Hadrian, and that impulsive sign is all that remains of its author for the eternity. Up to our days, the fraction of humankind that has left some sort of personal imprint (fossil bones, handcraft, inscriptions, intellectual work) after their death is vanishingly small. With the advent of the Internet and our progressive immersion into digital life, this fraction could easily rise to 100% in a few decades, providing every one of us with a cheap and unexpected form of immortality.

Monday, December 24, 2007

λ-fun in metaC++: extensions and term rewriting

So-called λδ-calculi involve the addition to λ-calculus of some set C of external terms along with δ rules describing the reductions by which terms involving entities of C behave, additionally to those of standard λ-calculus. For instance, instead of relying on the cumbersome and computationally expensive Church numerals we can embed natural numbers directly:

  • C = {succ,pred,+,*,iszero}+{0,1,2,...}.
  • succ nδ (n+1) for any natural n.
  • pred nδ (n-1) for any natural n>0.
  • + n mδ (n+m) for any natural n, m.
  • * n mδ (nm) for any natural n, m.
  • iszero 0 →δ TRUE, iszero nδ FALSE for natural n>0.

Many nice properties of λ-calculus, like the Church-Rosser Property, are retained in λδ-calculi.

Our translation of λ-calculus to metaC++ allows us to easily accomodate δ extensions: we only have to supply δ-reductions as additional specializations of the apply metafunction:

template <int N> struct int_;

struct succ;
template<int N>
struct apply<succ,int_<N> >{typedef int_<N+1> type;};

struct pred;
template<int N>
struct apply<pred,int_<N> >{typedef int_<N-1> type;};

struct plus;
template<int M,int N>
struct apply<apply<plus,int_<M> >,int_<N> >{typedef int_<M+N> type;};

struct mult;
template<int M,int N>
struct apply<apply<mult,int_<M> >,int_<N> >{typedef int_<M*N> type;};

struct iszero;
template<int N>
struct apply<iszero,int_<N> >{typedef FALSE type;};
template<>
struct apply<iszero,int_<0> >{typedef TRUE type;};

Our former FAC function using Church numerals can be adapted to work with these by simply replacing the original set of numerical operations and constants with the new ones:

struct FAC;

struct FACIF;
template<typename N>
struct apply<FACIF,N>{typedef int_<1> type;};

struct FACELSE;
template<typename N>
struct apply<FACELSE,N>:
apply2<
mult,N,
typename apply<FAC,typename apply<pred,N>::type>::type
>
{};

template<typename N>
struct apply<FAC,N>:
apply<
typename apply3<
IFTHENELSE,
typename apply<iszero,N>::type,
FACIF,
FACELSE
>::type,
N
>
{};
The complete program is provided.

Why it was so easy to accomodate δ-reductions in our C++ rendition of λ-calculus? Well, a basic premise of the embedding procedure is that there is no explicit representation for λ-abstractions and these are rather described operationally according to their reduction behavior: what we really have is a term rewriting engine, which is what C++ template partial specialization ultimately boils down to, and what δ-reduction is all about.

In the end λ-calculus itself is nothing but a kind of Term Rewriting System, where allowable rewritings are described in terms of λ-abstractions. In a way, the theory of TRSs captures the basic essence of λ-calculus and allows for a more general treatment of the subject.

Thursday, December 20, 2007

Within 50 km: correct sources

Thanks to the indications from a reader of the "Within 50 km" entry, we have now more authoritative sources regarding the commitments by the Spanish goverment on the evolution of the high-speed reailroad network. According to these sources, the plans are that

  • (R1) all (continental) province capitals will be directly connected to the network,
  • (R2) 90% of the population will live within 50 km of an AVE station

by 2020, and not 2010 as I incorrectly had assumed. In 2020 previsions are that the network length will reach 10,000 km, which is far more than the 2,230 km planned for 2010, and certainly more than enough to satisfy the coverage requirements stated above, as we see later. These figures are related to the so called Infrastructures and Transport Strategic Plan (PEIT) for the 2005-2020 period. The development of PEIT implies that the high-speed network will grow at the impresive rate of 777 km per year during 2010-2020. Whether this is feasible looks questionable to me, but the nature of my skepticism is not mathematical, so I leave it here.

Just for the fun of it, let us estimate the minimum network lengths needed to satisfy requirements R1 and R2. With respect to R1, we can use again the genetic program written for our previous entry on this issue, with only some changes to the initial parameters needed. After several hundred iterations we obtain an estimation for the minimum length of L1 = 3,870 km approximately. As for R2, we have, according to the National Statistics Institute (INE) data for the municipal register as of January 1st, 2006, that the Spanish population was 44,708,964. What is the minimum area the railroad network should cover so as to contain 90% of this population, i.e. 40,238,068 people? Francisco Ruiz provides an amazing resource at his website, a single Excel file including a comprehensive list of Spanish municipalities, along with their populations and areas, according to INE 2006 data: using this, we need only sort the (continental) municipalities by descending population density and look up the aggregated surface of the n first entries totalling up 40,238,068. This gives us an area A = 262,039 km2; repeating our original analysis with this value of A yields a minimum length L2 = 3,464 km. Anyway, this estimation is very imprecise as the analysis assumes the area to cover to be of compact shape, while in this case the shape is expected to be elongated along the Spanish coast with isolated dots over inland larger cities.

To satisfy both R1 and R2 we would need a minimum length between max{L1,L2} and L1 + L2, which in any case is well below 10,000 km.

Sunday, December 16, 2007

A paradoxical three-person game: numerical solution

The general form of the N-option Mediocrity game can be depicted as follows:

123456...3N-23N-13N
ABCABC...ABC

Let us assume A's optimum strategy is of mixed type:

SA = a1(A←1) + a2(A←4) + ··· + aN(A←3N-2),

with 0 ≤ ai ≤ 1, ∑ai = 1 (a pure strategy is just a particular case where some ai = 1). The critical point in our analysis is the obervation that the roles of A and C are perfectly symmetrical, and thus C's optimum strategy must be the following:

SC = a1(C←3N) + a2(C←3N-3) + ··· + aN(C←3).

This interdependency of A's and C's strategies allows us to regard the game as a two-person one: B against the "combined player" A+C. If the strategy adopted by B is written down as

SB = b1(B←2) + b2(B←5) + ··· + bN(B←3N-1),

with 0 ≤ bi ≤ 1, ∑bi = 1, then B's payoff is

UB = ∑bi(P(A<i)P(i<C) + P(i<A)P(C<i))=
bi((a1+···+ai)(1aN−i+2···aN) + (1−a1−···−ai)(aN−i+2+···+aN)),

where aN−i+2+···+aN is by convention taken to be zero when i = 1. We abbreviate this expression as

UB = ∑bisi.

Note that si = sN-i+1.

Case 1: N even. A strategy (a1,...,aN) giving si = 0 for i=1,...,N is clearly optimum (for A+C). We show that such strategy exists and is unique by solving the set of N/2 equations:

s1 = a1 = 0,
s2 = (a1+a2)(1−aN) + (1−a1a2)aN =0,
...
sN/2 = (a1+···+aN/2)(1−a(N/2)+2−···−aN)+(1−a1−···−aN/2)(a(N/2)+2+···+aN) = 0.

The first equation implies a1 = 0, so that the second equation reduces to a2(1−aN)+(1−a2)aN = 0, from which it follows that a2 = aN = 0. Going down the rest of equations in a similar way we conclude that every ai except a(N/2)+1 must be zero. Hence the unique optimum strategy for A+C has

a(N/2)+1 = 1,
ai = 0 otherwise,

which translates to

SA= A←(3N/2)+1,
SC = C←3N/2.

As UB = 0, B's strategy is immaterial and can be identified with

SB = (1/N)((B←2) + ··· + (B←3N-1)).

The winner is A or C depending on which side of A and C consecutive positions B plays. Their payoffs are UA = UC = 1/2.

Case 2: N odd, N ≥ 3. Let (a1,...,aN) be an arbitrary strategy for A and s1,...,sN its associated coefficients. We define a derived strategy (a'1,...,a'N) in the following manner:

a'(N+1)/2 = a1 + ··· + a(N+1)/2,
a'(N+3)/2 = a(N+3)/2 + ··· + aN,
a'i = 0 otherwise;

Then the associated coefficients s'i verify the following:

  • s'(N+1)/2 = (a'1 + ··· + a'(N+1)/2)2+ (1 − a'1 − ··· − a'(N+1)/2)2 =
    = (a'(N+1)/2)2+ (1 − a'(N+1)/2)2 =
    = (a1 + ··· + a(N+1)/2)2+ (1 − a1 − ··· − a(N+1)/2)2 =
    = s(N+1)/2.
  • For i < (N+1)/2,
    s'i = (a'1 + ··· + a'i)(1 − a'N−i+2 − ··· − a'N) +
    + (1 − a'1 − ··· − a'i)(a'N−i+2 + ··· + a'N) =
    = 0(1-0) + (1-0)0 = 0, as all the a'i coefficients involved are other than a'(N+1)/2 and a'(N+3)/2.
  • For i > (N+1)/2, s'i = s'N−i+1 = 0, since N−i+1 < (N+1)/2.

So, for any strategy for B of the form (b1,...,bN) we have ∑bisib(N+1)/2s(N+1)/2 = ∑bis'i, the inequality being strict if any of a1,..., a(N−1)/2,a(N+5)/2,...,aN is different from zero, hence we conclude that an optimum strategy for A must have all its coefficients set to zero except those at positions (N+1)/2 and (N+3)/2. We are then left with the task of finding positive values of a(N+1)/2 and a(N+3)/2 adding to 1 and minimizing the maximum of the expression

UB = b(N+1)/2s(N+1)/2 = b(N+1)/2((a(N+1)/2)2 + (1 − a(N+1)/2)2),

which is obviously reached at b(N+1)/2 = 1. The solution to this trivial problem is a(N+1)/2 = a(N+3)/2 = 1/2. So, the optimum strategies for each player are:

SA= (1/2)(A←(3N−1)/2) + (1/2)(A←(3N+5)/2),
SB= B←(3N+1)/2,
SC = (1/2)(B←(3N−3)/2) + (1/2)(B←(3N+3)/2),

yielding payoffs UA = UC = 1/4, UB = 1/2.

Although the numerical solution of the N-option Mediocrity game seems entirely satisfactory, the argument stating that optimum strategies for A and C must be symmetrical induces, for N odd, N ≥ 3, some psychological difficulties akin to those arising in the analysis of the Prisoner's dilemma. We will examine this issue in a later entry.

Wednesday, December 12, 2007

Within 50 km, revisited

Commenting on the "Within 50 km" entry blog, a friend of mine remarks that the commitment by the Spanish government concerning the coverage of the high-speed railroad network has been probably misreported by the media, and it might not apply to every town in the country, but only to most important cities. So, let us restate such commitment in a plausible way:

Spanish high-speed railroad network will be developed so that every continental province capital will lie within 50 km of a station.

(Please note that I have no source confirming/denying whether this is really the government's statement.) This is a rather weaker commitment, yet the question remains open whether it can be achieved with a network whose total length will be 2,230 km by 2010, according to the government itself.

Let us consider the 47 continental Spanish province capitals and investigate the minimum length of a network spanning the country with the property that each of the 47 cities lies within 50 km of a network node (station) --we will call this the coverage property. It is easy to see that the network has to be a Steiner tree. We will mathematically prove in a separate entry that the solutions to the problem are in fact Steiner trees constructed out of 47 base points, each lying on the circle with radius 50 km centered at the i-th city. A basic theorem about Steiner trees tells us that there can be no more than 47−2 additional Steiner points arising on the construction of the tree; this allows us to regard the solution network as a minimum spanning tree on 47+45=92 nodes. If there happens to be a solution with less than 92 nodes, we still have one with exactly 92 nodes: just add the remaining points arbitrarily along the original network. Summing up, we can formulate our problem as that of finding meshes of n=92 nodes with the coverage property whose minimum spanning tree length is minimum.

I have implemented a genetic algorithm to solve the problem, along the following lines:

  • Each candidate solution (or individual) of the pool is merely a set of n points. The fitness of an individual is the complementary of its "length", which is calculated as the length of its associated minimum spanning tree, with an additional penalty for each uncovered city (i.e. each city lying more than 50 km away from every point of the solution).
  • Initialization of each individual is implemented with a mix of "base points" lying at 50 km from some city and "free points" resulting from the weighted sum of two randomly selected cities. This mixed strategy ensures that the initial population has a decent proportion of individuals satisfying the coverage property.
  • On each generation, the 25% fittest individuals are selected for survival and breeding.
  • Crossover of two individuals a and b is implemented as follows: A city c is selected at random and the points of a and b are sorted according to their distance to c. The child individual is built with the first m points of a and the nm last points of b, for some random m between 0 and n. The points of the child individual can also be randomly mutated: a mutated point results from the weighted sum of the original and some randomly selected city.

I wrote a C++ program implementing this algorithm (you will need Boost to compile it): the program keeps iterating the population until aborted by the user, and on each iteration it outputs the minimum network length achieved so far and a SVG picture of the best candidate solution. Coordinates of province capitals have been obtained from the Spanish Ministry of the Interior, and translated to UTM (within the same UTM zone even if this incurs some curvature error) with Gabriel Ortiz's coordinate converter. I ran the program for hours, playing a bit with the initial parameters, and each run stabilized at a minimum length above 2,500 km. The figure below shows the best solution I got, with 2,535 km in length, obtained with a population size of 5,000 and a mutation rate of 0.01 after 4,000 iterations. You can observe a characteristic feature of Steiner trees by which confluent edges never form an angle smaller than 120º.

Take into account that in our rendering of the problem we have made some simplifying assumptions that in general underestimate the network length needed to satisfy the coverage commitment in the real world. For instance:

  • Railroad segments between nodes are always assumed to be straight lines.
  • We have laid out the network from scratch, without considering that part of it (Madrid-Sevilla, Madrid-Barcelona, Madrid-Lisbon, etc.) is already in place or planned, which, by definition, increases the length of the optimum compatible solution we can build from there.

So, unless we have made some mistake in our analysis, it is impossible to extend the high-speed railroad network to within 50 km of every province capital with only 2,230 km as planned by 2010.

Saturday, December 8, 2007

Goal crossbar cross-sections

In the game of soccer, it is not infrequent that the ball hits the goal crossbar: sometimes it bounces back into the field or out over the bar, and sometimes one's lucky and the ball enters the goal.

The result depends on the point of the crossbar where the hit occurs and the incidence angle of the ball. The question arises: how does the bar cross-section affect the chances that the ball bounces into the goal? Although goals at professional fields seem to always have crossbars with circular cross-section, the FIFA Laws of the Game allow for other shapes:

Goalposts and crossbars must be made of wood, metal or other approved material. Their shape may be square, rectangular, round or elliptical and they must not be dangerous to players.

The geometry involved is simple: Let α be the incidence angle of the ball and η the inclination of the crossbar surface at the hitting point. As depicted in the diagram, negative values of α would mean that the ball follows an ascending trajectory.

The reflection angle γ, measured clockwise from 12 o'clock, can be easily calculated:

γ = π/2 + 2η + α.

The ball enters the goal when γ > π, hence the minimum incidence angle for which the ball enters the goal at a particular hitting point is

αgoal = π/2 − 2η.

We now traverse the crossbar section in polar coordinates (ρ,θ), with θ growing clockwise from 3 o'clock, and determine the incidence angle η as a function of θ.

For a circular cross-section, η = θ; if the section is an ellipse with eccentricity ε then η = arctan((1−ε2)−1 tan θ), whereas for a rectangle η(θ) consists of two constant segments corresponding to the the vertical and horizontal edges.

The analysis so far considers an idealized ball with null radius. This approximation is grossly incorrect in the particular case of a rectangular cross-section, as it fails to model the situation where the ball hits on an edge of the crossbar rather than a flat face. An easy trick to cope with a ball radius r > 0 is to consider the extended cross-section consisting of those points within distance r of the original section:

The extended cross-section for a circle is also a circle, and for an ellipse is something not very different from the original shape (the resulting "eccentricity" is smaller). In the case of a rectangle the abrupt transition between the two values of η(θ) is replaced by a slope which is less steep as the ratio between r and the crossbar width grows; for a square section and considering the typical dimensions of balls and crossbars, the slope value lies around 1.7.

We are now ready to depict αgoal as a function of the hitting angle θ and interpret the results:

For each hitting angle θ, αgoal is the minimun incidence that the ball trajectory has to have to enter the goal, measuring the incidence angle from the horizontal line downwards. So, lower values of αgoal means that it is easier to enter the goal. For instance, when the hit occurs at low values of θ (front part of the crossbar) the ball has to follow an almost vertical descending trajectory if it is to enter the goal, whereas for high values of θ (bottom part of the bar) even ascending trajectories can bounce into the goal. We see that an elliptical cross-section is always more goal-friendly than a circular section. As for the square section, it is worse (less goal-friendly) than the ellipse for low values of θ and better for a smaller region of high values of θ.

So, what cross-section shape should we choose if we want to facilitate goal scoring? Circles are ruled out, and to decide between an elliptical shape or a square we would have to know which hitting zone is more likely. Although I can't justify it, my hunch is that most hits against the crossbar occur at the front part rather than at the bottom, and moreover the interval of values of θ for which the ellipse wins is larger than the winning interval for the square (the intersection point of both η functions happen at some θ0 > π/4). So, it seems like we should go for an elliptical cross-section, the more eccentric (flatter) the better.

Tuesday, December 4, 2007

λ-fun in metaC++: recursion

Let's implement a recursive numeric function in our C++ embedding of λ-calculus. Before delving into the specific problems of recursion, we remember the standard way of defining basic logical predicates in λ-calculus, as usual courtesy of Alonzo Church:
TRUE = λxy.x
FALSE = λxy.y
AND
= λxpq.p q FALSE
OR = λpq.p TRUE q
NOT = λp.p FALSE TRUE
IFTHENELSE
= λpfg.pfg
whose translation to metaC++ is straightforward. We will also need a predecessor function (we define the succesor in prior entries) and a predicate for testing whether a number is zero (where numbers have been defined via Church numerals):
PRED = λnfx.ngh.h(gf))(λu.x)(λu.u)
ISZERO = λn.nx.FALSE)TRUE
So equipped, we could try to "define" the factorial function as follows:
FAC = λn.IFTHENELSE (ISZERO n) 1 (MULT n (FAC(PRED n)))
which is not a true definition as FAC is both the definiendum and part of the definiens. Such a recursive definition is typically emulated in λ-calculus by parameterizing the recursive call:
FACIMPL = λfn.IFTHENELSE (ISZERO n) 1 (MULT n (f(PRED n)))
and adding the folding part by means of some fixed-point operator FP:
FAC = FP FACIMPL
where FP has the property
FP FACIMPLβ FACIMPL(FP FACIMPL)
so that reducing FP FACIMPL one step "executes" FACIMPL once on the very same FP FACIMPL function:
FP FACIMPLβ
FACIMPL(FP FACIMPL) →β
FACIMPL(FACIMPL(FP FACIMPL)) →β ...
There are many (infinite, actually) universal fixed point operators, i.e. λ-terms that are a fixed-point operator of any arbitrary term. The most famous is the Y combinator:
Y = λf.(λx. f(xx))(λx. f(xx))
Proving that Y is a universal fixed-point operator is trivial:
λf.(λx. f(xx))(λx. f(xx)) →β λf.f((λx. f(xx))(λx. f(xx))) = λf.f(Yf)
At first sight, it looks like we can easily code Y in C++:
template<typename F>
struct YAUX;
template<typename F,typename X>
struct apply<YAUX<F>,X>:apply<F,typename apply<X,X>::type>{};

struct Y;
template<typename F>
struct apply<Y,F>:apply<YAUX<F>,YAUX<F> >{};
but we will obtain a compiler error as soon as we try to use this:
// error 'type' not defined or something similar
typedef apply<Y,I>::type type;
What's happening? The compiler tries to instantiate apply<YAUX<I>,YAUX<I> >, which is self-referential because it derives from apply<F,apply<YAUX<I>,YAUX<I> >::type>. In short, YI has an infinite β-reduction path, which is after all understandable as this is the raison d'être of a fixed-point operator.

The good news is that we can in fact implement recursive functions in our C++ embedding of λ-calculus without using fixed-point operators: Defining a term f translates to specializing apply for some contexts of use of the type F, and in defining this specialization we can freely use the type F itself, which opens up the possibility of recursion. The following is a definition of the factorial function:

struct FAC;

struct FACIF;
template<typename N>
struct apply<FACIF,N>{typedef ONE type;};

struct FACELSE;
template<typename N>
struct apply<FACELSE,N>:
apply2<
MULT,N,
typename apply<FAC,typename apply<PRED,N>::type>::type
>
{};

template<typename N>
struct apply<FAC,N>:
apply<
typename apply3<
IFTHENELSE,
typename apply<ISZERO,N>::type,
FACIF,
FACELSE
>::type,
N
>
{};
where FACIF and FACELSE provide a level of indirection to avoid infinite β-reduction inside the expansion of FAC. In λ-calculus jargon, this trick corresponds to the so-called η-expansion of the original term.

A program with the full code for this entry is provided.

Friday, November 30, 2007

A paradoxical three-person game

I read the other day an article by Douglas Hofstadter on his game Mediocrity: three players choose a number each and the winner is that who has chosen the number in the middle. Hofstadter describes the game and then go on with designing arrangements for several types of tournaments, but he does not devote much time to the optimum strategy of the basic game itself; this is somewhat surprising because thinking about how to best play induces some kind of interesting circular reasoning. The following is an attempt to do that analysis. The game is dependent on the number of choices each player is given.

The N-option Mediocrity game: For a fixed N≥1, player A can only choose among 1,4,...,3N-2, player B among 2,5,...,3N-1 and player C among 3,6,...,3N. The winner is the player who has chosen the number in the middle.

Observe that there cannot be ties. For the purposes of our investigation, let's add some constraints:

  • Coalitions are not allowed.
  • The players are assumed to be perfectly rational, i.e. they are motivated exclusively by their own payoff and will be able to pursue any logical argument to that end; moreover, they know that each player knows about their opponents' perfect rationality.

What's paradoxical about the game? Let's follow a plaussible line of argument for player A:

"I must choose among 1,4,...,3N-2. Clearly 1 can never be in the middle, so I discard it. Player B, however, knows I won't play 1, so she won't choose 2, and thus player C, who can also deduce this, won't play 3. Hence 4 is not an option for me either, and this implies B won't play 5,..."

So it seems like A cannot rationally choose any number, and similar reasonings can be derived for B and C. But of course in the end they must choose some number and someone has to win. This puzzle is reminiscent of the much debated Surprise Examination Paradox; the difference, however, is that Mediocrity involves no modal statements about belief and such, so maybe it is easier to solve.

When N is small the analysis of the game is in fact straightforward:

The 1-option game

123
ABC

As each player has only one number to play to, the automatic winner is B.

The 2-option game

123456
ABCABC

A can't play 1, so she chooses 4 and C can't go for 6 so she plays 3. B loses regardless of her choice, so there are two possible outcomes with no particular preference: (A←4,B←2,C←3) and (A←4,B←5,C←3), bolds mark the winner.

When N≥3 things look considerably harder. We will look at this in a later entry.

Monday, November 26, 2007

All the King's philosophers: ultrafilters and Arrow's Theorem

Given some set A, an ultrafilter on A is a nonempty family F of subsets of A such that
  1. A belongs to F,
  2. X belongs to F if and only if AX does not belong to F,
  3. if X belongs to F and Y is a superset of X, Y belongs to F,
  4. if X and Y belongs to F, the intersection of X and Y does belong to F as well.
An ultrafilter is said principal if it is of the form {X included in A : a belongs to X} for some fixed element a of A.
Ultrafilters are used in Model Theory to construct new models for a logical theory out of a family of preexisting models. Principal ultrafilters are of little use in this context, as they yield models that are isomorphic to one of the elements of the given family. The following result basically tells us that finite ultrafilters are of no interest:
Theorem. If A is finite, every ultrafilter on A is principal.
This is the fundamental difficulty that al-Khwārizmī found when trying to build a Philosopher Rule for the Council. He could only have built a non-trivial rule if the Council were infinite!
Arrow's Theorem is a celebrated result by economist Kenneth Arrow that shows some inherent problems of certain voting systems:
Theorem. Let P be a population of voters each having an associated preference ordering a set of candidates C. A social welfare function is a rule that accepts any combination of individual orderings and generates a global ordering for the candidates. Suppose that the SWF has the following properties:
  1. When all voters prefer candidate a over candidate b, the SWF must also have this preference.
  2. If for a given voter input I the SWF prefers a over b and then we switch to a different input I' such that the individual voters favoring a over b are a superset of those in I, the new output of the SWF still favors a wih respect to b.
Then, if P is finite and there are more than two candidates, it turns out that the SWF necessarily implements a dictatorship, i.e. there is some individual d in P such that the results of the SWF are exactly the preferences expressed by d.
The similarities between Arrow's Theorem and the basic theorem on finite ultrafilters are not coincidental, and a quick search on the net reveals that some have indeed proved that both results are in fact equivalent.

Thursday, November 22, 2007

Within 50 km

Recently the Spanish government announced its goal that the national high-speed railroad network (AVE) be developed so that every town in Spain is within 50 km from an AVE station. On the other hand, the estimations by the government itself are that the AVE network total length will reach 2,230 km by 2010. Are these statements compatible?

Let us consider the problem of laying out a railroad network with minimum length such that every point within the considered area (the national territory) is not farther than some predefined distance away from the closest network node. Solving this problem for an arbitrary shape like that of Spain can only be done via computer simulation, but we can introduce drastic simplifications that allow us to come up with a reasonable estimate using only some elementary calculations.

Ignoring the boundary effects, an hexagonal grid is the most efficient way to distribute points on the plane so as to maximize the area covered by circles with a fixed radius centered at the points.

For our problem, we have

R = 50 km,
r = ((√3)/2) R = 43.30 km,
A = (3(√3)/2) R2 = 6,495 km2.

The continental area of Spain is 492,173 km2, so we need a total of n = 76 hexagons (or nodes) to cover the country. As an approximation to the optimum network, we can simply take a minimum spanning tree connecting all the 76 nodes. Again, finding such a tree is a computationally hard problem, but we need not do it, as we are only interested in its length L: a tree with n nodes has n-1 edges, and using the reasonable assumption that our minimum spanning tree connects nodes exclusively with their neighbors we have

L = 2r(n-1) = 6,495 km,

which is nearly three times the planned length of the AVE network by 2010.

Sunday, November 18, 2007

All the King's philosophers: part III

(See part I, part II.)

The cool breeze raising from the Zeravshan river awoke the mathematician on the morning of the third day. Al-Khwārizmī skipped his breakfast and went directly to his worktable in the royal garden, where, according to his instructions, new loads of paper were already in place. As the hours of the day passed by, the initially light mood of Al-Khwārizmī progressively turned somber, and his writing frantic; around him, scattered on the floor, lay dozens of sheets of paper full with garbled diagrams and tables.

This routine went on during the following days, and the King, a little impatient about the outcome of al-Khwārizmī's quest, began to send messengers to secretly observe the mathematician from the distance. This would not talk to anyone and abruptly dismissed the servants who approached him with food and wine. He had some lights set up at his place on the veranda so that he could continue working during the night.

One chilly morning Al-Khwārizmī found that he had fallen asleep at his workplace. He then decided that the quest was over and asked for an audience with the King. After explaining to the sovereign how he devised the principles of his theory for a Philosopher Rule, the mathematician then concluded:

"I have spent several days and nights searching for suitable lists of authoritative groups, beginning with some initial disposition and adding and supressing groups according to the logical principles that I have presented. And anytime I obtained a distribution conforming to the principles, I realized to my dismay that it possesed the following defect: there was some particular philosopher such that a group was authoritative in as much as that philosopher belonged to it. If I built some other suitable different distribution, it was only that particular philosopher that changed, but the troublesome peculiarity remained. What is the usefulness of a Rule if it merely takes into account the opinion of a single philosopher in the Council?"

"My lord, in despair and humiliation I must resign from the task that you gratiously assigned me. As far as my investigations have led me to see, there is no way to design a logical Philosopher Rule other than following the decisions of a single designated individual in your Council."

"Great al-Khwārizmī" replied the King, "it is a true sign of the wisest to acknowledge what one does not know. And you have my admiration for this sincerity and for the fine mathematical work you have presented to me."

"Given that your investigations show that having a Council of philosophers is no better than relying in one of them, I can not think of any other man than you who is more suitable for that position. Would you accept my humble invitation to be the Royal Counselor?"

In the end al-Khwārizmī declined the proposition and promptly left Samarkand to go on his trip to remote countries. History has it that the King dismantled the Council and devoted himself to the study of Science and Philosophy so as to become a better person, a wiser ruler, and his best own counselor.

Wednesday, November 14, 2007

λ-fun in metaC++: Church numerals

The classical way to represent natural numbers in λ-calculus was invented by Church:

0 = λfx.x
1 = λfx.fx
2 = λfx.f(fx)
...
N = λfx.fn(x)
This translates to our C++ embedding as follows:
struct ZERO;
template<typename F,typename X>
struct apply<apply<ZERO,F>,X>{typedef X type;};

struct ONE;
template<typename F,typename X>
struct apply<apply<ONE,F>,X>:apply<F,X>{};

struct TWO;
template<typename F,typename X>
struct apply<apply<TWO,F>,X>:
apply<F,typename apply<F,X>::type>{};
...
The corresponding basic arithmetic operations are easy to define:
SUCC = λnfx.f(nfx) (successor function)
PLUS = λmnfx.nf(mfx)
MULT = λmn.m(PLUS n) 0
and to translate to C++:
// syntax sugar to abbreviate notation
template<typename F,typename G,typename H>
struct apply2{
typedef typename apply<typename apply<F,G>::type,H>::type type;
};

struct SUCC;
template<typename N,typename F,typename X>
struct apply<apply<apply<SUCC,N>,F>,X>:
apply<F,typename apply2<N,F,X>::type>{};

struct PLUS;
template<typename M,typename N,typename F,typename X>
struct apply<apply<apply<apply<PLUS,M>,N>,F>,X>:
apply2<N,F,typename apply2<M,F,X>::type>{};

struct MULT;
template<typename M,typename N>
struct apply<apply<MULT,M>,N>:
apply2<M,typename apply<PLUS,N>::type,ZERO>{};

I've written a little C++ program exercising these ideas (you'll need Boost to compile it). How to test our constructs? That is, how to verify for instance that the type NINE defined as:

typedef apply<apply<MULT,THREE>::type,THREE>::type NINE;
actually corresponds to λfx.f(f(f(f(f(f(f(f(fx)))))))))? Well, if we define
template <int N>struct int_{};
struct succ;
template<int N> struct apply<succ,int_<N> >{
typedef int_<N+1> type;
};
then we have that
apply<apply<N,succ>::type,int_<0> >::type
is the same type as
int_<n>

as long as N stands for the Church numeral N. This verification can be easily done with some metaprogramming machinery. Note that succ is not a proper λ-term even though we have used it inside our apply machinery: this hints at a general technique called λ-calculus extension.


Saturday, November 10, 2007

A hero's job

The most popular 2-person non-zero sum conflictive games of cooperation/defection, like the Prisoner's dilemma, the Chicken game etc. have mutual cooperation as a Pareto optimal point, i.e. moving from mutual cooperation to defection harms the opponent. But what happens when this is not the case and yet mutual defection is a "Pareto pessimal", i.e. worst for both players?

Jan and Olga are travelling back home from Mars inside their tiny space capsule. The oxigen generator breaks down and reserves are just enough for only one person to make it to the Earth. Both space travellers glimpse at each other and then stare through the window into the infinite void sourrounding them.

In this game, a hero is needed for the other to obtain some payoff, but mutual cooperation is as bad as mutual defection:

A \ BCoopDef
Coop0\00\1
Def1\0
0\0

Even more than in the Prisoner's dilemma, the rational strategy for this game is to defect. Some have shown how cooperation can be justified in the iterated version of the Prisoner's dilemma, but in our rendering of the hero game no iteration is possible.

In my opinion, if the matrix above actually corresponds to the payoffs obtained by A and B, then there's little more to be said above the game and defecting is indeed the expected strategy. But what motivates a hero is the emotional reward he or she obtains from knowing the good that the heroic action is doing to others. A payoff matrix accounting for this moral compensation would look like this:

A \ BCoopDef
CoopaA\aBaA\1
Def1\aB0\0

Where aA and aB are the altruism factors of A and B, respectively. Usually each player does not know the altruism factor of the opponent, so the game becomes one of imperfect information. Also, it can be argued that in the case of mutual cooperation the payoff would still be zero, as the altruism factor should be multiplying the actual outcome of one's heroic action, but in the standard exposition of these type of games players take their choices simultaneously without knowing the opponent's move.

It does not take much Game Theory, anyway, to understand that heroic action is based on intangible payoffs like the love for others or for the country, honor, etc. Raising these altruism factors is arguably one of the goals of the highly ritualistic military culture.

If we extend the hero game to n persons in a way that still it only takes a hero to save the rest, we can see that, from a societal point of view, it is desirable that altruism be not evenly distributed but rather concentrated in a few individuals: otherwise we'd have mass immolation.

Tuesday, November 6, 2007

All the King's philosophers: part II

(See part I.)

Al-Khwārizmī spent the night at the royal palace. Very early the next morning he looked for a cool and quiet place on a veranda by the palace garden, asked for a few sheets of the famous Samarkand paper and some ink and put himself to work on the quest for the Philosopher Rule. The following is a mildly adapted version of the annotations made by al-Khwārizmī during that day.

"To simplify things, let us assume that questions posed to the Council are of the form that can be asked simply by saying yes or no, as in 'Must we go to war against the Bactrians?' or 'Should taxes be raised?'. To every such question, the Council will be divided into those philosophers answering affirmatively and those whose reply is negative. The Philosopher Rule must then decide whether the final answer is yes or no based on the distribution of individual replies across the Council. The Rule will be complete if it can take care of every possible answer distribution."

"Now, as philosophers pride themselves with having an answer to everything, we know that each of the Council members will either reply yes or no to every conceivable query they might be confronted with, that is, they never remain silent at a question. So, to a given question the group of negative philosophers is exactly the complementary of the group of affirmative philosophers, and we can in a sense concentrate only on the affirmative party. The Philosopher Rule can be then regarded as playing the following role: for each group of affirmative philosophers the Rule must decree whether the group is authoritative, that is, whether the final judgment on a question must be taken to be affirmative if it is this precise group of philosophers that reply in the positive."

"We cannot choose at random which groups are authoritative, because such disposition would likely yield inconsistent results when the Council was consulted over time. Clearly, some constraints must apply to the way in which we choose the list of authoritative groups. I think the art of Logic will help us here, fortunately I brought with me some books on the subject."

That was how far the mathematician arrived the first day. The second day, al-Khwārizmī ordered a fresh load of paper and sat on the veranda from dusk till dawn furiously scribbling symbols and consulting a heap of books by the school of the Stoics. When the last light of day vanished beyond the palace walls and the first crickets began chirping in the garden, the mathematician arose from his seat with a triumphal smile, as he thought he had the foundations of the problem basically laid down, and only some routinely calculations were necessary to reach a solution. This is a summary of his conclusions that day:

"We explore the constraints imposed on eligible authoritative groups by studying their connection with the basic laws of Logic by which every wise man must abide. For it is our goal that the decisions taken by the Rule be free of contradiction much as those of any philosopher are bound to be."

  • "If the entire Council supports a decision (though according to the King this event has not ever happened), it is only sensible for the Rule to approve the decision as well. So, the group formed by all philosophers in the Council is an authoritative group."
  • "If a group A is deemed authoritative, then the complementary group, i.e. that comprising the philosophers outside A, cannot be authoritative. Otherwise, if we asked a question which is supported by A and then submitted exactly the opposite question, which would be supported by the complementary of A, the Rule would end up holding both a position and its negation."
  • If a certain group A is considered authoritative, so will be the case with any other group including A: having more philosophers supporting the decision can only make us more confident on the result.
  • "Let us suppose we pose a question, like 'Should we raise taxes?' to which an authoritative group A responds affirmatively, and then some other question, like 'Should we offer a sacrifice to the gods?', which is supported by another authoritative group B. If we had submitted the combined question 'Should we raise taxes and offer a sacrifice?' the laws of Logic teach us that the philosophers answering positively would be exactly those belonging to both A and B. Hence the Rule must have the intersection of authoritative groups as authoritative."

"The only work left to do is finding a list of authoritative groups that satisfy the restrictions. This I will do tomorrow through some fairly easy if tedious calculations."

Al-Khwārizmī had a light dinner and went to sleep with the peace of mind enjoyed by those who feel success within reach of their hand.

Friday, November 2, 2007

λ-fun in metaC++

Let's embed untyped lambda calculus into the metaprogramming sublanguage of C++. Such embedding consists of two components:
  • Syntax: a representation for λ-terms.
  • Semantics: a mechanism for β-reduction.
Though in our approach both aspects are inextricably intertwinned: lambda abstraction (the operation that goes from M to λx.M) and its associated β-reduction rule are described operationally rather than syntactically.

A λ-term is represented by one of the following:

  • An unanalyzed C++ type,
  • A type of the form apply<F,G> for some types F,G,
where the metafunction apply is defined as follows:
template<typename F,typename G>
struct apply{typedef apply type;};

(Note that apply resolves to itself, i.e. apply<F,G>::type is apply<F,G>. The reason for this will be apparent later. )

For instance, if we have types F, G and H standing for λ-terms F, G and H, the derived term
FH(GH)
is represented by the type
apply<apply<F,H>,apply<G,H> >
As we commented before, there is no syntactic facility for expressing lambda abstraction; instead, if a type is to represent an abstraction term we define its behavior under β-reduction by specializing apply. For instance, given the term
I = λx.x
we can represent it like this:
struct I;
template<typename X>
struct apply<I,X>{typedef X type;};
That is, we partially specialize apply so that apply<I,F>::type represents the β-reduction of (λx.x)f, i.e. f. This manner of handling lambda abstraction is particularly appealing in that it dispenses with α-reduction of bound variables by way of simply not representing bound variables at all (their role is played by template parameters).

More examples of representing lambda abstraction:

K = λxy.x
struct K;
template<typename X,typename Y>
struct apply<apply<K,X>,Y>{typedef X type;};
K* = λxy.y
struct KSTAR;
template<typename X,typename Y>
struct apply<apply<KSTAR,X>,Y>{typedef Y type;};

S = λxyz.xz(yz)

struct S;
template<typename X,typename Y,typename Z>
struct apply<apply<apply<S,X>,Y>,Z>{
typedef typename apply<
typename apply<X,Z>::type,
typename apply<Y,Z>::type
>::type type;
};
Note how currying is used in the specializations of apply for binary K and K* and ternary S. Another important aspect is that the application operation for S is deep in the sense that it not only composes the term XZ(YZ), as the following formulation would have done:
// lazy formulation
template<typename X,typename Y,typename Z>
struct apply<apply<apply<S,X>,Y>,Z>{
typedef typename apply<
typename apply<X,Z>,
typename apply<Y,Z>
> type;
};
but it also executes XZ(YZ) by invoking the nested ::type of each of the apply components. This amounts to implementing full β-reduction instead of lazy one-step reduction. In general, full reduction is assumed, that is, each λ-term represented in our embedding mechanism is actually in β-normal form. This rules out terms without β-nf as Ω:

ω = λx.xx
Ω = ωω

struct omega;
template<typename X>
struct apply<omega,X>{typedef typename apply<X,X>::type type;};

typedef apply<omega,omega>::type OMEGA; // compile-time error
Now, when a type X is "primitive" in the sense that it does not have any associated specialization of apply, apply<X,Y>::type evaluates to apply itself. So, if we have some term F and apply it to a succesion of primitive types X, Y, Z,... until no more active reductions remain, what we obtain is a syntactic description of the β-nf of F in the form of a composite apply expression with X, Y,... as terminals. For instance:

Sβ λxyz.xz(yz)

apply<apply<apply<S,X>::type,Y>::type,Z>::type evaluates to
apply<apply<X,Z>,apply<Y,Z> >

KIβ λxy.y (= K*)

apply<apply<apply<K,I>::type,X>::type,Y>::type evaluates to Y
SKKβ λx.x (= I)
apply<apply<apply<S,K>::type,K>::type,X>::type evaluates to X
This technique can be used to determine whether two λ-terms (as represented in our C++ embedding) are β-convertible.

In later entries we will play some more with our lambda calculus embedding.

Sunday, October 28, 2007

A little operations research into TV programming: simulation and remarks

I wrote a little C++ program implementing the TV scheduling algorithm we devised a few entries ago and used it to do several calculations and get some intuition about how the optimum scheduling looks like.


The figure shows in solid beige an audience curve reaching its maximum around the central slot. The competition has placed its resources (amounting to a total C = 165) according to a sinusoidal function with maxima at slots 3-4 and minima at 8-9 (gray line). We have depicted the resulting OTVS distribution for M = 20, 120 and 220. When our resources are much less than the competition's, the optimum distribution concentrates on the slots of higher profitability from our point or view, i.e. those with higher ai/ci. As our resources grow the distribution progressively approaches the limit shape √(aici). This suggests the following rule of thumb for allocating resources: when you're a weak player, concentrate your resources on the competition's valleys.

What about the competition's optimum behavior, i.e. how to best distribute resources a priori without information on the opponent scheduling? Assuming the solution to OTVS is such that mi ≥ 0 for all i, we have

S({mi}) = A − (∑√(aici))2/(M+C),

where A = ∑ai. It is then easy to see, using Cauchy's Inequality, that the best policy for the competition is to just follow the audience curve, i.e. set ci = (C/A)ai. In this case, the resulting scenario is disappointingly simple: our corresponding counter-scheduling is simply mi = (M/A)ai and we obtain our fair share of the audience, S({mi}) = M/(M+C). It is only when the competition deviates from its natural distribution that we can get more than our money can buy, so to speak.

Wednesday, October 24, 2007

All the King's philosophers: part I

In one of the rare occasions when the mathematician Muhammad ibn Mūsā al-Khwārizmī travelled abroad from his homeland, the caravan he was joining made a stop at the ancient city of Samarkand. Upon being informed of his arrival, the King, who was a learned man and knew of the scientific reputation of the traveller, quickly summoned al-Khwārizmī to his palace and made him his guest for the night. After dinner both men engaged in conversation about scientifical subjects. Impressed by the erudition of al-Khwārizmī, the King then decided to confide the mathematician a problem that was troubling him for long.

"Muhammad ibn Mūsā al-Khwārizmī, as you see I hold the deepest reverence for science, and this devotion led me to the decision of ruling my kingdom according to the teachings of Philosophy. To that purpose, I gathered over the years a Council of Philosophers formed by some of the most learned men on the Earth. When I have some difficult decision to make, I summon the Council and ask them for advice. And here my problems begin: to whichever question I pose to the philosophers, each of them provides me with a different answer, and it is not uncommon that they begin quarreling among themselves about the subject, and I end up a little upset and without answers to my problem."

"I considered establishing some kind of rule by which to determine the conclusion to questions posed to the Council, like for instance taking the opinion held by the most philosophers, but I fear that such a rule could lead to contradictory judgements over time. While I undertand that each philosopher, taken in isolation, posseses a consistent view of the world and his tenets match that view, it is not clear how to combine all the different doctrines into a global set of precepts that can assist my heavy duties as the ruler of the country."

"I see that your skills in the ways of Mathematics are second to none, and a mathematical problem I have indeed. Would you help the King to find the Rule that will determine the outcome of questions posed to the Council of Philosophers? The compensation for your work would be in accordance with your mathematical eminence and my royal munificence" concluded the King, leaning back and slowly rubbing his jewellery-loaded hands.

The mathematician pondered the proposal for a mere few seconds and then replied "I am your servant, my lord", lowering his eyes, which were somewhat obfuscated by the wine and the glitter of the King's rings.

Sunday, October 21, 2007

Lane inversion

I often observe the following when a highway with two lanes per direction approaches congestion: under light traffic conditions, most vehicles stay on the right track so that the left lane is used for passing, as it should be; as traffic increases, the chances are higher that one cannot use the left lane to pass a slow vehicle because a third one is passing both, so users of the passing lane perceive there is a penalty in returning to the right, and ultimately most vehicles stay on the left lane; when this happens, the right lane begins to be used as a passing lane (even though in Spain, and I guess in many other countries, passing on the right is forbidden), and it turns out that the left lane becomes the slowest one. I take this phenomenon of lane inversion to be a sure sign that total congestion is coming on.

I wonder how much the throughput (traffic before congestion) of the highway would be raised if passing on the right was allowed and either lane could be used in cruise conditions.

Wednesday, October 17, 2007

A little operations research into TV programming: analytical solution

We derive an analytical solution to the optimum TV scheduling (OTVS) problem posed at a prior entry. In the rest we assume ai,ci > 0 for all i=1,...,n. We begin by solving a generalization of OTVS that is amenable to a more regular treatment following techniques similar to those of Variational Calculus, and then seek ways to constrain this problem so as to find solutions to OTVS itself.

Definition. Given I a subset of {1,...,n}, the generalized OTVS problem on I, GOTVS(I), consists of finding a resource distribution {mi, mi > −ci, i in I} with ∑Imi = M that maximizes S({mi}) = ∑Is(mi) = ∑Iaimi/(mi+ci). Note that in OTVS we imposed the tighter restriction mi ≥ 0.

Lemma. Consider the space of all distributions {mi, mi > −ci, i in I} with ∑Imi = M. The limit of S({mi}) when min({mi+ci}) → 0 is −∞.

Proof.Is(mi) can be split into its positive terms, whose sum is bounded by ∑Iai, and its negative terms, among which at least one tends to −∞.

Corollary. GOTVS(I) has a solution.

Proof. Given the result above, GOTVS(I) is equivalent to finding a maximum value of the continuous function S inside the compact set {mi, mi ≥ −ci+δ,i in I} with ∑Imi = M, for some fixed δ>0.

Theorem. Let di = s(mi)/∂mi = aici/(mi+ci)2. If {mi} is a solution to GOTVS(I) , then

dj = dk, for all j,k in I.

Proof. If dj > dk, we could then obtain a net gain in S({mi}) by transferring some quantity δ from mk to mj.

Corollary. GOTVS(I) has a unique solution of the form

mi = k√(aici) − ci, i in I,

where

k = (M+∑Icj)/∑I√(ajcj).

Proof. Solve di = aici/(mi+ci)2 = 1/k2, i in I, with the additional constraint ∑Imi = M.

Now, solving OTVS reduces to finding a solution to GOTVS(I) with no negative components for some maximal subset I of {1,...,n}. It turns out that finding the support of the solution to OTVS is fairly simple:

Theorem. Let {mi} be a solution to OTVS and define ti = (∂s(mi)/∂mi)mi=0 = ai/ci. For all j,p in I, if tjtp and mp > 0 then mj > 0.

Proof. Let us suppose mj = 0; as tj = (∂s(mj)/∂mj)mj=0tp = ap/cp > apcp/(mp+cp)2 = ∂s(mp)/∂mp, transferring some quantity δ from mp to mj would yield a solution without negative components and with greater S than {mi}.

Corollary. Let σ: II be a permutation of I such that tσ(1)tσ(2) ≥ ··· ≥ tσ(n). There is some τ in {1,...,n} such that the support of solutions to OTVS is {σ(1),...,σ(τ)}; moreover, tσ(τ) is strictly greater than tσ(τ+1) (except, of course, if τ = n).

Corollary. The solution to OTVS is unique.

The following pseudocode sketches an algorithm for computing OTVS.

OTVS({ai}, {ci}, M)
m1,...,mn ← 0
knumM
kden ← 0
τn
compute σ
for i = 1 to n
····if (knum+cσ(i))/(kden+√(aσ(i)cσ(i))) > cσ(i)/√(aσ(i)cσ(i)) then
········τi−1
········break
····else
········knumknum + cσ(i)
········kdenkden + √(aσ(i)cσ(i))
····end if
end for
kknum/kden
for i = 1 to τ
····mσ(i)k√(aσ(i)cσ(i)) −cσ(i)
end for

The idea is as follows: we keep visiting indices from the highest ti downward and maintain the running numerator and denominator of the value that k would take if the elements visited so far were exactly those comprising the support of the solution to OTVS. When we come upon an element for which its computed mi would be negative, we know we've gotten to the cutoff index τ and that the k calculated so far is the actual one, so we just use it for computing the final values of {mi}.