A celebrated result in the theory of random matrices is the connection between the extremal eigenvalue of a random matrix sampled from the Gaussian Unitary Ensemble and the Painlevé II equation. In this post I will give a derivation of this that uses the Hankel composition method. This method is given in great generality in the paper of Bothner and was used in our joint work to relate the distribution function of the extremal eigenvalue in the elliptic Ginibre ensemble to an integro-differential equation. By showing how this method works in the case of the GUE it should shed some light on how it works in other cases.

Background

The Gaussian Unitary Ensemble (GUE) is an ensemble of n×nn \times n Hermitian random matrices with probability density

1ZGUEe12tr(H2)\frac{1}{Z_{\mathrm{GUE}}} \mathrm{e}^{- \frac{1}{2} \mathrm{tr}(H^2)}

ZGUEZ_{\mathrm{GUE}} is a normalisation constant. We are interested in the distribution of the extremal (rightmost) eigenvalue. A famous result (see Chapter 24 of Mehta’s Random Matrices) shows that the cumulative distribution converges, under an appropriate scaling, to the Fredholm determinant of the Airy kernel. Let λn\lambda_n be the rightmost eigenvalue.

F(t)limnP(λn2n+t2n16)=det(1K)L2(t,)F(t) \equiv \lim_{n \to \infty} \mathbb{P}\left(\lambda_n \leq \sqrt{2n} + \frac{t}{\sqrt{2}n^\frac{1}{6}}\right) = \det(1 - K)_{L^2(t,\infty)}

where K:L2(t,)L2(t,)K: L^2(t,\infty) \to L^2(t,\infty) is the operator with kernel

K(x,y)=Ai(x)Ai(y)Ai(x)Ai(y)xy=0Ai(x+s)Ai(y+s)ds.K(x,y) = \frac{\mathrm{Ai}(x) \mathrm{Ai}^\prime(y) - \mathrm{Ai}^\prime(x)\mathrm{Ai}(y) }{x-y} = \int_0^\infty \mathrm{Ai}(x+s)\mathrm{Ai}(y+s) \, \mathrm{d}s.

The motivation for studying this is not simply that the GUE is an easy model to study, but also that this Airy kernel is universal (see Deift’s Orthogonal Polynomials and Random Matrices: A Riemann-Hilbert Approach). That is, suppose we have an ensemble of Hermitian matrices with probability density

1ZVentrV(H)\frac{1}{Z_{V}} \mathrm{e}^{- n \mathrm{tr} V(H)}

for some entire function VV which grows sufficiently rapidly at ±\pm \infty, e.g. a polynomial. For generic VV, the eigenvalues will asymptotically (nn \to \infty) concentrate on disjoint intervals [α1,β1],,[αm,βm][\alpha_1, \beta_1], \dots , [\alpha_m, \beta_m]; and the distribution of the extremal eigenvalue at these endpoints α1,β1,,αm,βm\alpha_1, \beta_1, \dots, \alpha_m , \beta_m will converge after a suitable rescaling, for “typical” VV, to det(1K)L2(t,)\det(1 - K)_{L^2(t,\infty)}. There is a similar universality in the bulk where the “universal” kernel is the sine kernel. Gap probabilities in the sine point process were found to be related to the Painlevé V equation by the group of Jimbo, Miwa, Môri and Sato in 1980 (see here for an accessible introduction to this work). The work of Tracy and Widom on the Airy kernel was strongly inspired by the work of this group.

The Connection to Painlevé II

Theorem (Tracy and Widom 1993):

F(t)=exp(t(st)q(s)2ds)\boxed{F(t) = \exp\left(-\int_t^\infty (s-t)q(s)^2 \, \mathrm{d}s\right)}

where qq solves Painlevé II

q(t)=tq(t)+2q(t)3\boxed{q^{\prime \prime}(t) = t q(t)+ 2 q(t)^3}

and we have the boundary condition q(t)Ai(t)q(t) \sim \mathrm{Ai}(t) as t+t \to +\infty. \triangle

We demonstrate the above by showing that

d2dt2logF(t)=q(t)2.\frac{\mathrm{d}^2}{\mathrm{d}t^2} \log F(t) = -q(t)^2 .

We then obtain the above formula by integrating twice. To justify this requires showing that logF(t)\log F(t) and ddtlogF(t)\frac{\mathrm{d}}{\mathrm{d}t} \log F(t) tend to zero at t=+t = +\infty. Showing this requires an asymptotic analysis of the Fredholm determinant det(1K)L2(t,)\det(1 - K)_{L^2(t,\infty)} which is beyond the scope of this post.

The first step is to bring the tt dependence into the operator. Let

Kt(x,y)=K(x+t,y+t)=tAi(x+s)Ai(y+s)ds.K_t(x,y) = K(x+t,y+t) = \int_t^\infty \mathrm{Ai}(x+s) \mathrm{Ai}(y+s) \, \mathrm{d}s .

Then F(t)=det(1K)L2(t,)=det(1Kt)L2(R+)F(t)=\det(1-K)_{L^2(t,\infty)} = \det(1-K_t)_{L^2(\mathbb{R}_+)}.

Notation: We let τt\tau_t be the shift operator, so that (τtϕ)(x)=ϕ(x+t)(\tau_t \phi)(x) = \phi(x+t) and DD be the derivative operator, (Dϕ)(x)=ϕ(x)(D\phi)(x) = \phi^\prime(x). We shall be somewhat careless and not specify on what spaces these operators act on. Let us also denote the Airy function Ai=A\mathrm{Ai} = A. \triangle

We see that ddtKt(x,y)=A(x+t)A(y+t)\frac{\mathrm{d}}{\mathrm{d}t} K_t(x,y) = - A(x+t)A(y+t). Thus

ddtKt=τtAτtA.\frac{\mathrm{d}}{\mathrm{d}t} K_t = - \tau_t A \otimes \tau_t A .

Remark: I should signpost a point of rigour. We have calculated the derivative with respect to tt pointwise on the kernel, but in fact what we’d like is for the limit implicit in the derivative to exist in the trace norm, and a complete proof would show this. \triangle

Then we see by Jacobi’s formula

ddtlogF(t)=trL2(R+)((1Kt)1dKtdt)=trL2(R+)((1Kt)1τtAτtA).\frac{\mathrm{d}}{\mathrm{d}t} \log F(t) = -\mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} \frac{\mathrm{d}K_t}{\mathrm{d}t} \right) = \mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} \tau_t A \otimes \tau_t A \right) .

Remark: Note that trL2(R+)(ψϕ)=ψ,ϕL2(R+)=R+ψ(x)ϕ(x)dx\mathrm{tr}_{L^2(\mathbb{R}_+)} \left(\psi \otimes \phi\right) = \langle \psi, \phi \rangle_{L^2(\mathbb{R}_+)} = \int_{\mathbb{R}_+} \psi(x) \phi(x) \, \mathrm{d}x (we will always take functions to be real valued). Note also that KtK_t, and hence (1Kt)1(1-K_t)^{-1}, are symmetric operators with respect to this inner product. \triangle

Next we use the identity

ddt(1Kt)1=(1Kt)1dKtdt(1Kt)1=(1Kt)1(τtAτtA)(1Kt)1.\frac{\mathrm{d}}{\mathrm{d}t} (1-K_t)^{-1} = (1-K_t)^{-1} \frac{\mathrm{d}K_t}{\mathrm{d}t} (1-K_t)^{-1} = - (1-K_t)^{-1} (\tau_t A \otimes \tau_t A) (1-K_t)^{-1} .

Observe that tr((αβ)(γδ))=tr(αδ)tr(βγ)\mathrm{tr}((\alpha \otimes \beta)(\gamma \otimes \delta)) = \mathrm{tr}(\alpha \otimes \delta) \mathrm{tr}(\beta \otimes \gamma).

d2dt2logF(t)=2trL2(R+)((1Kt)1DτtAτtA)(trL2(R+)((1Kt)1τtAτtA))2.\frac{\mathrm{d}^2}{\mathrm{d}t^2} \log F(t) = 2 \mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} D \tau_t A \otimes \tau_t A\right) - \left(\mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} \tau_t A \otimes \tau_t A\right)\right)^2 .

A hierarchy of coupled ODEs

Introduce the following notation,

qn(t)=((1Kt)1DnτtA)(0),q_n(t) = ((1-K_t)^{-1} D^n \tau_t A)(0) , pn(t)=trL2(R+)((1Kt)1DnτtAτtA).p_n(t) = \mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} D^n \tau_t A \otimes \tau_t A\right).

In this notation we have d2dt2logF(t)=2p1(t)p0(t)2\frac{\mathrm{d}^2}{\mathrm{d}t^2} \log F(t) = 2 p_1(t) - p_0(t)^2.

We now compute

ddtqn(t)=qn+1(t)q0(t)pn(t),\frac{\mathrm{d}}{\mathrm{d}t} q_n(t) = q_{n+1}(t)- q_0(t)p_n(t) , ddtpn(t)=pn+1(t)p0(t)pn(t)+trL2(R+)((1Kt)1DnτtADτtA)().\frac{\mathrm{d}}{\mathrm{d}t} p_n(t) = p_{n+1}(t)- p_0(t)p_n(t) + \underbrace{\mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} D^n \tau_t A \otimes D\tau_t A\right)}_{(\ast)} .

To compute ()(\ast) we integrate by parts

()=qn(t)(τtA)(0)trL2(R+)(D(1Kt)1DnτtAτtA)().(\ast) = -q_n(t)(\tau_t A)(0) - \underbrace{\mathrm{tr}_{L^2(\mathbb{R}_+)} \left( D(1-K_t)^{-1} D^n \tau_t A \otimes \tau_t A\right)}_{(\ast \ast)} .

Next we use the identity [D,(1Kt)1]=(1Kt)1[D,Kt](1Kt)1[D,(1-K_t)^{-1}] = (1-K_t)^{-1} [D,K_t] (1-K_t)^{-1}. We now give the following important exercise.

Exercise: Let ϕL2(R+)\phi \in L^2(\mathbb{R}_+) be a sufficiently nice function (e.g. continuously differentiable and ϕL2(R+)\phi^\prime \in L^2(\mathbb{R}_+)). Then

([D,Kt]ϕ)(x)=((τtAτtA)ϕ)(x)ϕ(0)Kt(x,0).([D,K_t]\phi)(x) = - ((\tau_t A \otimes \tau_t A)\phi)(x) - \phi(0) K_t(x,0) .

This yields the formula

()=pn+1(t)p0(t)pn(t)qn(t)((1Kt)1KtτtA)(0).(\ast \ast) = p_{n+1}(t)- p_0(t)p_n(t)-q_n(t)((1-K_t)^{-1} K_t \tau_t A)(0) .

Using that (1Kt)1Kt=(1Kt)11(1-K_t)^{-1} K_t = (1-K_t)^{-1} - 1, we obtain a formula for ddtpn(t)\frac{\mathrm{d}}{\mathrm{d}t} p_n(t). We thus obtain an infinite hierarchy of coupled ODEs, nNn \in \mathbb{N},

ddtqn(t)=qn+1(t)q0(t)pn(t),\boxed{\frac{\mathrm{d}}{\mathrm{d}t} q_n(t) = q_{n+1}(t)- q_0(t)p_n(t), } ddtpn(t)=qn(t)q0(t).\boxed{\frac{\mathrm{d}}{\mathrm{d}t} p_n(t) = -q_{n}(t)q_0(t).}

Exercise: The quantity C=p0(t)2q0(t)22p1(t)C = p_0(t)^2 - q_0(t)^2 - 2 p_1(t) is conserved. (There are actually infinitely many such conserved quantities but we only need this one.)

Corollary: It seems reasonable that since the Airy function decreases rapidly at ++\infty that qnq_n and pnp_n should tend to zero at t+t\to +\infty. It therefore follows that C=0C=0. From this it follows that

d2dt2logF(t)=q0(t)2.\frac{\mathrm{d}^2}{\mathrm{d}t^2} \log F(t) = - q_0(t)^2 .

Remark: It is “obvious” that since KtK_t is “small” for t+t\to +\infty

q0(t)(τtA)(0)=Ai(t).q_0(t) \approx (\tau_t A)(0) = \mathrm{Ai}(t) .

This explains the boundary condition. This needs to be rigorously justified but is beyond the scope of this post. \triangle

Closing up the system

Everything up until now has been “universal” – in that we haven’t used any properties of AA – we have only used the Hankel composition structure of KK. In particular, we haven’t used that AA solves the Airy equation, D2A=MAD^2 A = M A, where MM is the operator such that (Mϕ)(x)=xϕ(x)(M \phi)(x) = x \phi(x). Such a “non-universal” property allows us to close up the system and obtain an ODE for q0q_0. Note that (Mϕ)(0)=0(M \phi)(0) = 0.

From this we get

q2(t)=tq0(t)+([(1Kt)1,M]τtA)(0).q_2(t) = t q_0(t) + ([(1-K_t)^{-1},M] \tau_t A)(0) .

As before [(1Kt)1,M]=(1Kt)1[Kt,M](1Kt)1[(1-K_t)^{-1},M] = (1-K_t)^{-1} [ K_t, M](1-K_t)^{-1}. If we recall our two equivalent formulae for the Airy kernel we see that

[Kt,M]=τtADτtA+DτtAτtA.[ K_t, M] = - \tau_t A \otimes D\tau_t A + D\tau_t A \otimes \tau_t A .

This gives

q2(t)=tq0(t)+q1(t)p0(t)q0(t)p1(t).q_2(t) = t q_0(t) + q_1(t)p_0(t)- q_0(t)p_1(t) .

If we combine this formula with our relation p0(t)2q0(t)22p1(t)=0p_0(t)^2 - q_0(t)^2 - 2 p_1(t) = 0 we find

q0(t)=tq0(t)+2q0(t)3\boxed{q_0^{\prime \prime}(t) = tq_0(t) + 2 q_0(t)^3}

which is Painlevé II.