Jekyll2023-01-27T19:49:50+00:00https://almaths.github.io/maths_blog/feed.xmlA little mathematicsAlex Little's maths blog. Here I post about my research and stuff that interests me, in particular random matrix theory, statistical mechanics and integrable systems.Integrable structure in the elliptic Ginibre ensemble2022-11-16T11:00:11+00:002022-11-16T11:00:11+00:00https://almaths.github.io/maths_blog/integrable-elliptic-ginibre<p>This post will provide an accessible introduction to work contained in my <a href="https://arxiv.org/abs/2208.04684">joint paper</a> with Thomas Bothner. In this work we find integrable structure in the elliptic Ginibre ensemble and find interesting connections to the theory of <em>finite temperature</em> Airy processes. Our paper also performs a nonlinear steepest descent analysis of the cumulative distribution function of the rightmost particle, but this is left out of this post to keep the discussion nontechnical.</p>
<h2>Background and motivation</h2>
<p>Suppose you have an ecosystem with \(n\) species, with populations \(x_1, \dots, x_n\). Is this ecosystem stable? This question was raised by Robert May in his article <a href="https://www.gwern.net/docs/sociology/1972-may.pdf">“Will a Large Complex System be Stable?”</a> (1972). Suppose as a first approximation that each species is isolated and the only thing limiting population is competition over some non-organic resources. You could model this by the system of uncoupled ODEs</p>
\[\dot{x}_i = - \mu x_i\]
<p>for \(\mu > 0\). We have shifted the populations so that \(x_i = 0\) is the equilibrium. Now if we wanted to model interactions between species we could write</p>
\[\dot{x}_i = - \mu x_i + \sum_{j=1}^n A_{ij} x_j\]
<p>where \(A\) is some interaction matrix. \(A_{ij}\) represents the effect of the population of species \(j\) on the growth rate of species \(i\). (Note that there is no reason to think this matrix is symmetric.) This could be written in vector form</p>
\[\dot{\mathbf{x}} = (- \mu \mathbb{I} + A)\mathbf{x}\]
<p>It is clear that this system is stable when \(\mu > \max_i \Re \lambda_i\) and unstable when \(\mu < \max_i \Re \lambda_i\), where \(\lambda_1, \dots, \lambda_n\) are the eigenvalues of \(A\). Given that we expect \(n\) to be very large and the interactions between the species very complex, we cannot expect to model \(A\) exactly. Instead we take \(A\) to be random. Note that the probability of stability, \(\mathbb{P}(\max_i \Re \lambda_i < \mu)\), is exactly the cumulative distribution function for \(\max_i \Re \lambda_i\).</p>
<p>In our work we are interested in \(A \in \mathcal{M}_n(\mathbb{C})\), so it models stability of a linear system over the field of <em>complex</em> numbers. I will list three candidate distribution functions for \(A\). This list is by no means exhaustive – you can give many others – however these three cases are “exactly solvable” in that one can compute their correlation functions explicitly.</p>
<h3>The (complex) Ginibre Ensemble</h3>
<p>This ensemble was introduced by <a href="https://aip.scitation.org/doi/10.1063/1.1704292">Ginibre in 1965</a>. This is an ensemble of \(n \times n\) matrices with independent, identically distributed matrix elements, each given by complex Gaussians. One could write the density as</p>
\[P_{\mathrm{Gin}}(X) \, dX = \pi^{-n^2} e^{-\mathrm{tr}\, X X^\ast} \, d X\]
<p>The eigenvalues have the joint distribution</p>
\[\varrho_n(z_1, \dots, z_n) = C_n e^{-\sum_{k=1}^n |z_k|^2} \prod_{1 \leq i < j \leq n } |z_i - z_j|^2\]
<p>The Ginibre ensemble obeys a “circular law.” If we define the 1-point density as</p>
\[\varrho(x) = \frac{1}{n} \int_{(\mathbb{C})^{n-1}} \varrho_n(x, z_2, \dots, z_n) \, d^2 z_2 \dots d^2 z_n\]
<p>then \(\varrho(\sqrt{n}x ) \to \frac{1}{\pi} \chi_{\lvert x \rvert < 1}\) as \(n \to\infty\) in the weak-\(\ast\) sense. That is, the spectral density tends towards the unit disk. Furthermore, the eigenvalue with largest real part is asymptotically Gumbel distributed. More precisely</p>
\[\mathbb{P}\left( \max_i \Re \lambda_i \leq \sqrt{n} + \sqrt{\frac{\gamma_n}{4}} + \frac{t}{\sqrt{4 \gamma_n}} \right) \to e^{-e^{-t}}\]
<p>as \(n \to \infty\), where</p>
\[\gamma_n = \frac{1}{2}(\ln n - 5 \ln \ln n - \ln (2\pi^4))\]
<p>This result can be found in <a href="https://arxiv.org/abs/2206.04443">G. Cipolloni, L. Erdős, D. Schröder, and Y. Xu (2022)</a>. Furthermore, we show in work forthcoming on the arXiv, that the real parts of the eigenvalues converge locally, in the bulk of the spectrum, to a Poisson point process as \(n\to \infty\).</p>
<h3>The Gaussian Unitary Ensemble</h3>
<p>The Gaussian Unitary Ensemble (GUE) is an ensemble of \(n \times n\) complex Hermitian random matrices. The “unitary” in the name comes from the fact that the ensemble has a unitary symmetry. The density is formally very similar to the Ginibre case</p>
\[P_{\mathrm{GUE}}(X) \, dX = C^\prime_n e^{-\mathrm{tr}(X^2)} \, dX\]
<p>The difference is that this density is restricted to the subspace of matrices such that \(X=X^\ast\). In this case all the eigenvalues lie on \(\mathbb{R}\) and their joint density is given by</p>
\[\varrho_n(x_1, \dots , x_n) = C^{\prime \prime}_n e^{-\sum_{k=1}^n x_i^2} \prod_{1\leq i < j \leq n }|x_i - x_j|^2\]
<p>This looks practically identity to the case of Ginibre, but in fact the restriction to \(\mathbb{R}\) rather than \(\mathbb{C}\) changes things considerably. As is well known, the 1-point density <a href="https://mathworld.wolfram.com/WignersSemicircleLaw.html">converges to a semicircular distribution</a>. Furthermore, its rightmost eigenvalue asymptotically obeys a <a href="https://arxiv.org/abs/hep-th/9211141">Tracy-Widom (1994)</a> law. That is</p>
\[\mathbb{P}\left( \max_{i=1, \dots, n} \lambda_i \leq \sqrt{2n} + \frac{t}{\sqrt{2} n^\frac{1}{6}} \right) \to \exp\left(-\int_t^\infty (s-t) q(s)^2 \, ds\right)\]
<p>as \(n \to \infty\), and where \(q\) satisfies Painlevé II,</p>
\[q^{\prime \prime}(t) = t q(t) + 2 q(t)^3\]
<p>with boundary condition \(q(t) \sim \mathrm{Ai}(t)\) as \(t \to +\infty\). This Tracy-Widom distribution is a highly universal object appearing in many different contexts such as random matrix theory, Ulam’s problem, the KPZ equation, asymmetric exclusion processes, Aztec diamonds etc. This result is also celebrated because it connects random matrix theory to integrable systems (though it was not the first to do so).</p>
<h3>The Elliptic Ginibre Ensemble</h3>
<p>We start with the following observation. A Ginibre matrix can be generated by</p>
\[X = \frac{1}{\sqrt{2}}(H_1 + i H_2)\]
<p>where \(H_1, H_2\) are independently sampled GUE matrices. Motivated by this, consider the random matrix</p>
\[X = \sqrt{\frac{1+\tau}{2}}H_1 + i \sqrt{\frac{1-\tau}{2}} H_2\]
<p>for \(\tau \in [0,1]\). \(\tau = 0\) yields the Ginibre ensemble, \(\tau = 1\) yields the GUE. Thus we can interpolate between these two ensembles. This is called the “elliptic Ginibre ensemble.” Its density is given by</p>
\[P_{\mathrm{eGin}}(X)\, dX = C_n^{\prime \prime \prime} e^{- \frac{1}{1-\tau^2}\mathrm{tr}\, (X X^\ast - \tau \Re (X^2))} \, dX\]
<p>The ensemble derives its name because, up to a \(\sqrt{n}\) scaling, the spectral density tends (as \(n \to \infty\)) towards a constant on the ellipse</p>
\[\left\{ (x,y) \in \mathbb{R}^2 \, : \frac{x^2}{(1+\tau)^2} + \frac{y^2}{(1-\tau)^2} < 1 \right\} \subset \mathbb{R}^2 \simeq \mathbb{C}\]
<p>and zero outside.</p>
<p>In order to see something interesting for the extremal eigenvalue we need to take a double scaling limit, where \(\tau_n = 1 - \frac{\sigma^2}{n^\frac{1}{3}}\), for \(\sigma > 0\) a fixed parameter. This limit where the ellipse is very “flat” is called “weak non-Hermicity.” We are interested in the distribution of the largest real part in the limit. This scaling (due to <a href="https://link.springer.com/article/10.1007/s00440-009-0207-9">Bender, 2010</a>) is carefully chosen. Let \(\tau_n\) tend to \(1\) too fast and local correlations look like the GUE, too slow and local correlations look like Ginibre.</p>
<h2>Results</h2>
<p><a href="https://link.springer.com/article/10.1007/s00440-009-0207-9">Bender (2010)</a> defines a rescaled point process at the edge of the ellipse. One zooms in on the edge of ellipse in a specified way.</p>
\[x_j := \frac{\Re \lambda_j - c_n}{a_n}\]
\[y_j := \frac{\Im \lambda_j}{b_n}\]
<p>The scalings \(a_n, b_n, c_n\) can be found in Bender’s paper, and I will just state the result: that the rescaled point process \(\{ z_j := (x_j , y_j) \}_{j=1}^n\) converges as \(n\to \infty\). This yields a determinantal point process parametrised by \(\sigma > 0\). We now arrive at our main result.</p>
<p><strong>Theorem (Bothner-L.):</strong></p>
\[\boxed{F_\sigma(t) := \lim_{n \to \infty} \mathbb{P}\left( \max_{j=1,\dots, n} x_j \leq t \right) = \exp\left( - \int_t^\infty (s-t) \left[ \int_\mathbb{R} p_\sigma(s,y)^2 \, d\nu_\sigma(y) \right] ds \right)}\]
<p>where \(d\nu_\sigma(\lambda) = \frac{1}{\sigma \sqrt{\pi}} e^{-\left( \frac{\lambda}{\sigma}\right)^2}\, d\lambda\) and where \(p_\sigma\) satifies <em>integro-differential Painlevé II</em>,</p>
\[\boxed{\frac{\partial^2}{\partial t^2} p_\sigma(t,y) = \left[ t+y+2\int_\mathbb{R} p_\sigma(t,\lambda)^2 \, d \nu_\sigma(\lambda) \right] p_\sigma(t,y)}\]
<p>with boundary condition \(p_\sigma(t,y) \sim \mathrm{Ai}(t+y)\) as \(t \to +\infty\), pointwise in \(y \in \mathbb{R}\). \(\triangle\)</p>
<p>How does one see that this generalises the Tracy-Widom result? If one takes \(\sigma \downarrow 0\) one expects to reduce to the GUE. One sees that \(d\nu_\sigma(\lambda) \to \delta(\lambda) \, d\lambda\), and so</p>
\[\lim_{\sigma \downarrow 0} F_\sigma(t) = \exp\left( - \int_t^\infty (s-t) p_\sigma(s,0)^2 \, ds \right)\]
<p>and our integro-differential equation, when evaluated at \(y =0\), reduces to Painlevé II, with the right boundary condition.</p>
<p>On the other hand, recovering the Gumbel distribution as \(\sigma \to \infty\) is actually harder and is done in our paper. This is best done by studying the asymptotics of the corresponding Fredholm determinant rather than working with the integro-differential equation. We also look at asymptotics of \(F_\sigma(t)\) under various scalings of \(t\) and \(\sigma\) by means of nonlinear steepest descent techniques.</p>
<p><strong>Remark:</strong> There is work left to be done here, especially for the left tail \(t \to -\infty\), since we only obtain asymptotics in a scaling régime where \(\sigma\) is very small.</p>
<h2>Sketch of proof</h2>
<p>As the point process defined by Bender is determinantal, gap probabilities may be expressed in terms of Fredholm determinants.</p>
\[F_\sigma(t) = \det(1- K^{\sigma}_{\mathrm{Ai}})_{L^2((t,\infty)\times \mathbb{R})}\]
<p>The kernel of \(K^{\sigma}_{\mathrm{Ai}}\) (found by <a href="https://link.springer.com/article/10.1007/s00440-009-0207-9">Bender, 2010</a>) is complicated and is given in Equation 1.10 of <a href="https://arxiv.org/abs/2208.04684">our paper</a>. We make the following observation. Let \(\mathcal{F} : L^2(\mathbb{R}_+ \times \mathbb{R}) \to L^2(\mathbb{R}_+ \times \mathbb{R})\)</p>
\[\mathcal{F}f(x,\eta) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty e^{-i y \eta} f(x,y) \, dy\]
<p>be the Fourier transform in the vertical (imaginary) direction. Then we have</p>
\[K_\sigma = \mathcal{F} K^{\sigma}_{\mathrm{Ai}} \mathcal{F}^{-1}\]
<p>where \(K_{\sigma}\) is an integral operator with kernel</p>
\[K_{\sigma}(z_1, z_2) = \int_0^\infty \phi_\sigma(x_1 + s, y_1 ) \phi_\sigma(x_2 + s, y_2 ) \, ds\]
<p>where \(\phi(x,y) = \pi^{-\frac{1}{4}} e^{- \frac{y^2}{2}}\mathrm{Ai}(x+\sigma y)\) and \(z_i \equiv (x_i, y_i)\). By the invariance of the determinant, we then have</p>
\[F_\sigma(t) = \det(1- K_{\sigma})_{L^2((t,\infty)\times \mathbb{R})}\]
<p>Notice that \(K_{\sigma}(z_1, z_2)\) takes the form of a Hankel composition, and so one can use the method given in <a href="https://almaths.github.io/maths_blog/tracy-widom/">my previous post</a> to derive a Tracy-Widom-like result.</p>
<h3>Finite temperature Airy processes</h3>
<p>Our result is curious because the integro-differential equation we derive also appears in the study of <em>finite temperature Airy processes</em> on \(\mathbb{R}\), whereas we are looking at a point process in \(\mathbb{R}^2\). What is the relationship?</p>
<p>Begin by transferring the \(t\) dependence to the operator.</p>
\[F_\sigma(t) = \det(1- K_{\sigma}^t)_{L^2(\mathbb{R}_+\times \mathbb{R})}\]
<p>where</p>
\[K_{\sigma}^t((x_1, y_1),(x_2, y_2)) = K_{\sigma}((x_1+t, y_1),(x_2+t, y_2))\]
<p>Now define the operator \(P_{t,\sigma} : L^2(\mathbb{R}_+ \times \mathbb{R}) \to L^2(\mathbb{R}_+)\) with kernel</p>
\[P_{t,\sigma}(s, (x,y)) = \pi^{-\frac{1}{4}} e^{-\frac{1}{2}y^2} \mathrm{Ai}(s+x+\sigma y + t)\]
<p>Then a short calculation shows that</p>
\[P_{t,\sigma}^\ast P_{t,\sigma} = K_\sigma^t\]
<p>Now consider \(P_{t,\sigma} P_{t,\sigma}^\ast : L^2(\mathbb{R}_+) \to L^2(\mathbb{R}_+)\). A short calculation shows that this operator has kernel</p>
\[P_{t,\sigma} P_{t,\sigma}^\ast(s_1, s_2)=\int_\mathbb{R} \Phi\left( \frac{y}{\sigma} \right) \mathrm{Ai}(s_1+y+t) \mathrm{Ai}(s_2+y+t) \, dy =: N_\sigma^t(s_1, s_2)\]
<p>where \(\Phi(x) = \frac{1}{\sqrt{\pi}}\int_{-\infty}^x e^{-t^2}\, dt\). Undoing the shift by letting \(N_\sigma (s_1 +t , s_2 + t) = N_\sigma^t (s_1 , s_2)\) and using Sylvester’s identity we find</p>
\[F_\sigma(t) = \det(1-N_\sigma)_{L^2(t,\infty)}\]
<p>This means that the largest particle of the finite temperature Airy process with function \(\Phi\) is identically distributed to the largest <em>real</em> part in the elliptic edge process.</p>
<p><strong>Remark:</strong> Indeed, this observation may be generalised. Let us consider any finite temperature Airy process, with kernel</p>
\[N(x,y) = \int_\mathbb{R} \psi(s) \mathrm{Ai}(x+s) \mathrm{Ai}(y+s) \, ds\]
<p>where \(\psi(x) \geq 0\) is an increasing, continuously differentiable function such that \(\psi(-\infty) = 0\) and \(\psi(+\infty) = 1\). By an identical argument to that presented above we have the identity</p>
\[\det(1-N)_{L^2(t,\infty)} = \det(1-K)_{L^2((t,\infty) \times \mathbb{R})}\]
<p>where</p>
\[K((x_1,y_1), (x_2,y_2)) = \int_0^\infty \phi(x_1 +s, y_1) \phi(x_2 +s, y_2) \, ds\]
<p>where \(\phi(x,y) = \sqrt{\psi^\prime(y)}\mathrm{Ai}(x+y)\). Thus \(K\) has the form of a Hankel composition and so the Fredholm determinant has an associated integro-differential representation (by the method of <a href="https://almaths.github.io/maths_blog/tracy-widom/">my previous post</a>). That such Fredholm determinants have an integro-differential representation is already known in the literature but, to the author’s knowledge, this method of relating finite temperature kernels to Hankel compositions is new. \(\triangle\)</p>
<p><strong>Remark:</strong> Here we make a remark that will be well known to experts, namely the relationship between finite temperature kernels and Riemann-Hilbert problems. It is clear that if we let</p>
\[A : L^2(\mathbb{R}) \to L^2(t,\infty)\]
<p>with kernel</p>
\[A(x,x^\prime) = \mathrm{Ai}(x+x^\prime) \sqrt{\psi(x^\prime)}\]
<p>Then clearly \(A A^\ast = N : L^2(t,\infty) \to L^2(t,\infty)\). A simple calculation shows that \(M_t := A^\ast A : L^2(\mathbb{R}) \to L^2(\mathbb{R})\) has kernel</p>
\[M_t (x,y) = \sqrt{\psi(x)}\sqrt{\psi(y)} \frac{\mathrm{Ai}(x+t)\mathrm{Ai}^\prime (y+t) - \mathrm{Ai}^\prime (x+t)\mathrm{Ai}(y+t)}{x-y}\]
<p>This is an integrable Its-Izergin-Korepin-Slavnov type operator (see this <a href="https://www.ams.org/books/trans2/189/">nice introduction</a> to integrable operators by Deift). The IIKS theory relates Fredholm determinants of such operators to Riemann-Hilbert problems. Thus</p>
\[\det(1-N)_{L^2(t,\infty)}= \det(1-M_t)_{L^2(\mathbb{R})}\]
<p>may be related to a Riemann-Hilbert problem. This then allows for a Deift-Zhou steepest descent analysis.</p>
<p><strong>Remark:</strong> As a final remark, these factorisations are useful at proving that the corresponding operator is trace class, since one need only show that the factors are respectively Hilbert-Schmidt.</p>
<h3>The bulk of the spectrum (addendum)</h3>
<p>The techniques developed above can be used to derive a similar integro-differential representation for gaps between real parts in the <em>bulk</em> of the spectrum (the reader is referred to <a href="https://arxiv.org/abs/2212.00525">our paper on the arXiv</a> for details). In particular, the correlation kernel in the bulk also satisfies a curious factorisation property. Recall that we are considering the régime where \(\tau_n \to 1\) and so “the bulk” is approximately the set \((-2,2)\).</p>
<p>“Weak non-Hermiticity” in the bulk requires a somewhat different scaling than at the edge. In particular, if \(\lambda_0 \in (-2,2)\) is the point we’re going to “zoom in” on and \(\rho_1(x) = \frac{1}{\pi}\sqrt{\left(1-\frac{x^2}{4}\right)_+}\), then we must take</p>
\[\tau_n = 1- \frac{1}{n} \left( \frac{\sigma}{\rho_1(\lambda_0)}\right)^2\]
<p>for \(\sigma \geq 0\). Then, after a suitable rescaling, the point process around the point \(\lambda_0 \in (-2,2)\) converges to a determinantal point process with kernel given by</p>
\[K_{\mathrm{sin}}^{\sigma}(z_1,z_2) = \frac{1}{\sigma\sqrt{\pi}} e^{-\frac{y_1^2 + y_2^2}{2\sigma^2}} \frac{1}{2\pi}\int_{-\pi}^\pi e^{-(\sigma u)^2} \cos(u(z_1 - \overline{z_2})) \, du\]
<p>where \(y_k = \Im z_k\) for \(k=1,2\). This kernel describes a determinantal point process in the plane \(\mathbb{R}^2 \simeq \mathbb{C}\).</p>
<p>Now let \(J_t = (-t,t)\times \mathbb{R} \subset \mathbb{R}^2\) and suppose we are interested in the gap probability given by</p>
\[\det(1-K_{\mathrm{sin}}^{\sigma})_{L^2(J_t)}\]
<p>This corresponds to looking at gaps between real parts (note also that the point process is horizontally translation invariant). We now observe the following factorisation. Let</p>
\[A_\sigma : L^2(J_t) \to L^2(-\pi,\pi)\]
<p>with kernel</p>
\[A_\sigma(a,z) = \frac{1}{\sqrt{2\pi^\frac{3}{2}\sigma}} \exp\left(-\frac{y^2}{2\sigma^2} - \frac{1}{2}(\sigma a)^2 - ia \overline{z}\right)\]
<p>A simple calculation then shows that</p>
\[K_{\mathrm{sin}}^{\sigma} = A_\sigma^\ast A_\sigma : L^2(J_t) \to L^2(J_t)\]
<p>But then, by Sylvester’s identity, we have</p>
\[\det(1-K_{\mathrm{sin}}^{\sigma})_{L^2(J_t)} = \det(1-A_\sigma A_\sigma^\ast)_{L^2(-\pi,\pi)} = \det(1-S_\sigma^t )_{L^2(-t,t)}\]
<p>where \(S_\sigma^t\) is a rescaled version of \(A_\sigma A_\sigma^\ast\) (so that its domain is now \(L^2(-t,t)\)). A simple calculation (see our paper) shows that one can write the kernel of \(S_\sigma^t\) as</p>
\[S_\sigma^t(a,b) = \int_0^\infty \left[ \Phi\left( \frac{t}{\sigma}(z+1) \right) - \Phi\left( \frac{t}{\sigma}(z-1)\right) \right]\cos(\pi(a-b)z) \, dz\]
<p>This is exactly a <em>finite temperature sine kernel</em>. In our paper we show that for any determinantal point process on \(\mathbb{R}\) with kernel of the form</p>
\[K(a,b) = \int_0^\infty w(z)\cos(\pi(a-b)z) \, dz\]
<p>for \(w: \mathbb{R}_+ \to [0,1)\) and tending to \(0\) exponentially fast at \(+\infty\), the gap probability \(\det(1-K)_{L^2(-t,t)}\) can be represented in terms of a solution to an <em>integro-differential Painlevé V equation</em>. The generalises the famous result of Jimbo-Miwa-Môri-Sato (1980).</p>This post will provide an accessible introduction to work contained in my joint paper with Thomas Bothner. In this work we find integrable structure in the elliptic Ginibre ensemble and find interesting connections to the theory of finite temperature Airy processes. Our paper also performs a nonlinear steepest descent analysis of the cumulative distribution function of the rightmost particle, but this is left out of this post to keep the discussion nontechnical.A derivation of the Tracy-Widom distribution by the Hankel composition method2022-09-08T17:50:11+00:002022-09-08T17:50:11+00:00https://almaths.github.io/maths_blog/tracy-widom-distribution<p>A celebrated result in the theory of random matrices is the connection between the extremal eigenvalue of a random matrix sampled from the Gaussian Unitary Ensemble and the Painlevé II equation. In this post I will give a derivation of this that uses the Hankel composition method. This method is given in great generality in the <a href="https://arxiv.org/abs/2205.15007">paper</a> of Bothner and was used in our <a href="https://arxiv.org/abs/2208.04684">joint work</a> to relate the distribution function of the extremal eigenvalue in the elliptic Ginibre ensemble to an integro-differential equation. By showing how this method works in the case of the GUE it should shed some light on how it works in other cases.</p>
<h2>Background</h2>
<p>The Gaussian Unitary Ensemble (GUE) is an ensemble of \(n \times n\) Hermitian random matrices with probability density</p>
\[\frac{1}{Z_{\mathrm{GUE}}} e^{- \frac{1}{2} \mathrm{tr}(H^2)}\]
<p>\(Z_{\mathrm{GUE}}\) is a normalisation constant. We are interested in the distribution of the extremal (rightmost) eigenvalue. A famous result (see Chapter 24 of Mehta’s <em>Random Matrices</em>) shows that the cumulative distribution converges, under an appropriate scaling, to the Fredholm determinant of the <em>Airy kernel.</em> Let \(\lambda_n\) be the rightmost eigenvalue.</p>
\[F(t) \equiv \lim_{n \to \infty} \mathbb{P}\left(\lambda_n \leq \sqrt{2n} + \frac{t}{\sqrt{2}n^\frac{1}{6}}\right) = \det(1 - K)_{L^2(t,\infty)}\]
<p>where \(K: L^2(t,\infty) \to L^2(t,\infty)\) is the operator with kernel</p>
\[K(x,y) = \frac{\mathrm{Ai}(x) \mathrm{Ai}^\prime(y) - \mathrm{Ai}^\prime(x)\mathrm{Ai}(y) }{x-y} = \int_0^\infty \mathrm{Ai}(x+s)\mathrm{Ai}(y+s) \, ds\]
<p>The motivation for studying this is not simply that the GUE is an easy model to study, but also that this Airy kernel is <em>universal</em> (see Deift’s <em>Orthogonal Polynomials and Random Matrices: A Riemann-Hilbert Approach</em>). That is, suppose we have an ensemble of Hermitian matrices with probability density</p>
\[\frac{1}{Z_{V}} e^{- n \mathrm{tr} V(H)}\]
<p>for some entire function \(V\) which grows sufficiently rapidly at \(\pm \infty\), e.g. a polynomial. For generic \(V\), the eigenvalues will asymptotically (\(n \to \infty\)) concentrate on disjoint intervals \([\alpha_1, \beta_1], \dots , [\alpha_m, \beta_m]\); and the distribution of the extremal eigenvalue at these endpoints \(\alpha_1, \beta_1, \dots, \alpha_m , \beta_m\) will converge after a suitable rescaling, for “typical” \(V\), to \(\det(1 - K)_{L^2(t,\infty)}\). There is a similar universality in the bulk where the “universal” kernel is the sine kernel. Gap probabilities in the sine point process were found to be related to the Painlevé V equation by the group of Jimbo, Miwa, Môri and Sato in 1980 (see <a href="https://core.ac.uk/download/pdf/25350076.pdf">here</a> for an accessible introduction to this work). The work of Tracy and Widom on the Airy kernel was strongly inspired by the work of this group.</p>
<h2>The Connection to Painlevé II</h2>
<p><strong>Theorem</strong> (<a href="https://arxiv.org/abs/hep-th/9210074">Tracy and Widom 1993</a>)<strong>:</strong></p>
\[\boxed{F(t) = \exp\left(-\int_t^\infty (s-t)q(s)^2 \, ds\right)}\]
<p>where \(q\) solves Painlevé II</p>
\[\boxed{q^{\prime \prime}(t) = t q(t)+ 2 q(t)^3}\]
<p>and we have the boundary condition \(q(t) \sim \mathrm{Ai}(t)\) as \(t \to +\infty\). \(\triangle\)</p>
<p>We demonstrate the above by showing that</p>
\[\frac{d^2}{dt^2} \log F(t) = -q(t)^2\]
<p>We then obtain the above formula by integrating twice. To justify this requires showing that \(\log F(t)\) and \(\frac{d}{dt} \log F(t)\) tend to zero at \(t = +\infty\). Showing this requires an asymptotic analysis of the Fredholm determinant \(\det(1 - K)_{L^2(t,\infty)}\) which is beyond the scope of this post.</p>
<p>The first step is to bring the \(t\) dependence into the operator. Let</p>
\[K_t(x,y) = K(x+t,y+t) = \int_t^\infty \mathrm{Ai}(x+s) \mathrm{Ai}(y+s) \, ds\]
<p>Then \(F(t)=\det(1-K)_{L^2(t,\infty)} = \det(1-K_t)_{L^2(\mathbb{R}_+)}\).</p>
<p><strong>Notation:</strong> We let \(\tau_t\) be the shift operator, so that \((\tau_t \phi)(x) = \phi(x+t)\) and \(D\) be the derivative operator, \((D\phi)(x) = \phi^\prime(x)\). We shall be somewhat careless and not specify on what spaces these operators act on. Let us also denote the Airy function \(\mathrm{Ai} = A\). \(\triangle\)</p>
<p>We see that \(\frac{d}{dt} K_t(x,y) = - A(x+t)A(y+t)\). Thus</p>
\[\frac{d}{dt} K_t = - \tau_t A \otimes \tau_t A\]
<p><strong>Remark:</strong> I should signpost a point of rigour. We have calculated the derivative with respect to \(t\) pointwise on the kernel, but in fact what we’d like is for the limit implicit in the derivative to exist in the <em>trace norm</em>, and a complete proof would show this. \(\triangle\)</p>
<p>Then we see by Jacobi’s formula</p>
\[\frac{d}{dt} \log F(t) = -\mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} \frac{dK_t}{dt} \right) = \mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} \tau_t A \otimes \tau_t A \right)\]
<p><strong>Remark:</strong> Note that \(\mathrm{tr}_{L^2(\mathbb{R}_+)} \left(\psi \otimes \phi\right) = \langle \psi, \phi \rangle_{L^2(\mathbb{R}_+)} = \int_{\mathbb{R}_+} \psi(x) \phi(x) \, dx\) (we will always take functions to be real valued). Note also that \(K_t\), and hence \((1-K_t)^{-1}\), are symmetric operators with respect to this inner product. \(\triangle\)</p>
<p>Next we use the identity</p>
\[\frac{d}{dt} (1-K_t)^{-1} = (1-K_t)^{-1} \frac{dK_t}{dt}(1-K_t)^{-1} = - (1-K_t)^{-1} (\tau_t A \otimes \tau_t A) (1-K_t)^{-1}\]
<p>Observe that \(\mathrm{tr}((\alpha \otimes \beta)(\gamma \otimes \delta)) = \mathrm{tr}(\alpha \otimes \delta) \mathrm{tr}(\beta \otimes \gamma)\).</p>
\[\frac{d^2}{dt^2} \log F(t) = 2 \mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} D \tau_t A \otimes \tau_t A\right) - \left(\mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} \tau_t A \otimes \tau_t A\right)\right)^2\]
<h3>A hierarchy of coupled ODEs</h3>
<p>Introduce the following notation,</p>
\[q_n(t) = ((1-K_t)^{-1} D^n \tau_t A)(0)\]
\[p_n(t) = \mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} D^n \tau_t A \otimes \tau_t A\right)\]
<p>In this notation we have \(\frac{d^2}{dt^2} \log F(t) = 2 p_1(t) - p_0(t)^2\).</p>
<p>We now compute</p>
\[\frac{d}{dt} q_n(t) = q_{n+1}(t)- q_0(t)p_n(t)\]
\[\frac{d}{dt} p_n(t) = p_{n+1}(t)- p_0(t)p_n(t) + \underbrace{\mathrm{tr}_{L^2(\mathbb{R}_+)} \left( (1-K_t)^{-1} D^n \tau_t A \otimes D\tau_t A\right)}_{(\ast)}\]
<p>To compute \((\ast)\) we integrate by parts</p>
\[(\ast) = -q_n(t)(\tau_t A)(0) - \underbrace{\mathrm{tr}_{L^2(\mathbb{R}_+)} \left( D(1-K_t)^{-1} D^n \tau_t A \otimes \tau_t A\right)}_{(\ast \ast)}\]
<p>Next we use the identity \([D,(1-K_t)^{-1}] = (1-K_t)^{-1} [D,K_t] (1-K_t)^{-1}\). We now give the following important exercise.</p>
<p><strong>Exercise:</strong> Let \(\phi \in L^2(\mathbb{R}_+)\) be a sufficiently nice function (e.g. continuously differentiable and \(\phi^\prime \in L^2(\mathbb{R}_+)\)). Then</p>
\[([D,K_t]\phi)(x) = - ((\tau_t A \otimes \tau_t A)\phi)(x) - \phi(0) K_t(x,0)\]
<p>This yields the formula</p>
\[(\ast \ast) = p_{n+1}(t)- p_0(t)p_n(t)-q_n(t)((1-K_t)^{-1} K_t \tau_t A)(0)\]
<p>Using that \((1-K_t)^{-1} K_t = (1-K_t)^{-1} - 1\), we obtain a formula for \(\frac{d}{dt} p_n(t)\). We thus obtain an infinite hierarchy of coupled ODEs, \(n \in \mathbb{N}\),</p>
\[\boxed{\frac{d}{dt} q_n(t) = q_{n+1}(t)- q_0(t)p_n(t) }\]
\[\boxed{\frac{d}{dt} p_n(t) = -q_{n}(t)q_0(t)}\]
<p><strong>Exercise:</strong> The quantity \(C = p_0(t)^2 - q_0(t)^2 - 2 p_1(t)\) is conserved. (There are actually infinitely many such conserved quantities but we only need this one.)</p>
<p><strong>Corollary:</strong> It seems reasonable that since the Airy function decreases rapidly at \(+\infty\) that \(q_n\) and \(p_n\) should tend to zero at \(t\to +\infty\). It therefore follows that \(C=0\). From this it follows that</p>
\[\frac{d^2}{dt^2} \log F(t) = - q_0(t)^2\]
<p><strong>Remark:</strong> It is “obvious” that since \(K_t\) is “small” for \(t\to +\infty\)</p>
\[q_0(t) \approx (\tau_t A)(0) = \mathrm{Ai}(t)\]
<p>This explains the boundary condition. This needs to be rigorously justified but is beyond the scope of this post. \(\triangle\)</p>
<h3>Closing up the system</h3>
<p>Everything up until now has been “universal” – in that we haven’t used any properties of \(A\) – we have only used the Hankel composition structure of \(K\). In particular, we haven’t used that \(A\) solves the <em>Airy equation</em>, \(D^2 A = M A\), where \(M\) is the operator such that \((M \phi)(x) = x \phi(x)\). Such a “non-universal” property allows us to close up the system and obtain an ODE for \(q_0\). Note that \((M \phi)(0) = 0\).</p>
<p>From this we get</p>
\[q_2(t) = t q_0(t) + ([(1-K_t)^{-1},M] \tau_t A)(0)\]
<p>As before \([(1-K_t)^{-1},M] = (1-K_t)^{-1} [ K_t, M](1-K_t)^{-1}\). If we recall our two equivalent formulae for the Airy kernel we see that</p>
\[[ K_t, M] = - \tau_t A \otimes D\tau_t A + D\tau_t A \otimes \tau_t A\]
<p>This gives</p>
\[q_2(t) = t q_0(t) + q_1(t)p_0(t)- q_0(t)p_1(t)\]
<p>If we combine this formula with our relation \(p_0(t)^2 - q_0(t)^2 - 2 p_1(t) = 0\) we find</p>
\[\boxed{q_0^{\prime \prime}(t) = tq_0(t) + 2 q_0(t)^3}\]
<p>which is Painlevé II.</p>A celebrated result in the theory of random matrices is the connection between the extremal eigenvalue of a random matrix sampled from the Gaussian Unitary Ensemble and the Painlevé II equation. In this post I will give a derivation of this that uses the Hankel composition method. This method is given in great generality in the paper of Bothner and was used in our joint work to relate the distribution function of the extremal eigenvalue in the elliptic Ginibre ensemble to an integro-differential equation. By showing how this method works in the case of the GUE it should shed some light on how it works in other cases.Estimating generalised hypergeometric functions2022-08-14T17:50:11+00:002022-08-14T17:50:11+00:00https://almaths.github.io/maths_blog/hypergeometric<p>In my <a href="https://arxiv.org/abs/2102.08842">work</a> on products of truncated orthogonal matrices (in collaboration <a href="https://profiles.sussex.ac.uk/p435611-nicholas-simm">N. Simm</a> and <a href="https://www.bristol.ac.uk/people/person/Francesco-Mezzadri-66ca5240-8f45-4ffc-a838-d1f68827bd23/">F. Mezzadri</a>) it became important to estimate the following function for \(x \in (-1,1)\) for \(L,N \to \infty\).</p>
\[f_{N-2,L}(x) = \sum_{k=0}^{N-2} \binom{L+k}{k}^m x^k\]
<p>where \(m \in \mathbb{N}\) is fixed. In this post I will show how this estimate is carried out. In my opinion this is the key estimate of the paper so I would like to draw attention to it. This method can be used for the case \(N = +\infty\) and so can be used to estimate the generalised hypergeometric function</p>
\[f_{\infty,L}(x) = \sum_{k=0}^{\infty} \binom{L+k}{k}^m x^k = {}_m F_{m-1}\left( \begin{matrix} L+1 & \dots & L+1 \\
1 & \dots & 1 \end{matrix} \, \bigg| \, x \right)\]
<h2>Background</h2>
<p>Before discussing this, let me briefly outline the context of the problem. Let</p>
\[U_1, \dots, U_m \in O(N+L)\]
<p>be \(m\) independently sampled matrices from the orthogonal group \(O(N+L)\) according to Haar measure. We call the \(N \times N\) upper left corner of \(\tilde{U}_i\) a <em>truncated orthogonal matrix</em>. \(\tilde{U}_i\) is thus a random matrix with real matrix elements, whose randomness is inherited from Haar measure.</p>
<p>We are interested in the spectrum of the product</p>
\[X = \tilde{U}_1 \tilde{U}_2 \dots \tilde{U}_m\]
<p>as \(N,L \to \infty\) and where \(\frac{L}{N} \to \gamma > 0\). Define the <em>real spectral density</em> as the function \(\rho_{N,L} : \mathbb{R} \to [0, \infty)\) such that</p>
\[\mathbb{E}[| \sigma(X) \cap A |] = \int_A \rho_{N,L}(x) \, dx\]
<p>where \(\sigma(X)\) is the spectrum of \(X\), \(A \subset \mathbb{R}\) is a Lebesgue measurable set and \(\lvert \cdot \rvert\) denotes cardinality. From this we see that</p>
\[\mathbb{E}[ |\sigma(X) \cap \mathbb{R}|] = \int_\mathbb{R} \rho_{N,L}(x) \, dx\]
<p>gives the expected number of real eigenvalues. <a href="https://arxiv.org/abs/1708.00967">Forrester, Ipsen and Kumar</a> supply the following exact formula for the real spectral density.</p>
\[\rho_{N,L}(x)= \int_{[-1,1]} |x-y| w_L(x) w_L(y) f_{N-2,L}(xy) \, dy\]
<p>\(w_L\) is the so-called “weight function,” which we will not write out and can be found in our paper. To estimate \(\rho_{N,L}\) we wish to estimate both \(w_L\) and \(f_{N-2,L}\). The former turns out to be a straightforward application of the Laplace method; it is less obvious how to carry out the latter and is the subject of this post.</p>
<h2>The estimate</h2>
<p><strong>Lemma:</strong> Let \(g_{K}(z) = \sum_{k=0}^{K} a_k z^k\) for \(K \in \mathbb{N}\) and suppose \(\lim_{K \to \infty} g_{K}(z) = g_\infty(z)\) converges on some neighbourhood \(U\) of \(0 \in \mathbb{C}\).</p>
<p>Then</p>
\[\sum_{k=0}^K a_k^m x^k = \frac{1}{(2\pi i)^{m-1}} \oint_{\Gamma^{m-1}} g_{K}\left( \frac{x}{z_1 \dots z_{m-1}}\right) g_{\infty} (z_1) \dots g_{\infty}(z_{m-1}) \frac{dz_1}{z_1} \dots \frac{dz_{m-1}}{z_{m-1}}\]
<p>where \(\Gamma \subset U \setminus \{ 0 \}\) is a closed contour enclosing \(0\). This formula is also valid for \(K = +\infty\) so long as \(\frac{x}{z_1 \dots z_{m-1}} \in U\) throughout the contour \(\Gamma\).</p>
<p><strong>Proof:</strong> If we expand the product</p>
\[g_{K}\left( \frac{x}{z_1 \dots z_{m-1}}\right) g_{\infty} (z_1) \dots g_{\infty}(z_{m-1}) = \sum_{k_1, \dots k_{m-1}=0}^{\infty} \sum_{n=0}^K a_n a_{k_1} \dots a_{k_{m-1}} x^n z_1^{k_1 - n} \dots z_{m-1}^{k_{m-1} - n}\]
<p>Clearly \(\sum_{k=0}^K a_k^m x^k\) is the coefficient of \(z_1^0 \dots z_{m-1}^0\) in the above series, which can be picked out by the residue theorem. The above series is uniformly convergent on compact sets within the radius of convergence so term by term integration is justified. \(\square\)</p>
<p>This means so long as we have good estimates on the case of \(m=1\) we can extract good estimates in the case of general \(m\).</p>
<p><strong>Remark:</strong> Let \(f\) and \(g\) be two analytic functions defined in a neighbourhood of \(0\). Define the convolution</p>
\[(f \ast g)(x) = \frac{1}{2\pi i} \oint_\Gamma f(z) g\left( \frac{x}{z} \right) \frac{dz}{z}\]
<p>where \(\Gamma\) is a positively oriented contour that encloses \(0\). Then our above lemma states that</p>
\[\sum_{k=0}^K a_k^m x^k = g_K \ast \underbrace{g_\infty \ast \dots \ast g_\infty}_{m-1 \text{ times}} (x)\]
<p>One thus sees that our above lemma is nothing other than an instance of the convolution theorem (sometimes going under the name of <a href="https://en.wikipedia.org/wiki/Generating_function_transformation#Hadamard_products_and_diagonal_generating_functions">Hadamard products</a>). \(\triangle\)</p>
<p>The following is well known but we include a proof for completeness.</p>
<p><strong>Lemma:</strong> \(\sum_{k=0}^\infty \binom{L+k}{k} x^k = \frac{1}{(1-x)^{L+1}}\)</p>
<p><strong>Proof:</strong> Using the Cauchy residue theorem write</p>
<p>\(\binom{L+k}{k} = \frac{1}{2\pi i} \oint_\Gamma \frac{(1+z)^{L+k}}{z^{k+1}} \, dz\)
where \(\Gamma\) is a positively oriented contour enclosing \(0\). Then</p>
\[\sum_{k=0}^\infty \binom{L+k}{k} x^k = \frac{1}{2\pi i} \oint_\Gamma \frac{(1+z)^{L}}{z} \sum_{k=0}^\infty \left( \frac{x(1+z)}{z} \right)^k \, dz = \frac{1}{1-x}\frac{1}{2\pi i} \oint_\Gamma (1+z)^{L} \frac{1}{z- \frac{x}{1-x}} \, dz\]
<p>where \(x\) is chosen sufficiently small that \(\left\lvert \frac{x(1+z)}{z} \right\rvert < 1\) on the contour. This implies that the pole at \(z = \frac{x}{1-x}\) is enclosed. \(\square\)</p>
<p>This immediately gives a formula for the \(N = +\infty\) case,</p>
\[f_{\infty,L}(x) = \frac{1}{(2\pi i )^{m-1}} \oint_{\Gamma^{m-1}} \frac{1}{\left( 1 - \frac{x}{z_1 \dots z_{m-1}} \right)^{L+1}} \prod_{k=1}^{m-1}\frac{dz_k}{(1-z_k)^{L+1} z_k}\]
<p>Applying the method of steepest descent allows one to immediately obtain \(L \to +\infty\) asymptotics of \(f_{\infty,L}\).</p>
<p><strong>Proposition:</strong> Let \(x \in (0,1)\). Then we have the following asymptotics pointwise as \(L \to +\infty\)</p>
\[f_{\infty,L}(x) \sim \frac{1}{\sqrt{m} (2\pi L)^{\frac{m-1}{2}}} \frac{1}{x^\frac{m-1}{2m} \left( 1 - x^\frac{1}{m} \right)^{mL+1}}\]
<p>We also have the following bounds for \(x \in [0,1)\)</p>
\[|f_{\infty,L}(x)| \leq \left( \frac{\pi}{2L} \right)^{\frac{m-1}{2}} \frac{1}{ x^\frac{m-1}{2m} \left( 1 - x^\frac{1}{m} \right)^{m(L+1)} }\]
\[|f_{\infty,L}(-x)| \leq \frac{1}{ \left( 1 - x^\frac{1}{m} \right)^{m(L+1)}} e^{-\frac{Lx^\frac{1}{m}}{2m}}\]
<p><strong>Proof:</strong> The first follows from an application of the steepest descent method, the second follows from the inequality in equation 6.47 of our paper, the third is equation 2.16 of our paper. \(\square\)</p>
<p>Notice that the second estimate is quite good, it differs from the pointwise asymptotics by a \(\mathcal{O}(1)\) factor.</p>
<p>Let \(g_{N-2}(x) = \sum_{k=0}^{N-2} \binom{L+k}{k}x^k\). There are a variety integral representations of this, e.g. in terms of an incomplete beta function (see page 3 of <a href="https://arxiv.org/abs/1008.2075">Khoruzhenko, Sommers and Zyczkowski</a>). If we write the coefficient \(\binom{L+k}{k} = \frac{1}{2\pi i } \oint_{\Gamma} \frac{1}{z^{N-1}(1-z)^{L+1}} \, dz\) and sum, we find</p>
\[g_{N-2}(x) = \frac{1}{(1-x)^{L+1}} \chi_{R> |x|} -\frac{x^{N-1 }}{2\pi i} \oint_{|z|=R} \frac{1}{z^{N-1}(1-z)^{L+1}} \frac{dz}{z-x}\]
<p>for any \(R > 0\). A calculation shows that the steepest descent contour for the integral contained in the second term is \(R = \frac{1}{1+\gamma}\). Putting this all together yields an integral represention of \(f_{N-2,L}\).</p>
<p><strong>Remark:</strong> The technique discussed in this post can also be used to study the asymptotics of other generalised hypergeometric functions. For example, it allows one to obtain asymptotics of</p>
\[\sum_{k=0}^\infty \frac{x^k}{(k!)^m} = {}_0 F_{m-1}\left( \begin{matrix} & - & \\
1 & \dots & 1 \end{matrix} \, \bigg| \, x \right)\]
<p>for any fixed \(m \in \mathbb{N}\) in the régime \(x \to \infty\) in any direction in the complex plane. \(\triangle\)</p>In my work on products of truncated orthogonal matrices (in collaboration N. Simm and F. Mezzadri) it became important to estimate the following function for \(x \in (-1,1)\) for \(L,N \to \infty\).