In my recent work I made a connection between the theory of self-dual quaternion random matrices and Riemann-Hilbert problems. As part of the background of this research, I needed to revisit the theory of self-dual quaternion random matrices, in particular the question how to make sense of the eigenvalues of such matrices. This is not entirely self-explanatory given quaternions do not commute. In this post I hope to give an accessible explanation of this.

First let us recall basic facts about quaternions. The algebra of quaternions H\mathbb{H} is the real span of 4 linearly independent elements 1,e1,e2,e31, e_1, e_2, e_3 with the relations

e12=e22=e32=1e_1^2 = e_2^2 = e_3^2 = -1 e1e2=e3 etc. by cyclic permutationse_1 e_2 = e_3 \quad \text{ etc. by cyclic permutations} eiej=ejei for ij.e_i e_j = -e_j e_i \quad \text{ for } i \neq j .

It is convenient to identify these with 2×22 \times 2 matrices

1I=(1001),e1(i00i)e2(0110),e3(0ii0).\begin{aligned}1 \simeq \mathbb{I} = \left( \begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix} \right), & & e_1 \simeq \left( \begin{matrix} i & 0 \\ 0 & -i \end{matrix} \right) \\ e_2 \simeq \left( \begin{matrix} 0 & 1 \\ -1 & 0 \end{matrix} \right), & & e_3 \simeq \left( \begin{matrix} 0 & i \\ i & 0 \end{matrix} \right).\end{aligned}

In what follows it will be useful to complexify the quaternions HC\mathbb{H}_{\mathbb{C}} so that for QHCQ \in \mathbb{H}_{\mathbb{C}}

Q=q0I+q1e1+q2e2+q3e3(qiC).Q = q_0 \mathbb{I} + q_1 e_1 + q_2 e_2 + q_3 e_3 \quad (q_i \in \mathbb{C}).

Definition: The dual of a quaternion Q=q0I+q1e1+q2e2+q3e3HCQ= q_0 \mathbb{I} + q_1 e_1 + q_2 e_2 + q_3 e_3 \in \mathbb{H}_\mathbb{C} is

QD=q0Iq1e1q2e2q3e3.Q^\mathsf{D} = q_0 \mathbb{I} - q_1 e_1 - q_2 e_2 - q_3 e_3.

Note that QQDQ \mapsto Q^\mathsf{D} is a C\mathbb{C}-linear (and not conjugate linear) operation.

Lemma: Using our 2×22 \times 2 matrix representation of a quaternion QHCQ \in \mathbb{H}_{\mathbb{C}} we may write the dual

QD=e2QTe2.Q^\mathsf{D} = -e_2 Q^\mathsf{T} e_2.

Proof: Straightfoward calculation. \square

Definition: The adjoint of a quaternion Q=q0I+q1e1+q2e2+q3e3HCQ= q_0 \mathbb{I} + q_1 e_1 + q_2 e_2 + q_3 e_3 \in \mathbb{H}_\mathbb{C} is

Q=q0Iq1e1q2e2q3e3.Q^\dagger = \overline{q_0} \mathbb{I} - \overline{q_1} e_1 - \overline{q_2} e_2 - \overline{q_3} e_3.

Note that QQQ \mapsto Q^\dagger is a conjugate-linear operation and given our matrix representation it is exactly the conjugate transpose of the matrix QQ.

Corollary: A quaternion QQ is real (has real coefficients) if and only if Q=QDQ^\dagger = Q^\mathsf{D}, i.e. Q=e2QTe2Q^\dagger = -e_2 Q^\mathsf{T} e_2. Equivalently, a 2×22 \times 2 matrix QQ is in the real span of I,e1,e2,e3\mathbb{I}, e_1, e_2, e_3 if and only if Q=e2QTe2Q^\dagger = -e_2 Q^\mathsf{T} e_2. \triangle

We can now see the advantage of introducing HC\mathbb{H}_\mathbb{C} even though we are really only interested in H\mathbb{H}. Given an n×nn \times n (real) quaternion matrix M\mathcal{M} we identify this with a 2n×2n2n \times 2n matrix MM, and the condition that Mij=MjiD\mathcal{M}_{ij} = \mathcal{M}_{ji}^\mathsf{D} becomes the requirement that

M=MD=MM = M^\mathsf{D} = M^\dagger

where MD=JMTJM^\mathsf{D} = - J M^\mathsf{T}J for J=e2e2n timesJ = \underbrace{e_2 \oplus \dots \oplus e_2}_{n \text{ times}}.

Remark: Define the non-degenerate skew-symmetric bilinear form Ω:C2n×C2nC\Omega : \mathbb{C}^{2n} \times \mathbb{C}^{2n} \to \mathbb{C} by

Ω(x,y)=xTJy.\Omega(x,y) = x^\mathsf{T} J y .

Then M=MDM = M^\mathsf{D} is equivalent to Ω(Mx,y)=Ω(x,My)\Omega(Mx,y) = \Omega(x,My) for all x,yC2nx,y \in \mathbb{C}^{2n}. \triangle

Definition: The (non-compact) symplectic group Sp(n)\mathrm{Sp}(n) is the group of 2n×2n2n \times 2n matrices UU for which Ω(Ux,Uy)=Ω(x,y)\Omega(Ux,Uy) = \Omega(x,y) for all x,yC2nx,y \in \mathbb{C}^{2n}. The (compact) symplectic group is USp(n)=Sp(n)U(2n)\mathrm{USp}(n)=\mathrm{Sp}(n) \cap \mathrm{U}(2n). \triangle

It is easily seen that for UUSp(n)U \in \mathrm{USp}(n), U1=UD=UU^{-1}= U^\mathsf{D} = U^\dagger, so that UU may be thought of as an n×nn \times n matrix with real quaternion entries whose dual is its inverse. Note that USp(n)\mathrm{USp}(n) is exactly the group which, acting by conjugation, preserves (real) quaternion self-duality.

Proposition (Kramers’ degeneracy): Let M=MDM = M^\mathsf{D} be a 2n×2n2n \times 2n matrix. Then the characteristic polynomial of MM is an exact square. In particular, MM has generically nn eigenvalues each of multiplicity 22.

Proof: Because M=MDM = M^\mathsf{D} we have that (JM)T=JM(JM)^\mathsf{T} = - JM and so

det(ζIM)=det(ζJJM)=(pf(ζJJM))2\det(\zeta \mathbb{I}- M) = \det(\zeta J- JM) = \left( \mathrm{pf}(\zeta J- JM)\right)^2

for ζC\zeta \in \mathbb{C} and pf\mathrm{pf} being the Pfaffian. Here we have used that detJ=1\det J = 1. \square

Remark: Many works, including e.g. the textbooks of M. L. Mehta (Random Matrices) and P. Forrester (Log-Gases and Random Matrices), prefer to work with a so-called “quaternion determinant.” Given a self-dual n×nn \times n quaternion matrix M\mathcal{M} with 2n×2n2n \times 2n representative MM, we define the quaternion determinant

Qdet(M)=pf(JM).\mathrm{Qdet}(\mathcal{M}) = \mathrm{pf}(JM) .

Surprisingly, there is a theorem due Dyson (see Theorem 5.1.2 of Mehta’s textbook) that shows that Qdet\mathrm{Qdet} admits a Laplace-type formula in terms of a sum over permutations (ibid, Equation 5.1.5). All of this presumes that the matrix MM is self-dual, as far as I understand Qdet\mathrm{Qdet} is not defined for non-self-dual matrices. \triangle

Finally, to conclude our discussion, we must give meaning to the notion of diagonalising quaternion self-dual matrices. Let M\mathcal{M} be an n×nn \times n self-dual quaternion matrix and M=M=MDM = M^\dagger = M^\mathsf{D} be its 2n×2n2n \times 2n representative. We aim to show that MM may be diagonalised by an element of USp(n)\mathrm{USp}(n). Let us assume for simplicity of exposition that MM has exactly nn (distinct) eigenvalues λ1,,λnR\lambda_1, \dots, \lambda_n \in \mathbb{R} each of multiplicity 22. Let vkC2nv_k \in \mathbb{C}^{2n} be an eigenvector, vk=1\| v_k \| = 1, with eigenvalue λk\lambda_k.

Mvk=λkvkM v_k = \lambda_k v_k

By self-duality, wk:=Jvkw_k := J \overline{v_k} is also an eigenvector with λk\lambda_k. wkw_k and vkv_k are linearly independent eigenvectors since wk=1\| w_k \| = 1 and wk,vk=wkvk=vkTJvk=0\langle w_k , v_k \rangle = w_k^\dagger v_k = v_k^\mathsf{T} J v_k = 0. Then define the matrix

U=(v1w1vnwn).U = \left( \begin{matrix} \vert & \vert &\dots &\vert & \vert \\ v_1 & w_1 &\dots & v_n & w_n \\ \vert & \vert &\dots &\vert & \vert \end{matrix} \right).

From the construction it is clear that

U1MU=diag(λ1,λ1,,λn,λn)U^{-1} M U = \mathrm{diag}(\lambda_1, \lambda_1, \dots, \lambda_n, \lambda_n)

so UU diagonalises MM. Furthermore we claim UUSp(n)U \in \mathrm{USp}(n). This can be seen from the following. Firstly, since the columns of UU are orthonormal with respect to the standard Hermitian inner product on C2n\mathbb{C}^{2n}, UU must be unitary (U1=UU^{-1} = U^\dagger). Secondly, again by construction JUJ=U=UTJ U J = - \overline{U} = -U^\mathsf{-T}, and hence UD=U1U^\mathsf{D} = U^{-1}. This completes the proof that UUSp(n)U \in \mathrm{USp}(n).