Integer and Rational Solutions to a Binary Quadratic Equation (1): Legendre equation
In this post, we're interested in finding any or all rational solutions of a binary quadratic equation: \[ ax^2 + bxy + cy^2 + dx + ey + f = 0. \] We assume that the equation is non-degenerate in the sense that the following matrix \[ Q := \begin{pmatrix} 2a & b & d \\ b & 2c & e \\ d & e & 2f \\ \end{pmatrix} \] is non-singular.
Legendre equation and normal form
(Homogenization) The original equation might be heterogeneous. We introduce an additional variable \(z\) to homogenize it by setting \(x \gets x/z, y \gets y/z\). We also write it in the matrix form: \[ \mathbf{x}^\T Q \mathbf{x} = 0 , \quad \mathbf{x} := \begin{pmatrix} x & y & z \end{pmatrix}^\T. \] After homogenization, every rational solution can be converted into an integer solution. We also assume that \(\det Q \neq 0\).
(Diagonalization) The matrix \(Q\) here might not be diagonal. If so, we can diagonalize it by finding an invertible matrix \(P \in \QQ^{3 \times 3}\) and a diagonal matrix \(D \in \ZZ^{3 \times 3}\) such that \[ Q = P^\T D P, \quad \Longleftrightarrow \quad (P \mathbf{x})^\T D (P \mathbf{x}) = 0. \] One possible approach is to use LDL decomposition.
(Normalization) As \(Q\) is diagonal, the equation can be written as \(ax^2 + by^2 + cz^2 = 0\). Clearly, \(abc = \det Q \neq 0\). We refine this further by requiring that \(abc\) be square-free, leading to the normal form:
(Legendre equation, Normal Form) Legendre equation is defined as
\[ \begin{equation} ax^2 + by^2 + cz^2 = 0, \label{eq:legendre} \end{equation} \]
for non-zero integers \(a, b, c\). We say that it is in normal form if \(abc\) is square-free.
The following transformation by Gauss reduces a Legendre equation to normal form: Write \(bc = \alpha^2 s, ca=\beta^2 t, ab=\gamma^2 u\) for square-free integers \(s, t, u\) and integers \(\alpha, \beta, \gamma\), then \[ \frac{a \alpha}{\beta \gamma} (x/\alpha)^2 + \frac{b \beta}{\gamma \alpha} (y/\beta)^2 + \frac{c \gamma}{\alpha \beta} (z/\gamma)^2 = 0, \] and \(\frac{a \alpha}{\beta \gamma}, \frac{b \beta}{\gamma \alpha}, \frac{c\gamma}{\alpha \beta}\) are square-free and pair-wise coprime.
(Solubility) Legendre gave an elegant theorem to determine the solubility of an equation in normal form:
(Legendre) Equation \(\eqref{eq:legendre}\) in normal form has a non-trivial integer solution iff. \(a, b, c\) have different sign, and \(-bc, -ca, -ab\) are quadratic residues modulo \(a, b, c\) respectively.
The if
direction follows by considering modulo \(a, b, c\) individually.
Find a single non-trivial integer solution to Legendre equation \(\eqref{eq:legendre}\) in normal form
Brute-force
Holtz proved that there exists a small non-trivial solution:
(Small solution, Holzer) If Equation \(\eqref{eq:legendre}\) in normal form has a non-trivial solution, then there exists a non-trivial solution \((\hat x, \hat y, \hat z)\) such that:
\[ \hat x^2 \leq |bc|, \quad \hat y^2 \leq |ca|, \quad \hat z^2 \leq |ab|. \]
Additionally, \(|bc|\) is a square if and only if \(|bc| = 1\), since \(bc\) is square-free. The same holds for \(|ca|\) and \(|ab|\).
See [Mordell, Chapter 7, Theorem 5], [CR03] and [CM97] for more details.
Legendre's descent
(This subsection is mostly taken from [Aitken, Section 5].)
Apparently, \(ax^2 + by^2 + cz^2 = 0\) is equivalent to \((-ac) x^2 + (-bc) y^2 = (cz)^2\). Legendre's descent is based on the following observation: \[ax^2 + by^2 = z^2 \quad \Longleftrightarrow \quad a ( ux + z)^2 + b' (b y)^2 = (uz + ax)^2.\] for \(u^2 = a + b'b\). It allows us to reduce the problem \(ax^2 + by^2 = z^2\) to another problem \(ax^2 + b' y^2 = z^2\). When \(|b| \geq \max(2, |a|)\), the other problem is smaller as we can always choose \(|u| \leq |b|/2\) so \(|b'| \leq (u^2 + |a|) / |b| \leq |b|/4+1 < |b|\).
The algorithm runs as follows:
# given a, b, find (x, y, z) such that a x^2 + b y^2 = z^2 |
The key point of the algorithm is that \(u\) must exist during all recursions. It's proved that if \(ax^2 + by^2 + cz^2 = 0\) is in normal form, \(u\) always exists when calling \(\operatorname*{solve}(-ac, -bc)\).
See [CR03, Section 2.3], [Aitken] and [Hillgarter] for more details. [CR03] also optimizes the method but it's beyond the scope of the post.
Lattice-based algorithm
(This section is taken from [CR03, Section 2.6].)
Another idea arises when going deeper in Legendre's theorem.
We first discuss the case where any of \(|ab|, |bc|, |ca|\) is 1. W.L.O.G. we assume \(|ab| = 1\).
- If \(a \neq b\), then \((x, y, z) = (1, 1, 0)\) is a non-trivial solution.
- If \(a = b\), we can use Cornacchia's algorithm to solve \(x^2 + y^2 = -c / a\), and \((x, y, 1)\) is a non-trivial solution.
Now we consider the general case. We define a set of points \(\mathcal{L} \subset \ZZ^3\): \[ \begin{aligned} \mathcal{L} = \{ (x, y, z): \ & by \equiv \lambda_a z \pmod a, \\ & cz \equiv \lambda_b x \pmod b, \\ & ax \equiv \lambda_c y \pmod c,\\ & a x^2 + b y^2 + c z^2 \equiv 0 \pmod{2abc} \}, \end{aligned} \]
where \(\lambda_a\) is an arbitrary solution to \(\lambda^2 \equiv -bc \pmod{a}\), similarly for \(\lambda_b\) and \(\lambda_c\). Let \((\bar x, \bar y, \bar z)\) be the non-zero point in \(\mathcal{L}\) minimizing \(|a| x^2 + |b| y^2 + |c| z^2\). It can be proved that there is a small solution \((\hat x, \hat y, \hat z)\) in \(\mathcal{L}\), which can be used to bound \(|a\bar x^2 + b\bar y^2 + c \bar z^2|\): \[ \left |a \bar x^2 + b \bar y^2 + c \bar z^2 \right| \leq |a| \hat x^2 + |b| \hat y^2 + |c| \hat z^2 < 2|abc|. \] so we conclude that \(a \bar x^2 + b \bar y^2 + c \bar z^2 = 0\) and the minimizer itself satisfies Holzer's condition.
The final step is to determine the minimizer. [CR03, Section 2.6, p.17] shows that \(\mathcal{L}\) is a lattice, that is, \(\mathcal{L} = \{\mathbf{w} \mathbf{B}: \mathbf{w} \in \ZZ^3 \}\) for a (row-based) basis \(\mathbf{B} \in \ZZ^{3 \times 3}\). It also gives an explicit expression for a basis \(\mathbf{B}\). Moreover, the shortest vector in a 3D lattice can be found efficiently using LLL, as long as we have an arbitrary basis \(\mathbf{B}\) for \(\mathcal{L}\):
([CR03, Lemma 2.7, p.16]) Let \(\mathcal{L} = \{\mathbf{w} \mathbf{B}: \mathbf{w} \in \ZZ^3\}\) be a 3D lattice where \(\mathbf{B}\) is LLL-reduced. Then the shortest vector in \(\mathcal{L}\) is in the form of \(\mathbf{w}\mathbf{B}\) where \(\mathbf{w} \in \{-1, 0, 1\}^3\).
Here is the pseudocode:
def solve(a, b, c): |
Check out [CR03, Section 2.6] and [CM97] for more details. [CM97] uses a different lattice and [CR03] improves upon it.
A single non-trivial rational solution to \(\mathbf{x}^\top Q \mathbf{x} = 0\) with \(Q \in \ZZ^{d \times d}\)
Assume \(\det Q \neq 0\) and \(Q\) is symmetric. We here only list a few references below:
- [Simon05] removes the need of diagonalization, which often bloats the size of coefficients. It gives an algorithm where \(d = 3\). It works by reducing \(Q\) repeatedly until \(\det Q = \pm 1\), and then solving the case where \(\det Q = \pm 1\).
- [Simon06] extends [Simon05] and gives an algorithm for \(d > 3\). However, I have not read the paper.
All non-trivial solutions from a single non-trivial solution
Given a single non-trivial solution \(\mathbf{\hat x} \in \QQ^d\) to a quadratic equation \(F(\mathbf{x}) = 0\), we parametrize \(\mathbf{x} \in \QQ^d\) by \(\mathbf{x} = \mathbf{\hat x} - t\mathbf{u}\), where \(\mathbf{u} \in \ZZ^d, t \in \QQ\) and \(\mathbf{u}\) is primitive (i.e., \(\gcd(\mathbf{u}) = 1\)).
Since \(F(\mathbf{\hat x}) = 0\), we solve \(t\) in \(F(\mathbf{\hat x} - t\mathbf{u}) = 0\) using the second-order Taylor expansion of \(F\) at \(\mathbf{\hat x}\): \[ F(\mathbf{\hat x}) - t \mathbf{u}^\T \nabla F(\mathbf{\hat x}) + \frac{t^2}{2} \mathbf{u}^\T \nabla^2 F(\mathbf{\hat x}) \mathbf{u} = 0 \quad \Longrightarrow \quad t = \frac{\mathbf{u}^\T \nabla F(\mathbf{\hat x})}{\frac{1}{2} \mathbf{u}^\T \nabla^2 F(\mathbf{\hat x}) \mathbf{u}}. \] Since \(\mathbf{u}\) and \(-\mathbf{u}\) produce the same solution, we require that \(\mathbf{u}\) is lexicographically larger than \(-\mathbf{u}\). Note this parametrization also generates the initial solution \(\mathbf{\hat x}\) when \(\mathbf{u}^\top \nabla F(\mathbf{\hat x}) = 0\), i.e., \(\mathbf{u}\) is tangent to the curve.
Connection to integer solutions of a homogenous ternary quadratic equation
A homogeneous polynomial \(F(\mathbf{x})\) of degree \(d\) satisfies \(F(\lambda \mathbf{x}) = \lambda^d F(\mathbf{x})\), so every rational solution can be normalized to a primitive integer solution. Furthermore, any primitive integer solution \(\mathbf{x}\) with \(\mathbf{x}_1 \neq 0\) is equivalent to a rational solution of a heterogeneous equation \(F([1, \mathbf{x'}]) = 0\) for \(\mathbf{x}' \in \QQ^{d-1}\).
When the goal is to enumerate integral solutions, a good strategy is to first find a non-trivial rational solution to \(F(\mathbf{x}) = 0\) and then apply the parametrization trick to a heterogeneous equation to avoid duplicates.
Consider the following example. We consider the homogenous equation \(ax^2+bxy+cy^2=d z^2\) in \(\ZZ^3\) with a non-trivial primitive solution \((x_0, y_0, z_0)\) where \(z_0 \neq 0\), which is equivalent rational solutions to the heterogenous equation \(a x^2 + b xy + c y^2 - d = 0\) with a non-trivial solution \(\mathbf{\hat x} = (x_0 / z_0, y_0 / z_0)\). Write \(\mathbf{u} = [u, v]\) and apply the results above: \[ \begin{aligned} \mathbf{u}^\top \nabla F(\mathbf{\hat x}) & = z_0^{-1}(u (2ax_0 + b y_0) + v(b x_0 + 2c y_0) ), \\ \frac{1}{2} \mathbf{u}^\T \nabla^2 F(\mathbf{\hat x}) \mathbf{u} & = a u^2 + b uv + c v^2. \\ \end{aligned} \] After simplification, we have \[ \begin{pmatrix} x \\ y \\ z \end{pmatrix} = A \mathbf{r}, \quad A := \begin{pmatrix} ax_0 + by_0 & 2cy_0 & -cx_0 \\ -ay_0 & 2ax_0 & c y_0 + b x_0 \\ a z_0 & b z_0 & c z_0 \\ \end{pmatrix}, \quad \mathbf{r} := \begin{pmatrix} u^2 \\ uv \\ v^2 \end{pmatrix}, \] which are exactly the Equations (8-10) in Wolfram MathWorld. The equations might not generate primitive solutions directly, but they do produce all primitive tuples after normalization.
To use the equations to generate bounded primitive solutions, the common factor \(g := \gcd(A \mathbf{r})\) needs to be determined. It can be shown that \(g\) divides \(\det A\) by writing \(\mathbf{r}\) using \(A \mathbf{r}\): \[ \mathbf{r} = A^{-1} A \mathbf{r} = \frac{\adj A}{\det A} A \mathbf{r}, \] and observing that if \(g\) doesn't divide \(\det A\), \(\gcd(\mathbf{r})\) can't be 1.
In our case, \(\det A = -\Delta d z_0^3\) where \(\Delta = b^2 - 4ac\). The bound can be slightly improved to \(\Delta d z_0^2\) by finding the LCM of denominators in \(A^{-1}\): \[ A^{-1} = \frac{1}{\Delta d z_0^2} \begin{pmatrix} \Delta x_0 + cT & cS & -2cdz_0 \\ -aS & -cT & b d z_0 \\ aT & \Delta y_0 + a S & -2adz_0 \end{pmatrix}, \quad \begin{cases} S := 2cy_0+bx_0, \\ T := 2ax_0+by_0. \end{cases} \] If \(b = 0\), the denominator can be further reduced to \(2acdz_0^2\) instead of \(4acdz_0^2\).
References
- [Mordell]: Mordell, L.J., 1969. Diophantine Equations: Diophantine Equations (Vol. 30). Academic press.
- CR03: Cremona, J. and Rusin, D., 2003. Efficient solution of rational conics. Mathematics of Computation, 72(243), pp.1417-1441.
- Simon05: Simon, D., 2005. Solving quadratic equations using reduced unimodular quadratic forms. Mathematics of Computation, 74(251), pp.1531-1543.
- Simon06: Simon, D., Quadratic equations in dimensions 4, 5 and more. Preprint, 2005 [online]
- CM97: Cochrane, T. and Mitchell, P., 1998. Small solutions of the Legendre equation. Journal of Number Theory, 70(1), pp.62-66.
- Aitken: Wayne Aitken, Legrendre's Theorem, Legrange's Descent
- Hillgarter: Hillgarter, E., 1996. Rational points on conics. na.
- Holzer: Holzer, L., 1950. Minimal solutions of diophantine equations. Canadian Journal of Mathematics, 2, pp.238-244.
- It's weird that I can't find the first name of Holzer...