Integer and Rational Solutions of a Binary Quadratic Equation (5): Binary quadratic form

In this post, we are interested in finding all integer solutions of the following equation: \[ \begin{equation} ax^2 + bxy + cy^2 = n. %\label{eq:ibqf} \end{equation} \] where \(acn \neq 0\), \(b^2 - 4ac \neq 0\). This equation can certainly be solved using methods from the generalized Pell equation, but here, we explore other methods.

The function \(ax^2 + bxy + cy^2\) is also called the binary quadratic form (BQF), which we denote as \(\left< a,b,c \right>\) for brevity. The discriminant of \(\left< a,b,c \right>\) is defined as \(\Delta := b^2 - 4ac\).

Theory of binary quadratic forms

Binary quadratic forms have been studied for centuries and many textbooks discuss them. The goal of this section is to provide a brief introduction. Advanced topics (e.g., composition) will not be discussed.

(Equivalence) We define an equivalence relation between BQFs: \(\left< a,b,c \right> \sim \left< a',b',c' \right>\) if and only if there exists an \(M \in \ZZ^{2\times 2}\) such that \(\det M = 1\) and \[ \begin{equation} M^\T \begin{pmatrix} a & b/2 \\ b / 2 & c \\ \end{pmatrix} M = \begin{pmatrix} a' & b' / 2 \\ b' / 2 & c' \\ \end{pmatrix}, %\label{eq:equivalence} \end{equation} \] The matrix determinants of both sides show that \(\left< a,b,c \right>\) and \(\left< a',b',c' \right>\) have the same discriminant.

(Representing numbers) One way to understand this equivalence is by examining the set of numbers it represents. We say \(\left< a,b,c \right>\) represents \(n\) if the equation \(ax^2 + bxy + cy^2 = n\) for some \(x, y \in \ZZ\), and it properly represents \(n\) if \(\gcd(x, y) = 1\). If \(\left< a,b,c \right> \sim \left< a',b',c' \right>\) and \(\left< a',b',c' \right>\) represents \(n\) by \([x, y]\), then \(\left< a,b,c \right>\) represents \(n\) through \(M [x, y]\), which is a simple change of variables (under the constraint that \(\det M = 1\)).

(Positive definite) A BQF \(\left< a,b,c \right>\) is positive definite if for any \((x, y) \neq 0\), \(ax^2 + bxy + cy^2 > 0\). An alternative definition states that \(a > 0, c > 0\) and \(\Delta < 0\).

(Lagrange-reduced form) A positive definite BQF \(\left< a,b,c \right>\) is (Lagrange-)reduced if it meets both of the following conditions: (1) \(|b| \leq a \leq c\), (2) if \(|b| = a\) or \(a = c\), then \(b \geq 0\). Every positive definite BQF is equivalent to a unique reduced BQF.

(Reduction) We have two rules to reduce a positive definite BQF, similar to the Euclidean algorithm: \[ \begin{equation} \left< a,b,c \right> \sim \left< c,-b,a \right>, \quad M = \begin{pmatrix} 0 & 1 \\ -1 & 0\end{pmatrix}. \label{R.1} \tag{R.1} \end{equation} \] and for any \(t \in \ZZ\), \[ \begin{equation} \left< a,b,c \right> \sim \left<a, b - 2at, at^2 - bt + c\right>, \quad M = \begin{pmatrix} 1 & -t \\ 0 & 1 \end{pmatrix}, \label{R.2} \tag{R.2} \end{equation} \] A special case of \(\eqref{R.2}\) is \(\left< a,a,c \right> \sim \left< a,-a,c \right>\) for \(t = 1\).

We can always apply \(\eqref{R.1}\) to ensure that \(a \leq c\), and \(\eqref{R.2}\) to ensure \(|b| \leq a\) by setting \(t = \left\lfloor \frac{b}{2a} \right\rceil\). Both \(\eqref{R.1}\) and \(\eqref{R.2}\) reduces either \(a\) or \(|b|\) while keeping the other the same, so the process must terminate with a BQF \(\left< a,b,c \right>\) where \(|b| \leq a \leq c\). Finally, if \(a = c\) or \(a = |b|\), we choose the one with \(b \geq 0\).

See William Stein's lecture notes and [Manguba-Glover18, Chapter 2] for more details.

Normalization

(Normalization) The equation is first normalized as follows:

  • Assume \(\gcd(a, b, c) = 1\). Otherwise, we divide both sides by \(\gcd(a, b, c)\).
  • Assume \(\gcd(a, n) = 1\). Otherwise, we can always find \(u, s, v, r \in \ZZ\) such that \(us - vr = 1\) and \(\gcd(au^2 + buv + cv^2, n) = 1\), and then apply \(\begin{pmatrix} u & r \\ v & s\end{pmatrix}\) to \(\left< a,b,c \right>\). See here for how to find \((u, v)\). The high level idea is that for any prime \(p\), it's enough to try \((1, 0), (0, 1), (1, 1)\), and we use CRT to combine the results.

(Primitive solutions) We only care about the primitive solutions \((x, y)\) where \(g := \gcd(x, y) = 1\). If \(g \neq 1\), we find the primitive solutions of \(ax^2 + bxy + cy^2 = n/g^2\). It's also clear that \(\gcd(y, n) = 1\) for primitive solutions \((x, y)\).

Ellipse case (\(\Delta < 0\))

(This section is mostly taken from [Matthews15].)

We assume the BQF \(\left< a,b,c \right>\) is positive definite. Otherwise, we negate \(a, b, c\) and \(n\).

For a primitive solution \((x, y)\) to \(ax^2 + bxy + cy^2 = n\), define \(\lambda := x y^{-1} \bmod{|n|}\), which certainly satisfies \(a\lambda^2 + b\lambda + c \equiv 0 \pmod{|n|}\). We write \(x = x'n - \lambda y\), hence \[ ax^2 + bxy + cy^2 = n, \quad \Longleftrightarrow \quad an x'^2 + (b-2\lambda a) x'y + \frac{a\lambda^2 + b\lambda + c}{n}y^2 = 1. \] Note that these two BQFs have the same discriminant.

The theorem below determines the minimum positive value of a reduced positive definite BQF:

([Matthews15, Lemma 0.1], [Manguba-Glover18, Lemma 1]) For a reduced positive definite BQF \(\left< a, b, c \right>\), the minimum positive value of \(ax^2 + bxy + cy^2\) for \(x, y \in \ZZ\) is \(a\), and is achieved when \((x, y)\) is:

  • \((\pm 1, 0)\) if \(a < c\);
  • \((\pm 1, 0), (0, \pm 1)\) if \(b < a = c\);
  • \((\pm1, \mp1)\) if \(b = a = c\).

After reducing \(\left<an, b-2 \lambda a, \frac{a\lambda^2+b\lambda+c}{n} \right>\) to \(\left< a', b', c' \right>\) using some \(M_{\lambda}\), we can use the theorem above to determine the solutions of \(a'x^2 + b'xy + c'y^2 = 1\).

Here is an algorithm to solve the case:

def solve_bqf_neg_disc_primitive(a, b, c, n):
"""find primitive solutions to ax^2 + bxy + cy^2 = n.

Assume gcd(a, b, c) = 1 and gcd(c, n) = 1, b^2 - 4ac < 0.
"""

Δ = b*b - 4*a*c
assert Δ < 0

for λ in sqrtmod_all(Δ, n):
λ = invmod(2*a, n) * (λ - b) % n

(a2, b2, c2), M = bqf_neg_disc_reduce((a*n, b - 2*a*λ, (a*λ*λ + b*λ + c) // n))
if a2 != 1:
continue

# transform back to the original equation
M = [[n, -λ], [0, 1]] @ M

if a2 < c2:
sols = [[1, 0], [-1, 0]]
elif b2 < a2:
sols = [[1, 0], [-1, 0], [0, 1], [0, -1]]
else:
sols = [[1, -1], [-1, 1]]

yield from (M @ s for s in sols)

Parabola case (\(\Delta > 0\))

(This section is mostly taken from [Matthews15] and [MRS17].)

Structure and transformation

(Equivalence relation) [Matthews23, Section 2] Let \((x^*, y^*)\) be the fundamental solution to \(x^2 - \Delta y^2 = 4\), and \((x, y)\) be a solution to \(ax^2 + bxy + cy^2 = n\), then any \((x', y')\) such that \[ \begin{pmatrix} x' \\ y' \\ \end{pmatrix} = \pm U^k \begin{pmatrix} x \\ y \end{pmatrix}, \quad U := \begin{pmatrix} \frac{x^* - by^*}{2} & -cy^* \\ ay^* & \frac{x + by^*}{2} \end{pmatrix}. \] for \(k \in \ZZ\) is also a solution to \(ax^2 + bxy + cy^2 = n\). Hence we can use the associated equivalence relation from the group action of a group \(\left\{ \pm U^k: k \in \ZZ \right\}\) on the set of all solutions, which is the Stolt equivalence.

(Partition of solution space) Similarly as the ellipse case, we define \(\lambda := x y^{-1} \bmod{|n|}\) for a primitive solution \((x, y)\) and do the same transformation: write \(x = x'n - \lambda y\), hence \[ ax^2 + bxy + cy^2 = n, \quad \Longleftrightarrow \quad an x'^2 + (b-2\lambda a) x'y + \frac{a\lambda^2 + b\lambda + c}{n}y^2 = 1. \] One can also show ([Stolt57, Theorem 5]) that two primitive solutions are Stolt equivalent if and only if they have the same \(\lambda\), implying that the number of Stolt equivalent classes is the finite.

Find Stolt fundamental solutions

Our goal in this subsection is to find the Stolt fundamental solution for a specific \(\lambda\), which solves the equation \[ an x'^2 + (b-2\lambda a) x'y + \frac{a\lambda^2 + b\lambda + c}{n}y^2 = 1, \] with minimum non-negative \(y\) (and maximum \(x\) for tie-breaking).

With abuse of notation, we study the equation \(ax^2 + bxy + cy^2 = n\) where \(\Delta > 0\) is not a square, \(a > 0\) and \(n = \pm 1\). Additionally, we require \(y > 0\). Since \(\Delta \equiv 1 \pmod 4\) and \(\Delta\) is not a square, we have \(\Delta \geq 5\).

We first present a very interesting result in [Serret66]:

([Serret66]) Let \(0 < n < \sqrt{ \Delta } / 2\), and let \((x, y)\) be a primitive solution to \(ax^2 + bxy + cy^2 = n\) with \(y > 0\) where \(a > 0\). In addition, let \(\omega, \omega' = (-b \pm \sqrt{ \Delta }) / 2a\) be the roots of \(ar^2 + br + c = 0\) where \(\omega < \omega'\). Then \(p/q\) is a convergent to either \(\omega\) or \(\omega'\).

[MRS17] gives a simple proof: Note \(a(x / y - \omega)(x / y - \omega') = n / y^2\). If \(x / y > \omega'\), \[ \frac{x}{y} - \omega' = \frac{n}{a (x / y - \omega) y^2} < \frac{\sqrt{ \Delta } / 2}{a(\omega' - \omega )y^2} = \frac{1}{2y^2}, \] then by Dirichlet's approximation theorem, we conclude that \(x / y\) is a convergent to \(\omega'\). Similarly, if \(x / y < \omega\), it's a convergent to \(\omega\).

However, the theorem is no longer true for negative \(n\). [Pavone86] generalized the theorem for \[ |n| = \mu :=\min\limits_{\substack{(x, y) \in \ZZ^2 \\ (x, y) \neq (0, 0)}} |ax^2 + bxy + cy^2|, \] where he proved that Serret's theorem still holds unless \(\Delta = 5\), \(n < 0\), in which case \(x / y\) can also be a special semiconvergent of both \(\omega\) and \(\omega'\). [MRS17] further generalizes [Pavone86]'s result, allowing \(|n| < \sqrt{ \Delta } / 2\).

Anyway, [Pavone86]'s result is enough for us since we only care about the case \(n = \pm 1\). One way to find the fundamental solution is to examine the candidates from the following categories to see if they solve the equation, and pick the most fundamental one:

  1. Check the smallest convergent of \(\omega\) before the first period ends;
  2. Check the smallest convergent of \(\omega'\) before the first period ends;
  3. If \(\Delta = 5\) and \(n < 0\), also check the special semi-convergent.

[Matthews02] also designed an algorithm based on [Pavone86], but added many optimizations.

References

General:

  • Robertson09: Robertson, J., 2009. Computing in quadratic orders. at jpr2718. org.

Ellipse case (\(\Delta < 0\)):

  • Matthews15: Matthews, K.R., 2015. On a transformation of Lagrange [online]
  • Matthews14: Matthews, K.R., 2014. Lagrange's Algorithm Revisited: Solving \(at^2+ btu+ cu^2= n\) in the case of negative discriminant. Journal of Integer Sequences, 17(11).
    • It uses continued fraction and is more complicated. Also need to handle a few exceptional cases.
  • Manguba-Glover18: Manguba-Glover, K., 2018. Reduction Theory of Binary Quadratic Forms (Doctoral dissertation, University of Hawai ‘i at Manoa).

Parabola case (\(\Delta > 0\)):

  • Matthews23: Matthews, K., 2023. Finding the Fundamental Solutions of \(ax^2+ bxy+ cy^2= n\).
    • Notes on Stolt equivalence.
  • MR21: Matthews, K.R. and Robertson, J.P., 2021. On solving a binary quadratic Diophantine equation. Rocky Mountain Journal of Mathematics, 51(4), pp.1369-1385.
    • Unfortunately it's paywalled so I haven't read it yet. My guess is that it's pretty similar to [Matthews23].
  • Matthews02: Matthews, K., 2002. The diophantine equation \(ax^ 2+ bxy+ cy^ 2= N\), \(D= b^ 2-4ac>0\). Journal de théorie des nombres de Bordeaux, 14(1), pp.257-270.
    • Continued fraction-based algorithm.
  • MRS15: Matthews, K.R., Robertson, J.P. and Srinivasan, A., 2015. On fundamental solutions of binary quadratic form equations.
    • This bounds the scale of the Stolt fundamental solutions.
  • MRS17: Matthews, K.R., Robertson, J.P. and Srinivasan, A., 2017. Extending Theorems of Serret and Pavone. J. Integer Seq., 20(10), pp.17-10.
  • [Serret66] : Serret, J.A., 1866. Cours d'algèbre supérieure (Vol. 1). Gauthier Villars.
  • Pavone86: Pavone, M., 1986. A Remark on a Theorem of Serret. Journal of Number Theory, 23(2), pp.268-278.
  • Stolt57: Stolt, B., 1957. On a Diophantine equation of the second degree. Arkiv för Matematik, 3(4), pp.381-390.