Lemma 3.24 Let f( x, y) = ax 2 + bxy + cy 2 be a reduced positive definite form. If for some pair of integers x and y we have g.c.d.(x, y) = 1 and f(x, y) ~ c, then f(x, y) =a or c, and the point (x, y) is one of the six
points ±0, 0), ± (0, 1), ± (1, - 1). Moreover, the number of proper repre- sentations of a by f is
2 ifa<c,
4 ifO~b<a=c,and
6 ifa = b =c.
Proof Suppose that g.c.d.(x, y),;, 1. If y = 0 then x = ± 1, and we note that f(± 1, 0) =a. Now suppose that y = ± 1. If lxl ;;:,:: 2 then
l2ax + byl ;;:,:: l2axl - lbyl
;;:,:: 4a- lbl
(by the triangle inequality)
;;:,:: 3a (since lbl ~a).
Then by (3.3) we deduce that
4af(x, y) = (2ax + by)2- dy2
= 9a2 - d
(since a > 0)
;;:,:: 4ac (since lbl ~a).
Thus f(x, ± 1) > c if lxl ;;:,:: 2. Now suppose that lyl ;;:,:: 2. Then by (3.3) we see that
4af(x, y) = (2ax + by)2 - dy2
;;:,:: -dy2
;;:,:: -4d
= 16ac- 4b2
>Sac- 4b2
;;:,:: 4ac
(since ac > 0) (since 0 < a ~ c) (since I b I ~ a) .
Thus f(x, y) > c if IY I ;;:,:: 2. The only points remaining are ± 0, 0), ± (0, 1),
±0,- 1), and ±0, 1). As b > -a, we find that fO, 1) =a + b + c > c, so that the proper representations of a and of c are obtained by consider- ing the first three pairs of points.
The last assertion of the lemma now follows on observing that f0,0) =a, f(O, 1) = c, and fO,- 1) =a- b +c.
Theorem 3.25 Let f(x, y) = ax2 + bxy + cy2 and g(x, y) = Ax2 + Bxy + Cy 2 be reduced positive definite quadratic forms. Iff - g then f = g.
Proof Suppose that f - g. By Lemma 3.24, the least positive number properly represented by f is a, and that by g is A. By Theorem 3.17 it follows that a =A. We consider first the case a <c. Then by Lemma 3.24 there are precisely 2 proper representations of a by f. By Theorem 3.17 it follows that there are precisely 2 proper representations of a by g, and from Lemma 3.24 we deduce that C > a. Thus by Lemma 3.24 we see that c is the least number greater than a that is properly represented by f, and Cis the least such for g. By Theorem 3.17 it follows that c =C. To show that b = B, we consider the matrices ME r that might take f to g. Since det(M) = mum22 - m21m12 = 1, we know that g.c.d.(mw m21 ) = 1. Thus by (3.7a), f(mw m21 ) =a is a proper representation of a. By Lemma 3.24 it follows that the first column of M is ± [ ~ ] . We see similarly that (m12 , m22 ) = 1, so that by (3.7c), f(m12 , m22 ) = c is a proper representation of c. Hence by Theorem 3.24, the second column of M is ± [ ~] or ± [ - ~ ] . Thus we see that the only candidates for M are
±I and±[~ -n. However, in the latter event (3.7b) would give B = -2a + b, which is impossible since b and B must both lie in the interval (-a, a]. This leaves only ±I, and we see that if M = ±I then f= g.
We now consider the case a =c. From Lemma 3.24 we see that a has at least 4 proper representations by f. From Theorem 3.17 it follows that the same is true of g, and then by Lemma 3.24 we deduce that C =a =c.
Thus by Definition 3.8, 0 ~ b ~ a = c and 0 ~ B ~A = C = a. As b2 - 4ac = B2 - 4AC, it follows that b = B, and hence that f =g.
In the case a < c considered, we not only proved that f = g, but also established that the only matrices M E r that take f to itself are ±I. We now extend this.
Definition 3.10 Let f be a positive definite binary quadratic form. A matrix ME r is called an automorph off if M takes f to itself, that is, if
f(mux + m12y, m21x + m22y) = f(x, y). The number of automorphs off is denoted by w(f).
For example, the matrix [ -~ - ~] is an automorph of x2 + xy + y2,
and of course the identity matrix I = [ ~ ~] is an automorph of every form.
Theorem 3.26 Let f and g be equivalent positive definite binary quadratic forms. Then w(f) = w(g ), there are exactly w(f) matrices M E r that take f tog, and there are exactly w(f) matrices ME r that take g to f. Moreover, the only values of w(f) are 2, 4, and 6. Iff is reduced then
w(f) = 4 if a = c and b = 0, w (f) = 6 if a = b = c, and w(f) = 2 otherwise.
Proof Let A1, A2 , ããã,A, be distinct automorphs of f, and let M be a matrix that takes f to g. Then A 1M, A 2M,ããã, A,M are distinct members of r that take f to g. Conversely, if M1, M2 , ã ã ã, Ms are distinct members of f that take f to g, then M 1 Mi\ M 2 Mi\ ã ã ã , Ms Mi1 are distinct automorphs of f. Hence the automorphs off are in one-to-one correspon- dence with the matrices M that take f to g. If M takes f tog, then M-1 takes g to f, and these matrices M-1 are in one-to-one correspondence with the automorphs of g. Thus the automorphs of f are in one-to-one correspondence with those of g, and consequently w(f) = w(g) if either number is finite. But the number is always finite, because any form is equivalent to a reduced form, and in the next paragraph we show that any reduced form has 2, 4, or 6 automorphs.
Suppose that f is reduced. In the course of proving Theorem 3.25, we showed that w(f) = 2 if a < c, and we saw that f(mw m21 ) = a and f(m12 , m22 ) = c are proper representations of a and c. Suppose now that 0 ~ b ~ a = c and that M leaves f invariant (i.e., M takes f to itself).
Then by Lemma 3.24 the columns of M lie in the set { ± [ ~] , ± [ ~ ] , ± [ _ ~ ] } . Of the 36 such matrices, we need consider only those with determinant 1, and thus we have the six pairs ±M1,
[1 -1] [ 1 0]
±M2 , • ã ã, ± M6 where M1 = I, M2 =
0 1 , M3 = _1 1 ,
[0 -1] [0 -1] [ 1 1] .
M4 = 1 0 , M5 = 1 1 , and M6 = _1 0 . We note that 1f any one of the four matrices ±M ± 1 is an automorph, then all four are.
Here M1 is always an automorph. By (3.7b) we see that M2 takes f to g with B = b- 2a -:1= b, so that M2 is never an automorph. As M3 = M-;1, we deduce that M3 is likewise never an automorph. As M4 takes f to cx2 - bxy + ay2, M4 is an automorph if and only if b = 0 and a =c.
Since M5 takes f to cx2 + (2c - b)xy + (a - b + c)y2, we see that M5 is an automorph if and only if a = b =c. Finally, M6 = M51, so that M6 is an automorph if and only if a = b = c. This gives the stated result.
We now employ our understanding of automorphs to generalize Theorem 3.21 (which was concerned with the particular form x2 + y2) to arbitrary positive definite binary quadratic forms f of discriminant d < 0.
Extending the notation of the preceding section, we let R/n) denote the number of representations of n by f. Similarly, we let r/n) denote the number of these representations that are proper. Finally, let H/n) denote the number of integers h, 0 ~ h < 2n, such that h2 = d (mod 4n), say
h2 = d + 4nk, with the further property that the form nx2 + hxy + ky2 is equivalent to f.
Theorem 3.27 Let f be a posztwe definite binary quadratic form with discriminant d < 0. Then for any positive integer n, r/n) = w(f)H/n), and R/n) = Lmz1nr/nfm2).
It may be shown that if a nonzero number n is represented by an indefinite quadratic form whose discriminant is not a perfect square, then n has infinitely many such representations. To construct an analogous theory for indefinite forms one must allow for solutions of Pell's equation xz- dyz = ±4.
Proof Let ~(n) denote the set of those forms g(x, y) = nx2 + hxy + ky2 that are equivalent to f, and for which 0 ~ h < 2n. From Theorem 3.17 we know that such a form must have the same discriminant as f, so that h2 - 4nk = d. Thus there are precisely H/n) members of the set ~(n).
If g E ~(n), then g is equivalent to f, which is to say that there is a matrix ME r that takes f to g. By Theorem 3.26 it follows that there are precisely w(f) such matrices. Consequently, there are exactly w(f)H/n) matrices MEr that take f to a member of ~(n). We now exhibit a one-to-one correspondence between these matrices M and the proper representations of n.
Suppose that M is of the sort described. Then by (3.7a) we see that f(mw m21 ) = n. As det (M) = mum22 - m21m12 = 1, we see that (mw m21 ) = 1, and thus the representation is proper. Conversely, suppose that f(x, y) = n is a proper representation of n. To recover the matrix M, we take mu = x, m21 = y. It remains to show that m12 and m21 are
uniquely determined. Let u and v be chosen so that xv - yu = 1. In order that det (M) = 1, we must have m21 = u + tx, m22 = v + ty for some integer t. ForM of this form we see by (3.7b) that
h = 2ax(u + tx) + bx(v + ty) + by(y + tx) + 2cy(v + ty)
= (2axu + bxv + byu + 2cyv) + 2nt.
Thus there is a unique t for which 0 ~ h < 2n. This gives a unique matrix M with the desired properties. The first of the asserted identities is thus established.
To establish the second identity, suppose that x and y are integers such that f( x, y) = n, and put m = g.c.d.( x, y ). Then m 21 n, and indeed f(xjm, y jm) = njm2 is a proper representation of njm2, since g.c.d.(xjm, y jm) = 1. Conversely, if m21n and u and v are relatively prime integers such that f(u, v) = njm2, then f(mu, mv) = n and g.c.d.(mu, mv) = m.
Continuing our quest to generalize Theorem 3.21, we now let N/n) denote the number of integers h for which h2 = d (mod 4n) and 0 ~ h <
2n. Since h is a solution of the congruence u2 = d (mod 4n) if and only if h + 2n is a solution, it follows that N/n) is precisely one-half the total number of solutions of the congruence u2 = d (mod 4n). Assuming that n is a positive integer, the value of N/n) may be determined by applying the tools of Chapter 2, particularly Theorems 2.20 and 2.23. Let :F denote the set of all reduced positive definite binary quadratic forms of discriminant d. If h2 = d (mod 4n), say h2 = d + 4nk, and 0 ~ h < 2n, then there is a unique form f E :F for which n.x2 + hxy + ky2 E ~(n). Hence
L, H/n) =NAn).
fEST
For many discriminants d it happens that w(f) is the same for all f E !F.
In that case we let w denote the common value. (In this connection, recall Problem 15 in Section 3.5, and see Problem 6 below.) For such d we may multiply both sides by w and appeal to Theorem 3.27 to see that
L, r1(n) = wNAn).
fEST
In this manner we may determine the total number of proper representa- tions of n by reduced forms of discriminant d, but unfortunately it is not always so easy to describe the individual numbers r/n).
PROBLEMS
1. Let f(x, y) = ax2 + bxy + cy2 be a reduced positive definite form.
Show that all representations of a by fare proper.
2. Let f(x, y) = ax2 + bxy + cy2 be a reduced positive definite form.
Show that improper representations of c may exist. (H)
3. Show that any positive definite binary quadratic form of discriminant -3 is equivalent to f(x, y) = x2 + xy + y2• Show that a positive integer n is properly represented by f if and only if n is of the form n = 3aOpf3, where a = 0 or 1 and all the primes p are of the form 3k + 1. Show that for n of this form, r/n) = 6 ã 25, where s is the number of distinct primes p = 1 (mod 3) that divide n.
4. Write the canonical factorization of n in the form n = 3aOpi30qY where the primes p are of the form 3k + 1 and the primes q are of the form 3k + 2. Show that n is represented by f(x, y) = x2 + xy + y2 if and only if all the 'Y are even. Show that for such n, R/n) = 60/{3 + 1).
5. Show that for any given d < 0, the primitive positive definite quadratic forms of discriminant d all have the same number of automorphs.
6. Show that any positive definite quadratic form of discriminant - 23 is equivalent to exactly one of the forms f0(x, y) = x2 + xy + 6y2,
f1(x, y) = 2x2 + xy + 3y2 or fz(x, y) = 2x2- xy + 3y2. Show that if ( - :3
) = -1 then p is not represented by any of these forms. Show that if { - 23
) = 1 then p has a total of 4 representations by these forms. ~ho~ tlat in this latter case either p has 4 representations by
/ 1 or 2 representations apiece by /1 and /2 . Determine which of these cases applies when p = 139. (H)
*7. Let f(x, y) = ax2 + bxy + cy2 be a reduced positive definite form.
Suppose that g.c.d.(x, y) = 1 and that f(x, y) ~a + lb I + c. Show that f(x, y) must be one of the numbers a, c, a - lb I + c or a +
lbl +c.
NOTES ON CHAPTER 3
§3.1, 3.2 Fermat characterized those primes for which 2, - 2, 3, and - 3 are quadratic residues. His assertions for ± 3 were proved by Euler in 1760, and those for ±2 by Legendre in 1775. The first part of Theorem 3.1 was proved by Euler in 1755. The last part of Theorem 3.1, first proved by Euler in 1749, is equivalent to Theorem 2.11. We proved Theorem 2.11 by
the simpler method discovered by Lagrange in 1773. In 1738 Euler observed that whether the congruence x2 =a (mod p) has a solution or not is determined by the residue class of p (mod 41 a I). In 1783, Euler gave a faulty proof of an assertion equivalent to the quadratic reciprocity law.
On retrospect, one can see that even much earlier, Euler was just a short step away from having a complete proof of quadratic reciprocity.) In 1785, Legendre introduced his symbol, stated the general case of quadratic reciprocity without using his symbol, introduced the word "reciprocity,"
and gave an incomplete proof of the law. On 1859, Kummer noted that the gap in Legendre's proof is easily filled by appealing to Dirichlet's theorem of 1837 concerning primes in arithmetic progressions.) In ignorance of the earlier work of others, Gauss discovered the quadratic reciprocity law just before his eighteenth birthday. Mter a year of strenuous effort, Gauss found the first proof, in 1795, at the age of nineteen. This was published in 1801. Gauss discovered "Gauss's lemma" (Theorem 3.2) in 1808. Our proof of quadratic reciprocity (Theorem 3.3) follows Gauss's third proof of the theorem, which is considered to have been Gauss's favorite. Eventually Gauss gave eight proofs of quadratic reciprocity, in the hope of finding one that would generalize to give a proof of the quartic reciprocity law that he had empirically discovered.
For an instructive algebraic interpretation of Gauss's lemma, see W. C. Waterhouse, "A tiny note on Gauss's Lemma," J. Number Theory, 30 (1988), 105-107.
Theorem 3.5 is a variation of a result by P. Hagis, "A note concerning the law of quadratic reciprocity," Amer. Math. Monthly, 77 (1970), 397.
§3.3 In more advanced work, it is useful to extend the Legendre symbol beyond the Jacobi symbol, to the Kronecker symbol.
Let nz(p) denote the least positive quadratic nonresidue of p. Using the inequality (3.2) in a clever way, David Burgess showed that for every e > 0 there is a p0(e) such that nz(p) < pc+< for p > p0(e), where c = 1f(4ve) = o.1516 ....
~az +b)
§3.5 A function f(z) is called a modular function if - - = f(z) cz + d
for every [ ~ ~] E r. The study of modular functions, modular forms, and the more general automorphic functions is an active area of research in advanced number theory. If F is a field, then the n X n matrices with entries in F and nonzero determinant form a group, known as the general Linear group of order n over F, and denoted GL(n, F). If R is a commuta- tive ring with identity, then the n X n matrices with coefficients in R and determinant 1 form a group, known as the special Linear group of order n over R, denoted SL(n, R). In this notation, the modular group r is SL(2, Z).
Two forms ax2 + bxy + cy2 and Ax2 + Bxy + Cy2 of discriminant d lie in the same genus if aA is a square modulo ldl. This defines a new equivalence relation on the forms of discriminant d. Using the observation made in Problem 9, it may be shown that if two forms are equivalent (in the sense of Definition 3.7) then they lie in the same genus. Thus each genus is the union of one or more equivalence classes of forms. The consideration of these genera allows one to refine Corollary 3.14: If p is represented by some form of discriminant d, one may use quadratic reciprocity to determine in which genus this form must lie. An example of this is found in Problem 10, which concerns d = - 20. In this case it is found that there is only one equivalence class in each genus, and hence we are able to specify precisely which primes are represented by which forms.
However, the discriminant d = - 20 is one of only finitely many discrimi- nants of this sort: If d is large and negative, then each genus contains a large number of equivalence classes of forms.
The problem of finding all negative discriminants d for which h(d) = 1 has a long and interesting history, which is recounted in the survey article of D. Goldfeld, "Gauss's class number problem for imaginary quadratic fields," Bull. Amer. Math. Soc. 13 (1985), 23-37.
§3.6 Following Fermat, much attention was paid to the problem of giving an explicit formula for the numbers x and y for which x2 + y2 = p, when p is a prime of the form 4n + 1. This was first achieved in 1808 by Legendre, using continued fractions. In 1825 Gauss gave a different construction: Since x and y are of opposite parity, we may assume that x is odd. By replacing x by - x if necessary, we may suppose that x = 1 (mod 4). Then x is the unique number for which lx I < p j2 and 2x = (2nn) (mod p) where p = 4n + 1. More recently, Jacobsthal discov- ered that one may express x and y as sums involving the Legendre symbol,
1 p-I (k(k2 - r))
X = - L
2 k=l p '
1 p-l (k(k2 - n))
y = - L,
2 k=l p
where r denotes any quadratic residue of p, and n is any quadratic nonresidue of p. The method of Example 3, though it does not yield an explicit formula for x and y, nevertheless is computationally much more efficient. A similar calculational technique, but using continued fractions instead of the theory of quadratic forms, is found in Problem 6 of Section 7.3.
§3.7 Theorem 3.25 may be proved by considering the action of the modular group f on the upper half-plane dt"= {z E C: Joe- (z) > 0}. A nice account of this is found in Chapter 1 of LeVeque's Topics.
It was noted by Gauss that the theory of quadratic forms may be used to provide a method of factoring numbers. An elegant account of this approach has been given by D. H. and E. Lehmer, "A new factorization technique using quadratic forms," Math. Comp. 28 (1974), 625-635.
In Chapter 1, our treatment of sums of two squares depended on the identity
(x2 + y2)(u2 + v2) =(xu- yv)2 + (xv + yu)2
which reflects a familiar property of complex numbers, namely that if z = x + iy and w = u + iv, then lzllwl = lzwl. This is the first instance of a type of identity known as a composition formula. Such formulae exist for forms of other discriminants. For example, the reduced quadratic forms of discriminant -20 are f0(x, y) = x2 + 5y2 and f1(x, y) = 2x2 + 2xy +
3y2• By Theorems 3.18 and 3.25 it follows that H(-20) = 2. Moreover, it is easy to verify that
/ 0(x, y)f0(u, v) = f0(xu- 5yv, xv + yu),
/ 0(x,y)f1(u,v) =f1(xu -yu- 3yv,xv + 2yu +yv),
/ 1(x, y)f1(u, v) = f0(2xu + xv + yu- 2yv, xv + yu + yv).
Using these formulae, we see that fo and /1 form a group in which fo is the identity. More generally, Gauss proved that if d is not a perfect square then there exist composition formulae relating the various equivalence classes of primitive binary quadratic forms of discriminant d. These formulae cause the equivalence classes of the primitive forms of discrimi- nant d to form an abelian group. Subsequently it was discovered that this corresponds to the ideal class structure in a quadratic field of discriminant d. If in Definition 3.7 we had allowed matrices of determinant -1 then some of our equivalence classes would have been joined, the composition formulae would have become muddled, and the group structure destroyed.
For more extensive treatments of the theory of quadratic forms, one should consult the books of Cassels, Jones, and O'Meara.
Some Functions of Number Theory