996 M. SLATER–CONDON RULES Fig. M.1. Four Slater–Condon rules (I, II, III, IV) for easy remembering. On the left side we see pictorial representations of matrix elements of the total Hamiltonian ˆ H. The squares inside the brackets represent the Slater determinants. Vertical lines in bra stand for those spinorbitals, which are different in bra and in ket functions. On the right we have two square matrices collect- ing the h ij ’s and ij |ij−ij|jifor i j = 1N. The dots in the matrices symbolize non-zero ele- ments. and G 12 = 0. This happens because operators ˆ F and ˆ G represent the sum of, at most, two-electron operators, which will involve at most four spinorbitals and there will always be an extra overlap integral over the orthogonal spinorbitals. 9 The Slater–Condon rules are schematically depicted in Fig. M.1. 9 If the operators were more than two-particle, the result would be different. N. LAGRANGE MULTIPLIERS METHOD Imagine a Cartesian coordinate system of n +m dimensions with the axes labelled x 1 x 2 x n+m and a function 1 E(x),wherex = (x 1 x 2 x n+m ). Suppose that we are interested in finding the lowest value of E, but only among such x that satisfy m conditions (conditional extremum): conditional extremum W i (x) =0(N.1) for i = 1 2m. The constraints cause the number of independent variables to be n. If we calculated the differential dE at point x 0 , which corresponds to an ex- tremum of E,thenweobtain0: 0 = n+m j=1 ∂E ∂x j 0 dx j (N.2) where the derivatives are calculated at the point of the extremum. The quanti- ties dx j stand for infinitesimally small increments. From (N.2) we cannot draw the conclusion that the ( ∂E ∂x j ) 0 are equal to 0. This would be true if the increments dx j were independent, but they are not. Indeed, we find the relations between them by making differentials of conditions W i : n+m j=1 ∂W i ∂x j 0 dx j =0(N.3) for i = 1 2m (the derivatives are calculated for the extremum). This means that the number of truly independent increments is only n.Letus try to exploit this. To this end let us mul- tiply each equation (N.3) by a number i (Lagrange multiplier), which will be fixed Joseph Louis de Lagrange (1736–1813), French mathe- matician of Italian origin, self- taught; professor at the Ar- tillery School in Turin, then at the École Normale Supérieure in Paris. His main achieve- ments are in variational cal- culus, mechanics, and also in number theory, algebra and mathematical analysis. in a moment. Then, let us add together all the conditions (N.3), and subtract the result from eq. (N.2). We get 1 Symbol E is chosen to suggest that, in our applications, the quantity will have the meaning of energy. 997 998 N. LAGRANGE MULTIPLIERS METHOD n+m j=1 ∂E ∂x j 0 − i i ∂W i ∂x j 0 dx j =0 where the summation extends over n + m terms. The summation may be carried out in two steps. First, let us sum up the first n terms, and afterwards sum the other terms n j=1 ∂E ∂x j 0 − i i ∂W i ∂x j 0 dx j + n+m j=n+1 ∂E ∂x j 0 − i i ∂W i ∂x j 0 dx j =0 The multipliers i have so far been treated as “undetermined”. Well, we may force them to make each of the terms in the second summation equal zero 2 ∂E ∂x j 0 − i i ∂W i ∂x j 0 =0 for j =n +1n+m (N.4) Hence,thefirstsummationaloneis0 n j=1 ∂E ∂x j 0 − i i ∂W i ∂x j 0 dx j =0 which means that now we have only n increments dx j , and therefore they are in- dependent. Since for any (small) dx j ,thesumisalways0,theonlyreasonforthis could be that each parenthesis []individually equals zero ∂E ∂x j 0 − i i ∂W i ∂x j 0 =0forj =1n This set of n equations (called the Euler equations) together with the m condi- Euler equation tions (N.1) and m equations (N.4), gives a set of n + 2m equations with n + 2m unknowns (m epsilons and n +m components x i of the vector x 0 ). For a conditional extremum, the constraint W i (x) =0 has to be satisfied for i =1 2mand ∂E ∂x j 0 − i i ∂W i ∂x j 0 =0forj =1n+m The x i found from these equations determine the position x 0 of the condi- tional extremum E. Whether it is a minimum, a maximum or a saddle point, is decisive for the analy- sis of the matrix of the second derivative (Hessian). If its eigenvalues calculated at x 0 are all positive (negative), it is a minimum 3 (maximum), in other cases it is a saddle point. 2 This is possible if the determinant which is built of coefficients ( ∂W i ∂x j ) 0 is non-zero (this is what we have to assume). For example, if several conditions were identical, the determinant would be zero. 3 In this way we find a minimum; no information is available as to whether is it global or local. N. LAGRANGE MULTIPLIERS METHOD 999 Example 1. Minimizing a paraboloid going along a straight line off centre. Let us take a paraboloid E(xy) =x 2 +y 2 This function has, of course, a minimum at (0 0), but the minimum is of no interest to us. What we want to find is a minimum of E,butonlywhen x and y satisfy some conditions. In our case there will only be one: W = 1 2 x − 3 2 −y =0 (N.5) This means that we are interested in a minimum of E along the straight line y = 1 2 x − 3 2 . The Lagrange multipliers method works as follows: • We differentiate W and multiply by an unknown (Lagrange) multiplier thus getting: ( 1 2 dx −dy)=0. • This result (i.e. 0) is subtracted 4 from dE = 2x dx + 2y dy = 0 and we obtain dE =2x dx +2y dy − 1 2 ε dx +ε dy =0. • In the last expression, the coefficients at dx and dy have to be equal to zero. 5 In this way we obtain two equations: 2x − 1 2 =0and2y + =0. • The third equation needed is the constraint y = 1 2 x − 3 2 . • The solution to these three equations gives a set of xy which corresponds to an extremum. We obtain: x = 3 5 , y =− 6 5 , = 12 5 .Thuswehaveobtained, not only the position of the minimum (x = 3 5 , y =− 6 5 ), but also the Lagrange multiplier . The minimum value of E, which has been encountered along the straight line y = 1 2 x − 3 2 is equal to E( 3 5 − 6 5 ) =( 3 5 ) 2 +(− 6 5 ) 2 = 9+36 25 = 9 5 Example 2. Minimizing a paraboloid moving along a circle (off centre). Let us take the same paraboloid (N.5), but put another constraint W =(x −1) 2 +y 2 −1 =0 (N.6) This condition means that we want to go around a circle of radius 1, centred at (1 0), and see at which point (xy) we have the lowest value 6 of E.Theexample is chosen in such a way as to answer the question first without any calculations. Indeed, the circle goes through (0 0), therefore, this point has to be found as the minimum. Beside this, we should find a maximum at (20),becausethisisthepoint on the circle which is most distant from (0 0). Well, let us see whether the Lagrange multipliers method will give the same result. After differentiation of W , multiplying it by the multiplier , subtracting the result from dE and rearranging the terms, we obtain 4 Or added – no matter (in that case we get a different value of ). 5 This is only possible now. 6 Or, in other words, we intersect the paraboloid with the cylindrical surface of radius 1 and the cylin- der axis (parallel to the axis of symmetry of the paraboloid) is shifted to (1 0). 1000 N. LAGRANGE MULTIPLIERS METHOD dE = 2x −(2x −2) dx +2y(1 −) dy =0 which (after forcing the coefficients at dx and dy to be zero) gives a set of three equations 2x −(2x −2) = 0 2y(1 −) = 0 (x −1) 2 +y 2 = 1 Please check that this set has the following solutions: (xy ) = (000) and (xy) = (2 0 2). The solution (x y) = (0 0) corresponds to the minimum, while the solution (x y) = (2 0) corresponds to the maximum. 7 This is what we expected to get. Example 3. Minimizing the mean value of the harmonic oscillator Hamiltonian. This example is different: it will pertain to the extremum of a functional. 8 We are often going to encounter this in the methods of quantum chemistry. Letustaketheenergy functional E[φ]= ∞ −∞ dxφ ∗ ˆ Hφ ≡ φ ˆ Hφ where ˆ H stands for the harmonic oscillator Hamiltonian: ˆ H =− ¯ h 2 2m d 2 dx 2 + 1 2 kx 2 If we were asked what function φ ensures the minimum value of E[φ],sucha function could be found right away, it is φ = 0. Yes, indeed, because the kinetic energy integral and the potential energy integral are positive numbers, except in the situation when φ =0, then the result is zero. Wait a minute! This is not what we thought of. We want φ to have a probabilistic interpretation, like any wave function, and therefore φ|φ=1, and not zero. Well, in such a case we want to minimize E[φ], but forcing the normalization condition is satisfied all the time. Therefore we search for the extremum of E[φ] with condition W =φ|φ−1 = 0. It is easy to foresee that what the method has to produce (if it is to be of any value) is the normalized ground-state wave function for the harmonic oscillator. How will the Lagrange multipliers method get such result? The answer is on p. 198. 7 The method does not give us information about the kind of extremum found. 8 The argument of a functional is a function that produces the value of the functional (a number). O. PENALTY FUNCTION METHOD Very often we are interested in the minimization of a (“target”) function, 1 i.e. in finding such values of variables, which ensure a minimum of the function when some constraints are satisfied. Just imagine hiking in the Smoky Mountains: we want to find the point of the lowest ground elevation provided that we hike along a straight line from, say, Gatlinburg to Cherokee. Suppose the target function for minimization (which corresponds to the eleva- tion of the ground in the Smoky Mountains region) is the function f(x 1 x 2 x n+m ), but the variables x i have to fulfil m equations (“constraints”): φ i (x 1 x 2 x n+m ) =0fori =1 2m For such tasks we have at least three possibilities. The first is to eliminate m variables (by using the conditions) and express them by others. In this way the target function f takes into account all the constraints and depends only on n independent variables. Then the target function is to be minimized. The second possibility is to use the Lagrange multipliers method (see Appendix N). In both cases there is, however, the complication that the conditions to be satisfied might be quite complex and therefore solution of the corresponding equations may be difficult to achieve. An easier solution may be to choose a penalty method. The idea behind the penalty method is quite simple. Why go to the trouble of trying to satisfy the conditions φ i = 0, when we could propose the following: instead of function f let us minimize its modification F =f + m i=1 K i φ 2 i where the penalty coefficients K i > 0 are chosen to be large. 2 When minimizing F we admit that the conditions φ i = 0 could be non-satisfied, but any attempt to violate them introduces to F a positive contribution m i=1 K i φ 2 i . This means that, for minimization of F, it would always be better to explore such points in space (Fig. O.1) for which m i=1 K i φ 2 i = 0. If the K’s are large enough, the procedure will force the choice φ 2 i = 0, or φ i = 0fori = 1 2m, and this is what has to be satisfied during minimization. Note that the task would be much more difficult if φ 2 i had more than one mini- mum that corresponds to φ i =0. This penalty method is worth keeping in our tool 1 If we change the sign of the target function, the task is equivalent to maximization. 2 This means a high penalty. 1001 1002 O. PENALTY FUNCTION METHOD Fig. O.1. How does the penalty method work? We have to minimize f(xy), but under the condition that x and y satisfy the equation φ 1 (x y) =0 (black line). Function f(xy) exhibits a single minimum at point B, but this minimum is of no interest to us, because we are looking for a conditional minimum. To find it we minimize the sum f(xy)+Kφ 2 1 with the penalty function Kφ 2 1 0 allowing any deviation from the black line φ 1 (x y) =0. However, going off this line does not pay, because this is precisely what switches the penalty on. As a result, at sufficiently large K we obtain the conditional minimum W. This is what the game is all about. box, because it is general and easily applicable. For the method to work, it has to have a sufficiently large K.However,ifK is too large, the numerical results might be of poor quality, since the procedure would first of all take care of the penalty, paying little attention to f . It is recommended that we take a few values of K and check whether the results depend on this. As an example of the penalty function method, let us take the docking of two molecules. Our goal is to give such values of the atomic coordinates of both mole- cules as to ensure the contacts of some particular atoms of both molecules within some precise distance limits for the contacting atoms. The task sounds trivial, until we try to accomplish it in practice (especially for large molecules). The goal can be rather easily achieved when the penalty function method is used. We do the follow- ing. To the existing force field (i.e. an approximate electronic energy, Chapter 7) O. PENALTY FUNCTION METHOD 1003 we simply add a penalty for not satisfying the desired contacts. For a single pair of the atoms (a contact) the penalty could be set as K(r −r 0 ) 2 where r stands for the distance of the atoms, and r 0 is the optimum (desired) con- tact distance. At a chosen starting geometry the atoms are far from achieving the optimum distance, and therefore the force field energy is supplemented by a large distance-dependent penalty. The energy is so high that the minimization proce- dure tries to remove the penalty and relax the system. Often this can be done in only one way: by docking the molecules in such a way as to achieve the proper contact distance. P. MOLECULAR INTEGRALS WITH GAUSSIAN TYPE ORBITALS 1 S The normalized 1s spherically symmetric Gaussian Type Orbital (GTO) centred at the point shown by the vector R p reads as χ p ≡ 2α p π 3 4 exp −α p |r −R p | 2 The molecular integrals usually involve, at most, four such orbitals: χ p χ q χ r χ s , with corresponding centres R p R q R r R s , and exponents α p α q α r α s , respectively. Since any product of the 1s GTOs represents a (non-normalized) 1s GTO centred between the centres of the individual GTOs (see p. 359), let us denote the centre of χ p χ q by R k = α p R p +α q R q α p +α q , and the centre of χ r χ s by R l = α r R r +α s R s α r +α s . Then all the integrals needed are as follows: 1 overlap integral: S pq =χ p |χ q = 4α p α q (α p +α q ) 2 3 4 exp −α p α q α p +α q |R p −R q | 2 ; (P.1) kinetic energy integral: T pq = χ p − 1 2 χ q = α p α q α p +α q 3 − 2α p α q α p +α q |R p −R q | 2 S pq ; (P.2) nuclear attraction integral: 2 V α pq = χ p 1 |r −R α | χ q =2 α p +α q π F 0 (α p +α q )|R α −R k | 2 S pq ; (P.3) electron repulsion integral: (pr|qs) =(χ p χ r |χ q χ s ) = χ p (1) ∗ χ q (1) 1 r 12 χ ∗ r (2)χ s (2) dv 1 dv 2 = 2 √ π α p +α q √ α r +α s α p +α q +α r +α s F 0 (α p +α q )(α r +α s ) α p +α q +α r +α s |R k −R l | 2 S pq S rs (P.4) 1 S.F. Boys, Proc. Roy. Soc. (London) A200 (1950) 542. 2 In order to interpret this integral (in a.u.) as the Coulombic attraction of the electronic charge χ ∗ p (1)χ q (1) by a nucleus (of charge Z, located at R α ) we have to multiply the integral by −Z. 1004 P. MOLECULAR INTEGRALS WITH GAUSSIAN TYPE ORBITALS 1s 1005 with F 0 defined as 3 F 0 (t) = 1 √ t √ t 0 exp −u 2 du (P.5) Note that for an atom (all the centres coincide) we have t = 0andF 0 (0) =1. Do these formulae work? The formulae look quite complex. If they are correct, they have to work in several simple situations. For example, if the electronic distribution χ ∗ p (1)χ q (1) centred at R k is far away from the nucleus, then we have to obtain the Coulombic interaction of the charge of χ ∗ p (1)χ q (1) and the nucleus. The total charge of the electron cloud χ ∗ p (1)χ q (1) is obviously equal to S pq , and therefore S pq |R α −R k | should be a very good estimation of the nuclear attraction integral, right? What we need is the asymptotic form of F 0 (t),ift →∞. This can be deduced from our formula for F 0 (t). The integrand is concentrated close to t =0. For t → ∞, the contributions to the integral become negligible and the integral itself can be replaced by ∞ 0 exp(−u 2 ) du = √ π/2. This gives [F 0 (t)] asympt = √ π 2 √ t and V α pq asympt = 2 α p +α q π F 0 (α p +α q )|R α −R k | 2 S pq = 2 α p +α q π √ π 2 (α p +α q )|R α −R k | 2 S pq = S pq |R α −R k | exactly as we expected. If χ p =χ q ,thenS pq =1 and we simply get the Coulombic law for the unit charges. It works. Similarly, if in the electronic repulsion integral χ p = χ q , χ r = χ s and the dis- tance |R k − R l |=R is large, then we should get the Coulombic law for the two point-like unit charges at distance R. Let us see. Asymptotically (pr|qs) asympt = 2 √ π α p +α q √ α r +α s α p +α q +α r +α s F 0 (α p +α q )(α r +α s ) α p +α q +α r +α s |R k −R l | 2 = 2 √ π α p +α q √ α r +α s α p +α q +α r +α s √ π 2 (α p +α q )(α r +α s ) α p +α q +α r +α s |R k −R l | 2 = 1 R which is exactly what we should obtain. 3 The values of F 0 (t) are reported in L.J. Schaad, G.O. Morrell, J. Chem. Phys. 54 (1971) 1965. . value of the harmonic oscillator Hamiltonian. This example is different: it will pertain to the extremum of a functional. 8 We are often going to encounter this in the methods of quantum chemistry. . example of the penalty function method, let us take the docking of two molecules. Our goal is to give such values of the atomic coordinates of both mole- cules as to ensure the contacts of some. information about the kind of extremum found. 8 The argument of a functional is a function that produces the value of the functional (a number). O. PENALTY FUNCTION METHOD Very often we are interested