Introduction
Vectors are fundamental mathematical entities in classical physics, representing physical quantities that possess both magnitude and direction, such as displacement, velocity, acceleration, and various forces including mechanical, electrical, magnetic, and gravitational This concept of vectors stands as one of the most significant contributions to mathematics, originating from the field of physics.
Geometrical vectors, represented as arrows in two-dimensional and three-dimensional spaces, create real vector spaces through vector addition and scalar multiplication To effectively describe lengths and angles, which are crucial for various physical applications, these real vector spaces incorporate the dot product of two vectors Consequently, such vector spaces are referred to as Euclidean spaces.
The theory of real vector spaces can be extended to encompass various sets, including real matrices with m rows and n columns, real polynomials of order less than n, and real functions sharing the same domain This extension also includes sets of continuous, differentiable, or integrable functions, as well as the solutions to homogeneous systems of linear equations Notably, most of these generalized vector spaces are multi-dimensional.
Vector spaces of matrix-columns, denoted as R^n, consist of n rows and one column, where n can be 2, 3, 4, and so on, with each component x_i (for i=1,2, ,n) being a real number While ordered n-tuples can also be represented by matrix-rows, the matrix-column representation is more suitable for matrix transformations.
R n , which are applied to the left A ¯x, where A is an m×n real matrix.) We shall call the elements ofR n n-vectors.
The vector spacesR n for n=2,3, play an important role in geometry, describing lines and planes, as well as the area of triangles and parallelograms and the volume of a parallelepiped.
Vector spaces R^n for n > 3 lack a geometric interpretation but are crucial for various mathematical problems, such as systems of linear equations They also play a significant role in physics, particularly in the context of space-time events in the special theory of relativity when n = 4, and are important in economics through linear economic models.
Modern physics, especially Quantum Mechanics and elementary particle theory, relies on complex vector spaces Initially, Quantum Mechanics developed through two main approaches: Schrödinger's wave mechanics and Heisenberg's matrix mechanics Von Neumann demonstrated that both approaches are isomorphic to the infinite-dimensional unitary complex vector space known as Hilbert space Today, the geometry of Hilbert space is widely recognized as the mathematical framework for Quantum Mechanics.
The Standard Model of elementary particles includes various particle sets, one of which is quarks Initially, quarks were represented by three particles within the SU(3) group, which consists of unitary-complex 3×3 matrices with a unit determinant However, this set expanded to include six particles, now described by the SU(6) group.
Geometrical Vectors in a Plane
A geometrical vector in a Euclidean plane E 2is defined as a directed line segment (an arrow).
A vector is defined by its initial point A (the tail) and terminal point B (the arrow-head), typically denoted as −→ AB It possesses two key characteristics: length, represented as || −→ AB||, which is a positive number known as the norm, and direction, indicating both the line on which the segment lies and the orientation of the arrow.
Two vectors are deemed "equal" when they share the same length and are aligned along parallel lines in the same direction Essentially, they can be superimposed on each other through a translation within the plane.
The relation among all vectors in the plane is reflexive, symmetric, and transitive, establishing it as an equivalence relation This leads to a partition of the vector set into equivalence classes of equal vectors, which we denote as V² To simplify our analysis, we select a representative from each class, typically by choosing a point in E² as the origin O Consequently, the representative of each class is defined as the vector ¯a originating from point O.
Several vectors from the class[a¯]represented by the vector ¯a which starts at O.
A binary operation can be defined in V 2, specifically V 2 × V 2 → V 2, known as the addition of classes This operation is established by applying the parallelogram rule, where the representatives ¯a and ¯b of two classes [a]¯ and [b]¯ serve as the two sides of a parallelogram.
The diagonal from point O signifies the class [a¯+¯b] This definition of class addition is validated by observing that the sum of any vector from class [a]¯ and any vector from class [¯b] will also belong to class [a¯+¯b] By selecting a vector from class [a] and a vector from class [¯b], we can translate the vector from [¯b] to the terminal point of the vector from [a], resulting in the vector representation of [a¯+¯b].
To add two vectors, connect the initial point of the first vector to the terminal point of the second vector, illustrating the triangle rule for vector addition This method demonstrates that the resultant vector is classified within the set [a¯+¯b].
The combination of all vectors from the class [a]¯ with those from the class [¯b] results in all vectors from the class [a¯+¯b] It has been established that the sum of any vector from [a]¯ and any vector from [¯b] yields a vector from [a¯+¯b] Furthermore, every vector in [a¯+¯b] can be represented as such a sum, as it can be decomposed by drawing a vector from [a¯] at its initial point and then extending a vector from [¯b] from the terminal point of the first vector, ensuring that the terminal point aligns with the original vector's endpoint.
(One denotes both the addition of numbers and the addition of vectors by the same sign “+,” since there is no danger of confusion—one cannot add numbers to vectors).
It is obvious that the above binary operation is defined for every two representa- tives of equivalence classes (it is a closed operation—the first property).
The addition of vectors is commutative ¯ a+¯b=¯b+a,¯ as can be seen from the diagram This is the second property.
This operation is also associative (see the diagram), so it can be defined for three or more vectors:
Vectors in a Cartesian (Analytic) Plane R 2
We simply “add” one vector after another This is the third property.
Each vector ¯a has its unique negative−a (the additive inverse), which has the¯ same length, lies on any of parallel lines, and has the opposite direction of the arrow:
This is the fourth property.
The sum of a vector ¯a and its opposite −a results in the zero vector ¯0, where ¯a + (−a) = ¯0 The zero vector has a length of zero and lacks a defined direction, serving as the additive identity since ¯a + ¯0 = ¯a This illustrates the fifth property of vector addition.
The addition of vectors forms an Abelian group, as it meets all five essential properties of this algebraic structure Vector addition is a closed operation that is both commutative and associative Additionally, every vector has a unique inverse, and there exists a distinct identity element.
1.3 Vectors in a Cartesian (Analytic) Plane R R R 2 2 2
Any pair of perpendicular axes (directed lines) passing through the origin O, with marked unit lengths on them, is called a rectangular coordinate system.
Each point P in the plane has now two coordinates(x,y), which are determined by the positions of two orthogonal projections P and P of P on the coordinate axes x and y, respectively.
The Euclidean plane is transformed into a Cartesian (analytic) plane through the use of rectangular coordinate systems Analytic geometry, which is primarily attributed to the French mathematician René Descartes in the early 17th century, laid the foundation for this mathematical discipline.
Every coordinate system establishes a bijection between the points in the plane E² and the ordered pairs of real numbers in R² This means that E² can be identified with R², although this identification varies with each coordinate system.
In our analysis, we identify the position vector ¯a as the most natural representative of each equivalence class of geometrical vectors, specifically the vector originating from the point O(0,0) This vector is uniquely defined by the coordinates (x,y) of its terminal point P(x,y) We denote the position vector ¯a as OP, represented in matrix-column form as ¯ −→ a x y.
This arrangement is more efficient than a matrix-row[x y] when applying various matrix transformations from the left We refer to the x and y components of ¯a, and state that the matrix-column x y represents ¯a within the specified coordinate system.
From now on, we shall concentrate our attention on the set of all matrix-columns x y
In mathematics, the set of ordered pairs of real numbers, denoted as \( R^2 \), represents two-dimensional vectors, commonly referred to as 2-vectors Each 2-vector, represented as \( (x, y) \), corresponds to a unique equivalence class of geometric vectors in the two-dimensional Euclidean space, \( E^2 \).
The addition inR 2 of two position vectors ¯a x y and ¯b x y can be per- formed component-wise as the addition between the corresponding components (see the diagram) ¯ a+¯b x y
(Note that this is the general rule for addition of matrices of the same size).
Scalar Multiplication (The Product of a Number with a Vector)
Since the components are real numbers, and the addition of real numbers makes
R an Abelian group, we immediately see thatR 2 is also an Abelian group with respect to this addition of matrix-columns:
R 2 is obviously closed under+, since every two matrix-columns can be added to give the third one;
It is also associative, let ¯c x y
There is a unique additive identity (neutral) called the zero vector ¯00 0
Every vector ¯a x y has a unique additive inverse (the negative vector)−a¯ −x
1.4 Scalar Multiplication (The Product of a Number with a Vector)
In mathematical notation, the expression ¯a + a can be represented as 2 ¯¯ a, which leads to the introduction of the number-vector product This product generates another vector, defined for every real number c and every vector ¯a in R², as c ¯a = (c x, c y).
Scalar multiplication is a mathematical operation that involves a vector parallel to another vector, denoted as ¯a, which can either share the same direction (c > 0) or the opposite direction (c < 0) The length of vector ¯a is calculated as ||a|| = √(x² + y²), and consequently, the length of the scaled vector c ¯a is ||c ¯a|| = |c| ||a||, where ||c ¯a|| = √(c²x² + c²y²) This operation is a mapping from R × R² to R² and is commonly referred to as scalar multiplication in the context of tensor algebra.
Scalar multiplication is a closed operation applicable to every scalar \( c \in \mathbb{R} \) and vector \( \bar{a} \in \mathbb{R}^2 \) As a mapping from \( \mathbb{R} \times \mathbb{R}^2 \) to \( \mathbb{R}^2 \), it connects the operations defined in the field \( \mathbb{R} \) and the Abelian group \( \mathbb{R}^2 \) These operations include the addition and multiplication of real numbers in \( \mathbb{R} \), as well as the addition of vectors in \( \mathbb{R}^2 \).
(i) The distributive property of the addition of numbers with respect to scalar multiplication:
(c+d)x (c+d)y cx+dx cy+dy cx cy
(ii) The associative property of the multiplication of numbers with respect to scalar multiplication:
(iii) The distributive property of the addition of vectors with respect to scalar mul- tiplication: c(a¯+¯b) =c( x y
(iv) 1 ¯a=a (the number 1 is neutral for both the multiplication of numbers, 1c¯ =c, and the multiplication of numbers with vectors, 1 ¯a=a).¯
Definition Vector addition (with the five properties of an Abelian group) and scalar multiplication (with the four properties above) makeRan algebraic structure called a real vector space.
Since R 2 represents V 2, the set of the equivalence classes of equal vectors, it means that V 2is also a real vector space.
In real vector spaces, a linear combination is formed by combining vector addition and scalar multiplication, represented as \( \sum_{i=1}^{n} c_i \mathbf{a}_i \), where \( c_1, c_2, \ldots, c_n \in \mathbb{R} \) and \( \mathbf{a}_1, \mathbf{a}_2, \ldots, \mathbf{a}_n \in \mathbb{R}^2 \) This operation is fundamental to the structure of real vector spaces, encapsulating the essence of their algebraic properties.
The Dot Product of Two Vectors (or the Euclidean Inner Product
Note: This subject is treated in detail in Sect 3.1.
1.5 The Dot Product of Two Vectors (or the Euclidean Inner Product of Two Vectors in R 2 ) 9
By selecting two unit vectors, ¯i and ¯j, aligned with the x and y axes, we can express vector ¯a as a linear combination of its components Specifically, vector ¯a can be represented as ¯a = x¯i + y¯j, where x and y are the scalar values corresponding to the vector's components in the respective directions.
In R², the unique expansion of the vector a in terms of the orthogonal unit vectors ¯i and ¯j defines a basis This basis, denoted as {¯i, ¯j}, is referred to as an orthonormal (ON) basis due to the orthogonality and unit length of the vectors.
The scalar projection x of ¯a on the direction of ort ¯i is the result of the obvious formula x/||a||¯ =cosα⇒x=||a||¯ cosα.
Similarly, the scalar projection of ¯a on any other vector ¯b is obtained as||a||¯ cosθ, whereθis the smaller angle(0 ◦ ≤θ≤180 ◦ )between ¯a and ¯b.
In physics, the work \( W \) done by a force \( \bar{a} \) during a displacement \( \bar{b} \) is calculated as the product of the scalar projection of the force onto the direction of the displacement, represented by \( ||a|| \bar{\cos\theta} \), and the length of the displacement \( ||\bar{b}|| \) This relationship can be expressed mathematically as \( W = ||a|| \bar{\cos\theta} ||\bar{b}|| = ||a|| ||\bar{b}|| \cos\theta \) This formula for work is referred to as the dot product of the force and the displacement, denoted as \( \bar{a} \cdot \bar{b} \).
The dot product is aR 2 ×R 2 →Rmap, since the result is a number.
The principal properties of the dot product are
1 The dot product is commutative: ¯aã¯b=¯bãa (obvious);¯
2 It is distributive with regard to vector addition
(a¯+¯b)ã¯c=a¯ã¯c+¯bã¯c, since the scalar projection of ¯a+¯b along the line of vector ¯c is the sum of the projections of ¯a and ¯b;
3 It is associative with respect to scalar multiplication k(a¯ì¯b) = (k ¯a)ã¯b=a¯ã(k ¯b). For k>0 it is obvious, since k ¯a and k ¯b have the same direction as ¯a and ¯b, respectively.
For k0 if ¯a=¯0 and ¯aãa¯=0 iff ¯a=¯0 (obvious), so only the zero vector ¯0 has zero length, other vectors have positive lengths.
Note that the two nonzero vectors ¯a and ¯b are perpendicular (orthogonal) if and only if their dot product is zero:
By utilizing the properties of dot multiplication and the dot-multiplication table for the unit vectors ¯i and ¯j, the dot product of two vectors ¯a and ¯b, expressed as ¯a = x¯i + y¯j and ¯b = x¯i + y¯j, can be calculated in terms of their components This results in the equation ¯a · ¯b = (x¯i + y¯j) · (x¯i + y¯j), which simplifies to xx + yy, demonstrating the relationship between the vectors' components.
The dot product in R² is typically defined by the formula ||a|| ||b|| cosθ, which remains consistent across different coordinate systems, ensuring that the value calculated is always the same.
Note that the dot product for three vectors is meaningless.
Applications of the Dot Product and Scalar Multiplication
A The length (norm)||a||¯ of a vector ¯a x y
=x¯i+y ¯j can be expressed by the dot product: since ¯aãa¯=||a||¯ 2 cos 0 ◦ =||a||¯ 2 =x 2 +y 2 , it follows that
1.6 Applications of the Dot Product and Scalar Multiplication 11 The cosine of the angleθbetween ¯a x y and ¯b x y is obviously cosθ= a¯ã¯b
The unit vector(ort)a¯0in the direction of ¯a is obtained by dividing ¯a by its length ¯a 0= || a a|| ¯ ¯ , so that ¯a=||a||¯ a¯0and||a¯0 ||=1.
The components (scalar projections) x,y of ¯a x y are the result of dot- multiplication of ¯a with ¯i and ¯j, respectively: x=a¯ã¯i=xã1+yã0,y=a¯ã¯j=xã0+yã1⇒a¯= (a¯ã¯i)¯i+ (a¯ã¯j)¯j.
The distance d(A,B)between two points A(x,y)and B(x ,y )is the length of the difference ¯ a−¯b x−x y−y of their position vectors ¯a x y and ¯b x y
B The Cauchy–Schwarz inequality is an immediate consequence of the defi- nition of the dot product: since |cosθ| ≤1 for any angle θ, we have that ¯ aã¯b=||a||||¯ ¯b||cosθimplies|a¯ã¯b| ≤ ||a||||¯ ¯b||.
The triangle inequality is a direct consequence of the Cauchy–Schwarz inequal- ity−||a||||¯ ¯b|| ≤a¯ã¯b≤ ||a||||¯ ¯b||: (∗)
||a¯+¯b|| 2 = (a¯+¯b)ã(a¯+¯b) =||a||¯ 2 +2(a¯ã¯b) +||¯b|| 2 (∗) ≤ ||a||¯ 2 + +2||a||||¯ ¯b||+||¯b|| 2 = (||a||¯ +||¯b||) 2 , that implies||a¯+¯b||≤||a||¯ +||¯b||,which means that the length of a side of a triangle does not exceed the sum of the lengths of the other two sides.
The cosine rule, a fundamental theorem in trigonometry, can be derived using the dot product of vectors In a triangle with sides represented by vectors \( \vec{a} \), \( \vec{b} \), and \( \vec{c} \), where \( \vec{c} = \vec{a} + \vec{b} \), we find that \( ||\vec{c}||^2 = ||\vec{a}||^2 + ||\vec{b}||^2 + 2(\vec{a} \cdot \vec{b}) \) This leads to the equation \( c^2 = a^2 + b^2 - 2ab \cos\gamma \), with \( ||\vec{a}|| = a \), \( ||\vec{b}|| = b \), \( ||\vec{c}|| = c \), and \( \cos(180^\circ - \gamma) = -\cos\gamma \).
D One can easily prove (using the dot product) that the three altitudes in a triangle ABC are concurrent.
Let altitudes through A and B intersect at H, and let us take H as the origin.
Then let ¯a,¯b,c be the position vectors of A¯ ,B,C SinceHA −→ =a and¯ BC −→ =c¯−¯b are perpendicular to each other, we have ¯aã(c¯−¯b) =0 or ¯aãc¯=a¯ã¯b Similarly,
HB −→ ãCA= −→ 0⇒¯bã(a¯−c) =¯ 0 or ¯bãc¯=a¯ã¯b Subtracting these equations, one gets(¯b−a)¯ ãc¯=0 or −→ ABãHC= −→ 0 or −→ AB⊥HC Therefore, H lies on the third −→
1.6 Applications of the Dot Product and Scalar Multiplication 13 altitude (through C), and the three altitudes in ABC are concurrent (at H which is called the orthocenter).
E (1) As two simple and useful applications of scalar multiplication, let us con- sider the section formula (the ratio theorem) and the position vector of the centroid of a triangle.
The section formula gives us the position vector ¯p of point P specified by its position ratio with respect to two fixed points A and B:
PB=mãkc m nãkc m =m n ⇒AP=m nPB.
Since vectorsAP= −→ p¯−a and¯ PB= −→ ¯b−p lie on the same line, one is the scalar¯ multiple of the other ¯ p−a¯=m n(¯b−p¯)⇒n ¯p−n ¯a=m¯b−m ¯p⇒
⇒(m+n)¯p=m¯b+n ¯a,and finally ¯ p=m¯b+n ¯a m+n (the section formula).
The mid point of AB(m=n)has the position vector ¯p= a ¯ +¯b 2
(2) Consider an arbitrary triangle ABC Let D,E,F be the mid-points of the sides BC,CA,AB, respectively The medians of the triangle are the lines
AD,BE,CF We shall show, by the methods of vector algebra [see (1) above], that these three lines are concurrent.
Let G be defined as the point on the median AD such that GD AG = 2 1 , and hence, by the section formula ¯ g=2 ¯d+a¯
As D is the mid-point of BC, its position vector is d¯= ¯b+¯c
Substituting this vector in the expression for ¯g, we have g=a¯+¯b+¯c
The expression for the centroid G is symmetrical in relation to points A, B, and C Consequently, calculating the position vectors for the other two medians, BE and CF, in a 2:1 ratio yields the same result This confirms that point G lies on all three medians, establishing it as the centroid of triangle ABC.
The line segment connecting the midpoints of two sides of a triangle is parallel to the third side and measures exactly half its length.
Vectors in Three-Dimensional Space (Spatial Vectors)
d¯= a+¯ ¯ 2 c and ¯e= ¯b+¯ 2 c ⇒ d¯−e¯= 1 2 (a¯−¯b), soED= −→ 1 2 −→ BA Therefore,ED is −→ parallel to −→ BA, since it is a scalar multiple of −→ BA, and its length||ED −→ ||is 1 2 of
Using this result, one can easily prove that the mid-points of the sides of any quadrilateral ABCD are the vertices of a parallelogram:
1.7 Vectors in Three-Dimensional Space (Spatial Vectors)
The notion of a (geometric) vector in three-dimensional Euclidean space E 3is the same as in two-dimensional space (plane)—it is a directed line segment (an arrow).
A vector is defined by its length and direction, represented by a line segment in space with an arrow indicating its orientation, denoted as −→ AB or ¯a Vectors that share the same length and direction on parallel lines are considered equal, leading to the classification of all vectors in E³ into equivalence classes, referred to as V³ By selecting three mutually perpendicular axes of unit length that intersect at the origin O, a rectangular coordinate system is established in E³, creating a one-to-one correspondence between points in E³ and ordered triples (x, y, z) in R³ of real numbers.
(We choose a right-handed coordinate system - x,y,z axes point as the thumb, the index finger and the middle finger of the right hand.)
The natural representative of a class of equal vectors is the vector that originates from the point O(0,0,0) This vector is identified by the coordinates of its terminal point.
⎦ arranged as a matrix-column Here, x,y,z are called the scalar-components of ¯a.
We can useR 3 to denote both the set of all points in E 3 and the set of their position vectors.
Note that the representative column
The choice of the rectangular coordinate system, sharing the same origin O, significantly influences the outcomes of ¯a In Section 4.4, we will demonstrate the fundamental transformation formula, which applies to another rectangular coordinate system derived using the orthogonal replacement matrix R, where R −1 equals R T.
The representative matrix-column of ¯a changes analogously
1.7 Vectors in Three-Dimensional Space (Spatial Vectors) 17
In V 3, we define addition similarly to V 2 by applying the triangle rule, which involves adding vector ¯b to the terminal point of vector ¯a Alternatively, we can perform addition in a component-wise manner by summing the corresponding components of the natural representatives of the classes within the selected rectangular coordinate system, resulting in ¯a + ¯b.
This addition of vectors makes V 3, as well asR 3 , an Abelian group (the properties and proofs are the same as in the two-dimensional cases).
The scalar multiplication and the dot product are also defined by analogy with
⎦and ¯aã¯b=xx +yy +zz
Scalar multiplication and the dot product in three-dimensional space, R³ and V³, exhibit the same fundamental properties as in two-dimensional space Consequently, both R³ and V³ qualify as real vector spaces The presence of these properties in the dot product categorizes R³ and V³ as Euclidean vector spaces.
The rectangular coordinate system is determined by three perpendicular (orthog- onal) unit vectors (orts) ¯i ⎡
⎦ can be written as the unique sum of its vector-components (or unique linear combi- nation of ¯i,¯j,¯k)—see the diagram on p 15: ¯ a=x¯i+y ¯j+z¯k, where x=a¯ã¯i,y=a¯ã¯j,z=a¯ã¯k and ã ¯i ¯j ¯k ¯i 1 0 0 ¯j 0 1 0 ¯k 0 0 1
We say that the vectors ¯i,¯j,¯k form an orthonormal (ON) basis inR 3
The Cross Product in R 3
The cross product is a binary vector operation (a mappingR 3 ×R 3 →R 3 ), which is only meaningful inR 3 To every ordered pair of vectors inR 3 ¯ a ⎡
⎦=x ¯i+y ¯j+z ¯k, we associate a vector ¯c=a¯×¯b=x ¯i+y ¯j+z ¯k that is perpendicular to each of them: ¯ aã¯c=0 and ¯bã¯c=0.
This is a system of two homogeneous linear equations in three unknowns x , y ,z : xx +yy +zz =0 and x x +y y +z z =0.
In this scenario, there are fewer equations than unknowns, necessitating the introduction of a free parameter, denoted as s, for one of the unknowns, specifically z Consequently, the variables x and y will be expressed in relation to s From a geometric perspective, solving this homogeneous system involves determining the kernel of the coefficient matrix A.
This is normally done by reducing this matrix to the unique row-echelon Gauss–Jordan modified (GJM) form (see the end of Sect 2.18).
In this system, this takes the form
⎥⎦, where the last column is a unique basis vector of ker A.
The general solution of the system, i.e the kernel of A is the line inR 3
⎢⎣ zy −yz xy −yx xz −zx xy −yx
To simplify this expression, we can replace the free parameter s by another s=−k(xy −yx ), k∈R, and finally get x =k(yz −zy ) y =k(zx −xz ) z =k(xy −yx )
Obviously k=1 is the simplest solution, so ¯ c= (yz −zy )¯i+ (zx −yz )¯j+ (xy −yx )¯k.
A simpler and more transparent method involves multiplying the first equation by (−x) and the second by x By adding the resulting expressions, we arrive at the equation y zx − xz = z xy − yx.
Then, multiply the first equation by y and the second by(−y) Adding the ex- pressions so obtained we have x yz −zy = z xy −yx
Since these three quotients are equal, but arbitrary, we introduce a free parameter k∈R: x yz −zy = y zx −xz = z xy −yx =k∈R.
Naturally, we have the same situation as with the GJM method, and the simplest solution is again k=1.
The components of the vector ¯c can be written as determinants of 2×2 matrices: ¯ c=a¯×¯b y z y z ¯i− x z x z ¯j+ x y x y ¯k.
Note: From now on we shall need several statements from the theory of determinants (see Appendix A).
The expression for ¯c can be viewed as a symbolic determinant, where the first row consists of vectors rather than numbers Despite this, we can utilize the determinant expansion rule for the first row, as vector algebra involves two fundamental operations: vector addition and scalar multiplication Therefore, ¯c can be represented as the cross product of vectors a and b, expressed in terms of the unit vectors ¯i, ¯j, and ¯k, along with their corresponding coordinates x, y, and z.
This new operation is not commutative, but instead it is anticommutative: ¯ aׯb=−(¯b×a), since the interchange of two rows in a determinant changes¯
It is also not associative: ¯a×(¯b×c)¯ = (a¯×¯b)×c The last property fol-¯ lows immediately if we calculate ¯a×(¯bׯc)for three arbitrary vectors ¯a ⎡
=¯i[y(x y −y x )−z(z x −x z )] +¯j+¯k components=¯i[x (xx +yy +zz )−x (xx +yy +zz )] +¯j+¯k components= (a¯ã¯c)¯b−(a¯ã¯b)¯c.
The first terms (a¯ãc)¯ ¯b are in agreement, while the second terms (a¯ã¯b)c and (c¯ã¯b)a¯ are not This leads to the conclusion that ¯a×(¯b×c)¯ is equal to (a¯×¯b)ׯc Therefore, the cross product ¯aׯb×c cannot be defined for three vectors, as it does not satisfy the properties of an associative binary operation.
The relation of the cross product with vector addition is ¯a×(¯b+¯c) =aׯ ¯b+aׯ ¯c (the distributive law), which can be obtained from ¯i ¯j ¯k x y z x +x y +y z +z ¯i ¯j ¯k x y z x y z
It follows from an analogous argument that
The relationship between the cross product and scalar multiplication can be expressed as k(a × b) = (k a) × b = a × (k b), where k is a real number, demonstrating the associative law This equality holds because multiplying the second or third row of the corresponding determinant by the scalar k yields the same determinant value.
The Mixed Triple Product in R 3 Applications of the Cross
=0¯i+0 ¯j+0¯k An important consequence is that two nonzero vectors ¯a and ¯b are parallel iff ¯aׯb=¯0 This can be proved by observing that if ¯a and ¯b are parallel, then ¯a=k ¯b, so that ¯aׯb=k(¯bׯb) =¯0.
On the other hand, ¯aׯb=¯0⇒||a¯×¯b||=||a|| ||¯ ¯b||sinθ=0⇒sinθ=0⇒θ=0 ◦ orθ0 ◦ (since 0 ◦ ≤θ≤180 ◦ ) (for||a¯×¯b||see Sect 1.9).
As far as the three orthogonal unit vectors ¯i ⎡
⎦ are concerned, their cross-product table is as follows × ¯i ¯j ¯k ¯i 0 ¯k −¯j ¯j −¯k 0 ¯i ¯k ¯j −¯i 0
This table has zeros on the main diagonal and is skew symmetric with respect to this diagonal since the cross product is anticommutative.
From ¯iׯj=¯k, it follows that the direction of ¯aׯb is such that ¯a,¯b,a¯×¯b form a right-handed system.
The cross product of vectors inR 3 plays an essential role in the theoretical for- mulation of Mechanics and Electromagnetism.
1.9 The Mixed Triple Product in R R R 3 3 3 Applications of the Cross and Mixed Products
The mixed triple product, represented as ¯aã(¯bì¯c), combines the dot product and the cross product for three vectors, yielding a result in the form of x y z y z.
The mixed triple product is a numerical value derived from a proper determinant, illustrating the first-row expansion of the determinant Additionally, the third-row expansion of the same determinant can be expressed as \( y z y z x + z x z x y + x y x y z = (a¯ì¯b)ã¯c \).
In conclusion, the signs ã and ì can be interchanged, leading to the equality ¯aã(¯bìc) = ¯(a¯ì¯b)ãc, where it is important to keep the cross product in brackets Consequently, the mixed triple product is frequently represented in this manner.
[a ¯b ¯¯ c] = [c ¯¯a ¯b] = [¯b ¯c ¯a] =−[¯c ¯b ¯a] =−[a ¯¯c ¯b] =−[¯b ¯a ¯c], since every interchange of two rows in a determinant changes its sign.
Applications of the Cross and Mixed Products
A The area of a parallelogram and of a triangle.
The length of ¯aׯb can be determined as follows:
||a¯ì¯b|| 2 = (a¯ì¯b)ã(a¯ì¯b) =a¯ã[¯bì(a¯ì¯b)] =a¯ã[(¯bã¯b)a¯−(a¯ã¯b)¯b] =||a||¯ 2 ||¯b|| 2 −(a¯ã¯b) 2 =||a||¯ 2 ||¯b|| 2 − ||a||¯ 2 ||¯b|| 2 cos 2 θ=||a||¯ 2 ||¯b|| 2 sin 2 θ,and finally
Now, we can calculate the area A of the parallelograms determined by two vec- tors ¯a and ¯b:
The area A Δ of triangles determined by vectors ¯a and ¯b is calculated analo- gously:
1.9 The Mixed Triple Product in R 3 Applications of the Cross and Mixed Products 23
The sine rule can be derived using the cross product, where the three cross products ¯aׯb, ¯bׯc, and ¯c×a in triangle ABC yield lengths that are each twice the area of triangle ΔABC.
2A ΔABC =||a¯×¯b||=||¯bׯc||=||c¯×a||¯ or ab sinγ=bc sinα=ca sinβ, where a=||a¯||,b=||¯b||,c=||c¯||.
Finally, ab sinγ=ac sinβ ⇒ b sinβ = c sinγ , and ca sinβ=cb sinα ⇒ a sinα = b sinβ , giving the sine rule a sinα = b sinβ = c sinγ
We define the vector ¯aׯb as a vector that is orthogonal to both ¯a and ¯b, making it perpendicular to the base of the parallelepiped formed by the vectors ¯a, ¯b, and ¯c The positive projection of ¯c onto the line of ¯aׯb, represented as ||c|| cosθ, serves as the height of the parallelepiped.
The height (H) of a parallelepiped is defined within the range of 0° to 180°, ensuring that the cosine of the angle (θ) remains between -1 and 1, while the height itself must always be a positive value The volume of the parallelepiped is calculated by multiplying the area of the base, represented by the cross product of vectors a and b (||a × b||), with the height adjusted by the cosine of the angle (||c|| |cos θ|).
(the absolute value of the mixed triple product).
If three vectors \(\bar{a}\), \(\bar{b}\), and \(\bar{c}\) are coplanar, they cannot form a parallelepiped, resulting in a determinant value of \([ \bar{a} \, \bar{b} \, \bar{c} ] = 0\) This condition is both necessary and sufficient for coplanarity, as coplanar vectors are linearly dependent, leading to a zero determinant of their column vectors.
Coplanarity can be defined by the condition [a ¯b ¯c¯ ] = 0, which is necessary for establishing coplanarity among vectors Additionally, the condition [a ¯b ¯c] = 0 indicates that the vectors a, b, and c are linearly dependent, meaning one vector can be expressed as a linear combination of the others, thereby confirming that they lie within the same plane.
Equations of Lines in Three-Dimensional Space
1.10 Equations of Lines in Three-Dimensional Space 25 Given a fixed point P 0(x 0 ,y 0 ,z 0)on a line and the direction vector d¯⎡
⎦ parallel to the line, we see that
P −→ 0 P=t ¯d, where P(x,y,z)is any point on the line, so that the parameter t takes any real value (−∞ m is not feasible, as L would map n linearly independent vectors from V onto n linearly independent vectors in W, which can accommodate a maximum of m such vectors Similarly, the case of n < m is also impossible due to the existence of the inverse map L⁻¹ Therefore, an isomorphic map L can only exist between vector spaces V and W if they share the same dimension.
The Kernel and the Range of L
Every linmap L : V n →W m determines two important subspaces—one in the domain
In linear algebra, the kernel of a linear transformation L, denoted as ker L, is defined as the set of all vectors x in the domain V^n that are mapped to the zero vector ¯0 in the codomain W^m Formally, it can be expressed as ker L = {x | x ∈ V^n and L(x) = ¯0}.
The second subspace consists of images in W m of all vectors from V n It is called the range of L and it is denoted as ran L: ran L={L(¯x)|x¯∈V n }.
64 2 Linear Mappings and Linear Systems
We also write concisely ran L=L(V n ).
To demonstrate that the given sets are subspaces, we need to establish that any linear combination of their vectors remains within the set Consider a collection of vectors {x̄1, x̄2, , x̄p} from the kernel of L, where L(x̄1) = 0w, L(x̄2) = 0w, , L(x̄p) = 0w It follows that every linear combination of these vectors is also contained within the kernel of L, confirming that the sets are indeed subspaces.
The kernel of L is never empty since at least ¯0 v belongs to it: L(¯0 v ) =¯0 w
Any set of vectors from the range of the linear transformation L, denoted as ran L{y¯1, y¯2, , y¯q}, has at least one corresponding preimage in Vn Specifically, we have L(x¯1) = y¯1, L(x¯2) = y¯2, , L(x¯q) = y¯q Furthermore, any linear combination of these vectors, expressed as ∑q i=1 b i y¯i, also belongs to the range of L, as it can be represented by the preimage ∑q i=1 b i x¯i.
Since we now know that ker L and ran L are subspaces, we shall investigate their relationship, in particular the connection between their dimensions.
We shall prove the very important relationship:
Theorem (dimension) dim(ker L) +dim(ran L) =dim(DomL)
Proof Assume that dim(ker L) =k3 lack a geometrical interpretation but play a crucial role in various fields They are vital in mathematics for solving systems of linear equations, in physics for certain formulations of the special theory of relativity when n=4, and in economics for constructing linear economic models.
The generalization of the dot product in R 3 to R n , n>3, is straightforward.
To extend the summation in the coordinate definition from three dimensions to n dimensions, the dot product of two vectors \(\bar{x}, \bar{y} \in \mathbb{R}^n\) is defined as \(\bar{x} \cdot \bar{y} = \sum_{i=1}^{n} x_i y_i\), where \(\bar{x} = [x_1, x_2, \ldots, x_n]^T\) and \(\bar{y} = [y_1, y_2, \ldots, y_n]^T\) Additionally, the dot product can be represented as a matrix product, expressed as \(\bar{x} \cdot \bar{y} = \bar{x}^T \bar{y}\) This notation has numerous practical and beneficial applications.
There are several changes in terminology and notation that we are dealing with inR n
The product∑ n i=1 x i y i is no longer called the dot product nor is it denoted as ¯xã¯y.
The inner product, represented as (x, ¯y), should not be mistaken for the notation used for ordered pairs of vectors To clarify this distinction, we will denote an ordered pair of vectors as [x, ¯y].
The other difference inR n compared withR 3 is that the magnitude of the vector ¯ x is called its norm, and it is denoted as||x||, but it is defined analogously as¯
In the theory of tensor multiplication within vector spaces, the expression ∑ n i=1 x i y i exemplifies a tensor product of matrix-column spaces, followed by a contraction process that involves equating two indices and summing over the shared index This type of tensor product is referred to as an inner product.
The inner product in \( \mathbb{R}^n \) exhibits four key properties similar to the dot product in \( \mathbb{R}^3 \): commutativity, distributivity concerning vector addition, associativity in the multiplication of vectors by scalars, and positive definiteness Additionally, the properties of distributivity and associativity can be generalized, indicating that this inner product is bilinear.
As far as other real vector spaces V n (R)are concerned, the inner product in them can be defined axiomatically by taking the above properties of this product inR n as postulates:
In a real vector space \( V^n(\mathbb{R}) \), the inner product can be defined as any scalar function on the Cartesian product \( V^n(\mathbb{R}) \times V^n(\mathbb{R}) \) This means that any function mapping ordered pairs of vectors \([ \bar{x}, \bar{y} ]\) from \( V^n(\mathbb{R}) \) to a unique real number \( (\bar{x}, \bar{y}) \) is considered an inner product, provided it adheres to specific mathematical properties.
2 linear in the first factor—(a ¯x 1+b ¯x 2 ,y) =¯ a(x¯1 ,y) +¯ b(x¯2 ,y);¯
3 positive definite—(¯x,x)¯ >0 for ¯x=¯0 and(x,¯x) =¯ 0 iff ¯x=¯0.
Combining the first and second axioms, we can say that every inner product in
A real vector space V n (R)with this kind of inner product is called a Euclidean space E n
Remark Some authors reserve this term forR n , while other real spaces of this kind are called real inner-product vector spaces.
There are two possible ways to define different inner products in the same V n (R) andR n
1 Any basis v={v¯1 ,v¯2 , ,v¯ n } in V n (R)defines an inner product if we expand vectors from V n (R)in that basis ¯x=∑ n i=1 x i ¯v i ,x¯→X = [x 1 ,x 2 , ,x n ] T (note the differences in notation for the vector ¯x and its representing columnX) of two vectors ¯x,y¯∈V n (R)as
If we replace this basis v by a new one v ={v¯ 1 ,v¯ 2 , ,v¯ n }by means of an n×n replacement matrixR,
124 3 Inner-Product Vector Spaces then the representing columnX of ¯x will change by the contragredient matrix (R T ) − 1 :
(R T R) −1 Y This will be the same number as(x,¯y)¯ v ifR T R=I n (orR −1 R T ) Matrices of this kind are known as orthogonal matrices.
In summary, the basis vector v and all other bases in V^n(R) derived from v through orthogonal replacement matrices establish a single inner product in V^n(R), as all three axioms are clearly fulfilled.
Orthogonal n×n matrices, represented as R −1 = R T, form a group known as O(n) By establishing a relation among all bases in V n (R) based on their ability to be transformed into one another through an orthogonal matrix and its inverse, we create an equivalence relation that is reflexive, symmetric, and transitive, due to the properties of O(n) This leads to the partitioning of all bases in V n (R) into equivalence classes of orthogonally equivalent bases, with each class corresponding to a unique inner product in V n (R) Given that vectors in V n (R) can be expressed using the standard basis in R n, denoted as ¯ e p = [δ1p δ2p δnp] T for p = 1, 2, , n, it follows that all vectors possess unit norm and exhibit orthogonality, as indicated by the inner product (x, ¯ y) ¯ v being zero for different basis vectors This collection of orthogonal unit vectors is referred to as an orthonormal (ON) basis.
Thus, each class of orthogonally equivalent bases in V n (R) defines one inner product in V n (R), and all bases from the class (and only they) are orthonormal in that inner product.
2 We have the standard inner product inR n ,
The inner product in \( \mathbb{R}^n \) can be expressed as \( (\bar{x}, \bar{y}) = \bar{x}^T \bar{y} \), utilizing orthogonally equivalent bases represented by the standard basis \( \bar{e} = [\delta_1^p, \delta_2^p, \ldots, \delta_n^p]^T \) for \( p = 1, 2, \ldots, n \) Additionally, by selecting any \( n \times n \) positive definite real symmetric matrix \( A \) (where \( A \) is symmetric if \( A^T = A \) and positive definite if \( (\bar{x}, A \bar{x}) > 0 \) for all \( \bar{x} \neq \bar{0} \), and zero if \( \bar{x} = \bar{0} \)), a new inner product can be formulated in \( \mathbb{R}^n \).
To verify that it is an inner product, we have to prove only that it is symmetric (commutative) Indeed,
⎦=a 11 x 1 y 1+a 12 x 2 y 1+ããã+a 1n x n y 1+ +a 21 x 1 y 2+a 22 x 2 y 2+ããã+a 2n x n y 2+ã ã ã ã ã ã+a n1 x 1 y n +a n2 x 2 y n +ããã+a nn x n y n , and because the matrixA is symmetric(A T =A), it is symmetric with respect to the main diagonal a i j =a ji ,i=j,i,j=1,2, ,n, so that the above result is equal to
The other two axioms are obviously satisfied due to the linear properties of matrix multiplication: (aA+bB)C=aAC+bBC, where a,b are numbers, and A,B,C matrices, i.e.,
(a ¯x 1 +b ¯x 2 ,y¯) A = (a ¯x T 1 +b ¯x T 2 )Ay¯=a ¯x T 1 A ¯y+b ¯x T 2 Ay¯=a(x¯ 1 ,y¯) A +b(x¯ 2 ,y¯) A as well as due to the positive definiteness of the matrixA:
(x¯,x¯) A = (x¯,Ax¯)>0 for ¯x=¯0 n and it is zero iff ¯x=¯0 n
Examples of inner products in real vector spaces of matrices and polynomials. a) In the vector spaceR m×n of real m×n matrices, we have the standard inner product: for A,B∈R m×n , (A,B) =tr(A T B) =∑ m i=1
1 symmetric (commutative):(B,A) =tr(B T A) =tr(B T A) T =tr(A T B) = (AB)
[since the trace (the sum of diagonal elements) is invariant under transposition trA T =trA];
2 linear:(A+B,C) =tr[(A+B) T C] =tr(A T C+B T C) =tr(A T C) +tr(B T C) (A,C)+(B,C)(since(A+B) T =A T +B T and tr(A+B) =trA+trB);(aA,B) tr[(aA) T B] =tr(aA T B) =a tr(A T B) =a(A,B) [since tr(aA) =a trA];
3 positive definite:(A,A) =tr(A T A) =∑ m i=1 ∑ n j=1 a 2 i j >0 for A=0 m×n and it is zero iff A=0 m×n = [0] m×n b) In the infinite dimensional vector space P [a,b] (R) of real polynomials p(x) defined on a closed interval[a,b], the standard inner product is
The three axioms are obviously satisfied. b’) In P [a,b] (R)one can define an inner product with the weight
(p(x),q(x)) ρ =! b a ρ(x)p(x)q(x)dx, where the weightρ(x)is a nonnegative(ρ(x)≥0)and continuous function on the interval(a,b)(it cannot be zero on the whole interval).
One may notice that both a and b are natural generalizations of the standard inner product inR n
The first case is a generalization from one-column matrices with one index(R n×1) to the general type of matrices with two indices(R m × n )
The second case is a generalization from a variable with the discrete index[x 1 x 2 x n ] T to a variable with the continuous index p(x), x∈[a,b], so that the summation is replaced with an integral
Unitary Spaces U n (or Complex Inner-product Vector Spaces)
In the complex vector space Cn of matrix-columns with n complex numbers, defining an inner product is not as straightforward as one might expect A common approach is to adopt the standard inner product definition used in Rn, but this method has its limitations in the complex space.
(x¯,y¯) =∑ n i=1 x i y i , x i ,y i ∈C for all i, we immediately see that the norm
||x||¯ = (x,¯ x)¯ 1/2 = (∑ n i=1 x 2 i ) 1/2 is not, in this case, a real number, since the sum of squares of complex numbers
In Quantum Mechanics, the expression ∑ n i = 1 x 2 i can yield a negative value; however, it is essential for the norm to be a real number This requirement arises because the probabilities of measurement in Quantum Mechanics are determined by the norms of specific vectors.
For this reason, the standard inner product inC n is defined as
(the asterisk ∗ denotes the complex conjugation [1, 2])
With this inner product the norm of vector ¯x
||x||¯ = (¯x,x)¯ 1 / 2 = (∑ n i=1 x ∗ i x i ) 1 / 2 = (∑ n i=1 |x i | 2 ) 1 / 2 is always a positive real number (It is zero iff ¯x=¯0.)
Note In mathematical literature, it is more usual to define the inner product [3, 1] inC n as
Our definition is grounded in Quantum Mechanics, which relies on complex vector spaces and employs Dirac notation, necessitating a clear "physical" interpretation.
In matrix notation, the operation combining transposition and complex conjugation is often represented as \( A^\dagger \), denoting the adjoint of matrix \( A \), where \( (A^*)^T = A^\dagger \) This notation simplifies the representation of the adjoint, although some authors may use alternative symbols such as \( A^* \) or \( A^H \), or refer to it as the Hermitian adjoint.
It should be distinguished from adjA which is the classical adjoint [3] and represents the transposed matrix of the cofactors of a square matrix A.
Our inner product inC n (x¯,y¯) =x¯ † ¯y has the following three obvious properties:
1 It is skew (Hermitian)-symmetric(x,¯y) = (¯ y,¯x)¯ ∗ ; [1]
2 It is linear in the second factor
3 It is positive definite (strictly positive)
From properties 1 and 2, it follows that this inner product is antilinear [1] (skew- linear) or conjugate linear [3] in the first factor
Being antilinear in the first factor and linear in the second one, we say that this inner product is conjugate bilinear:
When we want to define an inner product in an arbitrary complex vector space
V(C), we can use the above three properties as postulates:
An inner product in a complex vector space \( V(\mathbb{C}) \) is defined as a complex scalar function that maps pairs of vectors from \( V(\mathbb{C}) \) to the complex numbers, denoted as \( V(\mathbb{C}) \times V(\mathbb{C}) \rightarrow \mathbb{C} \) This function assigns a complex number to each ordered pair of vectors \( [\bar{x}, \bar{y}] \) from \( V(\mathbb{C}) \), satisfying three essential properties.
1 (x,¯y) = (¯ y,¯x)¯ ∗ —skew (or Hermitian or conjugate) symmetry;
2 (¯z,a ¯x+b ¯y) =a(¯z,x¯) +b(¯z,y¯)—linearity in the second factor;
3 (x¯,x¯)>0 for ¯x=¯0 and(x¯,x¯) =0 iff ¯x=¯0—positive definiteness.
(This inner product is obviously antilinear [2] in the first factor(a ¯x+b ¯y,¯z) a ∗ (x,¯ ¯z) +b ∗ (¯y,¯z).)
Together with 2, this means that it is conjugate bilinear [4]:
As in V n (R), we can define an inner prodcut in V n (C)by choosing any basis v {v¯1 ,v¯2 , ,v¯ n }and expanding vectors from V n (C)in that basis: ¯ x=∑ n i=1 x i v¯ i and ¯y=∑ n i=1 y i v¯ i
⎥⎦as their inner product induced by the basis v in the standard form.
In this inner product, the basis v becomes orthonormal [2, 3, 4] (v¯ i ,v¯ j ) v =δ i j since ¯ v i =∑ n k = 1 δ ik v¯ k and ¯v j =∑ n k = 1 δ jk v¯ k , so
Also orthonormal are all bases in V n (C)which are obtained from v by unitary re- placement matricesR, i.e., such thatR − 1 =R †
To prove this, let us consider changing the basis v={v¯1 ,v¯2 , ,v¯ n }in V n (C)to another basis v ={v¯ 1 ,v¯ 2 , ,v¯ n }by the invertible replacement matrixR:
Then the representing columns x and x of a vector ¯x∈V n (C)in these two bases are connected by the so-called contragredient matrix [4](R −1 ) T : x ⎡
The inner product of vectors ¯x and ¯y in the first basis v is defined as (¯x,y)¯ v = ∑ n i=1 x ∗ i y i = x † y, where x and y are the column representations of ¯x and ¯y in this basis Applying the same definition to the second basis v yields similar results.
This will be the same number if (R ∗ ) −1 (R T ) −1 =I n or (R T R ∗ ) −1 =I n or
R T R ∗ =I n orR ∗ = (R T ) −1 or(R −1 ) T =R ∗ orR −1 =R ∗ T =R † Thus, when the replacement matrixRis unitary(R −1 =R † ), the definition of the inner product in these two bases will be the same.
In the context of V n (C), all bases can be categorized into distinct classes of unitary equivalent bases This classification arises from the fact that unitary n×n matrices constitute the group U(n), which establishes an equivalence relation among all bases.
The relation is reflexive, symmetric, and transitive (RST), adhering to the fundamental axioms of group theory, which include the existence of identity, inversion, and closed operations for group multiplication Each equivalence class corresponds to a unique inner product in V n (C), and only the bases from that class are orthonormal for that specific inner product.
In the context of inner products in the vector space \( V_n(\mathbb{C}) \), it is important to note that all inner products can be categorized by classes of unitary equivalent bases Each arbitrary inner product in \( V_n(\mathbb{C}) \) defines a unique class of orthonormal bases, establishing a bijection between the inner products and these classes To illustrate this, consider an arbitrary inner product in \( V_n(\mathbb{C}) \) and select any basis from the space By applying the Gram-Schmidt orthonormalization process, one can derive the corresponding orthonormal basis for that inner product.
Our first task is to find the expansion coefficents of any vector ¯x∈V n (C)in this basis ¯x=∑ n j=1 x j ¯v j Multiplying this expansion from the left by ¯v i , i=1,2, ,n, we get
These expansion coefficients(v¯ i ,x¯), i=1,2, ,n, are called the Fourier coeffi- cients of ¯x in this ON basis.
The inner product of two vectors ¯x and ¯y is calculated as
It is called Parseval’s identity [4] and can be written as (x¯,y¯) =x † y , where x ⎡
In the context of an orthonormal (ON) basis {v̅1, v̅2, , v̅n} and its unitary equivalence class, the inner product can be expressed in its standard form, demonstrating its dependence on the chosen ON basis.
A complex vector space \( V_n(\mathbb{C}) \) equipped with a specific inner product is referred to as a unitary space \( U_n \) This type of space is also known as a complex inner-product vector space, with \( \mathbb{C}^n \) commonly recognized as a complex Euclidean space.
We have already defined the standard inner product inC n as(¯x,y) =¯ x¯ † ¯y,x,¯ y¯∈
C n Now, we see that it is defined by the class of bases inC n unitary equivalent to the usual (standard) basis{e¯1 ,e¯2 , ,e¯ n }, where ¯ e p = [δ1p δ2p δ np ] T , p=1,2, ,n.
Examples of inner products in other unitary spaces.
A) In the spaceC m×n of complex m×n matrices, the standard inner product is defined as(A,B∈C m×n )
We can easily verify that this inner product satisfies the 3 axioms in the definition:
(aA,B) =tr[(aA) † B] =tr(a ∗ A † B) =a ∗ tr(A † B) =a ∗ (A,B)—antilinearity in the first factor;
3 (A,A) =tr(A † A) =∑ m i=1 ∑ n j=1 |a i j | 2 >0 if A=ˆ0, and equal to 0 iff A= ˆ0
B) If x(t)and y(t)are polynomials in the vector space P(C)of complex polynomi- als on the real variable t∈[a,b], then their inner product is defined as
The inner products in the examples provided serve as natural extensions of the standard inner product in C n, transitioning from one index to two indices and from discrete to continuous variables.
Orthonormal Bases and the Gram-Schmidt Procedure
Definition Two nonzero vectors ¯x and ¯y in an inner product (real or complex) vector space are orthogonal ¯x⊥y iff their inner product is zero:¯ (¯x,y) =¯ 0⇔x¯⊥y (see¯ Sect 3.1)
For two orthogonal vectors, we can easily prove the Pythagorean theorem:
132 3 Inner-Product Vector Spaces Similarly, for an orthogonal set of vectors{x¯1 ,x¯2 , ,x¯ k }
In any inner product vector space, the zero vector ¯0 is orthogonal to all other vectors, as demonstrated by the equation (x, ¯0) = 0 This unique property of the zero vector being orthogonal to every vector is essential in various mathematical proofs.
Definition The set of vectors{x¯1 ,x¯2 , ,x¯ k } in any inner product vector space is called an Ortho Normal (ON) set if
A set of vectors is considered orthonormal if each vector has a unit norm and is orthogonal to all other vectors in the set.
An essential property of any orthonormal set of vectors is their linear independence To demonstrate this, consider an orthonormal (ON) set of vectors {x₁, x₂, , xₖ} We need to show that a linear combination of these vectors, expressed as ∑ₖᵢ₌₁ aᵢ xᵢ, equals the zero vector only when all coefficients aᵢ (for i = 1, 2, , k) are zero.
∑ k i=1 a i x¯ i =¯0⇒all a i =0 Indeed, we shall multiply the above equality from the left with any ¯x j , j=1,2, ,k and get on one side
(x¯ j , ∑ k i=1 a i x¯ i ) =∑ k i=1 a i (x¯ j ,x¯ i ) =∑ k i=1 a i δ i j =a j , while the other side gives(x¯ j ,¯0), so a j =0,i=1,2, ,k Thus, every orthonormal ordered set of vectors which also spans its vector space is an orthonormal (ON) basis.
If{x¯1 ,x¯2 , ,x¯ k }is an ON set of vectors in a complex inner product vector space and if ¯x is any vector from that space, then Bessel’s inequality is valid:
In a real inner product vector space this inequality looks simpler
∑ k i=1(x¯ i ,x)¯ 2 ≤ ||x||¯ 2 , and the proof is analogous.
We will exclusively utilize orthonormal (ON) bases, as they facilitate obtaining meaningful results It will be demonstrated that any basis can be substituted with an equivalent ON basis.
But, first, let us list some useful formulas already obtained with ON bases.
1 If{x¯1 ,x¯2 , ,x¯ n }is any ON basis in a vector space (real or complex) with inner product, then the components of any vector ¯x in that basis are ¯ x=∑ n i=1(¯x i ,x)¯x¯ i
This expansion is called the Fourier expansion, and the expansion coefficients (x¯ i ,x)¯ are usually called the Fourier coefficients The component(x¯ i ,x)¯ x¯ i is the projection of ¯x along the unit vector ¯x i
2 The inner product of two vectors ¯x and ¯y in this ON basis takes the form
This expression is called Parseval’s identity [4].
3 The norm of any vector ¯x in this ON basis is
In a real vector space, the equation ||x||¯² = ∑ n i=1 |(x¯ i, x)|¯² demonstrates that when an orthonormal (ON) set serves as a basis, Bessel’s inequality simplifies to an equality.
Here, we can derive two important inequalities relevant for the norm in inner- product vector spaces.
The first example is the well-known Cauchy–Schwarz inequality:
For any two vectors ¯x and ¯y in an arbitrary inner-product vector space, we have
There are different proofs, but the simplest one uses Bessel’s inequality:
If ¯y=¯0, then both sides are 0 and we get the trivial equality So we assume ¯y=¯0 and then use Bessel’s inequality∑ k i=1 |(¯x i ,x)|¯ 2 ≤ ||x||¯ 2 for k=1 ¯x 1= || y y ¯ ¯ || to obtain
|( || y y|| ¯ ¯ ,x¯)| 2 ≤ ||x¯|| 2 Multiplying both sides with||y¯|| 2 , we get|(x¯,y¯)| 2 ≤ ||x¯|| 2 ||y¯|| 2 , which implies the Cauchy–Schwarz inequality.
The second example is the triangle inequality in inner-product vector spaces (which is one of three basic properties of the norm besides positive definiteness
||x||¯ >0 if ¯x=¯0 and homogeneity [3] ||a ¯x||=|a|||x||):¯ ||x¯+¯y|| ≤ ||x||¯ +||y||, for¯ any ¯x and ¯y.
Proof Using the above Cauchy–Schwarz inequality|(x,¯ y)| ≤ ||¯ x||||¯ y||:¯
[In complex vector space(x¯,y¯)+ (y¯,x¯) = (x¯,y¯)+ (x¯,y¯) ∗ =2Re(x¯,y¯)≤2|(x¯,y¯)|, and in the real one 2(x¯,y¯)≤ |(x¯,y¯)|.Taking the square roots, we get the afore-mentioned triangle inequality.
The Gram-Schmidt procedure is a method used to transform an arbitrary basis \( X = \{ \bar{x}_1, \bar{x}_2, \ldots, \bar{x}_n \} \) in a real or complex inner-product vector space \( V_n \) into a corresponding orthonormal basis \( Y = \{ \bar{y}_1, \bar{y}_2, \ldots, \bar{y}_n \} \) This orthonormal basis is constructed such that each vector \( \bar{y}_m \) for \( m = 1, 2, \ldots, n \) is a linear combination of the initial vectors \( \{ \bar{x}_1, \bar{x}_2, \ldots, \bar{x}_m \} \), ensuring that \( \bar{y}_m \) belongs to the span of these vectors.
Since X is a basis (linearly independent and spanning the set), it follows that every one of its members is a nonzero vector, so that it can be made a unit vector by dividing it by its norm.
So, we take as ¯y 1the normalized ¯x 1: ¯ y 1= ¯x 1
So, ¯y 1is of unit norm and it belongs to L(¯x 1): ¯ y 1= 1
To obtain the second vector ¯y 2 [it must be orthogonal to ¯y 1and of unit norm and also it must belong to L(x¯1 ,x¯2)], we form a linear combination of ¯x 1and ¯x 2to get ¯ y 2 =x 2 −a 1 x ¯ 1
To determine the unknown coefficient \( a_1 \), we apply the orthogonality condition \( (\bar{y}_1, \bar{y}_2) = 0 \) By multiplying \( \bar{y}_2 \) from the left with \( \bar{y}_1 \), we obtain the expression \( (\bar{y}_1, \bar{y}_2) = (\bar{y}_1, \bar{x}_2) - a_1(\bar{y}_1, \bar{y}_1) = (\bar{y}_1, \bar{x}_2) - a_1 \), which equals zero when \( a_1 = (\bar{y}_1, \bar{x}_2) \) Since \( \bar{y}_2 \) is a linear combination of two linearly independent vectors \( \bar{x}_1 \) and \( \bar{x}_2 \), where at least one coefficient is non-zero, \( \bar{y}_2 \) cannot be zero Thus, we can express \( \bar{y}_2 \) as the normalized vector \( \bar{y}_2 = \bar{y}_2 \).
The vector ¯y 2 is a unit norm vector that is orthogonal to ¯y 1 and represents a linear combination of ¯x 1 and ¯x 2 It is defined as the difference between ¯x 2 and its projection onto the unit vector ¯y 1, denoted as (y¯1, x¯2)¯y 1 This difference is referred to as the normal from ¯x 2 onto the line defined by ¯y 1.
The vector ¯y₂ serves as the unit normal of ¯x₂ within the subspace defined by the previously identified vector ¯y₁, which is part of the desired orthonormal (ON) basis The subspace formed by ¯x₁ and ¯x₂ is equivalent to that spanned by ¯y₁ and ¯y₂; however, ¯y₁ and ¯y₂ are normalized unit vectors that are orthogonal to one another.
To determine ¯y 3and the further vectors ¯y 4 ,y¯5 , ,y¯ n , we follow the same idea: ¯ y 3is the normalized (unit) normal from ¯x 3onto the subspace spanned by ¯y 1and ¯y 2
(the previously found vectors from the ON basis).
The normal, nonzero vector ¯ y 3 =x¯3 −[(y¯1 ,x¯3)¯y 1+ (y¯2 ,x¯3)¯y 2] projection of ¯x 3onto L(y¯1 ,y¯2) =L(x¯1 ,x¯2) and (the normalized normal) ¯ y 3= ¯y 3
Thus, ¯y 3is of unit norm, it is obviously orthogonal to both ¯y 1and ¯y 2[(y¯1 ,y¯ 3 ) (y¯1 ,x¯3)−(¯y 1 ,x¯3) =0 and similarly for ¯y 2] and it belongs to L(x¯1 ,x¯2 ,x¯3).
For further vectors of the ON basis, we do the same construction: ¯ y i =x¯ i −[(y¯1 ,x¯ i )y¯1+ (y¯2 ,x¯ i )y¯2+ + (y¯ i−1 ,x i )¯y i−1 ], i=4,5, ,n, the nonzero normal from ¯x i onto L(y¯1 ,y¯2 , ,y¯ i−1 )and ¯ y i = ¯y i
So, ¯y i is the normalized unit, normal from the corresponding ¯x i onto the subspace
L(¯y 1 ,y¯2 , ,y¯ i−1 )spanned by the previously found vectors.
The Gramm–Schmidt procedure for orthonormalization enables us to use only orthonormal bases in every (real or complex) inner-product vector space.
The Legendre differential equation is a second-order linear differential equation represented as (1−t²)y(t)−2ty(t)+ν(ν+1)y(t) = 0, where ν is a real number Solutions to this equation are known as Legendre functions and play a significant role in various applications across physics and technology.
The Legendre equation has a polynomial as its solution ifν is an integer The Legendre polynomial P n (t)is a solution of the Legendre equation with parameter ν=n∈ {0,1,2, }and with the property P n (1) =1.
The Legendre polynomial P n (t) can be obtained by the so-called Rodrigues formula
This formula gives immediately P 0(t) =1 and P 1(t) = 1!2 1 dt d (t 2 −1) =t The rest of the polynomials can be more easily calculated by the recurrence formula
If we define an inner product in the vector space P(R)of real polynomials of real variable t on the interval(−1,+1)as
−1 p(t)q(t)dt, then we can easily deduce that Legendre polynomials are orthogonal
[assume m>n, use Rodrigues formula for P n (t)and P m (t), and integrate n times by parts] The square of the norm of P n (t)is
2 P n (t), n=0,1,2, form an orthonormal basis in P(R), as well as in the Hilbert space L 2(−1,+1)of square integrable real functions on the interval(−1,+1).
Now, we shall show that the normalized Legendre polynomials y n (t) 2n+1 2
The polynomial \( P_n(t) \) can be derived using the Gram-Schmidt procedure for orthonormalization from the standard basis {1, t, t², t³, } in \( P(R) \) with the specified inner product To maintain brevity, we will focus on deriving only the first four polynomials in this sequence.
Let us start with the first four vectors of the standard basis X={x 0(t),x 1(t),x 2(t), x 3(t)}={1,t,t 2 ,t 3 }, and apply the Gram–Schmidt procedure.
The first vector from the corresponding orthonormal basis is obviously y 0(t) = x 0 (t)
For the second vector from the ON basis, we calculate the first normal y 1 (t) =x 1(t)−(y 0(t),x 1(t)) =t−1
=t−0=t, and then the normalized normal is y 1(t) = y 1 (t)
For the third vector from the corresponding ON basis, we calculate the normal from x 2(t)onto the subspace spanned by the previously found ON vectors y 0(t)and y 1(t): y 2 (t) =x 2(t)−[(y 0(t),x 2(t))y 0(t) + (y 1(t),x 2(t))y 1(t)] =t 2 −[1
[Notice that P 2(t) = 1 2 (3t 2 −1)and y 2 (t) = 1 3 (3t 2 −1)differ by a factor, P 2(t) 3
2 y 2 (t), (colinear vectors), but their normalized vector must be the same] The nor- malized normal is y 2(t) = y 2 (t)
||y 2 (t)||, but it is more practical to calculate the square of the norm
2 ã 1 2 (3t 2 −1). For the fourth vector from the corresponding orthonormal basis, we first calculate the normal from x 3(t)onto the subspace L(y 0(t),y 1(t),y 2(t))spanned by the previ- ously found ON vectors y 0(t),y 1(t),y 2(t) y 3 (t) =x 3(t)−[(y 0(t),x 3(t))y 0(t) + (y 1(t),x 3(t))y 1(t) + (y 2(t),x 3(t))y 2(t)] =t 3 −
[Again P 3(t) = 1 2 (5t 3 −3t) and y 3 (t) = 1 5 (5t 3 −3t) are colinear vectors P 3(t) 5
2 y 3 (t)] The square of the norm of y 3 (t)is
2P 3(t), and it is exactly the normalized Legendre polynomial
Direct and Orthogonal Sums of Subspaces and the Orthogonal
and the Orthogonal Complement of a Subspace
3.4.1 Direct and Orthogonal Sums of Subspaces
Consider two subspaces V and V of the vector space V which are such that they have in common only the zero vector ¯0:
The set of all vectors ¯x from V that can be expressed in the form ¯x=x¯1+¯x 2 ,x¯1 ∈
V ,x¯2 ∈V is called the direct sum of V and V and is denoted as V +V :
Obviously, the set V +V is a subspace itself, since it is closed to the addition of its vectors, as well as to the multiplication of its vectors with scalars: take ¯x 1+x¯2 ∈
In the subspace V + V, each vector ¯x can be uniquely expressed as the sum ¯x1 + ¯x2, where ¯x1 belongs to V and ¯x2 belongs to V, provided that the intersection V ∩ V is equal to the zero vector, {¯0} If we assume V ∩ V = {¯0}, any vector ¯y from this intersection would lead to a non-unique decomposition of ¯x, as it could be represented as ¯x = (¯x1 + ¯y) + (¯x2 - ¯y) Therefore, the uniqueness of the decomposition of vector ¯x is guaranteed only when V ∩ V = {¯0}.
1 In the spaceR 3 consider a plane(R 2 )through the origin and a line(R 1 )also through the origin, but not lying in the plane Obviously,
2 The spaceR n×n of all square n×n real matrices is the direct sum of the subspaces of all symmetric(A T =A)and of all skew-symmetric(A T =−A)matrices.The set of all symmetric matrices(A T =A)is indeed a subspace since it is closed under the addition of matrices[(A+B) T =A T +B T =A+B], as well as under the
140 3 Inner-Product Vector Spaces multiplication of matrices with numbers[(aA) T =aA T =aA] And similarly for the set of skew-symmetric matrices(A T =−A).
Each matrix A∈R n × n can be uniquely written as the sum of a symmetric and a skew-symmetric matrix:
It is also obvious that only the zero matrix can be symmetric and skew-symmetric at the same time.
The dimension of the space \( \mathbb{R}^{n \times n} \) is \( n^2 \), as its standard basis comprises \( n^2 \) matrices, each having a single element equal to 1 and all other elements as zeros In the subspace of symmetric matrices, the standard basis includes \( n \) matrices with one element equal to 1 on the main diagonal, alongside \( (n-1)n/2 \) matrices featuring two elements equal to 1, symmetrically placed with respect to the main diagonal, reflecting the property \( a_{ij} = a_{ji} \) Conversely, the standard basis for the subspace of skew-symmetric matrices consists of \( (n-1)n/2 \) matrices that have 1 and -1 symmetrically positioned around the main diagonal, adhering to the defining characteristics of skew-symmetric matrices, where \( a_{ij} = -a_{ji} \) and \( a_{ii} = 0 \).
So, the sum of dimensions of the two subspaces n+n(n−1)
2 =n+n(n−1) =n 2 is equal to the dimension of the spaceR n×n which is the direct sum of these sub- spaces Δ
This statement is generally valid:
The dimension of the subspace V +V is equal to the sum of the dimensions of V and V dim(V +V ) =dimV +dimV , again due to V ∩V ={¯0}.
To prove this almost obvious statement, we choose a basis{¯f 1 ,f¯2 , , ¯f m }in V (dimV =m)and a basis{g¯1 ,g¯2 , ,g¯ n }in V (dimV =n)and show that the set
B={¯f 1 ,f¯2 , , ¯f m ,g¯1 ,g¯2 , ,g¯ n }is a basis in V +V —a linearly independent and generating set, so that dim(V +V ) =m+n To prove that they are LIND, we make the LIND test a 1¯f 1 +a 2 f¯2+ +a m f¯ m +b 1 g¯1+b 2 g¯2+ +b n g¯ n =¯0.
By transferring the linear combination of vectors with coefficients \( b_s \) to the opposite side, we establish that one vector from \( V \) equals another vector from \( V \), leading to the conclusion that both must be the zero vector \( \bar{0} \), given that \( V \cap V = \{ \bar{0} \} \) This directly indicates that all coefficients \( a_s \) and \( b_s \) must be zero, as \( \bar{f}_s \) and \( \bar{g}_s \) serve as bases Additionally, the set \( B \) is defined as a generating set.
V +V , since every ¯x∈V +V , i.e., ¯x=x¯1+¯x 2 , x¯1 ∈V , x¯2 ∈V , is their linear combination Δ
In an inner-product vector space V, two subspaces V₁ and V₂ are considered orthogonal, denoted as V₁ ⊥ V₂, if the inner product (x̄, ȳ) equals zero for all vectors x̄ in V₁ and ȳ in V₂ This implies that the only common vector between them is the zero vector, leading to the conclusion that V₁ ∩ V₂ = {0} The set of vectors in V that can be expressed as x̄₁ + x̄₂, where x̄₁ is from V₁, x̄₂ is from V₂, and (x̄₁, x̄₂) = 0, forms the orthogonal sum, denoted as V₁ ⊕ V₂ This orthogonal sum maintains the properties of the direct sum, with the added requirement that the components are orthogonal, ensuring a unique decomposition of vectors such that x̄ = x̄₁ + x̄₂ and confirming that dim(V₁ ⊕ V₂) = dim V₁ + dim V₂.
The concepts of direct and orthogonal sums can be applied to any collection of subspaces within a vector space V If V(1), V(2), , V(k) represents a set of mutually disjoint subspaces in V, meaning that their intersection is the zero vector (V(i) ∩ V(j) = {0} for i ≠ j), then the combination of these subspaces forms a unique set of vectors.
The direct sum of k subspaces in a vector space V, denoted as V + V + + V (k), consists of vectors that can be expressed as a sum of k vectors, each originating from one of the subspaces Importantly, each vector in this direct sum has unique components within these subspaces.
If V is an inner-product vector space and if all subspaces are orthogonal to each other, V ( i ) ∩V ( j ) , i=j=1,2, ,k, then the above direct sum becomes the orthog- onal sum V ⊕V ⊕ ããã ⊕V ( k )
A key objective in the theory of unitary spaces, particularly in its applications to Quantum Mechanics, is to break down the entire state space into a specific orthogonal sum of its subspaces.
3.4.2 The Orthogonal Complement of a Subspace
In the context of a real or complex inner-product vector space V, the orthocomplement of a subspace W, denoted as W ⊥, is defined as the set of all vectors in V that are orthogonal to every vector in W.
The orthocomplement W ⊥ is a subspace of V, as demonstrated by the linearity of the inner product in V, applicable to both real and complex vector spaces This can be illustrated by considering any two vectors ¯x and ¯y from the space.
To establish that W ⊥ is a subspace of V, it is necessary to demonstrate that any linear combination a ¯x + b ¯y of vectors ¯x and ¯y, where a and b are scalars from R or C, also belongs to W ⊥ Since ¯x and ¯y are elements of W ⊥, this implies that (¯z, ¯x) = ¯0 and (¯z, ¯y) = ¯0 for any vector ¯z in W Consequently, we have (¯z, a ¯x + b ¯y) = a(¯z, ¯x) + b(¯z, ¯y) = ¯0, confirming that a ¯x + b ¯y is indeed in W ⊥ Furthermore, it is evident that the orthocomplement of W ⊥ is W itself.
To establish that W is a subspace of (W ⊥ ) ⊥, we recognize that every vector in W is orthogonal to all vectors in (W ⊥ ) ⊥ To prove their equality, we will compare their dimensions If the dimensions are equal, it follows that W is identical to (W ⊥ ) ⊥ We demonstrate that the dimensions of a subspace and its orthocomplement sum to the dimension of the entire space V, leading to the equations dim(W) + dim(W ⊥ ) = dim(V) and dim(W ⊥ ) + dim((W ⊥ ) ⊥ ) = dim(V) Thus, we find that dim((W ⊥ ) ⊥ ) = dim(V) - dim(W ⊥ ) = dim(V) - [dim(V) - dim(W)] = dim(W).
Theorem Every vector ¯v∈V can be written in one and only one way as the sum ¯ v=w¯+w¯ , w¯∈W and ¯w ∈W ⊥ , [of course(w,¯ w¯ ) =0].
In other words, we say that W and W ⊥ are orthogonally added to form V :
As a consequence, dim(W) +dim(W ⊥ ) =dim(V).
To establish an orthonormal basis in the vector space W, we apply the Gramm-Schmidt procedure to any chosen basis, resulting in the equivalent orthonormal set {x̄₁, x̄₂, , x̄ₙ}, where n represents the dimension of W Furthermore, for every vector ¯v in V, we can uniquely express it as a linear combination of the orthonormal basis vectors ¯w = ∑ₖ=₁ⁿ cₖx̄ₖ.
( ¯w is the sum of projections of vector ¯v along n unit orthogonal vectors which make the ON basis in W ).
The vector ¯w is a linear combination of the basis vectors in the subspace W, confirming its membership in W Additionally, ¯w is orthogonal to all basis vectors in W, which means it also belongs to the orthogonal complement of W, denoted as W ⊥.
The components ¯w and ¯w of ¯v are unique, since it can be shown that they do not depend on a particular choice of ON basis{x¯1 ,x¯2 , ,x¯ n }.
To show this uniqueness, consider another ON basis in W m=1,2, ,n, ¯ y m =∑ n k=1 r mk x¯ k , R= [r mk ] n × n
In a complex inner-product vector space, the condition for the set {ȳ1, ȳ2, , ȳn} to be an orthonormal (ON) basis is that R†R = I_n, while in a real vector space, it is R^T R = I_n To compute the new projection ¯w_new of the vector ¯v onto the subspace W using this orthonormal basis, we use the formula ¯w_new = ∑_{m=1}^{n} (ȳ_m, v) ¯ȳ_m.
We have already called the component ¯w of ¯v the projection of ¯v into the subspace
W , and, as just shown, it is obtained as the sum of projections of ¯v along n unit vectors of any ON basis in W
The other component ¯w =v¯−w¯∈W ⊥ is called the normal of the vector ¯v onto the subspace W , since it is orthogonal to all vectors in W
The length of the normal ¯w represents the shortest distance of vector ¯v to the subspace W : precisely, if ¯z is any vector in W distinct from ¯w, then||w¯ ||=||v−¯ w||¯