In Mathematics, a proof is a demonstration that
assuming certain axioms, some statements is necessarily true.
A proof is a logical argument, not an empirical one
One must demonstrate that a proposition is true in all
cases before it is considered a theorem of mathematics. An
unproven proposition for which there is some sort of empirical evidence
is known as a conjecture
Five ways of forming new statements based on
statements P and Q.
Conjunctions (P and Q)
Denotes \(P\wedge Q\)
Is true only if both statements are true, and is false
otherwise
P
Q
\(P\wedge Q\)
T
T
T
T
F
F
F
T
F
F
F
F
Disjunction (P or Q)
Denote \(P\vee Q\)
Is true if either statement is true, and false otherwise
P
Q
\(P\vee Q\)
T
T
T
T
F
T
F
T
T
F
F
F
Negation (Not P)
Denotes \(\neg P\)
Is true if the statement is false, and is true otherwise
P
\(\neg P\)
T
F
F
T
Conditional Connective (If P then Q; P only if Q; Q if P)
Denotes \(P\to Q\)
Is false if P (condition) is true and Q (result) is
false, and is true otherwise.(Only this case)
P is called the antecedent and Q is called the
consequent
Think \(P\to Q\) as a
promise that Q is true whenever P is
true. When P is false, the promise is kept by default.
Example: suppose your friend promises “if it is sunny
tomorrow, I will ride my bike". We will say this is true if they keep
their promise. If it rains and they don't ride their bike, most people
would agree that they have still kept their promise.
P
Q
\(P\to Q\)
T
T
T
T
F
F
F
T
T
F
F
T
Variations of \(P\to Q\)
Converse: \(Q\to
P\)
Inverse:\(\neg P\to
\neg Q\) (If not P, if not Q)
Contrapositive:\(\neg
Q \to \neg P\)
P
Q
$ PQ$
\(Q\to P\)
\(\neg P \to \neg Q\)
\(\neg Q \to \neg P\)
T
T
T T T
T T T
F T F
F T F
T
F
T F F
F T T
F T T
T F F
F
T
F T T
T F F
T F F
F T T
F
F
F T F
F T F
T T T
T T T
Biconditional (P if and only if Q)
Denotes \(P \leftrightarrow
Q\)
True if P and Q have the same truth value, and false
otherwise
P
Q
\(P\leftrightarrow Q\)
T
T
T
T
F
F
F
T
F
F
F
T
Compound Logical Statements
Form more complicated compound statements based on the above
five.
Parentheses to avoid ambiguity
Convention: \(\neg\)takes
precedence over the other four operations.
Example
\((P\to R)\wedge (Q \vee \neg
R)\)
Step:
Copy values
Negation
\(P\to R\)
\(Q \vee \neg R\)
\(\wedge\)
By truth table:
P
Q
R
\((P\to R)\wedge (Q \vee \neg
R)\)
T
T
T
T T T T T T
F
T
T
F
T F F F T T
T
T
F
T
T T T F F F
F
T
F
F
T F F F F T
T
F
T
T
F T T T T T
F
F
T
F
F T F T T T
T
F
F
T
F T T F F F
F
F
F
F
F T F T F T
T
1 3 1 5 1 4 2
Applications
Translating language to mathematical symbols.
Example
You can access the Internet from campus only if you are
a computer sciecne magjor or you are not a freshman
P = "You can access the Internet from campus"
Q = "You are a computer science major"
R = "You are a freshman"
\(P\to Q\): P only if Q; Q if P
\(P\to (Q\vee \neg R)\)
You cannot ride the roller coaster if you are under 4
feet tall unless you are older than 16 years old
P = "You can ride the roller coaster"
Q = "You are under 4 feet tall"
R = "You are older than 16 years old"
\(\neg P\) = " You cannot ride the
roller coaster" --> This is the result
The condition for the result is: You are under 4 feet tall and You
are younger than 16 years old, which means that, if you are under 4 feet
tall but older than 16, then you can ride the roller coaster.
\((Q\wedge \neg R)\to \neg
P\)
Tautology and Contradiction
Tautology is a statement that us true by logical
necessity, regardless of whether the component statements are true or
false
Example
\(P\vee (\neg P)\)
P
\(P\vee \neg P\)
T
T T F
F
F T F
Contradiction is a statement that is always
false by logical necessity, regardless of whether the component
statements are true or false.
Example
\(P\wedge (\neg P)\)
P
\(P\wedge (\neg P)\)
T
T F F
F
F F T
Try to show whether the following statement is tautology, a
contradiction or neither?
\(((P\wedge Q)\to R)\to (P\to(Q\to
R))\)
P
Q
R
\(((P\wedge Q)\to R)\to (P\to(Q\to
R))\)
T
T
T
T T T T T T T T
TT T
T
T
F
T T T F F T T T
FF F
T
F
T
T F F T T T T F
TT T
T
F
F
T F F T F T T F
TT F
F
T
T
F F T T T F T T
TT T
F
T
F
F F T T F F T T
TF F
F
F
T
F F F T T F T F
TT T
F
F
F
F F F T F F T F
TT F
1 2 1 3 1 1 4 1 3 2 1
1.2 Relations Between
Statements
Meta Statement
A meta statement is a logical statement about
logical statements.
Example
If the statement "Ethel is tall and Agnes is short" is true, then the
statement "Ethel is tall" is true.
Implication
For two compound statements P and Q, and implication \(P\implies Q\),states that Q is true
whenever P is true. To judge a implication, \(P\to Q\) is tautology.
Example
My thumb will hurt if I hit it with a hammer
\(x=2\) implies \(x+1=3\)
\(\neg (P\to Q)\implies P\vee
Q\)
To solve this, only need to justify whether \((\neg (P\to Q))\to(P\vee Q)\) is tautology
by turth table.
Remark:
The difference between \(P\to
Q\) and \(P\implies Q\)
\(P\to Q\) is a compound statement
that may or may not be true.
\(P\implies Q\) is a relation
stating that the compound statement \(P\to
Q\) is true under all instances of the external
propositions.
Be careful not to allow contradictions in
logical arguments because, starting from a contradiction, anything can
be proven true.
Example
\(P\wedge \neg P\implies Q\) is a
valid logical equvalence. But Q doesn't appear on the LHS.
Thus, a contradiction in your assumptions can lead to a “correct𝑡
proof for an arbitrary statement.
Logical implication is NOT always revisable
Example
\(\neg (P\to Q)\implies P\vee
Q\)
But \(P\vee Q\) does not imply \(\neg (P\to Q)\) Since as P and Q are both
True, \(P\vee Q\) is True, but \(\neg (P\to Q)\) is Flase.
A equivalent\(P\Longleftrightarrow Q\), states that P is
true if and only if Q is true. To justify, \(P\longleftrightarrow Q\) (biconditional) is
a tautology.
Use quntifiers to express the definition of the limit of a
real-valued function \(f(x)\) Of a real
variables \(x\) at a point \(a\) in its domain. And express when \(lim_{x\to a}f(x)\) does not exist.
Sol:
Recall the definition of the statement \(\lim_{x\to a}f(x)=L\) Is "for every real
number
\(\epsilon>0\) such that \(|f(x)-L|<\epsilon\) Whenever \(0<|x-a|<\delta\)"
Using quantifiers: \(\exist L,\forall
\epsilon>0,\exist \delta>0,\forall x\in
D:(|x-a|<\delta\implies|f(x)-L|<\epsilon)\)
Note that by Negation of implication: \(\neg[|x-a|<\delta\implies|f(x)-L|<\epsilon]\Longleftrightarrow
|x-a|<\delta\wedge|f(x)-L|\geq\epsilon\)
1.4 Strategies for Proofs
A mathematical proof is a convincing argument that
starts from the premises, and logically deduces the
desired conclusion.
A theorem is a statement that can be proved on the
basis of explicitly stated or previously agreed assumptions.
A proposition is a statement not associated
with any particular theorem; this term sometimes connotes a
statement with a simple proof.
A lemma is a proven proposition
which is used as a stepping stone to a larger result
rather than an independent statement in itself.
A corollary is a mathematical statement which
follows easily from a previously proven statement, typically a
mathematical theorem.
Direct
Assume P true and give steps that lead to Q:
If P is true
.... some arguments ...
them Q is true
Contrapositive
Proof of equivalent statement \(\neg Q \to
\neg P\)
\(P\to Q \Longleftrightarrow\neg Q\to\neg
P\)
If \(\neg Q\) is true
... some argument ...
\(\neg P\) is true
\(P\to Q\) is true
Example
Let n be an integer, proof if \(n^2\) is odd, then \(n\) is odd.
Sol: Denote P = "\(n^2\) is odd", Q
= "\(n\) is odd", then we have \(\neg Q\) = "n is even", \(\neg P\) = "\(n^2\) is even". By proving if \(\neg Q\) then \(\neg P\), we can prove \(P\to Q\).
Suppose we have \(\exist k \in
\N:n=2k\)
\(\implies
n^2=(2k)^2=4k^2=2(2k^2)\)
\(\implies n^2\) is an even number
\(\implies \neg P\)
Which means we have $Q PPQ $
Contradiction
Using \(\neg(P\to Q)\Longleftrightarrow
P\wedge\neg Q\), one supposes that both P and \(\neg Q\) are true and then gives steps
leading to a contradiction.
Since \((P\to Q)\Longleftrightarrow\neg
(\neg (P\to Q))\)
To prove \(P\to Q\) is true is to
prove \(\neg(P\to Q)\) is false.
Since \(\neg(P\to Q)\Longleftrightarrow
\neg(\neg P \vee Q)\Longleftrightarrow P\wedge\neg Q\)
To prove \(P\to Q\) is true is to
prove \(P\wedge\neg Q\) is false.
Then assume \(P\wedge\neg Q\) is
satisfied, find contradiction
Example
Prove if \(x\in \R, x^2=2\), then
\(x\) is an irrational number.
Sol:
P = "\(x\in \R, x^2=2\)", Q = "x is
an irrational number"
To prove the statement true, we prove \(P\wedge\neg Q\) is false.
If we have \(x\in \R, x^2=2\) and x
is a rational number:
\(\implies \exist n,m\in
\N:x=\frac{n}{m}\), m≠0. m and n does not have common factors
Then \(S(n+1)=\sum_{i=1}^{n+1}i^2=\frac{n(n+1)(2n+1)}{6}+(n+1)^2=\frac{(n+1)(n+1+1)(2(n+1)+1)}{6}\)
1.5 Set Theory
Basics
A set is an unordered collection of
distinct objects, called elements or members of the
set.
\(a\in A\): A is a set and a is an
element of A
\(a\notin A\): A is a set and a is
not an element of A
Element order is irrelevent
Repeated elements have no effect
Singleton: a set containing exactly
one element.
Empty set \(\emptyset\): (null set) the set
that does not have any elements in it
Standard sets: Integers \(\Z\), Rational numbers \(\Q\), Real numbers \(\R\), Complex numbers \(\C\)
Bounding
Open bounded interval:\((a,b)=\{ x\in \R|a<x<b\}\)
Close bounded interval:\([a,b]=\{x\in \R|a\leq x\leq b\}\)
Half-open bounded:\((a,b]=\{x\in \R|a<x \leq b\}\) or \([a,b)=\{x\in \R|a\leq x<b\}\)
Open unbounded:\((a,\infty)=\{x\in \R|a<x\}\) or \((-\infty,b)=\{x\in R|x<b\}\) Or \((-\infty,\infty)=\R\)
Close unbounded:\([a,\infty)=\{x\in \R|x\geq a\}\) or \((-\infty,b]=\{x\in \R|x\leq b\}\)
Subset
Set A is a subset of set B: If \(x\in A\implies x\in B\), we have \(A\subseteq B\)
If set A is not a subset of set B: \(A\not\subseteq B\)\[
A\subseteq B \Longleftrightarrow ((x\in A)\to(x\in B))
\]Example
If \(A=\{a,b,c\}\) then \(a\in A, \{a\}\subseteq A\)
If \(B=\{\{a\},b,c\}\) then \(\{\{a\}\}\subseteq B, \{a\}\in B\)
Lemma: Let A, B and C be sets
\(A\subseteq A\)
\(\emptyset\subseteq A\)
If \(A\subseteq B\), and \(B\subseteq C\), then \(A\subseteq C\)
Equivalence and Proper
Subset
Let A and B be sets:
\(A=B\): If \(A\subseteq B, B\subseteq A\), set A
equals the set B.
\(A\subsetneq B\): If \(A\subseteq B, A\not= B\), set A is a
proper subset of set B \[
A=B\Longleftrightarrow\forall x((x\in A)\longleftrightarrow (x\in B))
\]
Lemma: Let A, B and C be sets
A = A
If A = B, then B = A
If A = B, and B = C, then A = C
Cardinality
For set A, the cardinality |A| is the number of
elements in A.
Countably Infinite: If there is a one-to-one
correspondence between A and the natural numbers \(\N\), then A is called countably
infinite and |A| = \(\infty\)
Example
Set of rational numbers \(\Q=\{\frac{p}{q}|p\in \Z,q\in\N\}\), is
countably infinite.
Uncountably Infinite: If |A| = \(\infty\) but not countably infinte, then A
is uncountably infinite
Example
The set of real numbers is uncountably infinite by Cantor's diagonal
argument.
Power Set
Let A be a set. The power set of A, denoted \(P(A)\), defined by \(P(A)=\{X|X\subseteq A\}\)
Let A and B be sets. A relation R from A to B is
\(R\subseteq A\times B\). If \(a\in A, b\in B\), we write \(aRb\) if \((a,b)\in R\). A relation on \(A\) is a relation from \(A\) to \(A\).
Example
A linear relation \(R^2=\{(a,b)|a\in R,
b\in R\}\)
\(R_1\) is a relation such that
\(R_1=\{(a,a+1)|a\in R\}\subseteq
R^2\)
The symbols \(<\) and \(\leq\) both represent relations on \(\R\)
We have set \(A=\{1,2,3\}\) and
\(B=\{x,y,z\}\)
\(R_1\) is a relation such that
\(R_1=\{(1,x),(1,y),(2,z)\}\subseteq A\times
B\)
Thus we have \(1R_1x\), \(1R_1y\),\(1\not
R_{1} z\)
Relation Class
Relation class of x with respect to R, \(R[x]=\{y\in B|xRy\}=\{y\in R|(x,y)\in
R\}\), which is the set of y that corresponding to x.
Example
We have set \(A=\{1,2,3\}\) and
\(B=\{x,y,z\}\)
\(R_1\) is a relation such that
\(R_1=\{(1,x),(1,y),(2,z)\}\subseteq A\times
B\)
\(R_1[1]=\{x,y\}\), \(R_1[2]=\{z\}\)
For relation \(R_<=\{(x,y)|x<y\}\subseteq R^2\)
so \(R_<[x]=\{y>x|y\in
R\}=(x,\infty)\)
For relation \(R_{\leq}=\{(x,y)|x<y\}\subseteq
R^2\)
so \(R_{\leq}[x]=\{y>x|y\in
R\}=[x,\infty)\)
Relation Properties
A relation R on none empty set A is said to be: (\(A\times A\))
Reflexive: if \(xRx\) holds for all \(x\in A\)---> \((x,x)\in R, \forall x\in A\)
Symmetric: if \(xRy\) implies \(yRx\), for all \(x,y\in A\) ---> \((x,y)\in R\wedge (y,x)\in R\)
Transitive: if \(xRy\) and \(yRz\), then \(xRz\), for all \(x,y,z\in A\) ---> \((x,y)\in R\wedge(y,z)\in R\to(x,z)\in
R\)
Example
\(\leq\)
\(<\)
Reflexive
Yes. Since \(x\leq x\)
No
Symmetric
No
No
Transitive
Yes
Yes
Set \(B=\{x,y,z\}\). Let E be
the relation defined on B by \(E=\{(x,x),(y,y),(z,z),(x,y),(y,z)\}\), then
the relation E is:
Reflexive?
Since {x,x},{y,y},{z,z}\(\in E\),
Yes
Symmetric?
We have {x,y},{y,z}\(\in E\), but we
don't have {y,x}{z,y}\(\in E\). Thus,
NO.
Transitive?
We have {x,y},{y,z}\(\in E\), but we
don't have {x,z}\(\in E\). Thus,
No.
Equivalence
A relation is call equivalence relation if it is
reflexive, symmetric, and transitive.
Example
Suppose X and Y are sets and that \(f:X\to
Y\) is a function. Let us define a relation ~ on X by \(x_1\sim x_2\) if and only if \(f(x_1)=f(x_2)\). Show that ~ is an
equivalence relation.
Solution:
Reflexive: Since \(f(x)=f(x)\) for
every \(x\in X\), then we have \(x\sim x\)
Symmetric: If \(x_1 \sim x_2\), then
\(f(x_1)=f(x_2)\). But then \(f(x_2)=f(x_1)\), hence \(x_2\sim x_1\)
Transitive: If \(x_1\sim x_2\) and
\(x_2\sim x_3\), then \(f(x_1)=f(x_2)\) and \(f(x_2)=f(x_3)\). But then \(f(x_1)=f(x_3)\), hence \(x_1\sim x_3\).
Equivalence Class
The relation classes of set A with respect to the equivalence
relation ~ are called **equivalence classes*
Theorem:
Let A be a non-empty set, and let ~ be an equivalence relation on
A
Let \(x,y\in A\). If \(x\sim y\), then \([x]=[y].\) If \(x\not\sim y\), then \([x]\cap[y]=\empty\)
\(\bigcup_{x]\in A}=A\)
\([x]=[y]\) if an only if \(x\not\sim y\)
Quotient Set
Let A be a non-empty set, and let ~ be an equivalence relation on A.
The quotient set is \(A/\sim=\{[x]|x\in A\}\)
Example
Let P be the set of all people, and let be the relation on P defined
by 𝑥~ 𝑦 if and only if 𝑥 and 𝑦 are the same age (in years). If person 𝑥
is 19 years old, then the equivalence class of 𝑥 is the set of all
19-year-olds. Each element of the quotient set P/~ is itself a set,
where there is one such set consisting of all 1- yearolds, another
consisting of all 2-year-olds, and so on. Although there are billions of
people in 𝑃𝑃, there are fewer than 125 elements in P/~ ,because no
currently living person has reached the age of 125.
Remark:
The quotient set is in fact a partition of A
Partition
Let A be a non-empty set. A partition of A is a
family \(D\) of non-empty subsets of A
such that:
If \(P,Q\in D\) and \(P≠ Q\), then \(P\cap Q=\empty\)
\(\bigcup_{P\in D}=A\)
Example
Let \(C=\{[n,n+1)\}_{n\in Z}\).
Then C is a partition of R.
1.6 Function
Definition
A function from X to Y denote \(f:X\to Y\), is a subset \(F\subseteq X\times Y\) such that for
each\(x\in X\), there
is one and only one pair in F of the form \((x,y)\).
Remark:
Domain: X
Codomain: Y
Range:\(\{f(x)\in Y|x\in
X\}\)
Two function are equal if they have the same
domain, range, and value for all elements of the domain.
Injective/Surjective/Bijective
Let X and Y be sets, and let \(f:X\to
Y\) be a function:
Injection (One-to-one): If \(x_1≠x_2\) implies \(f(x_1)≠f(x_2)\), for all \(x_1,x_2\in X\); equivalently, if \(f(x_1)=f(x_2)\) Implies \(x_1=x_2\) for all \(x_1,x_2\in X\)
Surjection (Onto): If for every \(y\in Y\), there exist some \(x\in X\) such that \(f(x)=y\); equivalently, the range equals Y,
which is \(\{f(x)|x\in X\}=Y\)
Bijective: If it is both injective and
surjective
Example
Let \(k:[0,\infty)\to[0,\infty)\)
be defined by \(k(x)=x^2\) for all
\(x\in [0,\infty)\) -->
Bijective
Let \(g:[0,\infty)\to\R\) be
defined by \(g(x)=x^2\) for all \(x\in [0,\infty)\) --> Injective
Let \(h:\R\to[0,\infty)\) be
defined by \(h(x)=x^2\) for all \(x\in[0,\infty)\) --> Surjective
Composition
Let A, B, C be sets, and let \(f:A\to
B\) and \(g:B\to C\) be
functions. The composition of \(f\) and \(g\) is the function \((g\circ f):A\to C\) defined by \[
(g\circ f)(x)=g(f(x))
\]
Remark: The codomain of the first function much
equal the domain of the second function, if the
composition of two functions is defined.
Example
\(k:\R\to\R\) is defined by \(k(x)=\sin x\) for all \(x\in \R\).
\(h:(0,\infty)\to \R\) is defined by
\(h(x)=ln x\)
Then \((k\circ
h)(x)=k(h(x))=sin(lnx)\) is defined for all \(x\in(0,\infty)\)
However, \(h\circ k\) cannot be
defined since \(ln(sinx)\) is not
defined for all \(x\in\R\)
Lemma: Let A, B, C and D be sets, and let \(f:A\to B\) and \(g:B\to C\) and \(h:C\to D\) be functions:
Lemma: Let A, B and C be sets, and let \(f:A\to B\) and \(g:B\to C\) be functions:
If \(f\) and \(g\) are injective, then
\(g\circ f\) is
injective
If \(f\) and \(g\) are surjective, then
\(g\circ f\) is
surjective
If \(f\) and \(g\) are bijective, then
\(g\circ f\) is
bijective
Proof:
We need to prove \(\forall x, y\in
A\), if \((g\circ f)(x)=(g\circ
f)(y)\), then \(x=y\)
Given \((g\circ f)(x)=(g\circ
f)(y)\), since \(g\) is
injective, then \(f(x)=f(y)\)
Given \(f(x)=f(y)\), sicne \(f\) is injective, then \(x=y\).
We need to prove \(\forall c\in C\),
there exist \(a\in A\) such that \((g\circ f)(a)=c\)
Since \(g\) is surjective, then
\(\forall c\in C\), there is \(b\in B\) such that \(g(b)=c\)
Since \(f\) is surjective, then
\(\forall b\in B\), there is \(a\in A\) such that \(f(a)=b\)
Inverse Function
A bijective function has a unique
inverse function \(f^{-1}:Y\to X\), satisfying: \[
\forall x\in X, f^{-1}(f(x))=x \ and\ \forall y\in Y, f(f^{-1}(y))=y
\] Any one-to-one function \(f:X\to
Y\) defines a bijection \(g:X\to
R\), where R is the range of \(f\) and \(g(x)=f(x),\forall x \in X\)
Example
Suppose we have \(f(x)=sinx,x\in(-\infty,\infty)\). For \(x\in[-\frac{\pi}{2},\frac{\pi}{2}]\), the
function is bijective and hence it has inverse function.
Image
For a function \(f:A\to B\) and
subset \(P\subseteq A\) the
image of P under \(f\), denoted \(f(P)\) is \[
f(P)=\{f(p)|p\in P\}\subseteq B (codomain)
\] The range of \(f\) is the set \(f(A)\)
Let \(Q\subseteq B\). The
inverse image or preimage of Q under
\(f\), denoted \(f^{-1}(Q)\) Is the set defined by \[
f^{-1}(Q)=\{a\in A|f(a)\in Q\}\subseteq A (domain)
\]
Example
graph TB
subgraph A
a1(1)
a2(2)
a3(3)
end
a1-->b1
a2-->b1
a3-->b2
subgraph B
b1(x)
b2(y)
b3(z)
end
Then the following would be:
\(f(A)=\{f(1),f(2),f(3)\}=\{x,y\}\)
\(f(\{1,3\})=\{x,y\}\)
\(f^{-1}(\{x\})=\{1,2\}\)
\(f^{-1}(B)=\{1,2,3\}\)
\(f^{-1}(\{x,y\})=\{1,2,3\}\)
\(f^{-1}(\{z\})=\empty\)
Remark:
In general, we have \[
f(f^{-1}(B))\subseteq B \ and\ f^{-1}(f(A))\supseteq A
\]Example
graph TB
subgraph X
1
2
3
end
1-->a
2-->b
3-->c
subgraph Y
a
b
c
d
e
end
We have \(B=\{a,b,c,d\}\)
thus \(f^{-1}(B)=\{1,2,3\}\)
then \(f(f^{-1}(B))=\{f(1),f(2),f(3)\}=\{a,b,c\}\subseteq
B\)
graph TB
subgraph X
1
2
3
end
1-->a
2-->b
3-->b
subgraph Y
a
b
end
then we have \(y\in f(C_1)\) or
\(y\in f(C_2)\), implies that \(\exist x_1\in C_1\) such that \(f(x_1)\in C\) or \(\exist x_2\in C_2\) such that \(f(x_2)=C\)
Let the function \(f:\R\to \R\) Be
defined by \(f(x)=x^2\). Let \(A=[1,2]\) then \(B=f(A)=[1,4]\). And \(f^{-1}(B)=f^{-1}([1,4])=[-2,-1]\cup[1,2]\subseteq
A\)
Let the function \(f:\R\to \R\) be
defined by \(f(x)=x^2+1\). Let \(B=[0,2]\) and notice that \(A=f^{-1}(B)=[-1,1]\) Then \(f(A)=f([-1,1])=[1,2]\subseteq B\)
Theorem:
Let \(f:X\to Y, A_i\subseteq X,\forall i\in
I, B_i\subseteq Y\) for all \(i\in
I\), then,
\(d_{cos}(\vec a,\vec b)\geq
0,d_{cos}(\vec a,\vec b)=0\Longleftrightarrow cos0=1\) hold when
the direction is the same, thus is not a "if and only if" -->
False
\(d_{cos}(\vec a,\vec b)=d_{cos}(\vec
b,\vec a)\)
Triangular Inequality --> False by conterexample
Example of Distances
Given \(x=(0,0)^T,y=(2,1)^T\)
Then
\(d_2(x,y)=\sqrt{2^2+1^2}=\sqrt
5\)
\(d_1(x,y)=|2|+|1|=3\)
\(d_\infty(x,y)=max(2,1)=2\)
An interval [a,b], let \(C[a,b]\) be the set of all continuous
functions on [a,b]. Show that, \[
\rho(x(t),y(t))=max_{a\leq t\leq b}|x(t)-y(t)|
\] is a distance between two functions \(x(t), y(t)\).
Sol
\(\rho(x(t),y(t))\geq 0\) if
\(\rho(x(t),y(t))=max_{a\leq t\leq
b}|x(t)-y(t)|=0\implies \forall t x(t)=y(t)\)
Consider a set G of objects and one
operation \(\cdot\) on the
elements of G, the set and operation \((G,\cdot)\) is called Abelian group
/ communtative group if:
Closure:\(\forall a,b\in
G\), then \(a\cdot b\in G\)
Associativity:\(\forall
a,b,c\in G\), then \((a\cdot b)\cdot
c=a\cdot(b\cdot c)\)
Identity element:\(\exist
e\in G\) such that \(a\cdot
e=a\) for \(\forall a\in G\)
Inverse element:\(\forall
a\in G,\exist b\in G\) such that \(a\cdot b=e\)
Commutativity:\(\forall
a,b\in G\) then \(a\cdot b=b\cdot
a\)
Example
The integers and the operation addition "+", denote (Z, +), while
\((Z,\times)\) is not defined since
there is no inverse element in \(\N\)
Field
Consider a set F of objects and two operations on
the elements of F, addition \(+\) and
multiplication \(\cdot\). The set with
two operation \((F,+,\cdot)\) is a
field if:
Closure:\(\forall a,b\in
F\) then \(a+b\in F,a\cdot b\in
F\)
Associativity:\(\forall
a,b,c\in F\) then \((a+b)+c=a+(b+c),(a\cdot b)\cdot c=a\cdot (b\cdot
c)\)
Identity element:\(\exist
0,1\in F\) such that \(a+0=a,a\cdot
1=a\) for \(\forall a\in F\)
Additive Inverse:\(\forall a\in F\), exist a unique element
\(-a\in F\) such that \(a+(-a)=0\)
Multiplicative Inverse:\(\forall a≠0\in F,\) exist a unique element
\(a^{-1}\in F\) such that \(a\cdot a^{-1}=1\)
Distribution of multiplication over addition:\(\forall a,b,c\in F\) such that \(a\cdot(b+c)=a\cdot b+a\cdot c\)
Remark
If there is no inverse element for multiplication, \((F,+,\cdot)\) is called commutative
ring
A vector space (essentially a set) consists of the
following:
A field \(F\) of
scalars
A set \(V\) of
obejects
A binary operation called vector
addition, which maps any pair of vectors \(\vec v,\vec w\in V\) a vector \(\vec v+\vec w\in V\) such that:
Commutative:\(v+w=w+v\)
Associative:\(u+(v+w)=(u+v)+w\)
There is a unique vector such that \(v+\vec 0=v,\forall v\in V\)
To each \(v\in V\), there is a
unique vector \(-v\in
V\), such that \(v+(-v)=\vec
0\)
An operation called scalar
multiplication, which associates with each \(s\in F\) and \(\vec v\in V\) a vector \(s\vec v\in V\) such that:
\(1v=v,\forall v\in V\)
\((s_1s_2)\vec v=s_1(s_2\vec
v)\)
\(s(\vec v+\vec w)=s\vec v+s\vec
w\)
\((s_1+s_2)\vec v=s_1\vec v+s_2\vec
v\)
Example
Standard vector space of functions:
Let \(X\) be a non-empty set and
\(Y\) be a vector space over \(F\). Consider the set \(V\) of all function mapping from \(X\) to \(Y\).
The vector addition of two functions \(f,g\in V\) is the function from \(X\) into \(Y\) denoted by: \[
(f+g)(x)=f(x)+g(x)
\] where the RHS uses vector addition from \(Y\).
The scalar product of \(s\in F\) and
the function \(f\in V\) is the function
\(sf\) denoted by : \[
(sf)(x)=sf(x),\forall x\in X
\] where the RHS uses scalar mulitiplication from \(Y\).
Check whether is a vector space.
\((f+0)(x)=f(x)+0(x)=f(x)\), where
\(o(x)\) is the zero function
Let \(V\) be a vector space over
\(F\). A subspace
(subset of vector space) of \(V\) is a
subset \(W\subsetneq V\) which is
itself a vector space over F.
Theorem:
A non-empty subset \(W\subsetneq V\)
is a subset of \(V\) if \(W\) satisfies the following conditions:
\(\vec 0\in W\)
\(\forall \vec u,\vec v\in W,\vec u+\vec
v\in W\)
\(\forall s\in F,\vec u\in W,s\vec u\in
W\)
Example
Let \(A\) be an \(m\times n\) matrix over \(F\). The set of all \(n\times 1\) column vectors \(V\) such that \(\vec v\in V\) and \(A\vec v=0\) is a subspace of \(F^{n\times 1}\)
Prove:
\(W=\{\vec v|A\vec v=\vec 0\}\)
\(A\vec 0=0\) True
when \(A(\vec v_1+\vec v_2)=A\vec
v_1+A\vec v_2=\vec 0\in W\) True
\(A(x\vec v)=xA\vec v=\vec 0\in W\)
True
The subspace of \(\R^2\) are
precisely \(\{\vec 0\},\R^2\) and all
lines in \(\R^2\) through the
origin.
Prove that: \(b\in
F,V=\{(x_1,x_2,x_3,x4)^T\in F^4|x_4=5x_3+b\}\) is a subspace of
\(F^4\) if and onle of \(b=0\)
Sol:
If \(b=0\), then \(b\in F,V=\{(x_1,x_2,x_3,x4)^T\in
F^4|x_4=5x_3\}\)
\((0,0,0,0)^T\in V\)
\(\vec v_1=[x_1,x_2,x_3,x_4]^T,\vec
v_2=[y_1,y_2,y_3,y_4]^2,\vec v_1+\vec
v_2=[x_1+y_1,x_2+y_2,x_3+y_3,x_4+y_4]\) and we have \(x_4=5x_3,y_4=5y_3\implies
x_4+y_4=5(x_3+y_3)\)
\(\lambda v_1,x_4=5x_3,\lambda x_4=\lambda
5x_3\)
If \(V\) is subspace,
there should be \(\vec 0\in V\) such
that \(0=5\cdot 0+b,b=0\)
Remark:
\(\{\vec 0\}\) is the
smallest subspace of \(V\)
\(V\) itself is the largest
subspace of \(V\)
The empty set is NOT a subspace of \(V\) since a vector space must contain
at least one element.
Sum Of Subspaces
Suppose \(U_1,U_2,U_3,...,U_m\) are
the subspaces of \(V\). The sum
of subspaces is the set of all possible sums of elements of
\(U_1,..,U_m\): \[
U_1+\cdots+U_m=\{u_1+\cdots+u_m|u_1\in U_1,\cdots u_m\in U_m\}
\]Example
\(U=\{(x,0,0)^T\in F^3\},
W=\{(0,y,0)^T\in F^3|y\in F^3\}\), then \(U+F=\{(x,y,0)^T|x,y\in F^3\}\)
For \(U=\{(x,y,y)\in F^3,x,y\in F\},
W=\{(x,x,y)\in F^3,x,y\in F\}\), prove that \(U+W=\{(x,y,z)\in F^3,x,y,z\in F\}\) is a
sum of subspace.
Sol:
We have the following decomposition,
\((x,y,z)=(0,0,z-y)\in W+(x,y,y)\in
U\)
and \((1,1,1)=(1,1,1)\in W + (0,0,0)\in
U\)
Remark: Sum of subspaces in the theory of vector spaces are analogous
to unions of subsets in set theory.
Theorem:
Suppose \(U_1,U_2,...,U_m\) are the
subspaces of \(V\). Then \(U_1+\cdots+U_m\) is the
smallest subspaces of \(V\) containing \(U_1,...,U_m\)
Proof:
\(0\in U_1+U_2+\cdots+U_m\) and
\(U_1+U_2+\cdots+U_m\) is closed under
addition and multipliaction, thus it is a subspace.
At the same time,
we have \(U_1,U_2,...U_m\subseteq
U_1+U_2+\cdots+U_m\).
And conversely, every subspace of \(V\) containing \(U_1,U_2...,U_m\) contains \(U_1+U_2+\cdots +U_m\) because subspaces are
closed for sum. Thus, \(U_1+U_2+\cdots
+U_m\) is the smallest subspace.
Suppose \(U_1,U_2,...,U_m\) are the
subspaces of \(V\). The sum \(U_1+\cdots+U_m\) is called a direct
sum denoted by \(U_1\oplus\cdots
\oplus U_m\) if each element \(u\in
U_1+\cdots +U_m\) can be written in only way (decomposed
uniquely) as \(u=u_1+\cdots+u_m\) where \(u_j\in U_j\)
Theorem:
Suppose \(U_1,U_2,...,U_m\) are
subspaces of \(V\). The sum \(U_1+\cdots+U_m\) is a direct
sumif and only if the only way to write \(\vec 0\) as sum \(u_1+\cdots+u_m\) with \(u_j\in U_j\), is by taking each \(u_j\) equal to \(\vec0\)
\(\implies\) Obversely by the
definition of direct sum.
\(\Longleftarrow\) if \(\vec 0\in u_1+\cdots+u_m\) could be
represented in two ways.
Then \(\vec
0=u_1+\cdots+u_m=u_1'+\cdots+u_m',u_i,u_i'\in
U_i\)
Then \((u_1-u_1')+\cdots+(u_m-u_m')=\vec
0\)
Since \(\vec 0\) could only be
written as \(0+0+0...+0\)
We have \(u_i=u_i'\)
Example
Let \(U=\{(x,y,0)\in F^3:x,y\in
F\},W=\{(0,0,z)\in F^3:x,y\in F\}\), then show that \(F^3=U\oplus W\)
A vector \(w\in V\) is said to be a
linear combination of vectors \(v_1,...,v_n\in V\) provided that there
exist scalars \(s_1,...,s_n\in F\) such
that \[
w=\sum_{i=1}^{n}s_iv_i
\] Let \(S\) and \(T\) be two subsets of vector spaces \(V\). S and \(T\) are said to be
equivalent if any element in S could be a linear
combination of T, and any element in T could be a linear combination of
S.
Example
\((17,-4,2)^T=6(2,1,-3)^T+5(1,-2,4)^T\)
But for \((17,-4,5)\) there do not
exist numbers \(a_1,a_2\in F\) such
that \((17,-4,5)^T=a_1(2,1,-3)^T+a_2(1,-2,4)^T\)
Span
Let \(U\) be a list or set of
vectors in \(V\). Then the
span of \(U\),\(span(U)\) is defined to be the set
of all finite linear combinations of the vectors in U: \[
span(U)=\{v|v=\sum\lambda_i u_i,\lambda_i\in F,u_i\in U\}
\]
Remark:
\(\vec 0\in span(U)\)
Example
\((17,-4,2)^T\in
span((2,1,-3)^T,(1,-2,4)^T)\) while \((17,-4,5)^T\not \in
span((2,1,-3)^T,(1,-2,4)^T\)
Span of empty set: \(\{0\}\)
Theorm:
For a vector space \(V\), the span
of any list or set of vectors in \(V\)
Form a subspace.
Example
For \(U=\vec e_1,\vec e_2\), where
\(\vec e_1=(1,0)^T,\vec
e_2=(0,1)^T\)
\(span(U)=\R^2\)
Theorem:
The span of a list of vectors \(U\)
in V is the intersection of all subspaces of \(V\) that contain \(U\). Which is the smallest
subspaces of V containing U.
A vector space is called finite-dimensional if some
finite number of vectors span the space.
A vector space is called infinte-dimensional if it
is not finite-dimensional.
Example
A function \(p:F\to F\) is called a
polynomial with coefficients in \(F\)
if there exist \(a_0,\cdots,a_m\in F\)
such that, \[
p(z)=a_0+a_1z+\cdots+a_mz^m
\] For all \(z\in F\)
\(p(F)\) is the set of all
polynomials with coefficients in F, is infinite-dimensional.
\(p_m(F)\) is the set of all
polynomials with coefficients in F and degree at most m, is a
finite-dimension vector space for each non-negative integer m.
Linear Dependence and
Independence
Let V be a vector space over F. A list of vectors \(u_1,\cdots,u_n\in V\) is called
linearly dependent if there are scalars \(s_1,...,s_n\in F\), not all equal
to 0, such that \(\sum_{i=1}^{n}s_iu_i=0\)
A list that is not linearly dependent is called linearly
independent
A subset \(U\subseteq V\) is called
linearly dependent if there is a finite list \(u_1,...,u_n\in U\) of distinct vectors that
is linearly dependent.
Example
For \(V=\R^4\), vectors \(v_1=(1,1,0,0),v_2=(0,1,1,0),v_3=(0,0,1,1)\)
Are linearly independent.
Remark:
Any subset of a linearly independent set is also linearly
independent
Any set which contains the \(\vec
0\) is linearly dependent.
A set \(U\subsetneq V\) Is linearly
independent if and only if each finite subset of U is
linearly independent.
Steinitz Exchange Lemma
Let \(V\) be a vector space over
\(F\). Suppose that \(\alpha_1\cdots\alpha_s\) is a
linearly independent set of vectors in \(V\), and could be linear
combiniations of \(\beta_1\cdots\beta_2\in V\) (\(\beta\) havs more information), then
\(s\leq t\)
\(\alpha_1\cdots \alpha_s\) could
exchange \(s\) of vectors in \(\beta_1\cdots \beta_t\) such that \(\alpha_1 \cdots \alpha_s,\beta_{s+1}\cdots
\beta_t=\beta_1\cdots\beta_t\)
Theorem: Every spanning list in a vector space can be reduced to form
a basis of the vector space.
Proof:
If \(V=span(\vec v_1,\vec v_2,...,\vec
v_3)\), we want to remove some vectors to form the basis of \(V\).
Let \(B=\{\vec v_1,...,\vec
v_n\}\)
Step 1: If \(\vec v_1=\vec 0\), then
remove \(\vec v_1\), else keep \(\vec v_{1}\).
Step 2: If \(\vec v_j\) is in \(span(\vec v_1,...,\vec v_{j-1})\), delete
\(\vec v_{j}\) from B
If \(\vec v_j\) is not in \(span(\vec v_1,...,\vec v_{j-1})\), keep
\(\vec v_{j}\) In B
Finally we will have a set of linearly independent vectors such that
\(span(\vec v_1,...,\vec v_m)=span(\vec
v_1,...,\vec v_m)\)
Example
\(V=span\{(1,1),(0,1),(1,0)\}\), we
want to find the basis of V.
Check \((1,1)\) is not \(\vec 0\).
Check whether \((1,1)\) and \((0,1)\) are linearly independent -->
Yes
Check whether \(\{(1,1),(0,1)\}\)
and \((1,0)\) are linearly independent
--> No, since \((1,0)=(1,1)-(0,1)\)
Thus, the basis is \(V=span(v_1,v_2)\).
But, by changing the order of checking, we can find other basis such
as: \(V=span\{(1,0),(0,1)\}\)
Theorem:
Every vector space has a basis.
Theorem:
Every linearly independent finite number of vectors
in a finite-dimensional vector space can be extended to a
basis of the vector space.
Theorem:
Suppose \(V\) is finite-dimensional
and \(U\) is subspace of \(V\). Then there is a subspace \(W\) Of \(V\), such that \(V=U\oplus W\).
Proof:
SInce \(V\) is finite-dimensional,
\(W\) is also finite-dimensional.
Let \(\vec u_1,...,\vec u_m\) be the
basis of \(U\), we have \(span(\vec u_1,...,\vec u_m)=U\) and \(\exist \vec w_1,...,\vec w_n\in V\) such
that, by the thorem bi extension, \(\vec
u_1,...,\vec u_m,\vec w_1,...,\vec w_n\) is the basis of \(V\)
Thus, we define \(W=span(\vec w_1,...,\vec
w_n)\) then \(W\sub V\) is a
subspace.
Next we prove \(U\oplus W=V\)
Since \(\vec u_1,...,\vec u_m,\vec
w_1,...,\vec w_n\) is the basis of \(V\),
By Stinitz exchange lemma,
if \(V\) is a finite-dimensional vector
space, then any two bases of \(V\) have
the same number of element.
Definition
The dimension of a finite-dimensional vector space
is the number of elements in any basis for \(V\), denoted by \(dim V\)
Theorem:
Suppose \(V\) is finite-demensional,
then every linearly independent list of vectors in
\(V\) with length \(dimV\) is a basis for \(V\).
Theorem:
If \(V\) is finite-demensional and
\(U\) is subspace of \(V\), then \(dimU\leq dimV\).
Example
Show that \(1,(x-5)^2,(x-5)^3\) is a
basis of the subspace \(U\) of \(P_3(R)\) defined by \(U=\{p\in P_3(\R):p'(5)=0\}\).
Sol:
Since \(p'(5)=0\), we have P is
the factor of \((x-5)^2\), which is
\(P=g(x)(x-5)^2\\P'=g'(x)(x-5)^2+2g(x)(x-5)\),
thus \(g(x)\) Has three potential
values \(g(x)=(x-5)\\g(x)=C\\g(x)=0\)
then we have \((x-5)^3,(x-5)^2,1\)
can represent the \(P_3\) and they are
linearly independent. Thus they are a basis of \(U\).
Theorem (Dimesion of a sum)
If \(U_1\) and \(U_2\) are subspaces of finite-dimensional
vector spcace, then: \[
dim(U_1+U_2)=dimU_1+dim_2-dim(U_1\cap U_2)
\]
Suppose \(U_1,U_2,...,U_m\) are the
subspaces of \(V\). The sum \(U_1+\cdots+U_m\) is a direct sum if
and only if\[
dim(U_1+\cdots+U_m)=dimU_1+\cdots+dimU_m
\]
Example
Let \(U=\{p\in P_4(F: p(6)=0\}\)
Find the basis of U
Since we have \(p(6)=0\), \((x-6)\) is a factor of P, thus we have
\(P=g(x)(x-6)\in P_4\), where \(g(x)=1,x,x^2,x^3\in P_3\) (note that the
zero function is included, by taking them equal to zero). finally we
have \((x-6),x(x-6),x^2(x-6),x^3(x-6)\)
as basis.
General Method:
The standard form of \(P_4\) is
\(P_4=a_0+a_1x+a_2x^2+a_3x^3+a_4x^4\)
For \(P(6)=0\) we have: \(P(6)=a_0+a_16+a_26^2+a_36^3+a_46^4=0\), the
solution of the equation are:
\((-6,1,0,0,0)^T,(-6^2,0,1,0,0)^T,(-6^3,0,0,1,0)^T,(-6^4,0,0,0,1)\),
plug these solutions in, we have: \(P_4=x-6,x^2-6^2,x^3-6^3,x^4-6^4\)
Let A be an \(m\times n\) matrix
over F and let B be an \(n\times p\)
Matrix over F. The matrix product AB is the \(m\times p\) matrix C: \[
c_{ij}=\sum_{k=1}^na_{ik}b_{kj}
\]
Remark:
The j-th column vector of C \(\vec
c_j\) is the linear combination of columns of A
\((A\cdot B)_j=\vec a_1 b_{1j}+\vec a_2
b_{2j}+\cdots+\vec a_nb_{nj}\), where \(\vec a_n\) are column vectors of A
The i-th row vector of C \(\vec
c^*\) is the linear combination of the rows of B
\((A\cdot B)_i=a_{i1}\vec b_1^*+a_{i2}\vec
b_2^*+\cdots +a_{in}\vec b_n^*\), where \(\vec b_n^*\) are row vectors of B
\(Rank(A)=Rank(A^T)\)
Lemma:
If \(\vec a=\begin{bmatrix}a_1\\a_2\\
\vdots\\ a_m \end{bmatrix}\in F^m, \vec b^*=(b_1,b_2,...,b_n)\in
F^n\) are two vectors and \(A=\vec
a\times \vec b_n^*\), then \(Rank(A)=1\).
Note: This lemma means that all other columns of A can be represented
by the first column of A.
Then for \(A\times B\), we have
\(A\times B=\sum_{k=1}^n\) where \(\vec a_k=\begin{bmatrix}a_{1k}\\a_{2k}\\ \vdots\\
a_{mk} \end{bmatrix}_{m\times 1}, \vec
b_k^*=(b_{k1},b_{k2},...,b_{k3})_{1\times p}\), we know the for
each k, \(Rank(\vec a_k \vec
b_k^*)=1\), meaning that \(AB\)
is n rank 1 matrix sum
Theorem
Let \(A\) be an \(m\times n\) matrix over \(F\) and let \(B\) be an \(n\times p\), then:
\(Rank(AB)\leq min\{Rank(A),
Rank(B)\}\)
\(Rank(A+B)\leq
Rank(A)+Rank(B)\)
\(Rank(A^TA)=Rank(AA^T)=Rank(A)=Rank(A^T)\)
If \(P\in F^{m\times m}, Q\in F ^{n\times
n}\) and invertible, then \(Rank(PAQ)=Rank(PA)=Rank(AQ)=Rank(A)\)
Multiplying elementry matrix on the left of A means elementary
row operations on A
Multiplying elementary matrix on the right of A means elementary
column operations on A
Elementary matrix is invertible
Elementary operation does not change the rank of
matrix
Lemma:
For any \(m\times n\) matrix A over
F, there is an invertible\(m\times m\) matrix \(P\) over \(F\) such that \(R=PA\) is the reduced row echelon form.
Remark:
This is the first factorization of \(A=P^{-1}R\), where R is the reduced row
echelon form.
If \(Rank(A)=r\) and let \(R'\) be the matrix by removing those
zero rows of R, then \(A=CR'\),
where \(C\in F^{m\times r}\) be the
first \(r\) Columns of \(P^{-1}\). (Zeros in RREF are trivial)
Lemma:
Let \(A\) be an \(m\times n\) Matrix over \(F\) with \(m\lt
n\) (#row < #colunm). Then, there exists a
length-n column vector \(x≠0\) such that \(A\vec x=0\)
Remark: If the unkonwn number n is larger than
equation number m, there is always
notrivial solution for \(A\vec
x=0\).
Theorem:
Let \(A\in F^{m\times n}\), with
\(Rank(A)=r\), then there exist
invertible matrix \(P\in F^{m\times
m}\) and \(Q\in F^{n\times n}\)
such that \(A=P\begin{bmatrix}I_{r\times r}
& 0_{r\times (n-r) }\\ 0_{(m-r)\times r}&0_{(m-r)\times (n-r)}
\end{bmatrix}Q\). (P as row operation, Q as column operation)
Corollary 1:
A matrix \(A\in F^{n\times n}\) is
invertible if and only if it could be written as a
production of elementary matrix.
Corollary 2:
A matrix \(A\in F^{n\times n}\) is
invertible if and only if A could be changed to be a
\(n\times n\) identity matrix after
finite elementary operation: \(PA_{n\times
n}Q=I_{n\times n}\)
Theorem (Rank Factorization): Let \(A\in
F^{m\times n}\), then \(Rank(A)=r\geq
1\)if and only if there exist a full column
rank matrix \(B\in F^{m\times r}\) and
a full row rank matrix \(C\in F^{r\times
n}\) such that \(A_{m\times
n}=B_{m\times r}C_{r\times n}\) (B and C are NOT
unique)
graph LR
subgraph F^n
v
end
v--Tv-->Av
subgraph F^m
Av
end
Let V be the subspace of contiuous functions from \(\R\to\R\), and define \(T\) by \(Tf(x)=\int_0^xf(t)dt\). then T is a linear
transforamtion from V to V. The function \(Tf\) is continous and differentiable.
Let \(L(V,W)\) denote the collection
of all linear transformation from \(V\)
into \(W\), where V and W are vectors
spaces over a field F.
Suppose \(S,T\in L:(V,W)\) and \(\lambda \in F\). The sum \(S+T\) and the product \(\lambda T\) are linear maps from V to W
defined by: \((S+T)\vec v=S\vec v+T\vec
v\) and \((\lambda T)\vec v=\lambda
(T\vec v),\forall \vec v\in V\)
Remark:
\(L(V,W)\) is a vector space.
(Idenity element: zero function)
If \(T\) is a linear
transformation, then \(T(0)=0\)
If \(a_1,a_2,...,a_k\in V\) are
linear dependent, then the under linear transformation
T, \(T(a_1),T(a_2),...,T(a_k)\) are
linearly dependent
If \(T(a_1),T(a_2),...T(a_k)\) are
linearly independent, then \(a_1,a_2,...a_k\) are linearly
independent
If \(T\in (U,V)\) and \(S\in L(V,W)\), then the product \(ST\in L(V,W)\) is defined by \[
(ST)u=S(Tu)
\] For \(u\in U\). Note that
multiplication is not commutative: ST≠TS
Remark:
It is just the composition \(S\circ
T\) Of two functions.
The product satisfies the following properties:
Associativity: \((T_1T_2)T_3=T_1(T_2T_3)\)
Identity: \(TI=IT=T\)
Distributive: \((S_1+S_2)T=S_1T+S_2T\) and \(S(T_1+T_2)=ST_1+ST_2\) whenever \(T,T_1,T_2\in L(U,V)\) and \(S,S_1,S_2\in L(V,W)\)
Theorem: Suppose \(v_1,...v_n\) is
the basis of V and \(w_1,...w_n\in W\).
Then there exist a unique transformation \(T:V\to W\), such that \[
T\vec v_j=\vec w_j, for\ each\ j=1,...n
\] Proof:
Since \(\vec v_1,...\vec v_n\) is
the basis of V \(\implies \forall \vec v\in
V,\vec v=\sum_{i=1}^n\lambda_i\vec v_i\)
Then we define \(T(\vec
v)=T(\sum_{i=1}^n\lambda_i\vec v_i)=\sum_{i=1}^n \lambda_i\vec w_i\in
W\)
Such that \(\forall \vec
v_j,\exist\lambda_i=0,i≠j,\lambda_j=1\) St \(T(\vec v_j)=\vec w_j\)
For a linear transforamtion \(T:V\to
W\), the range of T is the set of all vectors \(\vec v\in W\) such that \(\vec w=T\vec v\) for some \(\vec v\in V\). \[
R(T)=\{\vec w\in W|\exist \vec v\in V,s.t. T\vec v=\vec w\}=\{T\vec
v|\vec v\in V\}
\] Nullspace/Kernal: \[
Null(T)=\{\vec v\in V|T\vec v=\vec 0\}
\]
Example
If \(T\) is the zero transformation
from \(V\) to \(W\) such that \(T\vec v=0,\forall \vec v\in V\), then \(NullT=V,R(T)=\{\vec 0\}\)
Suppose \(S\) is the transformation
from \(F^{\infty}\) to \(F^{\infty }\) is the backward shift defined
by \(T(x_1,x_2,...)=(x_2,x_3,...)\).
Then \(T(x1,x_2,...)=0\) iff \(x_2,x_3,...\) are all 0.. Thus, \(NullT=\{(a,0,...)|a\in
F\},R(T)=F^{\infty}\)
For \(D:P_4\to P_4\), \(R(D)=P_3\subseteq P_4, Null(D)=\{C|C\in
F\}\)
Theorem:
For a linaer transformation \(T:V\to
W\), Null T i s a subspace of V and R(T) is a
subspace of W
Theorem: For a linear transformation \(T:V\to W\), T is injective
if and only if \(NullT=\{0\}\); T is
surjective if and only if its range \(R(T)=W\)
Example
The differential transformation \(D\in
L(P_5(\R),P_5(\R))\) is not surjective; \(S\in L(P_5(\R),P_4(\R))\) is surjective
Assume \(V\in F^n\) and \(W\in F^m\) are two vector spaces. \(T:V\to W\) is a linear transformation.
\(\{\alpha_1,...\alpha_n\}\) and \(\{\beta_1,...\beta_m\}\) are the basis of V
and W. Since \(T(\alpha_i)\in W\) and
\(\{\beta_1,..\beta_m\}\) are the basis
of W, then
The matrix A depends on the basis of V, the basis of W and the linear
transformation T. In general, the matrix of T respect to differenc basis
is denote \(M(T,(v_1,...v_n),(w_1,...w_m))\), which is
uniquely defined by the three elements.
Example
Suppose \(T\in L(F^2,F^3)\) is
defined by \(T(x,y)=(x+3y,2x+5y,7x+9y)\). Find the
matrix of T with respect to the standard basis of \(F^2\) and \(F^3\).
We have \(\vec
e_1=\begin{bmatrix}1\\0\end{bmatrix},\vec
e_2=\begin{bmatrix}0\\1\end{bmatrix}\) are basis of \(F^2\) and \(\beta_1=\begin{bmatrix}1\\0\\0\end{bmatrix},\beta_2=\begin{bmatrix}0\\1\\0\end{bmatrix},\beta_3=\begin{bmatrix}0\\0\\1\end{bmatrix}\)
are basis of \(F^3\)
Assume for \(\alpha\in U\), under
the basis \((\alpha_1,\alpha_2,...\alpha_n)\), the
coordinate is \((x_1,x_2,...x_n)\),
i.e. \(\alpha=\sum_{i=1}^nx_i\alpha_i\)
(\(\alpha\) is a linear combination of
\(\alpha_i\)), then \[
T\alpha=T(x_1\alpha_1+\cdots x_n\alpha_n)=x_1T\alpha_1+\cdots
x_nT\alpha_n=T(\alpha_1,\alpha_2\cdots\alpha_n)\vec
x=(\beta_1,\beta_2,...\beta_m)A\vec x
\\ \implies T\alpha=(\beta_1,\beta_2,...\beta_m)A\vec x
\]
Theorem
Suppose \(S,T\in L(V,W),\lambda\in
F\), (basis are fixed) then
A linear transformation \(T\in
L(V,W)\) is called invertible if there exists a
linear map \(S\in L(W,V)\) such that
\(ST\) equals the identity function on
\(V\) and \(TS\) equals the identity function on \(W\). Then \(S\) is called an inverse
of \(T\). If \(T\) is invertible, the fucntion \(S\) Is unique and is denoted by \(T^{-1}\).
Theorem: A linear transforamtion is invertible if and only
if it is injective and surjective. (\(Null(T)=0,Range(T)=W\))
Example
The multiplication by \(x^2\)
linear transforamtion from \(P(\R)\to
P(\R)\) is not invertible becasue it is not surjective. (1 is not
in the range)
The backward shif linear map from \(F^{\infty}\to F^{\infty}\) is not
invertible because it is not injecticv. (\((1,0,...)\) is in the null space)
An isomorphism is an invertible linear
transformation two vector spaces are called isomorphic
if there is an isomorphism from one vector space onto the other one.
Theorem:
Two finite-dimensional vector spaces ovber \(F\) are isomorphic if and
only if they have the same dimension
Theorem;
Suppose \(v_1...v_n\) is a basis of
\(V\) and \(w_1,...w_m\) is a basis of \(W\). Then \(M\) is an isomorphism between \(L(V,W)\) and \(F^{m,n}\)
Note:
This means that for \(A=M(T,\vec\alpha,\vec\beta)\), M is a
linear map such that \(M:T\to A\) and
is injective and surjective
Proof:
If \(T\in L(V,M),M(T)=0\) (here
0 is matrix), then \(T\vec v_k=\vec 0,\forall
\vec v_k\in V\), where \(\vec
v_1,...\vec v_k\) is the basis of V.
Thus, \(T\vec v=0,\forall v\in V\),
then \(T=0,Null(M)=0\)
Theroem:
Suppose V and W are finte-dimensional. Then \(dimL(V,W)=dimV\cdot dimW\)
Theorem:
Let \(T\in L(V,W)\) and \(A=M(T,\vec\alpha,\vec\beta)\),then \(R(T)=dim(range(T))=Rank(A),Nullity
T=dim(Null(T))=n-Rank(A)\)
Let \(T\in L(V,W)\). \(A=M(T,\vec\alpha_1,...\vec\alpha_n),(\vec\beta_1,...\vec\beta_m),
B=M(T,(\vec\alpha_1',...\vec\alpha_n'),(\vec\beta_1',...\vec\beta_m'))\)
Then there are two invertible matrix\(P\in F^{mm},Q\in F^{nn}\) such that \[
(\vec\alpha_1',...\vec\alpha_n')=(\alpha_1,...\alpha_n)Q\\
(\vec\beta_1,...\vec\beta_m)=(\vec\beta_1',...\vec\beta_m')P
\] Then this means that \[
T(\vec\alpha_1',...\vec\alpha_n')=T(\vec\alpha_1,...\vec\alpha_n)Q=(\vec\beta_1,...\vec\beta_m)AQ=(\vec\beta_1',...\vec\beta_m')PAQ\\B=PAQ
\] i.e. The matrix of linear transformation T is
equivalent
For another basis of \(P_3:1,x,x^2,x^3+x^2\), we have \(Q:(1,x,x^2,x^3+x^2)=(1,x,x^2,x^3)\begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&1&1\\0&0&0&1\end{bmatrix}\)
For another baisi of \(P_2:1+x,x,x^2\), we have \(P:(1,x,x^2)=(1+x,x,x^2)\begin{bmatrix}1&0&0\\-1&1&0\\0&0&1\end{bmatrix}\)
If we directly write D with the alternative basis, \(D(1,x,x^2,x^2+x^3)=(0,1,2x,2x+3x^2)=(1+x,x,x^2)\begin{bmatrix}0&1&0&0\\0&-1&2&2\\0&0&0&3\end{bmatrix}\)
And then we can verify that \(B=PAQ\)
Theorem:
A linear transformation \(T\in
L(V,W)\). \(A=M(T,\vec\alpha,\vec\beta)\) and \(rank(A)=r\), then there exists a basis
\((\vec\alpha')\) in V and a basis
\((\vec\beta')\) in W such that
\[
T(\vec\alpha')=(\vec\beta')\begin{bmatrix}I_r&0\\0&0\end{bmatrix}
\]
A linear transforamtion from a vector space to
itself is called an operator. \(L(V)=L(V,V)\)
Theorem:
Suppose \(V\) is finite-dimensional
and \(T\in L(V)\). Then the following
are equivalent:
T is inveritible.
T is injective
T is surjective
Corollary:
Suppose \(V\) is finite-dimensional
and \(T\in L(V)\). Then \(T\) is invertible/bijective if and
only if\(T\) transforms the
basis of \(V\) to the basis of \(V\). i.e. if \(v_1,...v_n\) is the basisi of \(V\) then \(Tv_1,...Tv_2\) is also the basis of \(V\).
Remark: Suppose \(V\) is
finite-dimensional and \(T\in L(V)\).
If T is invertible and A is a matrix under the basis \(\{\alpha_1,...\alpha_n\}\), i.e. \[
T(\alpha_1,...\alpha_n)=(\alpha_1,...\alpha_n)A
\] then, the matrix of the invert operator \(T^{-1}\) under the same basis is \(A^{-1}\)\[
M(T,\vec\alpha,\vec\alpha)=A\\
M(T^{-1},\vec\alpha,\vec\alpha)=A^{-1}
\]
If \(\vec v_1,...\vec v_n\) is a
basis of V, then the dual basis of \(\vec v_1,...\vec v_n\) is the list \(\vec\phi_1,...\vec\phi_n\) of the elements
of \(V'\), where each \(\phi_j\) is the linear functional
on V such that \[
\phi_j(\vec v_k)=\begin{cases}1&j=k\\0&j≠k\end{cases}
\]Example
What is the dual basis of the standard basis \(\vec e_1,...\vec e_n\) of \(F^n\).
Sol:
For \(1\leq j\leq n\) , define \(\phi_j\) to be the linear functional on
\(F^n\) and defined as \(\phi_j((x_1,x_2,...x_n)^T)=x_j\)
then \(\phi(\vec
e_k)=\begin{cases}1&j=k\\0&j≠k\end{cases}\implies\phi_j(\vec
e_j)\)
Theorem:
Suppose V is a finite-dimensional. Then the dual basis fo a basis of
V is a basis of \(V'\)
Proof:
Suppose \(\vec v_1...\vec v_n\) is
the basis of V. \(\vec\phi_1...\vec\phi_n\) is the dual
basis.
Let \(\lambda_1\phi_1+\cdots+\lambda_n\phi_n=0\)
(Here 0 is a zero map), then we have \(\forall
v\in V,0(v)=0\in F\)
Then \(\forall
v_j,j=1...n,\lambda_1\phi_1(v_j)+\lambda_2\phi_2(v_j)+\cdots\lambda_j\phi_j(v_j)+\cdots\lambda_n\phi_n(v_j)\)=0
Since only \(\phi_j(v_j)=1\), and
\(\lambda_j\phi_j(v_j)=0\)
\(\lambda_j=0,j=1...n\), thus, \(\phi_j\) is linearly independent
Also \(dim(\phi_1...\phi_n)=n=dim(V')\)
Therefore, \(\phi_1...\phi_n\) is a
basis of \(V'\)
For \(U\sub V\), the
annihilator of \(U\),
denoted \(U^0\) is defined by \[
U^0=\{\phi\in V'|\phi(u)=0,\forall u\in U\}
\] Note that here 0 is number.
Example
Suppose \(U\) is the subspace of
\(P(R)\) consisting of all polynomial
multiples of \(x^2\). IF \(\phi\) is the linear functional on \(P(R)\) defined by \(\phi(p)=p'(0)\), show that \(\phi\in U^0\)
Sol:
We have \(U=\{p(x)\in
P(R)|p(x)=x^2g(x),g(x)\in p(x)\}\)
we need to show that \(\forall p\in U,
\phi(p)=0\)
Let \(e_1,e_2,e_3,e_4,e_5\)
denote the standard basisi of \(R^5\),
and let \(\phi_1,\phi_2,\phi_3,\phi_4,\phi_5\) denote
the dual basis of $(R^5)'. Suppose \[
U=span(e_1,e_2)
\] Show that \(U^0=span(\phi_3,\phi_4,\phi_5)\)
Sol:
we need to show \(span(\phi_3,\phi_4,\phi_5)\subseteq U^0\),
then show that \(\forall\phi\in
span(\phi_3,\phi_4,\phi_5),\phi\in U^0\)
we have \(\phi=\lambda_3\phi_3+\lambda_4\phi_4+\lambda_5\phi_5\)
Avector space V with norm \(\lVert\cdot\rVert:V\to \R\) is called
normed vector space.
The concept of norm is closely related to that of a matrix. For
instance, a metric can be defined from any norm. Let \(\lVert\vec v\rVert\) be a norm on vector
space V, then \(d(v,w)=\lVert \vec v-\vec
w\rVert\) is the mertric(distance ) induced by the norm.
Example
Standard norms for \(\R^n\) and
\(\C^n\)
\(l^1\) norm: \(\lVert\vec v \rVert_1=\sum_{i=1}^n|\vec
v_i|\)
\(l^p\) norm: \(\lVert\vec v \rVert_p=(\sum_{i=1}^n|\vec
v_i|^p)^{\frac{1}{p}}\)
\(l^{\infty}\) norm: \(\lVert\vec v \rVert_{\infty}=max_{i=1...n}\{\vec
v_i\}\)
Standard norms for the vector space of function from \([a,b]\) to \(\R\)
Note that here the integral notation refers to the Lebesgue
integral instead of the Riemann integral
Consider vectors in \(\R^n\)
with the Euclidean metric \(d(v,w)=\sqrt{\sum_{i=1}^n(v_i-w_i)^2}\).
Let \(\bar d(v,w)=min\{d(v,w),1\}\) and
define \(f(v)=\bar d(v,0)\). Is \(f\) a norm?
Sol:
We know that \(f(v)=min(d(v,0),1)\)
Checking homogeneity to see if \(f(\lambda
v)=|\lambda|f(v)\)
\(f(\lambda v)=min(d(\lambda
v,0),1)=min(|\lambda|d(v,0),1)\) (This is becasue the homogeneity
is satisfied for \(d(v,w)\))
\(\lambda f(v)=\lambda
min(d(v,0),1)\)
Let \(\vec v=\vec e,\lambda=2\) then
\(\begin{cases}f(2\vec e)=min(2d(\vec
e,0),1)=1\\2f(\vec e)=2\end{cases}\), meaning that \(f\) is not a norm.
A vector \(\vec v\in V\) is said to
be normalized if \(\Vert\vec
v\rVert=1\). Any nonzero vector could be
normalized by \[
\vec u=\frac{\vec v}{\lVert \vec v\rVert}
\] has norm \(\lVert \vec
u\rVert=1\). A normalized vector is also referred to as a
unit vector.
Let \(V\) and \(W\) be two normed vector spaces and let
\(T:V\to W\) be a linear
transformation. The induced operator norm of \(T\) is defiend as \[
\lVert T\rVert=sup_{v\in V-\{0\}}\frac{\lVert Tv\rVert}{\lVert
v\rVert}=sup_{v\in V,\lVert v\rVert=1}\lVert T v\rVert=sup_{\lVert
u\rVert=1}\lVert Tu\rVert_w
\]
For the space $L(V,V) $ of linear operators on \(V\), a norm is called
submultiplicative if \(\lVert
TU\rVert\leq\lVert T\rVert\lVert U\rVert,\forall T,U\in
L(V,V)\)
Remark:
The induced operator norm inequality shows that all
induced operator norms are submultiplicative beacuse \[
\lVert UTu\rVert\leq \lVert U\rVert\lVert Tu\rVert\leq\lVert
U\rVert\lVert T\rVert\lVert u\rVert
\]
Because for any linear functional \(\varphi\), we have \(\operatorname{dim}\operatorname{range}\varphi \leq
\operatorname{dim}\mathbb{F} = 1\)
A norm of a vector space of matrices is called a matrix norm.
For \(A\in F^{m\times n}\), the
maxtrix norm induced by the \(l^p\)
vector norm \(\lVert\cdot\rVert_p\) is
\[
\lVert A\rVert_p=max_{\lVert\vec v\rVert_p}\lVert A\vec v \rVert_p
\] In special cases:
\(\lVert A\rVert_\infty=max_{\lVert\vec
v\rVert_\infty}\lVert A\vec v \rVert_\infty=max_i\sum_j|a_{ij}|\)
(Max row sum)
\(\lVert A\rVert_1=max_{\lVert\vec
v\rVert_1}\lVert A\vec v \rVert_1=max_j\sum_i|a_{ij}|\) (Max
column sum)
\(\lVert A\rVert_2=\sqrt{\rho(A^H
A)}\), where \(\rho(B)\) is the
maximum absolute value of all eigenvalues
\(A=\begin{bmatrix}1&1\\2&1\end{bmatrix}\),
then \(A^TA=\begin{bmatrix}5&3\\3&2\end{bmatrix}\)
and the eigenvalues of this matrix are \(\sigma_1^2=6.8541,\sigma_2^2=0.1459\)
Let \(\{\vec v_1,\vec v_2...\vec
v_n\}\) be a basis for the n dimensional vector space V, then for
any vector \(\vec w\in V\) can be
expressed uniquely as \[
\vec w=\sum_{i=1}^ns_i\vec v_i
\] Ordering this set \((v_1,v_2,...v_n)\) allows the first element
in the coordinate vector to be associated with the first vector in our
basis and so on.
If \(V\) is a finite-dimensional
vector space, an ordered basis for V is a finite set of
vectors that is linearly independent and spans V.
The ordered basis\(B\), denoted by \((\vec v_1,...\vec v_n)\) defines the set
and a spcific ordering of the vectors. Based on this
ordered basis, a vector \(v\in V\) can
be unambiguously represneted as an \(n-tuple(s_1,...,s_n)\in F^n\) such that
\[
\vec v=\sum_n^i s_i\vec v_i
\] For a finite-dimensional vector space V with ordered
space\(B=(\vec v_1,...\vec
v_n)\), the coordinate vector of \(\vec v\in V\) is denoted by \([\vec v]_B\) and equals the unique
vector\(\vec s=F^n\) such
that \[
\vec v=\sum_i^n s_i\vec v_i
\]Example
Suppose that \(A=\vec v_1,...\vec
v_n\) is an ordered basis for V. Let \(P\) be an \(n\times n\) invertible matrix. Show that
there exists an ordered basisi \(B=\vec
w_1,...\vec w_n\) for \(V\) such
that \([u]_A=[u]_B\)
Sol:
Let \((\vec w_1,...\vec w_n)=(\vec
v_1,...\vec v_n)P\)
then we have \(\vec u=(\vec v_1,...\vec
v_n)[\vec u]_A\\=(\vec w_1,...\vec w_n)P^{-1}[\vec u]_A\)
Since \(\vec u=(\vec w_1,...\vec w_n)[\vec
u]_B\)
We have \([\vec u]_B=P^{-1}[\vec
u]_A\)
Which means \([\vec u]_A=P[\vec
u]_B\)
More spcifically:
\(A=(1,x,x^2),B=(1,x+x^2),x^2\)
We have \((1,x+x^2,x^2)=(1,x,x^2)\begin{bmatrix}1&0&0\\0&1&0\\0&1&1\end{bmatrix}\)
Let \(F\) be the field of
real number or the field of complex
numbers, and assume \(V\) is a
vector space over \(F\). An
inner product on \(V\)
is a function which assigns to each ordered pair of
vectors \(\vec v,\vec w\in V\) a
saclar\(<\vec v,\vec
w>\in F\) in such a way that for all \(\vec u,\vec v,\vec w\in V,\forall s\in
F\):
The Eucliden inner product on \(F^n\) is defined by \[
<(w_1,...w_n)^T,(z_1,...z_n)^T>=w_1\overline{z_1}+\cdots
w_n\overline{z_n}
\] For column vector, \(<v,w>=w^H\cdot v\)
The inner product on a function space is: let V
be the vector space os all continuous complex-values functions on the
unit interval \([0,1]\). Then the
followubng denotes an inner product \[
<f,g>=\int_0^1 f\overline{g}dx
\]
Let V be a finite-dimensional vector space, and suppose that \(B=w_1,...w_n\) is an ordered basis for V.
Any inner product on V is determiend by the values \(G_{ij}=<w_j,w_i>\) that is takes on
pairs of vectors in B.
For any \(\vec x\in F^n,\vec x^HG\vec
x\geq 0\), with equality iff \(\vec
x=0\) (positive defiend)
Theorem: If \(S\in F^{n,n}\) is a
positive defined, \(A=(\alpha_1,...\alpha_n)\) is a basis of
vector space V, for any \(\alpha,\beta\in
V\) with coordinates x,y,under basis \((\alpha_1,...\alpha_n)\), then the dunction
\(<\alpha,\beta>\) defined as
\(<\alpha,\beta>=y^HGx\) is a
inner product,a nd under this basis the Grame matrix is S. \[
f(\alpha,\beta)=[\beta]^H_AG[\alpha]_A
\] Remark:
For finte-dimensional vector space \(V^n\) and basis \((\alpha_1,...\alpha_n)\), we define a
function \(\sigma:<\alpha,\beta>\to
G\). then the previous throem show that this function is a
injection and surjection.
Let V be an inner-product space with inner product \(<\cdot>\). This inner product
natrually denotes the induced norm\[
\lVert u\rVert^2=<u,u>
\]Projection:
Let \(w,v\) be vectors in an
inner-product space V with inner product \(<\cdot>\). The
projection of \(w\)
onto \(v\) is defined to be (projection
of \(\vec w\) on \(\vec v\) with direction \(\vec v\)) \[
u=\frac{<w,v>}{\lVert v\rVert}(norm/length)\frac{v}{\lVert
v\rVert}(direction)
\]
Lemma:
Let \(u\) be the projection of \(w\) onto \(v\). Then \(<w-u,u>=0\) and \(\lVert w\rVert^2=\lVert u\rVert^2+\lVert
w-u\rVert^2\)
Suppose \(\vec u,\vec v\in V\), then
\(|<\vec u,\vec v>|\leq \lVert\vec
u\rVert\lVert\vec v\rVert,|<\vec u,\vec v>|^2\leq <\vec u,\vec
u><\vec v,\vec v>\). This inequality is an equality if
and only if \(\vec u=\lambda\vec v\)
for some \(\lambda \in F\)
Proof:
if \(\vec v=0,or\ \vec u=0\), then
it is true
if \(\vec v\not=0,\vec u\not=0\),
then we have, by Orthogonal Decomposition,
\(\vec u=\frac{<\vec u,\vec
v>}{\lVert \vec v\rVert^2}\vec v+\vec w\), where \(\vec w\) is the residual, \(<\vec v,\vec w>=0\)
The equality is true when \(\lVert\vec
w\rVert^2=0\)
Example:
If \((x_1,x_2...,x_n),(y_1,...y_n)\in
R^n\) then \[
|<\vec x,\vec
y>|^2=|x_1y_1+...+x_ny_n|^2\leq(x_1^2+...x_n^2)\cdot(y_1^2+...+y_n^2)
\]
If \(f\) and \(g\) are continous real-valuesd functions on
\([-1,1]\) then \[
|<f,g>|^2=|\int_{-1}^1fgdx|^2\leq|\int_{-1}^1f^2dx||\int_{-1}^1g^2dx|=<f,f>\cdot<g,g>=\lVert
f\rVert^2\lVert g\rVert^2
\]
If \(V\) is an inner-product space
over \(F\) and \(\lVert\vec v \rVert=\sqrt{<\vec v,\vec
v>}\), then for any \(\vec v,\vec
w\in V\) and any \(s\in F\)
A list of vectors is called orthonormal if each
vectors in the list has norm 1 and is
orthogonal to all the other vectors in the list. In
other words, a list of \(e_1,...e_m\)
vectors is orthonomal if \[
<e_i,e_j>=\begin{cases}1&if\ i=j\\0&if\ i≠j\end{cases}
\]
Properties:
If \(e_1,...e_m\) is an orthonomal
list of vectors in V, then
Basis \(\implies\) linearly
independent and spans the vector space
Theorem:
Every orthonormal list of vectors in \(V\) with length \(dim V\) is an orthonormal
basis of V.
Example
Standard orthonormal basis of \(R^4:(1,0,0,0)^T,(0,1,0,0)^T,(0,0,0,1)^T,(0,0,0,1)^T\)
Let V be the vector space over \(\C\) of continous complex-valued functions
on the interval \(0\leq x\leq 1\) with
the inner product \[
<f,g>=\int_0^1f\bar gdx
\] Let \(f_n=\sqrt{2}cos2\pi
nx,g_n=\sqrt{2}sin 2\pi nx\). Then \(\{1,f_1,g_1,f_1,g_2\cdots\}\) is a
countable infinte orthonormal set that is a Scauder basis for this
vector space.
Suppose \(\vec e_1,...\vec e_n\) is
an orthonormal basis of V and \(\vec v\in
V\). Then \[
v=<v,e_1>e_1+<v,e_2>e_2+\cdots+<v,e_n>e_n\\and\ \lVert
v\rVert^2=|<v,e_1>|^2+|<v,e_2>|^2+\cdots+|<v,e_n>|^2
\] Proof:
Let \(V\) be an inner-product space
and assume \(\vec v_1,...\vec v_n\) are
;inearly independent vectors in \(V\).
Then it is possible to construct an orthogonal sequence of vectors \(\vec w_1,...\vec w_n\in V\) such that for
each \(k=1,...,n\) then set \(\{\vec w_1,...\vec w_n\}\) is a basisi of
the subspace spanned by \(\vec v_1,...\vec
v_k\)
Corollary
every finite-dimensional inner-production space has a basis of
orthonormal vector.
Remark: Let \(V\) be an
inner-product space. Then every orthonrmal list of vectors in V can be
extended to an orthonormal basis of V
Example
Find an orthonormal basisi of \(P_2(\R)\), where the inner product is given
by \(<p,q>=\int_{-1}^1p(x)q(x)dx\).
Sol: The standard basis of \(P_2(\R)\) is \(\vec v_1=1,\vec v_2=x,\vec v_3=x^2\)
Then we use the Gram-Schmidt process to get the orthonormal:
Let \(V\) be an inner-product space
and \(W\) be any set of vectors in
\(V\). Then, the orthogonal
complement of \(W\), denoted
\(W^{\bot}\) is the set of all
vectors in V that are orthogonal to every vector in W\[
W^{\bot}=\{v\in V|<v.w>=0,\forall w\in W\}
\]
Remark:
Let \(W\) be any
subset of vector space \(V\). Then \(W^{\bot}\) is a closed subspace of \(V\) and that any vector in the subspace
spanned by \(W\) is orthogonal to any
vector in \(W^{\bot}\)
Suppose \(U\) is a
finite-dimensional subspace of \(V\).
Then \(V=U\oplus U^{\bot}\)
Proof: Let \(\vec e_1,...\vec e_m\)
be an orthonormal basisi of \(U\), then
it can be extended to be an orthonomal basis of \(V\), \(\vec
e_1...\vec e_m,\vec e_{m+1}...\vec e_n\)
Then we need to show \(\vec e_{m+1}...\vec
e_n\) is the basisi of \(U^{\bot}\)
since orthonormal \(<e_j,e_k>=0,j> m,k\leq m\implies \vec
e_j\in U^{\bot}\)
This means we have \[
\vec v=<\vec v,\vec e_1>\vec e_1+...+<\vec v,\vec e_m>\vec
e_m+<\vec v,\vec e_{m+1}>\vec e_{m+1}+...+<\vec v,\vec
e_n>\vec e_n=\vec u+\vec w,\vec u\in U,\vec w\in U^{\bot}
\]
Corollary:
Suppose \(V\) is finite-dimenmsional
and \(U\) ia subspace of \(V\), then \[
dimV=dimU+dimU^{\bot}
\]
Theorem:
Suppose \(U\) is a
finite-dimensional subspace of \(V\).
Then \[
(U^{\bot})^{\bot}=U
\]
Remark:
Suppose \(U\subseteq V\) ia a
finite-dimensional subspcae
\(V=U\oplus W_1=U\oplus W_2\),
since the decomposition is not unique
However, \(V=U\oplus U^{\bot}\)
is unique, i.e. if \(\vec u\) is fixed,
there is a unique \(\vec w\) such that
\(\vec v=\vec u+\vec w\)
Proof:
We need to show that if \(V=U\oplus
T_1=U\oplus T_2\), where \(T_1,
T_2\) are both orthogonal to \(U\), then $ T_1=T_2$
Suppose we have \(V=U\oplus T_2\),
let \(\vec t_1\in T_1\subseteq V\),
then there is \(\vec u\in U,\vec t_2\in
T_2\) such that \(\vec t_1=\vec u+\vec
t_2\)
then we have \(<\vec t_1,\vec
u>=<\vec u,\vec u>+<\vec t_2,\vec u>=0\\0=<\vec u,\vec
u>\Longleftrightarrow \vec u=0\) since orhtogonal
Both \(U\) and \(U^{\bot}\) are close, and \((U^{\bot})^{\bot}=U\)
The orthogonal projection of \(V\) onto \(U\) is the operator \(P_U\in L(V)\) defined as follows (map \(\vec v\in V\to \vec u\in U\)): \[
For\ \vec v\in V, \vec v=\vec u+\vec w, where\ \vec u\in U,\vec w\in
U^{\bot}, then \ P{_U\vec v=\vec u}
\]
Example
Suppose \(\vec x\in V\) with \(\vec x≠\vec 0\) and \(U=span(\vec x)\). Show that \[
P_U\vec v=\frac{<\vec v,\vec x>}{\lVert\vec x\rVert^2}\vec
x,\forall \vec v\in V
\]
Sol:
By orthogonal decomposition, \(\vec
v\) can be written as
Suppose \(U\) is a
finite-dimensional subspace of \(V\),
\(\vec v\in V,\vec u\in U\), then \[
\lVert\vec v-P_U\vec v\rVert(Minimal\ Residual)\leq \lVert\vec v-\vec
u\rVert (Residual)
\] And the inequality above is an equality iff \(\vec u=P_U\vec v\)
Best Approximation (Finite-dimensioanl)
\(V\) is a vector space with
inner product (defined), \(\lVert\cdot\rVert\) is the induced
norm, \(U\) is a subspace of
\(V\)
For \(\vec v\in V\), we have \[
\lVert\vec v-P_U\vec v\rVert(Minimal\ Loss)\leq\lVert\vec v-\vec
u\rVert,\forall\vec u\in V
\] Then \(\hat{\vec v}\) is
called the best approximation of \(\vec v\) on \(U\): \[
\lVert\vec v-\hat{\vec v}\rVert\leq\lVert\vec v-\vec u\rVert,\forall\vec
u\in V
\] Then, if \(\hat{\vec v}\) is
the best apporixmation, \[
<\vec v-\hat{\vec v},\vec u>=0,\forall \vec u\in U
\]
Example
Find a polynomial u with real coefficients and degree at most 5 that
approximates \(sinx\) as well as
possible on the interval \([-\pi,\pi]\)
in the sense that \(\int_{-\pi}^\pi|sinx-u(x)|^2dx\) is as
small as possible.
Sol:
Let \(C_R[-\pi,\pi]\) denote the
real inner product space of continous real-valued function on \([-\pi,\pi]\) with inner product (derived by
the loss function) \[
<f,g>=\int_{-\pi}^\pi fgdx
\]\(U=P_5([-\pi,\pi])\) is the
most 5 order polynomial, \(U\subseteq
C_R[-\pi,\pi]\), then our problem is to find \(\vec u\in U,such\ that\ \lVert\vec v-\vec u\rVert
is\ minimal\)
Then the best apporixmation of \(sinx\) in \(P_5\) is \(P_U(sinx)=P_U(v)=<v,\vec e_1>\vec
e_1+\cdots+<v,\vec e_6>\vec e_6\)
Then we first apply Gram-Schmidt Prcedure to standard \(P_5: 1,x,x^2,x^3,x^4,x^5\), producing an
orthonormal basisi \(\vec e_1,...\vec
e_6\) of \(U\)
Then \(P_U(v)=\vec e_1+\cdots+\vec
e_6\\\implies applying\ <f,g>=\int_{-\pi}^\pi fgdx\\\implies
P_U\vec v=0.987862x-0.155271x^3+0.00564312x^5\)
A linear functional on \(V\) is a linear transformation from \(V\) to \(F\) (map vector to number). In other words,
a linear functional is an element of \(L(V,F)\)
Example
The function \(\phi:F^3\to F\)
defined by \[
\phi(z_1,z_2,z_2)=2z_1-5z_2+z_3\\\implies
\phi(z)=<z,u>=<(z_1,z_2,z_3)^T,(2,-5,1)^T>, \forall z\in F^3
\] is a linear functional on \(F^3\).
Suppose \(V\) is finite-dimensioanl
and \(\phi\) is linear functional on
\(V\). Then there is a **uniqu \(\vec u\in V\) such that \[
\phi(\vec v)=<\vec v,\vec u>
\]
Suppose \(T\in L(V,W.)\) The
adjoint of \(T\) is
the function \(T^*:W\to V\) such that
\(<T\vec v,\vec w>_W=<\vec v,T^*\vec
w>_V,\forall \vec v\in V,\vec w\in W\)
Let \(T\in L(V,W)\). Suppose \(E_1,...e_n\) is an orthonormal basis of
\(V\) and \(f_1...f_m\) is orthonormal basis of \(W\). Then \[
M(T^*,(f_1...f_m),(e_1...e_n))=A^H
\] is teh conjugate transpose of \[
M(T,(e_1...e_n),(f_1...f_m))=A
\]Example
For \(T:R^3\to R^2,
T(x_1,x_2,x_3)^T=(x_2+3x_3,2x_1),\ and \ T^*:R^2\to R^3,
T^*(y_1,y_2)^T=(2y_2,y_1,3y_1)^T\)
Let \(e_1=(1,0,0)^T,e_2=(0,1,0)^T,e_3=(0,0,1)^T,f_1=(1,0),f_2=(0,1)\)
we have \[
T(e_1,e_2,e_3)=(f_1,f_2)\begin{bmatrix}0&1&3\\2&0&0\end{bmatrix}(A)\\T^*(f_1,f_2)=(e_1,e_2,e_3)\begin{bmatrix}0&2\\1&0\\3&0\end{bmatrix}(A^H)
\]Big Picture
Let \(T\) be a bounded linear
transformation from \(V\to W\). The
equation \(T\vec v=\vec w\) has a
solusion if and only if\(<w,u>=0,\forall \vec v\in NullT^*\)\[
\vec w \in R(T)\Longleftrightarrow \vec w \bot Null T^*
\] In matrix notation, \(\vec v=\vec
w\) has a solusion if and only if\(\vec u^H\vec w=0,\forall \vec u \ such\ that\
A^H\vec u=0\)
Let \(T\) be a bounded linear
transformation from \(V\to W\) (\(dim V<dim W\)). The vector \(v\in V\) minimizes \(\lVert T\vec v-\vec w\rVert\) if and only
if \[
T^*T\vec v=T^*\vec w,\vec w\in W
\] If A is a matrix that \(A^HA\) is invertible, the best
approximation minimizes \(\lVert A\vec v-\vec
w\rVert\) is \[
\vec v=(A^HA)^{-1}A^H\vec w
\] Suppose \(T\) is linear
transformation from \(V\to W\). The
vector \(\vec v\in V\) minimizes \(\lVert\vec v\rVert\) and satisfies \(T\vec v=\vec w\). Then \(\hat{\vec v}\in R(T^*)\). i.e. \[
\exist \vec w\in W,\ such\ that\ TT^*\vec u=\vec w
\] If A is a matrix such that \(AA^H\) is invertible, the best
approximation minimizes \(\lVert
v\rVert\) is \[
\vec v=A^H(AA^H)^{-1}\vec w
\]
Let \(W\in \C^{mn},\vec v\in \C^m\),
then the orthogonal projection of v onto column space of W is given by
\[
P_W\vec v= W(W^HW)^{-1}W^H\vec v
\] Especially, if \(W\) is the
matrix with orthonormal columns, then \[
P_W\vec v=WW^H\vec v
\]