Khoa Toán - Cơ - Tin học, Đại học Khoa học Tự nhiên, Đại học Quốc gia Hà Nội.[r]
(1)Subgradients
Hoàng Nam Dũng
(2)Last time: gradient descent Consider the problem
min
x f(x)
forf convex and differentiable,dom(f) =Rn.
Gradient descent: choose initial x(0) ∈Rn, repeat
x(k) =x(k−1)−tk · ∇f(x(k−1)), k =1,2,3,
Step sizestk chosen to be fixed and small, or by backtracking line search
(3)Outline
Today:
I Subgradients
I Examples
I Properties
I Optimality characterizations
(4)Basic inequality
Recall that for convex and differentiablef,
f(y)≥f(x) +∇f(x)T(y−x), ∀x,y ∈dom(f).
Basic inequality
recall the basic inequality for differentiable convex functions:
f(y)≥f(x) +∇f(x)T(y−x) ∀y∈domf
(x, f(x))
∇f(x)
−1
• the first-order approximation off atxis a global lower bound
• ∇f(x)defines a non-vertical supporting hyperplane toepif at(x, f(x)):
(5)Subgradients
Asubgradientof a convex function f atx is any g ∈Rn such that
f(y)≥f(x) +gT(y−x), ∀y∈dom(f)
I Always exists (on the relative interior of dom(f))
I Iff differentiable at x, then g =∇f(x) uniquely
I Same definition works for nonconvex f (however, subgradients need not exist)
Subgradient
gis asubgradientof a convex functionfatx∈domfif f(y)≥f(x) +gT(y−x) ∀y∈domf
x1 x2
f(y)
f(x1) +g1T(y−x1)
f(x1) +g2T(y−x1)
f(x2) +gT3(y−x2)
g1,g2are subgradients atx1;g3is a subgradient atx2
Subgradients 4-3
g1 and g2 are subgradients at x1,g3 is subgradient atx2
(6)Subgradients
Asubgradientof a convex function f atx is any g ∈Rn such that
f(y)≥f(x) +gT(y−x), ∀y∈dom(f)
I Always exists (on the relative interior of dom(f))
I Iff differentiable at x, then g =∇f(x) uniquely
I Same definition works for nonconvex f (however, subgradients need not exist)
Subgradient
gis asubgradientof a convex functionfatx∈domfif f(y)≥f(x) +gT(y−x) ∀y∈domf
f(y)
f(x1) +g1T(y−x1)
f(x1) +g2T(y−x1)
(7)Examples of subgradients Considerf:R→R,f(x) =|x|
Examples of subgradients Consider f :R→R,f(x) =|x|
−2 −1
−0.5 0.0 0.5 1.0 1.5 2.0 x f(x)
• For x6= 0, unique subgradient g= sign(x)
• For x= 0, subgradientg is any element of[−1,1]
5 I For x6=0, unique subgradient g = sign(x)
I For x=0, subgradientg is any element of[−1,1]
(8)Examples of subgradients Considerf:Rn→R,f(x) =kxk2
Considerf :Rn→R,f(x) =kxk
2
x1
(9)Examples of subgradients Considerf:Rn→R,f(x) =kxk1
Considerf :Rn→R,f(x) =kxk
1
x1
x2 f(x)
• Forxi6= 0, uniqueith component gi= sign(xi) • Forxi= 0,ith component gi is any element of[−1,1]
7 I For xi 6=0, unique ith component gi = sign(xi)
I For xi =0,ith component gi is any element of[−1,1]
(10)Examples of subgradients
Considerf(x) = max{f1(x),f2(x)}, for f1,f2:Rn→R convex,
differentiable
Considerf(x) = max{f1(x), f2(x)}, for f1, f2 :Rn→Rconvex,
differentiable
−2 −1
0
5
10
15
x
f(x)