Computational Graph
L =
((a × b)
+ c)
× f
= (−6 + 10) × −2 = −8
a
2.00
∇ 6.00
b
-3.00
∇ -4.00
×
e
-6.00
∇ -2.00
c
10.00
∇ -2.00
+
d
4.00
∇ -2.00
f
-2.00
∇ 4.00
×
L
-8.00
∇ 1.00
a
2.00
b
-3.00
c
10.00
f
-2.00
lr
0.10
Gradients (via chain rule):
∇a = f × b | ∇b = f × a | ∇c = f | ∇f = d
Descent: param − lr × ∇ | Ascent: param + lr × ∇
∇a = f × b | ∇b = f × a | ∇c = f | ∇f = d
Descent: param − lr × ∇ | Ascent: param + lr × ∇
Variables Over Steps
Moving to MINIMIZE L → variables adjust opposite to gradients
L (left)
a
b
c
f (right)
a
2.0000
b
-3.0000
c
10.0000
f
-2.0000
L
-8.0000