−∞,+∞)(-\infty
w=[0,1]w
Time
Python
o1o_1o1
h1h_1h1
b=0b
MSE
w1w_1w1. How
ThenSince
x2x_2x2
backprop”.Phew
Male (000
3 Mar 2019Computer Science
Princeton University
ZERO
Python
0,1)(0
’d
We’ll
Nice
Alice
∂L∂w1\frac{\partial L}{\partial
L=(1−ypred)2L
∂L∂w1\frac{\partial L}{\partial w_1}∂w1∂L:Reminder
f′(x)=f(x)∗(1−f(x))f'(x
LLL
No matching tags
No matching tags
SGD
We’ll
∂ypred∂w1\frac{\partial y_{pred}}{\partial
action!We’re
No matching tags
Here’s what a simple neural network might look like:This network has 2 inputs, a hidden layer with 2 neurons (h1h_1h1 and h2h_2h2), and an output layer with 1 neuron (o1o_1o1). There can be multiple hidden layers!Let’s use the network pictured above and assume all neurons have the same weights w=[0,1]w = [0, 1]w=[0,1], the same bias b=0b = 0b=0, and the same sigmoid activation function. Let h1,h2,o1h_1, h_2, o_1h1,h2,o1 denote the outputs of the neurons they represent.What happens if we pass in the input x=[2,3]x = [2, 3]x=[2,3]?The output of the neural network for input x=[2,3]x = [2, 3]x=[2,3] is 0.72160.72160.7216. Let’s label each weight and bias in our network:Then, we can write loss as a multivariable function:Imagine we wanted to tweak w1w_1w1. All we’re doing is subtracting η∂L∂w1\eta \frac{\partial L}{\partial w_1}η∂w1∂L from w1w_1w1:If we do this for every weight and bias in the network, the loss will slowly decrease and our network will improve.Our training process will look like this:Let’s see it in action!It’s finally time to implement a complete neural network:You can run / play with this code yourself.
As said here by