#

linear

Linear (fully connected) layers and how their weight and bias gradients are computed.

ProgrammingHow PyTorch backward() Propagates Gradients to Params

Technical explanation of how PyTorch autograd and backward() build the dynamic graph and accumulate gradients into linear layer weights and biases for SGD.

1 answer 1 view