Exploding Gradient
- Exploding gradient occurs when the gradient of the weights become so large that it becomes
NaN
due to overflow
Why Exploding Gradient Occurs?
If the gradient is greater than 1.0 and the network is too deep, then the gradient accumulates to a very large number
How to identify Exploding Gradient?
- The model weights quickly become very large during training
- Model weights go to
NaN
- The error gradient is always above 1.0 for each node and layer during training
How to solve Exploding Gradient?
- Decrease the depth
- LSTM
- Gradient Clipping
- L1 or Lasso Regression
- L2 or Ridge Regression