🤓Interesting note on weight decay vs L2 regularization
In short, the was difference when moving from caffe (which implements weight decay) to keras (which implements L2). That led to different results on the same net architecture and same set of hyperparameters.
Link: https://bbabenko.github.io/weight-decay/
#DL #nn #hyperopt #hyperparams
🔗 weight decay vs L2 regularization
one popular way of adding regularization to deep learning models is to include a weight decay term in the updates. this is the same thing as adding an $L_2$ ...
In short, the was difference when moving from caffe (which implements weight decay) to keras (which implements L2). That led to different results on the same net architecture and same set of hyperparameters.
Link: https://bbabenko.github.io/weight-decay/
#DL #nn #hyperopt #hyperparams
🔗 weight decay vs L2 regularization
one popular way of adding regularization to deep learning models is to include a weight decay term in the updates. this is the same thing as adding an $L_2$ ...
bbabenko.github.io
weight decay vs L2 regularization
one popular way of adding regularization to deep learning models is to include a weight decay term in the updates. this is the same thing as adding an $L_2$ ...