Neural Networks | Нейронные сети

UberAI introduces a new approach for making Neural Networks process images faster & more accurately with jpeg representations.

Link: https://eng.uber.com/neural-networks-jpeg/
Paper: https://papers.nips.cc/paper/7649-faster-neural-networks-straight-from-jpeg

#nn #CV #Uber

🔗 Faster Neural Networks Straight from JPEG
Uber AI Labs introduces a method for making neural networks that process images faster and more accurately by leveraging JPEG representations.

Uber Engineering Blog

Faster Neural Networks Straight from JPEG

Uber AI Labs introduces a method for making neural networks that process images faster and more accurately by leveraging JPEG representations.

115 viewsedited 12:48

👍 1 👎

Neural Networks | Нейронные сети

Generalization in Deep Networks: The Role of Distance from Initialization

Why it's important to take into account the initialization to explain generalization.

ArXiV: https://arxiv.org/abs/1901.01672

#DL #NN

🔗 Generalization in Deep Networks: The Role of Distance from Initialization
Why does training deep neural networks using stochastic gradient descent (SGD) result in a generalization error that does not worsen with the number of parameters in the network? To answer this question, we advocate a notion of effective model capacity that is dependent on {\em a given random initialization of the network} and not just the training algorithm and the data distribution. We provide empirical evidences that demonstrate that the model capacity of SGD-trained deep networks is in fact restricted through implicit regularization of {\em the $\ell_2$ distance from the initialization}. We also provide theoretical arguments that further highlight the need for initialization-dependent notions of model capacity. We leave as open questions how and why distance from initialization is regularized, and whether it is sufficient to explain generalization.

21 viewsedited 19:48

👍 1 👎

Neural Networks | Нейронные сети

🤓Interesting note on weight decay vs L2 regularization

In short, the was difference when moving from caffe (which implements weight decay) to keras (which implements L2). That led to different results on the same net architecture and same set of hyperparameters.

Link: https://bbabenko.github.io/weight-decay/

#DL #nn #hyperopt #hyperparams

🔗 weight decay vs L2 regularization
one popular way of adding regularization to deep learning models is to include a weight decay term in the updates. this is the same thing as adding an $L_2$ ...

bbabenko.github.io

weight decay vs L2 regularization

14 views19:05

👍👎

Neural Networks | Нейронные сети