You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OK, thank you for your answer! Did you experiment with different batch sizes and up/downscaled learning rates with Adam to see if the linear scaling rule is true?
Hi! I have a question concerning the linear learning rate scaling that you are using. In the publication https://arxiv.org/abs/1706.02677 this scaling rule is only proven for SGD but you are using Adam. Did you do or do you know about any experiments that back up this approach?
The text was updated successfully, but these errors were encountered: