You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The 2x down-sampling is one of the important operations in reference models. But, a convolution or a pooling with stride=2, padding='SAME' may result in different outputs over different deep learning libraries (e.g., TensorFlow, CNTK, Theano, Caffe, Torch, ...) due to their different padding behaviors.
For example (TensorNets syntax; but can be regarded as pseudo codes for other libraries),
x=pad(x, [[0, 0], [3, 3], [3, 3], [0, 0]]) # 224 -> 230x=conv(x, 64, 7, stride=2, padding='VALID') # 230 -> 112 with slicing the the rightmost 1 pixel
As TensorNets translates the original repositories written in various libraries, TensorNets must consider the two padding behaviors to reproduce the original results exactly.
Results
I compared the performance differences of the two padding schemes with the 11 ResNet variants. Precisely, the two for the ResNets are:
All except ResNet50,101,152v2 showed better performances in symmetric padding than asymmetric padding. This is because only TensorFlow (as far as I know) uses asymmetric padding, and only ResNet50,101,152v2 are trained with TensorFlow. Note that
Caffe uses definitely symmetric padding, and I can infer that (Py)Torch also does so (I'm not familiar with Torch). Thus, in order to reproduce the original results, I revised the current symmetric paddings for the pool1 in the ResNets as asymmetric paddings only in case of ResNet50,101,152v2. As the conv1 of ResNet50,101,152v2 transforms 299 to 150, the SAME padding is equivalent to the symmetric so the conv1/pad didn't be touched. Please see the commit :)
The text was updated successfully, but these errors were encountered:
The 2x down-sampling is one of the important operations in reference models. But, a convolution or a pooling with
stride=2, padding='SAME'
may result in different outputs over different deep learning libraries (e.g., TensorFlow, CNTK, Theano, Caffe, Torch, ...) due to their different padding behaviors.For example (TensorNets syntax; but can be regarded as pseudo codes for other libraries),
produces a
[None, 112, 112, 64]
map. This example can be performed as either one of the following cases:Case 1 (asymmetric)
kernel_size // 2 - 1
, bottom-right:kernel_size // 2
Case 2 (symmetric)
kernel_size // 2
As TensorNets translates the original repositories written in various libraries, TensorNets must consider the two padding behaviors to reproduce the original results exactly.
Results
I compared the performance differences of the two padding schemes with the 11 ResNet variants. Precisely, the two for the ResNets are:
The results are summarized as follows:
All except ResNet50,101,152v2 showed better performances in symmetric padding than asymmetric padding. This is because only TensorFlow (as far as I know) uses asymmetric padding, and only ResNet50,101,152v2 are trained with TensorFlow. Note that
Caffe uses definitely symmetric padding, and I can infer that (Py)Torch also does so (I'm not familiar with Torch). Thus, in order to reproduce the original results, I revised the current symmetric paddings for the
pool1
in the ResNets as asymmetric paddings only in case of ResNet50,101,152v2. As theconv1
of ResNet50,101,152v2 transforms 299 to 150, theSAME
padding is equivalent to the symmetric so theconv1/pad
didn't be touched. Please see the commit :)The text was updated successfully, but these errors were encountered: