Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are you padding on the sixth convolutional layer?! #52

Open
TheRevanchist opened this issue Jul 13, 2017 · 2 comments
Open

Why are you padding on the sixth convolutional layer?! #52

TheRevanchist opened this issue Jul 13, 2017 · 2 comments
Labels

Comments

@TheRevanchist
Copy link

TheRevanchist commented Jul 13, 2017

Hi,

On the sixth convolutional layer, the code is:

    W6 = utils.weight_variable([7, 7, 512, 4096], name="W6")
    b6 = utils.bias_variable([4096], name="b6")
    conv6 = utils.conv2d_basic(pool5, W6, b6)

At this stage, pool5 has size [batch_size, 7, 7, 512]. Now, as far as I understand, you are using a filter of size 7 by 7, in order to make your feature map of size [batch_size, 1, 1, 4096]. However, if you look at the code of conv2d_basic(...), the code is:

    def conv2d_basic(x, W, bias):
        conv = tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="SAME")
        return tf.nn.bias_add(conv, bias)

The problem here is that the output of conv6 actually remains [batch_size, 7, 7, 4096] because you're using padding, and I am not sure that this is what we want. If we look at the official code, we'll see the sixth convolutional layer coded as:

layer {
    name: "fc6"
    type: "Convolution
    bottom: "pool5"
    top: "fc6"
    param {
        lr_mult: 1
        decay_mult: 1
    }
    param {
        lr_mult: 2
        decay_mult: 0
    }
    convolution_param {
    num_output: 4096
    pad: 0
    kernel_size: 7
    stride: 1
    }
}

They aren't using padding in this layer, which means that conv6 is actually of size [batch_size, 1, 1, 4096]. Pretty sure this is the entire point of conv6.

Am I missing something, or that part of the code was a mistake from your part?

Anyway, cheers for the code. The cleanest TF implementation of F-CNN I have seen so far.

@shekkizh
Copy link
Owner

Hi @TheRevanchist
It's been a while since I looked into the code, but if it is implemented with padding I think you are right in that this was a mistake on my part (Nice catch!). Having said that, I believe since we are using convolutions all the way it might not be affecting the segmentation.
Anyways thanks for pointing it out. If you have already fixed it you could do a PR and I will review and merge the changes.

@shekkizh shekkizh added the bug label Jul 18, 2017
@TheRevanchist
Copy link
Author

I kind of played with it, but the results seem to be significantly worse than yours (visually at least, the validation error rate seem to be around the same). Which made me doubt myself if there is something I didn't understand and so the padding in that layer was needed, or I made some mistake in the code.

I will give another look to it soon, and see if I can fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants