Model load issue 2 #3

miha-skalic · 2017-11-15T20:45:12Z

Another issue, this time with python2

Python 2.7.13 |Anaconda 4.3.1 (64-bit)| (default, Dec 20 2016, 23:09:15) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import tensorflow as tf
>>> import tensornets as nets
>>> import numpy as np
>>> 
>>> print(tf.__version__)
1.2.1
>>> print(np.__version__)
1.13.1
>>> 
>>> inputs = tf.placeholder(tf.float32, [None, 224, 224, 3])
>>> outputs = tf.placeholder(tf.float32, [None, 50])
>>> model = nets.ResNet152v2(inputs, is_training=True, classes=50)
>>> 
>>> loss = tf.losses.softmax_cross_entropy(outputs, model)
>>> train = tf.train.AdamOptimizer(learning_rate=1e-5).minimize(loss)
>>> 
>>> with tf.Session() as sess:
...     nets.pretrained(model)
... 
2017-11-15 21:42:48.330744: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-15 21:42:48.330814: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-15 21:42:48.330941: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-11-15 21:42:48.595931: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-15 21:42:48.596442: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties: 
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.582
pciBusID 0000:01:00.0
Total memory: 10.91GiB
Free memory: 9.73GiB
2017-11-15 21:42:48.741859: W tensorflow/stream_executor/cuda/cuda_driver.cc:523] A non-primary context 0x121ad770 exists before initializing the StreamExecutor. We haven't verified StreamExecutor works with that.
2017-11-15 21:42:48.742280: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-15 21:42:48.742780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 1 with properties: 
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.582
pciBusID 0000:02:00.0
Total memory: 10.91GiB
Free memory: 10.75GiB
2017-11-15 21:42:48.743591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 1 
2017-11-15 21:42:48.743609: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y Y 
2017-11-15 21:42:48.743619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 1:   Y Y 
2017-11-15 21:42:48.743699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0)
2017-11-15 21:42:48.743713: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0)
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/shared/miha/py_homebrew/tensornets/pretrained.py", line 49, in pretrained
    __load_dict__[model_name](scope)
  File "/shared/miha/py_homebrew/tensornets/pretrained.py", line 171, in load_resnet152v2
    return load_weights(scopes, weights_path)
  File "/shared/miha/py_homebrew/tensornets/utils.py", line 175, in load_weights
    assert len(weights) == len(values), 'The sizes of symbolic and ' \
AssertionError: The sizes of symbolic and actual weights do not match.
>>>

The text was updated successfully, but these errors were encountered:

miha-skalic · 2017-11-15T20:57:08Z

Adding with device helps in this case:

with tf.device('gpu:0'):
    inputs = tf.placeholder(tf.float32, [None, 224, 224, 3])
    outputs = tf.placeholder(tf.float32, [None, 50])
    model = nets.Inception4(inputs, is_training=True, classes=50)

however another error follows. It looks like the model wants to assign old FC-1000 to new FC-50.

ValueError: Dimension 1 in both shapes must be equal, but are 50 and 1000 for 'Assign_596' (op: 'Assign') with input shapes: [1536,50], [1536,1000].

taehoonlee · 2017-11-18T12:08:57Z

The first error raised because of attempts to restore all the global variables when weights for `tf.train.Optimizer` are additionally declared.
The second one caused by the size mismatch in the last layer.
Thank you for reporting the issue again @miha-skalic, please see the updates for the procedure of assigning weights!

taehoonlee closed this as completed Nov 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model load issue 2 #3

Model load issue 2 #3

miha-skalic commented Nov 15, 2017 •

edited

Loading

miha-skalic commented Nov 15, 2017

taehoonlee commented Nov 18, 2017

Model load issue 2 #3

Model load issue 2 #3

Comments

miha-skalic commented Nov 15, 2017 • edited Loading

miha-skalic commented Nov 15, 2017

taehoonlee commented Nov 18, 2017

miha-skalic commented Nov 15, 2017 •

edited

Loading