Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model load issue 2 #3

Closed
miha-skalic opened this issue Nov 15, 2017 · 2 comments
Closed

Model load issue 2 #3

miha-skalic opened this issue Nov 15, 2017 · 2 comments

Comments

@miha-skalic
Copy link

miha-skalic commented Nov 15, 2017

Another issue, this time with python2

Python 2.7.13 |Anaconda 4.3.1 (64-bit)| (default, Dec 20 2016, 23:09:15) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import tensorflow as tf
>>> import tensornets as nets
>>> import numpy as np
>>> 
>>> print(tf.__version__)
1.2.1
>>> print(np.__version__)
1.13.1
>>> 
>>> inputs = tf.placeholder(tf.float32, [None, 224, 224, 3])
>>> outputs = tf.placeholder(tf.float32, [None, 50])
>>> model = nets.ResNet152v2(inputs, is_training=True, classes=50)
>>> 
>>> loss = tf.losses.softmax_cross_entropy(outputs, model)
>>> train = tf.train.AdamOptimizer(learning_rate=1e-5).minimize(loss)
>>> 
>>> with tf.Session() as sess:
...     nets.pretrained(model)
... 
2017-11-15 21:42:48.330744: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-15 21:42:48.330814: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-15 21:42:48.330941: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-11-15 21:42:48.595931: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-15 21:42:48.596442: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties: 
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.582
pciBusID 0000:01:00.0
Total memory: 10.91GiB
Free memory: 9.73GiB
2017-11-15 21:42:48.741859: W tensorflow/stream_executor/cuda/cuda_driver.cc:523] A non-primary context 0x121ad770 exists before initializing the StreamExecutor. We haven't verified StreamExecutor works with that.
2017-11-15 21:42:48.742280: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-15 21:42:48.742780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 1 with properties: 
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.582
pciBusID 0000:02:00.0
Total memory: 10.91GiB
Free memory: 10.75GiB
2017-11-15 21:42:48.743591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 1 
2017-11-15 21:42:48.743609: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y Y 
2017-11-15 21:42:48.743619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 1:   Y Y 
2017-11-15 21:42:48.743699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0)
2017-11-15 21:42:48.743713: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0)
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/shared/miha/py_homebrew/tensornets/pretrained.py", line 49, in pretrained
    __load_dict__[model_name](scope)
  File "/shared/miha/py_homebrew/tensornets/pretrained.py", line 171, in load_resnet152v2
    return load_weights(scopes, weights_path)
  File "/shared/miha/py_homebrew/tensornets/utils.py", line 175, in load_weights
    assert len(weights) == len(values), 'The sizes of symbolic and ' \
AssertionError: The sizes of symbolic and actual weights do not match.
>>> 
@miha-skalic
Copy link
Author

Adding with device helps in this case:

with tf.device('gpu:0'):
    inputs = tf.placeholder(tf.float32, [None, 224, 224, 3])
    outputs = tf.placeholder(tf.float32, [None, 50])
    model = nets.Inception4(inputs, is_training=True, classes=50)

however another error follows. It looks like the model wants to assign old FC-1000 to new FC-50.

ValueError: Dimension 1 in both shapes must be equal, but are 50 and 1000 for 'Assign_596' (op: 'Assign') with input shapes: [1536,50], [1536,1000].

@taehoonlee
Copy link
Owner

The first error raised because of attempts to restore all the global variables when weights for `tf.train.Optimizer` are additionally declared.
The second one caused by the size mismatch in the last layer.
Thank you for reporting the issue again @miha-skalic, please see the updates for the procedure of assigning weights!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants