tfslim "Обучение модели с нуля." произошла какая-то ошибка

Я тренируюсь с

https://github.com/tensorflow/models/tree/master/slim

тренировка модельной формы с нуля. произошла какая-то ошибка

Я думаю, что это своего рода проблема с GPU и CPU.

другие коды работают у меня нормально.

но это происходит ошибка

я запускаю следующий код

python train_image_classifier.py 
    --train_dir= /home/sk/workspace/slim/datasets/log
    --dataset_name=imagenet 
    --dataset_split_name=train 
    --dataset_dir=/home/sk/workspace/slim/datasets/imagenet 
    --model_name=inception_v3

и ошибка

Caused by op u'InceptionV3/Logits/Conv2d_1c_1x1/biases/RMSProp_1', defined at:
  File "/home/sk/workspace/slim/train_image_classifier.py", line 573, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "/home/sk/workspace/slim/train_image_classifier.py", line 539, in main
    global_step=global_step)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 446, in apply_gradients
    self._create_slots([_get_variable_for(v) for v in var_list])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/rmsprop.py", line 103, in _create_slots
    self._zeros_slot(v, "momentum", self._name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 766, in _zeros_slot
    named_slots[_var_key(var)] = slot_creator.create_zeros_slot(var, op_name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 174, in create_zeros_slot
    colocate_with_primary=colocate_with_primary)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 146, in create_slot_with_initializer
    dtype)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/slot_creator.py", line 66, in _create_slot_var
    validate_shape=validate_shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 1049, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 948, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 356, in get_variable
    validate_shape=validate_shape, use_resource=use_resource)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 341, in _true_getter
    use_resource=use_resource)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py", line 714, in _get_single_variable
    validate_shape=validate_shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 197, in __init__
    expected_shape=expected_shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 281, in _init_from_args
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py", line 128, in variable_op_v2
    shared_name=shared_name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 708, in _variable_v2
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Cannot assign a device to node 'InceptionV3/Logits/Conv2d_1c_1x1/biases/RMSProp_1': Could not satisfy explicit device specification '/device:GPU:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
Colocation Debug Info:
Colocation group had the following types and devices: 
ApplyRMSProp: CPU 
Const: CPU 
Assign: CPU 
IsVariableInitialized: CPU 
Identity: CPU 
VariableV2: CPU 
     [[Node: InceptionV3/Logits/Conv2d_1c_1x1/biases/RMSProp_1 = VariableV2[_class=["loc:@InceptionV3/Logits/Conv2d_1c_1x1/biases"], container="", dtype=DT_FLOAT, shape=[3], shared_name="", _device="/device:GPU:0"]()]]


Process finished with exit code 1

1 ответ

Он пытается запустить некоторые операции на графическом процессоре, но TensorFlow не видит устройство с графическим процессором (либо потому, что вы используете версию TensorFlow для ЦП, из-за проблемы установки CUDA или из-за отсутствия графического процессора). Похоже, вы можете указать --clone_on_cpu=True вместо этого использовать процессор.

Другие вопросы по тегам