Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image Analogies not Importing theano_backend from Keras Correctly #31

Open
matthewbahr opened this issue Jul 26, 2016 · 20 comments
Open

Comments

@matthewbahr
Copy link

Here's the error output from running just a basic make image with image-analogies:

Using gpu device 0: GeForce GTX 970 (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 4007)
Traceback (most recent call last):
File "make_image_analogy.py", line 17, in <module>
args = image_analogy.argparser.parse_args()
File "build\bdist.win-amd64\egg\image_analogy\argparser.py", line 101, in parse_args
AttributeError: 'module' object has no attribute '_on_gpu'

I'm able to run a full 12 epoch keras 1.0.5 test on the theano backend without problems. I've tried adding a "--a-scale-mode match" which gets me past the strange module issue but it just crashes on the first pass with an attribute error of

Convolution2D has no attribute 'get_output'

Not really sure what is going on.

@sdierauf
Copy link

sdierauf commented Aug 3, 2016

I think I've seen this before...if I had to guess, theano isn't fully compiled properly. Are you running this inside a virtualenv?

@matthewbahr
Copy link
Author

No, I've done the theano compilation on my own rather than through virtualenv.

When I use Keras to drive the theano backend it works just fine I think.

Which error were you seeing when you'd seen this before? The 1st or the second? I'm not convinced that they have the same root issue.

@qazxswedcxzaqws
Copy link

I fixed this by just removing the bit of code since it was for cpu mode anyway.

@matthewbahr
Copy link
Author

The only reason I'm going to all this effort to do this on my windows instead of my mac is because I want CUDA

@qazxswedcxzaqws
Copy link

Open image_analogy\argparser.py in notepad++, go to line 101 and make it look like this
capture

@qazxswedcxzaqws
Copy link

There should be one or two more errors after this, tell me what they are when you get them, I had the same problems so I just made a couple rushed temporary fixes and it was up and running.

@matthewbahr
Copy link
Author

I'll try that, thanks!

@matthewbahr
Copy link
Author

qazxswedcxzaqws I'm getting a new error like you expected.

Here is everything I get from the call to the output:

A:\Dev\Deepdream>python image-analogies/build/scripts-2.7/make_image_analogy.py images/greatwave.jpg images/greatwaveprime.jpg images/me.jpg images/out/me
Using Theano backend.
DEBUG: nvcc STDOUT mod.cu
   Creating library C:/Users/Crowbahr/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.12-64/tmprydv0j/265abc51f7c376c224983485238ff1a5.lib and object C:/Users/Crowbahr/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.12-64/tmprydv0j/265abc51f7c376c224983485238ff1a5.exp

Using gpu device 0: GeForce GTX 970 (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 4007)
Theano cuda without cuDNN detected. Forcing a-scale-mode to "match"
Using PatchMatch model
Scale factor 0.25 "A" shape (1L, 3L, 864L, 1296L) "B" shape (1L, 3L, 864L, 1296L)
Building loss...
Precomputing static features...
Traceback (most recent call last):
  File "image-analogies/build/scripts-2.7/make_image_analogy.py", line 27, in <module>
    image_analogy.main.main(args, model_class)
  File "build\bdist.win-amd64\egg\image_analogy\main.py", line 69, in main
  File "build\bdist.win-amd64\egg\image_analogy\models\nnf.py", line 17, in build
  File "build\bdist.win-amd64\egg\image_analogy\models\nnf.py", line 55, in build_loss
  File "build\bdist.win-amd64\egg\image_analogy\models\base.py", line 53, in precompute_static_features
  File "build\bdist.win-amd64\egg\image_analogy\models\base.py", line 61, in get_features
  File "build\bdist.win-amd64\egg\image_analogy\models\base.py", line 72, in get_layer_output
AttributeError: 'Convolution2D' object has no attribute 'get_output'

@qazxswedcxzaqws
Copy link

qazxswedcxzaqws commented Sep 11, 2016

Open image_analogy\models\base.py in notepad++ then go to line 72 and change it to this
capture3

@matthewbahr
Copy link
Author

Alllllll righty it's working so far!

@matthewbahr
Copy link
Author

matthewbahr commented Sep 11, 2016

Well I'm getting output now on it but it doesn't seem to be utilizing much of my GPU. My CPU is maxing out and it's hitting 8gb or RAM but my GPU is idling and never reaches more than 1% load according to GPU-Z

This means the iterations are taking forever.

It does appear to be using all the memory though, hitting 3686MB

@qazxswedcxzaqws
Copy link

That means the program is running in CPU mode, but from what you have posted it looks like it should be running in GPU mode. Mind posting what your output looks like when you run it now?

@matthewbahr
Copy link
Author

Well I'm seeing the warning message that I wrote in while doing the first change saying Theano cuda without cuDNN detected.

DEBUG: nvcc STDOUT mod.cu
   Creating library C:/Users/Crowbahr/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.12-64/tmpy5e1xl/265abc51f7c376c224983485238ff1a5.lib and object C:/Users/Crowbahr/AppData/Local/Theano/compiledir_Windows-10-10.0.14393-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.12-64/tmpy5e1xl/265abc51f7c376c224983485238ff1a5.exp

Using gpu device 0: GeForce GTX 970 (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 4007)
Theano cuda without cuDNN detected. Forcing a-scale-mode to "match"
Using PatchMatch model

But part of reading that says that CNMeM and cuDNN are there... It looks like it's using the GPU memory but not the cores for processing.

@qazxswedcxzaqws
Copy link

Just double check that the changes I made earlier are 100% identical and also that cudNN is installed correctly, since i'm running an identical version without any ("Theano cuda without cuDNN detected. Forcing a-scale-mode to "match"") messages.

@matthewbahr
Copy link
Author

Might be something with cuDNN. The code is identical.

You're seeing cuDNN 4007 too?

@qazxswedcxzaqws
Copy link

qazxswedcxzaqws commented Sep 11, 2016

Actually I just found that some options I was using made the "theano cuda without cuDNN" message disappear, with a similar usage to yours I still get the message. But my GPU is being used properly still nonetheless, so its not really an issue for me. I'm running CUDA 8.0 and cuDNN 5005 since I have a Pascal GPU, but that shouldn't really be an issue as your CUDA version is probably more compatible than mine since it is older.

@matthewbahr
Copy link
Author

kk looks like cuDNN is only available as 5xxx series now from NVIDIA so I'm gonna have to work on this. Might as well leave the cpu version running over night in the meantime.

@matthewbahr
Copy link
Author

After getting and replacing the cuDNN (It's at 5103 now) it's showing the same low gpu load. Occasionally I'll see a 40~ish spike but mainly not running.

Memory usage is still high.

@qazxswedcxzaqws
Copy link

qazxswedcxzaqws commented Sep 11, 2016

Strange, all I can do now is recommend this guide https://github.com/titu1994/Neural-Style-Transfer/blob/master/Guide.md in the "Setting Up Theano for GPU (on Windows)" Section, just in case you are missing any dependencies. As there don't seem to be any more error codes i'm not really sure whats going wrong, the only thing I can chalk it up to is this program is quite flaky and outdated in comparison to some newer Theano based alternatives.

@awentzonline
Copy link
Owner

Hey the initial issue looks like you were using keras >= 1.0 with this project which was originally only compatible with keras 0.3. I've upgraded this project to use keras >= 1.0 so that should fix the get_output, and _on_gpu errors.

The GPU usage issue can be a combination of things. The output you posted above

Using PatchMatch model

means the patch matching is done with a different algorithm on the CPU. Use the option --model=brute to run the brute-force patch matcher on the GPU.

There was another issue with some combination of keras/theano/whatever where the brute-force GPU patch-matching convolutions are "optimized" by theano to use the CPU, instead. I've added a fix to explicitly use the cuDNN operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet