Added a cell to enable GPU computing in caffe #15

slock83 · 2015-07-10T10:03:27Z

On CUDA-capable devices, the difference is HUGE, tested on
Linux x86_64, I7 4700MQ, gtx780m
OS X x86_64, I7, gt750m

both became way faster

ebbie76 · 2015-07-13T13:59:18Z

Can you provide install instructions? Did you try it on pc?

slock83 · 2015-07-13T14:22:38Z

Tried and working blazing fast on my computer!
To install, you need to have compiled caffe without setting the cpu only, then it should just work if the compilation went well

I tried it on :
Msi laptop, Linux x86_64, Intel core i7 4700mq, nvidia gtx780m
And
Apple MacBook pro, osx Yosemite, Intel core i5 pro, nvidia gt750m + Intel iris pro (not handling caffe)

ebbie76 · 2015-07-13T14:35:00Z

I have an Alienware 17. Do you think it will work?

slock83 · 2015-07-13T15:20:29Z

The best way to know is to check if your gpu has Cuda capability > 2.0
(it will not work with amd or ATI cards)

But beware as the complete installation is quite tricky and will most likely require technical skills, furthermore, there is no official support for Windows, you will need to use Linux

ebbie76 · 2015-07-13T15:31:21Z

I have an nvidia card. I may not install it after all but thanks for the information.

SlimeQ · 2015-07-13T18:44:45Z

is installation any different from before? any hiccups I should be expecting?

slock83 · 2015-07-13T19:08:38Z

Read the previous comments if you have any doubt!
On Linux you may want to ensure that your Cuda installation is working properly (it should be OK if you did not set cpu only during caffe install)
Just try, it's the fastest way to check :)
And do not hesitate to ask if you run into troubles

SlimeQ · 2015-07-13T21:16:35Z

Oh wow, for some reason I thought I was on a caffe issue.... You mean to tell me that I've actually been dreaming on my 10 year old dual core Athlon all week?

... That actually makes a ton of sense, I figured deep nets were just super heavy.

thanks!

slock83 · 2015-07-13T21:41:15Z

Yup, they actually are super heavy (deep learning, convolution on huge datasets,...)
But
Caffe python wrapper is not well documented, plus the API has changed in the past month, rendering the old command to switch to gpu useless, without any information (had to read the sources)
What made me figure that is simple : my top notch gaming gear was almost on par with a MacBook Air
So yeah now enjoy dreaming super fast! Hope this will get merged eventually, it would ease a lot of people's lives

burningion · 2015-07-13T21:42:24Z

@slock83 Can you share some numbers? What sort of speed up are you looking at?

slock83 · 2015-07-13T22:11:33Z

No benchmark done, but for my build (the Linux I described above) it was quite impressive, like x30~x50
If you have compatible hardware, don't try to weight pros and cons, just do it, you won't regret it!

SlimeQ · 2015-07-14T00:06:46Z

wow, yeah i'm flying right now. i was averaging ~20-30min per frame before, and now i'm getting (3+5+10+20) seconds. wonder how long it'll take for tumblr to block my bot? :P

ghost · 2015-07-14T00:19:20Z

I have 2 video cards and one has the cuda with it, I understand that I simply need caffe.set_mode_gpu() to start caffe in gpu mode, but can I point to my other video card there? I don't want to use both video card, simply the second one to do gpu stuff with caffe

Would this:

caffe.set_device(gpu_id)

Works? What would be the gpu id then?

burningion · 2015-07-14T02:55:33Z

A couple unscientific benchmarks from my setup, an Intel i7-4770K 16GB RAM, NVIDIA GTX 760 2GB Hackintosh running OS X 10.10.4.

Caffe has been compiled with cudnn / gpu support, and the latest nvidia web driver has also been installed.

I did this using caffe.set_mode_gpu() and caffe.set_mode_cpu() in the import section. With my 760 I get an out of memory error on images bigger than 320x240. For this reason I had to do loops over on the flowers.jpg file included with this distribution for guided training in order to test my GPU. I'll be ordering a 980 ti later this week and will follow up if anyone is interested.

import timeit
start_time = timeit.default_timer()
for i in xrange(30):
    _=deepdream(net, img, end='inception_4b/3x3')
elapsed = timeit.default_timer() - start_time
print elapsed

With GPU enabled: 46.4347259998 seconds
With CPU enabled: 128.624374151

And @damarusama yes, your gpus will be listable, and will start from 0, and count up. Depending on your OS, there is an NVIDIA command line tool to detect which card is which. Generally, with two cards, it will be 0 or 1.

slock83 · 2015-07-14T07:59:02Z

I don't know what is your pll, but it seems quite good, with sub nanosecond precision !
More seriously, you could trim your number to the millisecond, as Unix based system (this therefore include bsd, and thus osx)
Before ordering a huge gpu, check if your nvidia injection work well! Your 760 should absolutely not get an oom on image that are so small! We were able to work on the default sky image (1024×700) with a laptop gt750 with 1g video memory (on the same osx version)

So I 'd tell you not to buy a new card, but instead try to check your hackintosh

burningion · 2015-07-14T13:05:26Z

@Slock I just rebooted, and tried re running everything. You're right, the sky image works perfectly. Getting some glitchy behavior from my GPU, which I suppose is to be expected when you're running on a Hackintosh.

I'll try running more tests in Ubuntu later, but thanks for the heads up.

slock83 · 2015-07-14T13:40:53Z

Hackintosh always had problems with gpu support, it's almost a miracle that Cuda works on them!

burningion · 2015-07-15T22:27:01Z

Just in case anyone is curious, I'm able to do image frames up to 1280x720 on 2GB of GPU memory (in OS X), but run out of GPU memory when trying 1920 x 1080 images.

slock83 · 2015-07-16T07:01:03Z

If you really want to render high resolution images, you'll have to split them into the three parts :
Left half, right half, and from the first quarter to the fire quarter of the image
Then modify the deepdream function this way :
In the loop that does the iterations :
Render the left side
Copy the right part of the left side to the left part of the middle image
Render the middle image
copy the right part of the middle image to the left part of the right image
Render the right image
Drop the middle image
~~new iteration~~
Render the right image
Copy the left side of the right image to the right side of the middle image
...

Well I think you got this, best thing would be to do this, but render only on the untouched parts so we have a smooth transition

Well good luck if you are going to do this!

slock83 · 2015-07-16T12:44:31Z

with my latest benchs, performances were between 7.6 to 8.2 times faster on the GPU (while the said gpu was also used for display, so it may be even faster)

Added a cell to enable GPU computing in caffe

ef37fb6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a cell to enable GPU computing in caffe #15

Added a cell to enable GPU computing in caffe #15

slock83 commented Jul 10, 2015

ebbie76 commented Jul 13, 2015

slock83 commented Jul 13, 2015

ebbie76 commented Jul 13, 2015

slock83 commented Jul 13, 2015

ebbie76 commented Jul 13, 2015

SlimeQ commented Jul 13, 2015

slock83 commented Jul 13, 2015

SlimeQ commented Jul 13, 2015

slock83 commented Jul 13, 2015

burningion commented Jul 13, 2015

slock83 commented Jul 13, 2015

SlimeQ commented Jul 14, 2015

ghost commented Jul 14, 2015

burningion commented Jul 14, 2015

slock83 commented Jul 14, 2015

burningion commented Jul 14, 2015

slock83 commented Jul 14, 2015

burningion commented Jul 15, 2015

slock83 commented Jul 16, 2015

slock83 commented Jul 16, 2015

Added a cell to enable GPU computing in caffe #15

Are you sure you want to change the base?

Added a cell to enable GPU computing in caffe #15

Conversation

slock83 commented Jul 10, 2015

ebbie76 commented Jul 13, 2015

slock83 commented Jul 13, 2015

ebbie76 commented Jul 13, 2015

slock83 commented Jul 13, 2015

ebbie76 commented Jul 13, 2015

SlimeQ commented Jul 13, 2015

slock83 commented Jul 13, 2015

SlimeQ commented Jul 13, 2015

slock83 commented Jul 13, 2015

burningion commented Jul 13, 2015

slock83 commented Jul 13, 2015

SlimeQ commented Jul 14, 2015

ghost commented Jul 14, 2015

burningion commented Jul 14, 2015

slock83 commented Jul 14, 2015

burningion commented Jul 14, 2015

slock83 commented Jul 14, 2015

burningion commented Jul 15, 2015

slock83 commented Jul 16, 2015

slock83 commented Jul 16, 2015