Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA code runs but does not march through time #66

Open
amjokisaari opened this issue Aug 18, 2017 · 13 comments
Open

CUDA code runs but does not march through time #66

amjokisaari opened this issue Aug 18, 2017 · 13 comments
Milestone

Comments

@amjokisaari
Copy link
Collaborator

in gpu/cuda, I run ./diffusion ../params.txt

The code appears to execute. PNGs and CSVs are generated. However, it looks like no time marching is occurring. The attached image is the final one in the sequence, they all look the same. Runlog.csv does have data generated out to 100,000 iterations.

data0100000

@tkphd
Copy link
Collaborator

tkphd commented Aug 18, 2017

As it runs, can you use nvidia-smi to check whether the GPU is doing any work? ( #45 )

@tkphd tkphd added this to the diffusion milestone Aug 18, 2017
@tkphd
Copy link
Collaborator

tkphd commented Aug 18, 2017

Also, which graphics card are you using? It's possible the specified gencode flags are mismatched. If this is the cause, fixing it will require a much more sophisticated make/cmake solution.

@amjokisaari
Copy link
Collaborator Author

amjokisaari commented Aug 19, 2017

ahhh this is opening a can of worms. My laptop (Dell M5000 series) has integrated Intel graphics as well as an Nvidia card (Quadro M1000M). Currently I'm running my OS graphics on the integrated graphics card and I have no idea how CUDA code fares when doing this dual-graphics-card-but-running-the-integrated-one thing. Running "nvidia-smi" in the terminal gives me command not found, some brief foruming brings up this as a potential solution...

@amjokisaari
Copy link
Collaborator Author

I may need to swap the entire system over to Nvidia, which means making friends with all the Nvidia graphics drivers again....

@tkphd
Copy link
Collaborator

tkphd commented Aug 19, 2017

This looks like an error stemming from a mismatch between the -gencode flags and the hardware.
For your device (Quadro M, Maxwell architecture?), try -gencode arch=compute_50,code=sm_50, based on this summary. If that doesn't work, remove the -gencode arch=compute_xx,code=sm_xx flag entirely. Then start messing with drivers.

  • Tesla K80 (Kepler architecture):
    • -gencode arch=compute_35,code=sm_35: yields expected diffusion field (see the top-level README) 😃
    • No -gencode flag at all: yields expected diffusion field 😃
  • Tesla C2075 (Fermi architecture):
    • -gencode arch=compute_35,code=sm_35: yields all zeros, despite nvidia-smi showing the GPU under load 😦
    • -gencode arch=compute_20,code=sm_20: yields expected diffusion field 😃
    • No -gencode flag at all: yields expected diffusion field 😃

@amjokisaari
Copy link
Collaborator Author

nope, neither changing the flags to 50 nor removing them entirely resolved the issue. Boo :(

@tkphd
Copy link
Collaborator

tkphd commented Sep 6, 2017

Boo indeed. This might be a hardware/driver issue. Can you test on another machine?
(Not giving up on this machine, just want to know if you can get it running at all.)

@amjokisaari
Copy link
Collaborator Author

...miiiight be able to try doing this in Windows??

Also, here's another stupid question: with reinstallation, nvdia card has the open-source nouveau drivers installed. These wouldn't have a chance in hell of working, would they..?

@tkphd
Copy link
Collaborator

tkphd commented Sep 6, 2017

Worth a try, yeah? (re. both Windows and nouveau)
There's also apparently a deep incompatibility with CUDA and GCC>4.9, so, yeah. Lots of complications.

@tkphd
Copy link
Collaborator

tkphd commented Sep 7, 2017

Is it possible your GPU doesn't support double-precision floats?

@amjokisaari
Copy link
Collaborator Author

amjokisaari commented Sep 7, 2017 via email

@tkphd
Copy link
Collaborator

tkphd commented Sep 7, 2017

  1. Google your hardware, or
  2. In common_diffusion/type.h, change typedef double fp_t; to typedef float fp_t; then recompile and run again.

tkphd added a commit that referenced this issue Oct 3, 2017
Addresses #66, closes #96, closes #102, closes #112.
@tkphd
Copy link
Collaborator

tkphd commented Oct 3, 2017

I had the same bug crop up on older hardware. Building the CUDA example without the specific flags worked for me -- committed in 73294de. Does the bug still affect your machine, @amjokisaari?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants