forked from gromacs/gromacs
-
Notifications
You must be signed in to change notification settings - Fork 4
Planning
AncaSC edited this page Nov 12, 2014
·
6 revisions
- Goal: To have a working OpenCL version
- Status for Phase 1
- For A1.1 log see A1.1 log
- For A1.2 log see A1.2 log
- T1.2.1 Initialisation - detection of available OpenCL platforms and their setup
- T1.2.2 Choosing the target device
- T1.2.3 Adding structures or modifying existing ones to accommodate OpenCL execution*
- T1.2.4 Setup code for the execution of an OpenCL kernel
- T1.2.5 Actual execution of OpenCL code
- For A1.3 log see A1.3 log
- T1.3.1 Separation of CUDA host/device code
- T1.3.2 Translation of device code from CUDA to OpenCL
- ST1.3.2.1 Achieving a basic compilable form of the actual kernel code
- ST1.3.2.2 Adapting the data exchange format between host and device
- ST1.3.2.3 Finalising implementation in a fully functional form
- For A1.4 log see A1.4 log
- T1.4.1 Changing build scripts to accommodate the new code and all OpenCL dependencies
- T1.4.2 Adding necessary flags for switching between CUDA and OpenCL
- T1.4.3 Functional OpenCL version on non-NVIDIA GPUs
- T1.5.1 Checking that the changes do not affect the functionality of the initial implementation
- T1.5.2 Validating the consistency between OpenCL and CUDA code
- T1.5.3 Testing on multiple platforms
- Goal: To optimize the existing OpenCL kernels
- Status for Phase 2
- For A2.1 log see A2.1 log
- For A2.2 log see A2.2 log
- For A2.3 log see A2.3 log
A2.4 Nvidia-specific optimisation to bring the OpenCL performance up to the level of the CUDA implementation
- For A2.4 log see A2.4 log
- For A2.5 log see A2.5 log
- Goal: To finalize the internal architecture
- Status for Phase 3
- Goal: To identify slow computations that can benefit from GPU execution
- Status for Phase 4
- Goal: To enable multi-GPU execution
- Status for Phase 5
- For A5.1 log see A5.1 log
- T5.1.1 Evaluate performance gain for splitting the work currently done by the GPU between several GPUs
- T5.1.2 Evaluate performance gain for offloading work identified in Phase 4 to another GPU than the one used for non-bonded force calculation
- For A5.2 log see A5.2 log
- For A5.3 log see A5.3 log
- For A5.4 log see A5.4 log