-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathTODO
28 lines (19 loc) · 1.6 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
11.12.2015
TODO: [Version 0.3]
1. Implement Resilient-Back-Propagation.
Once complete, rename `train` to `train_backprop` and create a new `train_rprop` method.
Then, parametrise the `epoch` method to take as param either an ENUM, or a functor to the batch training method.
3. OPTIMIZE: the `trainer` class has serious problems: too many copies and allocations.
What I need is a flat (and more elegant) way of fw_propagating, which is spamming `cudaMemcpy` and `cudaMemaloc`
In order to do this, I need to rethink my `trainer::fw_propagate` seriously.
A CRAZY IDEA, would be to ALLOCATE before-hand ALL Pattern-trainer_data.
In this case, I would allocate as many training patterns ONCE.
Then for each Epoch Iteration, I would re-use the same Pattern-Trainer_Data.
This will use ALOT of Memory, but will avoid constant re-allocations and copying.
It may be feasible with normal training sets, but may become a problem with large ones.
3. Average Cross-Entropy Error. This may require changes to back_prop as the update rule slightly changes.
4. Enable Output Regression (Soft-Max) to properly scale the output.
NOTE: Profiling using nvprof was revealing: most time is spent allocating and copying memory.
This needs to be addressed and fixed.
NOTE: Running Version 0.2 using `diabetes` GPU usage is: 25%-30% and 307Mb/2Gb with Memory and PCIe usage 1~2%
Therefore, v 0.2 did not maximise GPU utilisation.