ModelNet10 dataset downloaded from Princeton ModelNet.
- Original OFF files converted into PLY format.
- Each polygon object is rendered and taken images of from three orthogonal axes.
- Twelve images represent for each axis and are frames of one rendered video.
- Plain 3D-CNN architecture with 16-filter layer stack.
- Batch normalization applied. LeakyReLU as activation function.
- Features from three axes are fed to the same network.
- Prediction is made by voting from three results.
- Model summary (link)
- Trained for 20 epochs.
- Best test accuracy 91.6%.