Skip to content

Latest commit

 

History

History
124 lines (109 loc) · 6.98 KB

README.md

File metadata and controls

124 lines (109 loc) · 6.98 KB

go-cv-simd

This is the low-level Go Assembly part of the go-cv wrapper around Simd.

SIMD

The Simd Library is a highly optimized image processing library. It provides many useful high performance algorithms for image processing such as:

  • pixel format conversion
  • image scaling and filtration
  • extraction of statistic information from images
  • motion detection
  • object detection (HAAR and LBP classifier cascades)
  • classification
  • neural network

The algorithms are optimized using different SIMD CPU extensions. In particular the library supports following CPU extensions:

  • x86/x64: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and AVX2
  • ARM: NEON

c2goasm

This wrapper depends on c2goasm for embedding the assembly from the individual functions into Go.

See the sse2.sh script for more information about how to invoke.

Performance compared to OpenCV 2.x

A comparison against go-opencv shows the following results:

                               OpenCV          SSE2
benchmark                   old ns/op     new ns/op      delta
BenchmarkGaussian-8             74338         18481    -75.14%
BenchmarkGaussianRGB-8         186024         57169    -69.27%
BenchmarkBlur-8                110155         16623    -84.91%
BenchmarkBlurRGB-8             293017         53716    -81.67%
BenchmarkMedian3x3-8           129268         23270    -82.00%
BenchmarkMedian3x3RGB-8        169857         65896    -61.21%
BenchmarkMedian5x5-8           883311        131812    -85.08%
BenchmarkMedian5x5RGB-8       1246845        388415    -68.85%

Performance

Below are the performance figures for SSE2

BenchmarkFilterRhomb3x3-8                    	  200000	     11114 ns/op
BenchmarkFilterRhomb3x3RGB-8                 	   50000	     30721 ns/op
BenchmarkFilterRhomb5x5-8                    	   20000	     64515 ns/op
BenchmarkFilterRhomb5x5RGB-8                 	   10000	    207618 ns/op
BenchmarkFilterSquare3x3-8                   	   50000	     24205 ns/op
BenchmarkFilterSquare3x3RGB-8                	   20000	     89507 ns/op
BenchmarkFilterSquare5x5-8                   	   10000	    133944 ns/op
BenchmarkFilterSquare5x5RGB-8                	    5000	    373998 ns/op
BenchmarkGaussianBlur-8                      	  100000	     18290 ns/op
BenchmarkBgraToGray-8                        	  100000	     21582 ns/op
BenchmarkMeanFilter-8                        	  100000	     16248 ns/op
BenchmarkMeanFilterRGB-8                     	   20000	     65692 ns/op
BenchmarkFillBgr-8                           	  300000	      5859 ns/op
BenchmarkFillBgra-8                          	  200000	      7691 ns/op
BenchmarkLaplace-8                           	   50000	     23146 ns/op
BenchmarkInt16ToGray-8                       	  100000	     23108 ns/op
BenchmarkGrayToBgra-8                        	  200000	      9651 ns/op
BenchmarkDeinterleaveUv-8                    	  200000	     11442 ns/op
BenchmarkBinarization-8                      	  300000	      3910 ns/op
BenchmarkAveragingBinarization-8             	   20000	     71418 ns/op
BenchmarkBgraToYuv420p-8                     	   30000	     46726 ns/op
BenchmarkBgraToYuv422p-8                     	   30000	     55859 ns/op
BenchmarkBgraToYuv444p-8                     	   20000	     61443 ns/op
BenchmarkAlphaBlending-8                     	   30000	     57316 ns/op
BenchmarkAbsGradientSaturatedSum-8           	  200000	      7354 ns/op
BenchmarkSquaredDifferenceSum-8              	  200000	      6192 ns/op
BenchmarkSquaredDifferenceSumMasked-8        	  200000	      8200 ns/op
BenchmarkAbsDifferenceSum-8                  	  500000	      3169 ns/op
BenchmarkAbsDifferenceSumMasked-8            	  300000	      4378 ns/op
BenchmarkAbsDifferenceSums3x3-8              	  100000	     12497 ns/op
BenchmarkAbsDifferenceSums3x3Masked-8        	  100000	     18695 ns/op
BenchmarkAddFeatureDifference-8              	  200000	      9140 ns/op
BenchmarkBackgroundInitMask-8                	  500000	      3883 ns/op
BenchmarkBackgroundGrowRangeSlow-8           	  200000	      7023 ns/op
BenchmarkBackgroundGrowRangeFast-8           	  300000	      5264 ns/op
BenchmarkBackgroundIncrementCount-8          	  200000	     10248 ns/op
BenchmarkBackgroundAdjustRange-8             	  100000	     11832 ns/op
BenchmarkBackgroundAdjustRangeMasked-8       	  100000	     14080 ns/op
BenchmarkBackgroundShiftRange-8              	  200000	      6310 ns/op
BenchmarkBackgroundShiftRangeMasked-8        	  200000	      8252 ns/op
BenchmarkHistogramAbsSecondDerivative-8            30000	     46647 ns/op
BenchmarkHistogramMasked-8                	   30000	     48566 ns/op
BenchmarkHistogramConditional-8              	   30000	     48975 ns/op
BenchmarkSegmentationFillSingleHoles-8       	   50000	     26726 ns/op
BenchmarkSegmentationChangeIndex-8           	  500000	      3430 ns/op
BenchmarkSegmentationPropagate2x2-8          	   50000	     38551 ns/op
BenchmarkSobelDx-8                           	  100000	     17599 ns/op
BenchmarkSobelDy-8                           	  100000	     17063 ns/op
BenchmarkConditionalCount8u-8                	  500000	      3159 ns/op
BenchmarkConditionalCount16i-8               	  200000	      6549 ns/op
BenchmarkConditionalSum-8                    	  500000	      3284 ns/op
BenchmarkConditionalSquareSum-8              	  300000	      5789 ns/op
BenchmarkConditionalSquareGradientSum-8      	  100000	     15947 ns/op
BenchmarkConditionalFill-8                   	  300000	      4195 ns/op
BenchmarkOperationBinary8u-8                 	  500000	      4055 ns/op
BenchmarkOperationBinary16i-8                	  200000	     11250 ns/op
BenchmarkReduceGray2x2-8   	               	  100000	     15383 ns/op
BenchmarkReduceGray3x3-8   	               	   20000	     71412 ns/op
BenchmarkReduceGray4x4-8   	               	   10000	    134250 ns/op
BenchmarkReorder16bit-8   	               	  500000	      3125 ns/op
BenchmarkReorder32bit-8   	               	  300000	      5176 ns/op
BenchmarkReorder64bit-8   	               	  300000	      5134 ns/op
BenchmarkStretchGray2x2-8                    	  500000	      3548 ns/op
BenchmarkYuv420pToBgra-8                     	   30000	     50043 ns/op
BenchmarkYuv422pToBgra-8                     	   30000	     54433 ns/op
BenchmarkYuv444pToBgra-8                     	   30000	     54661 ns/op
BenchmarkTextureGetDifferenceSum-8           	  300000	      5642 ns/op
BenchmarkTextureBoostedUv-8                  	  300000	      5139 ns/op
BenchmarkTextureBoostedSaturatedGradient-8   	  100000	     22974 ns/op
BenchmarkTexturePerformCompensation-8        	  500000	      2801 ns/op

go-cv-simd

See the underlying package go-cv-simd for more information.

License

go-cv-simd is released under the Apache License v2.0. You can find the complete text in the file LICENSE.