diff --git a/go/README.md b/go/README.md index 488fb2b..9d52d15 100644 --- a/go/README.md +++ b/go/README.md @@ -15,7 +15,7 @@ import ( ) func init() { - sppark.Load("poc.cu", "-arch=native") + sppark.Load("poc.cu", "-O2") } func CudaFunc() { @@ -30,3 +30,5 @@ func CudaFunc() { In the presented case `sppark.Load()` attempts to load `poc.so`, a shared library with the name derived from the first argument to the method, that is expected to reside next to the **current** executable. If not found, the method will attempt to compile `poc.cu` with `nvcc` and retry to load it. There may be any number of wrappers implemented in the bridge module. And one needs a copy of [`cgo_sppark.h`](cgo_sppark.h) in the same directory. If so desired, the CUDA module and the Go bridge can be packaged into a Go module for the target application to `import`. The nature of this Go module is such that if a user wants to compile the shared object prior the application being executed for the first time, it's on the user. Because that's where the actual CUDA code is. One way is to implement a test that won't even have to make any CUDA calls, it's sufficient to copy the generated shared object to a directory of your choice. Consider [`poc_test.go`](../poc/go/poc_test.go) as a template. + +If you use Windows, recall that on Windows [`cgo`](https://pkg.go.dev/cmd/cgo) depends on [MinGW GCC](https://www.mingw-w64.org/) being available on your %PATH%. Verify with `gcc -dM -E -x c nul: | findstr "MINGW64"`. In addition to which you need to have [`nvcc`](https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64) and even [`cl`](https://learn.microsoft.com/en-us/visualstudio/ide/reference/command-prompt-powershell), the one targeting x64.