dj_fft: Header-only FFT library

Details

This repository provides a header-only library to compute fourier transforms in 1D, 2D, and 3D. Its goal is to provide a fast and easy-to-use fast fourier transform algorithm.

Cloning

Clone the repository and all its submodules using the following command:

git clone --recursive git@github.com:jdupuy/dj_fft.git

If you accidentally omitted the --recursive flag when cloning the repository you can retrieve the submodules like so:

git submodule update --init --recursive

Usage

The 1D, 2D, and 3D FFT routines return an std::vector<std::complex<T>>, given another std::vector<std::complex<T>> as input, which specifies the data that must be transformed, as well as an enum class dj::fft_dir, which specifies in which direction the FFT must be computed (specify dj::fft_dir::DIR_FWD for the forward direction and dj::fft_dir::DIR_BWD for the backward direction).

Note that the input vector is expected to be of size N for 1D FFT, NxN for a 2D FFT, and NxNxN for a 3D FFT, where N must be a power of two. Note that the 2D and 3D vectors are expected to be arranged in a flat row-major fashion, i.e., the 2D and 3D elements (i, j) and (i, j, k) are respectively located at index i + N * j and i + N * (j + N * k) in memory.

Below is a C++ pseudocode for computing a 2D FFT in forward direction:

#define DJ_FFT_IMPLEMENTATION // define this in exactly *one* .cpp file
#include "dj_fft.h"

some_function()
{
  int N = size_of_your_input; // input size
  auto myData = std::vector<std::complex<T>>(N * N); // input data

  // prepare data
  for (int j = 0; j < N; ++j) {
    for (int i = 0; i < N; ++i) {
      myData[i + N * j] = some_value; // set element (i, j)
    }
  }

  // compute forward 2D FFT
  auto fftData = dj::fft2d(myData, dj::fft_dir::DIR_FWD);

  // print the data
  for (int j = 0; j < N; ++j) {
    for (int i = 0; i < N; ++i) {
      printf("{%f, %f} ", fftData[i + N * j].real(), fftData[i + N * j].imag());
    }
    printf("\n");
  }
}

To see examples that compile, see the examples/ directory.

GPU Acceleration

Additionally, the library provides GPU accelerated 1D, 2D, and 3D FFTs for std::vector<std::complex<float>> inputs. GPU acceleration is especially relevant for large 2D and 3D datasets. For instance:

for an input of size 4096x4096, a regular 2D FFT completes in roughly 18 seconds on an intel i7-8086k, and 0.9 seconds on an NVidia RTX 2080
for an input of size 512x512x512, a regular 3D FFT completes in roughly 131 seconds on an intel i7-8086k, and 8.2 seconds on an NVidia RTX 2080

The following table provides a more comprehensive set of measurements for 2D FFTs:

2D FFT Resolution	256²	512²	1024²	2048²	4096²	8192²
CPU (i7-8086k)	0.05s	0.22s	0.99s	4.32s	18.85s	81.96s
GPU (RTX 2080)	0.01s	0.02s	0.07s	0.24s	0.94s	3.68s
GPU speed-up	x5	x11	x14	x18	x20	x22

The following table provides a more comprehensive set of measurements for 3D FFTs:

3D FFT Resolution	64³	128³	256³	512³
CPU (i7-8086k)	0.19s	1.72s	15.70s	141.18s
GPU (RTX 2080)	0.04s	0.15s	1.03s	8.10s
GPU speed-up	x5	x11	x15	x17

Below is a C++ pseudocode for computing a 1D FFT in backward direction on the GPU:

#define DJ_FFT_IMPLEMENTATION // define this in exactly *one* .cpp file
#include "dj_fft.h"

some_function()
{
  int N = size_of_your_input; // input size
  auto myData = std::vector<std::complex<float>>(N); // input data

  // prepare data
  for (int i = 0; i < N; ++i) {
    myData[i] = some_float_value; // set element (i)
  }

  // compute backward 1D FFT
  auto fftData = dj::fft1d_gpu(myData, dj::fft_dir::FFT_BWD);

  // print the data
  for (int i = 0; i < N; ++i) {
    printf("{%f, %f}\n", fftData[i].real(), fftData[i].imag());
  }
}

Note that the return values of a GPU FFT may differ slightly from that of a regular FFT, due to the way floating point arithmetic is implemented.

For a complete example that compiles, see the examples/ directory.

GPU Acceleration (Advanced)

By default, the GPU accelerated routines run on the primary GPU. Users who want to run the FFT on a secondary GPU will have to create an OpenGL context themselves and use the fftNd_gpu_glready functions. You can create a custom OpenGL context with a cross-platform windowing library such as GLFW (https://www.glfw.org/), and an OpenGL function loader such as glad (https://glad.dav1d.de/). I'll probably add a sample at some point.

License

This library is in the public domain. You can do anything you want with them. You have no legal obligation to do anything else, although I appreciate attribution.

It is also licensed under the MIT open source license, if you have lawyers who are unhappy with public domain. The dj_fft.h source file includes an explicit dual-license for you to choose from.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
examples		examples
test		test
.appveyor.yml		.appveyor.yml
.gitmodules		.gitmodules
.travis.yml		.travis.yml
LICENSE.txt		LICENSE.txt
README.md		README.md
dj_fft.h		dj_fft.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dj_fft: Header-only FFT library

Details

Cloning

Usage

GPU Acceleration

GPU Acceleration (Advanced)

License

About

Releases

Packages

Contributors 2

Languages

License

jdupuy/dj_fft

Folders and files

Latest commit

History

Repository files navigation

dj_fft: Header-only FFT library

Details

Cloning

Usage

GPU Acceleration

GPU Acceleration (Advanced)

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages