-
Notifications
You must be signed in to change notification settings - Fork 154
Generating Bit Reverse Tables for FFTs
Generating bit reversal tables for CMSIS-DSP is a question which is coming back often. In this post, I'll explain how to do it. (I'll use the naming "bit reversal" even if not totally correct).
Pure radix case is the simplest. The reversal formula is obvious and the out of place reversal and in-place reversal are the same.
In CMSIS the bit reversal tables are in bytes and take into account the fact that the FFT is complex. So there is a factor of 8 applied to the sample indexes.
In this post I'll just consider the index. So you'll need to multiply by 8 to generate the final CMSIS-DSP tables.
Let's take the example of the 64 samples CFFT F32.
The index of a sample is written
8 n1 + n2 with 0 <= n1,n2 < 8
The reversal of this index is just a reversal of the values n1, n2 and is written
8 n2 + n1
The correspondence
8 n1 + n2 -> 8 n2 + n1
is giving the transposition required to implement the bit reversing table.
You must of course remove the cases where 8 n1 + n2 is equal to 8 n2 + n1. And if you include the case a -> b, don't include the case b -> a.
This list of transpositions is working for an out of place reversal and also for an in-place one because the transpositions are always disjoint ones. It won't be the case any more with mixed radix as we will see. And since CMSIS-DSP is doing an in-place reversing, it will have to be taken into account.
The radix-4 FFT in CMSIS-DSP are implemented in such a way that the bit reversal is in fact a base 2 one. So, if you decompose your index as:
4 n1 + 2 n2 + n3
and reverse it as
4 n3 + 2 n2 + n1
then you'll get the tables for the radix-4 FFT and radix 4x2 of CMSIS.
Mixed radix is more complex for two reasons:
- The reversal formula is not as obvious.
- The reversal formula is not giving directly the list of transpositions to apply for the in-place algorithm. It is just giving a transformation for the out-of-place algorithm.
Let's take as example a 16 sample FFT.
The reversal formula is:
2 b + a -> 8 a + b
with 0 <= a < 1 and 0 <= b < 7.
If we compute all the values we get:
{0 -> 0, 2 -> 1, 4 -> 2, 6 -> 3, 8 -> 4, 10 -> 5, 12 -> 6, 14 -> 7,
1 -> 8, 3 -> 9, 5 -> 10, 7 -> 11, 9 -> 12, 11 -> 13, 13 -> 14,
15 -> 15}
We see a problem with the transform 8 -> 4 and 4 -> 2. They are not disjoint since both involve the position 4.
If we compute the out-of-order reordering by applying above rules to the sequence [1..15], we get:
{8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6, 14, 7, 15}
To go from [1..15] to the above order, the permutation required is:
(1 2 4 8)(3 6 12 9)(5 10)(7 14 13 11)
Where (a b c d) is a cycle representing the transform a -> b -> c -> d -> a
This permutation can be computed using FindPermutation in Mathematica, or Permutation.from_sequence using the Python symbolic package sympy.
So, we see that the effect of having overlap in the transformation rules is to generate a cycle in the final permutation.
To do this permutation in place we need to decompose the cycle in a sequence of transpositions. There is not an unique way to do it. So you may not (and will not) recover the table of the CMSIS-DSP but you'll get tables with the same length and the same effect on the array of samples.
In addition to that, the disjoint cycles being independent they can be reordered in the permutation. It means that the order in the final CMSIS-DSP table is also a bit arbitrary.
So the best way to check that the resulting table is correct is to compute its equivalent permutation. You don't need to be bit exact with the CMSIS-DSP bit reversal tables to be right.
I choose to decompose a cycle
(a b c d) as (a d)(b d)(c d).
So I finally get the following list of transpositions:
(1 8)(2 8)(4 8)(3 9)(6 9)(12 9)(5 10)(7 11)(14 11)(13 11)
If we flatten the list of numbers and multiply by 8, we get the following table:
8,64, 16,64, 32,64, 24,72, 48,72, 96,72, 40,80, 56,88, 112,88, 104,88
The original CMSIS-DSP table is:
8,64, 24,72, 16,64, 40,80, 32,64, 56,88, 48,72, 88,104, 72,96, 104,112
First obvious difference : the order is different. But it is not the only difference.
In our table we have the pairs (56,88), (112,88) and (104,88). In CMSIS-DSP table, we have (56,88), (88,104) and (104,112).
And you can check that:
(56 88)(112 88)(104 88) = (56 88)(88 104)(104 112)
So both tables are representing the same permutation of the samples.
For a 8192 table, the correspondence formula would be:
1024 e + 128 d + 16 c + 2 b + a -> 4096 a + 512 b + 64 c + 8 d + e
with 0 <= a < 2 and 0 <= b,c,d,e < 8.
For a 32 samples FFT, the correspondence formula is:
4 b + a -> 8 a + b
with 0 <= a < 4 and 0 <= b < 8
For a 2048 FFT, the correspondence formula is:
256 d + 32 c + 4 b + a -> 512 a + 64 b + 8 c + d
with 0 <= a < 4 and 0 <= b,c,d < 8
I have attached the bit reversal table and twiddle table to implement the 8192 CFFT. Note that the index type for bit reversal has to be changed from uint16_t to uint32_t to support 8192. So it impacts the API and other bit reversal tables.
A Python script was contributed by a user of the library to generate some of those tables.
This repository is also providing CMSIS-DSP FFTs with bigger sizes: https://github.com/Treeed/Long_FFTs_for_CMSIS_DSP/tree/master
Code has been duplicated because index type must be changed from uint16_t to uint32_t
Note that external projects have not been checked nor validated.