A few weeks ago, I wrote a post comparing the performance of Flash/AS3 vs. Silverlight/C#. I used a very simple test, element-by-element addition of two vectors of numbers. The C# version was about 5 times faster.
I was curious whether I’d get a similar performance disparity with some less trivial computation, so I ported my “even faster” AS3 FFT implementation to C#. The C# FFT uses only managed code (no pointers) so you can use it in Silverlight. I didn’t attempt to make any C#-specific optimizations, just did the minimal changes to get it to build.
As in my earlier C# vs. Flash performance test, I once again found that the C# implementation is much faster, nearly 4x faster than AS3. For 1000 forward-inverse pairs of 1024 pt FFTs, my C# FFT takes ~85 ms. The same test with the AS3 FFT takes over 300 ms.
I’m releasing the C# FFT code under the MIT license, which means you can pretty much do whatever you like with it, as long as you keep the comment block that says I wrote it and that you can do whatever you like with it
. If you find it useful or have suggestions to improve it, feel free to leave a comment.
Update 3 August 2011: Graeme West reported some strange results using my FFT. I dug into it, and discovered a bug: the sign for the imaginary part of the “twiddle factors” was incorrect. I’ve fixed it now – just required changing the if (inverse) condition to if (inverse == false). I never noticed because for real input data (i.e. where the imaginary component is zero), the only impact on the output was that the sign of the imaginary component was flipped, and that had no effect on magnitude spectra (which is what I normally was computing).
using System;
using System.Net;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Documents;
using System.Windows.Ink;
using System.Windows.Input;
using System.Windows.Media;
using System.Windows.Media.Animation;
using System.Windows.Shapes;
namespace SpeedTest
{
/**
* Performs an in-place complex FFT.
*
* Released under the MIT License
*
* Copyright (c) 2010 Gerald T. Beauregard
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to
* deal in the Software without restriction, including without limitation the
* rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
* sell copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
public class FFT2
{
// Element for linked list in which we store the
// input/output data. We use a linked list because
// for sequential access it's faster than array index.
class FFTElement
{
public double re = 0.0; // Real component
public double im = 0.0; // Imaginary component
public FFTElement next; // Next element in linked list
public uint revTgt; // Target position post bit-reversal
}
private uint m_logN = 0; // log2 of FFT size
private uint m_N = 0; // FFT size
private FFTElement[] m_X; // Vector of linked list elements
/**
*
*/
public FFT2()
{
}
/**
* Initialize class to perform FFT of specified size.
*
* @param logN Log2 of FFT length. e.g. for 512 pt FFT, logN = 9.
*/
public void init(
uint logN )
{
m_logN = logN;
m_N = (uint)(1 << (int)m_logN);
// Allocate elements for linked list of complex numbers.
m_X = new FFTElement[m_N];
for (uint k = 0; k < m_N; k++)
m_X[k] = new FFTElement();
// Set up "next" pointers.
for (uint k = 0; k < m_N-1; k++)
m_X[k].next = m_X[k+1];
// Specify target for bit reversal re-ordering.
for (uint k = 0; k < m_N; k++ )
m_X[k].revTgt = BitReverse(k,logN);
}
/**
* Performs in-place complex FFT.
*
* @param xRe Real part of input/output
* @param xIm Imaginary part of input/output
* @param inverse If true, do an inverse FFT
*/
public void run(
double[] xRe,
double[] xIm,
bool inverse = false )
{
uint numFlies = m_N >> 1; // Number of butterflies per sub-FFT
uint span = m_N >> 1; // Width of the butterfly
uint spacing = m_N; // Distance between start of sub-FFTs
uint wIndexStep = 1; // Increment for twiddle table index
// Copy data into linked complex number objects
// If it's an IFFT, we divide by N while we're at it
FFTElement x = m_X[0];
uint k = 0;
double scale = inverse ? 1.0/m_N : 1.0;
while (x != null)
{
x.re = scale*xRe[k];
x.im = scale*xIm[k];
x = x.next;
k++;
}
// For each stage of the FFT
for (uint stage = 0; stage < m_logN; stage++)
{
// Compute a multiplier factor for the "twiddle factors".
// The twiddle factors are complex unit vectors spaced at
// regular angular intervals. The angle by which the twiddle
// factor advances depends on the FFT stage. In many FFT
// implementations the twiddle factors are cached, but because
// array lookup is relatively slow in C#, it's just
// as fast to compute them on the fly.
double wAngleInc = wIndexStep * 2.0*Math.PI/m_N;
if (inverse == false)
wAngleInc *= -1;
double wMulRe = Math.Cos(wAngleInc);
double wMulIm = Math.Sin(wAngleInc);
for (uint start = 0; start < m_N; start += spacing)
{
FFTElement xTop = m_X[start];
FFTElement xBot = m_X[start+span];
double wRe = 1.0;
double wIm = 0.0;
// For each butterfly in this stage
for (uint flyCount = 0; flyCount < numFlies; ++flyCount)
{
// Get the top & bottom values
double xTopRe = xTop.re;
double xTopIm = xTop.im;
double xBotRe = xBot.re;
double xBotIm = xBot.im;
// Top branch of butterfly has addition
xTop.re = xTopRe + xBotRe;
xTop.im = xTopIm + xBotIm;
// Bottom branch of butterly has subtraction,
// followed by multiplication by twiddle factor
xBotRe = xTopRe - xBotRe;
xBotIm = xTopIm - xBotIm;
xBot.re = xBotRe*wRe - xBotIm*wIm;
xBot.im = xBotRe*wIm + xBotIm*wRe;
// Advance butterfly to next top & bottom positions
xTop = xTop.next;
xBot = xBot.next;
// Update the twiddle factor, via complex multiply
// by unit vector with the appropriate angle
// (wRe + j wIm) = (wRe + j wIm) x (wMulRe + j wMulIm)
double tRe = wRe;
wRe = wRe*wMulRe - wIm*wMulIm;
wIm = tRe*wMulIm + wIm*wMulRe;
}
}
numFlies >>= 1; // Divide by 2 by right shift
span >>= 1;
spacing >>= 1;
wIndexStep <<= 1; // Multiply by 2 by left shift
}
// The algorithm leaves the result in a scrambled order.
// Unscramble while copying values from the complex
// linked list elements back to the input/output vectors.
x = m_X[0];
while (x != null)
{
uint target = x.revTgt;
xRe[target] = x.re;
xIm[target] = x.im;
x = x.next;
}
}
/**
* Do bit reversal of specified number of places of an int
* For example, 1101 bit-reversed is 1011
*
* @param x Number to be bit-reverse.
* @param numBits Number of bits in the number.
*/
private uint BitReverse(
uint x,
uint numBits)
{
uint y = 0;
for (uint i = 0; i < numBits; i++)
{
y <<= 1;
y |= x & 0x0001;
x >>= 1;
}
return y;
}
}
}
Gerry-
I have written a small application in C# using NAudio that will record a sound clip from line-in or MIC. My requirement is to search this recorded sound clip for a 2 second clip. (I know this is very similar to Shazam). Another dev suggested I use FFT, but I have no background at all in signal processing.
Basically, in pseudocode, I want to achieve the following:
If RecordedSound.data contains SoundsClip(myclip.wav)
{
Do this
and this
}
If FFT is the way to go, could you point me in the right direction to get started on this?
Thanks-
Jason
Jason, that’s a challenging problem!
Basically what you need to do is to do is analyze the two clips and generate some “feature vectors” (vectors of numbers that represent their acoustic properties in some time slice) at intervals of 10 – 50 ms or so. Then you need to attempt to match the feature vectors of one clip against another clip. That would require some measure of “distance” between the feature vectors, and some fast matching algorithm that’s smart enough to not do an exhaustive computation of the match at every time position in every clip. In short – quite a big problem.
Fortunately people have done a lot of work in this field already, and some of the stuff has made it into audio standards already. In particular the choice of feature vectors has been gravitating towards a small set thanks to MPEG-7 standardization. Do a Google search for “MPEG7 music features”.
Incidentally, the Fourier transform by itself won’t give a good feature vector – it’s too big, and contains loads of information that can change dramatically even for very small changes in perceived sound. I believe people normally use a cepstrum, which can be computed from the Fourier Transform – the cepstrum’s much smaller and more robust. Look for MFCC (Mel-Frequency Cepstral Coefficients) in the literature on the subject].
-Gerry
Hi
Thanks for this very nice implementation.
It seems that there is something non-standard happening here. Compared to other implementations, the entries seem to be in a rather strange order: call the requisite outputs of the FFT y[0], y[1], …. y[N]. (These are what the outputs would be according to other implementations, and indeed according to the definition i.e. just doing the algebra of multiplying the vector by the requisite square matrix.) But actually your implementation is writing the output out as y[0], y[N], y[N-1], …. y[1].
I’m sure I’m missing something. Can you advise?
Hi,
Your algorithm works very well. I really appreciate it. Is there any limitation as to the size of the data array? If I set my data size greater than 262,144 it does not seem to return a result.
Again, thanks for simple well commented code.
The only hard-coded limitation on the size of the data is due to the size of uint/int, but that would limit the FFT size to 2^31 (or maybe 2^32). If it’s failing for a smaller size, it’s probably because of memory. I suggest just checking the result of each “new” call in init() to ensure it succeeded, i.e. that the value is non-null. Perhaps just modify init() so it returns a bool. If any of the new calls gives a null pointer, immediately return from init() with false; and return true at the end of the the init() function. Then in the code that uses the FFT, just check the return value from init(). If it’s true, it’s good to go. If the return value is false, then you know you’ve hit the limit for memory allocation on the heap.
If the issue is that you’re running out of memory, maybe there’s some way to increase the maximum amount of heap memory. Alternatively, you could use a different FFT algorithm. The C# version I posted is a bit odd, in that it copies all the data into linked lists for processing. It’s a direct port from an ActionScript implementation which had to be written in a strange way to get decent performance. In ActionScript, it turns out that accessing elements of vectors is not very fast, and stepping through linked lists much faster. In C#, it’s probably faster (and definitely less memory intensive) to operate on the arrays/vectors of input data directly, as I did in my original more conventional ActionScript FFT implementation here which you could probably port pretty easily to C#.
i was wondering how could i use this code in my project to determine the frequency of the sound recorded from the mic, plz can you help me , thanks
Hi Gerry, I used your AS3 FFT code and it worked great. I’m moving my project over to C# now though. For some reason, I keep getting Infinity when I try to convert the FFT output (which is in magnitude I think) to dB. I’ll link my stackoverflow post.
http://stackoverflow.com/questions/8696926/magnitude-to-decibel-always-returns-infinity-in-c-sharp
I can’t find anything wrong with my conversion, so I think I’m not seeing something. Thanks!!
If the power in a bin happens to be zero (i.e. re and im parts are both zero), then you’ll be taking the log of zero. The log of zero is minus infinity. A straightforward way to prevent this is to add a very small number to the power before taking the log, as I do in my spectrum analyzer code.
Incidentally, since you’re getting the magnitude (not the power) in each bin by taking the square root of re*re + im*im, to get dB levels you want 20 times the log10, not 10. Alternatively, you can skip the square root so you’re dealing with power, skip the square root, and multiply the log10 by 10.
Hi Gerry, please disregard my question below, I found out what was wrong. Everything is working as it should, but my values are very large (up in the positive 600′s) but when I did the spectrum analyzer in AS3 everything was between -150 and 0. Do you know what might be causing this? Thanks again!!
Thanks for the advice, I think I sort of jumble up things a bit when I was trying to initially fix them. I tried what you advised just now and I keep getting NaN for some reason. I’ve gone back to the spectrum analyzer code and taken from there. I still keep getting NaN though, but I’m not sure why. Here is my code:
…
private double SCALE = (double)20/System.Math.Log(10); // used to convert magnitude from FFT to usable dB
private double MIN_VALUE = (double)System.Double.MinValue;
…
// Perform FFT
fft.run(tempRe, tempIm);
fftCount++;
// Convert from Decibel to Magnitude
for (int i = 0; i < N/2; i++) {
double re = tempRe[i]; // get the Real FFT Number at position i
double im = tempIm[i]; // get the Imaginary FFT Number at position i
m_mag[i] = Math.Sqrt(re * re + im * im); // Convert magnitude to decibels
m_mag[i] = SCALE * Math.Log(m_mag[i] + MIN_VALUE);
if (fftCount == 50 && i == 400) print ("dB @ 399: " + m_mag[399]);
}
I'm new to C# so it could just be something simple but I'm not sure. Thanks!!
Gerry,
Thanks for the useful code! I dropped it into a Visual Studio 2010 C# project and it worked perfectly!
Many thanks. This is really clear and effective. I’m just getting into C# and I learnt a lot from this and wrote a little ‘is my violin in tune’ programme for my son. You can download this if you want from the website. This should take you to it: http://www.moonsch.com/?creator&Administrator&20
There were no changes required, except in the ‘run’ definition my (old 2003) compiler didn’t like ‘inverse’ being defaulted to false for some reason. Easily overcome, but next lesson is probably to look into this and find out why! Not tonight though.
Does the code work for all sizes or for 2^n’s?
It’s only for power-of-2 lengths. The FFT size is specified via the logN argument to init(), logN being the base-2 length of the FFT. For example, for a 512 point FFT, logN = 9. For non-power-of-2 lengths, you could try FFTW; I don’t think it’s available in C#, but people have written C# wrappers to call it.