If you want to write code for signal processing on the Mac or iOS, you really should take advantage of Apple’s Accelerate framework. It provides an extensive library of highly optimized mathematical functions suitable for a wide range of signal processing applications.

While many of the functions are fairly straightforward to use, and well-documented in the vDSP Programming Guide, the FFT functions are a bit maddening, especially when doing FFTs of real signals, which is pretty much always the case when dealing with audio.

The following code example shows how to do a real-to-complex FFT of a real vector; convert from complex representation to magnitude and phase; convert it back to rectangular/complex representation; and do complex-to-real FFT, with the correct scaling to get back to the original signal.

// main.cpp // AccelerateFFT // // Demo of how to use Apple's blazingly fast but maddeningly confusing // real->complex and complex->real FFT functions. // // Created by Gerry Beauregard (g.beauregard [at] ieee.org) on 2013-01-28. // // Use this code however you like. No credit required, but if this code // saves you some hair-pulling, a hat-tip and kind comment is always // appreciated 😉 #include <stdio.h> #include <Accelerate/Accelerate.h> const int LOG_N = 4; // Typically this would be at least 10 (i.e. 1024pt FFTs) const int N = 1 << LOG_N; const float PI = 4*atan(1); int main(int argc, const char * argv[]) { // Set up a data structure with pre-calculated values for // doing a very fast FFT. The structure is opaque, but presumably // includes sin/cos twiddle factors, and a lookup table for converting // to/from bit-reversed ordering. Normally you'd create this once // in your application, then use it for many (hundreds! thousands!) of // forward and inverse FFTs. FFTSetup fftSetup = vDSP_create_fftsetup(LOG_N, kFFTRadix2); // ------------------------------- // Set up a bunch of buffers // Buffers for real (time-domain) input and output signals. float *x = new float[N]; float *y = new float[N]; // Initialize the input buffer with a sinusoid int BIN = 3; for (int k = 0; k < N; k++) x[k] = cos(2*PI*BIN*k/N); // We need complex buffers in two different formats! DSPComplex *tempComplex = new DSPComplex[N/2]; DSPSplitComplex tempSplitComplex; tempSplitComplex.realp = new float[N/2]; tempSplitComplex.imagp = new float[N/2]; // For polar coordinates float *mag = new float[N/2]; float *phase = new float[N/2]; // ---------------------------------------------------------------- // Forward FFT // Scramble-pack the real data into complex buffer in just the way that's // required by the real-to-complex FFT function that follows. vDSP_ctoz((DSPComplex*)x, 2, &tempSplitComplex, 1, N/2); // Do real->complex forward FFT vDSP_fft_zrip(fftSetup, &tempSplitComplex, 1, LOG_N, kFFTDirection_Forward); // Print the complex spectrum. Note that since it's the FFT of a real signal, // the spectrum is conjugate symmetric, that is the negative frequency components // are complex conjugates of the positive frequencies. The real->complex FFT // therefore only gives us the positive half of the spectrum from bin 0 ("DC") // to bin N/2 (Nyquist frequency, i.e. half the sample rate). Typically with // audio code, you don't need to worry much about the DC and Nyquist values, as // they'll be very close to zero if you're doing everything else correctly. // // Bins 0 and N/2 both necessarily have zero phase, so in the packed format // only the real values are output, and these are stuffed into the real/imag components // of the first complex value (even though they are both in fact real values). Try // replacing BIN above with N/2 to see how sinusoid at Nyquist appears in the spectrum. printf("\nSpectrum:\n"); for (int k = 0; k < N/2; k++) { printf("%3d\t%6.2f\t%6.2f\n", k, tempSplitComplex.realp[k], tempSplitComplex.imagp[k]); } // ---------------------------------------------------------------- // Convert from complex/rectangular (real, imaginary) coordinates // to polar (magnitude and phase) coordinates. // Compute magnitude and phase. Can also be done using vDSP_polar. // Note that when printing out the values below, we ignore bin zero, as the // real/complex values for bin zero in tempSplitComplex actually both correspond // to real spectrum values for bins 0 (DC) and N/2 (Nyquist) respectively. vDSP_zvabs(&tempSplitComplex, 1, mag, 1, N/2); vDSP_zvphas(&tempSplitComplex, 1, phase, 1, N/2); printf("\nMag / Phase:\n"); for (int k = 1; k < N/2; k++) { printf("%3d\t%6.2f\t%6.2f\n", k, mag[k], phase[k]); } // ---------------------------------------------------------------- // Convert from polar coordinates back to rectangular coordinates. tempSplitComplex.realp = mag; tempSplitComplex.imagp = phase; vDSP_ztoc(&tempSplitComplex, 1, tempComplex, 2, N/2); vDSP_rect((float*)tempComplex, 2, (float*)tempComplex, 2, N/2); vDSP_ctoz(tempComplex, 2, &tempSplitComplex, 1, N/2); // ---------------------------------------------------------------- // Do Inverse FFT // Do complex->real inverse FFT. vDSP_fft_zrip(fftSetup, &tempSplitComplex, 1, LOG_N, kFFTDirection_Inverse); // This leaves result in packed format. Here we unpack it into a real vector. vDSP_ztoc(&tempSplitComplex, 1, (DSPComplex*)y, 2, N/2); // Neither the forward nor inverse FFT does any scaling. Here we compensate for that. float scale = 0.5/N; vDSP_vsmul(y, 1, &scale, y, 1, N); // Assuming it's all correct, the input x and output y vectors will have identical values printf("\nInput & output:\n"); for (int k = 0; k < N; k++) { printf("%3d\t%6.2f\t%6.2f\n", k, x[k], y[k]); } return 0; }

Note that in order to do anything really interesting with audio on the Mac or iOS (for example the audio time-stretching in my AudioStretch app), there’s loads of other stuff you need to learn how to do: setting up a real-time audio input and output; how to grab and window (Hanning, Hamming, etc.) frames of audio data; how to manipulate signals in the frequency domain; synthesis using overlap-add methods, etc. Time-permitting, I might cover some of that in future posts.

Really helpful tutorial. Thanks!

Glad you like it! Thanks for the comment.

Thank you!

Hi. Really useful tutorial. Was wondering however if you managed to find the time to go into windowing, overlapping and downsampling. I’m currently working on a project that would need all these and some sample code would be highly appreciated as I’m quite new to this. There really isn’t any good documentation/samples on this anywhere.

Pingback: From frequecy domain to time domain | Stackforum.com

Hi. I’ve learned some very useful things with this article. I want to point out, specially for other readers, that the code here doesn’t consider freeing allocated memory. When reusing this routine to perform FFT and iFFT on frames of a longer signal, the buffers would need to be allocated somewhere else to optimize memory usage.

Having said this, there’s a bug in the code with these lines:

tempSplitComplex.realp = mag;

tempSplitComplex.imagp = phase;

The original pointer allocated on realp and imagp are lost, and on subsequent calls on this routine the code will fail here:

vDSP_zvabs(&tempSplitComplex, 1, mag, 1, half);

vDSP_zvphas(&tempSplitComplex, 1, phase, 1, half);

because the buffers are essentially the same address location.

I’m using similar code to comput cepstrum frequency voice pitch analysis and wondering if the polar to rectangular coordinates are actually required. I’ve also been throwing out the phase in my analysis and not sure if that would provide value. Any input would be appreciated. Thanks

You don’t need the phase to compute the cepstrum. The cepstrum is the squared inverse Fourier transform of the log of the power spectrum. The power spectrum is the squared magnitude of the Fourier transform, and you don’t need phase information to compute that.

Awesome – thanks! So, would the best way to go about it then to populate the imaginary part of the input to the inverse transform calculation?

meant to say “populate the imaginary part with zeros”

Yep, that should work. There are some more efficient ways to do it that take advantage of the fact that the vector is all real numbers and has even symmetry, but getting all the indexing correct can get a little confusing. Not worth the bother unless CPU load is a big issue.

Hi Gerry,

Thanks for your help in gaining a better understanding of how the vDSP library works. In case anyone is interested, I wanted to provide my work on the 2D FFT case. I’ve done some testing and this appears to work as expected. If anyone sees any issues, please let me know and I’ll fix.

-Mark

int N = 4;

float log_2_N = log2f((float)N);

// N*N values

float x[16] = {0.66, -0.28, -0.47, 0.22, -0.28, 0.03, -0.28, -0.59, -0.09, 0.34, 0.03, -0.16, 0.72, -0.09, -0.28, 0.53};

FFTSetup fftSetup = vDSP_create_fftsetup(log_2_N, kFFTRadix2);

COMPLEX_SPLIT tempSplitComplex;

tempSplitComplex.realp = (float *)malloc(sizeof(float) * N*N/2);

tempSplitComplex.imagp = (float *)malloc(sizeof(float) * N*N/2);

// pack

vDSP_ctoz((COMPLEX *)x, 2, &tempSplitComplex, 1, N*N/2);

// forward FFT

vDSP_fft2d_zrip(fftSetup, &tempSplitComplex, 1, 0, log_2_N, log_2_N, kFFTDirection_Forward);

printf(“\nSpectrum:\n”);

for (int k = 0; k < N*N/2; k++) {

printf("%3d\t%6.2f\t%6.2f\n", k, tempSplitComplex.realp[k], tempSplitComplex.imagp[k]);

}

// inverse FFT

vDSP_fft2d_zrip(fftSetup, &tempSplitComplex, 1, 0, log_2_N, log_2_N, kFFTDirection_Inverse);

// unpack

vDSP_ztoc(&tempSplitComplex, 1, (COMPLEX *)x, 2, N*N/2);

// scale

float scale = 0.5 / (N*N);

vDSP_vsmul(x, 1, &scale, x, 1, N*N);

printf("\nInput & output:\n");

for (int k = 0; k < N*N; k++) {

printf("%3d\t%6.2f\n", k, x[k]);

}

You saved me a lot of hair-pulling, but I still managed to end up with strands in my fists.

Apparently Accelerate creates some quirks with its implementation. The scramblepack may be useful for data size, but it creates an artifact.

1: vDSP_fft_zrip(fftSetup, &tempSplitComplex, 1, logNumberOfSamples, kFFTDirection_Forward);

2: // Fix the scramblepack error;

3: tempSplitComplex.imagp[0]=0;

without line 3, the reported imaginary part, and subsequently the magnitude and phase of bin 1 are wrong. The reason scramblepacking works is BECAUSE the imaginary portion of bin 0 in a real to complex transformation is guaranteed to be zero, so Accelerate uses it for storage. Unfortunately that carries over to the rest of the calculations.

I don’t like calling people out for minutia, but I got really really confused by the fact that the magnitude we get from this example is not equal to the magnitude from Matlab or Mathematica, etc.

That’s because it’s the absolute value. The magnitude is the square of the absolute value. I know this may be obvious to some people, but for me, when something is off by sqrt(itself) and the value is between 0 and 1, I don’t see it quickly. Well i didn’t before… =p.

One last note, the phase values reported by Accelerate are inverted by -1. They are from the reflected half of the FFT. If the phase matters to you, use

float signswitcher=-1;

vDSP_vsmul(tempSplitComplex.imagp, 1, &signswitcher, tempSplitComplex.imagp, 1, fourierOutputLength);

Accelerate is fast as hell. But it cuts some corners that might get you if you are a newb (I’m looking at me.)

Thanks for the help, I would be working on another project if it weren’t for this post.

You’re correct that for bin 0 of an FFT of a real sequence, the imaginary part is zero. The same is true for the bin at the Nyquist frequency. In the packed format that vDSP_fft_zrip yields, element zero’s real part holds the the real part of bin 0, while element zero’s imaginary part holds the real part of the bin at the Nyquist frequency. This is all mentioned in the section on “Packing for One-Dimensional Arrays” in Apple’s documentation but it is definitely confusing, and also in my code comments “Bins 0 and N/2 both necessarily have zero phase, so in the packed format only the real values are output, and these are stuffed into the real/imag components of the first complex value (even though they are both in fact real values).”

With standard textbook FFT definitions, there’s a 1/N scaling factor applied on the forward FFT. In Apple’s documentation, they state clearly that their FFTs don’t do any scaling, which makes sense, since it’s extra computation and it isn’t always necessary. In my code, I compensate for the lack of scaling factor after doing the inverse FFT. (See my code comment “Neither the forward nor inverse FFT does any scaling. Here we compensate for that.”). If you want the mags vector to be scaled differently, you could of course just apply the scaling earlier, e.g. by scaling the real sequence immediately before doing the forward FFT.

I wish you had more posts like this :O

thanks