Using Apple’s vDSP/Accelerate FFT


If you want to write code for signal processing on the Mac or iOS, you really should take advantage of Apple’s Accelerate framework. It provides an extensive library of highly optimized mathematical functions suitable for a wide range of signal processing applications.

While many of the functions are fairly straightforward to use, and well-documented in the vDSP Programming Guide, the FFT functions are a bit maddening, especially when doing FFTs of real signals, which is pretty much always the case when dealing with audio.

The following code example shows how to do a real-to-complex FFT of a real vector; convert from complex representation to magnitude and phase; convert it back to rectangular/complex representation; and do complex-to-real FFT, with the correct scaling to get back to the original signal.

//  main.cpp
//  AccelerateFFT
//
//	Demo of how to use Apple's blazingly fast but maddeningly confusing
//	real->complex and complex->real FFT functions.
//
//  Created by Gerry Beauregard (g.beauregard [at] ieee.org) on 2013-01-28.
//
//	Use this code however you like. No credit required, but if this code
//	saves you some hair-pulling, a hat-tip and kind comment is always
// 	appreciated 😉

#include <stdio.h>
#include <Accelerate/Accelerate.h>

const int LOG_N = 4; // Typically this would be at least 10 (i.e. 1024pt FFTs)
const int N = 1 << LOG_N;
const float PI = 4*atan(1);

int main(int argc, const char * argv[])
{
	// Set up a data structure with pre-calculated values for
	// doing a very fast FFT. The structure is opaque, but presumably
	// includes sin/cos twiddle factors, and a lookup table for converting
	// to/from bit-reversed ordering. Normally you'd create this once
	// in your application, then use it for many (hundreds! thousands!) of
	// forward and inverse FFTs.
	FFTSetup fftSetup = vDSP_create_fftsetup(LOG_N, kFFTRadix2);

	// -------------------------------
	// Set up a bunch of buffers

	// Buffers for real (time-domain) input and output signals.
	float *x = new float[N];
	float *y = new float[N];

	// Initialize the input buffer with a sinusoid
	int BIN = 3;
	for (int k = 0; k < N; k++)
		x[k] = cos(2*PI*BIN*k/N);

	// We need complex buffers in two different formats!
	DSPComplex *tempComplex = new DSPComplex[N/2];

	DSPSplitComplex tempSplitComplex;
	tempSplitComplex.realp = new float[N/2];
	tempSplitComplex.imagp = new float[N/2];

	// For polar coordinates
	float *mag = new float[N/2];
	float *phase = new float[N/2];

	// ----------------------------------------------------------------
	// Forward FFT

	// Scramble-pack the real data into complex buffer in just the way that's
	// required by the real-to-complex FFT function that follows.
	vDSP_ctoz((DSPComplex*)x, 2, &tempSplitComplex, 1, N/2);

	// Do real->complex forward FFT
	vDSP_fft_zrip(fftSetup, &tempSplitComplex, 1, LOG_N, kFFTDirection_Forward);

	// Print the complex spectrum. Note that since it's the FFT of a real signal,
	// the spectrum is conjugate symmetric, that is the negative frequency components
	// are complex conjugates of the positive frequencies. The real->complex FFT
	// therefore only gives us the positive half of the spectrum from bin 0 ("DC")
	// to bin N/2 (Nyquist frequency, i.e. half the sample rate). Typically with
	// audio code, you don't need to worry much about the DC and Nyquist values, as
	// they'll be very close to zero if you're doing everything else correctly.
	//
	// Bins 0 and N/2 both necessarily have zero phase, so in the packed format
	// only the real values are output, and these are stuffed into the real/imag components
	// of the first complex value (even though they are both in fact real values). Try
	// replacing BIN above with N/2 to see how sinusoid at Nyquist appears in the spectrum.
	printf("\nSpectrum:\n");
	for (int k = 0; k < N/2; k++)
	{
		printf("%3d\t%6.2f\t%6.2f\n", k, tempSplitComplex.realp[k], tempSplitComplex.imagp[k]);
	}

	// ----------------------------------------------------------------
	// Convert from complex/rectangular (real, imaginary) coordinates
	// to polar (magnitude and phase) coordinates.

	// Compute magnitude and phase. Can also be done using vDSP_polar.
	// Note that when printing out the values below, we ignore bin zero, as the
	// real/complex values for bin zero in tempSplitComplex actually both correspond
	// to real spectrum values for bins 0 (DC) and N/2 (Nyquist) respectively.
	vDSP_zvabs(&tempSplitComplex, 1, mag, 1, N/2);
	vDSP_zvphas(&tempSplitComplex, 1, phase, 1, N/2);

	printf("\nMag / Phase:\n");
	for (int k = 1; k < N/2; k++)
	{
		printf("%3d\t%6.2f\t%6.2f\n", k, mag[k], phase[k]);
	}

	// ----------------------------------------------------------------
	// Convert from polar coordinates back to rectangular coordinates.

	tempSplitComplex.realp = mag;
	tempSplitComplex.imagp = phase;

	vDSP_ztoc(&tempSplitComplex, 1, tempComplex, 2, N/2);
	vDSP_rect((float*)tempComplex, 2, (float*)tempComplex, 2, N/2);
	vDSP_ctoz(tempComplex, 2, &tempSplitComplex, 1, N/2);

	// ----------------------------------------------------------------
	// Do Inverse FFT

	// Do complex->real inverse FFT.
	vDSP_fft_zrip(fftSetup, &tempSplitComplex, 1, LOG_N, kFFTDirection_Inverse);

	// This leaves result in packed format. Here we unpack it into a real vector.
	vDSP_ztoc(&tempSplitComplex, 1, (DSPComplex*)y, 2, N/2);

	// Neither the forward nor inverse FFT does any scaling. Here we compensate for that.
	float scale = 0.5/N;
	vDSP_vsmul(y, 1, &scale, y, 1, N);

	// Assuming it's all correct, the input x and output y vectors will have identical values
	printf("\nInput & output:\n");
	for (int k = 0; k < N; k++)
	{
		printf("%3d\t%6.2f\t%6.2f\n", k, x[k], y[k]);
	}

    return 0;
}

Note that in order to do anything really interesting with audio on the Mac or iOS (for example the audio time-stretching in my AudioStretch app), there’s loads of other stuff you need to learn how to do: setting up a real-time audio input and output; how to grab and window (Hanning, Hamming, etc.) frames of audio data; how to manipulate signals in the frequency domain; synthesis using overlap-add methods, etc.  Time-permitting, I might cover some of that in future posts.

Advertisements

About Gerry Beauregard

I'm a Singapore-based Canadian software engineer, inventor, musician, and occasional triathlete. My current work and projects mainly involve audio technology for the web and iOS. I'm the author of AudioStretch, an audio time-stretching/pitch-shifting app for musicians. Past jobs have included writing speech recognition software for Apple, creating automatic video editing software for muvee, and designing ASICs for Nortel. I hold a Bachelor of Applied Science (Electrical Engineering) from Queen's University and a Master of Arts in Electroacoustic Music from Dartmouth College.
This entry was posted in Audio, FFT, iOS, Programming. Bookmark the permalink.

15 Responses to Using Apple’s vDSP/Accelerate FFT

  1. RM says:

    Really helpful tutorial. Thanks!

  2. LM says:

    Thank you!

  3. Andrei Filip says:

    Hi. Really useful tutorial. Was wondering however if you managed to find the time to go into windowing, overlapping and downsampling. I’m currently working on a project that would need all these and some sample code would be highly appreciated as I’m quite new to this. There really isn’t any good documentation/samples on this anywhere.

  4. Pingback: From frequecy domain to time domain | Stackforum.com

  5. Hi. I’ve learned some very useful things with this article. I want to point out, specially for other readers, that the code here doesn’t consider freeing allocated memory. When reusing this routine to perform FFT and iFFT on frames of a longer signal, the buffers would need to be allocated somewhere else to optimize memory usage.
    Having said this, there’s a bug in the code with these lines:
    tempSplitComplex.realp = mag;
    tempSplitComplex.imagp = phase;

    The original pointer allocated on realp and imagp are lost, and on subsequent calls on this routine the code will fail here:
    vDSP_zvabs(&tempSplitComplex, 1, mag, 1, half);
    vDSP_zvphas(&tempSplitComplex, 1, phase, 1, half);
    because the buffers are essentially the same address location.

  6. Quinn Darwin says:

    I’m using similar code to comput cepstrum frequency voice pitch analysis and wondering if the polar to rectangular coordinates are actually required. I’ve also been throwing out the phase in my analysis and not sure if that would provide value. Any input would be appreciated. Thanks

    • You don’t need the phase to compute the cepstrum. The cepstrum is the squared inverse Fourier transform of the log of the power spectrum. The power spectrum is the squared magnitude of the Fourier transform, and you don’t need phase information to compute that.

      • Quinn Darwin says:

        Awesome – thanks! So, would the best way to go about it then to populate the imaginary part of the input to the inverse transform calculation?

      • Quinn Darwin says:

        meant to say “populate the imaginary part with zeros”

      • Yep, that should work. There are some more efficient ways to do it that take advantage of the fact that the vector is all real numbers and has even symmetry, but getting all the indexing correct can get a little confusing. Not worth the bother unless CPU load is a big issue.

  7. markckim says:

    Hi Gerry,

    Thanks for your help in gaining a better understanding of how the vDSP library works. In case anyone is interested, I wanted to provide my work on the 2D FFT case. I’ve done some testing and this appears to work as expected. If anyone sees any issues, please let me know and I’ll fix.

    -Mark

    int N = 4;
    float log_2_N = log2f((float)N);
    // N*N values
    float x[16] = {0.66, -0.28, -0.47, 0.22, -0.28, 0.03, -0.28, -0.59, -0.09, 0.34, 0.03, -0.16, 0.72, -0.09, -0.28, 0.53};

    FFTSetup fftSetup = vDSP_create_fftsetup(log_2_N, kFFTRadix2);
    COMPLEX_SPLIT tempSplitComplex;
    tempSplitComplex.realp = (float *)malloc(sizeof(float) * N*N/2);
    tempSplitComplex.imagp = (float *)malloc(sizeof(float) * N*N/2);

    // pack
    vDSP_ctoz((COMPLEX *)x, 2, &tempSplitComplex, 1, N*N/2);
    // forward FFT
    vDSP_fft2d_zrip(fftSetup, &tempSplitComplex, 1, 0, log_2_N, log_2_N, kFFTDirection_Forward);

    printf(“\nSpectrum:\n”);
    for (int k = 0; k < N*N/2; k++) {
    printf("%3d\t%6.2f\t%6.2f\n", k, tempSplitComplex.realp[k], tempSplitComplex.imagp[k]);
    }
    // inverse FFT
    vDSP_fft2d_zrip(fftSetup, &tempSplitComplex, 1, 0, log_2_N, log_2_N, kFFTDirection_Inverse);
    // unpack
    vDSP_ztoc(&tempSplitComplex, 1, (COMPLEX *)x, 2, N*N/2);
    // scale
    float scale = 0.5 / (N*N);
    vDSP_vsmul(x, 1, &scale, x, 1, N*N);

    printf("\nInput & output:\n");
    for (int k = 0; k < N*N; k++) {
    printf("%3d\t%6.2f\n", k, x[k]);
    }

  8. Joe Brown says:

    You saved me a lot of hair-pulling, but I still managed to end up with strands in my fists.

    Apparently Accelerate creates some quirks with its implementation. The scramblepack may be useful for data size, but it creates an artifact.

    1: vDSP_fft_zrip(fftSetup, &tempSplitComplex, 1, logNumberOfSamples, kFFTDirection_Forward);
    2: // Fix the scramblepack error;
    3: tempSplitComplex.imagp[0]=0;

    without line 3, the reported imaginary part, and subsequently the magnitude and phase of bin 1 are wrong. The reason scramblepacking works is BECAUSE the imaginary portion of bin 0 in a real to complex transformation is guaranteed to be zero, so Accelerate uses it for storage. Unfortunately that carries over to the rest of the calculations.

    I don’t like calling people out for minutia, but I got really really confused by the fact that the magnitude we get from this example is not equal to the magnitude from Matlab or Mathematica, etc.

    That’s because it’s the absolute value. The magnitude is the square of the absolute value. I know this may be obvious to some people, but for me, when something is off by sqrt(itself) and the value is between 0 and 1, I don’t see it quickly. Well i didn’t before… =p.

    One last note, the phase values reported by Accelerate are inverted by -1. They are from the reflected half of the FFT. If the phase matters to you, use

    float signswitcher=-1;
    vDSP_vsmul(tempSplitComplex.imagp, 1, &signswitcher, tempSplitComplex.imagp, 1, fourierOutputLength);

    Accelerate is fast as hell. But it cuts some corners that might get you if you are a newb (I’m looking at me.)

    Thanks for the help, I would be working on another project if it weren’t for this post.

    • You’re correct that for bin 0 of an FFT of a real sequence, the imaginary part is zero. The same is true for the bin at the Nyquist frequency. In the packed format that vDSP_fft_zrip yields, element zero’s real part holds the the real part of bin 0, while element zero’s imaginary part holds the real part of the bin at the Nyquist frequency. This is all mentioned in the section on “Packing for One-Dimensional Arrays” in Apple’s documentation but it is definitely confusing, and also in my code comments “Bins 0 and N/2 both necessarily have zero phase, so in the packed format only the real values are output, and these are stuffed into the real/imag components of the first complex value (even though they are both in fact real values).”

      With standard textbook FFT definitions, there’s a 1/N scaling factor applied on the forward FFT. In Apple’s documentation, they state clearly that their FFTs don’t do any scaling, which makes sense, since it’s extra computation and it isn’t always necessary. In my code, I compensate for the lack of scaling factor after doing the inverse FFT. (See my code comment “Neither the forward nor inverse FFT does any scaling. Here we compensate for that.”). If you want the mags vector to be scaled differently, you could of course just apply the scaling earlier, e.g. by scaling the real sequence immediately before doing the forward FFT.

  9. Miguel Saldaña says:

    I wish you had more posts like this :O
    thanks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s