An FFT in AS3


I write a lot of audio signal processing code, a fair bit of which works in the frequency domain. A key building block for almost any frequency domain signal processing is the FFT, an efficient implementation of the DFT.

A couple of decades ago, I wrote a FFT in C for a class assignment. Last year, when I started getting into audio signal processing in Flash, I ported that FFT to ActionScript 3. I use it in my AudioStretch mp3 time-stretcher/pitch-shifter, among other things.

You can find my AS3 FFT implementation below, at the end of this post. I’m releasing it under the MIT License, which means that you can use it pretty much however you like as long as you include the copyright notice in the code.

My implementation is pretty fast, at least compared to other pure AS3 FFTs I’ve found.  On my old (circa 2006) MacBook Pro I clock ~1250 ms for 1000 forward-inverse FFT pairs, with FFT length of 1024.  That’s 0.625 ms per FFT. As far as I know, it’s the fastest pure AS3 FFT around. (If not, please let me know – I’d really like to have a faster FFT!).

By comparison, the FFT in as3mathlib takes over 19000 ms for the same task, i.e. it’s over 15 times slower.  Two main factors make mine faster:

  • Using Vector instead of Array for the input/output
  • Caching the indices for the bit reversal step

Unfortunately, even with loads of tweaking, my AS3 FFT is ~7x slower than my C++ implementation, and over 20x slower than the hand-optimized native one in IPP. And that’s for double-precision (64-bit floating point) FFTs. Lower precision FFTs (e.g. 32-bit floating point, or even 16 or 32-bit fixed point) can be even faster. In AS3, however, the only supported floating point type is the 64-bit Number.

Bottom line: if you’re looking for awesome signal processing performance, AS3 isn’t the greatest language to use. Use AS3 for all the other benefits it brings, including super easy deployment via the web, and excellent cross-platform support.

I know of at least two FFT implementations out there that use Adobe’s Alchemy technology, which allows you to compile C++ code to run on the ActionScript Virtual Machine (AVM2).  These Alchemy FFT implementations include one by Eugene Zatepyakin, and ALF from Drexel University. These are almost certainly faster than pure AS3 FFTs, but since Alchemy is still in pre-release, Adobe advises “that Alchemy not be used to generate code for use in production”.

Passing Vector data to and from Alchemy is also a major pain, and really slow, so the two Alchemy FFT implementations mentioned above actually keep the audio data on the Alchemy side of the AS3-Alchemy interface. You therefore can’t call their FFT functions with arbitrary data; instead, you can only perform an FFT on a previously specified buffer of audio data. In any case, I don’t really relish the prospect of creating SWFs using two different programming languages…

No doubt someone will come up with a parallel FFT implementation using Pixel Bender.

Personally I hope that Adobe will implement some native signal processing functions right into the AS3 language… but I’m not holding my breath!

To see the code, click “show source” below:

package
{
	import __AS3__.vec.Vector;

	/**
	 * Performs an in-place complex FFT.
	 *
	 * Released under the MIT License
	 *
	 * Copyright (c) 2010 Gerald T. Beauregard
	 *
	 * Permission is hereby granted, free of charge, to any person obtaining a copy
	 * of this software and associated documentation files (the "Software"), to
	 * deal in the Software without restriction, including without limitation the
	 * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
	 * sell copies of the Software, and to permit persons to whom the Software is
	 * furnished to do so, subject to the following conditions:
	 *
	 * The above copyright notice and this permission notice shall be included in
	 * all copies or substantial portions of the Software.
	 *
	 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
	 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
	 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
	 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
	 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
	 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
	 * IN THE SOFTWARE.
	 */
	public class FFT
	{
		public static const FORWARD:Boolean = false;
		public static const INVERSE:Boolean = true;

		private var m_logN:uint = 0;			// log2 of FFT size
		private var m_N:uint = 0;				// FFT size
		private var m_invN:Number;				// Inverse of FFT length

		// Bit-reverse lookup table
		private var m_bitRevLen:uint;			// Length of the bit reverse table
		private var m_bitRevFrom:Vector.<uint>;	// "From" indices for swap
		private var m_bitRevTo:Vector.<uint>;	// "To" indices for swap

		/**
		 *
		 */
		public function FFT()
		{
		}

		/**
		 * Initialize class to perform FFT of specified size.
		 *
		 * @param	logN	Log2 of FFT length. e.g. for 512 pt FFT, logN = 9.
		 */
		public function init(
			logN:uint ):void
		{
			m_logN = logN
			m_N = 1 << m_logN;
			m_invN = 1.0/m_N;

			//	Create the bit-reverse table
			m_bitRevFrom = new Vector.<uint>;
			m_bitRevTo   = new Vector.<uint>;
			for ( var k:uint = 0; k < m_N; k++ )
			{
				var kRev:uint = BitReverse(k,logN);
				if (kRev > k)
				{
					m_bitRevFrom.push(k);
					m_bitRevTo.push(kRev);
				}
			}
			m_bitRevLen = m_bitRevFrom.length;
		}

		/**
		 * Performs in-place complex FFT.
		 *
		 * @param	xRe		Real part of input/output
		 * @param	xIm		Imaginary part of input/output
		 * @param	inverse	If true (INVERSE), do an inverse FFT
		 */
		public function run(
			xRe:Vector.<Number>,
			xIm:Vector.<Number>,
			inverse:Boolean = false ):void
		{
			var numFlies:uint = m_N >> 1;	// Number of butterflies per sub-FFT
			var span:uint = m_N >> 1;		// Width of the butterfly
			var spacing:uint = m_N;			// Distance between start of sub-FFTs
			var wIndexStep:uint = 1; 		// Increment for twiddle table index

			// For each stage of the FFT
			for ( var stage:uint = 0; stage < m_logN; ++stage )
			{
				// Compute a multiplier factor for the "twiddle factors".
				// The twiddle factors are complex unit vectors spaced at
				// regular angular intervals. The angle by which the twiddle
				// factor advances depends on the FFT stage. In many FFT
				// implementations the twiddle factors are cached, but because
				// vector lookup is relatively slow in ActionScript, it's just
				// as fast to compute them on the fly.
				var wAngleInc:Number = wIndexStep * 2.0*Math.PI/m_N;
				if ( inverse == false ) // Corrected 3 Aug 2011. Previously this condition was backwards!
					wAngleInc *= -1;
				var wMulRe:Number = Math.cos(wAngleInc);
				var wMulIm:Number = Math.sin(wAngleInc);

				for ( var start:uint = 0; start < m_N; start += spacing )
				{
					var top:uint    = start;
					var bottom:uint = top + span;

					var wRe:Number = 1.0;
					var wIm:Number = 0.0;

					// For each butterfly in this stage
					for ( var flyCount:uint = 0; flyCount < numFlies; ++flyCount )
					{
						// Get the top & bottom values
						var xTopRe:Number = xRe[top];
						var xTopIm:Number = xIm[top];
						var xBotRe:Number = xRe[bottom];
						var xBotIm:Number = xIm[bottom];

						// Top branch of butterfly has addition
						xRe[top] = xTopRe + xBotRe;
						xIm[top] = xTopIm + xBotIm;

						// Bottom branch of butterly has subtraction,
						// followed by multiplication by twiddle factor
						xBotRe = xTopRe - xBotRe;
						xBotIm = xTopIm - xBotIm;
						xRe[bottom] = xBotRe*wRe - xBotIm*wIm;
						xIm[bottom] = xBotRe*wIm + xBotIm*wRe;

						// Update indices to the top & bottom of the butterfly
						++top;
						++bottom;

						// Update the twiddle factor, via complex multiply
						// by unit vector with the appropriate angle
						// (wRe + j wIm) = (wRe + j wIm) x (wMulRe + j wMulIm)
						var tRe:Number = wRe;
						wRe = wRe*wMulRe - wIm*wMulIm;
						wIm = tRe*wMulIm + wIm*wMulRe
					}
				}

				numFlies >>= 1; 	// Divide by 2 by right shift
				span >>= 1;
				spacing >>= 1;
				wIndexStep <<= 1;  	// Multiply by 2 by left shift
			}

			// The algorithm leaves the result in a scrambled order.
			// Do bit-reversal to unscramble the result.
			for ( var k:uint = 0; k < m_bitRevLen; k++ )
			{
				var brFr:uint = m_bitRevFrom[k];
				var brTo:uint = m_bitRevTo[k];
				var tempRe:Number = xRe[brTo];
				var tempIm:Number = xIm[brTo];
				xRe[brTo] = xRe[brFr];
				xIm[brTo] = xIm[brFr];
				xRe[brFr] = tempRe;
				xIm[brFr] = tempIm;
			}

			//	Divide everything by n for inverse
			if ( inverse )
			{
				for ( k = 0; k < m_N; k++ )
				{
					xRe[k] *= m_invN;
					xIm[k] *= m_invN;
				}
			}
		}

		/**
		 * Do bit reversal of specified number of places of an int
		 * For example, 1101 bit-reversed is 1011
		 *
		 * @param	x		Number to be bit-reverse.
		 * @param	numBits	Number of bits in the number.
		 */
		private function BitReverse(
			x:uint,
			numBits:uint):uint
		{
			var y:uint = 0;
			for ( var i:uint = 0; i < numBits; i++)
			{
				y <<= 1;
				y |= x & 0x0001;
				x >>= 1;
			}
			return y;
		}
	}
}

[Update: If you found this interesting, do check out the even faster FFT I wrote.]

Advertisements

About Gerry Beauregard

I'm a Singapore-based Canadian software engineer, inventor, musician, and occasional triathlete. My current work and projects mainly involve audio technology for the web and iOS. I'm the author of AudioStretch, an audio time-stretching/pitch-shifting app for musicians. Past jobs have included writing speech recognition software for Apple, creating automatic video editing software for muvee, and designing ASICs for Nortel. I hold a Bachelor of Applied Science (Electrical Engineering) from Queen's University and a Master of Arts in Electroacoustic Music from Dartmouth College.
This entry was posted in Audio, Flash, Programming and tagged , . Bookmark the permalink.

18 Responses to An FFT in AS3

  1. Pingback: Tweets that mention An FFT in AS3 « Gerry Beauregard -- Topsy.com

  2. Andre Michelle suggested to me (via email) that the speed could perhaps be improved by using linked lists of Numbers rather than Vectors. Based on the results of some very preliminary experiments, I think I can probably knock about 20% off the running time by following his suggestion. I’ll elaborate more in a future post.

  3. Rackdoll says:

    Very…VERY nice implementation there!
    Goodjob!

    Do you have implementation example up ?

  4. tolga says:

    Have you considered using azoth? http://www.buraks.com/azoth

    • Looks interesting… quoting from your site: “Azoth is a Windows console (Win32 command-line) application, for AS3 Flash programming, that lets you replace method calls to a certain AS3 class (included with Azoth) with equivalent Alchemy opcodes in a SWF file. This provides superior performance for accessing a ByteArray as memory, than you can achieve with AS3 alone.” I develop on a Mac, so a Win32 command-line app won’t be any help to me, but perhaps it’ll be of interest to other readers of this blog.

  5. One thing that may help the speed: As I understand it, when a variable is declared inside a loop memory is allocated and released every iteration. As opposed to when a variable is declared outside the loop; memory is allocated once then that address is just overwritten every iteration.

    eg:

    var x:Number;
    var i:int;
    for (i = 0; i < 12; i++){

    x = getSomeValue();

    thenDoSomethingWith(x);

    }

    p.s. I'm finding your FFT very useful. Thanks.

    • Hi Utilitariat, glad to hear you’re finding the FFT useful. I’ve found that in ActionScript 3, the speed remains the same regardless of whether variables are declared inside or outside the loops.

      In fact, ActionScript 3 doesn’t seem to care whether variables are declared before they’re used, as long as they’re declared somewhere. For example, this code builds and runs fine:

      for ( x = 1; x <= 10; x++ )
      sum += x;
      trace( "sum:", sum );
      var sum:uint = 0;
      var x:uint;

      I don't recommend it though. To keep my sanity (and that of anyone else reading my code), I like to declare variables before they're used.

      If you'd like better performance for the the FFT, do check out my post "An even faster AS3 FFT".

  6. Alama says:

    Thanks for your work ! i’m verry interesting 😉

  7. v0j says:

    Hi Gerry,

    I was trying to create a pitch detection system in flash that detects pitch from the mic… I have searched an FFT algorithm and my search lead to your post. I have read you articles and I was wondering… how to use your codes? (i’m sorry because i’m just a begginer in flash) what program do I need to use? Please can anyone answer my question? 🙂

    You can use any development tool that allows you to edit and compile ActionScript 3 code. I use Adobe Flex Builder 3, the current version of which is Adobe Flash Builder 4. Adobe Flash CS5 also allows you to use AS3 code. Some people prefer FlashDevelop or FDT. FDT is free. All the others have free trial versions (typically 30 days). -Gerry

  8. v0j says:

    Hi again…

    I was reading and analyzing your codes and I wonder how to make an input in the
    fft.run(xRE,xIm,inverse:Boolean = false );

    what’s the difference between the Real input/output and an Imaginary input/output?
    the xRE was supposed to be the input and it was supposed to be an array right?
    what if my input was a mic? how can I use the fft.run();
    can you please give me an example?
    for(…){
    xRe[i]= micSomething…..; //(is this somehow correct?)
    xIm[i]= micSomething…..;
    }
    My mind just exploded when I see the:
    xRe[i] = Math.cos(…!@#$%); //maybe this syntax is for testing purpose only

    I’m sorry that I cannot fully finish the program without your help… I just can comprehend it anymore and it leads me to a dead end.

    • I suggest you take a look at the Real-Time Spectrum Analysis code here:
      https://gerrybeauregard.wordpress.com/2010/08/06/real-time-spectrum-analysis/

      The FFT decomposes a signal into a sum of sinusoids. In audio applications, normally you deal with real signals, and the imaginary component will be zero. For example, if you’re using the microphone input, you’d take a block of audio data from the mic, multiply it by an analysis window (e.g. Hamming or Hanning window), and assign that to the xRe elements of the input, and set the xIm elements all to zero. After you call fft.run(xRe, xIm), xRe/xIm will contain the spectrum of the signal. Even though the signal is real-only, the spectrum will be complex, i.e. it will have non-zero elements in both xRe and xIm. Together, the real and imaginary components represent the magnitude and phase of the sinusoid at each frequency. Assuming you want to get a magnitude spectrum, you need to combine the real and imaginary components to get the magnitude, mag[k] = Math.sqrt(xRe[k]*xRe[k] + xIm[k]*xIm[k]).

  9. Pingback: AudioStretch – real-time audio time-stretching and pitch-shifting in ActionScript | Gerry Beauregard

  10. Alf says:

    Hi Gerry,
    I am trying to find out how multiplying two signals together using FFT (the original sound file and the IR) in AS3 in order to get a binaural output. IS anything you could help? I am a bit lost.
    Regards

    • Hi Alf, what you’re trying to do is known as convolution. Having an FFT certainly helps, but it’s only part of the solution. There are some subtleties about the lengths of the FFT, zero-padding of the IR and of blocks of the source signal, and overlap-add. If you search for “Convolution using FFT” you’ll find a bunch of stuff on the subject.

      As it happens, along with some classmates, I wrote an FFT-based convolver program a couple of decades ago (!), and in a report about we described the process of convolution using the FFT. Here’s an excerpt:

      Once the binaural HRTF (the impulse response for each ear for a given sound source location) is obtained for a specific sound source location and room, it can be convolved with an acoustically dry sound to give a result that, when listened to over headphones, sounds like the sound emanating from that specific location in he room. Using direct convolution to produce this result is a highly computational process. However, if the HRTF and the dry sound are transformed into the frequency domain using the FFT, they can simply be multiplied together and the result inversely transformed to obtain the result. Even with the computations involved in performing FFTs and inverse FFTs on both the HRTF and the dry sound, the number of computations for the latter method is only in the order of Nlog2N compared to N^2 with direct convolution (where N is the number of points in both signals). Due to some properties of the FFT, however, there are some slight complications with using the FFT for convolution of two signals of varying lengths. We will now talk about these complications and how they are overcome.

      Unlike the continuous time Fourier transformation method which can deal with periodic or non-periodic signals, the FFT method assumes that the signals are all periodic with the period being the length of the FFT. Thus multiplication and division operations on the FFTs of signals are analogous to circular convolution and deconvolution operations on their time domain counterparts respectively (Oppenheim 1975). (Although we speak of deconvolution theoretically, there is not a practical method of this operation in the time domain). In order to perform linear convolution using the FFT, the time domain signals must be padded with zeros until they are twice their original length before the FFT is taken. This eliminates the ‘wrap-around’ effects of circular convolution.

      Another complication with using the FFT for the direct convolution of an HRTF with a dry sound, is that typically, the dry sound is much longer than the HRTF and an FFT of the length of the dry sound would be computationally cumbersome (if the length of the dry sound is known beforehand at all). A method called the overlap-and-save method can be employed to convolve two signals of unequal length (Oppenheim 1975). If two signals, h[n] of length N and x[m] of length M, where N is much shorter than M, are to be convolved, the convolution process can be done in sections of N points using FFTs of length 2N. Successive sections of N points are taken from x[m] and convolved with h[n] (by taking their FFTs, multiplying them together and taking the inverse FFT), each of which yields 2N points. The first N points of a section are added to the last N points of the previous section and so on until the end of x[m] is reached. Figure 5 shows this process. The top of the figure (a) shows h[n]. Figure (b) is x[m] divided into sections of N points. Figure (c) shows the placement (in time) of the sections (after the FFTs have been taken of both signals, the FFTs have been multiplied, and the inverse FFT has been taken) before they are added together. Figure (d) shows the final signal, the linear convolution of h[n] with x[m].

      The “Oppenheim 1975” reference is the classic textbook “Discrete-Time Signal Processing” by Oppenheim and Schafer.

  11. Martin says:

    Hi Gerry. I recently used your FFT as well as your real-time spectrum analyzer in an application I am building for University. I found both packages to be very comprehensive and they have really helped me. Thank you very much for sharing it wish us. I was just wondering if there may be an API reference with either one, or perhaps some rough instructions which I could use to set me on the right path to implementing the packages in my app. Any advice would be greatly appreciated. Martin.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s