An even faster AS3 FFT


Earlier today I posted code for a fairly fast pure-AS3 FFT implementation. Andre Michelle suggested that I could perhaps get better performance by using linked lists instead of Vectors.  He was right!  I have a new FFT implementation that runs at double the speed of the version I posted yesterday.

Interestingly the new code is now in many ways closer to my C++ FFT. In C++, one common technique to boost speed is to use pointer arithmetic. For example, if p points to the first element in a (C++) array or vector, you can advance to the next element by simply incrementing p, i.e. p++. To get the value that’s pointed to, you simply dereference: *p.

AS3 doesn’t have pointer arithmetic, but it does have pointers (sort of).  In effect, every time you create a new object, the variable you assign it to is a pointer.  You can’t increment these pointers, but you can use them to make linked lists… and it turns out that traversing a sequence of value-containing objects organized as a linked list is substantially faster than stepping through a Vector of those values.

In the FFT algorithm, most of the time we need to step through elements in sequence. For that, using a linked list works great. At a few spots, however, we need to jump to specific elements. To get the best of both, we use a Vector of elements which are also organized as a linked list.

So… my new FFT (code below) takes the vectors of real and imaginary input values, and stuffs them into FFTElements. Each FFTElement has a real and imaginary value, a link to the next element, and an index specifying the target position in the output vectors at the bit reversal stage. The FFTElements are also kept in a Vector so that we can jump to specific ones (which is required when initializing the butterfly top & bottom “pointers” at the start of each stage).

This new implementation runs 29x as fast as the FFT in as3mathlib.  That’s virtually the same speedup relative to as3mathlib that the authors of the Audio Processing Library for Flash (ALF) reported that their Alchemy-based FFT gave (see Table 1 here).

Here are the performance numbers from a test on my late 2006 MB Pro (2GHz Intel Core Duo).  Times are for 1000 iterations for forward-inverse 1024 point FFTs (i.e. 2000 FFTs in total).

Implementation Test time (ms) Per FFT (ms)
as3mathlib 19200 9.60
FFT (previous post) 1250 0.625
FFT2 (this post) 660 0.330

So it appears that a pure AS3 FFT can deliver essentially the same performance as a C++ FFT compiled with Alchemy, provided it avoids doing stuff that’s relatively slow in AS3 (accessing Vector elements), and takes advantage of stuff that’s fast (linked lists).  It all kinda makes sense. After all, C++ code compiled with Alchemy runs on the same ActionScript Virtual Machine as code written in AS3. It’d kinda suck if compiling C++ consistently yielded more efficient AVM byte code than compiling AS3!

It’s still 3.5x slower than my C++ FFT (release config, compiled with XCode 4), but I reckon that’s the price one pays for running code in a virtual machine, rather than running native.

Big thanks again to Andre Michelle for that excellent suggestion to use linked lists.

Source code for the new FFT below (click the tiny “show source” links). I’ll upload a test app tomorrow…

package
{
	import __AS3__.vec.Vector;

	/**
	 * Performs an in-place complex FFT.
	 *
	 * Released under the MIT License
	 *
	 * Copyright (c) 2010 Gerald T. Beauregard
	 *
	 * Permission is hereby granted, free of charge, to any person obtaining a copy
	 * of this software and associated documentation files (the "Software"), to
	 * deal in the Software without restriction, including without limitation the
	 * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
	 * sell copies of the Software, and to permit persons to whom the Software is
	 * furnished to do so, subject to the following conditions:
	 *
	 * The above copyright notice and this permission notice shall be included in
	 * all copies or substantial portions of the Software.
	 *
	 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
	 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
	 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
	 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
	 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
	 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
	 * IN THE SOFTWARE.
	 */
	public class FFT2
	{
		public static const FORWARD:Boolean = false;
		public static const INVERSE:Boolean = true;

		private var m_logN:uint = 0;			// log2 of FFT size
		private var m_N:uint = 0;				// FFT size
		private var m_invN:Number;				// Inverse of FFT length

		private var m_X:Vector.<FFTElement>;	// Vector of linked list elements

		/**
		 *
		 */
		public function FFT2()
		{
		}

		/**
		 * Initialize class to perform FFT of specified size.
		 *
		 * @param	logN	Log2 of FFT length. e.g. for 512 pt FFT, logN = 9.
		 */
		public function init(
			logN:uint ):void
		{
			m_logN = logN
			m_N = 1 << m_logN;
			m_invN = 1.0/m_N;

			// Allocate elements for linked list of complex numbers.
			m_X = new Vector.<FFTElement>(m_N);
			for ( var k:uint = 0; k < m_N; k++ )
				m_X[k] = new FFTElement;

			// Set up "next" pointers.
			for ( k = 0; k < m_N-1; k++ )
				m_X[k].next = m_X[k+1];

			// Specify target for bit reversal re-ordering.
			for ( k = 0; k < m_N; k++ )
				m_X[k].revTgt = BitReverse(k,logN);
		}

		/**
		 * Performs in-place complex FFT.
		 *
		 * @param	xRe		Real part of input/output
		 * @param	xIm		Imaginary part of input/output
		 * @param	inverse	If true (INVERSE), do an inverse FFT
		 */
		public function run(
			xRe:Vector.<Number>,
			xIm:Vector.<Number>,
			inverse:Boolean = false ):void
		{
			var numFlies:uint = m_N >> 1;	// Number of butterflies per sub-FFT
			var span:uint = m_N >> 1;		// Width of the butterfly
			var spacing:uint = m_N;			// Distance between start of sub-FFTs
			var wIndexStep:uint = 1; 		// Increment for twiddle table index

			// Copy data into linked complex number objects
			// If it's an IFFT, we divide by N while we're at it
			var x:FFTElement = m_X[0];
			var k:uint = 0;
			var scale:Number = inverse ? m_invN : 1.0;
			while (x)
			{
				x.re = scale*xRe[k];
				x.im = scale*xIm[k];
				x = x.next;
				k++;
			}

			// For each stage of the FFT
			for ( var stage:uint = 0; stage < m_logN; ++stage )
			{
				// Compute a multiplier factor for the "twiddle factors".
				// The twiddle factors are complex unit vectors spaced at
				// regular angular intervals. The angle by which the twiddle
				// factor advances depends on the FFT stage. In many FFT
				// implementations the twiddle factors are cached, but because
				// vector lookup is relatively slow in ActionScript, it's just
				// as fast to compute them on the fly.
				var wAngleInc:Number = wIndexStep * 2.0*Math.PI/m_N;
				if ( inverse == false ) // Corrected 3 Aug 2011. Had this condition backwards before, so FFT was IFFT, and vice-versa!
					wAngleInc *= -1;
				var wMulRe:Number = Math.cos(wAngleInc);
				var wMulIm:Number = Math.sin(wAngleInc);

				for ( var start:uint = 0; start < m_N; start += spacing )
				{
					var xTop:FFTElement = m_X[start];
					var xBot:FFTElement = m_X[start+span];

					var wRe:Number = 1.0;
					var wIm:Number = 0.0;

					// For each butterfly in this stage
					for ( var flyCount:uint = 0; flyCount < numFlies; ++flyCount )
					{
						// Get the top & bottom values
						var xTopRe:Number = xTop.re;
						var xTopIm:Number = xTop.im;
						var xBotRe:Number = xBot.re;
						var xBotIm:Number = xBot.im;

						// Top branch of butterfly has addition
						xTop.re = xTopRe + xBotRe;
						xTop.im = xTopIm + xBotIm;

						// Bottom branch of butterly has subtraction,
						// followed by multiplication by twiddle factor
						xBotRe = xTopRe - xBotRe;
						xBotIm = xTopIm - xBotIm;
						xBot.re = xBotRe*wRe - xBotIm*wIm;
						xBot.im = xBotRe*wIm + xBotIm*wRe;

						// Advance butterfly to next top & bottom positions
						xTop = xTop.next;
						xBot = xBot.next;

						// Update the twiddle factor, via complex multiply
						// by unit vector with the appropriate angle
						// (wRe + j wIm) = (wRe + j wIm) x (wMulRe + j wMulIm)
						var tRe:Number = wRe;
						wRe = wRe*wMulRe - wIm*wMulIm;
						wIm = tRe*wMulIm + wIm*wMulRe;
					}
				}

				numFlies >>= 1; 	// Divide by 2 by right shift
				span >>= 1;
				spacing >>= 1;
				wIndexStep <<= 1;  	// Multiply by 2 by left shift
			}

			// The algorithm leaves the result in a scrambled order.
			// Unscramble while copying values from the complex
			// linked list elements back to the input/output vectors.
			x = m_X[0];
			while (x)
			{
				var target:uint = x.revTgt;
				xRe[target] = x.re;
				xIm[target] = x.im;
				x = x.next;
			}
		}

		/**
		 * Do bit reversal of specified number of places of an int
		 * For example, 1101 bit-reversed is 1011
		 *
		 * @param	x		Number to be bit-reverse.
		 * @param	numBits	Number of bits in the number.
		 */
		private function BitReverse(
			x:uint,
			numBits:uint):uint
		{
			var y:uint = 0;
			for ( var i:uint = 0; i < numBits; i++)
			{
				y <<= 1;
				y |= x & 0x0001;
				x >>= 1;
			}
			return y;
		}
	}
}
package
{
	public class FFTElement
	{
		public var re:Number = 0.0;			// Real component
		public var im:Number = 0.0;			// Imaginary component
		public var next:FFTElement = null;	// Next element in linked list
		public var revTgt:uint;				// Target position post bit-reversal
	}
}
Advertisements

About Gerry Beauregard

I'm a Singapore-based Canadian software engineer, inventor, musician, and occasional triathlete. My current work and projects mainly involve audio technology for the web and iOS. I'm the author of AudioStretch, an audio time-stretching/pitch-shifting app for musicians. Past jobs have included writing speech recognition software for Apple, creating automatic video editing software for muvee, and designing ASICs for Nortel. I hold a Bachelor of Applied Science (Electrical Engineering) from Queen's University and a Master of Arts in Electroacoustic Music from Dartmouth College.
This entry was posted in Programming and tagged , . Bookmark the permalink.

65 Responses to An even faster AS3 FFT

  1. Jeremy says:

    Wow, nice work! Looking forward to seeing the test app. Signal processing is relatively new territory for me, do you have any recommendations for good books or online resources?

    Thanks,
    ~Jeremy

    • The textbook “Discrete-Time Signal Processing” by Oppenheim & Schafer is excellent. More academic than fun though, as textbooks tend to be.

      Stanford prof Julius Smith has excellent online resources on audio DSP topics.

      Processing audio signals is much more fun if you play them back in real time. To do that in Flash, you need to learn about the SampleDataEvent introduced in Flash Player 10. Adobe engineer Tinic Uro has some posts on that.

      For fun and inspiring Flash audio projects, Andre Michelle’s blog and labs pages are great. His post on gapless mp3 looping is easy to understand and fun.

    • azeem says:

      Hello!!
      I was trying to run this code but there appears two errors can any one please solve these errors..
      I run the first link Show source

      [What are the two errors you’re getting? -Gerry]

  2. Taka Kojima says:

    Hey Gerry,

    Maybe try to cut down on the uint type values as well, int type objects perform better in almost all scenarios, even Numbers perform better than uints most of the time.

    As a general rule, any var you have to assign a fractional value to, divide or multiply use Number. Bitwise operations, addition and subtraction use int. Only use uints where you absolutely need them.

    Swapping out things like your iterators with ints might have a small impact, but I guess it’s worth a shot.

    (This is old, but the last time I tested, I got similar results)

    http://www.gskinner.com/blog/archives/2006/06/types_in_as3_in.html

    Taka

    • Hi Taka, thanks for the suggestion. I just tried replacing all the uints with ints, but got no measurable speed improvement. (In fact, on my MB Pro, the all ints version runs very slightly slower – by about 1% – than the version with uints). Perhaps Adobe’s improved their handling of uints over the years…

      • Taka Kojima says:

        Interesting… good to know, one thing I noticed is in the top of your init method you have 3 loops that pretty much do the same loop, you could rewrite it to something like:

        var prev_fftElement:FFTElement;
        var fftElement:FFTElement;
        			
        for(var k:uint = 0; k &lt; m_N; k++){
        	fftElement = new FFTElement();
        	fftElement.revTgt = BitReverse(k, logN);
        	m_X[k] = fftElement;
        				
        	if(prev_fftElement)
        		prev_fftElement.next = fftElement;
        		
        	prev_fftElement = fftElement;
        }
        

        You’ll probably get a minor performance increase from that.

      • Taka Kojima says:

        Just tested that, using your test class you provided… didn’t do much haha… I’ll keep my comments to myself 🙂

      • No surprise really… in the FFTTimingTest program, the call to init() comes *before* the first call to getTimer() 😉

        The time taken for one-time initialization generally isn’t of great concern for FFT functions. What really matters is the time for the FFT operation itself (which I do 1000 times in a loop). That’s why init() is outside the block of code that’s timed.

  3. Nice catch 😉

    If you send me your test files, I would love to check if our FFT implementation is faster.

  4. Pingback: Real-Time Spectrum Analysis | Gerry Beauregard

  5. Pingback: Vectors vs. Linked Lists in AS3 | Gerry Beauregard

  6. Alama says:

    Hello Gerry,

    I began conducting a deck in AS3, after 3 months off, I intend to continue this work. This deck has a Master Tempo feature impossible to achieve without the FFT. thanks to your work, I’m having a lot easier to implement.

    The FFT I will also be useful to make a BPM detector, whatever I have another idea easier, but I’m experimenting.

    Your sample of audioStrtch am interested, just to see how we use the FFT class 🙂
    Only the “speed” interests me, is the master tempo, the other could be called a Master Tone. Would you have an example of minimum code to use the FFT Speed? It would be really super nice.

    Here’s my deck, far from being finished, only function: Power On, Eject (mp3 load), play / pause, pitch and reverse. (I’m busy working on the LCD display ..) but it already gives an overview of the thing and the amount of work involved.

    http://www.covergraph.com/alama/pioneer_projet/flash/cdj1000mk3_vx/index.html

    Désolé pour mon mauvais Anglais :-)))

    Bonne continuation et encore merci pour la partage de ce travail 😉

    Alain Mazy.

  7. daniel says:

    nicee, I’ll be working on a guitar tuner with this, after using your spectrum analyzer I think its possible to make a tuner. sorry about my english, spanish speaker at this side 😀
    Definitely possible to make a guitar tuner, but note that the FFT on its own is unlikely to give you good enough frequency resolution. You may need to use some standard tricks for improving the resolution, for example using 3 or more points around a peak magnitude to estimate the true peak position.

    • daniel says:

      Are you talking about a peak detection in the audio data? and mesure the time difference between peaks, so we can detect the root frecuency?
      I thought something like that but didn’t do anything usefull, I’ll be working on this the next days I think.

      VERY interesting your blog.

      cheers!

  8. Vladimir says:

    Hello Gerry!
    I’ve noticed thant you are using decimation in time FFT algorithm. And according to it you make bit reversal in the end of the program(). I don’t know wether it would be better to use decimation in frequency FFT algorithm, but according to it bit reversal could be managed during class initialization, so “while” cycle in the end of the program could be removed. I’ll try to use that algorithm and test it.

    Vladimir.

    Hey, go for it! I’d love to have a still faster FFT implementation. BTW, I’m pretty sure I’ve implemented decimation-in-frequency, at least going by the descriptions of various FFT implementations in my old Oppenheim & Schafer “Discrete-Time Signal Processing” textbook. With decimation-in-frequency, the input’s in natural order, and the output’s in bit-reversed order, and needs to be unscrambled before it’s copied to the output. With decimation-in-time, the input first needs to be put into bit-reversed order, but the output is in natural order. Probably equal performance either way, but it might be worth giving it a try anyway. What would almost certainly speed things up a lot is to pre-compute the sin/cos twiddle factors for each FFT stage, but store them in linked lists (since accessing Vector elements, even sequentially, is relatively slow in AS3). -Gerry

  9. Matt says:

    Thank you for the FFT, you are a god. I’m trying to calculate frequency for a guitar tuner. Is there anyway to get more granular data out of the FFT or is this limited by Flash’s sound data? Interpolating frequency based on dB doesn’t seem to be accurate.

    Glad you like the FFT! If you’re making a guitar tuner, you might want to check out my real-time spectrum analyzer code, as it shows how to grab the microphone input and apply a Hanning window prior to the FFT.

    The ‘raw’ resolution of an FFT is limited by the number of samples in the analysis window. If the sample rate is SR, and there are N points in the FFT, the spacing between the FFT bins is SR/N. For example, if the sample rate is 44.1 kHz, and N is 2048 (46ms), the spacing between bins is 21.53 Hz. That’s way too coarse for a guitar tuner, as the low E string’s fundamental frequency is 82.4 Hz. Fortunately there are loads of tricks to get better resolution. Here are a few, in no particular order…

    You can use a longer FFT, either using more samples of the audio signal, or ‘zero-padding’ a relatively short buffer. If the computational load of the FFT is an issue, use a lower sample rate by setting up the mic source to use a lower sample rate and/or downsampling the audio from the mic (after applying a suitable low-pass filter to prevent aliasing).

    You can interpolate around the peak in the magnitude spectrum. Using the magnitude of the bin at the peak, and the bin just above and just below the peak bin, you can compute an adjustment by +/- half a bin. Probably works better with the linear magnitude spectrum rather than log (dB) spectrum.

    You can use a technique inspired by phase vocoders: compute FFTs of two windows of samples with a single-sample shift in the time-domain. By looking at the phase shift in the peak FFT bin, you can get extremely accurate estimates of the actual frequency (down to a few millicents).

    You find a peak corresponding to one of the harmonics of the fundamental, estimate the frequency of that harmonic, then divide it by the harmonic number to get a better estimate of the fundamental. -Gerry

    • v0j says:

      I was working on a tone detection program and I used your libraries(FFT2 and the Spectrum analyzer). What I did was I lowered the sample rate of the mic and “minified” the buffer rate of the mic so that it will be much faster and to achieve a more specific decimal values. I combined it with my function that computes one ‘cent’ and produce a value for a certain chord so that it will return a specific value and then voila! I produced a mini tone detection program(with 3 octave starting in E2 – lowest chord of a guitar).

      Can you post some codes that the input is a mp3 file? because I was working on a project that detects the tone and checks if it’s has the same tone as the mp3 file. Your help would be much appreciated.

      • Alama says:

        if you can read a little french, you can read my blog, otherwise, with the example code, you should be fine.
        http://www.covergraph.com/blog/

        Until the release of FP11, write a parser MP3 or use a free framework like AudioFx

        For decompress a mp3 to raw (wav like), use sound.extract() method.

      • v0j says:

        @alama,

        I have read your post(through translate.google) and I liked it but It is not what I wanted to do. All I wanted is that to read the mp3 file the “Gerry’s FFT way” so that I could come up with the same result… Thanks anyway 🙂

      • Hi v0j, you should just be able to use Sound.extract() to get the audio sample data, as Alama suggested in his post. If you want a ‘live’ FFT/spectrum display as the mp3 is playing, that’s also possible, using SampleDataEvent. Kaourantin.net and Andre Michelle have good tutorials on both these subjects.

      • v0j says:

        @Gerry,

        hahaha! I have just studied you codes(again). and Your right and Alama’s right also. So that would be the easiest way to get the sample data. Got it! Thanks again to You and Alama. You’ve been a big help to me…

  10. Gery Teague says:

    Hey Gerry,

    I’ve been going through your code for the last 2 days, and I’m running into a problem that maybe you can help me understand. I’m trying to write code such that I can enter the frequency (ex: 440 for A4) and have it play back by dynamically generating the sounds for the Sound.SampleDataEvent. I have done this by simply using the following sound formula.

    for(var a:int = 0; a<2048; a++)
    {
    e.data.writeFloat(Math.sin(2*Math.PI * i * 440 / 44100));
    e.data.writeFloat(Math.sin(2*Math.PI * i * 440 / 44100));
    }

    The generic formula for generating the samples for a sound is:
    sin ( 2PI * sample# * freq/sampleRate) * amplitude

    Looking through your speed test code I see the following:
    Math.sin(2*Math.PI * i * 440 / 2048);

    This assumes that we're creating a wave at 440hz and sampling 2048 times.

    This works out great because I need 2048 samples, but it doesn't work out great because the points it's returning assume that flash is playing back at 2048 samples per second, which it isn't. So the sound is terrible.

    What I'm hoping to eventually do using your FFT2 is to send it a vector of 0.0 for all the real numbers, and a vector of different frequencies for the imaginary numbers, (ex: sending only 440).
    xIm[0]…. xIm[439] = 0;
    xIm[440] = 1024;
    xIm[441]…xIm[2047] = 0;

    And receiving a new vector of real numbers xRe that have the correct points to send back to the SampleDataEvent object for it's new bytes to play back.

    Could you help me with this? I can be emailed at [edited out for your privacy -Gerry]

    Thank you very much for your time.
    Gery

    [Hi Gery, we’ve corresponded a bit privately already on this, but I thought I’d also respond on my blog too for the benefit of others.

    From what you mentioned in the emails, what you’d really like to do is generate tones. If you want to generate notes, an FFT or IFFT is usually not the the way to do it. The FFT and IFFT certainly have many uses in the audio world, but generally they’re not used for generating notes. In principle, you can compute the spectrum you want then invert it to get a tone, but in practice constructing a self-consistent good-sounding spectrum, or series of spectra if you want a non-stationary signal, is a bit tricky.

    Normally the FFT and IFFT are used for analyzing and modifying existing audio signals. For example, I use my FFT in http://www.audiostretch.com“>AudioStretch, my web-based real-time time-stretcher/pitch-shifter. There’s quite a lot of specialized fiddly math and black art involved in using the FFT/IFFT correctly for audio applications, well beyond understanding the transforms themselves. It’s a bit of a slog learning it (think graduate-level EE courses and beyond). Best to avoid unless you need it for your application.

    To generate notes efficiently, the typical way to do it is to precompute a single period of a waveform, and store that in a vector. (In your case, that waveform would be a sine tone, though normally for musical tones the waveform would be the sum of multiple sine tones in a harmonic series, i.e. with integer multiple of some fundamental frequency). To play a note, you step through the vector grabbing samples, looping back to the beginning of the table when you reach the end. The rate at which you step through the table determines the pitch. The technique is often called “wavetable synthesis”. Google it and you should find lots of references, including a Wikipedia article.

    Writing a wavetable synthesizer from scratch is certainly a good way to learn about audio (and not nearly as complicated as learning how to use an FFT). If you Google “wavetable synthesis AS3”, you’ll find various blogs that have source code, for example at makemachine.net. You might also want to check out André Michelle’s Tonfall library. I haven’t used it myself, but from the description, it looks like it’s got a wavetable synth, plus a lot of other good stuff. André Michelle’s blog is excellent too.

    Good luck and have fun!

    -Gerry]

  11. Pingback: An FFT in C# | Gerry Beauregard

  12. Pingback: rigondo

  13. Pingback: Freak Wenzie « rigondo

  14. Yang says:

    Hello, your FFT seems only able to take the input/output with a size equal to a power of 2, any advices to extend it to any number ?

    • While there are FFT algorithms that work with non-power-of-2 lengths (for example FFTW), I suspect they’re a fair bit more complicated. Often if people are dealing with non-power-of-2 length data, they just use zero-padding to bring the length up to the nearest power of 2. For example, if you’ve got a buffer that’s got 800 samples, you can apply an 800-sample analysis window (Hanning, Hamming, triangular, whatever) to those samples; append 224 samples to bring the total length to 1024 = 2^10; then run the FFT.

      • Yang says:

        Yes, I will first stick to the closer power of two and then if it doesn’t behave as I wanted (but there is no reason why to be honest), I will try to find new solutions. Thanks a lot for being so quick to answer.

  15. charlierun says:

    Hello Gerry
    First of all thanks for this awesome class package Im having fun using this =).Im trying to target a high frequency 20khz I do this with your class no problems but to do so I have to analyze from 0 to 20 khz will be it be possible to target only 20 khz this will boost perfomance in my app.
    Thanks in advance Gerry =)

    • If you mean that you’d like to just get some very small subset of the FFT bins around 20 kHz, the simplest approach is to directly compute the DFT on the bins of interest. The mathematical expression for the DFT (see Wiki) looks daunting unless you’re familiar with complex exponentials, but the code is remarkably simple especially with a real-only input sequence. Basically you multiply-accumulate (dot-product) a block of input samples by cos and sin functions; the resulting sums give you the real and imaginary DFT bin values.

      • Lopoer says:

        Am interested in this as well can you write a more human language version?

      • charlierun says:

        Hi Gerad
        Thanks for the fast response.What I want to do is given a input audio stream wich im getting from a mic, only look for magnitude of 20 khz I know for example using your class, once I increment FFT length to 11 I can target 20khz signal. This magnitude is position on 929th value of the m_tempRe array that gets populated on your fft class. Would there be a dirty way to hack your class to read only for this signal I have been reading dft on wiki and is over my head..=(
        Thanks Gerad
        update I have been trying to do some dft function though I get no where did you mean something like this
        var k1:Number = 2*Math.PI/n;
        var k2:Number;
        for(var p=0; p<11; p++){
        k2 = k1*p;

        for(var k=0; k<929; k++){
        output_real += this.m_tempRe[k]*Math.cos(k2*k);
        output_imag += this.m_tempRe[k]*Math.sin(k2*k);
        }
        m_tempRe[p] = output_real/n;
        m_tempIm[p] = output_imag/n;
        }

      • If you’re using the FFT, you can actually get a bin at (or at least near) 20kHz regardless of the length of the FFT, as long as the sample rate is at least double 20 kHz. Using a longer window just means the bins are more closely spaced, and you’ll be able to find a bin that closer to the exact frequency you want. If the FFT length is N, the spacing between successive bins is SR/N, where SR is the sample rate. Assuming you’re using SR = 44.1 kHz, and N = 2048 (logN = 11), that means the bin spacing will be 21.53 Hz (44.1 kHz/2048), and the nearest bin to 20 kHz will be 929 (round(20000/21.53).

        Note that to get the magnitude at that frequency, you actually need to combine the real and imaginary parts:
        const BIN:uint = 929;
        var re:Number = m_tempRe[BIN];
        var im:Number = m_tempIm[BIN];
        var mag:Number = Math.sqrt(re*re + im*im);
        Don’t get freaked out by the imaginary number stuff. The square root of the sum of squares is just the Pythagoras rule for computing the hypotenuse of a right triangle given the other two sides.

        To compute a single bin of the DFT, you’d do something like this:

        const LOG_N:uint = 11; // Log of the single-bin DFT
        const N:uint = 1 << LOG_N; // Length of the single-bin DFT
        const BIN:uint = 929;
        const SR:Number = 44100;
        const PHASE_INC:Number = -2*Math.PI * SR / BIN;
        var x:Vector. = new Vector.(N);

        // Fill x with N samples of the signal you want to analyze
        // Optional: multiply by a “window function” that tapers to zero at the ends. Looks for “hanning” in my spectrum analyzer code

        // This is the DFT for a single bin:
        var xRe:Number = 0.0;
        var xIm:Number = 0.0;
        for (var k:uint = 0; k < N; k++)
        {
        xRe += x[k] * Math.cos(phase);
        xIm += x[k] * Math.sin(phase);
        phase += PHASE_INCREMENT;
        }
        // Get the magnitude at the bin
        var mag:Number = Math.sqrt(xRe*xRe + xIm*xIm);

        Disclaimers: I haven't compiled or tested the above code. Might have some typos. Also, it'll be slow because of all the sin/cos computations. There are ways to speed it up. And finally: the code can be modified to use any N, not just powers-of-two, and to have the "bin" be centered on any arbitrary frequency.

        BTW, why do you want to get the magnitude at 20 kHz? That's very, very high, and beyond the range of hearing for most people. What's more, while theoretically you can go up to 22.05kHz with a 44.1kHz sample rate, in practice the microphone frequency response and anti-aliasing filters before the A/D converter means you're unlikely to find much energy at all above 20 kHz… so I'm really puzzled what your application is!

        BTW2, if you're working on a commercial project, I'm open to doing some consulting for a fee… maybe I should add a Paypal "donate" button somewhere 😉

  16. charlierun says:

    Hi Gerad
    Thanks for the fast response.What I want to do is given a input audio stream wich im getting from a mic, only look for magnitude of 20 khz I know for example using your class, once I increment FFT length to 11 I can target 20khz signal. This magnitude is position on 929th value of the m_tempRe array that gets populated on your fft class. Would there be a dirty way to hack your class to read only for this signal I have been reading dft on wiki and is over my head..=(
    Thanks Gerad

  17. While your blog posts are incredibly helpful to me I am still struggling to understand how to achieve my desired result. What I am attempting is to precalc the spectrum of an audio file before it plays and perform *stuff* on this spectrum data to make a game that essentially matches the music.

    The main issues I am having are that I cannot find an example of precalcing FFT values before the fact anywhere on the internet at least not in AS3. I am also not sure of the resolution of the data I should be taking. The way my poor game developer mind sees it is that I should be that I can take 60 samples a second if my game runs at 60fps but the other, saner, part of me says that this will not be defined enough to perform accurate enough pattern matching stuff to generate the pretty colours that I want to display.

    Any light you can shed on either topic would be bloody marvelous.

    • You might want to take a look at my real-time spectrum analyzer example. It computes and plots the magnitude spectrum of the microphone input at regular intervals using an FFT. If you want the analysis to be done on recorded music instead, that’s also possible – basically you’d need to get the audio sample data using Sound.extract, analyze it, and also play it using the Sound class’ SampleData event mechanism.

  18. Alama says:

    For this kind of thing, it’s best to probably use the original AS3 FFT computeSpectrum. There are 512 values ​​(256 zones per channel frequencies) ranging from 0 to 1. The byteArray of the spectrum is refreshed at SWF frequency (60 for your game). Would be easier for you?

  19. B. says:

    Hi Gerry,

    Can you please explain to me how I can use this code to apply an FFT on an Array of samples?
    Where do I plug in the input array, and what is the output?

    Thanks in advance.

    • Like most FFT implementations, my FFT takes an input sequence consisting of complex numbers (i.e. numbers with real and imaginary parts), and generates a complex spectrum. In FFT.run(), the arguments xRe and xIm are Vectors of Numbers containing the real and imaginary parts of the sequence (on input) and spectrum (on output).

      If you’ve got an Array of (presumably real?) samples, you’ll need to first create a pair of Vector., say xRe and xIm, with a length corresponding to the FFT size. Copy the values from the Array into xRe, and set the values in xIm to zero. Then call FFT.run(xRe, xIm). xRe and xIm will contain the real and imaginary parts of the spectrum.

      For more on the usage, have a look at the code on the FFT Timing Test and Real-Time Spectrum Analysis posts.

  20. Pingback: Sound Danger | Dormant Blog

  21. David Schäfer says:

    if anyone is still interested in this amazingly fast FFT-implementation, i found a way to boost it’s performance even more.

    well, it wasn’t exactly me to FIND the trick. it was jackson dunstan (http://jacksondunstan.com/articles/2076)

    surprsingly, the vector lookup-time can decreased by nearly 50%, just by type-casting.

    so i added the type-casting to the fft-loops, and got my average run (with 2048 samples) down from 24ms to 17ms on my machine.

    my changes to the run-method:

    #98 x.re = scale * xRe[k];
    #99 x.im = scale * xIm[k];
    >>
    #98 x.re = scale * (xRe[k] as Number);
    #99 x.im = scale * (xIm[k] as Number);

    #122 var xTop:FFTElement = m_X[start];
    #123 var xBot:FFTElement = m_X[start+span];
    >>
    #122 var xTop:FFTElement = m_X[start] as FFTElement;
    #123 var xBot:FFTElement = m_X[start+span] as FFTElement;

    i didnt really expect these tiny changes, to speed it up at all… but oh yeah it did 🙂

    have fun!

    • Gery Teague says:

      Sweet! Your engine is awesome!

    • Nice discovery! I’m looking forward to trying it. Generally I’ve noticed that Flash Player versions that have come out since I first did my FFT performance tests are substantially faster than the versions I used for the FFT timing tests. If vector access has improved enough (especially with the casting trick you found), it may now be quicker to use my original AS3 FFT implementation than the “Even faster” version that uses linked lists.

  22. Tox says:

    I don’t know if someone said it already, but there are two major improvements and one minor one possible with your code:
    a) avoid for-loops and use while-loops instead; they are faster, especially with long iterations
    b) avoid creating variables (using the var keyword) inside of loops. use class internal variables instead. if these are the same across all instances of the class you can use static variables.
    c) use uint only where necessary, the int data type is slightly faster.

    thanks a lot for sharing!

  23. ryan says:

    To save anyone else some time… I tried using this code inside of a Mobile Application with AIR on iOS and Android to do pitch detection (just get a freq of a person singing) and got pretty poor performance.. Android Galaxy S4 got 10/60fps and iPAD2 got 30/60 fps.. Anyone looking to use this for something similar should look else where or build a Native Extension.

  24. Phil says:

    Hi,
    I am writing a Flex application for schools to use a phone/tablet accelerometer to show a simulated earthquake.

    Would I be able to make use of your algorithm for the frquency response (bear in mind the sample rate will not be quite consistant – eg t0=0, t1=50ms, t2=56ms, t3=45ms etc).

    TIA,
    Phil.

    • Hi Phil, if you’re using an FFT to calculate a frequency response, the sample rate needs to be constant. If you have irregularly spaced samples, perhaps you could interpolate to get a regularly-spaced set of samples? Really not sure. In any case, an FFT probably isn’t the best mathematical algo for analysing seismological data (whether real or simulated); I’m pretty sure wavelets are more commonly used, though even then I think the assumption is that the samples are uniformly sampled.

      • Phil says:

        Hi Gerry, thanks for your VERY quick answer.

        Your suggestion for interpolation is very possible (although it does increase the amount of “data”).

        I have tried your function, which returned variables “im” and “re” (I assume I can get the frquency magnitude by the sum of the squares rooted). However I will need to insert the data in, say, 128 values, but there will be some values “left over” from my data vector.

        If that number “left over” were, for example, 25 (from a total vector length of 128*15+25), do I simply need to pad the last values with zeros at the end of the vector to get a vector 128 long?

        Again, thanks,
        Phil.

      • sqrt(re[i]*re[i] + im[i]*im[i]) is indeed how you get the magnitudes. Note that if you do an N point FFT on real-only input data (i.e. where all the im values are zero), the all the magnitude info will be in the first N/2 bins; the upper N/2 will be a mirror image of the lower N/2. Zero-padding your input data to get to a power-of-length is a pretty common thing to do in audio signal processing.

  25. Hi Gerry.

    I am wondering if I could ask your advice, and perhaps the advice of any of your readers.

    I am currently in the final year of my honours degree in Music Technology, in the UK. I have very little experience in coding but decided to give it a go. I am using Flash CC 2014 Professional to code an application for iOS and android devices. The app that I am building is currently reading the information from the microphone on the users device and displaying real time graphical representation of this. However, I need the app to record more detailed information about the frequencies etc… as its function is to monitor the users exposure levels to sound and access this at the end of the session.

    I wondering if your FFT above might be able to help me achieve this, or if you could recommend a place where I might be able to find such an FFT.

    All advice greatly appreciated.

    Martin.

  26. Great, I’ll check it out now. Thanks for that Gerry.

  27. I got what you mean , regards for putting up.Woh I am thankful to find this website through google. “I would rather be a coward than brave because people hurt you when you are brave.” by E. M. Forster.

  28. Hatt says:

    Hello, I build a project with your code and other code in https://github.com/stefanvonderkrone/fourier-transformation/blob/master/src/de/stefanvonderkrone/as3/fourier/core/FourierTransformation.as to generate the image, but the result seems wrong, may I get your help.
    thanks a lot

    • Hi, I can’t look at your code (no time!) but if there’s a problem specifically with using my FFT implementation, I may be able to help.

      • Hatt H says:

        Thank for replay! the result like this

        i had tried some image, the width and height of the image is NOT power of 2 will get this result, but width and height is power of 2, the result seems good

        result:

        ​(i scale original image)

        😱 FFT can only transform the data which is in power of 2 ? ( I’m sorry for my poor math)

        PS: I had tried the code in comment from David Schäfer, but cant get faster

        2016-11-15 6:48 GMT+08:00 Gerry Beauregard :

        > Gerry Beauregard commented: “Hi, I can’t look at your code (no time!) but > if there’s a problem specifically with using my FFT implementation, I may > be able to help.” >

    • The FFT implementation I posted is for one dimension, so not sure whether it would be useful for image processing. It’s a radix-2 implementation, and as such only works with power-of-2-length sequences.

  29. James C says:

    Belated thanks for sharing this gem of coding on FFT.

    One particular thing (of many) that puzzles me in the code is that updateSpectrum() runs on its own timer separately to onMicSampleData(). Ignoring any screen update issues and thinking more about ensuring that no sampled audio data is “missed” by the FFT, isn’t there merit in calling the FFT processing (i.e. updateSpectrum()) at the end of the onMicSampleData() function so that every time new audio data is received, it is analysed?

    • If the UI updates are less frequent than the calls to onMicSampleData(), then you can reduce the CPU load by calling updateSpectrum() only as often as required by the UI.

      • James C says:

        I thought that was probably the reasoning behind the FFT and associated graph work running on a separate (timer) event to the storing of new audio data from the mic.

        However, if the requirement is such (as is with my project) that no audio data is missed even if that means not displaying any frequency graph (or other visual feedback or UI), then isn’t there merit in running the FFT every single time new audio data is provided via onMicSampleData()?

        In recent tests I carried out when onMicSampleData() and updateSpectrum() ran independently there is a risk that previously acquired audio data is overwritten and lost before the FFT and associated analysis is run by updateSpectrum().

        For example, when updateSpectrum() is on a 100ms timer, there’s about an even split of either 2 or 3 onMicSampleData() calls being made in-between each updateSpectrum() call with each onMicSampleData() call providing 1024 input samples – bearing in mind that the cyclic buffer for storing the sample data is on 2048 elements long. So, then updateSpectrum() is called following 3 onMicSampleData() calls that means that 1024 samples have been lost.

        Now, I’m no sound specialist by a long stretch of the imagination and I have no idea if losing 1024 samples of audio data is significant; but I am trying to detect specific occurrences of general noise which may be very short lived and also specific frequencies which may be short lived so losing audio data worries me.

        Further more, my algorithm is sometimes concerned with the persistence of a certain frequency over a period of time. Now, imagining a worse case scenario where the audio data has been lost/overwritten before every FFT and analysis is run, isn’t it possible that a particular pulsating frequency I’m trying to detect happens to be in the audio data that is being lost? Again, this is not my field so I don’t know if what I am asking is quite likely or extremely unlikely to happen in practice. At the moment, detection seems to be fairly reliable and as expected but I would welcome your opinion on this.

        Underlying this general query about updateSpectrum() being driven by a separate (timer) event, is that (I believe for the Flash Player Virtual Machine – AVM2) certain types of events such as mouse movement and mic sampling are primary type citizens and are responded to immediately while other event types such as timer events are secondary type citizens and have to wait their turn even if that means not firing according to their timed scheduled. With this in mind, tying the FFT and analysis into the onMicSampleData() call guarantees no audio data is missed while putting the FFT and analysis on a timer based call introduces a definite risk of parts of the audio data being missed.

        Please don’t take this as any criticism of your code which I am truly grateful for but I would really appreciate your further thoughts on this issue.

  30. The spectrum analyser demo as written is just that – a demo. As is it’s good for getting an idea of what’s going on in sounds that have steady or relatively-slowly changing spectra. Since it certainly will skip analysis on some of the incoming audio, if there are very rapid changes, they may not be displayed.

    Calling updateSpectrum() on very call to onMicSampleData() might help, and there’s no harm at all in doing so provided the computer is fast enough (and these days essentially any computer will be fast enough). Of course if some interesting spectral information only appears for one frame, your eyes may miss it anyway. But if the spectrum is being used as input for some further analysis – say note detection – then you definitely don’t want to miss any of the input data.

    BTW, it looks like this thread of comments ended up on the “An Even Faster AS3 FFT” page instead of the “Real-Time Spectrum Analysis” page! 🙂

    • James C says:

      Thank you for your comments on my query. Most helpful.

      Apologies to all for dropping this discussion point in on the wrong page.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s