This file describes the origins of the various FFTs found in this
directory.  With each source code package we give the name by which
the code is labelled in the benchmark results.

This README file has been superceded by the HTML file "doc/ffts.html".
Please refer to that document for complete information.  The main
purpose of this file is now simply to list which files go with which
package.

Readers should note that we have made various modifications to the
code in order to make it work with the benchmark (e.g. changing
subroutine names, changing to use double precision, and similar minor
alterations).  The biggest change is that all the routines have been
modfied to #include <fftw.h> and use FFTW_REAL as their floating point
type; this allows us to change the precision of the benchmark simply
by changing fftw.h.

Except where otherwise noted, FFTs are one-dimensional only.

-----------------------------------------------------------------------------

Frigo-old: matteo_fft.c
	C FFT written by M. Frigo (one of the FFTW authors).  This
code is an early precursor to FFTW.  It is actually written in Cilk,
and is designed for efficient parallel execution.  Optimized for
powers of two, although it will work on any size.

CWP: cwp.h pfafft.c
	C FFT by Dave Hale (1989) in a numerical library from the Colorado
School of Mines. This code works only on a limited set of array sizes
and uses a Prime Factor Algorithm.
	http://risc1.numis.nwu.edu/ftp/pub/transforms/cwplib.tar.gz
	Info: http://risc1.numis.nwu.edu/ftp/pub/transforms/cwplib.doc
	Colorado School of Mines: http://www.mines.colorado.edu/

Krukar: b512.c bitrev.c dint.c dintime.c fft.c fft.h 
        idint.c idintime.c tab.c
	C FFT by Richard H. Krukar (krukar@ectopia.com) (1990). Works
only for powers of two <= 4096.
	http://hpux.ced.tudelft.nl/hpux/Maths/Misc/ffts_in_C-1.0.html
	Author's Home Page: http://206.214.38.35/contact.html

Singleton: go_fft.c go_fft.h
	Fortran FFT (converted via f2c) by R. C. Singleton (Stanford
Research Institute, 1968). This routine handles both one and
multi-dimensional transforms.
	R. C. Singleton, "An algorithm for computing the mixed radix
fast Fourier transform," IEEE Trans. on Audio and Electroacoustics,
vol. AU-17, no. 2, p. 93-103 (June, 1969).
	http://www.netlib.org/go/fft.f

FFTPACK: cfftb.c cfftb1.c cfftf.c cfftf1.c cffti.c cffti1.c
         passb.c passb2.c passb3.c passb4.c passb5.c passf.c
         passf2.c passf3.c passf4.c passf5.c
	Fortran FFT (converted via f2c) from the popular FFTPACK
package by P. N. Swarztrauber.
	P. N. Swarztrauber, "Vectorizing the FFTs," Parallel
Computations, p. 51-83 (1982).
	http://www.netlib.org/fftpack/index.html

NR (C):  four1.c fourn.c
	C FFTs from Numerical Recipes in C.  fourn is a multi-
dimensional transform. Both of these routines only work for powers of
two.  Note that we changed the name of the routines from four1 and
fourn to nrc_four1 and nrc_fourn.  We also changed the floating point
type to FFTW_REAL. Not in the public domain.
	http://cfata2.harvard.edu/numerical-recipes/

Temperton: gpfa_gpfa.c gpfa_gpfa2f.c gpfa_gpfa3f.c gpfa_gpfa5f.c
      gpfa_gpfft3.c gpfa_setgpfa.c
	Fortran FFT (converted via f2c) by C. Temperton. Works for any
powers of 2, 3, and 5.  Performs both one and multi-dimensional
transforms.
	C. Temperton, "A Generalized Prime Factor FFT Algorithm For
Any N = 2^P 3^Q 5^R," SIAM Journal on Scientific and Statistical
Computing, vol. 13, no. 3, p. 676-686 (1992).
	http://www.spektracom.de/~arndt/fxt/gpfa.tgz

Green: green.h green.c
	C FFT (includes real-complex transform) by John Green
(green_jt@vsdec.nl.nuwc.navy.mil) (1996).  Optimized for the PowerPC.
	http://hyperarchive.lcs.mit.edu/HyperArchive/Archive/dev/src/ffts-for-risc-121-c.hqx

Mayer: fft_mayer.c trigtbl.h
	C FFT by Ron Mayer (mayer@acuson.com) (1993). Only works for
powers of two.  This code actually performs the FFT by using the Fast
Hartley Transform (FHT).
	http://www.geocities.com/ResearchTriangle/8869/fft_summary.html

Nielsen: mixfft/*
	C FFT by Jens Jorgen Nielsen (1996).
(Mixed-radix Cooley-Tukey transform.) The mixfft/ directory contains
Nielsen's original package (we are not allowed to distribute a
modified version).  To use this FFT in the benchmark, you must
modify the code as described in doc/install-nielsen.html!

NAPACK: NAPACK.c
	Fortran FFT (converted via f2c) from NAPACK package.  Does not
include inverse transform.  Seems to have bugs under gcc and Linux for
sizes that aren't powers of two (works okay elsewhere,
though). 
	http://math.nist.gov/cgi-bin/gams-serve/
               list-package-components/NAPACK.html

Edelblute: fft_duhamel.c
	C FFT by Dave Edelblute (edelblut@cod.nosc.mil) (1993),
implementing a Duhamel-Holman split-radix FFT.  Only works for powers
of two.
	http://risc1.numis.nwu.edu/fft/fft-stuff.tar.gz

Beauregard:  BeauregardFFT.c BeauregardFFT.h ComplexMath.c ComplexMath.h
	A C FFT by Gerry Beauregard (1991).  Can only handle sizes that
are powers of two.
	http://risc1.numis.nwu.edu/fft/fft-stuff.tar.gz

Ooura (C): ooura.c
	C FFT by Takuya Ooura (ooura@mmm.t.u-tokyo.ac.jp) (1996).
Only works for sizes that are powers of two.  There is also a Fortran
version (from the looks of things, the Fortran version was written
first, translated with f2c, and then the C code was cleaned up a bit
by hand).
	http://momonga.t.u-tokyo.ac.jp/~ooura/

Ransom: ransom/*
	C FFT by Scott M. Ransom (ransom@cfa.harvard.edu) (1997). Uses
the "6-step" FFT (a variant of the same method used by Arndt 4-step).
Includes real-complex transform and prototype mass-storage
FFT. Received in personal communication with the author. Only works
for sizes that are powers of two.
	Reference: http://science.nas.nasa.gov/Pubs/TechReports/RNRreports/dbailey/RNR-89-004/RNR-89-004.ps

Valkenburg: fft2_complex.h fourier.c ft.c w.c w.h
	C FFT by Peter Valkenburg (valke@cs.vu.nl) (1987).
	http://www.spektracom.de/~arndt/fxt/fft2.tgz

PDA: pda.h pda_dcfftb.c pda_dcfftf.c pda_dcffti.c pda_dcftb1.c pda_dcftf1.c
     pda_dcfti1.c pda_dnfftb.c pda_dnfftf.c pda_dpssb.c pda_dpssb2.c
     pda_dpssb3.c pda_dpssb4.c pda_dpssb5.c pda_dpssf.c pda_dpssf2.c
     pda_dpssf3.c pda_dpssf4.c pda_dpssf5.c
	Fortran FFTs (converted via f2c) from the Public Domain
Algorithms (PDA) library.  These routines perform both one and
multi-dimensional transforms.  The one-dimensional transforms are
based on FFTPACK.
	http://www.roe.ac.uk/computing/starlink/docs/sun194.htx/node10.htm

HARM: harmd.c
	Multi-dimensional Fortran FFT (converted via f2c).  Only works
for powers of 2. The author is unknown, but we suspect that it might
be based on something by J. W. Cooley himself, based on mentions of a
"PK HARM" multi-dimensional transform we found in his book (see below).
	J. W. Cooley and P. A. W. Lewis and P. D. Welch, "The Fast
Fourier Transform Algorithm and Its Applications" (IBM Research, 1967).
	http://risc1.numis.nwu.edu/fft/harm.f

Arndt *: fxt/*
	Many transforms from the "FXT" package by Joerg Arndt
(arndt@spektracom.de).  This package includes several different FFT
implementations (both complex-complex and real-complex), FHT code, and
number-theoretic transforms.  Not all of the code in FXT was written
by Arndt himself; much was adapted from other sources.  We benchmarked
only those routines from this package that appeared to have been
significantly modified from their original versions; otherwise, we
benchmarked just the original versions.  The FXT package is ostensibly
written in C++.  However, the usage of C++ was purely cosmetic; we
converted it back to C so that we could compile everything in the
benchmark with just a C compiler.
	Arndt DIT is a radix-4 DIT FFT routine.  Arndt DIF is a
radix-4 DIF FFT routine.
	Arndt Split-Radix is a Duhamel-type split-radix transform,
based on the code by D. Edelblute (benchmarked separately). It was
compiled with the USE_SINCOS3 option turned off.  (With the option
turned on, there are no essential differences from Edelblute, and the
performance is the same.  In any case, the code is often faster with
the option turned off.)
	Arndt 4-step is not a new dance craze, but is an FFT based on
the so-called "four-step" FFT.  (This is essentially a Cooley-Tukey
FFT where you use sqrt(n) as the radix and do transpose operations to
keep data sequential.)
	All of these functions only work for transforms whose sizes
are powers of two.
	http://www.spektracom.de/~arndt/joerg.html

Bergland: bergland.c
	A radix-8 FFT, translated by Dr. Richard L. Lachance
(richard.lachance@bomem.com) from a Fortran program by G. D. Bergland
and M. T. Dolan. Works only for powers of two, and does not include a
true inverse transform. The original source can be found in the book:
_Programs for Digital Signal Processing_, edited by the DSP Committee,
IEEE Acoustics, Speech, and Signal Processing Society (IEEE Press,
1979), Chapter 1.2, "Fast Fourier Transform Algorithms."
	(Received in personal communication with Dr. Lachance.)

GSL: gsl/*
	This directory contains FFT routines adapted from the GNU
Scientific Library (GSL).  The routines were written by Brian Gough
(bjg@vvv.lanl.gov).  The original GSL may be downloaded from:
        ftp://nis-ftp.lanl.gov/pub/users/rosalia/

QFT: qft/*
	See qft/README (or benchfft docs).
