StatLib---Applied Statistics algorithms
The Royal Statistical Society has been publishing algorithms in its journal Applied Statistics since 1968. As of 1989, there are over 250
of them. Most are in Fortran, though a few were in Algol, and some recent ones have been in Pascal. The book - Applied Statistics
Algorithms by Griffiths, P. and Hill, I.D., Ellis Horwood: Chichester (1985) contains translations of several algorithms from Algol to
Fortran. A few of the other algorithms have been supplied in Fortran translations, though a few are in Algol or Pascal. The index which
follows indicates which algorithms are not in Fortran.
The full source code is published in the journal. Those available here have been transcribed manually, mainly within CSIRO Division of
Mathematics & Statistics, or have been supplied directly by the authors or by the RSS Algorithms Editor. In some cases, later corrections
or improvements published in Applied Statistics, have been incorporated.
It is the policy of the editors of the algorithms section of Applied Statistics that algorithms do not use double precision. Many of the
algorithms here have been converted to double precision, though users should be careful to check what precision is used. Many of the
algorithms require the use of other algorithms, particularly functions for the gamma function and the normal distribution function. Where
such functions or subroutines are required, an appropriate comment has been added to the algorithm.
In a few cases, alternative algorithms from other sources have been added. For instance, three algorithms are included in the file for as66
for calculating the area under the normal curve, and an alternative random number generator is provided in the file for as183. The Applied
Statistics algorithm for Nelder-Mead simplex minimization (AS 47) does not include the fitting of a quadratic surface; the CSIRO/
Rothamsted implementation which does this is included with AS 47.
Users must consult the original journal articles for details of the calling arguments; they are not included with these algorithms.
*** Warning. In some cases, there are different arguments (usually more) in the Griffiths & Hill versions of these algorithms. Users should
check this.
It has been assumed that the user will be using a Fortran-77 compiler, so functions such as alog and amin1 have been converted to their
generic forms (log and min). This simplifies conversion of code between single and double precision. Also all constants in the code, such
as 1.0, 0.d0, etc., have been replaced with one, zero, etc., and these are defined in either data or parameter statements. Many of the
algorithms have been entered in lower case, which is not acceptable in Fortran, though most compilers accept it.
Some of the algorithms need machine-dependent constants. The user should check this. In most cases, these have been set for compilers which
allow a range of floating-point numbers from about 10**(-37) to 10**(+37), though most modern compilers allow a much wider range in double
precision.
No guarantee is given that the algorithms have been entered correctly, or that they perform as claimed in the journal.
To obtain an algorithm, send an E-mail request of the form: send index from apstat send 207 from apstatto statlib@lib.stat.cmu.edu
The Royal Statistical Society holds the copyright to these routines, but has given its permission for their distribution provided that no
fee is charged.
The full collection of Applied Statistics algorithms is very large. Please only request those algorithms which you need. Requesting large
numbers of algorithms places a great strain on the StatLib system and the underlying mail networks.
-------------------------------------------------------------------------------------------------------------------------------------------
As of the end of 1997 the Applied Statistics journal does NOT accept algorithms
-------------------------------------------------------------------------------------------------------------------------------------------
Listing of available Applied Statistics algorithms
No.
Brief description (volume number/year in brackets)
3
Student's t-distribution. (17/1968) See also AS 27.
5
The non-central t-distribution. (17/1968) See also AS 243.
6
Cholesky decomposition of a symmetric +ve definite matrix. (17/1968)
7
Inversion of a symmetric matrix stored in triangular form. (17/1968)
13
Minimum spanning tree. (18/1969) See also AS 40.
14
Printing the minimum spanning tree. (18/1969)
15
Single-linkage cluster analysis. (18/1969)
22
Calculate treatment effects in a complete factorial experiment for any numbers of levels of factors using the extended Yate's
method. (19/1970)
27
Upper tail area under Student's t-distribution. (19/1970) See also AS 3.
30
Half-normal plotting. (19/1970)
32
The incomplete gamma integral. (19/1970) See also AS 239.
34
Update inverse of symmetric banded matrix. (19/1970)
38
Calculate R-squared for all possible regression subsets using the Gauss-Jordan method. (20/1971) See also AS 268.
40
Update a minimum spanning tree. (20/1971)
41
Updates corrected sums of squares and products matrices. (20/1971) See also AS 240.
45
Histogram plotting. (20/1971)
46
Gram-Schmidt orthogonalization. (20/1971)
47
Nelder & Mead simplex method of unconstrained minimization without requiring derivatives. Does not include the quadratic surface
fitting part of the algorithm. A CSIRO/Rothamsted version of the algorithm, which does include fitting a quadratic surface, is also
included. (20/1971) (Updated, 20/Dec/93)
51
Log-linear fit for contingency tables. (21/1972) See also AS 160.
52
Calculation of sums of powers (up to 4) of deviations from the mean. (21/1972)
53
Wishart variate generator. (21/1972)
57
Printing multi-dimensional tables. (22/1973)
58
Allocates observations to clusters to minimize within-cluster sum of squares. (22/1973) See also AS 136.
60
Eigenvalues/vectors of a symmetric matrix. (22/1973)
62
Distribution of the Mann-Whitney U statistic. (22/1973)
63
Incomplete beta function. (22/1973) See also TOMS algorithm 708. TOMS algorithms are available from netlib.
64
Inverse of the incomplete beta function ratio. (22/1973) The file here is actually the Griffiths & Hill version of AS 109.
65
Expands structure formula to a list of binary integers. This is actually remark R82 which replaces the original AS65. (39/1990)
66
The normal distribution function. Two other algorithms (not from Applied Statistics) have also been included. (22/1973)
75
Algorithms for least-squares calculation using square-root free planar rotations (Morven Gentleman's package). (23/1974)
76
An integral useful in calculating noncentral t and bivariate normal probabilities. (23/1974)
77
Calculate exact null distribution of the largest root of a beta matrix. (23/1974)
78
The mediancentre (i.e. the median in a multi-dimensional space). (23/1974) See also AS 143.
83
Complex discrete fast Fourier transform. (24/1975)
84
Measures of multivariate skewness and kurtosis. (24/1975)
88
Generate all nCr combinations by simulating nested Fortran DO-loops. (Jane Gentleman's routines). (24/1975) See also AS 172.
89
Tail probabilities for Spearman's rho. (24/1975)
91
Percentage points of the chi-squared distribution. (24/1975)
93
Calculates frequency distribution for the Ansari-Bradley test statistic. (25/1976) A routine has been added to return the
distribution function.
95
Maximum likelihood estimation of scale and location parameters from grouped data. User's distribution function. (25/1976)
96
Finding `nice' scales for graphs. (25/1976)
97
Real discrete fast Fourier transform. Series length must be a power of 2. (25/1976) See also AS 117 and AS 176.
99
Fitting Johnson curves by moments. (25/1976)
100
Normal-Johnson and Johnson-Normal transformations. (25/1976)
103
Psi or digamma function. (25/1976)
107
Calculate operating characteristics and average sampling number for a general class of sequential sampling plans. (26/1977)
108
Multiple linear regression minimizing the sum of absolute errors. (26/1977) See also AS 238 (in Pascal).
109
Inverse of the incomplete beta function. (26/1977)
110
LP-Norm fit of straight line by extension of Schlossmacher. (26/1977)
111
Percentage points of the normal distribution. (26/1977) See also AS 241.
114
Compute the numerator of certain ordinal measures of association (Kendall's tau, Somer's d, Goodman and Kruskal's gamma) when the
data are ordered categories. (26/1977)
116
Calculate the tetrachoric correlation and its standard errors. (26/1977)
117
Fast Fourier transform for series of any length using the CHIRP algorithm. (26/1977)
121
Trigamma function. (27/1978)
123
Distribution function of mixtures of beta distributions. (27/1978)
125
Maximum likelihood estimation for censored exponential survival data with covariates. (27/1978)
126
Distribution function of the range for the normal distribution. (27/1978)
127
Generation of random orthogonal matrices. (27/1978)
128
Computes approximate covariance matrix for normal order statistics. (27/1978)
132
Simple regression minimizing the sum of absolute deviations. (27/1978)
133
Finding the global maximum or minimum of a function of 1 variable. (27/1978)
134
Generate random beta variates for alpha < 1 and beta > 1. (28/1979)
135
Min-Max (L-infinity) estimates for linear multiple regression. (28/1979)
136
A K-means clustering algorithm. (28/1979)
138
Maximum likelihood estimates of the mean and standard deviation of the normal distribution with censored or confined
observations. (28/1979)
139
Maximum likelihood estimation in a linear model from confined and censored normal data. (28/1979)
140
Clustering the nodes of a directed graph. (28/1979)
141
Inversion of a symmetric matrix ignoring a specified row/column. (28/1979)
142
Exact tests of significance in binary regression. (28/1979)
143
Calculates the median centre. (28/1979)
145
Exact distribution of the largest multinomial frequency. (28/1979)
147
Incomplete gamma function. (29/1980) See also AS 239.
148
Removal of bias in the jackknife procedure. (This is actually ASR 62) (29/1980)
149
Amalgamation of means in the case of simple ordering ('Up-and-Down Blocks' algorithm for isotonic regression). (29/1980)
150
Computes estimate of spectrum of a point process using a centered moving average of the periodogram of the counting
process. (29/1980)
151
Smoothed spectral estimate for bivariate point processes. (29/1980)
152
Cumulative hypergeometric probabilities. (This is actually AS R77) (29/1980, revised in 38/1989)
153
Distribution of weighted sum of squares of normal variables. (29/1980) Pan's procedure for the tail probabilities of the
Durbin-Watson statistic.
154
Exact maximum likelihood estimation of autoregressive-moving average models by Kalman filtering. (29/1980) See also AS 182.
155
Distribution function of a linear combination of non-central chi- squared random variables. (29/1980) See also AS 204. This is a
Fortran translation supplied by the author.
157
The runs-up and runs-down tests. (30/1981)
158
Calculation of probabilities for inferences under simple order restrictions. (30/1981) See also AS 198.
159
Generate random 2-way table with given marginal totals. (30/1981)
160
Partial and marginal association in multi-dimensional contingency tables. (30/1981)
161
Critical regions of an unconditional non-randomized test of homogeneity in 2 x 2 contingency tables. (30/1981)
162
Multivariate Conditional Logistic Analysis of Stratum-matched Case-control Studies. (30/1981) Includes a Fortran version of CACM
algorithm 382 for generating all combinations of M out of N items.
163
A Givens Algorithm for Moving from one Linear Model to another without Going back to the Data. (30/1981)
164
Least squares subject to linear constraints. (30/1981)
165
Discriminant analysis of categorical data. (30/1981)
166
Calculates the entanglement matrix for a specified design. (30/1981)
167
Calculates efficiencies of estimation and their generalized inverse. (30/1981)
168
Calculates 'neat' values for plotting scales. (30/1981)
169
Produces scatter plots. (30/1981)
170
Computation of probability and non-centrality parameter of a non- central chi-square distribution. (30/1981)
171
Fisher's exact variance test for the Poisson distribution. (31/1982)
172
Generates indices for simulated nested DO-loops. Actually converts (either way) between a single index and a vector of
subscripts. (31/1982)
173
Generates design matrix for balanced factorial experiments. (31/1982)
174
Multivariate rank sum test and median test. (31/1982)
175
Cramer-Wold factorization of self-reciprocal polynomials as the product of two polynomials. (31/1982)
176
Kernel density estimation using the fast Fourier transform. (31/1982) Also contains an alternative set of routines for density
estimation.
177
Expected values of normal order statistics. (31/1982)
178
Gauss-Jordan sweep operator with multi-collinearity detection. (31/1982)
179
Enumeration of all permutations of multi-sets with fixed repetition numbers. (31/1982)
180
Linear rank estimate of the standard deviation after symmetric trimming. (31/1982)
181
Withdrawn. See R94 below
182
Finite-sample prediction from ARIMA processes. (31/1982) Uses AS 154.
183
The Wichmann & Hill random number generator. An alternative is also provided. (31/1982)
184
Non-central studentized maximum and related multiple-t probabilities. (31/1982)
185
Backward elimination procedure to find best-fitting log-linear models for contingency tables. (31/1982) Uses AS 51.
186
Discrete Fast Fourier Transform with data permutation. Series length must be a power of 2. (31/1982)
187
Derivatives of the incomplete gamma integral. (31/1982)
188
Estimation of the order of dependence in sequences. (32/1983)
189
Maximum likelihood estimation for the beta binomial distribution. (32/1983)
190
Distribution function & its inverse, for the studentized range. (32/1983)
191
Approximate likelihood calculation for ARMA and seasonal ARMA models. Includes a routine for the Banachiewicz (or modified
Cholesky) factorization A = LDL'. (32/1983)
192
Calculate approximate percentage points using Pearson curves and the first three or four moments. (32/1983)
193
The Knuth spectral test for congruential random number generators. (32/1983)
194
Test of goodness of fit of ARMA models. (32/1983)
195
Multivariate normal probabilities for a rectangular region in N- dimensional space. A version which calls IMSL routines is also
included. (33/1984) See AS 251 for a special case.
196
Conditional multivariate logistic analysis of stratified case-control studies. (33/1984). Edited for errors by Bill Bardsley
(bill.bardsley@man.ac.uk) on 18/08/2002.
197
Likelihood function for an ARMA process. (33/1984)
198
Calculation of level probabilities for order-restricted inference. (33/1984)
199
Branch and bound algorithm to find the subset which maximizes a quadratic form. (33/1984)
200
Approximate the sum of squares of normal score. (33/1984)
201
Combine predictions about a statistic based on the orderings of a set of means with an F-test of differences between the
means. (33/1984)
202
Data-based nonparametric hazard estimation. (33/1984)
203
Maximum likelihood estimation of mixtures of distributions (normal, exponential, Poisson and binomial). (33/1984) See also
AS221. This is a translation from Algol into Fortran.
204
Distribution of a sum of non-central chi-squared variables. (33/1984) Translation from Algol into Pascal.
205
Enumeration of all R x C contingency tables with given row and column totals, and calculation of hypergeometric probability for
each table. (33/1984)
206
Isotonic regression in two independent variables. (33/1984)
207
Fit a generalized log-linear model. (33/1984)
208
Fit multivariate logistic model by the method of moments. (34/1985)
209
The distribution function of skewness and kurtosis. (34/1985)
210
Fit 5-parameter Johnson SB curves by moments. (34/1985)
211
The F-G diagonalization algorithm. (34/1985) Modifications in ASR 71 & ASR 74 have been added to the file.
212
Fitting the exponential curve by least squares. (34/1985)
213
Generation of correlation matrices with specified eigenvalues. (34/1985)
214
Calculation of Monte Carlo confidence intervals. (34/1985)
215
Maximum likelihood estimation of the parameters of the generalized extreme-value distribution. (34/1985). Updated on [15/Nov/99]
216
Maximum likelihood fitting when model contains a linear part and auxiliary parameters. (34/1985)
217
Computation of the Dip statistic to test for unimodality. (34/1985). Bug fixed by Ferenc Mechler (fmechler@med.cornell.edu). See
README file for details.[5/Sep/02]
218
Elements of the Fisher information matrix for the smallest extreme- value distribution, also allows for censored data. (35/1986)
219
Height balanced trees. (35/1986)
220
Operating characteristics of James-Stein and Efron-Morris estimation. (35/1986)
221
Maximum likelihood estimation of a mixing distribution. (35/1986)
222
Resistant smoothing using the fast Fourier transform. (36/1987)
223
Optimum ridge parameter selection. (36/1987)
224
Combining component designs to form a design with several orthogonal blocking factors. (36/1987)
225
Minimizing linear inequality-constrained Mahalanobis distances. (36/1987)
226
Cumulative probabilities for the non-central beta distribution. (36/1987)
227
Generate all possible N-bit binary codes. (36/1987)
228
Finding I-projections (maximizing cross-entropy) subject to a finite set of linear inequality constraints. (36/1987)
229
Quantile regression. (36/1987)
230
Distribution of customers in M/G/m queues using an approximation due to Hokstad. (36/1987)
231
Distribution of a noncentral chi-squared variable with non-negative degrees of freedom. (36/1987) In Pascal.
232
Computation of population and sample correlation and partial correlation matrices in MARMA(p,q) time series. (37/1988)
233
Branch and bound algorithm for feature subset selection. (37/1988)
234
Approximating the percentage points of simple linear rank statistics with Cornish-Fisher expansions. (37/1988)
235
Tally frequencies of distinct real values. (37/1988)
236
Recursive enumeration of RxC tables for exact likelihood evaluation. (37/1988) In Pascal.
237
The corner method for identifying autoregressive moving average models. (37/1988)
238
Recursive procedure for the L1-norm fitting of a straight line. (37/1988) In Pascal.
239
Incomplete gamma function. (37/1988)
240
Updating the inverse of corrected sums of squares and products. (37/1988)
241
Inverse normal - more accurate than AS 111. (37/1988)
242
The exact likelihood of a vector autoregressive moving average model. (38/1989)
243
Cumulative distribution function of the non-central t-distribution. (38/1989)
244
Decomposability and collapsibility for log-linear models. (38/1989) In Pascal.
245
Log of the gamma function. (38/1989) The file includes a slower but more accurate algorithm which is not an AS algorithm.
246
Repeated measurements analysis of variance with unknown autoregressive parameter. (38/1989)
247
Updating the sufficient configurations for fitting ZPA (zero partial association) models to multi-dimensional contingency
tables. (38/1989) In Pascal.
248
Empirical distribution function goodness-of-fit tests for the uniform, normal and exponential distributions. (38/1989)
249
Mean and covariance estimation for the truncated multivariate normal distribution. (38/1989)
250
Test the equality of dispersion matrices using methods due to Puri & Sen (distribution-free) and Anderson (normal
distribution). (38/1989)
251
Multivariate normal probability integrals with product correlation structure. (38/1989)
252
Generating classes for log-linear models. (39/1990)
253
Maximum likelihood estimation of the RC(M) association model. (39/1990)
254
Fit mixture of normal distributions to grouped & truncated data. (39/1990)
255
Fitting two-way tables by means for rows, columns and cross-term. (39/1990) In Algol.
256
Distribution of a quadratic form in normal variables. (39/1990) In Pascal.
257
Isotonic regression for umbrella orderings. (39/1990)
258
Average run lengths for cumulative sum schemes. (39/1990)
259
Extending confidence intervals by the Robbins-Monro search process. (39/1990)
260
Distribution function of the square (R^2) of the sample multiple- correlation coefficient. (40/1991)
261
Quantiles of the distribution of square (R^2) of the sample multiple- correlation coefficient. (40/1991)
262
The Wei-Lachin two-sample test, the generalized Wilcoxon test and the log-rank test for incomplete multivariate observations with
censoring. (40/1991)
263
Construction of irredundant test sets. (40/1991)
264
Printing of bit patterns. (40/1991)
265
Waiting time distribution for the G/G/1 queue using the Fast Fourier transform. (40/1991)
266
Maximum likelihood estimation of the parameters of the Dirichlet distribution. (40/1991)
267
Calculate lower bound for the probability of correct selection of a subset containing the best populations from a set of
populations. (40/1991)
268
Calculates statistics for all possible subset regressions using an orthogonal decomposition. (40/1991)
269
Calculates the Cornish-Fisher adjustment to distribution functions (from the normal distribution function) using higher
cumulants. (41/1992)
270
Maximum likelihood fitting of a 'key' density function, e.g. normal distribution, multiplied by a series in Hermite
polynomials. (41/1992)
271
Optimal (smallest mis-classification probability) joint classification procedure. (41/1992) See also AS 276.
272
Produce character Box-plots using supplied quartiles. (41/1992)
273
Compare the least-squares fit of two subsets of regression variables using Spj0tvoll's method. (41/1992). Updated [04/01/03].
274
Least squares algorithms to supplement those of Gentleman in AS 75. (41/1992)
274-90
Algorithm 274, translated into Fortran 90.
275
Non-central chi-squared distribution function, number of degrees of freedom must be positive but not necessarily
integer. (41/1992)
276
Optimal combinatoric classification for two normal classes. (41/1992) See also AS 271.
277
The Oja bivariate median. (41/1992)
278
The distribution of quadratic forms of multivariate generalized Student variables. (41/1992)
279
P-values for the generalized/alternative Durbin-Watson statistic with arbitrary lag using a Cholesky transformation
method. (42/1993)
280
The power function for Fisher's exact test. (42/1993)
281
Scaling and rounding regression coefficients to force them to be integers. (42/1993)
282
High breakdown regression and multivariate estimation. (42/1993)
283
Rapid computation of the permutation paired and grouped t-tests. (42/1993) In ANSI Standard C.
284
Null distribution of a statistic for testing sphericity and additivity: a Jacobi polynomial expansion. (42/1993)
285
Calculate multivariate normal probabilities using Monte Carlo methods for a general region defined by a user-supplied
function. (42/1993)
286
Estimation of parameters when there are errors in both the predictor variables and the dependent variable using least squares
with a general model (includes implicit non-linear regression). (42/1993)
287
Adaptive rejection sampling (to generate random variables) from log-concave density functions. (42/1993)
288
P-value calculation for the generalized two-sample Smirnov tests. The tests are conditional on ties in the pooled
sample. (43/1994)
289
Convolves hypergeometric distributions generated by several 2x2 tables (43/1994)
290
Generate a grid of variance ratios for contour plotting of confidence regions for a pair of parameters in non-linear
regression. (43/1994)
291
Calculates overlap capability of subsequence SSEQ of length L. (43/1994)
292
Computes Fisher information matrix elements for time (type I) or failure (type II) censored units from the smallest extreme value
(sev), largest extreme value (lev), normal or logistic distribution. (43/1994)
293
Convolves conditional distributions generated by several 2xK tables (43/1994)
294
Pascal program for ....?
295
Heuristic algorithm to pick N rows of X out of NCAND to maximize the determinant of X'X, using the Fedorov exchange
algorithm. (43/1994)
R94
calculates Shapiro-Wilk normality test and P-value for sample sizes 3 <= n <= 5000 . Handles censored or uncensored
data. Corrects AS 181, which was found to be inaccurate for n > 50. (Uses function POLY in AS 181) by Patrick Royston (44/1995)
R96
Interval for trapezoidal rule integration to limit error using moment generating function bound for weighted sum of
chi-squared(1) random variables.
297
Nonparametric regression using ultraspherical polynomials.
298
Hybrid minimization routine using simulated annealing and any local minimizer supplied by the user.
300
Efficient algorithm for determining a simulated percentile point with a fixed degree of accuracy.
301
Computes the logarithms of F1, P(H) and P(H').
302
Subroutine to compute a multiple isotonic regression for umbrella ordering with known peak.
303
Generation of ordered multinomial frequencies. Given integers N (N >= 0) and K (K >= 1), generates all possible integer arrays NI
of size K, whose elements are non-negative, non-decreasing and sum to N.
304
Fisher's non-parametric randomization test for two small independent random samples.
305
Compute the ARL of a CUSUM mean chart with linear drift.
306
Calculation of a BIB product operation on any two matrices.
307
Calculation of the simplicial depth and the halfspace depth
310
Computes the cumulative distribution function of a non-central beta random variable.
314
Inverts matrix with contents subject to modulo arithmetic.
317
Maximum Likelihood Estimation and Goodness-of-Fit Tests for Mixtures of Distributions. Submitted by Alan Miller
(amiller@bigpond.net.au) [31/Oct/02]
319
A program to implement a quasi-newton method. Using new algorithm Varmet July 1994
-------------------------------------------------------------------------------------------------------------------------------------------
N.B. This library was maintained by Alan Miller, formerly of CSIRO Division of Mathematics and Statistics, Clayton, Victoria 3169.
Fortran 90 translations of some of the above algorithms can be found at http://users.bigpond.net.au/amiller/apstat.html
Recent algorithms are provided by "T.R.Hopkins" . The StatLib collection of Applied Statistics Algorithms is
maintained by Pantelis Vlachos, vlachos@stat.cmu.edu.
-------------------------------------------------------------------------------------------------------------------------------------------
Credit where credit is due
If you use an algorithm, dataset, or other information from StatLib, please acknowledge both StatLib and the original contributor of the
material.
-------------------------------------------------------------------------------------------------------------------------------------------
Last modified: Wed May 11 10:09:22 EDT 2005 by Pantelis Vlachos