Help file for design.mac macros for MacAnova
(C) 1998, 1999, 2000, 2001, 2002, 2003 by Gary W. Oehlert
Updated 940909 GO
Updated 950413 GO
Updated 960713 CB
Updated 970225 GO
Updated 970319 GO
Updated 970520 GO
Updated 970924 GO (revised search keys)
Updated 980506 GO (added mixed() and varcomp())
Updated 980803 CB (added all3anova(), all4anova(), design_index, edited others)
Updated 990831 CB (added '()' or '%' after topic names)
Updated 000209 CB (added entries for choosedef2() and interblock())
Updated 000620 GO (added entries for sidebyside() and interactplot())
Updated 000908 GO (added entry for reml())
Updated 001216 CB (added subtopics)
Updated 010430 CB (minor change to pairwise())
Updated 010801 CB (added () to macro names where needed)
Updated 020221 CB (added some cross references)
Updated 020828 CB (updated boxcoxvec())
Updated 021022 CB (added buildfactor())
Updated 030813 CB (added subtopic titles)
Updated 031030 CB (minor editing)
!!!! Starting marker for message of the day
!!!! Ending marker for message of the day
???? Starting marker for list of up to 32 comma/newline separated keys
Aliasing
Analysis
ANOVA
Confounding
Design
Factorial
Permutation test
Plots
Random effects
???? Ending marker for keys
Note: There are no topics with names longer than 15 characters so as to
allow 5 column output from 'help()' with width >= 80.
Each topic starts with '====topicName' in column 1, followed by an
unlimited number of lines of information. It is helpful, but not
required, that topics be in alphabetical order. Topic names should, if
possible, be no longer than 12 characters, since otherwise they must be
quoted in a help() command. The convention is used that when referring
to functions, but not macros, by name, '()' is appended.
Each topic name that refers to a function or a pre-defined macro has
'()' appended to it. This does not affect output but is intended to
useful in semi-automatic generation of the reference manual from the
help file.
Names of topics that are not to be included in the Reference Manual are
followed by '*'. This does not affect help() output.
Names of topics whose usages are not to be included in the Reference
Manual are followed by '%'. This does not affect help() output.
Keys separated by commas can follow the topic name (and trailing '()',
'*' or '%') if separated from it by '#'. These should be selected from
up to 32 keys listed between lines starting with '????' somewhere before
the first help topic. Case is ignored.
If the next line after '====topicName' starts with '%%%%', all following
lines up to a matching '%%%%' are skipped by help() and printed by
usage(). usage() skips the lines following the second '%%%%'. This
feature implemented 951212.
The file should be terminated by a line starting '_E_O_F_'
====aliases2()#aliasing,design,factorial
%%%%
aliases2(basis [,effect:vec] [,length:j]), REAL matrix basis of alias
generators, REAL vector vec of 0's and 1's, positive integer j
%%%%
@@@@usage#Usage
aliases2(basis) finds all aliases of I in a 2^(k-p) fractional factorial
and returns a CHARACTER vector of these aliases as its value. k must be
no larger than 25, and factors are labeled A, B, C, ..., H, J, ... Z
(skipping I).
The p x k matrix basis contains the generators for the aliasing, one row
for each generator and one column for each factor in the design. The
elements in basis are 0, -1, or 1. A nonzero entry indicates that a
factor is present in the generator for that row. The sign of a generator
is the product of the signs of the nonzero elements of the generator.
For example, 1 0 1 0 0 -1 means -ACF is a generator (alias of I).
If basis is a (column) vector, it is changed to a row vector (1 by k
matrix) before proceding.
@@@@effect_and_length_keywords
aliases2(basis,effect:vec) returns the aliases of vec, a vector of k 0s
and 1s that specifies an effect.
aliases2(basis [,effect:vec],length:j) does the same but returns only
aliases of length j.
@@@@examples
Examples:
Cmd> print(format:"2.0f",b) # Matrix b is 2x5, so 2^(5-2) design
b:
(1,1) 1 1 1 0 0 [ABC is a generator]
(2,1) 0 0 1 1 -1 [-CDE is a generator]
Cmd> aliases2(b) # aliases of I
(1) "I"
(2) "ABC"
(3) "-CDE"
(4) "-ABDE"
Cmd> aliases2(b,length:3) # length 3 aliases of I
(1) "I"
(2) "ABC"
(3) "-CDE"
Cmd> aliases2(b,effect:vector(1,1,0,0,0)) # aliases of AB
(1) "AB"
(2) "C"
(3) "-ABCDE"
(4) "-DE"
@@@@see_also#Cross references
See also aliases3(), confound2(), choosedef2() and choosegen2().
@@@@______
====aliases3()#aliasing,design,factorial
%%%%
aliases3(basis[,effect:vec][,length:j]), REAL matrix basis of alias
generators, REAL vector vec of 0's, 1's and 2's, positive integer j
%%%%
@@@@usage#Usage
aliases3(basis) finds all aliases of I in a 3^(k-p) factional factorial
design and returns a CHARACTER vector of these aliases as its values. k
must be no larger than 25 and factors are labeled A, B, ..., H, J, ...,
Z (skipping I).
basis is p x k REAL matrix of 0s, 1s and 2s which contains the
generators for the aliasing, one row for each generator and one column
for each factor in the design. For example 1 0 2 0 0 1 means AC^2F is a
generator (alias of I).
If basis is a (column) vector, it is changed to a row vector (1 by k
matrix) before proceding.
@@@@effect_and_length_keywords#Keywords effect and length
aliases3(basis,effect:vec) returns the aliases of vec, a vector of k 0s,
1s and 2s representing an effect. When vec:rep(0,k), the result is the
same as aliases3(basis).
aliases3(basis [,effect:vec] , length:j) returns only aliases of length
j.
@@@@examples
Examples:
Cmd> print(c,format:"2.0f") # Matrix c is 2x4, so 3^(4-2)
c:
(1,1) 1 2 0 2 [A B^2 D^2 is a generator]
(2,1) 0 1 2 2 [B C^2 D^2 is a generator]
Cmd> aliases3(c) # all aliases of I
(1) "I"
(2) "A^1 B^2 D^2 "
(3) "A^1 B^2 D^2 "
(4) "B^1 C^2 D^2 "
(5) "A^1 C^2 D^1 "
(6) "A^1 B^1 C^1 "
(7) "B^1 C^2 D^2 "
(8) "A^1 B^1 C^1 "
(9) "A^1 C^2 D^1 "
Cmd> aliases3(c,effect:vector(1,1,0,0)) # aliases of A^1B^1
(1) "A^1 B^1 "
(2) "A^1 D^1 "
(3) "B^1 D^2 "
(4) "A^1 B^2 C^2 D^2 "
(5) "A^1 B^2 C^1 D^2 "
(6) "C^1 "
(7) "A^1 C^1 D^1 "
(8) "A^1 B^1 C^2 "
(9) "B^1 C^1 D^2 "
@@@@see_also#Cross references
See also aliases2(), confound3()
@@@@______
====all3anova()#anova
%%%%
all3anova(y, a, b, c [,s2:val]), REAL vector y, factors a, b, c of same
length as y, REAL scalar val > 0.
%%%%
@@@@usage#Usage
all3anova(y, a, b, c) fits all hierarchical 3 factor ANOVA models. y
must be a REAL vector and a, b and c must be factors (created by
factor() or makefactor()) of the same length as y.
For each of the 18 models fit, all3anova() prints the following, in order
of increasing C(p).
p Number of degrees of freedom in the model, including the
constant term
C(p) Mallow's Cp statistic
Adj R^2 Adjusted R^2
R^2 Unadjusted R^2
Model The right hand side of the fitted model using the symbols
a, b and c for the 3 factors
The estimate of variance used in computing C(p) is the error mean square
from the model fitting all factors and their interactions ("y=a*b*c").
Thus this model must not fit perfectly (have zero residual sums of
squares).
@@@@s2_and_keep_keywords#Keywords s2 and keep
all3anova(y, a, b, c, s2:v), where v > 0 is a REAL scalar does the same,
except v is used in computing C(p) instead of the error mean square from
the model fitting all factors and their interactions.
all3anova(y, a, b, c [, s2:v], keep:T) does the same except nothing is
printed. Instead a structure with the following component is returned:
Component Contents
p Vector of integers containing p for all the models fit
cp REAL vector containing C(p) for all the models fit
adjrsq REAL vector containing adjusted R^2 for all the models fit
rsq REAL vector containing unadjusted R^2 for all the models fit
model CHARACTER vector containing right hand side of each model fit
using a, b and c as factor names.
The models are in increasing order of C(p).
@@@@keep_and_print_keywords#Keywords keep and print
all3anova(y, a, b, c, [, s2:v], keep:T, print:T) returns a structure and
prints results.
@@@@see_also#Cross references
See also all4anova() and screen().
@@@@______
====all4anova()#anova
%%%%
all4anova(y, a, b, c, d [,s2:val]), REAL vector y, factors a, b, c, d
with same length as y, REAL scalar val > 0.
%%%%
@@@@usage#Usage
all4anova(y, a, b, c, d) fits all hierarchical 4 factor ANOVA models. y
must be a REAL vector and a, b, c and d must be factors (created by
factor()) of the same length as y.
For each of the 166 models fit, all4anova() prints the following, in
order of increasing C(p).
p Number of degrees of freedom in the model, including the
constant term
C(p) Mallows' Cp statistic
Adj R^2 Adjusted R^2
R^2 Unadjusted R^2
Model The right hand side of the fitted model using the symbols
a, b, c and d for the 4 factors
The estimate of variance used in computing C(p) is the error mean square
from the model fitting all factors and their interactions ("y=a*b*c*d").
Thus this model must not fit perfectly (have zero residual sums of
squares).
all4anova(y, a, b, c, d, s2:v), where v > 0 is a REAL scalar does the
same, except v is used in computing C(p) instead of the error mean
square from the model fitting all factors and their interactions.
all4anova(y, a, b, c, d [, s2:v], keep:T) does the same except nothing
is printed. Instead a structure with the following component is
returned:
Component Contents
p Vector of integers containing p for all the models fit
cp REAL vector containing C(p) for all the models fit
adjrsq REAL vector containing adjusted R^2 for all the models fit
rsq REAL vector containing unadjusted R^2 for all the models fit
model CHARACTER vector containing right hand side of each model fit
using a, b and c as factor names.
The models are in order of increading C(p).
@@@@keep_print_keywords#Keywords keep and print
all4anova(y, a, b, c, d, [, s2:v], keep:T, print:T) returns a structure
and prints results.
@@@@see_also#Cross references
See also all3anova() and screen().
@@@@______
====allaliases2()#aliasing,design,factorial
%%%%
allaliases2(basis), REAL matrix basis of alias generators
%%%%
@@@@usage#Usage
allaliases2(basis) finds the full set of aliases in a 2^(k-p) fractional
factorial and returns a CHARACTER vector of these aliases as its value.
k must be no larger than 25, and factors are labeled A, B, ..., H, J,
..., Z (skipping I).
The p x k matrix basis contains the generators for the aliasing, one row
for each generator and one column for each factor in the design. The
elements in basis are 0, -1, or 1. A nonzero entry indicates that a
factor is present in the generator for that row. The sign of a generator
is the product of the signs of the nonzero elements of the generator.
For example, 1 0 1 0 0 -1 means -ACF is a generator (alias of I).
@@@@examples
Examples:
Cmd> print(b,format:"2.0f") # Matrix b is 2x5, so 2^(5-2) design
b:
(1,1) 1 1 1 0 0 [ABC is a generator]
(2,1) 0 0 1 1 -1 [-CDE is a generator]
Cmd> allaliases2(b) # alias table
(1) "I = ABC = -CDE = -ABDE"
(2) "A = BC = -ACDE = -BDE"
(3) "B = AC = -BCDE = -ADE"
(4) "AB = C = -ABCDE = -DE"
(5) "D = ABCD = -CE = -ABE"
(6) "AD = BCD = -ACE = -BE"
(7) "BD = ACD = -BCE = -AE"
(8) "ABD = CD = -ABCE = -E"
@@@@______
====boxcoxvec()#ANOVA,analysis
%%%%
boxcoxvec(Model [,powers:pow]), CHARACTER scalar Model, REAL vector pow
boxcoxvec(rhs_model,y[,powers:pow]), CHARACTER scalar rhs_model, REAL
vectors y and pow.
%%%%
@@@@usage#Usage
boxcoxvec(Model, powers:Pow), where Model is a CHARACTER scalar
specifying a GLM model of the form "y = Rhs", computes the error SS for
y transformed to the boxcox powers given in Pow. Example: Model =
"yield = x + a + b", where response yield is a REAL vector, x is a
variate and a and b are factors. For this model Rhs is "x + a + b".
Any factors or variates in Rhs must have the same length as y. In the
example, x, a, and b must all be the same length as yield.
If powers:Pow is omitted, the default is Pow = run(-1,2,.25)
For each power P in Pow, the model "{boxcox(y,P)} = RHS" is fit using
anova() and the error SS is saved. The value returned by boxcoxvec()
is structure(power:Pow, SS:Ss), where Pow and Ss are vectors of powers
and error sums of squares.
boxcoxvec(Rhs,y [,powers:Pow]), where Rhs is a quoted string or
CHARACTER, does the same as boxcoxvec("y = Rhs" [,powers:Pow]). This
somewhat clumsier usage is maintained for backward compatibility.
A power that minimizes the error SS may be chosen to transform the data
for analysis.
@@@@examples
Examples:
Cmd> y <- rnorm(20);ey <- exp(y)
NOTE: random number seeds set to 59622139 and 172924584
Cmd> a <- factor(rep(run(5),4))
Cmd> boxcoxvec("ey = a") # default powers; or boxcoxvec("a", ey)
component: power
(1) -1 -0.75 -0.5 -0.25 0
(6) 0.25 0.5 0.75 1 1.25
(11) 1.5 1.75 2
component: SS
(1) 179.93 72.889 34.971 21.487 17.979
(6) 20.624 31.191 58.929 131.81 332.94
(11) 915.71 2673 8140.8
Cmd> boxcoxvec("ey = a",powers:run(-.25,.25,1/16))#explore powers ~= 0
component: power
(1) -0.25 -0.1875 -0.125 -0.0625 0
(6) 0.0625 0.125 0.1875 0.25
component: SS
(1) 21.487 19.938 18.875 18.236 17.979
(6) 18.083 18.544 19.38 20.624
Cmd> lineplot(boxcoxvec("ey = a"),title:"Resid SS vs Powers")#graph it
@@@@see_also#Cross references
See also boxcox(), lineplot(), 'models'.
@@@@______
====buildfactor()#anova,factorial
%%%%
a <- buildfactor(jsub, dims, [,reverse:T]), integer jsub > 0, dims a
vector of positive integers, length(dims) >= jsub
%%%%
@@@@usage#Usage
fac_j <- buildfactor(j, dims) creates a factor corresponding to the
values of subscript j >= 1 for a balanced factorial design. dims must be a
vector of positive integers with M = length(dims) >= j.
fac_j will be a factor with dims[j] levels, with N = length(fac_j) =
prod(dims), the product of the elements of dims.
fac_j may be used in an ANOVA of a balanced factorial experiment where
each case is identified by m <= M subscripts with the first changing
fastest and the last changing slowest, with subscript i running from 1
to dims[i]. When m < M, there will be N/prod(dims[run(m)]) replications.
fac_j <- buildfactor(j, dims, reverse:T), does the same except first
subscripts change slowest. buildfactor(j, dims, reverse:T) is
equivalent to buildfactor(M+1-j, reverse(dims)).
@@@@example
Example:
Cmd> dims <- vector(2,3,2,2)#for 2 x 3 x 2 in 2 reps
Cmd> fac_1 <- buildfactor(1,dims)
Cmd> fac_2 <- buildfactor(2,dims)
Cmd> fac_3 <- buildfactor(3,dims)
Cmd> print(fac_1,fac_2,fac_3,format:"1.0f")
fac_1:
(1) 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
fac_2:
(1) 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3
fac_3:
(1) 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 1 1 2 2 2 2 2 2
Cmd> list(fac_1,fac_2,fac_3)
fac_1 REAL 24 FACTOR with 2 levels
fac_2 REAL 24 FACTOR with 3 levels
fac_3 REAL 24 FACTOR with 2 levels
These factors could be used in analysis of two replicates of a 2 by 3 by
2 design in standard order, say by anova("y=fac_1+fac_2+fac_3").
Cmd> print(buildfactor(2,dims,reverse:T),format:"1.0f")
VECTOR:
(1) 1 1 1 1 2 2 2 2 3 3 3 3 1 1 1 1 2 2 2 2 3 3 3 3
Compare this with fac_2 above.
@@@@see_also#Cross references
See also factor(), makefactor(), rep()
@@@@______
====choosedef2()#confounding,design,factorial
%%%%
choosedef2(k,p, all:T or tries:m), k, p, m > 0 integer scalars
%%%%
@@@@usage#Usage
choosedef2(k, p, all:T) searches all potential defining contrasts to
find a set for a 2^k factorial design in 2^p blocks. k must be an
integer between 1 and 32 and p an integer < k.
choosedef2(k,p,tries:m) does the same, except only m random sets of
defining contrasts are searched.
In both cases, the defining contrasts for the minimum aberration design
among those searched is returned.
The return value is
structure(generators:bestgen, aberration:aber, basis:genmat).
bestgen is a CHARACTER vector of length p with the names such as "ABDF"
of the defining contrasts in the set chosen.
aber is a vector of k integers >= 0 with ab[i] containing the number
of contrast with i letters that are confounded with "I".
genmat is a p by k matrix with entries 0 and 1, one row for each element
of bestgen. genmat[i,j] is 1 if and only if factor j is bestgen[i].
For example, if bestgen[i] is "ABDF", genmat[i,] is vector(1,1,0,1,0,1
[,...])'. genmat can be used as an argument to aliases2() to determine
all the aliases of any contrast.
@@@@see_also#Cross references
See also aliases2() choosegen2(), confound2().
@@@@______
====choosegen2()#aliasing,design,factorial
%%%%
choosegen2(k,p, all:T [,res:r]), positive integers k, p, r.
choosegen2(k,p, tries:m [,res:r]), positive integers k, p, m, and r.
%%%%
@@@@usage#Usage
choosegen2(k,p,all:T) , where k, p and r are positive integers, searches
all potential sets of generators for a 2^(k-p) fractional factorial
design. It returns information on a generator with maximum resolution.
choosegen2(k,p, tries:m), where m > 0 is an integer, does the same,
except it searches only m randomly selected sets of generators.
choosegen2(k,p,all:T,res:r) and choosegen2(k,p,tries:m,res:r), where r
is a positive integer, do the same, except they stop the search at the
first design of resolution r. If none is found, they return information
on a set having the highest resolution found.
@@@@value_returned#Value returned
The value returned is
structure(resolution:bestres,generators:bestgen,aberration:aber,\
basis:genmat)
Integer bestres > 0 is the best resolution found.
bestgen is a CHARACTER vector of length p with the names such as "ABDF"
of the defining contrasts in the set chosen.
aber is a vector of k integers >= 0 with ab[i] containing the number
of contrast with i letters that are confounded with "I".
genmat is a p by k matrix with entries 0 and 1, one row for each element
of bestgen. genmat[i,j] is 1 if and only if factor j is bestgen[i].
For example, if bestgen[i] is "ABDF", genmat[i,] is vector(1,1,0,1,0,1
[,...])'. genmat can be used as an argument to aliases2() to determine
all the aliases of any contrast.
@@@@examples
Examples:
Cmd> choosegen2(5,2,all:T) # find best resolution
component: resolution
(1) 3
component: generators
(1) "ABCD"
(2) "BCE"
component: aberration
(1) 0 0 2 1 0
component: basis
(1,1) 1 1 1 1 0
(2,1) 0 1 1 0 1
Cmd> choosegen2(5,2,all:T,res:4) # try to find 4 (we won't)
component: resolution
(1) 3
component: generators
(1) "ABCD"
(2) "BCE"
component: aberration
(1) 0 0 2 1 0
component: basis
(1,1) 1 1 1 1 0
(2,1) 0 1 1 0 1
Cmd> # look for 2^(9-4) resol. 4, just make 1000 tries, since\
there are lots of combinations to explore
Cmd> choosegen2(9,4,tries:1000,res:4)# look for 2^(9-4) resol. 4
component: resolution
(1) 4
component: generators
(1) "CDEF"
(2) "BDEG"
(3) "ABCEH"
(4) "BCDJ"
component: aberration
(1) 0 0 0 7 7
(6) 0 0 0 1
component: basis
(1,1) 0 0 1 1 1
(1,6) 1 0 0 0
(2,1) 0 1 0 1 1
(2,6) 0 1 0 0
(3,1) 1 1 1 0 1
(3,6) 0 0 1 0
(4,1) 0 1 1 1 0
(4,6) 0 0 0 1
@@@@see_also#Cross references
See also aliases2(), choosedef2(), confound2().
@@@@______
====confound2()#confounding,design,factorial
%%%%
confound2(basis), REAL matrix basis containing confounding generators
%%%%
@@@@usage#Usage
confound2(basis) confounds a two series factorial into blocks based on
the generators given in the matrix basis. Results are returned in a
structure with component names 'block1', 'block2', etc. Each component
has a CHARACTER vector of factor/level combinations for that block.
The p x k matrix basis contains the generators for the confounding, one
row for each generator and one column for each factor in the design.
The elements in basis are 0 or 1. A nonzero entry indicates that a
factor is present in the generator for that row.
@@@@examples
Examples:
Cmd> # Matrix d is 2x4 (2 generators so 4 blocks of size 4)
Cmd> print(d,format:"2.0f")
d:
(1,1) 1 1 0 0 [AB is a generator]
(2,1) 0 1 1 1 [BCD is a generator]
Cmd> confound2(d)
component: block1
(1) "(1)"
(2) "abc"
(3) "abd"
(4) "cd"
component: block2
(1) "a"
(2) "bc"
(3) "bd"
(4) "acd"
component: block3
(1) "ab"
(2) "c"
(3) "d"
(4) "abcd"
component: block4
(1) "b"
(2) "ac"
(3) "ad"
(4) "bcd"
@@@@see_also#Cross references
See also aliases2(), choosegen2(), choosedef2().
@@@@______
====confound3()#confounding,design,factorial
%%%%
confound3(basis), REAL matrix basis containing confounding generators
%%%%
@@@@usage#Usage
confound3(basis) confounds a three series factorial into blocks based on
the generators given in the matrix basis. Results are returned in a
structure with component names 'block1', 'block2', etc. Each component
has a CHARACTER vector of factor/level combinations for that block.
The p x k matrix basis contains the generators for the confounding, one
row for each generator and one column for each factor in the design.
The elements in basis are 0, 1, or 2, indicating the exponent of each
factor in the generator.
@@@@example
Example:
Cmd> print(e,format:"2.0f")# Matrix e is 1x2 (one generator in a 3^2)
e:
(1,1) 1 2 [A^1 B^2 is a generator]
Cmd> confound3(e)
component: block1
(1) "00"
(2) "11"
(3) "22"
component: block2
(1) "10"
(2) "21"
(3) "02"
component: block3
(1) "20"
(2) "01"
(3) "12"
@@@@see_also#Cross references
See also aliases3(), confound2()
@@@@______
====design_index*#
%%%%
Type help(design_index) for a list of topics in file "design.hlp".
%%%%
@@@@list_of_topics#List of topics
File design.hlp contains help on the following topics:
aliases2 Macro to determine aliases in fractioned 2 series
aliases3 Macro to aliases in fractioned 3 series
all3anova Macro to fit all hierarchical models in a three factor
ANOVA and sort them by Mallow's Cp statistic
all4anova Macro to fit all hierarchical models in a four factor ANOVA
and sort them by Mallow's Cp statistic
allaliases2 Macro to compute complete aliases structure in fractioned 2
series factorial
boxcoxvec Macro to compute the error SS for several boxcox trans-
formations
buildfactor Macro to construct factor vector for balanced factorial
design in standard order (1st subscript changing fastest,
last changing slowest)
choosedef2 Macro to choose defining contrasts for a blocked 2 series
factorial
choosegen2 Macro to choose generators for a 2 series fractional
factorial
confound2 Macro to confound a 2 series factorial into blocks
confound3 Macro to confound a 3 series factorial into blocks
design_index This list of topics
ems Macro to compute the expected mean squares for the terms in
the ANOVA of a model some of whose terms are random
ffdesign2 Macro to determines the factor/level combinations to use
for fractioned 2 series
interactplot Macro to make interaction plots of marginal means in a
factorial
interblock Macro to recover interblock information in an incomplete
block design
mixed Macro to compute an "ANOVA" table for a model some of whose
factors are random
pairwise Macro to compute paired comparison statistics and generate
an "underline" diagram (previously named pairedcomp)
quadmax Macro to find the location of the maximum of a quadratic
function, with optional linear equality and inequality
constraints on the solution
randsign Macro to carry out a permutation paired t-test
randt Macro to carry out a permutation two-sample t-test using
values from a single vector
randt2 Macro to carry out a permutation two sample t-test
combining values from two fectors (uses randt())
reml Macro to perform restricted maximum likelihood analysis of
a model with fixed and random terms
rscanon Macro to do canonical analysis of 2nd order response
surface design
sidebyside Macro to make a side-by-side plot of effects
varcomp Macro to compute the "ANOVA" estimates of random effects
variances in mixed effects analysis of variance
@@@@______
====ems()#ANOVA,analysis,factorial
%%%%
ems(Model,randomvars[,marg:T] [,restrict:F] [,nonhier:T] [,keep:T]\
[,print:T]), CHARACTER scalar model, CHARACTER vector randomvars
%%%%
@@@@usage#Usage
ems(Model,Randomvars) computes the expected mean squares for the terms
in the ANOVA for the model given in CHARACTER scalar Model. Randomvars
is a CHARACTER vector specifying the names of factors in the model which
are random. Randomvars can also be REAL with integer elements
specifying the index of a factor in the model. If there are no random
factors, Randomvars should be NULL.
In this default use, ems() computes sequential (Type I) sums of squares
for the the restricted (mixed effects add to zero across fixed factors)
model, prints these expected means squares for each term, and returns no
value. Contributions from random terms are shown as multiples of the
variance component (for example, 16V(a.b)); contributions from fixed
terms are shown as a multiple of a quadratic function for the term, for
example, 32Q(c). In a balanced design, the Q() function is the sum of
the squared coefficients divided by degrees of freedom, for example,
sum(c^2)/(K-1). In an unbalanced situation, Q(c) is a more complicated
quantity defined using matrix algebra.
See below for the use of keywords to change the action of ems().
ems() works only for factors -- no variates are allowed in the model.
ems() works for both balanced and unbalanced data.
@@@@details#Details
ems() assumes that if a factor first appears in an interaction, then that
factor is nested in the other terms of the interaction. For example, if
the first appearance of factor c is in the term a.b.c, then c is assumed
nested in the a.b combinations. This nesting is assumed in the
remainder of the model. That is, continuing the example, if there is a
later term c.d, it will be interpreted as a.b.c.d even though a.b.c.d is
not specifically in the model.
When a term contains the first appearance in the model of more than one
factor, ems() assumes that the new factors are merged to make a single
factor, whose number of levels is the product of the numbers of levels
in the factors being merged. For example, if the first appearance of
factors b and c with 5 and 3 levels, respectively is in the term a.b.c,
then b and c together are considered a single factor with 15 levels.
This grouping is assumed in the remainder of the model. That is,
continuing the example, if there is a later term c.d, it will be
interpreted as b.c.d even though b.c.d is not specifically in the model.
This grouped factor is interpreted as random if any of the factors in
the group is random.
ems() uses the "synthesis" method of Hartley, as explained in 10.5.2 of
R. R. Hocking (1985), The Analysis of Linear Models, Brooks/Cole,
Belmont, CA.
@@@@examples1#Examples
The examples below are based on balanced two factor and three factor
models with a total of 64 responses. All factors have 2 levels so two
factor and three factor models have 16 and 8 replications, respectively.
In some examples, one of the responses is set to MISSING to destroy
balance.
A fully nested model, with c fixed and both d and e random.
Cmd> ems("y=c/d/e",vector("d","e"))
EMS(CONSTANT) = V(ERROR1) + 8V(c.d.e) + 16V(c.d) + 64Q(CONSTANT)
EMS(c) = V(ERROR1) + 8V(c.d.e) + 16V(c.d) + 32Q(c)
EMS(c.d) = V(ERROR1) + 8V(c.d.e) + 16V(c.d)
EMS(c.d.e) = V(ERROR1) + 8V(c.d.e)
EMS(ERROR1) = V(ERROR1)
A 3 factor crossed model, with c and d fixed, e random.
Cmd> ems("y=c*d*e",3) # e is factor 3
EMS(CONSTANT) = V(ERROR1) + 32V(e) + 64Q(CONSTANT)
EMS(c) = V(ERROR1) + 16V(c.e) + 32Q(c)
EMS(d) = V(ERROR1) + 16V(d.e) + 32Q(d)
EMS(c.d) = V(ERROR1) + 8V(c.d.e) + 16Q(c.d)
EMS(e) = V(ERROR1) + 32V(e)
EMS(c.e) = V(ERROR1) + 16V(c.e)
EMS(d.e) = V(ERROR1) + 16V(d.e)
EMS(c.d.e) = V(ERROR1) + 8V(c.d.e)
EMS(ERROR1) = V(ERROR1)
A 2 factor crossed model with unbalanced data, c fixed and d random
Cmd> y1 <- y[1]; y[1] <- ? # make data unbalanced
Cmd> ems("y=c*d",2) # d is factor 2
EMS(CONSTANT) = V(ERROR1) + 0.0080645V(c.d) + 31.508V(d) +
0.0079365Q(c) + 63Q(CONSTANT)
EMS(c) = V(ERROR1) + 15.746V(c.d) + 0.0081925V(d) + 31.492Q(c)
EMS(d) = V(ERROR1) + 0.0042316V(c.d) + 31.484V(d)
EMS(c.d) = V(ERROR1) + 15.742V(c.d)
EMS(ERROR1) = V(ERROR1)
@@@@use_of_keywords
Use of Keywords to change the action of ems().
ems(Model,Randomvars,keep:T) suppresses printed output but returns a
structure (described below) containing the results. If you want the
printed output too, use keep:T,print:T.
ems(Model,Randomvars,marg:T) computes expected mean squares based on
adjusted (Type III) sums of squares.
ems(Model,Randomvars,restrict:F) computes expected mean squares assuming
no marginal restrictions on any random effects in the model that have
two or more dimensions. Thus, for example, when the model is
"y=c+D+c.D", where c is fixed and D is random, it is not assumed that
the c.D effects sum to zero for each level of D.
ems(Model,Randomvars,nonhier:T) computes expected mean squares for an
analysis of variance that does not enforce the usual MacAnova hierarchy
assumptions. That is, for example, model "y=a+b+c+a.b.c" does not imply
that the two-way interaction degrees of freedom are part of the "a.b.c"
term. You cannot use anova() to compute such an analysis although it
can be done (if you know how) using swp().
These keywords can be used together. For example, ems(Model,Randomvars,
marg:T,restrict:F) provides answers equivalent to the EMS in SAS PROC
GLM.
@@@@examples2#More examples
These examples still assume y[1] is MISSING.
Cmd> ems("y=c*d",2,marg:T) # crossed with d random
EMS(CONSTANT) = V(ERROR1) + 31.475V(d) + 62.951Q(CONSTANT)
EMS(c) = V(ERROR1) + 15.742V(c.d) + 31.475Q(c)
EMS(d) = V(ERROR1) + 31.475V(d)
EMS(c.d) = V(ERROR1) + 15.742V(c.d)
EMS(ERROR1) = V(ERROR1)
Cmd> ems("y=c*d",2,restrict:F) # crossed with d random
EMS(CONSTANT) = V(ERROR1) + 15.762V(c.d) + 31.508V(d) + 0.0079365Q(c)
+ 63Q(CONSTANT)
EMS(c) = V(ERROR1) + 15.754V(c.d) + 0.0081925V(d) + 31.492Q(c)
EMS(d) = V(ERROR1) + 15.746V(c.d) + 31.484V(d)
EMS(c.d) = V(ERROR1) + 15.738V(c.d)
EMS(ERROR1) = V(ERROR1)
Cmd> ems("y=c*d",2,marg:T,restrict:F) # same as SAS PROC GLM
EMS(CONSTANT) = V(ERROR1) + 15.738V(c.d) + 31.475V(d) + 62.951Q(CONSTANT)
EMS(c) = V(ERROR1) + 15.738V(c.d) + 31.475Q(c)
EMS(d) = V(ERROR1) + 15.738V(c.d) + 31.475V(d)
EMS(c.d) = V(ERROR1) + 15.738V(c.d)
EMS(ERROR1) = V(ERROR1)
Cmd> y[1] <- y1 # restore value for y[1] to regain balance
Cmd> ems("y=c*d+e+c.d.e",3)
EMS(CONSTANT) = V(ERROR1) + 32V(e) + 64Q(CONSTANT)
EMS(c) = V(ERROR1) + 8V(c.d.e) + 32Q(c)
EMS(d) = V(ERROR1) + 8V(c.d.e) + 32Q(d)
EMS(c.d) = V(ERROR1) + 8V(c.d.e) + 16Q(c.d)
EMS(e) = V(ERROR1) + 32V(e)
EMS(c.d.e) = V(ERROR1) + 8V(c.d.e)
EMS(ERROR1) = V(ERROR1)
Cmd> ems("y=c*d+e+c.d.e",3,nonhier:T)
EMS(CONSTANT) = V(ERROR1) + 32V(e) + 64Q(CONSTANT)
EMS(c) = V(ERROR1) + 32Q(c)
EMS(d) = V(ERROR1) + 32Q(d)
EMS(c.d) = V(ERROR1) + 8V(c.d.e) + 16Q(c.d)
EMS(e) = V(ERROR1) + 32V(e)
EMS(c.d.e) = V(ERROR1) + 8V(c.d.e)
EMS(ERROR1) = V(ERROR1)
Note that 'nonhier:T' makes the c.d.e term disappear from the EMS for
the fixed terms; compare with the default hierarchical model that
precedes it.
@@@@keep_keyword
Structure returned with keep:T
When 'keep:T' is an argument, the structure returned has components
'df', 'ss', 'termnames', 'coefs', and 'rterms'.
Component Description
df REAL Vector of degrees of freedom for all terms in model
ss REAL Vector of sums of squares for all terms in model
termnames CHARACTER vector of labels for each term
coefs REAL matrix with coefs[i,j] the coefficient for term j in
the EMS of term i
rterms LOGICAL vector with T indicating that a term is random.
Components ss and df are just those computed from a MacAnova anova()
command (possibly with marg:T as needed), and may not be in conformance
with the model as used by ems() for the following reasons:
1. anova() computes only hierarchical models, while you may specify
nonhierarchical models in ems() by using nonhier:T.
2. ems() enforces nesting and grouping. If b first appears in a.b then b
is nested in a and any appearance of b in a later term implies the
presence of a. anova() does no such enforcing. For example, in
"y=a+a.b+c+b.c", b.c would be interpreted by ems() as a.b.c while
anova() would not include a.b.c in the model. If b and c first
appear together, then "y=b.c+d+c.d" is interpreted in ems() as
"y=b.c+d+b.c.d".
@@@@examples3#Examples
These examples assume a 2^2 design in 16 replicates:
Cmd> ems("y=c*d",2)
EMS(CONSTANT) = V(ERROR1) + 32V(d) + 64Q(CONSTANT)
EMS(c) = V(ERROR1) + 16V(c.d) + 32Q(c)
EMS(d) = V(ERROR1) + 32V(d)
EMS(c.d) = V(ERROR1) + 16V(c.d)
EMS(ERROR1) = V(ERROR1)
Cmd> ems("y=c*d",2,keep:T)
component: df
(1) 1 1 1 1 60
component: ss
(1) 0.76155 0.036871 0.31116 1.623 56.318
component: termnames
(1) "CONSTANT"
(2) "c"
(3) "d"
(4) "c.d"
(5) "ERROR1"
component: coefs
(1,1) 64 0 32 0 1
(2,1) 0 32 0 16 1
(3,1) 0 0 32 0 1
(4,1) 0 0 0 16 1
(5,1) 0 0 0 0 1
component: rterms
(1) F F T T
@@@@______
====ffdesign2()#design,aliasing,factorial
%%%%
ffdesign2(basis), integer matrix basis containing confounding generators
%%%%
@@@@usage#Usage
ffdesign2(basis) finds the set of factor/level combinations used in the
2^(k-p) fractional factorial corresponding to the given generators. The
result is a CHARACTER vector giving the factor/level combinations.
The p x k matrix basis contains the generators for the aliasing, one row
for each generator and one column for each factor in the design. The
elements in basis are 0, -1, or 1. A nonzero entry indicates that a
factor is present in the generator for that row. The sign of a generator
is the product of the signs of the nonzero elements of the generator.
For example, 1 0 1 0 0 -1 means -ACF is a generator (alias of I).
@@@@examples
Examples:
Cmd> print(b, format:"2.0f") # Matrix b is 2x5, so 2^(5-2) design
b:
(1,1) 1 1 1 0 0 [ABC is a generator]
(2,1) 0 0 1 1 -1 [-CDE is a generator]
Cmd> ffdesign2(b)
(1) "ce"
(2) "a"
(3) "b"
(4) "abce"
(5) "dc"
(6) "ade"
(7) "bde"
(8) "abdc"
@@@@______
====interactplot()#factorial,plots
%%%%
interactplot(y,a [,b,c ...] [graphics keywords]), y a REAL vector, a, b,
c, ... vectors of positive integers the same length as y.
interactplot(means [graphics keywords]), means a REAL matrix or array.
%%%%
@@@@usage#Usage_with_data
interactplot(y,a [,b,c ...] [graphics keywords]) makes an interaction
plot of the marginal means of REAL vector y for all combinations of
variables a, b, c, .... The variables a, b, ... must be vectors of
positive integers the same length as y.
The levels of variable a will be put on the horizontal axis and separate
lines drawn for each combination of variables values of b, c, .... When
a is the only factor argument, only one line is drawn. Lines are
numbered 1, 2, 3, ... with the levels of b varying fastest, c varying
more slowly, etc.
@@@@usage#Usage_with_means
interactplot(means [graphics keywords]) where means is a REAL matrix or
array will make an interaction plot with the first dimension of means on
the horizontal axis, and separate lines for each combination of the
other dimensions. The lines are numbered 1, 2, 3, ... with the last
dimension varying slowest. interactplot(tabs(y,a,b,means:T)) makes the
same plot as interactplot(y,a,b).
@@@@graphics_keywords#Graphics keywords
In both usages, any graphics keywords will be passed to function
chplot() which actually makes the plot. In particular, for example,
Cmd> interactplot(y,a,b,symbols:vector("B1","B2","B3"),\
xticklabs:vector("A1","A2","A3","A4"))
will label the curves B1, B2 and B3 instead of 1, 2 and 3 and the
horizontal axis location A1, A2, A3 and A4.
@@@@______
====interblock()#analysis,anova
%%%%
interblock(y,block,trt,[contrast:coefs])
%%%%
@@@@usage#Usage
interblock(y,block,trt,[contrast:coefs]) does recovery of interblock
information in an incomplete block design, where y is the vector of
responses, block is the factor of block levels, trt is the factor of
treatment levels, and coefs is an optional vector of contrast
coefficients (it must be the same length as the number of levels of
trt). If no contrast is specified, the output is the intra-block,
inter-block, and combined estimates of treatment effects and their
standard errors. If a contrast is specified, the output is the
intra-block, inter-block, and combined estimates of the contrast with
their standard errors.
@@@@example
Example:
Cmd> y<-vector(19,17,11,6,26,23,21,19,28,20,7,20,17,26,19,15,23,31,\
20,26,31,16,23,21,13,7,20,20,24,19,17,6,29,14,24,21)
Cmd> session<-factor(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,\
8,9,9,9,10,10,10,11,11,11,12,12,12)
Cmd> trt<-factor(1,2,3,4,5,6,7,8,9,1,4,7,2,5,8,3,6,9,1,5,9,2,6,7,3,4,\
8,1,6,8,2,4,9,3,5,7)
Cmd> interblock(y,session,trt)
intra est intra se inter est inter se combined est combined se
0.33333 0.49414 0.33333 0.91174 0.33333 0.43443
-2.2222 0.49414 -4 0.91174 -2.6259 0.43443
-6.2222 0.49414 -6 0.91174 -6.1718 0.43443
-12.889 0.49414 -13 0.91174 -12.914 0.43443
5.8889 0.49414 6.6667 0.91174 6.0655 0.43443
3.5556 0.49414 4.6667 0.91174 3.8078 0.43443
1.6667 0.49414 0.33333 0.91174 1.3639 0.43443
-0.22222 0.49414 8.8818e-16 0.91174 -0.17177 0.43443
10.111 0.49414 11 0.91174 10.313 0.43443
Cmd> cfs<-vector(.25,.25,.25,.25,-.25,-.25,-.25,-.25,0)
Cmd> interblock(y,session,trt,contrast:cfs)
estimate se
intra -7.9722 0.3706
inter -8.5833 0.68381
combined -8.111 0.32583
@@@@______
====mixed()#anova,analysis,random effects,factorial
%%%%
mixed(Model,randomvars[,marg:T] [,restrict:F] [,nonhier:T] [,useneg:T]\
[,keepmixed:T]), CHARACTER scalar model, CHARACTER vector randomvars
mixed(emsResult [,useneg:T] [,keepmixed:T]), emsResult a structure
returned by ems() with keep:T
%%%%
@@@@usage#Usage
mixed(Model, randomvars) computes and prints an "ANOVA" table
appropriate for the model and random factors given in CHARACTER scalar
Model and CHARACTER vector randomvars. These arguments are exactly the
same as for ems(). You can also use ems() keywords 'marg', 'restrict'
and 'nonhier'.
mixed(emsResult), where emsResult has been computed by emsResult <-
ems(Model, randomvars [,...],keep:T), does the same.
mixed(Model, randvars [,...], keepmixed:T) returns the table as an
matrix with appropriately labelled rows and columns but does not print
it.
The anova table has one row for each term in the model and the following
seven columns.
Col. 1 Label for term
Col. 2 DF = degrees of freedom for term
Col. 3 MS = mean square for term (numerator of F)
Col. 4 Error DF = degrees of freedom for appropriate error term
Col. 5 Error MS = mean square for appropriate error term
(denominator of F)
Col. 6 F = F-statistic = MS/(Error MS)
Col. 7 P value = tail probability for F test
With keepmixed:T, the matrix returned consists of columns 2 through 7 of
the table, with column 1 used to label rows.
@@@@MS_description#Description of numerator and denominator MS
Numerator and denominator MS's are linear combinations of mean squares
whose expectations differ only by a multiple of the variance component
associated with the line. When the numerator or denominator is not a
simple ANOVA mean square, its degrees of freedom are found using the
Satterthwaite approximation.
By default, only linear combinations of mean squares with positive
coefficients are used. This means that the numerator for a term may be
the sum of the mean square for the term and one or more mean squares
from other terms. If the keyword useneg:T is used, then the numerator
for a term will be the mean square for that term, and denominators may
contain differences as well as sums of mean squares.
@@@@example
Example:
Three populations, all crosses between 4 males and 4 females in each
population with six offspring from each mating randomly assigned to
three environments. Male and female are random. First the simple
ANOVA.
Cmd> anova("y=(pop+m.pop+f.pop+m.f.pop)*env")
Model used is y=(pop+m.pop+f.pop+m.f.pop)*env
DF SS MS
CONSTANT 1 5.4299 5.4299
pop 2 2091.4 1045.7
pop.m 9 112.5 12.5
pop.f 9 370.02 41.113
pop.m.f 27 56.774 2.1027
env 2 206.15 103.08
pop.env 4 0.16527 0.041316
pop.m.env 18 3.4185 0.18992
pop.f.env 18 8.2354 0.45752
pop.m.f.env 54 17.117 0.31698
ERROR1 144 30.448 0.21144
Now compute the expected mean squares, and keep the ems() output.
Cmd> emsstuff<-ems("y=(pop+m.pop+f.pop+m.f.pop)*env",vector("m","f"),
keep:T,print:T)
EMS(CONSTANT) = V(ERROR1) + 6V(pop.m.f) + 24V(pop.f) + 24V(pop.m) +
288Q(CONSTANT)
EMS(pop) = V(ERROR1) + 6V(pop.m.f) + 24V(pop.f) + 24V(pop.m) + 96Q(pop)
EMS(pop.m) = V(ERROR1) + 6V(pop.m.f) + 24V(pop.m)
EMS(pop.f) = V(ERROR1) + 6V(pop.m.f) + 24V(pop.f)
EMS(pop.m.f) = V(ERROR1) + 6V(pop.m.f)
EMS(env) = V(ERROR1) + 2V(pop.m.f.env) + 8V(pop.f.env) +
8V(pop.m.env) + 96Q(env)
EMS(pop.env) = V(ERROR1) + 2V(pop.m.f.env) + 8V(pop.f.env) +
8V(pop.m.env) + 32Q(pop.env)
EMS(pop.m.env) = V(ERROR1) + 2V(pop.m.f.env) + 8V(pop.m.env)
EMS(pop.f.env) = V(ERROR1) + 2V(pop.m.f.env) + 8V(pop.f.env)
EMS(pop.m.f.env) = V(ERROR1) + 2V(pop.m.f.env)
EMS(ERROR1) = V(ERROR1)
Now use mixed().
Cmd> mixed(emsstuff)
DF MS Error DF Error MS F P value
CONSTANT 1.914 7.533 14.01 53.61 0.1405 0.8617
pop 2.008 1048 14.01 53.61 19.54 8.745e-05
pop.m 9 12.5 27 2.103 5.945 0.0001412
pop.f 9 41.11 27 2.103 19.55 1.242e-09
pop.m.f 27 2.103 144 0.2114 9.945 0
env 2.012 103.4 30.75 0.6474 159.7 0
pop.env 56.12 0.3583 30.75 0.6474 0.5534 0.9729
pop.m.env 18 0.1899 54 0.317 0.5991 0.8844
pop.f.env 18 0.4575 54 0.317 1.443 0.1496
pop.m.f.env 54 0.317 144 0.2114 1.499 0.03044
ERROR1 144 0.2114 0 0 MISSING MISSING
The test for environment should be
(MS(env)+MS(m.f.env))/(MS(m.env)+MS(f.env)) = (103.08+.32)/(.190+.458) =
(103.4)/(.6474) = 159.7 as reported in the table.
@@@@see_also#Cross references
See also ems(), reml().
@@@@______
====pairwise()#ANOVA,analysis
%%%%
pairwise(factorname,lev [,method:T] [,error:term]), CHARACTER scalar
factorname, positive REAL scalar lev < 1, positive integer or
CHARACTER scalar term, keyword phrase method:T one of 'lsd:T',
'bsd:T', 'snk:T', 'hsd:T', 'regwb:T', or 'regwr:T'
pairwise(factorname,critval:val), positive REAL scalar val
%%%%
@@@@usage#Usage
pairwise(factorname,siglevel) prints a summary of all paired comparisons
between the levels of the factor given in factorname at the level of
significance siglevel. Comparisons are done using the Bonferroni
method. factorname must be a CHARACTER scalar or quoted string naming a
factor in the current GLM model and siglevel must be a REAL scalar
between 0 and 1. It is an error if there is no current GLM model or if
the current GLM model does not contain the named factor.
@@@@methods#Methods
pairwise(factorname,siglevel,METHOD:T), where METHOD:T is one of
'bsd:T', 'lsd:T', 'snk:T', 'hsd:T', 'regwb:T', or 'regwr:T', does the
same, except METHOD specifies the multiple comparison method to be used.
METHOD Description
bsd Bonferroni method (the default)
lsd Least significant difference method
hsd Tukey's honestly significant difference or Studentized
range method
snk Student-Newman-Keuls method
regwb Step down Bonferroni using REGW tail probabilities
regwrs Step down Studentized range using REGW tail probabilities.
The REGW tail probabilities were proposed in papers by Ryan, Einot and
Gabriel, and Welsch.
@@@@critval_keyword#Keyword critval
pairwise(factorname,critval:val) does the same, except it uses val as
the critical value for a t-test between the levels of factorname rather
than a computed cutoff.
@@@@examples
Examples:
After anova("y=trt"),
Cmd> pairwise("trt",.01,hsd:T)
does paired comparisons between the levels of trt at signficance .01
using the HSD method.
Cmd> pairwise("trt",\
critval:invstudrng("trt",1 - .01, max(trt), DF[3])/sqrt(2))
does the same, directly computing the HSD critical value. See
invstudrng().
pairwise() prints only a summary of the results and returns no value. The
printed output consists of one row for each level of the term, sorted
from smallest to largest effect, giving the "underlines" identifying
effects that are not significantly different, level number, and effect.
@@@@error_mean_square#Error mean square
By default, the error mean square used in the comparison tests is taken
from the last error term of the current model (the last line of the
ANOVA table). You may specify a different error term with keyword
phrase error:term. term must be a CHARACTER scalar or positive integer
which specifies the name or number of the line in the ANOVA table to be
used as the error mean square. Examples are 'error:4' (use line 4 as
error term) or 'error:"a.b.c"' (use the ABC interaction as error term).
@@@@use_of_contrast#Use of contrast()
The contrast() command is used to make each comparison. In particular
this implies that the comparisons are adjusted for any other terms in
the model and that there should be no missing degrees of freedom in the
factor.
@@@@examples
Examples:
Cmd> anova("y=a")
Model used is y=a
DF SS MS
CONSTANT 1 15.082 15.082
a 4 67.535 16.884
ERROR1 15 20.132 1.3421
Cmd> pairwise("a",.05,hsd:T) #hsd method
| 5 -1.86
| | 2 -1.18
| | 4 -1.15
| | 3 1.13
| 1 3.07
Cmd> pairwise("a",.05,lsd:T) #lsd w/ alpha=.05
| 5 -1.86
| 2 -1.18
| 4 -1.15
3 1.13
1 3.07
Cmd> pairwise("a",critval:2.13) #lsd w/ alpha=.05 a different way
| 5 -1.86
| 2 -1.18
| 4 -1.15
3 1.13
1 3.07
====quadmax()#analysis
%%%%
quadmax(A,b [,eq:eqmat] [,gte:gtemat] [,ckbounds:F]), square REAL matrix
A, REAL vector b with nrows(b) = nrows(A), REAL matrices eqmat and
gtemat with ncols(eqmat) = ncols(gtemat) = nrows(A) + 1,
%%%%
@@@@usage#Usage
quadmax(A,b) finds the x that maximizes x'Ax + b'x; if the problem is
unbounded then quadmax() returns an error. A is a p by p REAL matrix and
b is a p by 1 REAL vector.
quadmax(A,b,eq:eqmat) does the same, except that eqmat is a q by p+1
REAL matrix that specifies q linear constraints on z. eqmat is the
partitioned matrix [Q y], where Q is q by p and y is q by 1 and the
solution is constrained to satisfy Qx = y. If the constrained problem
is unbounded, then quadmax() returns an error. If the constraints cannot
be met, then quadmax() returns NULL.
quadmax(A,b,gte:gtemat) does the same except that gtemat is a g by p+1
REAL matrix specifying linear inequality constraints on the solution.
gtemat is the partitioned matrix [G z], where G is g by p and z is g by
1 and the solution is constrained to satisfy Gx >= z elementwise. If
the constrained problem is unbounded, then quadmax() returns an error. If
constraints cannot be met, then quadmax() returns NULL.
You can use both eq:eqmat and gte:gtemat to specify both equality and
inequality constraints.
@@@@example#Example
Suppose in a three variable mixture problem you have the equality
constraint that the sum of the x's is 1 and each element of x is at
least .05. Then you can use quadmax(A,b,eq:eqmat,gte:gtemat)
[ 1 0 0 .05 ]
where eqmat = [1 1 1 1] and gtemat = [ 0 1 0 .05 ]
[ 0 0 1 .05 ]
@@@@ckbounds_keyword#Keyword ckbounds
quadmax(A,b [,eq:eqmat] [,gte:gtemat], ckbounds:F) does the same but
there is no test for the solution being unbounded. This can
substantially decrease the computational time when you know the problem
is bounded. This is the case, for example, in a problem where the range
of the x's is totally bounded by inequality constraints, or in a problem
where x'Ax is known to have a unique maximum (A has all negative
eigenvalues). However, ckbounds:F should not be used when you do not
know there is a bounded solution.
@@@@algorithm#Algorithm
The algorithm used by quadmax() can be described as intelligent brute
force and will probably be overwhelmed by too many constraints. When
you are sure the problem has a bounded solution, be sure to use keyword
phrase ckbounds:F.
@@@@______
====randsign()#permutation test,analysis
%%%%
randsign(diffs [,trials:n]), REAL vector diffs, positive integer n
%%%%
@@@@usage#Usage
randsign(diffs) computes sum(s_i * diffs_i) for all 2^length(diffs)
possible combinations of signs s_i and returns these as a REAL vector.
randsign(diffs,trials:n) samples from the distribution of sum(s_i *
diffs_i) based on n sets of random signs. This is appropriate when
diffs is long, as 2^length(diffs) grows quickly!!
You can use the results of randsign to compute p-values for the
randomization equivalent of the paired t-test by finding the fraction of
the total differences based on random signs as extreme or more extreme
than the observed total difference for the two groups.
@@@@examples
Examples:
Cmd> x1 <- vector(1,4,2,4,6,3,7) # data set 1
Cmd> x2 <- vector(3,5,3,6,2,9,8) # data set 2
Cmd> diffs <- x2-x1
Cmd> diffs # the differences
(1) 2 1 1 2 -4
(6) 6 1
Cmd> sum(diffs) # observed total difference
(1) 9
Cmd> out <- randsign(diffs) # all differences with random signs
Cmd> stemleaf(out)
1 -1s|7
4 -1f|555
9 -1t|33333
16 -1*|1111111
24 -0.|99999999
32 -0s|77777777
41 -0f|555555555
52 -0t|33333333333
64 -0*|111111111111
64 +0*|111111111111
52 +0t|33333333333
41 +0f|555555555
32 +0s|77777777
24 +0.|99999999
16 1*|1111111
9 1t|33333
4 1f|555
1 1s|7
1*|1 represents 11 Leaf digit unit = 1
Cmd> sum(out >= 9)/128 # one sided randomization p-value
(1) 0.1875
@@@@see_also#Cross references
See also randt(), randt2().
@@@@______
====randt2()#permutation test,analysis
%%%%
randt2(y1, y2 [,trials:n]), REAL vectors y1 and y2, positive integer n
%%%%
@@@@usage#Usage
randt2(y1, y2) computes all possible values of xbar_1 - xbar_2 where
xbar_1 is the mean of a subset of size M = length(y1) from vector(y1,y2)
and xbar_2 is the mean of the length(y2) remaining elements. y1 and y2
must be REAL vectors. If there are MISSING values in y1 or y2, they are
immediately deleted and a warning message printed.
The returned value is a vector containing all the differences.
Thus randt2() produces the permutation distribution for a test of equality
of means based on the differences of the means of y1 and y2.
randt2(x1, x2, trials:N) does the same except it randomly samples N
values from the permutation distribution of ybar_1 - ybar_2.
@@@@computing_pvalues#Computing P-values
You can use the results of randt2 to compute p-values for the
randomization equivalent of the two sample t-test by finding the
fraction of the differences as extreme or more extreme than the observed
difference for the two groups. You may need to be a little careful in
making the comparison because there may not be an exact match of the
observed difference of means and the values returned by randt2 because
of minor differences in rounding .
@@@@examples
Examples:
Cmd> y1 <- vector(6.5,6,7.1,7.1,3.5,6.1) # group 1 data
Cmd> y2 <- vector(3.9,4.9,2.1,7.7,4.9) # group 2 data
Cmd> d <- sum(y1)/6 - sum(y2)/5; d # difference of means
(1) 1.35
Cmd> out <- randt2(y1,y2)
Cmd> length(out) # how many are there?
(1) 462
Cmd> sum(round(out - d,12) >= 0)/462 #one-sided permutation P-value
(1) 0.20996
@@@@see_also#Cross references
See also randsign(), randt().
@@@@______
====randt()#permutation test,analysis
%%%%
randt(dvec, m [,trials:n]), REAL vector dvec, positive integer n
%%%%
@@@@usage#Usage
randt(dvec,m) computes xbar_1 - xbar_2 for all combinations with m data
values from dvec in group 1 and the remainder in group 2. The returned
value is a vector containing all the differences. dvec must be a REAL
vector with no MISSING values and m > 0 an integer with m < length(dvec).
randt(dvec,m,trials:n) samples from the distribution of xbar_1 - xbar_2
using n independent sets of random assignments to the groups.
@@@@computing_pvalues#Computing P-values
You can use the results of randt() to compute p-values for the
randomization equivalent of the two sample t-test by finding the
fraction of the differences as extreme or more extreme than the observed
difference for the two groups. You may need to be a little careful in
making the comparison because there may not be an exact match of the
observed difference of means and the values returned by randt() because of
minor differences in rounding .
@@@@examples
Examples:
Cmd> y1 <- vector(6.5,6,7.1,7.1,3.5,6.1) # group 1 data
Cmd> y2 <- vector(3.9,4.9,2.1,7.7,4.9) # group 2 data
Cmd> d <- sum(y1)/6 - sum(y2)/5; d # difference of means
(1) 1.35
Cmd> out <- randt2(vector(y1,y2), length(y1))
Cmd> length(out) # how many are there?
(1) 462
Cmd> sum(round(out - d,12) >= 0)/462 #one-sided permutation P-value
(1) 0.20996
Cmd> stemleaf(out) # how do they look
3 -2.|665
12 -2*|222200000
37 -1.|8888888887766666666665555
83 -1*|4444444433333322222222222222111111110000000000
151 -0.|9999999999888888888888888887777777777777777766666666666555*
( 81) -0*|4444444444444444444333333333333333333322222222222221111111*
230 +0*|0000000000001111111111111111111112222222222222222223333333*
154 +0.|5555555555555555566666666666666666677777777778888888889999*
86 1*|00000000000011111111111112222222333333333444444444
36 1.|5555556666667778889999
14 2*|0000001334444
1 2.|8
@@@@see_also#Cross references
See also randsign(), randt2().
@@@@______
====reml()#analysis
%%%%
reml(Model,Randomvars[,restrict:F,nonhier:T,marg:T,maxiter:k,
usemle:T,tolerance:x])
%%%%
@@@@usage#Usage
reml(Model,Randomvars) performs a restricted maximum likelihood analysis
for the model given in CHARACTER scalar Model. Randomvars
is a CHARACTER vector specifying the names of factors in the model which
are random. Randomvars can also be REAL with integer elements
specifying the index of a factor in the model. If there are no random
factors, Randomvars should be NULL.
The return value of reml() is a structure with the following components:
theta: estimates of the fixed effects
phi: estimates of the variance components
thetavar: variance matrix of the fixed effects
phivar: variance matrix of the variance components
phidf: equivalent degrees of freedom for the variance components
L: REML log likelihood
Any variates in the model must be fixed effects.
@@@@assumptions#Assumptions
reml() assumes that if a factor first appears in an interaction, then
that factor is nested in the other terms of the interaction. For
example, if the first appearance of factor c is in the term a.b.c, then
c is assumed nested in the a.b combinations. This nesting is assumed in
the remainder of the model. That is, continuing the example, if there
is a later term c.d, it will be interpreted as a.b.c.d even though
a.b.c.d is not specifically in the model.
reml() works for both balanced and unbalanced data.
@@@@restrict_and_nonhier_keywords#Keywords restrict and nonhier
reml(Model,Randomvars,restrict:F) performs the REML analysis assuming
no marginal restrictions on the random effects in the model.
reml(Model,Randomvars,nonhier:T) performs the REML analysis for an
analysis of variance that does not enforce the usual MacAnova hierarchy
assumptions.
That is, for example, model "y=a+b+c+a.b.c" does not imply
that the two-way interaction degrees of freedom are part of the "a.b.c"
term. You cannot use anova() to compute such an analysis although it
can be done (if you know how) using swp().
@@@@usemle_tolerance_maxiter_keywords#Keywords usemle, tolerance and maxiter
reml(Model,Randomvars,usemle:T) performs a maximum likelihood analysis
instead of the REML analysis.
reml(Model,Randomvars,tolerance:value) uses value as a tolerance for
determining singularity and convergence (default is 1e-10).
reml(Model,Randomvars,maxiter:value) uses value as the maximum number of
iterations in the fitting process (default is 60).
@@@@see_also#Cross references
See also mixed().
@@@@______
====rscanon()#analysis
%%%%
rscanon(y,x1,x2,...,xk [,block:var1,block:var2,...]), REAL vectors y,
x1, ...,xk, factors var1, var2, ...; all should have the same number
of rows.
%%%%
@@@@usage#Usage
rscanon(y,x1,x2,...,xk) performs the canonical analysis for the
quadratic response surface model with response y and predictors x1,
... , xk. y and x1 through xk must be REAL vectors of the same length.
The result is a structure with components 'b0', 'b', 'B', 'x0', 'y0',
'H', and 'lambda', giving the intercept, linear coefficients, the
quadratic/ cross product coefficient matrix, the stationary point, the
predicted response at the stationary point, the matrix of canonical
directions, and the eigenvalues, respectively.
If the design was blocked, you can specify the blocking factors using
one or more keyword phrases of the form 'block:var', where var is a
factor. For example, if the design was blocked by factors date and
analyst, you might use
Cmd> rscanon(yield,time,temperature,block:date,block:analyst).
The output is the same as before, but is block adjusted.
@@@@examples
Examples:
Cmd> #data from example 16-2 of Montgomery
Cmd> x1 <- vector(-1,-1,1,1,0,0,0,0,0,1.414,-1.414,0,0)
Cmd> x2 <- vector(-1,1,-1,1,0,0,0,0,0,0,0,1.414,-1.414)
Cmd> y <- vector(76.5,77.0,78.0,79.5,79.9,80.3,80.0,79.7,\
79.8,78.4,75.6,78.5,77.0)
Cmd> rscanon(y,x1,x2)
component: b0
(1) 79.94
component: b
(1) 0.99505 0.5152
component: B
(1,1) -1.3764 0.125
(2,1) 0.125 -1.0013
component: x0
(1) 0.38923 0.30585
component: y0
(1) 80.212
component: H
(1,1) 0.28972 0.95711
(2,1) 0.95711 -0.28972
component: lambda
(1) -0.9635 -1.4143
@@@@______
====sidebyside()#plots
%%%%
sidebyside([termlaby:y,labels:c,rescale:tf,showconst:tf,boxcut:int])
%%%%
@@@@usage#Usage
sidebyside() produces a side-by-side plot of the effects and residuals
of the current model; there must be an active model. A side-by-side
plot plots the effects for each term against the term number, showing
the relative sizes of the effects. When there are many effects or
residuals, a boxplot is made instead of showing individual effects.
There are no required arguments, but the following arguments alter
the plot; in addition, any graphics arguments are passed through.
@@@@keywords#Keywords
Optional keyword phrase arguments are
termlaby:real Specify y-value for term labels. This can be a
single value or a vector of length equal to number
of terms.
labels:charvector Specify your own term labels.
rescale:logic Should effects be divided by their standard errors.
Default is F. This uses the standard errors as
reported by secoefs() and may not be correct for
all mixed models (secoefs() depends on terms labeled
ERRORX). Residuals are divided by root MSE.
showconst:logic Should the coefficient of the CONSTANT be shown?
Default is F.
boxcut:integer Cutoff for using a boxplot for a term instead of
plotting individual effects. Default is 20.
@@@@example
Example:
Cmd> y <- vector(9,13,12,43,48,57,60,65,70,77,70,91,\
15,13,20,66,58,73,75,78,90,97,108,99)
Cmd> acid<-factor(rep(run(2),rep(12,2)))
Cmd> style<-factor(rep(rep(run(4),rep(3,4)),2))
Cmd> anova("y=acid*style",silent:T)
Cmd> sidebyside() #default plot
Cmd> # specify term locations and labels
Cmd> sidebyside(termlaby:vector(-50,-40,-30,-20),\
labels:vector("a","b","c","d"),boxcut:30)
Cmd> # show the constant and rescale
Cmd> sidebyside(dumb:T,showconst:T,rescale:T)
@@@@______
====typeIIIss()#
%%%%
typeIIIss(model), obsolete, use anova(model,marginal:T)
%%%%
@@@@obsolete_macro
typeIIIss(model) is obsolete, use anova(model,marginal:T) instead.
@@@@______
====varcomp()#anova,analysis,random effects,factorial
%%%%
varcomp(model,randomvars [,marg:T] [,restrict:F] [,nonhier:T]),
CHARACTER scalar model, CHARACTER vector randomvars
varcomp(emsResult), emsResult a structure returned by ems() with keep:T.
%%%%
@@@@usage#Usage
varcomp(model, randomvars) computes the "ANOVA" estimates of the
variances of random effects in mixed effects analysis of variance as
well as estimates of their standard errors. model is a CHARACTER scalar
or a quoted string specifying an ANOVA model and randomvars is a
CHARACTER vector of the names of some or all of the factors in the
model. You can also use keyword phrases marg:T, restrict:F and
nonhier:T exactly as for macro ems().
varcomp(emsResult), where emsResult is computed as ems(model,
randomvars, keep:T) does the same.
The estimates are linear combinations of mean squares for random
effects. They are unbiased but may be negative.
@@@@value
The value of varcomp() is a matrix with one row for each random term and
two columns giving the estimated variance component and its standard
error.
@@@@marg_keyword#Keyword marg
varcomp(model, randomvars, marg:T) and varcomp(emsResult, marg:T) do the
same but use linear combinations of "marginal" EMS, that is the EMS for
each term is computed after adjusting for all other terms in the model.
varcomp() assumes that the EMS for random terms have no contributions from
fixed factors. This is true for balanced data and may be guaranteed in
general by using marg:T.
@@@@example
Example:
Three populations, all crosses between 4 males and 4 females in each
population with six offspring from each mating randomly assigned to
three environments. Male and female are random. First the simple
ANOVA.
Cmd> anova("y=(pop+m.pop+f.pop+m.f.pop)*env")
Model used is y=(pop+m.pop+f.pop+m.f.pop)*env
DF SS MS
CONSTANT 1 5.4299 5.4299
pop 2 2091.4 1045.7
pop.m 9 112.5 12.5
pop.f 9 370.02 41.113
pop.m.f 27 56.774 2.1027
env 2 206.15 103.08
pop.env 4 0.16527 0.041316
pop.m.env 18 3.4185 0.18992
pop.f.env 18 8.2354 0.45752
pop.m.f.env 54 17.117 0.31698
ERROR1 144 30.448 0.21144
Now compute the expected mean squares, and keep the ems() output.
Cmd> emsstuff<-ems("y=(pop+m.pop+f.pop+m.f.pop)*env",vector("m","f"),
keep:T,print:T)
EMS(CONSTANT) = V(ERROR1) + 6V(pop.m.f) + 24V(pop.f) + 24V(pop.m) +
288Q(CONSTANT)
EMS(pop) = V(ERROR1) + 6V(pop.m.f) + 24V(pop.f) + 24V(pop.m) + 96Q(pop)
EMS(pop.m) = V(ERROR1) + 6V(pop.m.f) + 24V(pop.m)
EMS(pop.f) = V(ERROR1) + 6V(pop.m.f) + 24V(pop.f)
EMS(pop.m.f) = V(ERROR1) + 6V(pop.m.f)
EMS(env) = V(ERROR1) + 2V(pop.m.f.env) + 8V(pop.f.env) +
8V(pop.m.env) + 96Q(env)
EMS(pop.env) = V(ERROR1) + 2V(pop.m.f.env) + 8V(pop.f.env) +
8V(pop.m.env) + 32Q(pop.env)
EMS(pop.m.env) = V(ERROR1) + 2V(pop.m.f.env) + 8V(pop.m.env)
EMS(pop.f.env) = V(ERROR1) + 2V(pop.m.f.env) + 8V(pop.f.env)
EMS(pop.m.f.env) = V(ERROR1) + 2V(pop.m.f.env)
EMS(ERROR1) = V(ERROR1)
From the EMS, we see than (MS(pop.m.f.env)-MS(ERROR1))/2 is an unbiased
estimate of V(m.f.env); here, we have (.31698 - .21144)/2 = .05277.
Similarly, (MS(pop.f.env)-MS(pop.m.f.env))/8 is an unbiased estimate of
V(f.env); here we have (.45752 - .31698)/8 = .01757. varcomp()
automates these calculations, as well as providing the standard error.
Cmd> varcomp(emsstuff)
Estimate SE
pop.m 0.43323 0.24668
pop.f 1.6254 0.80788
pop.m.f 0.31521 0.095472
pop.m.env -0.015883 0.010989
pop.f.env 0.017568 0.020532
pop.m.f.env 0.052766 0.032948
ERROR1 0.21144 0.024919
Note that variance component estimates can be negative; varcomp() does
not truncate the estimates at 0. We would get the same output from the
following command.
Cmd> varcomp("y=(pop+m.pop+f.pop+m.f.pop)*env",vector("m","f"))
@@@@see_also#Cross references
See also ems(), mixed().
@@@@______
_E_O_F_#This should be the last line, an internal End Of File marker