95JCGS03\P0155-----------------------------------------------------------
Grand Tour and Projection Pursuit
Dianne Cook, Andreas Buja, Javier Cabrera, & Catherine Hurley
The grand tour and projection pursuit are two methods for exploring
multivariate data. We show how to combine them into a dynamic
graphical tool for exploratory data analysis, called a projection
pursuit guided tour. This tool assists in clustering data when
clusters are oddly shaped and in finding general low-dimensional
structure in high-dimensional, and in particular, sparse data. An
example shows that the method, which is projection-based, can be
quite powerful in situations that may cause grief for methods based
on kernel smoothing. The projection pursuit guided tour is also
useful for comparing and developing projection pursuit indexes and
illustrating some types of asymptotic results.
Key Words: Data visualization; Interactive dynamic graphics; Data projections;
Exploratory data analysis.
95JCGS03\P0173--------------------------------------------------------------
Fast and Accurate Computation of the Exact Confidence Limits
for the Common Odds Ratio in Several $2 \times 2$ Tables
J. G. Liao and Charles B. Hall
Mehta, Patel, and Gray proposed a network algorithm for the exact
confidence limits of the common odds ratio in several $2 \times 2$
tables. Their algorithm was implemented in the StatXact and Egret
statistical packages and further discussed by Vollset, Hirji, and
Elashoff. The need to evaluate polynomials of potentially very
high degrees, however, poses some numerical difficulties. This
article presents a method that cuts the degree of polynomials by
at least one half. Two other modifications to further stabilize
and speed the computation are also proposed.
Key Words: Noncentral hypergeometric distribution.
95JCGS03\P0180---------------------------------------------------------
A Visualization Technique for Studying the Iterative Estimation
of Mixture Densities
Jeffrey L. Solka, Wendy L. Poston,& Edward J. Wegman
This article focuses on recent work that analyzes the expectation
maximization (EM) evolution of mixtures-based estimators. The goal of
this research is the development of effective visualization techniques
to portray the mixture model parameters as they change in time. This
is an inherently high-dimensional process. Techniques are presented
that portray the time evolution of univariate, bivariate, and trivariate
finite and adaptive mixtures estimators. Adaptive mixtures is a
recently developed variable bandwidth kernel estimator where each of
the kernels is not constrained to reside at a sample location. The
future role of these techniques in developing new versions of the
adaptive mixtures procedure is also discussed.
Key Words: Density estimation; Expectation maximization; Graphics.
95JCGS03\P0199--------------------------------------------------------
Exact Cumulant Calculations for Pearson $\chi^2$ and
Zelterman Statistics for $r$-way Contingency Tables
James E. Stafford
We present a computer algebra procedure that calculates exact cumulants
for Pearson $\chi^2$ and Zelterman statistics for $r$-way contingency
tables. The algorithm is an example of how an overwhelming algebraic
problem can be solved neatly through computer implementation by
emulating tactics that one uses by hand. For inference purposes the
cumulants may be used to assess chi-square approximations or to improve
this approximation via Edgeworth expansions. Edgeworth approximations
are compared to the computer-intensive techniques of Mehta and Patel
that provide exact and arbitrarily close results. Comparisons to
approximations that utilize the gamma distribution (Mielke and Berry)
are also made.
Key Words: Computer algebra; Edgeworth expansion; Expectation; Mathematica;
Nested sums; Partitions;Symbolic computation.
95JCGS03\P0213----------------------------------------------------------
Adaptive Order Polynomial Fitting: Bandwidth Robustification and
Bias Reduction
Jianqing Fan & Irene Gijbels}
This article deals with estimation of the regression function and its
derivatives using local polynomial fitting. An important question is:
How to determine the order of the polynomial to be fitted in a
particular fixed neighborhood? This depends on the local curvature of
the unknown curve. A higher order fit leads to a possible bias
reduction, but results in an increase of variability. A precise
evaluation of this increase is presented, and from this it is also clear
that it is preferable to choose the order of fit adaptively. In this
article we provide, for a given bandwidth, such a data-driven variable
order selection procedure. The basic idea is to obtain a good estimate
of the mean squared error at each location point and to use this
estimate as a criterion for the order selection. The performance of
the proposed selection procedure is illustrated via simulated examples.
It turns out that the adaptive order fit is more robust against
bandwidth variation; even if the bandwidth varies by a factor of 3, the
resulting estimates are qualitatively indistinguishable. Hence the
issue of choosing the bandwidth becomes less important and a crude
bandwidth selector might suffice. We propose such a simple rule for
selecting the bandwidth, and demonstrate its performance for the
adaptive order fit via some simulated examples.
Key Words: Adaptive order approximation; Bandwidth robustification;
Bandwidth selection; Bias reduction; Estimation of derivatives;
Increase of variability; Nonparametric regression;
Spatial-adaptation.