96JCGS01\P0001---------------------------------------------------------
Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models
Genshiro Kitagawa
A new algorithm for the prediction, filtering, and smoothing of
non-Gaussian non-linear space models is shown. The algorithm is based
on a Monte Carlo method in which successive prediction (and subsequently
smoothing), conditional probability density functions are approximated
by many of their realizations. The particular contribution of this
algorithm is that it can be applied to a broad class of nonlinear
non-Gaussian higher dimensional state space models on the provision
that the dimensions of the system noise and the observation noise are
relatively low. Several numerical examples are shown.
Key Words: Fixed interval smoothing; Non-Gaussian state space model;
Nonstationary time series; Recursive filtering; Time series modeling.
96JCGS01\P0026------------------------------------------------------
Wavelet Analysis and Synthesis of Stationary Long-Memory Processes
E. J. McCoy & A. T. Walden
The discrete wavelet transform (DWT) can be interpreted as a filtering
of a time series by a set of octave band filters such that the width
of each band as a proportion of its center frequency is constant. A
long-memory process having a power spectrum that plots as a straight
line on log-frequency/log-power scales over many octaves of frequency
is intrinsically related to such a structure. As an example of such
processes, we focus on one class of discrete-time, stationary,
long-memory processes, the fractionally differenced Gaussian white
noise processes (fdGn). We show how the DWT breaks down a fdGn, and
show the exact correlation structure of the resulting coefficients
for different wavelets (Daubechies' minimum-phase and least-asymmetric
and Haar). The DWT is an impressive ``whitening filter.'' A discrete
wavelet-based scheme for simulating fdGn's is discussed and is shown
to be equivalent to a spectral decomposition of the covariance matrix
of the process; however, it can be carried out using only information
on the nature of the spectrum of the process --- that is, time-domain
information is not required. It produces results comparable with the
exact Hosking method. We then show that, using wavelet methods, the
spectral slope parameter $d$ can be estimated as well, or better, than
when using the best Fourier-based method known to us, namely regression
on multitaper spectral ordinates. Since wavelet analysis and synthesis
methods can be applied to a much wider variety of empirical or
theoretical long-memory processes, wavelet methods could prove a
valuable tool in the future in the analysis and synthesis of stochastic
processes.
Key Words: Discrete wavelet transform; Fractionally differenced white noise;
Long-memory processes; Maximum likelihood estimation; Simulation.
96JCGS01\P0057----------------------------------------------------------
Introduction to the Special Section on Design and Implementation
of Data Analysis Systems
William F. Eddy & Guenther Sawitzki
From March 23--26, 1995, StatLab Heidelberg, the statistical laboratory
at Universit$\ddot{\mbox{a}}$t Heidelberg, hosted the ``Workshop on
Design and Implementation of Data Analysis Systems.'' Although organized
by a statistics group, the focus was not on the statistical aspects of
such systems but on their design and implementation. This sort of
meeting is in the tradition of the Interface Symposium, although much
smaller: while there is lot of work to do both in the field of
statistics and computing, sometimes it is necessary to sit on the
fence and have a look at both sides.
The relationship between ``classical'' statistics and interactive
data analysis was an ongoing discussion throughout the workshop. How
are recent developments in statistics --- for example, understanding
resampling methods and bandwidth selection problems --- reflected
in data analysis? What is the contribution of ``classical'' statistics
to data analysis? We believe that a graphical or interactive data
analysis method is only as good as the statistical method that
supports it. The foundations of many programs, and a lot of
graphical and/or interactive methods, need to be investigated, and
classical statistics still needs to learn a lot about the purposes
and intentions of data analysis.
Two related software issues have become important during the current
decade: how do you open and extend an existing program, and how do you
design programs for extensibility? Design for extensibility is a
central issue if you want to augment software to include new methods
or adapt to new ways of analysis. This was one of the recurring
topics in the workshop, and in the SoftStat conference that
immediately followed.
The workshop was held at an old villa of the Universit$\ddot{\mbox{a}}$t
Heidelberg. Having all the participants in one place --- starting
with a common breakfast and sufficient time to meet and discuss
outside the lectures --- was important to the success of the
workshop.
A collection of contributions from this workshop will appear in the
\emph{Journal of Computational and Graphical Statistics} over the
next four issues. We thank the editor for the opportunity to include
these contributions in JCGS.
96JCGS01\P0058--------------------------------------------------------------
Pixel-Oriented Visualization Techniques for Exploring Very
Large Data Bases
Daniel A. Keim
An important goal of visualization technology is to support the
exploration and analysis of very large amounts of data. This
article describes a set of pixel-oriented visualization techniques
that use each pixel of the display to visualize one data value
and therefore allow the visualization of the largest amount of data
possible. Most of the techniques have been specifically designed
for visualizing and querying large data bases. The techniques may
be divided into query-independent techniques that directly
visualize the data (or a certain portion of it) and query-dependent
techniques that visualize the data in the context of a specific
query. Examples for the class of query-independent techniques are
the screen-filling curve and recursive pattern techniques. The
screen-filling curve techniques are based on the well-known Morton
and Peano-Hilbert curve algorithms, and the recursive pattern
technique is based on a generic recursive scheme, while generalizes
a wide range of pixel-oriented arrangements for visualizing large
data sets. Examples for the class of query-dependent techniques
are the snake-spiral and snake-axes techniques, which visualize the
distances with respect to a data base query and arrange the most
relevant data items in the center of the display. In addition to
describing the basic ideas of our techniques, we provide example
visualizations generated by the various techniques, which demonstrate
the usefulness of our techniques and show some of their advantages
and disadvantages.
Key Words: Visualizing large data bases; Visualizing multidimensional and
multivariate data.
96JCGS01\P0078---------------------------------------------------------
Interactive High-Dimensional Data Visualization
Andreas Buja, Dianne Cook, & Deborah F. Swayne
We propose a rudimentary taxonomy of interactive data visualization
based on a triad of data analytic tasks: finding Gestalt, posing
queries, and making comparisons. These tasks are supported by three
classes of interactive view manipulations; focusing, linking, and
arranging views. This discussion extends earlier work on the
principles of focusing and linking and sets them on a firmer base.
Next, we give a high-level introduction to a particular system for
multivariate data visualization --- XGobi. This introduction is not
comprehensive but emphasizes XGobi tools that are examples of
focusing, linking, and arranging views; namely, high-dimensional
projections, linked scatterplot brushing, and matrices of
conditional plots. Finally, in a series of case studies in data
visualization, we show the powers and limitations of particular
focusing, linking, and arranging tools. The discussion is dominated
by high-dimensional projections that form an extremely well-developed
part of XGobi. Of particular interest are the illustration of
asymptotic normality of high-dimensional projections (a theorem of
Diaconis and Freedman), the use of high-dimensional cubes for
visualizing factorial experiments, and a method for interactively
generating matrices of conditional plots with high-dimensional
projections. Although thee is a unifying theme to this article, each
section --- in particular the case studies --- can be read separately.
Key Words: Brushing; High-dimensional projections; Multiple linked views;
Plot matrices; Real-time graphics; Taxonomy of data visualization.
96JCGS01\P0100-----------------------------------------------------------
An Interactive Icon Index: Images of the Outer Planets
William F. Eddy & Audris Mockus
We are interested in the exploratory analysis of large collections of
complex objects. As an example, we are studying a large collection of
digital images that has nearly 30,000 members. We regard each image
in the collection as an individual observation. To facilitate our
study we construct an index of the images in the collection. The
index uses a small copy of each image (an icon or a ``thumbnail'')
to represent the full-size version. A large number of these thumbnails
are laid out in a workstation window. We can interactively arrange and
rearrange the thumbnails within the window. For example, we can sort
the thumbnails by the values of a function computed from them or by the
values of data associated with each of them. By the use of specialized
equipment (a single-frame video disk recorder/player), we can instantly
access any individual full-size image in the collection as a video
image. We regard our software as an early development of statistical
exploratory tools for studying collections of images and other complex
objects in the same way we routinely study batches of numbers. We
expect that eh concept of a visual index will extend to other
collections of complex objects besides images, for example, time
series, functions, and text.
Key Words: Exploratory data analysis; Image data base.