

Type of Document Dissertation Author Ribler, Randy L. URN etd-1711111139751001 Title Visualizing Categorical Time Series Data with Applications to Computer and Communications Network Traces Degree PhD Department Computer Science Advisory Committee
Advisor Name Title Ehrich, Roger W. Foutz, Robert Kriz, Ronald D. Ribbens, Calvin J. Abrams, Marc Committee Chair Keywords
- visualization
- categorical data
- time series
- data mining
- performance analysis
- information visualization
Date of Defense 1997-04-04 Availability unrestricted Abstract
Visualization tools allow scientists to
comprehend very large data sets and to
discover relationships which are
otherwise difficult to detect.
Unfortunately, not all types of data can
be visualized easily using existing tools.
In particular, long sequences of
nonnumeric data cannot be visualized
adequately. Examples of this type of
data include trace files of computer
performance information, the
nucleotides in a genetic sequence, a
record of stocks traded over a period
of years, and the sequence of words in
this document. The term categorical
time series is defined and used to
describe this family of data. When
visualizations designed for numerical
time series are applied to categorical
time series, the distortions which result
from the arbitrary conversion of
unordered categorical values to totally
ordered numerical values can be
profound. Examples of this
phenomenon are presented and
explained. Several new, general
purpose techniques for visualizing
categorical time series data have been
developed as part of this work and
have been incorporated into the Chitra
performance analysis and visualization
system. All of these new visualizations
can be produced in O(n) time. The new
visualizations for categorical time series
provide general purpose techniques for
visualizing aspects of categorical data
which are commonly of interest. These
include periodicity, stationarity,
cross-correlation, autocorrelation, and
the detection of recurring patterns. The
effective use of these visualizations is
demonstrated in a number of
application domains, including
performance analysis, World Wide
Web traffic analysis, network routing
simulations, document comparison,
pattern detection, and the analysis of
the performance of genetic algorithms.
Files
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access etd.pdf 20.92 Mb 01:36:50 00:49:48 00:43:34 00:21:47 00:01:51
If you have questions or technical problems, please Contact DLA.