Title page for ETD etd-07302003-183651


Type of Document Dissertation
Author Lawrence, David E
URN etd-07302003-183651
Title Cluster-Based Bounded Influence Regression
Degree PhD
Department Statistics
Advisory Committee
Advisor Name Title
Birch, Jeffrey B. Committee Chair
Anderson-Cook, Christine M. Committee Member
Smith, Eric P. Committee Member
Terrell, George R. Committee Member
Ye, Keying Committee Member
Keywords
  • High-breakdown
  • Robust
  • Linear
  • Outlier
  • LTS
Date of Defense 2003-07-17
Availability unrestricted
Abstract
In the field of linear regression analysis, a single outlier can dramatically influence ordinary least squares estimation while low-breakdown procedures such as M regression and bounded influence regression may be unable to combat a small percentage of outliers. A high-breakdown procedure such as least trimmed squares (LTS) regression can accommodate up to 50% of the data (in the limit) being outlying with respect to the general trend. Two available one-step improvement procedures based on LTS are Mallows 1-step (M1S) regression and Schweppe 1-step (S1S) regression (the current state-of-the-art method). Issues with these methods include (1) computational approximations and sub-sampling variability, (2) dramatic coefficient sensitivity with respect to very slight differences in initial values, (3) internal instability when determining the general trend and (4) performance in low-breakdown scenarios. A new high-breakdown regression procedure is introduced that addresses these issues, plus offers an insightful summary regarding the presence and structure of multivariate outliers. This proposed method blends a cluster analysis phase with a controlled bounded influence regression phase, thereby referred to as cluster-based bounded influence regression, or CBI. Representing the data space via a special set of anchor points, a collection of point-addition OLS regression estimators forms the basis of a metric used in defining the similarity between any two observations. Cluster analysis then yields a main cluster "halfset" of observations, with the remaining observations becoming one or more minor clusters. An initial regression estimator arises from the main cluster, with a multiple point addition DFFITS argument used to carefully activate the minor clusters through a bounded influence regression framework. CBI achieves a 50% breakdown point, is regression equivariant, scale equivariant and affine equivariant and distributionally is asymptotically normal. Case studies and Monte Carlo studies demonstrate the performance advantage of CBI over S1S and the other high breakdown methods regarding coefficient stability, scale estimation and standard errors. A dendrogram of the clustering process is one graphical display available for multivariate outlier detection. Overall, the proposed methodology represents advancement in the field of robust regression, offering a distinct philosophical viewpoint towards data analysis and the marriage of estimation with diagnostic summary.
Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Back.pdf 226.00 Kb 00:01:02 00:00:32 00:00:28 00:00:14 00:00:01
  Chapter1.pdf 261.03 Kb 00:01:12 00:00:37 00:00:32 00:00:16 00:00:01
  Chapter2.pdf 285.12 Kb 00:01:19 00:00:40 00:00:35 00:00:17 00:00:01
  Chapter3.pdf 141.81 Kb 00:00:39 00:00:20 00:00:17 00:00:08 < 00:00:01
  Chapter4.pdf 179.23 Kb 00:00:49 00:00:25 00:00:22 00:00:11 < 00:00:01
  Chapter5.pdf 594.90 Kb 00:02:45 00:01:24 00:01:14 00:00:37 00:00:03
  Chapter6.pdf 381.58 Kb 00:01:45 00:00:54 00:00:47 00:00:23 00:00:02
  Chapter7.pdf 451.75 Kb 00:02:05 00:01:04 00:00:56 00:00:28 00:00:02
  Chapter8.pdf 390.93 Kb 00:01:48 00:00:55 00:00:48 00:00:24 00:00:02
  Chapter9.pdf 128.99 Kb 00:00:35 00:00:18 00:00:16 00:00:08 < 00:00:01
  Front.pdf 236.78 Kb 00:01:05 00:00:33 00:00:29 00:00:14 00:00:01

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.