Title page for ETD etd-33098-142912


Type of Document Dissertation
Author Abdulla, Ghaleb M.S.
Author's Email Address abdulla@vt.edu
URN etd-33098-142912
Title Analysis and Modeling of World Wide Web Traffic
Degree PhD
Department Computer Science
Advisory Committee
Advisor Name Title
Fox, Edward Alan Committee Chair
Abrams, Marc Committee Member
Balci, Osman Committee Member
Kafura, Dennis G. Committee Member
Nayfeh, Ali H. Committee Member
Keywords
  • Time Series
  • Modeling
  • Scalability
  • World Wide Web
  • Log analysis
  • Caching
  • Proxy
Date of Defense 1998-04-27
Availability unrestricted
Abstract

This dissertation deals with monitoring, collecting,

analyzing, and modeling of World Wide Web (WWW)

traffic and client interactions. The rapid growth of

WWW usage has not been accompanied by an overall

understanding of models of information resources and

their deployment strategies. Consequently, the current

Web architecture often faces performance and reliability

problems. Scalability, latency, bandwidth, and

disconnected operations are some of the important

issues that should be considered when attempting to

adjust for the growth in Web usage. The WWW

Consortium launched an effort to design a new protocol

that will be able to support future demands. Before doing

that, however, we need to characterize current users'

interactions with the WWW and understand how it is

being used.

We focus on proxies since they provide a good medium

for caching, filtering information, payment methods, and

copyright management. We collected proxy data from

our environment over a period of more than two years.

We also collected data from other sources such as

schools, information service providers, and commercial

sites. Sampling times range from days to years. We

analyzed the collected data looking for important

characteristics that can help in designing a better HTTP

protocol. We developed a modeling approach that

considers Web traffic characteristics such as

self-similarity and long-range dependency. We

developed an algorithm to characterize users' sessions.

Finally we developed a high-level Web traffic model

suitable for sensitivity analysis.

As a result of this work we develop statistical models of

parameters such as arrival times, file sizes, file types, and

locality of reference. We describe an approach to model

long-range and dependent Web traffic and we

characterize activities of users accessing a digital library

courseware server or Web search tools.

Temporal and spatial locality of reference within

examined user communities is high, so caching can be an

effective tool to help reduce network traffic and to help

solve the scalability problem. We recommend utilizing

our findings to promote a smart distribution or push

model to cache documents when there is likelihood of

repeat accesses.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  thesis.pdf 2.63 Mb 00:12:09 00:06:15 00:05:28 00:02:44 00:00:14

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.