Basic Terms
A "File" refers to a discrete unit of content which is
recorded on physical media (a hard disk, a floppy disk).
Images, textual content, even entire databases are often
stored as files. Each file has a name ("filename") which
must be unique within its context (i.e. its "path" -- see
below). A filename typically includes a three or four
character "suffix" or "extension" which describes the type
of file (examples: .gif for GIF images, .html for text
marked up in HTML format.) Valid file (and directory) names
are composed of the letters a-z (upper or lower case) and
the numbers 0 through 9 in any combination. It is also
acceptable to use dashes (-) or underscore (_) characters
in filenames. It is not acceptable for filenames to contain
spaces or slashes (slashes are reserved to identify
directories within a "path" -- see below.)
Sample Filenames:
- index.html
- figure1.gif
- etd.pdf
- process.mov
A "Directory" is an organizational structure, a
container for files and other directories. A file that is
stored in a directory is completely distinct from any files
that are stored in other directories, whether or not the
files have the same name. Directory naming conventions are
identical to file naming conventions, except that directory
names do not usually include extensions.
A "Path" is a combination of slashes and directory names
that can be used to identify the specific location of a
file. A fully qualified path begins with a top-level (or
"root") directory, typically referred to with a single
slash (we will use / for all of our examples, as we are
dealing with UNIX filesystems). Within a path, we can refer
to any subdirectory using its position relative to the root
directory (examples: /theses, /theses/available,
/theses/submitted). We use genealogical terms to describe
the relationships between directories. In the previous
examples, the /theses directory would be considered a child
of the root directory. The /theses/available and
/theses/submitted directories would be considered children
of the /theses directory, and could also be referred to as
siblings of each other (i.e. /theses/available is a sibling
of /theses/submitted). The /theses directory would be
considered the parent of the /theses/available directory,
and the / directory would be considered a parent of the
/theses directory.
The path and filename are two critical components of
every URL, even if they are implied rather than explicitly
specified (as in the case of index.html files -- see
below). When choosing a name for a file or directory, you
want to be sure the final URL will be descriptive, without
being overly wordy or redundant. The following sections
give some guidelines for naming files and directories.
Tip 1: Avoid inserting redundant information in the name
of each subdirectory and file.
Bad File + Directory Name:
/ejournals/JTE/jte-v11n2/jte-v11n2-miscellany.html
Better File + Directory Name:
/ejournals/JTE/v11n2/miscellany.html
In the first example above, we've added 14 extra
characters of useless information to the path and filename,
which will be reflected in the URL users see and may even
have to type into their browser. In short: we're not going
to put issues of any journal but the JTE in the directory
/ejournals/JTE, so we can simply call the directory
/ejournals/JTE/v11n2. We're not going to put articles that
aren't a part of volume 11, number 2 in the directory
/ejournals/JTE/v11n2, so we can simply name the file
miscellany.html. So the better path and filename would be
/ejournals/JTE/v11n2/miscellany.html.
Tip 2: Avoid overly general file and directory
names.
Bad File + Directory Name:
/site1/section1/file1.html
Better File + Directory Name:
/exhibits/spring2001/synopsis.html
The first file and directory name given above gives us
no hints as to the larger context for the page we're
looking at. A simple, descriptive name that hints at the
subject, time period, or broad category of materials we're
dealing with is a great help for the next person who has to
update and/or rearrange the files you work on.
Tip 3: Avoid overly terse file and directory names.
Bad File + Directory Name:
/arch/blhist/timln/syn.html
Better File + Directory Name:
/archives/black_history/timeline/synopsis.html
It used to be the case that file and directory names
could only be 8 characters long. This is no longer the
case, and names that are composed of full english words are
easier to remember than odd abbreviations or even
acronyms.
Tip 4: Avoid overly complex capitalization.
Bad File + Directory Name:
/Exhibits/Fall1999/CulinaryCOLL/Cover.html
Better File + Directory Name:
/exhibits/fall1999/culinary/cover.html
File and directory names on our web server are case
sensitive. Failing to correctly capitalize even one letter
in the first example above results in a "File not found"
message. It's better practice to pick a case and stick to
it. Lowercase is best for this, as it requires less
fiddling with caps lock or shift keys on the part of users
who are typing in the URL.
Tip 5: Avoid being too wordy.
Overly Long Directory Name:
/manuscripts_and_guides/smithfield_preston_collection
Better Directory Name:
/manuscripts/smithfield_preston
These days, it's possible to create very descriptive
filenames by incorporating dashes (-) and underscores (_)
in between common english words. Avoid the tendency to
write small essays when creating files and directories,
stick to no more than two longer words or three short words
when naming a file or directory.
Tip 6: "index.html" files and why you should use
them
Our web server is configured to recognize a special
filename, index.html. When this file is found in a
directory, the contents of the file are returned instead of
a listing of the directory's contents. The first advantage
of using index.html files is that it keeps users from
inadvertantly browsing through outdated versions of pages,
or from trying to make sense of a directory full of images
instead of the page that describes the images. The second
advantage is that URLs referring to an index.html file can
omit the filename index.html at the end, as it is
implied. Example: The URL http://scholar.lib.vt.edu/ejournals/index.html
is equivalent to the shorter URL http://scholar.lib.vt.edu/ejournals/.
Tip 7: Divide larger groups of files using
subdirectories.
It's a good idea to break up larger sites using a
directory structure that corresponds to some common mental
model that users can follow along with. If you're marking
up electronic journals, you'll probably want a subdirectory
for each journal, and then a subdirectory for each issue of
the journal. A directory containing all the images for
every issue, or all of the articles for every issue would
be difficult to maintain without reading each and every
page of the journal.
Think of the way written text is commonly broken up into
paragraphs. A single long paragraph containing dozens of
different ideas is hard to make sense of, there are no
stopping points to allow people to digest information. If
you have one major idea for a group of pages, it's probably
OK if the pages share a directory with a page that covers
another (related) concept. If you have one major idea for a
group of pages, and more than two or three pages on another
topic (whether or not it is related), it may be time to
think about putting the pages related to the second topic in their own
directory.