What is HTML?
HTML stands for HyperText Markup Language. In short, it is a simple way to add formatting, layout and web links to electronic text. There are programs to view pages written in HTML for billions of personal computers and other computing devices, making HTML the lingua franca of the web.
The fundamental concept underlying HTML (and its parent language, SGML) is that blocks of text can be identified as having a particular structure or format by enclosing them in pairs of "tags" (also called "elements"). Below you'll see a sample piece of text in which one word has been highlighted in bold.
Example 1 Sample Code What It Looks Like Get <b>on</b> with it. Get on with it. In the example above, the start tag <b> identifies where the bold text will begin, and the end tag </b> identified where the bold text will end. Tags can be nested (as long as certain rules are followed -- keep reading). Observe:
Example 2 Sample Code What It Looks Like Get <b><i>on</i></b> with it. Get on with it. When nesting tags as in the example above, the important thing to remember is that tags must act like envelopes, i.e. the end tag for an inner pair of tags must occur before the end tag for the outer pair of tags (see below).
Example 3 Right Wrong Get <b><i>on</i></b> with it. Get <b><i>on</b></i> with it. This concept would seem to be pretty straightforward, but even the simplest HTML document can involve a lot of nested tags, it's easy to lose track.
The Overall Structure of an HTML Document
As we mentioned in the previous section, there are rules that govern which tags we can use, and how we use them. These standards are defined by the W3C (http://www.w3c.org/). Before we begin using any of the tags we're about to discuss, we have to know how to create the overall structure of an HTML document. See the next example for the most basic structure of an HTML document.
Example 4 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
<html>
<head>
</head>
<body>
</body>
</html>The first line of any valid HTML document should give an indication as to the version of the HTML standard the document will conform to. We do this using the <!DOCTYPE> tag at the top of the previous example. You don't have to memorize the innards of the tag, just know that the example above is our current preference from the three or four existing HTML standards.
After the document type definition, the entire contents of an HTML document must be enclosed in <html> tags. Within the <html> tags, there are two broad sections of the document. The "head" of the document is defined using the <head> tags. In the head of an HTML document, the only things you'll usually find are:
- The Title of the document (discussed later).
- Metadata (discussed much later).
- Global Style Sheet Information (discussed much later).
- Scripting Routines (not discussed in this document).
The majority of the content of an HTML document is included in the "body" of the document, which is defined using the <body> tags. Unlike the head of the document, hundreds of tags are supported within the body of the document. In fact, with the exception of the <title> and <meta> tags, all of the tags we will discuss from this point on can only be used between the <body> tags.
A Little More Detail About Tags
The behavior of most tags can be altered by setting "attributes" inside the text of the start tag. An attribute is set using the equal (=) symbol to assign a value to an attribute name (i.e. name="value"). See example below:
Example 5 Sample Code <p>Here's the first paragraph.</p> <p align="center">Here's the second paragraph.</p>What It Looks Like Here's the first paragraph.
Here's the second paragraph.
Many browsers will allow you to omit the quotes around the value, it's really much better if you get used to putting the quotes in. Otherwise, you're really going to be upset when you check your pages with Icab.
The Two Types of Tags
There are two types of tags: inline elements, and block elements. You can see the difference in the example below.
Example 7 Sample Code <p>Here's the <b>first</b> paragraph.</p> <p>Here's the <b><i>second</i></b> paragraph.</p>What It Looks Like Here's the first paragraph.
Here's the second paragraph.
In the previous example, the inline elements are the bold (<b>) and italic (<i>) tags. The block elements are the paragraph (<p>) tags. As you can see from this simple example, do not interrupt the natural flow of the text. Block level elements interrupt the natural flow of text, typically adding vertical space and returning the cursor to the starting position. Inline elements can occur within block elements or within other inline elements. Block elements can occur within block elements, but cannot occurs within inline elements. This distinction is very important to remember.
Inline Elements
Bold Text (<b>)
see example 7 aboveItalic Text (<i>)
see example 7 aboveSubscripted Text (<sub>)
Subscripted text appears smaller and on a slightly lower level than normal text.
Example 8 Sample Code Chemical Formula for Water:<br> H<sub>2</sub>OWhat It Looks Like Chemical Formula for Water:
H2OSuperscripted Text (<sup>)
Subscripted text appears smaller and on a slightly higher level than normal text.
Example 9 Sample Code Pythagorean theorem:<br> a<sup>2</sup> + b<sup>2</sup> = c<sup>2</sup>What It Looks Like Pythagorean theorem:
a2 + b2 = c2<span> Tags (<span>)
<span> tags are generic inline elements. They are almost entirely used to support style sheets. Among other things, <span> tags and styles are used to change the color, font family, and size of text.
Example 10 Sample Code Bulls are supposedly attracted to the color <span style="color:red">red</span>What It Looks Like Bulls are supposedly attracted to the color red Break Tags (<br>)
Break tags are used to start a new line. Like image tags (see below), there is no closing tag for a break tag. Break tags can be distinguished from paragraph tags in that there is less vertical space involved. See Examples 8 and 9 above and example x below.
Anchor Tags (<a>)
Anchor Tags are commonly called links. Anchor tags are most commonly used to link to external documents, to sections within external documents, and to sections within internal documents.
There are two attributes that are commonly used, the name attribute and the href attribute.
The name attribute is used to define a placeholder, a location within a larger document. There is an example of a placeholder and of a link to a placeholder included below.
The href attribute is set equal to a URL, either relative or absolute. A URL (example: http://scholar.lib.vt.edu/staff/handbook/index.html")is composed of four parts: The protocol (http://), a server name (scholar.lib.vt.edu), a path (/staff/handbook/), and a filename (index.html). A URL which has at least a protocol and a server is an absolute URL. A URL which does not contain a protocol and server name is a relative URL, any omitted information is implied. See the examples below for a demonstration of how missing components of a URL are added to a relative URL. It is important to know that many web servers are set to look for a default filename (index.html, index.htm, default.html, default.htm) when a URL includes a path but not a filename.
Example 11 Sample Code <p><a name="placeholder_example">Placeholder example</a></p> <p><a href="tips_and_tricks.html">Relative URL -- filename only.</a></p> <p><a href="/staff/handbook/">Relative URL -- path relative to the web root.</a></p> <p><a href="../images/">Relative URL -- path relative to the current directory.</a></p> <p><a href="http://scholar.lib.vt.edu/staff/handbook/index.html">Absolute URL -- no information implied</a></p> <p><a href="http://scholar.lib.vt.edu/">Absolute URL -- path is "/", filename is implied (index.html)</a></p> <p><a href="#placeholder_example">Relative URL -- links to internal placeholder defined above.</a></p>What It Looks Like Relative URL -- filename only.
Relative URL -- path relative to the web root.
Relative URL -- path relative to the current directory.
Absolute URL -- no information implied
Absolute URL -- path is "/", filename is implied (index.html)
Relative URL -- links to internal placeholder defined above.
Image Tags (<img>)
Image tags are used to indicate where an external image file should be displayed within a document. Like break tags (see above) image tags do not need a closing tag. The src attribute of the image tag is set equal to an absolute or relative URL (see description of URLs in anchor tag section above). The URL should point to the location of an acceptable image, currently JPEG and GIF images are the image formats of choice (GIF images are used for graphics with a limited range of colors, JPEG images for photos -- see the images section for more information).
The border attribute is used to turn off the outline that appears around an image which is enclosed in an anchor tag (see Example 12).
Example 12 Sample Code <p> <img src="/staff/handbook/images/skills.gif" alt="Skills"><br> Image </p> <p> <a href="/staff/handbook/skills/"><img src="/staff/handbook/images/skills.gif" alt="Skills"></a><br> Image inside an anchor tag </p> <p> <a href="/staff/handbook/skills/"><img src="/staff/handbook/images/skills.gif" alt="Skills" border="0"></a><br> Image inside an anchor tag, no border </p>What It Looks Like
ImageBlock Elements
Blockquote Tags (<blockquote>)
Example 13 Sample Code <p>Here's some text before the blockquote tags. Note how the text extends all the way to the edge of the screen on both the left and right margins.</p> <blockquote> Here's some blockquoted text. Note how the text is indented away from both the left and right margin. If you can't tell that the text is indented on the right, try decreasing the size of your browser window. </blockquote>What It Looks Like Here's some text before the blockquote tags. Note how the text extends all the way to the edge of the screen on both the left and right margins.
Here's some blockquoted text. Note how the text is indented away from both the left and right margin. If you can't tell that the text is indented on the right, try decreasing the size of your browser window.<div> tags (<div>)
<div> tags are generic block-level elements used to add style information, and are the block-level equivalent of <span> tags. The only attribute that is typically used (besides the style attribute) is the align attribute. A <div> tag with its align attribute set to center should be used in place of the outdated center tag.
Example 13 Sample Code <div style="color:green">Here's a div with styles applied.</div> <div align="center">Here's a div that's centered.</div>What It Looks Like Here's a div with styles applied.Here's a div that's centered.Heading Tags (<h1> - <h6>)
Heading tags are block level elements that display bold text on its own line. The size of the heading varies according to the number following the "h" (see example below), ranging from <h1> (the largest) to >h6> (the smallest). Of course, with styles, it's possible to change the appearance of headings, so lower numbered headings may not always appear larger than higher numbered headings.
Example 13 Sample Code <h1>Heading 1</h1> <h2>Heading 2</h2> <h3>Heading 3</h3> <h4>Heading 4</h4> <h5>Heading 5</h5> <h6>Heading 6</h5>What It Looks Like Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Horizontal Rules (<hr>)
A horizontal rule is a block-level element that displays a horizontal line at the current position. The relevant attributes are align and width. The align attribute controls the horizontal alignment, and supports the values "left", "middle", and "right". The default alignment (in the absence of style information) is center. The width attribute supports pixel widths as well as percentage widths (see example below). The percentage width and the default width (100%) are based on the innermost enclosing block level element, in the examples below, this is the table cell containing the horizontal rules.
Example 13 Sample Code <hr width="5%" align="left"> <hr width="10%" align="center"> <hr width="15%" align="right"> <hr width="10">What It Looks Like
Lists (<ol> and <ul>)
Lists are self explanatory, one item below another, they are usually indented, and a number or a bullet appears beside them. See examples below.
Unordered List
An unordered list simply lists one item below another with the same "bullet" next to each item.
Example 14 Sample Code <ul> <li>one</li> <li>two</li> <li>three</li> </ul>What It Looks Like
- one
- two
- three
Ordered List
An ordered list displays one item below another, and increments or decrements the number for each new item.
Example 15 Sample Code <ol> <li>dishwashing detergent</li> <li>onions</li> <li>ice cream</li> </ol> <p>You can continue a numbered list by specifying the starting number.</p> <ol start=4> <li>rice</li> <li>tofu</li> <li>bacon grease</li> </ol>What It Looks Like
- dishwashing detergent
- onions
- ice cream
You can continue a numbered list by specifying the starting number.
- rice
- tofu
- bacon grease
Paragraph Tags (<p>)
Paragraph tags are by far the most common block-level element. As you would expect, paragraph tags simply add space before and after a block of text. See example 6 for a clear demonstration of this.
Tables (<table>)
Tables are the most versatile element in HTML. They can be used in a simple row/column format to display data. They can also be used for layout, i.e. to organize portions of the page seamlessly without using problematic technologies like frames.
A table itself is first enclosed by <table> tags. The <table> tags may contain one or more rows (<tr> tags), each of which may contain one or more cells (<td> or <th> tags).
Table data cells (<td> tags) may contain almost any other block-level or in-line element. It is not uncommon to see nested tables (see example X below).
Table heading cells (<th> tags) are almost identical to table data cells, but are typically centered and bolder. However, it is not appropriate to simply center or highlight text in a normal table data cell to give the appearance of a table header, as screen readers will not be able to tell the difference between table data (<td> tags) and table headers (<th> tags).
Our first example is a simple table, with headings and data.
Example 16 Sample Code <table> <tr> <th>1999</th> <th>2000</th> </tr> <tr> <td>299,000</td> <td>1,234,567</td> </tr> </table>What It Looks Like
1999 2000 299,000 1,234,567 Our next example is a table with borders and background colors, with cells that span more than one row or column. Please note that spanning proceeds from the left to the right, and from the top to the bottom.
Example 17 Sample Code <table border=1> <tr> <td rowspan="2" bgcolor="#ccccff"> </td> <td colspan="2" bgcolor="#ccffcc"> </td> </tr> <tr> <td bgcolor="#999999"> </td> <td rowspan="2" bgcolor="#ffcccc"> </td> </tr> <tr> <td colspan="2" bgcolor="#ccffff"> </td> </tr> </table>What It Looks Like
Here's an example of a nested table (based on the above).
Example 18 Sample Code <table border=1> <tr> <td rowspan="2" bgcolor="#ccccff"> </td> <td colspan="2" bgcolor="#ccffcc"> </td> </tr> <tr> <td bgcolor="#999999"> <table border=1> <tr> <td rowspan="2" bgcolor="#ccccff"> </td> <td colspan="2" bgcolor="#ccffcc"> </td> </tr> <tr> <td bgcolor="#999999"> <table border=1> <tr> <td rowspan="2" bgcolor="#ccccff"> </td> <td colspan="2" bgcolor="#ccffcc"> </td> </tr> <tr> <td bgcolor="#999999"> </td> <td rowspan="2" bgcolor="#ffcccc"> </td> </tr> <tr> <td colspan="2" bgcolor="#ccffff"> </td> </tr> </table> </td> <td rowspan="2" bgcolor="#ffcccc"> </td> </tr> <tr> <td colspan="2" bgcolor="#ccffff"> </td> </tr> </table> </td> <td rowspan="2" bgcolor="#ffcccc"> </td> </tr> <tr> <td colspan="2" bgcolor="#ccffff"> </td> </tr> </table>What It Looks Like
In our final example, a table is used to combine images and text. The image was created with Illustrator, copied into Photoshop, then cut into four pieces. A table was used to combine the four pieces so that text could be used in the center square. If you want to duplicate this effect, you'll need to download the images in the "arrow" directory under this one.
Example 19 Sample Code <table cellspacing="0" cellpadding="0"> <tr> <td rowspan="3" width="25" height="93"> <img src="arrow/9.gif" width="25" height="93" alt="arrow"> </td> <td height="26" width="142" valign="bottom"> <img src="arrow/12.gif" height="26" width="142" alt="arrow"> </td> <td rowspan="3" height="93" width="50"> <img src="arrow/3.gif" height="93" width="50" alt="arrow"> </td> </tr> <tr> <td width="142" height="38" align="center"> <span style="font-size:12px">Text inside graphics</span> </td> </tr> <tr> <td width="142" height="3" valign="top"> <img src="arrow/6.gif" width="142" height="29" alt="arrow"> </td> </tr> </table>What It Looks Like
Text inside graphics This is one type of layout table, and is used extensively on sites like the VT imagebase. This page itself uses a layout table, although the integration of graphics and text is not as seamless. View the source of this page and you'll hopefully see what I mean.
HTML Entities
There are a number of special characters that cannot be displayed normally using plain ASCII text. Many of these characters are from European languages that use accents and diacritical marks. Others are less common punctuations. To represent these characters in HTML documents, we use what are called HTML entities.
An HTML entity consists of an ampersand (&) character, followed by text or a number, followed by a semicolon (;). As an example, to display an acute e (é), we would use the entity é. The following table displays the most commonly used HTML entities
Entity What It Looks Like & & " " é é ñ ñ ü ü &emdash; &emdash; For a complete list of HTML entities, refer to the book HTML: The Definitive Guide, which should be available in the office. For a very dry reference (but one that is not likely to go away) please visit the W3C page on entities that are supported in HTML 4.0.