1.4. HTML Tutorial

Let’s get started with our own HTML document.

Create a directory on your computer called chapter01. You can create this directory on your desktop so it is easy to find.

../../../_images/new_folder.png

Inside of that, create a file named index.html. All web files should end in .html so the computer knows it is a web page. If you are on Windows, make sure you have gone through the Enable Display of File Extensions section. Otherwise you will end up with a file named index.html.txt where the .txt is hidden.

../../../_images/new_text_document.png

When creating web sites, use all lower case, and no spaces for your file names.

Inside that document type a simple phrase like:

Hello there

Open a web browser and drag the file to the web browser window to open it up. You should see something like:

First Web Page

Now try the following code with all the weird spaces and blank lines. (In the upper right, there’s an icon that will appear allowing you to quickly copy the code.)

Hello there.
How            are
you
doing


today?

Try it out. Notice that everything is on the same line:

First Web Page

All white space, from one space, to 100 spaces and carriage returns are treated as just one space. To tell the web browser we want different paragraphs, we need to use a tag. Specifically, a paragraph tag. It looks like this:

<p>Hello there.</p>
<p>How are
you doing today?</p>
<p>I am fine.</p>
First Web Page

All tags are enclosed in less than and greater than symbols, like <tag>. In this case, <p> is the paragraph tag. The p stands for paragraph.

The content for the tag goes in between the open and close tags. All close tags have a / before the tag name. This is a forward slash because it leans forward. It is the slash under the question mark on your keyboard. In this case, we used </p> to close the paragraph.

Tags can be nested. But you have to close inside tags before closing outside tags. For example, here we use the <em> tag. This is the emphasis tag. It will cause the text to be italicized by default.

<p>I have started a paragraph tag. I will open
a tag signifying <em>emphasis</em>.</p>

<p>This is NOT LEGAL. I will open
a tag signifying <em>emphasis but close it in
the wrong order.</p></em>

<p>This is NOT LEGAL. I will open
a tag signifying <em>emphasis but forget to
close it.</p>

Running this, you won’t see errors:

First Web Page

But if you validate the document using CSE Validator, or the W3C Markup Validation Service it will tell you what is wrong:

HTML Validator

You’ll also get some errors saying we are missing other important elements in our document. We’ll show you how to fix those errors soon.

Some tags don’t have content. They begin and end at the same time. For example, the break-line tag <br /> does not have content.

<p>This is a new tag that doesn't start a new
paragraph, it just goes to a<br />new line.</p>

1.4.1. HTML Comments

Like many languages, HTML has comments. Comments are just for the developer and not displayed to the user. They are a nice way of documenting your code, or disabling sections of code. Comments are created like the following:

<!-- This is my comment -->

Important! Unlike some languages, comments aren’t removed from the user. Any user can do a “View Source” in a web browser and see the comments. Therefore don’t write comments that you wouldn’t want everyone in the web to be able to read.

1.4.2. Headings

You can add headings. There are six “levels” to the headings, using tags h1…h6. For example:

<h1>First level heading</h1>
<p>Paragraph.</p>

<h2>Second level heading</h2>
<p>Paragraph.</p>

<h2>Another second level heading</h2>
<p>Paragraph.</p>

<h1>Back to first level heading</h1>
<p>Paragraph.</p>

Here is a list of some of the most commonly used tags:

Tag Description
<a> Defines a hyperlink
<address> Defines contact information for the author/owner of a document
<article> Defines an article
<aside> Defines content aside from the page content
<b> Defines the document’s body
<body> Defines the document’s body
<br> Defines a single line break
<div> Defines a section in a document
<em> Defines emphasized text
<figcaption> Defines a caption for a <figure> element
<figure> Specifies self-contained content
<footer> Defines a footer for a document or section
<h1> to <h6> Defines HTML headings
<head> Defines information about the document
<header> Defines a header for a document or section
<hr> Defines a thematic change in the content
<html> Defines the root of an HTML document
<i> Italics
<img> Defines an image
<kbd> Defines keyboard input
<li> Defines a list item
<link> Defines the relationship between a document and an external resource (most used to link to style sheets)
<main> Specifies the main content of a document
<meta> Defines metadata about an HTML document
<nav> Defines navigation links
<ol> Defines an ordered list
<p> Defines a paragraph
<pre> Defines preformatted text
<script> Defines a client-side script
<section> Defines a section in a document
<small> Defines smaller text
<span> Defines a section in a document
<strong> Defines important text
<style> Defines style information for a document
<sub> Defines subscripted text
<summary> Defines a visible heading for a <details> element
<sup> Superscripted text
<title> Title for the document. Should be in head tag, shows up as title name but not on page.
<u> Underline

One of the most popular tags is the <div> tag. This tag is used to define any sort of ‘divisions’ on your web page. Basically, anything you’d want to put in a box. Even other boxes. Because <div> is so generic, HTML5 added common elements like <article> <footer> <section> and <summary>. They act like <div> tags, but give a better description of their content.

1.4.4. Images

In a manner similar to links, you can also load images in your web page. For this, we use the img tag, and the src attribute. For example this will load happy_face.png image off the server:

<img src="happy_face.png" />

It is good practice to keep images in a separate folder. For example:

<img src="images/happy_face.png" />

Images can and should have alt text specified. If the image can’t load, this text will be displayed. If the user is blind, the screen reader will read back the alt text.

<img src="images/happy_face.png" alt="Happy Face" />

There are many image file formats. Here’s the abridged version:

  • .jpg: Use JPEG’s for photos or photo-like images.
  • .png: Use PNG’s for graphic art.
  • .gif: Use GIF’s for graphic art. These can be animated too.
  • .svg: Browser support for svg used to be poor, but is getting better. See SVG Can Do That? SVG can scale to any resolution because it does not store “dots” that make up the image, but instead stores the drawing commands used to draw the image.

Normally the resolution of the image should match the resolution that you want it to appear on the screen. Otherwise you force the user’s browser to download extra image data, and then scale it. This makes for a slow website and poor image quality.

Note that each image file requires a separate request to the server to get the image. Each image on a page slows down the web page load time.

1.4.5. HTML Entities and Escape Characters

What happens if we want to display a < and not have it be part of a tag? For this and many other characters, we use HTML Entities (Also known as HTML Escape Characters.) You can encode any character this way, but most of them don’t need to be encoded.

A full table of HTML Escape Characters can be found with a quick web search. These are also known as HTML Entities.

To complicate matters, there is a different encoding scheme for URLs. Remember, HTML makes up the structure of web documents. URLs are the links to them. URLs can’t have spaces in them and there are many other characters that URLs can’t have. So they need to be encoded. Check the end of the w3schools HTML Escape Character table to find the URL escape character table.

1.4.6. Example Document Structure

What does a bare-bones HTML5 document look like? Here is an example that will validate without error using HTML Validator:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
<!-- Basic 01_HTML document. Remove comments for your own document. -->

<!-- This says the document is an HTML5 document. -->
<!DOCTYPE html>

<!-- All the 01_HTML goes in the 01_HTML tags. We also tell the browser that the
     language of the document is English because of the "en". Go here:
     https://www.loc.gov/standards/iso639-2/php/code_list.php for a full list. -->
<html lang="en">

<!-- The head has meta-info about the document that doesn't show up on the
     document, but instead is about the document. Confusingly, many
     documents also have a "header" which is at the top of the document
     and totally different than the head. Also, HTTP, the way we transfer
     01_HTML over the internet, has its own 'head' section. -->
<head>
  <!-- Specify how the letters are encoded. For Cyrillic, Kanji, Spanish
       characters, etc. The world has mostly standardized on UTF-8, so
       you can use it for about anything.
       For more info on character sets, see:
       https://youtu.be/MijmeoH9LT4
       -->
  <meta charset="utf-8">

  <!-- The title is what will be used when you bookmark a site, or the text
       that's on the tab. -->
  <title>Sample HTML5 Document</title>

</head>

<!-- All things that appear in the document are in the body.
     A document should have only one head, and only one body. Don't put the
     head in the body, or the body on top of the head. That's just weird. -->
<body>
  <p>Sample Document</p>
</body>
</html>

There are some new things in this structure, let’s go through them.

1.4.6.1. HTML Document Type

There have been several versions of HTML. Browsers need to know what version is being given to them. For HTML version 5, all documents start with the following:

<!DOCTYPE html>

It seems odd that there’s no “5” in there, but all your documents should start with that. You might have some comments above, but that’s it.

1.4.6.2. The HTML Tag

Around all your HTML code, should go an <html> tag. It will start right after the doctype, and the last tag in your document should be a close </html> tag.

HTML tags should also tell the browser what language the document is in. This tag says the document is in English:

<html lang="en">

See below for the full list of languages and abbreviations:

https://www.loc.gov/standards/iso639-2/php/code_list.php

1.4.6.3. HTML Head Section

The <head> has meta-info about the document that doesn’t show up on the document, but instead is about the document.

Confusingly, many documents also have a <header> which is at the top of the document and totally different than <head>.

The head section contains a <meta> tag that specifies how the letters are encoded. For Cyrillic, Kanji, Spanish characters, etc. The world has mostly standardized on UTF-8, so you can use it for about anything. For more info on character sets, see:

https://youtu.be/MijmeoH9LT4

The <title> is what will be used when you bookmark a site, or the text that’s on the tab. You don’t see this on the main document at all.

No images, no paragraphs can go in the head, as it is not displayed in the document.

<head>
  <meta charset="utf-8">
  <title>Sample HTML5 Document</title>
</head>

1.4.6.4. HTML Body Section

HTML has a <body> tag that should contain all the document that you can see. It should go right below the <head> tag, and still inside the <html> tag. Only one body is allowed. Do not put anything after the close of the body tag.

1.4.7. Lists

Lists can be used for representing any hierarchical data. With CSS formatting, they can look very different than what you might think of as a classic list. For example, drop down menus are often done with lists.

Here is an example list. The main list is surrounded by an unordered list tag ul. Each list item is surrounded by a li tag.

<ul>
  <li>Item 1</li>
  <li>Item 2</li>
  <li>Item 3</li>
</ul>

You can also have an ordered list, with the ol tag. Style sheets can change a ul into a ol, but good practice is to use ol for numbered lists.

<ol>
  <li>Item 1</li>
  <li>Item 2</li>
  <li>Item 3</li>
</ol>

You can also have lists inside of lists. Notice how Item 3’s li tag does not end until after the inside list closes.

<ul>
  <li>Item 1</li>
  <li>Item 2</li>
  <li>Item 3
    <ul>
      <li>Sub item A</li>
      <li>Sub item B</li>
      <li>Sub item C</li>
    </ul>
  </li>
  <li>Item 4</li>
  <li>Item 5</li>
</ul>

Here is a slightly more complex HTML5 example that has some lists:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
<!DOCTYPE html>

<html lang="en">
<head>
  <meta charset="utf-8">
  <title>Sample HTML5 Document</title>
</head>

<body>
  <header>
    <nav>
      <ul>
        <li>
          <a href="index.html">Home Page</a>
        </li>

        <li>
          <a href="about.html">About</a>
        </li>

        <li>
          <a href="chickens.html">Page about chickens</a>
        </li>
      </ul>
    </nav>
  </header>
  <!-- Instead of section and article, many developers use the div tag. -->
  <section>
    <h1>Main Title</h1>

    <article>
      <h2>Article 1</h2>

      <p>Lorem ipsum dolor sit amet, consectetur adipisicing elit.
      Explicabo ullam vel repellendus molestiae. Corporis
      excepturi, reprehenderit, ex eum voluptatibus quidem
      consequatur natus, facere fuga iste impedit id ipsa dicta
      minus.</p>

      <p>Repudiandae cumque molestias nulla quam, provident
      explicabo, ratione alias tenetur nisi, sed id quae. Repellat
      autem, voluptate sunt explicabo. Sed adipisci officiis
      debitis voluptatum nisi, alias voluptates ducimus ipsam
      illo!</p>

      <p>Ea omnis, pariatur atque mollitia dolorem tempore iste
      illo id ducimus velit ratione tenetur! Fuga distinctio magni
      cumque laboriosam deserunt temporibus, impedit voluptates ex
      tempore, aperiam adipisci nobis libero tenetur!</p>
    </article>

    <article>
      <h2>Article 2</h2>

      <p>Tempore accusamus, modi assumenda enim ut et perferendis,
      deserunt eveniet repellendus, dolorum pariatur cum a. Commodi
      sint harum veritatis? Magni cumque facilis quaerat nemo,
      blanditiis ipsam repudiandae quos facere! Ab!</p>

      <p>Dolore alias rem, veniam in distinctio odit totam commodi
      eligendi dolores fuga. Non, voluptatum, accusantium. Pariatur
      voluptatem veritatis rerum mollitia. Officiis nisi magni
      maxime, cumque nemo inventore totam vel tenetur.</p>

      <p>Tempora ex deleniti adipisci sed, atque ipsum molestias,
      voluptatibus quis, autem quae aliquid earum eos ipsam
      praesentium, quidem ea inventore iure error tenetur laborum
      odit aliquam molestiae! Error, eum, accusamus!</p>
    </article>
  </section>
  <footer>
      Copyright 2099 by Mary Poppins
  </footer>
</body>
</html>

1.4.8. Tables

Tag Description
<table> Surrounds the entire table
<tr> Surrounds cells in a row
<td> Surrounds a cell. Can use attributes rowspan and colspan to span multiple rows or columns.
<th> Surrounds a header cell.
<thead> Optional, usually surrounds the header row(s)
<tbody> Optional, surrounds the main data rows
<tfoot> Optional, surrounds the summary/footer rows
<table>
  <thead>
    <tr>
      <th>Column 1</th>
      <th>Column 2</th>
      <th>Column 3</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Cell</td>
      <td>Cell</td>
      <td>Cell</td>
    </tr>
  </tbody>
  <tfoot>
    <tr>
      <td>Summary 1</td>
      <td>Summary 2</td>
      <td>Summary 3</td>
    </tr>
  </tfoot>
</table>

1.4.9. More Information

How do you learn more?

1.4.9.1. Books, Books, and More Books

What we will learn in this class only scratches the surface of what there is to learn. Take time to look at the resources that are available at Dunn Library. Search for HTML, HTML5, CSS, CSS3, and PHP. Spend some time in the library and bookstores to see what is available.

1.4.9.2. Websites

I really like W3Schools for tutorials and as a reference on HTML and many other web technologies. Do note that they do not have anything to do with the W3C organization, and there is some hate for them on the Internet. Plus the “certifications” they sell for passing tests aren’t really worth anything.

1.4.10. Viewing HTML

Did you know you can see the HTML of any web page? When a web page is up, right-click and select “view source” or type Ctrl-U. Is that too much HTML to wade through? Only want to see the HTML of a particular element? Right click on it and select “inspect element.” Browsers have a “debug” menu available by hitting the F12 key. Give it a try. Browsers offer many developer add-ons. For example, Firebug is a popular add-on for Firefox. The built-in tools are good enough I rarely use add-ons anymore, but you should know they are out there.

1.4.11. Validating HTML

The CSE HTML Validator is great for HTML work. You can use it to load a URL and check if for correctness. You can have it automatically check as you browse a website. It can even scan an entire website for errors in “batch” mode. CSE Validator has an option under “Tools” for “Pretty Print/Fix.” This will clean up the HTML formatting to make it easier to read. This can be useful in learning techniques to make your HTML readable and structured. If you don’t want to shell out money, you can few a page’s source, copy it, and then paste it into Introduce W3C for validation. This isn’t as convenient, and you’ll lose time compared to HTML Validator, but it does work if you are on a budget.

1.4.12. Final HTML Sample Document

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
<!DOCTYPE html>
<!-- Sample document by Paul Craven
     This is an 01_HTML comment.
     The browser ignores it.
     But a user could see it by doing 'view source'.
-->
<html lang="en">

  <head>
    <meta charset="utf-8">
    <title>Example HTML Document</title>
  </head>

  <body>
    <!-- Our header, that has the navigation, website title, and logo -->
    <header>
      <!-- Nav section -->
      <nav>
        <ul>
          <li>
            <a href="index.html">Home Page</a>
          </li>

          <li>
            <a href="about.html">About</a>
          </li>

          <li>
            <a href="chickens.html">Page about chickens</a>
          </li>
        </ul>
      </nav>
      <!-- Logo and title -->
      <img src="company_logo.png" />
      <h1>My Company Title</h1>
    </header>
    <h1>Example HTML Document</h1>
    <p>This document covers several standard elements of an HTML document.</p>
    <p>You can find a good explanation of these elements and more at
    <a href='http://www.w3schools.com/html/'>W3Schools</a>.</p>

    <h2>Basic Tags</h2>
    <p>This<br />shows<br />how<br />to<br />use<br />a<br />line<br />break<br />tag.</p>
    <p>This is <b>bold</b> text. This is <i>italicized</i> text.</p>
    <p>x<sup>2</sup> = 10</p>
    <p>This is formatted <kbd>in keyboard</kbd> style.</p>
    <pre>
    This is preformatted
    text that is displayed
    as-is.
    </pre>

    <h2>Lists</h2>

    <h3>Unordered lists</h3>
    <ul>
      <li>Item 1</li>
      <li>Item 2</li>
      <li>Item 3</li>
    </ul>

    <h3>Unordered lists</h3>
    <ol>
      <li>Item 1</li>
      <li>Item 2</li>
      <li>Item 3</li>
    </ol>

    <h2>Images</h2>
    <img alt='sample local image' src='myimage.jpg'>
    <br />
    <img alt='Simpson Logo' src='http://simpson.edu/wp-content/themes/simpson/assets/img/logo-desktop.png'>
    <br />
    <img alt='Simpson Logo Big'
         src='http://simpson.edu/wp-content/themes/simpson/assets/img/logo-desktop.png'
         width='600'>

    <h2>Tables</h2>
    <table>
      <tr>
        <th>Heading 1</th>
        <th>Heading 2</th>
        <th>Heading 3</th>
      </tr>
      <tr>
        <td>Cell 1</td>
        <td>Cell 2</td>
        <td>Cell 3</td>
      </tr>
      <tr>
        <td colspan='2'>Cell 1</td>
        <td>Cell 3</td>
      </tr>
    </table>

    <h2>HTML Entities</h2>

    <p>Ampersand: &amp;</p>
    <p>Less than: &lt;</p>
    <p>Greater than: &gt;</p>
    <p>Copyright: &copy;</p>
    <p>None-breaking space: These&nbsp;words&nbsp;won't&nbsp;wrap.</p>

    <h2>Forms</h2>
    <form action='processing_page.html'>
      Text field: <input name='sample_text_field' type='text'><br/>
      Password field: <input name='password' type='password'><br/>
      <input name="buttonset" value="1" type="radio"> Radio button 1<br/>
      <input name="buttonset" value="2" type="radio"> Radio button 2<br/>
      <input name="buttonset" value="1" type="checkbox"> Check box 1<br/>
      <input name="buttonset" value="2" type="checkbox"> Check box 2<br/>
      <input type="submit">
    </form>
    <footer>
        Copyright 2099 by Tom Baker
    </footer>
  </body>

</html>