INTRODUCTION TO HTML

Tilburg   

1.0  BACKGROUND

 HTML was designed to be a language to describe document structure. The basic HTML elements specify: - headings, titles, and paragraphs - but not margins and fonts.

The Hyper Text Markup Language (HTML) is an SGML application for marking up documents for inclusion in the World Wide Web. HTML allows you to:

Ø       Publish documents to the Internet in a platform independent format

Ø       Create links to related works from your document

Ø       Include graphics and multimedia data with your document

Ø       Link to non-World Wide Web information resources on the Internet

2.0    WHAT IS MARKUP?

bulletMarkup is the act of inserting additional text into a document, which is not usually visible to the reader, and is not part of the content, but enhances the document in some way, such as capturing document structure or adding hypertext capability.
bulletMarkup also refers to the additional text, also known as tags, which are inserted in the document.

An example of markup:

Grocery list

<UL>

<LI>Apples

<LI>Oranges

</UL>

Components of HTML Markup

<A HREF="Virginia.gif">More about Virginia</A>
bulletA is the anchor tag (tags are also referred to as elements).
bulletHREF is an attribute of the anchor tag
bullet="Virginia" is a value being assigned to the attribute
bullet> closes the anchor tag
bulletThe phrase More about Virginia is the tag's contents
bullet</A> is an end tag for the anchor tag.

Tags can have elements, which are only allowed between them. For example, all HTML tags are elements of the <HTML> tag.

3.0    HTML Levels

There are four official levels, or versions of HTML conformance. Each encompasses a set of tags and higher levels include tags from all those below it.

Level 0

The minimum tags, which constitute an HTML document (most tags currently in use). Level 0 tags are usually rendered consistently from browser to browser.

 Level 1

Level 0 tags plus tags for highlighting (also called Logical Tags) and images

 Level 2

Level 0 and Level 1 tags plus form tags  

Level 3 (Version 3.2)

Previous levels plus support for client-side image maps and scripts, and table markup elements.

 3.1    Elements, Tags, and Attributes

bullet Tags specify structural elements in a document, such as headings: Tags begin with a left-angle bracket < and end with a right-angle bracket >. The first word between the angle brackets is the tag's name. Any further words and characters are the attributes, e.g. align=right.

A tag is therefore the basic 'item', and an attribute is some extra detail such as how to align the content.

·         An element comprises three parts: a start tag, content, and an end tag. Most tags possess 'closing tags' such as </h2> which mark the place where the effect of the 'opening' tag should stop.

Note:-  Tags are case-insensitive.

 

Tags should nest properly: if you want for example to make a part of the header in italics:

 
<h2>
           Tags <i>and</i> Attributes
</h2>

 

Note: -

Also, HTML documents are free-format - you can use spaces and tabs anyhow you llike, and break lines anywhere. 

Browsers allow a great deal of flexibility about which tags you need to put into a web page.

3.2    Document Structure

An HTML document consists of two main parts: the Head, and the Body. The basic document structure is: -

                                                                                          

<HTML>
<Head> ... </Head>
<Body> ... </Body>
</HTML>

 -The Head contains information about the document, such as links to pages that could be preloaded

-The Body contains the document to be displayed. The main Head element is the <TITLE> tag. Every document should have a title, and it appears as a 'label' on the browser window.

 <Title>A Basic Introduction to HTML</Title>

 Another useful Head tag is the <META> tag if one wishes to optimize their pages for search engines.

 1: <HTML>

2: <Head>
3:  <Title>A Simple Document</Title>
4:  <Meta Name = "Keywords"
5:    Content  = "Hypertext">
6: </Head>
7: <Body>
8:... This stuff is
      what the user sees ...
9: </Body>
10: </HTML>

 

The numbers and colons are not part of the HTML file, but serve to associate the following comments with the lines above:

  1. Declares this to be an HTML document.
  2. The Head contains items that are about the document.
  3. The title used in the browser title bar, hotlists, listings, etc.
  4. Meta tags can be used to add information not already specified in the HTML/HTTP system.
  5. Some search engines make use of these keywords, as well as those in the Body.
  6. Closes the Head.
  7. Body contains the document's displayable content.
  8. Text markup commands. 
  9. Closes the Body.
  10. Closes the HTML.

-HTML also supports interactive forms, "hotspots" in pictures, more versatile formatting choices and styles, and formatted lists, as well as several other improvements, such as an e-mail URL, so hyperlinks can be made to send e-mail mechanically. For example, choosing an e-mail address in a portion of hypertext opens a mail application, ready to send e-mail to that address.

3.3   Headers

 There are 6 headers: H1, H2, H3, H4, H5, and H6. H1 is the "main" header, usually used once at the top of the document. H6 is the "smallest" header and is rarely used, though it's often abused to make small bold text.

3.4    Anchors (Links)

The fundamental feature of the WWW that makes it so powerful is of course, hypertext links. The tag that creates those links is called the anchor tag (A). It has one commonly used attribute: HREF, which specifies the URL of the target document.

3.5    Images

Images have made a profound difference in the way the web looks.

 The above example shows the simplest way to make an inline image. You can wrap it inside anchor tags and then it will be a clickable image:

 <a    href     = "../../Graphics/">

<Img  src      = "/Icons/graphics.gif"></a>

 

It’s important to specify the image dimensions (allows the browser to lay out the page sooner) and what to do if the browser doesn't have image support or if the user has image loading turned off.

 

<a    href     = "../../Graphics/">
<Img  src      = "/Icons/graphics.gif"
      width    = 108
      height   = 44
      border   = 0
      hspace   = 16
      alt      = "Graphics"
      align    = left
      ></a>

3.6    Character Styles

ü       EM is called a logical style: you specify what you're trying to do, rather than how to do it. -Another one is STRONG.

 Example:

 <strong>STRONG</strong>.

 ü       Emphasis is usually indicated with italics.

             <I>italics</I>.

 ü       'Strong' is usually rendered as bold.

<b><bold></b>.

 ü       SAMP is rendered as teletype font.

 Example:-

<samp>SAMP</samp>is rendered as

<tt>teletype</tt>font.

 Note: -

 If one wants to use angle brackets or HTML tags then either write &lt; for < and write &gt; for >; or try the XMP tag which renders everything literally until the closing XMP tag.

 3.7  Paragraphs and Line Breaks

- The browser except inside special tags ignores white space and line breaks. A line break is illustrated as: - <br>

- A paragraph break (i.e. line break and then an empty line between paragraphs) can be illustrated as: - <p>

 - The paragraph tag has an optional closing tag: -  </p>.

3.8    Lists

There are several kinds of lists. They include:

  1. Ordered.
  2. Unordered.
  3. Definition.

An ordered list has numbered items. To make the above list:

     <ol>

<li>  Ordered.
<li>  Unordered.
<li>  Definition.
</ol>

 To have it without numbered items:

bullet Ordered.
bullet Unordered.
bullet Definition.

A definition list looks like this:

Ordered Lists: - The list items are ordered, e.g. by numerals.

Unordered Lists: -The list items aren't ordered particularly.

Definition Lists: - The list items have two parts: a title DT and a description DD.

A definition list is made like this:

 

<dl>
<dt>  Ordered Lists.
<dd>  The list items are ordered,
               e.g. by numerals.
<p>
<dt>  Unordered Lists.
<dd>  The list items aren't ordered.
<p>
<dt>  Definition Lists.
<dd>  The list items have two parts:
      a title DT and a description DD.
</dl>

3.9    Tables

Tables consist of rows containing headers and data cells:

Name

Tag

Typical Appearance

Table

TABLE

A table like this

Row

TR

A row like this

Head

TH

Bold, centered

Data

TD

Plain, left aligned

 

 

 

 

The table tag attributes used here are: 

bullet bgcolor :-the table's background color. 
bullet border: -specifies the width in pixels for the border (0 for no border);
bullet cell padding :-how much space between the border and the cell contents.

4.0    Information Type Elements

Information type elements are used to markup textual structures within the body of the document. There are tags for emphasizing important sections of text, for definitions and citations, and other data. It is up to the browser to decide how to display these structures. 

bullet<EM> indicates this portion of text should be emphasized (usually italicized)
bullet<STRONG> indicates stronger emphasis than <EM> (usually bold)
bullet<ADDRESS> is used to record information that can be used to contact the document author.
bullet<DFN> is used to markup a definition
bullet<CITE> is used to markup a citation from another document
bullet<CODE> is used to markup sections of program code

Information type elements are used to markup textual structures within the body of the document. There are tags for emphasizing important sections of text, for definitions and citations, and other data. It is up to the browser to decide how to display these structures. 

bullet<EM> indicates this portion of text should be emphasized (usually italicized)
bullet<STRONG> indicates stronger emphasis than <EM> (usually bold)
bullet<ADDRESS> is used to record information that can be used to contact the document author.
bullet<DFN> is used to markup a definition
bullet<CITE> is used to markup a citation from another document
bullet<CODE> is used to markup sections of program code

5.0    Physical Style Elements

Font Size

·         <BIG> will display text in a larger font.

·         <FONT> lets you specify how much larger or smaller the contained text should be than the surrounding text. It requires a SIZE attribute:
<FONT SIZE=+3>Three times larger<FONT>
Thee times larger

·         <SMALL> will display text in a smaller font.

6.0    BUILDING LINKS IN HTML

HTML Anchor Tag

Anchors are what make HTML a hypertext language. The anchor tag consists of a start tag <A plus one or more attributes naming or describing the anchor plus > then content which becomes the link, followed by an end tag </A> :

<A HREF="slide17.html">HTML Anchor Tag</A>


HTML Anchor Tag

 

There are basically two types of anchors: start and destination. Start anchors are selectable segments of text such as HTML Anchor Tag above, while destination anchors are portions of text that mark an available destination. Here is an example of a start and a destination in which the start example is pointed to by the destination example:

 

Start:          <A NAME="SGML">Standard Generalized Markup Language</A>
Destination:    <A HREF="#SGML">SGML</A>

Building Links to HTML Files

Links are built using the HREF attribute with the anchor tag. HREF must be assigned some value, a target value for a destination. The target can be within the same document, another document on the same server (filename), a document on a different server (URL), or a portion of text in another document.

 

<A HREF="#SGML">More information about SGML</A>

 provides a link to a target named SGML in the current document.

 <A HREF="http://scholar.lib.vt.edu/html-intro.html">More information about HTML</A>

 links to a document called html-intro.html on the World Wide Web.

 <A HREF="http://scholar.lib.vt.edu/html-intro.html#SGML">More information about SGML</A>

 links to a target named SGML in a file called html-intro.html on the World Wide Web.

 Note: An anchor can be both a link and a target:

 <A NAME="SGML2" HREF="#SGML">if you are still lost</A>

 can be pointed to with the name SGML2 and points to a target called SGML.

"Any value assigned to an attribute must be enclosed by double Quotes"

7.0    Using Uniform/Universal Resource Locators (URL)

Anchors can link to remote data when a Uniform Resource Locator (URL) is used with the HREF attribute. Any type of data on almost any type of information server on the Internet can be accessed using a URL. The URL has three main components:

  1. Server/Resource Type :
    bulletfile (File Transfer Protocol)
    bulletgopher
    bullethttp (World Wide Web)
    bulletnews (Usenet News)
    bullettelnet
  2. Internet Name (and port if required)
  3. Filename and path

Most URLs include the characters :// to divide the server type from the Internet address, except for Usenet news URLs. Filename can be truncated to a forward slash / which tells the server to send a default document or directory listing. Telnet does not require a filename or path. Here are example URLs for several common resource types :

 

file://scholar.lib.vt.edu/pub/next/HTML-Editor.FAT.compressed
gopher://vatech.lib.vt.edu/
http://scholar.lib.vt.edu/library.html
news:comp.infosystems.www.providers
telnet://vtls.vt.edu

8.0    Interactive Document Content Interactive HTML:

Forms

Users can provide feedback or use HTML to access databases through forms. Forms can be constructed from five level 2 HTML tags:

bulletFORM
bulletINPUT
bulletOPTION
bulletSELECT
bulletTEXT AREA

They provide a user with the ability to enter information, which can then be processed, on the server as survey information, search information for a database, information request, etc. Forms by themselves only allow data entry. They require software commonly referred to as gateways, which receive the data, process it, and return a response to the WWW client. Gateways are custom applications written in an available programming language on the server (such as Perl or C). Some authors use JavaScript to implement client-side form processing.

  Form Tags – FORM

 The <FORM> tag is placed around a section of an HTML document, which includes FORM elements. Other BODY tags can occur in a form, and multiple forms can occur in a document, but forms cannot be nested.

 There are two attributes essential to forms:

 v      ACTION: - indicates the URL of the processing gateeway. This URL will point to a program rather than a document. This program will receive the contents of the form in one of two ways depending on what value is specified for the METHOD attribute.

 

v      METHOD: - can be assigned one of two values: GET oor POST. Gateways can accept data directly when METHOD is GET or look for it in a special variable if POST is used. If you are using an existing gateway, refer to its documentation for the correct METHOD.

 

Example:

 <FORM METHOD=GET ACTION="http://nebula.lib.vt.edu:8001/cgi-bin/marian-gate">
Sends the contents of a form directly to a gateway called marian-gate. Form elements should not occur outside <FORM> start and end tags.

Form Tags

INPUT

The <INPUT> tag provides some type of data entry point in the form depending on the value of its TYPE attribute:

 

checkbox and radio specify selectable options: 

<INPUT TYPE=checkbox>  <INPUT TYPE=radio>

 

reset and submit make the INPUT field a button to clear the form or send its contents to the gateway:

 

<INPUT TYPE=reset> <INPUT TYPE=submit>

 

HIDDEN is used to conceal a field that has a preset value that will never change but is unknown to the gateway. 

 

TEXT specifies that a data input field be displayed:

<INPUT TYPE="TEXT">

 

IMAGE displays an image and when the user selects a spot, the coordinates are passed to the gateway.

(INPUT is an empty tag, like IMG, with no end tag)