Transcript Chapter 4 Intro to HTML
Chapter 4
Introduction to XHTML
© 2010, Robert K. Moniot 1
OBJECTIVES
In this chapter, you will learn: • Characteristics of XHTML vs. older HTML.
• How to write XHTML to create web pages: – Controlling document structure – Including images in a document – Inserting hyperlinks for navigation – Making lists and tables – Creating forms for user input – Including meta information for search engines to use © 2010, Robert K. Moniot 2
HTML
• Hyper-Text Markup Language: the foundation of the World-Wide Web • Design goals: – Platform independence: pages can be viewed using a variety of different computers and browsers.
– Universality: servers can store information in their own data formats, but convert it to HTML for access by browsers.
– Convenient linking from one page to another (hypertext).
– HTML conveys the structure of the document, not its precise appearance, allowing for varying display capabilities.
• Openness (not proprietary) was key to the adoption of HTML and growth of the Web © 2010, Robert K. Moniot 3
XHTML
• Extensible HTML – Based on HTML, but stricter: must be valid XML – Adds extensibility for future purposes – Improves interoperability with other data formats and user agents • Allows only document's content and structure, not style, to be specified – Formatting is specified with Cascading Style Sheets (CSS) • Designed to facilitate Semantic Web • HTML still accepted by browsers – Use validators to make sure your XHTML is standard – Properly written, XHTML is acceptable to all browsers © 2010, Robert K. Moniot 4
Basic form of XHTML
• An XHTML document is all plain text (no binary formatting codes). The contents can be divided into two categories: – –
Content:
material which the user sees when visiting the page
Meta-information:
information about the document: its structure, formatting, etc.
• Meta-information is distinguished from content by using
tags
. A tag is a tag-name enclosed in angle brackets. Tags usually come in pairs: an
opening tag
and a
closing tag,
which is the same tag-name preceded by a slash.
<
tag-name
>Content affected by tag<
/tag-name
>
© 2010, Robert K. Moniot 5
Nesting of tags
• Opening and closing tags define regions affected by the tags. These regions must
nest
, not
overlap
.
Yes: < tag1 >Some text < tag2 >more text< /tag2 > and more.< /tag1 > No: < tag1 >Some text < tag2 >more text< /tag1 > and more.< /tag2 >
© 2010, Robert K. Moniot 6
Rules about Tags
• Not all tags need closing tag – For some tags, a closing tag is
optional
in HTML:
< p >
paragraph. Implies closing of previous paragraph tag.
For these, closing tag mandatory in XHTML.
– For some tags, a closing tag is
never
used:
< br / >
line break. Marks a location, not a region.
For these, XHTML mandates
/
before closing
>
.
• Tag names are case-insensitive in HTML
< br >
and
< BR >
and
< Br >
are all equivalent XHTML mandates lower-case only.
© 2010, Robert K. Moniot 7
Rules about Tags, cont’d
• Unknown tags are ignored. This rule allows new tags to be introduced into later XHTML standards without causing problems for older browsers. But it also means you need to be careful to spell tag names correctly!
© 2010, Robert K. Moniot 8
Tags with attributes
• Some tags can be qualified by attributes that provide needed additional information or change the default properties of the tag.
• Attributes are specified within the angle brackets following the opening tag name. (Attributes are never listed in a closing tag.)
<
tag-name
attribute="value" attribute="value">Content
text<
/tag-name
>
© 2010, Robert K. Moniot 9
Tags for Document Type
• These tags were not in original HTML standard: introduced in XHML.
• XML version tag
< ?
xml version="1.0" encoding="utf-8" ?
>
• Document Type Definition (DTD)
< !DOCTYPE
html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
(These can be cut and pasted from examples on Web. Don’t try to type them in by hand!)
© 2010, Robert K. Moniot 10
Tags for Document Structure
• Some tags specify the overall structure of the document
< html
> ... <
/html >
encloses the entire document
< head
> ... <
/head >
encloses the
head
portion of the document. Everything in the head portion is meta-information, not content.
< body
> ... <
/body >
encloses the
body
portion of the document. The body portion contains the document's content.
Example 1 (ignore the other tags for now) © 2010, Robert K. Moniot 11
Tags for Document Head
< title >Document title< /title >
- Specifies the title that appears in the title-bar of the browser (
not
in the content area). This tag is optional but should always be included to assist the user in navigating the browser's history list.
< meta
name="keywords" content="word, phrase, etc"
/ >
- Specifies keywords to assist indexing of the page by search engines.
Other head tags will be discussed later.
© 2010, Robert K. Moniot 12
Tags for Document Body
• Heading tags
< h1 >Main heading< /h1 >
- Formats the enclosed text as a prominent large, bold heading.
< h2 >Sub-heading< /h2 >
- Formats the enclosed text as a large, bold heading but not as prominent as for
< h1 >.
< h3 >Sub-sub heading< /h3 >
- Formats the enclosed text as an even less prominent heading.
... (levels 4 and 5 of decreasing prominence)
< h6 >Minor heading< /h6 >
- Formats the enclosed text as a minor heading, only slightly more prominent than ordinary text.
Example 1 © 2010, Robert K. Moniot 13
Organizational Tags
• Paragraphs and Layout
< p >
- start of a new paragraph. The closing
< /p >
tag is optional in HTML but required in XHTML.
< br / >
- line break. Use this where the extra line spacing of a paragraph tag is not desirable, as in an address. Never takes a closing tag.
< center >Centered text< /center >
(deprecated) changes justification of enclosed text (normally left-justified) to centered.
< blockquote >Quoted text< /blockquote >
- for large blocks of quoted material.
< pre >Preformatted text< /pre >
- use this to present a block of text exactly as typed in the document, for instance to show an interactive session with a computer.
Example 2 © 2010, Robert K. Moniot 14
Tags for Style
• Explicit style tags
< b >Boldfaced text< /b > < i >Italicized text< /i > < u >Underlined text< /u >
(deprecated)
< tt >Typewriter-font text< /tt >
• Logical style tags
< strong >Prominent text< /strong >
- usually bold
< em >Emphasized text< /em >
- usually italics
< cite >Cited text< /cite >
- usually italicized
< code >Computer code< /code >
- usually in typewriter font Example 3 © 2010, Robert K. Moniot 15
Tags for Style
• The
< font >
tag.
– In the early days of web design, this tag was introduced to allow the web programmer to control the font family, typeface, color, etc.
– This tag is now deprecated . Style sheets provide much better control over style and compatibility. However, many web pages still use this tag for simple effects such as text size and color.
Example:
< font size="+1" color="red">Large, red text< /font >
© 2010, Robert K. Moniot 16
Special characters
• Some characters such as angle brackets are reserved for special meanings in XHTML. Others are not available on many keyboards. These characters can be put into content using codes between ampersand and semicolon:
<
- the less-than symbol
<
(left angle bracket)
>
- the greater-than symbol
>
(right angle bracket)
&
- the ampersand sign
& ©
- the copyright symbol
©
• Many more are defined. Note that unlike tags, these codes
are
case-sensitive in HTML.
© 2010, Robert K. Moniot 17
Hypertext Links
• Hypertext links connect one document to another. A link has two components: the
link text ,
which the user sees to click on, and the
link target ,
which is the location to which the browser is directed when the link is clicked.
• Form of a hypertext link:
< a href=" target.html
"> link text < /a >
© 2010, Robert K. Moniot 18
Hypertext Links
• The link target, or href (hypertext reference) is in the form of a
URL:
Uniform Resource Locator.
• A URL has 3 components, not all of which need to be supplied in every reference: – A protocol – An Internet address (either name or IP number) – A file path Example:
http :// www.fordham.edu
/current/index.html
© 2010, Robert K. Moniot 19
URL protocol
• The protocol portion of a URL specifies the way that the web browser will interact with the server to obtain the resource. Protocols supported by most browsers include:
http
- Hypertext Transport Protocol (the default)
https
- Secure HTTP (used to exchange confidential information such as credit card numbers)
ftp
- File Transfer Protocol, used on some servers that host downloadable materials
file
- for URLs that are files on the same computer where the browser is running
mailto
– for email addresses © 2010, Robert K. Moniot 20
URL address
• The Internet address portion of a URL can be either a name, e.g. www.fordham.edu
, or a number, e.g. 150.108.13.113
• If omitted, the address of the URL in the href is taken to be the same as the address in the URL of the document containing the link. Thus if the address is omitted from a link in a web page, the link refers to a document on the same server that served that page.
• A URL without an address portion can be either
absolute
or
relative
, as explained next.
© 2010, Robert K. Moniot 21
URL file path
• The file path portion of a URL optionally specifies the chain of directories (folders) in which the document is located, and the name of the file itself. The directory names in the chain are separated by slash characters.
• If the file
name
portion of the path is omitted, then it defaults to a value that is defined by the server, typically
index.html
.
Example: the URL
http :// www.myplace.com
/shopping/fruit/
lacks a file name, and so it might be equivalent to
http :// www.myplace.com
/shopping/fruit/index.html
© 2010, Robert K. Moniot 22
Relative URLs
• If a URL omits the Internet address portion, then the file path portion can be either
relative
or
absolute
.
• An absolute file path begins with a slash; a relative one does not.
• In both cases, the Internet address portion is supplied by that of the referencing document.
• For a relative reference, the chain of directories is also supplied by that of the referencing document.
< a href=" /somepage.html
">Absolute link< /a > < a href=" somepage.html
">Relative link< /a >
© 2010, Robert K. Moniot 23
Relative URLs
• A relative file path can also specify directories. In this case the chain starts in the same directory as the referencing page.
• Use
../
to indicate the parent directory.
Link referencing a file in a subdirectory of the current page's directory
< a href=" assets/somepage.html
">Relative link< /a >
Link referencing a file in a sibling directory of the current page's directory
< a href=" ../assets/somepage.html
">Relative link< /a >
© 2010, Robert K. Moniot 24
Absolute URLs
• In an absolute file path, the chain of directories starts at the top of the “URL space” of the server.
Absolute link referencing the top-level home page of a specified server
< a href=" http :// www.somehost.com
">Absolute link< /a >
Absolute link referencing a file in the
shopping
directory of the top level of the web server containing the referencing page
< a href=" /shopping/somepage.html
">Absolute link< /a >
© 2010, Robert K. Moniot 25
Images
• Use the
< img >
tag to include an image in a page. This tag allows some attributes that specify the size of the image, and alternative text that can be used in place of the image.
• The tag must include a
src
attribute specifying a URL that gives the location of the image.
Example:
< img src="myface.jpg" height="256" width="300" alt="Picture of me" / >
© 2010, Robert K. Moniot 26
Images
• The image size specification via
length
attributes of the
< img >
and
width
tag can be used to resize the image, but this is not recommended. Rather, the best use of these attributes is to tell the browser how big the image is, so that it can allow the right amount of space in the rendering of the page before it has downloaded the image.
• The alternative text given by
alt
can be displayed by the browser while the image is loading or by browsers that suppress images. Some browsers will pop up a balloon with this text when the mouse hovers over the image. This text can also be read aloud by browsers designed for the blind.
© 2010, Robert K. Moniot 27
Lists
• List types: –
< ol >
- Ordered list: each item is numbered. Use the
type
attribute
(deprecated)
to obtain letters or roman numerals: the value of the attribute is simply what the item label should look like. Possibilities for
type
are
1, a, A, i, I
–
< ul >
- Unordered list: each item is marked with a bullet.
–
< dl >
- Definition list: the head of each item is a word or phrase, and the rest is a description.
• For
< ol >
and <
ul >
, list items are started by
< li >
(closing tag optional in HTML).
• For
< dl >
, the head word is marked by
< dt >
description by
< dd >
and the (closing tags optional in HTML).
Example 4 © 2010, Robert K. Moniot 28
Other XHTML tags
• • • •
Superscripts and subscripts: < sup
>...<
/sup >
or
< sub
enclose text in >...<
/sub >
tags.
Horizontal rule
(line): use
< hr / >
. It always appears on a line by itself. This tag takes an optional attribute
width
(deprecated)
such as
< hr width="50%“ / >
to control how far across the page it extends.
Strikethrough text:
enclose in
< del
>...<
/del >
tags.
Comments:
these consist of any text enclosed within . Their purpose is to enlighten the web programmer reading the XHTML. They do not appear in the rendered page.
Example 5 © 2010, Robert K. Moniot 29
Tables
• The W3C says: “Tables should not be used purely as a means to layout document content as this may present problems when rendering to non-visual media. Additionally, when used with graphics, these tables may force users to scroll horizontally to view a table designed on a system with a larger display. To minimize these problems, authors should use style sheets to control layout rather than tables.
” • Be aware that not all your visitors will use the same hardware or software to view the page.
© 2010, Robert K. Moniot 30
Defining Tables
• A table organizes content into a grid of horizontal rows and vertical columns.
• The table area is enclosed in
< table > ... < /table >
• A table row is started with
< tr >
(closing tag optional).
• Within each row, a table column item is started with
< td >
(closing tag optional). Use
< th >
instead for column headings.
• Use
align
attribute (values
left
,
right
,
center
) to text within a cell.
Example 6 © 2010, Robert K. Moniot 31
Table attributes
summary="text" – provides a description of the table's purpose and structure for non-visual agents (e.g. speech) width="length" – specifies width of table.
Length
can be a percentage of the width of the window, or an absolute length in pixels.
© 2010, Robert K. Moniot 32
Table attributes, cont'd
border="pixels" – specifies width of frame drawn around table. Setting it to 0 suppresses the frame.
cellspacing ="pixels" – specifies spacing between cells (above, below, and to either side) cellpadding ="space" – specifies spacing between the boundary of a cell and its contents. Here
space
can either be a number in pixels, or a percentage of the cell size.
© 2010, Robert K. Moniot 33
Elements within Table
• Caption: specifies a caption to appear with the table.
– Allows an optional attribute
align
to specify where the caption appears relative to the table. The
align
position
can be one of:
top bottom left right
Cells that Span Rows, Columns
• A single cell can span multiple columns or multiple rows.
• This is achieved using the
rowspan
attributes of the
< th >
or
< td >
and element.
colspan
Example:
< td colspan="2">This cell spans two columns< /td >
Example:
< td rowspan="2">This cell spans two rows< /td >
You can also use the
nowrap
attribute to control automatic wrapping of text within a cell. Normally wrapping is good in order to prevent a cell from becoming too wide.
© 2010, Robert K. Moniot 35
Elements within Tables
• Row groups: these allow you to structure portions of the table into head, body, and foot.
< thead
> ... <
/thead >
encloses rows in head of table.
< tfoot
> ... <
/tfoot >
encloses rows in foot of table.
< tbody
> ... <
/tbody >
encloses rows in body of table.
– Note that
< tfoot >
must come before
< tbody >
so that the browser can scroll the body independently of head and foot, or repeat head and foot on multi-page printout.
– Multiple
< tbody >
elements are allowed, in order to divide the body into groups of rows. The browser may put extra space or other indication of the separation between these groups.
– The closing tags are optional in HTML.
© 2010, Robert K. Moniot 36
Elements within Tables
• Column groups: these allow you to specify width of columns in the table.
Form:
< colgroup > < col
width="width" span="cols"
/ >
...
< /colgroup >
width
can be a number of pixels or a percentage of the table width.
cols
is the number of columns sharing the same attributes.
HTML standard also allows
align
attribute but Firefox ignores it.
© 2010, Robert K. Moniot 37
Customizing Tables
• Use
colspan
and
rowspan
attributes of
< th >
or
< td >
to span multiple rows, columns, e.g.
tag
< td colspan="3">Multi-column cell< /td >
• Use
valign
attribute of
< th >
or
< td >
tag for vertical positioning within cells. Values
top
,
middle
,
bottom
.
• Use
border
attribute of
< table >
tag to show grid lines.
• Use
< caption >
tag inside table to provide a table caption.
Example 7 © 2010, Robert K. Moniot 38
Forms
• XHTML forms allow the user to supply data to the web server.
• We will focus on creating the form in XHTML so it appears to the user with input fields and descriptive text, ready to be filled in.
• When the form is submitted, the data entered into the input fields is sent back to the web server and processed, for instance by a CGI program. We will not study CGI programming at this time.
© 2010, Robert K. Moniot 39
Defining A Form
• A form is defined as follows:
< form
name="form-name" action="path/to/cgi" method="post">
...
(form fields and descriptive text go here)
< /form >
The optional
name
attribute gives the form a name. This is useful when using JavaScript to access the form elements.
The “
path to cgi
” is a URL specifying the location of the CGI program that will process the form. The
method
attribute can be either
"get"
or
"post"
.
The region between the opening and closing
< form >
tags can contain input fields and any sort of normal XHTML content.
© 2010, Robert K. Moniot 40
Form Submit Methods
• Method
"get"
– The input data are appended to the URL of the request sent to the server when the form is submitted. The server passes the data to the CGI through an environment variable.
– This method is primitive and suitable only for very small amounts of form data. It should ordinarily not be used for forms.
• Method
"post"
– The input data are sent to the server in a data body following the request headers. The server passes the data to the CGI through an input mechanism.
– This is the preferred method.
© 2010, Robert K. Moniot 41
Form Input Fields
• The most common type of form input field is defined using the
< input >
tag, which allows for various types of input elements: text boxes, check boxes, etc. The general form of this tag is
< input
name="field-name" type="input-type" ...
/ >
Note that there is never a closing tag, since this defines an element, not a region.
The
name
attribute gives the input field a name, which is useful for JavaScript and CGI processing. The
type
attribute allows different kinds of input fields to be defined. Depending on the field type, there may be other attributes to control other properties of the element.
© 2010, Robert K. Moniot 42
Input Field Types
• The
type
attribute can take on one of the following values:
text
- specifies a small box in which the user can type text.
password
- like
text
, but the text which is typed appears as asterisks, though it is submitted to the server as typed.
checkbox
- specifies a box which can be clicked to check or un check it.
radio
specifies a “radio button.” Several buttons of this kind are usually defined as a group, each button of the group having the same name. A radio button is like a checkbox except that only one button of the group can be selected at a time.
© 2010, Robert K. Moniot 43
Special Input Types
• There are some other input types that are special:
submit
- specifies a Submit button that sends the completed form data to the server.
reset
- specifies a Reset button that restores all input fields of the form to their default initial values.
hidden
- used to provide data that the user does not see.
© 2010, Robert K. Moniot 44
Other
< input >
Tag Attributes
• • • • size="width" for input fields of type
text password
. The
width
or is an integer giving the width of the box in characters. Default width is 20.
maxlength="length" for fields of type
text password
. Specifies the maximum number of or characters that can be entered into the box.
checked="checked"
buttons (
"checked"
for
checkbox
and
radio
is the only value allowed). If this attribute is present, the choice is selected by default. value="value" specifies a default value for the item or, for submit and reset buttons, a label for the button.
© 2010, Robert K. Moniot 45
Form Menu Elements
• A menu is defined as follows:
< select
name="name">
< option
> menu item 1
< option
> menu item 2
...
< /select >
© 2010, Robert K. Moniot 46
Form Menu Elements
• Default appearance of a
select
element has the selected item in a box, with a scroll button at one side that pops up the menu to allow selecting a different item.
• To create a menu that displays several options in the window, include the attribute size="length" where
length
is the number of items to show.
• By default, only one item from the list can be selected. To allow multiple items to be selected, add the attribute
multiple="multiple"
.
• The
item
tag allows the attribute
selected="selected"
to indicate that the item should be selected by default.
© 2010, Robert K. Moniot 47
The
textarea
Element
• The
text
type of input element is intended only for small amounts of text. To provide space for more input text, as well as scrollbars, use this element. Example:
< Default text can be placed here.
< textarea /textarea name="remarks" rows="10" cols="25"> >
Example 8 © 2010, Robert K. Moniot 48
Anchors
• The
< a >
tag can be used to create an
anchor
, a marked location in a document. The anchor is given an ID using the
id
attribute. (Other tags can be used too.) • A link can refer to this named location using the
#
character in the URL. Clicking on the link jumps to the marked location rather than to the top of the referenced document.
Example: Create anchor with
< a id="section1">Section 1.< /a >
Reference anchor with
< a href="#section1">See Section 1.< /a >
© 2010, Robert K. Moniot 49
Anchors
• The URL referring to an anchor located in the same document can omit everything except the
#
and the anchor name.
• Or, the URL can be a relative or absolute URL with the
#
anchorname
at the end.
Example of absolute URL referring to an anchor:
< a href="http://www.mysite.org/pubs/paper.html#section1"> See Section 1 of the paper.< /a >
Example 9 © 2010, Robert K. Moniot 50