Transcript Slide 1
Introduction to HTML
Contents Getting Started..
What is HTML?
How to create and View an HTML document?
Basic HTML Document Format The HTML Basic tags HTML Protocol Headers Publishing Pages
What the following term mean :
Web server: a system on the internet containg one or more web site Web site: a collection of one or more web pages Web pages: single disk file with a single file name Home pages: first page in website
Think about the followings before working your Web pages.
Think about the sort of information(content) you want to put on the Web.
Set the goals for the Web site. Organize your content into main topics. Come up with a general structure for pages and topics.
What is HTML?
Telling the browser what to do, and what props to use.
A serises of tags that are integrated into a text document.
Tags are ; surrounded with angle brackets like this or . Most tags come in pairs exceptions:
,
,
The second tag(off switch) starts with a forward slash. For example , text can embedded, for instance, to do this:
The correct order is
For example,
centers the paragraph following it. Some browsers don't support the some tags and some attributes.
Basic HTML Document Format
See what it looks like:
How to Create and View an HTML document?
1.Use an text editor such as Notepad to write the document. 2.Save the file as filename.html on a PC. This is called the Document Source. 3.Open Internet Explorer (or any browser) Off-Line 4.Click on File, Open File and select the filename.html document that you just created. 5.Your HTML page should now appear just like any other Web page in Netscape.
6.You may now switch back and forth between the Source and the HTML Document switch to Notepad with the Document Source make changes save the document again switch back to Internet Explorer click on RELOAD and view the new HTML Document switch to Notepad with the Document Source......
Tags in head
...-- contains information about the document
HTML Characteristics
Just a Text File!
+ Portable + Human Readable/Writable Defines the Structure (not Appearance) of the Document Client (Browser) defines the appearance Font preferences, window width, … Pours into Browser (PDAs, Bigger/Smaller)
Document Structure
< html > < head >< title >My First Web Page< /title > < /head > < body bgcolor="white"> < p >A Paragraph of Text.< /p > < /body > < /html >
Nested Tags
Like a tree, each element is contained inside a parent element Each element may have any number of attributes ...
... ... bgcolor="white"...
Basic Tags
< html > < head >< title >My First Web Page< /title > < /head > < body bgcolor="white"> < p >A Paragraph of Text.< /p > < /body > < /html >
Basic Tags
Preamble which identifies content as HTML
…
H1-6 where larger number means smaller heading Includes vertical whitespace unlike
Basic Tags
horizontal rule
new line ... bold ... italicize text in between
Unordered Lists
- Apples
- Oranges
- One
- Two
Lists
o o 1.
2.
Apples Fuji Granny Smith Oranges
Image Files
JPEG Best for photos Public standard GIF Best for simple images Older standard PNG – Portable Network Graphics Public standard replacement for GIF SVG – Scalable Vector Graphics Series of drawing commands Uses XML
Tables
Table Example
row 1, cell 1 | row 1, cell 2 |
row 2, cell 1 | row 2, cell 2 |
Comments
-->
Special HTML
< → < > → > & → & → space
Anchor Tag (Links)
Absolute HREFs specify fully qualified URLs.
Yahoo! In this directory! In sub-directory a! Relative HREFs are relative to the directory containing the current HTML file.
Review: Client and Server
User uses HTTP client (Web Browser) It has a URL (e.g. http://www.yahoo.com/) Makes a request to the server Server sends back data (the response ) User clicks on the client side...
request (URL) response (HTML, …) Client Server
Client/Server Timeline
Client (C1)
get IP address & port
(C2)
create new socket
(socket) (C3)
connect to server IP:port
(connect) .
.
.
.
.
.
(C4)
connection successful
(C5)
send HTTP request
(write) (C6)
wait for HTTP response
(read) .
.
.
.
(C7)
process HTTP response
(C8)
close connection
(close) Server (S1)
create new socket
(socket) (S2)
bind socket to port 80
(bind) (S3)
permit socket connections
(listen) (S4)
wait for connection
(accept) .
.
.
.
.
.
(S5)
application notified of connection
(S6)
start reading request
(read) .
.
.
.
(S7)
process HTTP request message
.
.
(S8)
send back HTTP response
(write) (S9)
close connection
(close)
HTTP Request Structure
URL, URN, or URI?
URN is location-independent resource identifier urn:ietf:rfc:3187 urn:isbn:0451450523 URL is the location URI is the superset of URL & URN
URL Structure
< scheme >://
Unsafe Characters
Some Characters need to be encoded ~ SPACE % [ASCII: 126 (0x7E)] [ASCII: 32 (0x20)] [ASCII: 37 (0x25)] Examples http://www.bob.com/%7Ekelly/ http://www.bob.com/my%20home%20page.html
http://www.bob.com/100%25Crankiness.html
Empty-String Path
http://www.yahoo.com
Assume the path is "/" Client should send GET / HTTP/1.0\r\n\r\n
Relative Headers
Client Side Given a URL in a file, if it is relative, will add base address to the relative URL Last requested path is http://foo.com/b/index.
in index.html see link base address is http://foo.com/b/ Client requests http://foo.com/b/a.html
Request Header
GET / HTTP / 1.1
Request Header Example
GET / HTTP/1.1
Host: localhost:8181 Connection: keep-alive Referer: http://localhost/~ronyeh/ User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/124 (KHTML, like Gecko) Safari/125 Accept: */* Accept-Encoding: gzip, deflate;q=1.0, identity; q=0.5, *;q=0 Accept-Language: en-us, ja;q=0.62, de-de;q=0.93
General Headers
Used by clients & servers Seen in both requests and responses Date: Tue, 3 Oct 2004 02:16:00 GMT Connection: keep-alive
Request Headers
Client-IP: 192.168.1.12
Host: hostmachine.com
Referer: http://wherefrom.com/ User-Agent: Mozilla/5.0
UA-OS If-Modified-Since
Request Headers
Accept: */* Accept: text/html Accept-Language: en-us, ja Accept-Encoding: gzip
HTTP Response Structure
Example Response
HTTP Server Response Codes
200 OK 3XX -- Minor Client Error 301 -- File Moved Permanently 302 -- Moved Temporarily 304 -- Not Modified 4XX -- Major Client Error 400 -- Syntax Error 401 -- Unauthorized 403 -- Forbidden, Permission Denied 404 -- Not Found!
HTTP Server Response Codes
5XX -- Server Errors 500 -- Internal Server Error 503 -- Service Unavailable
General Headers
Used by clients & servers Seen in both requests and responses Date : Tue, 3 Oct 1974 02:16:00 GMT Connection : close
Server Response Headers
Server: GWS/2.1
Content-Length: 2136 Content-Type: text/html Location Expires Last-Modified
MIME
Multipurpose Internet Mail Extensions type, subtype, & optional parameters type/subtype; param1=value1
application/* audio/* image/* image/jpeg image/tiff text/* text/xml text/rtf text/html text/plain
MIME types
video/* video/quicktime video/mpeg video/x-msvideo
Pages with Multiple Types
Each entity (ex. image) is standalone HTTP request Page with many pictures creates many connections Each response therefore has appropriate MIME settings
Mapping URL Path
Server can map URLs to any place on the file system. Doesn't have to be under the Document Root. It's the server's choice!!!
User names ~kashaw May map to /users/kashaw/WWW /a/b/ => maps to a default file index.html, default.html, index.htm, index.shtml
/a/b/ => if default file doesn't exist, may list the directory's files
Trailing Slash
What if Client asks for /a/b Say file b doesn't exist Utilize the 301 Redirect to /a/b/ Client re-does request What happens if server does NOT issue 301, but gives the client the right file anyways?
Advanced Topics
Redirection Caching Performance HTTP 1.1
When NOT to Redirect
Client requests /a/b/ Server maps to /a/b/index.html and sends back html file A Link Client takes base address /a/b/ concatenates with c.html
Client requests /a/b/c.html which is correct!
When to Redirect
URL missing trailing slash No file named /class/cs193i But, directory named /class/cs193i/ If Redirect did NOT happen Client thinks base address is /class/ a relative href=”schedule.html” in cs193i will be mapped by client to /class/schedule.html
Server will return 404 Not Found
Why Redirect?
Reliability (Find Live Hosts) Minimize Delay (Find Shortest Path) Conserve Network Bandwidth (Spread out Requests Geographically) Load Balancing (Distribute Requests Temporally)
Load Balancing Example
Client1 GET / HTTP/1.1
Host: www.goldenretrievers.com
www.goldenretrievers.com
a.goldenretrievers.com
Client2 b.goldenretrievers.com
Client3 HTTP/1.1 302 Found Date: Wed, 10 July 2004 16:46:17 GMT Location: http://c.goldenretrievers.c
om c.goldenretrievers.com
Load Balancing Example
Client1 Client2 Client3 www.goldenretrievers.com
a.goldenretrievers.com
b.goldenretrievers.com
c.goldenretrievers.com
Redirection Tradeoffs
HTTP Redirection Every request initially goes through the www.goldenretrievers.com machine Must Customize www Web Server Alternative: DNS Redirection DNS server decides which IP address to return (from a list of OK IP Addrs) Alternative: Hardware Redirection NAT Box! Packet rewriting!
Caching Motivation
Redundant Data Transfer Network Bandwidth Bottlenecks Server Demand Distance Delays (Latency)
Adding Caching
Web Cache < Traffic Server
Hit, Miss, Revalidate
Revalidate Options
Revalidate request with If-Modified-State
GET /announce.html HTTP/1.0
If-Modified-Since: Sat, 29 Jun 2002, 14:30:00 GMT Cache (browser cache or proxy cache) HTTP/1.0 304 Not Modified Date: Wed, 03 Jul 2002, 19:18:23 GMT Content-type: text/plain Content-length: 67 Expires: Fri, 05 Jul 2002, 05:00:00 GMT
“Still fresh” response
Check via If-Modified-Since...Not Modified Suffers from 2X latency between cache & server Just assume, and have a timeout, refresh cache automatically Server
Request / Response Timeline
SYN SYN+ ACK ACK+ GET /food.html
HTTP/1.0
HTTP/1.0 200 OK Time
Web Pages w/ Multiple Requests
Time
Persistent Connections
Pipelining
Connection: Keep-Alive
Persistent Connections
HTTP 1.0 -- Connections close by Default No need for Content-length, end signaled by EOF (in-band signal) HTTP 1.1 -- Persistent by Default Must use Content-length
Chunked-Transfer Encoding
Problem: Content-length costly for server Solution Server omits Content-Length Transfer-encoding: chunked Send Data in Chunks, Prefixed by length in Hex End is marked with Chunk Length 0 (in band signal like in POP)
Publishing pages
Viewing your pages via the internet Send pages to a web server How a page is distributed Server space maintenance Common ways (protocols) of sending pages to a web server FTP SSH Through a mapped drive
Publishing pages
Sending pages via FTP to a web server File Transfer Protocol Plain-text data transfer Microsoft’s FTP client ftp://username:[email protected]
ftp://[email protected]
Other FTP clients WS_FTP CuteFTP
Publishing pages
Exercise: Publish pages to usiweb.usi.edu through an FTP client Go to your desktop Open “My Computer” or “Internet Explorer” Type in the following URL into the address field: Oakland’s ftp address
Publishing pages
Sending pages via SSH Secure Shell Encrypted data transfer More secure than FTP SSH used with www.oakland.edu
access for off-campus SSH clients SSH Secure Shell PuTTY (Free Win32 SSH client)
Publishing pages
Sending pages through a mapped drive Method used to publish on campus to www.oakland.edu
. Normally labeled “H:” or “the H drive”.
Publishing pages
Creating a mapped drive for www.usi.edu
Make sure you have access to on the web server If not: Have your fiscal agent send an e-mail to Web Services approving access Go to your computer’s desktop Right click on the “My Computer” icon Select “Map Network Drive…” Enter Q: as the drive letter Enter \\www\www_usi Click on Finish or OK as the folder Traverse to your folder you have access to using FrontPage or another web publishing application