Lecture 13 Dynamic Web Servers & Common Gateway Interface CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger.
Download
Report
Transcript Lecture 13 Dynamic Web Servers & Common Gateway Interface CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger.
Lecture 13
Dynamic Web Servers &
Common Gateway Interface
CPE 401 / 601
Computer Network Systems
slides are modified from Dave Hollinger
Web Server
Talks HTTP
Looks at METHOD, URI to determine what
the client wants.
For GET, URI often is just the path of a file
relative to some directory on the web server
Dynamic Web Servers
2
GET /foo/blah
/
usr
bin
foo
www
fun
etc
gif
blah
Dynamic Web Servers
3
In the good old days...
Years ago
WWW was made up of (mostly) static
documents.
Each URL corresponded to a single file stored on
some hard disk.
Today
Many of the documents on the WWW are
built at request time.
URL doesn’t correspond to a single file.
Dynamic Web Servers
4
Dynamic Documents
Dynamic Documents can provide:
automation of web site maintenance
customized advertising
database access
shopping carts
date and time service
…
Dynamic Web Servers
5
Web Programming
Writing programs that create dynamic
documents has become very important.
There are a number of general approaches:
Create custom server for each service desired.
• Each is available on different port.
Have web server run external programs.
Develop a real smart web server
• SSI, scripting, server APIs.
Dynamic Web Servers
6
Custom Server
Write a TCP server that watches a “well
known” port for requests.
Develop a
mapping from http requests to
service requests.
Send back HTML (or whatever) that is
created/selected by the server process.
Have to handle http errors, headers, etc.
Dynamic Web Servers
7
An Example Custom Server
We want to provide a time and date service.
Anyone in the world can find out the date and
time
according to our computer!!!
We don’t care what is in the http request,
our reply doesn’t depend on it.
We assume the request comes from a
browser that wants the content formatted
as an HTML document.
Dynamic Web Servers
8
WWW based time and date server
Listen on a well known TCP port.
Accept a connection.
Find out the current time and date
Convert time and date to a string
Send back some http headers (Content-Type)
Send the string wrapped in HTML formatting.
Close the connection.
loop forever
Dynamic Web Servers
9
Another Example: Counter
Keep track of how many times our server is
hit each day.
Report on the number of hits our server got
on any day in the past!
The reply now
does depend on the request.
We have to remember that the request
comes from a HTTP client,
so we need to accept HTTP requests.
Dynamic Web Servers
10
Time & Date Hit Server
Each request comes as a string (URI)
specifying a resource.
Our requests will look like this:
/mm/dd/yyyy
An example URL for our service:
http://www.timedate.com:4567/02/10/2000
We will get a request like:
GET /02/10/2000 HTTP/1.1
Dynamic Web Servers
11
New code
Record the “hit” in database.
Read request - parse request to
month,day,year
Lookup hits for month,day,year in database.
Send back some http headers (Content-Type)
Create HTML table and send back to client.
Close the connection.
Dynamic Web Servers
12
Drawbacks to Custom Server Approach
We might have lots of ideas custom services.
Each requires dedicated address (port)
Each needs to include:
• basic TCP server code
• parsing HTTP requests
• error handling
• headers
• access control
Dynamic Web Servers
13
Another Approach
Take a general purpose Web server (that can
handle static documents) and
have it process requested documents as it sends
them to the client.
The documents could contain commands that
the server understands
the server includes some kind of interpreter.
Dynamic Web Servers
14
Example Smart Server
Have the server read each HTML file as it
sends it to the client.
The server could look for this:
<SERVERCODE> some command </SERVERCODE>
The server doesn’t send this part to the
client, instead it interprets the command and
sends the result to the client.
Everything else is sent normally.
Dynamic Web Servers
15
Example Document
<TITLE>timedate.com Home Page</TITLE>
<H1 ALIGN=CENTER>Welcome to timedate.com</H1>
<SERVERCODE> include fancygraphic </SERVERCODE>
The current time is
<SERVERCODE> time </SERVERCODE>.<P>
Today is <SERVERCODE> date </SERVERCODE>.
Visit our sponser:
<SERVERCODE> random sponsor </SERVERCODE>
Dynamic Web Servers
16
Real Life - Server Side Includes
Many real web servers support this idea
but not the syntax we’ve shown.
Server Side Includes (SSI) provides a set of
commands that a server will interpret.
Typically the server is configured to look for
commands only in specially marked documents
so normal documents aren’t slowed down
Dynamic Web Servers
17
SSI Directives
SSI commands are called
directives
Directives are embedded in HTML comments.
A comment looks like this:
<!-- this is an HTML comment -->
A directive looks like this:
<!--#command parameter=“arg”-->
Dynamic Web Servers
18
Some SSI Directives
SSI servers keep a number of useful things
in environment variables:
DOCUMENT_NAME, DOCUMENT_URL
echo: inserts the value of an environment
variable into the page.
This page is located at
<!--#echo var=“DOCUMENT_URL”-->.
Dynamic Web Servers
19
SSI Directives
include: inserts the contents of a text file.
<!--#include file=“banner.html”>
flastmod: inserts the time and date that a
file was last modified.
Last modified:
<!--#flastmod file=“foo.html”>
Dynamic Web Servers
20
SSI Directives (cont.)
exec: runs an external program and inserts
the output of the program.
Current users:
<!--#exec cmd=“/usr/bin/who”>
Danger! Danger! Danger!
Dynamic Web Servers
21
More Power
Some servers support elaborate scripting
languages.
Scripts are embedded in HTML documents,
the server interprets the script:
Microsoft Active Server Pages (ASP)
• JScript, VBScript, PerlScript
Netscape LiveWire
• JavaScript, SQL connection library.
There are others...
Dynamic Web Servers
22
Server Mapping and APIs
Some servers include a programming interface
that allows us to extend the capabilities of
the server by writing modules.
Specific URLs are mapped to specific modules
instead of to files.
We could write our timedate.com server as a
module and merge it with the web server.
Dynamic Web Servers
23
External Programs
Another approach is to provide a standard
interface between external programs and web
servers.
We can run the same program from any web
server.
The web server handles all the http,
• we focus on the special service only.
It doesn’t matter what language we use to write
the external program.
Dynamic Web Servers
24
Common Gateway Interface
CGI is a standard interface to external
programs supported by most (if not all) web
servers.
The interface that is defined by CGI includes:
Identification of the service
• external program
Mechanism for passing the request to the external
program.
Dynamic Web Servers
25
CGI Programming
We will focus on CGI programming.
CGI programs are often written in
scripting languages (perl, tcl, etc.),
we will concentrate on C
CGI
27
CGI Programming
HTTP
SERVER
CLIENT
CGI Program
CGI
28
Common Gateway Interface
CGI is a standard mechanism for:
Associating URLs with programs that can be run
by a web server.
A protocol (of sorts) for how the request is
passed to the external program.
How the external program sends the response to
the client.
CGI
29
CGI URLs
There is some mapping between URLs and CGI
programs provided by a web sever.
The exact mapping is not standardized
• web server admin can set it up
Typically:
requests that start with /CGI-BIN/ , /cgi-bin/
or /cgi/, etc. refer to CGI programs
• not to static documents.
CGI
30
Request
CGI program
The web server sets some environment
variables with information about the request.
The web server fork()s and the child
process exec()s the CGI program.
The CGI program gets information about the
request from environment variables.
CGI
31
STDIN, STDOUT
Before calling exec(), the child process
sets up pipes so that
stdin comes from the web server and
stdout goes to the web server.
In some cases part of the request is read
from stdin.
Anything written to stdout is forwarded by
the web server to the client.
CGI
32
Environment
Variables
stdin
HTTP
SERVER
CGI Program
stdout
CGI
33
Important CGI
Environment Variables
REQUEST_METHOD
QUERY_STRING
CONTENT_LENGTH
CGI
34
Request Method: Get
GET requests can include a
part of the URL:
query string as
Delimiter
GET /cgi-bin/login?mgunes HTTP/1.0
Request
Method
Resource
Name
Query
String
CGI
35
/cgi-bin/login?mgunes
The web server treats everything before
the ‘?’ delimiter as the resource name
In this case the resource name is the name
of a program.
Everything after the ‘?’ is a string that is
passed to the CGI program.
CGI
36
Simple GET queries - ISINDEX
You can put an <ISINDEX> tag inside an
HTML document.
The browser will create a text box that
allows the user to enter a single string.
If an ACTION is specified in the ISINDEX
tag, when the user presses Enter,
a request will be sent to the server specified as
the ACTION.
CGI
37
ISINDEX Example
Enter a string:
<ISINDEX ACTION=http://foo.com/search.cgi>
Press Enter to submit your query.
If you enter the string “blahblah”,
the browser will send a request to the http server
at foo.com that looks like this:
GET /search.cgi?blahblah HTTP/1.1
CGI
38
What the CGI sees
The CGI Program gets REQUEST_METHOD
using getenv:
char *method;
method = getenv(“REQUEST_METHOD”);
if (method==NULL) … /* error! */
CGI
39
Getting the GET
If the request method is GET:
if (strcasecmp(method,”get”)==0)
The next step is to get the query string
from the environment variable QUERY_STRING
char *query;
query = getenv(“QUERY_STRING”);
CGI
40
Send back http Response and Headers:
The CGI program can send back a http
status line :
printf(“HTTP/1.1 200 OK\r\n”);
and headers:
printf(“Content-type: text/html\r\n”);
printf(“\r\n”);
CGI
41
Important!
CGI program doesn’t have to send a status line
the http server will do this for you if you don’t.
CGI program must
always send back at least
one header line indicating the data type of the
content (usually text/html).
The web server will typically throw in a few
header lines of it’s own
Date, Server, Connection
CGI
42
Simple GET handler
int main() {
char *method, *query;
method = getenv(“REQUEST_METHOD”);
if (method==NULL) … /* error! */
query = getenv(“QUERY_STRING”);
printf(“Content-type: text/html\r\n\r\n”);
printf(“<H1>Your query was %s</H1>\n”,
query);
return(0);
}
CGI
43
URL-encoding
Browsers use an encoding when sending query
strings that include special characters.
Most nonalphanumeric characters are encoded as a
‘%’ followed by 2 ASCII encoded hex digits.
• ‘=‘ (which is hex 3D) becomes “%3D”
• ‘&’ becomes “%26”
The space character ‘ ‘ is replaced by ‘+’.
• Why? (think about project 2 parsing…)
The ‘+’ character is replaced by “%2B”
• “foo=6 + 7” becomes “foo%3D6+%2B+7”
CGI
44
Security!!!
It is a
very bad idea to build a command line
containing user input!
What if the user submits:
“ ; rm -r *;”
grep ; rm -r *; /usr/dict/words
CGI
45
Beyond ISINDEX - Forms
Many Web services require more than a simple
ISINDEX.
HTML includes support for forms:
lots of field types
user answers all kinds of annoying questions
entire contents of form must be stuck together
and put in QUERY_STRING by the Web server.
CGI
46
Form Fields
Each field within a form has a name and a value.
The browser creates a query that
includes a sequence of “name=value” substrings
and
sticks them together separated by the ‘&’
character.
If user types in “Mehmet H.” as the name and
“none” for occupation,
the query would look like this:
“name=Mehmet+H%2E&occupation=none”
CGI
47
HTML Forms
Each form includes a METHOD that
determines what http method is used to
submit the request.
Each form includes an ACTION that
determines where the request is made.
CGI
48
An HTML Form
<FORM METHOD=GET
ACTION=http://foo.com/signup.cgi>
Name:
<INPUT TYPE=TEXT NAME=name><BR>
Occupation:
<INPUT TYPE=TEXT NAME=occupation><BR>
<INPUT TYPE=SUBMIT>
</FORM>
CGI
49
What a CGI will get
The query (from the environment variable
QUERY_STRING) will be
a URL-encoded string containing the name,value
pairs of all form fields.
The CGI must decode the query and separate
the individual fields.
CGI
50
HTTP Method: POST
The HTTP POST method delivers data from
the browser as the content of the request.
The GET method delivers data (query) as
part of the URI.
HTML Form using POST
Set the form method to POST instead of GET.
<FORM METHOD=POST ACTION=…>
CGI
51
GET vs. POST
When using forms it’s generally better to use
POST:
there are limits on the maximum size of a GET
query string (environment variable)
a post query string doesn’t show up in the browser
as part of the current URL.
CGI
52
CGI reading POST
If REQUEST_METHOD is a POST,
the query is coming in STDIN.
The environment variable CONTENT_LENGTH
tells us how much data to read.
CGI
53
Possible Problem
char buff[100];
char *clen = getenv(“CONTENT_LENGTH”);
if (clen==NULL)
/* handle error */
int len = atoi(clen);
if (read(0,buff,len)<0)
… /* handle error */
pray_for(!hacker);
CGI
54
CGI Method summary
GET:
REQUEST_METHOD is “GET”
QUERY_STRING is the query
POST:
REQUEST_METHOD is “POST”
CONTENT_LENGTH is the size of the query
query can be read from STDIN
CGI
55