Lecture 13 Dynamic Web Servers & Common Gateway Interface CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger.

Transcript Lecture 13 Dynamic Web Servers & Common Gateway Interface CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger.

Lecture 13
Dynamic Web Servers &
Common Gateway Interface
CPE 401 / 601
Computer Network Systems
slides are modified from Dave Hollinger
Web Server
 Talks HTTP
 Looks at METHOD, URI to determine what
the client wants.
 For GET, URI often is just the path of a file

relative to some directory on the web server
Dynamic Web Servers
2
GET /foo/blah
/
usr
bin
foo
www
fun
etc
gif
blah
Dynamic Web Servers
3
In the good old days...
Years ago
 WWW was made up of (mostly) static
documents.

Each URL corresponded to a single file stored on
some hard disk.
Today
 Many of the documents on the WWW are
built at request time.

URL doesn’t correspond to a single file.
Dynamic Web Servers
4
Dynamic Documents
 Dynamic Documents can provide:

automation of web site maintenance

customized advertising

database access

shopping carts

date and time service

…
Dynamic Web Servers
5
Web Programming
 Writing programs that create dynamic
documents has become very important.
 There are a number of general approaches:

Create custom server for each service desired.
• Each is available on different port.

Have web server run external programs.

Develop a real smart web server
• SSI, scripting, server APIs.
Dynamic Web Servers
6
Custom Server
 Write a TCP server that watches a “well
known” port for requests.
 Develop a
mapping from http requests to
service requests.
 Send back HTML (or whatever) that is
created/selected by the server process.
 Have to handle http errors, headers, etc.
Dynamic Web Servers
7
An Example Custom Server
 We want to provide a time and date service.
 Anyone in the world can find out the date and
time

according to our computer!!!
 We don’t care what is in the http request,
our reply doesn’t depend on it.
 We assume the request comes from a
browser that wants the content formatted
as an HTML document.
Dynamic Web Servers
8
WWW based time and date server

Listen on a well known TCP port.

Accept a connection.

Find out the current time and date

Convert time and date to a string

Send back some http headers (Content-Type)

Send the string wrapped in HTML formatting.

Close the connection.
loop forever
Dynamic Web Servers
9
Another Example: Counter
 Keep track of how many times our server is
hit each day.
 Report on the number of hits our server got
on any day in the past!
 The reply now
does depend on the request.
 We have to remember that the request
comes from a HTTP client,

so we need to accept HTTP requests.
Dynamic Web Servers
10
Time & Date Hit Server
 Each request comes as a string (URI)
specifying a resource.
 Our requests will look like this:
/mm/dd/yyyy
 An example URL for our service:
http://www.timedate.com:4567/02/10/2000
 We will get a request like:
GET /02/10/2000 HTTP/1.1
Dynamic Web Servers
11
New code
 Record the “hit” in database.
 Read request - parse request to
month,day,year
 Lookup hits for month,day,year in database.
 Send back some http headers (Content-Type)
 Create HTML table and send back to client.
 Close the connection.
Dynamic Web Servers
12
Drawbacks to Custom Server Approach
 We might have lots of ideas custom services.

Each requires dedicated address (port)

Each needs to include:
• basic TCP server code
• parsing HTTP requests
• error handling
• headers
• access control
Dynamic Web Servers
13
Another Approach
 Take a general purpose Web server (that can
handle static documents) and

have it process requested documents as it sends
them to the client.
 The documents could contain commands that
the server understands

the server includes some kind of interpreter.
Dynamic Web Servers
14
Example Smart Server
 Have the server read each HTML file as it
sends it to the client.
 The server could look for this:
<SERVERCODE> some command </SERVERCODE>
 The server doesn’t send this part to the
client, instead it interprets the command and
sends the result to the client.
 Everything else is sent normally.
Dynamic Web Servers
15
Example Document
<TITLE>timedate.com Home Page</TITLE>
<H1 ALIGN=CENTER>Welcome to timedate.com</H1>
<SERVERCODE> include fancygraphic </SERVERCODE>
The current time is
<SERVERCODE> time </SERVERCODE>.<P>
Today is <SERVERCODE> date </SERVERCODE>.
Visit our sponser:
<SERVERCODE> random sponsor </SERVERCODE>
Dynamic Web Servers
16
Real Life - Server Side Includes
 Many real web servers support this idea

but not the syntax we’ve shown.
 Server Side Includes (SSI) provides a set of
commands that a server will interpret.
 Typically the server is configured to look for
commands only in specially marked documents

so normal documents aren’t slowed down
Dynamic Web Servers
17
SSI Directives
 SSI commands are called
directives
 Directives are embedded in HTML comments.
 A comment looks like this:

 A directive looks like this:

Dynamic Web Servers
18
Some SSI Directives
 SSI servers keep a number of useful things
in environment variables:
DOCUMENT_NAME, DOCUMENT_URL
 echo: inserts the value of an environment
variable into the page.
This page is located at
.
Dynamic Web Servers
19
SSI Directives
 include: inserts the contents of a text file.
<!--#include file=“banner.html”>
 flastmod: inserts the time and date that a
file was last modified.
Last modified:
<!--#flastmod file=“foo.html”>
Dynamic Web Servers
20
SSI Directives (cont.)
 exec: runs an external program and inserts
the output of the program.
Current users:
<!--#exec cmd=“/usr/bin/who”>
Danger! Danger! Danger!
Dynamic Web Servers
21
More Power
 Some servers support elaborate scripting
languages.
 Scripts are embedded in HTML documents,
the server interprets the script:

Microsoft Active Server Pages (ASP)
• JScript, VBScript, PerlScript

Netscape LiveWire
• JavaScript, SQL connection library.

There are others...
Dynamic Web Servers
22
Server Mapping and APIs
 Some servers include a programming interface
that allows us to extend the capabilities of
the server by writing modules.
 Specific URLs are mapped to specific modules
instead of to files.
 We could write our timedate.com server as a
module and merge it with the web server.
Dynamic Web Servers
23
External Programs
 Another approach is to provide a standard
interface between external programs and web
servers.


We can run the same program from any web
server.
The web server handles all the http,
• we focus on the special service only.

It doesn’t matter what language we use to write
the external program.
Dynamic Web Servers
24
Common Gateway Interface
 CGI is a standard interface to external
programs supported by most (if not all) web
servers.
 The interface that is defined by CGI includes:

Identification of the service
• external program

Mechanism for passing the request to the external
program.
Dynamic Web Servers
25
CGI Programming
 We will focus on CGI programming.
 CGI programs are often written in
scripting languages (perl, tcl, etc.),

we will concentrate on C
CGI
27
CGI Programming
HTTP
SERVER
CLIENT
CGI Program
CGI
28
Common Gateway Interface
 CGI is a standard mechanism for:



Associating URLs with programs that can be run
by a web server.
A protocol (of sorts) for how the request is
passed to the external program.
How the external program sends the response to
the client.
CGI
29
CGI URLs
 There is some mapping between URLs and CGI
programs provided by a web sever.

The exact mapping is not standardized
• web server admin can set it up
 Typically:
 requests that start with /CGI-BIN/ , /cgi-bin/
or /cgi/, etc. refer to CGI programs
• not to static documents.
CGI
30
Request
CGI program
 The web server sets some environment
variables with information about the request.
 The web server fork()s and the child
process exec()s the CGI program.
 The CGI program gets information about the
request from environment variables.
CGI
31
STDIN, STDOUT
 Before calling exec(), the child process
sets up pipes so that
stdin comes from the web server and
 stdout goes to the web server.

 In some cases part of the request is read
from stdin.
 Anything written to stdout is forwarded by
the web server to the client.
CGI
32
Environment
Variables
stdin
HTTP
SERVER
CGI Program
stdout
CGI
33
Important CGI
Environment Variables
REQUEST_METHOD
QUERY_STRING
CONTENT_LENGTH
CGI
34
Request Method: Get
 GET requests can include a
part of the URL:
query string as
Delimiter
GET /cgi-bin/login?mgunes HTTP/1.0
Request
Method
Resource
Name
Query
String
CGI
35
/cgi-bin/login?mgunes
 The web server treats everything before
the ‘?’ delimiter as the resource name
 In this case the resource name is the name
of a program.
 Everything after the ‘?’ is a string that is
passed to the CGI program.
CGI
36
Simple GET queries - ISINDEX
 You can put an <ISINDEX> tag inside an
HTML document.
 The browser will create a text box that
allows the user to enter a single string.
 If an ACTION is specified in the ISINDEX
tag, when the user presses Enter,

a request will be sent to the server specified as
the ACTION.
CGI
37
ISINDEX Example
Enter a string:
<ISINDEX ACTION=http://foo.com/search.cgi>
Press Enter to submit your query.
 If you enter the string “blahblah”,
 the browser will send a request to the http server
at foo.com that looks like this:
GET /search.cgi?blahblah HTTP/1.1
CGI
38
What the CGI sees
 The CGI Program gets REQUEST_METHOD
using getenv:
char *method;
method = getenv(“REQUEST_METHOD”);
if (method==NULL) … /* error! */
CGI
39
Getting the GET
 If the request method is GET:
if (strcasecmp(method,”get”)==0)
 The next step is to get the query string
from the environment variable QUERY_STRING
char *query;
query = getenv(“QUERY_STRING”);
CGI
40
Send back http Response and Headers:
 The CGI program can send back a http
status line :
printf(“HTTP/1.1 200 OK\r\n”);
 and headers:
printf(“Content-type: text/html\r\n”);
printf(“\r\n”);
CGI
41
Important!
 CGI program doesn’t have to send a status line
 the http server will do this for you if you don’t.
 CGI program must
always send back at least
one header line indicating the data type of the
content (usually text/html).
 The web server will typically throw in a few
header lines of it’s own

Date, Server, Connection
CGI
42
Simple GET handler
int main() {
char *method, *query;
method = getenv(“REQUEST_METHOD”);
if (method==NULL) … /* error! */
query = getenv(“QUERY_STRING”);
printf(“Content-type: text/html\r\n\r\n”);
printf(“<H1>Your query was %s</H1>\n”,
query);
return(0);
}
CGI
43
URL-encoding
 Browsers use an encoding when sending query
strings that include special characters.

Most nonalphanumeric characters are encoded as a
‘%’ followed by 2 ASCII encoded hex digits.
• ‘=‘ (which is hex 3D) becomes “%3D”
• ‘&’ becomes “%26”

The space character ‘ ‘ is replaced by ‘+’.
• Why? (think about project 2 parsing…)

The ‘+’ character is replaced by “%2B”
• “foo=6 + 7” becomes “foo%3D6+%2B+7”
CGI
44
Security!!!
 It is a
very bad idea to build a command line
containing user input!
 What if the user submits:
“ ; rm -r *;”
grep ; rm -r *; /usr/dict/words
CGI
45
Beyond ISINDEX - Forms
 Many Web services require more than a simple
ISINDEX.
 HTML includes support for forms:
lots of field types
 user answers all kinds of annoying questions
 entire contents of form must be stuck together
and put in QUERY_STRING by the Web server.

CGI
46
Form Fields
 Each field within a form has a name and a value.
 The browser creates a query that
 includes a sequence of “name=value” substrings
and
 sticks them together separated by the ‘&’
character.
 If user types in “Mehmet H.” as the name and
“none” for occupation,

the query would look like this:
“name=Mehmet+H%2E&occupation=none”
CGI
47
HTML Forms
 Each form includes a METHOD that
determines what http method is used to
submit the request.
 Each form includes an ACTION that
determines where the request is made.
CGI
48
An HTML Form
<FORM METHOD=GET
ACTION=http://foo.com/signup.cgi>
Name:
<INPUT TYPE=TEXT NAME=name><BR>
Occupation:
<INPUT TYPE=TEXT NAME=occupation><BR>
<INPUT TYPE=SUBMIT>
</FORM>
CGI
49
What a CGI will get
 The query (from the environment variable
QUERY_STRING) will be

a URL-encoded string containing the name,value
pairs of all form fields.
 The CGI must decode the query and separate
the individual fields.
CGI
50
HTTP Method: POST
 The HTTP POST method delivers data from
the browser as the content of the request.
 The GET method delivers data (query) as
part of the URI.
 HTML Form using POST

Set the form method to POST instead of GET.
<FORM METHOD=POST ACTION=…>
CGI
51
GET vs. POST
 When using forms it’s generally better to use
POST:


there are limits on the maximum size of a GET
query string (environment variable)
a post query string doesn’t show up in the browser
as part of the current URL.
CGI
52
CGI reading POST
 If REQUEST_METHOD is a POST,

the query is coming in STDIN.
 The environment variable CONTENT_LENGTH
tells us how much data to read.
CGI
53
Possible Problem
char buff[100];
char *clen = getenv(“CONTENT_LENGTH”);
if (clen==NULL)
/* handle error */
int len = atoi(clen);
if (read(0,buff,len)<0)
… /* handle error */
pray_for(!hacker);
CGI
54
CGI Method summary
 GET:
 REQUEST_METHOD is “GET”
 QUERY_STRING is the query
 POST:
 REQUEST_METHOD is “POST”
 CONTENT_LENGTH is the size of the query

query can be read from STDIN
CGI
55

Lecture 13 Dynamic Web Servers & Common Gateway Interface CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger.

Transcript Lecture 13 Dynamic Web Servers & Common Gateway Interface CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger.

Directory