INFO 321 Server Technologies II INFO 321 Weeks 5-6 Apache ◊ Apache is synonymous with a web server app, but the Apache HTTP Server is.

Download Report

Transcript INFO 321 Server Technologies II INFO 321 Weeks 5-6 Apache ◊ Apache is synonymous with a web server app, but the Apache HTTP Server is.

INFO 321

INFO 321 Server Technologies II

Weeks 5-6

Apache

◊ Apache is synonymous with a web server app, but the Apache HTTP Server is just one project of the ten-year-old Apache Software Foundation (ASF) • There are dozens of Foundation projects • They state “We consider ourselves not simply a group of projects sharing a server, but rather a

community of developers and users

.” Material from http://httpd.apache.org/ and notes by Dr. Randy Kaplan INFO 321 Weeks 5-6

Overview

◊ This set of notes is divided into these sections • Web Server functionality • Choosing a web server • Installing Apache • Running Apache • Virtual Hosting • Authentication • Indexing • Alias and Redirect • Proxying 3

Web Server functionality

INFO 321 Weeks 5-6

Web Server protocols

◊ The main purpose of a web server is to handle HTTP and related protocols • DNS • FTP • HTTPS • Gopher, Telnet, etc. are also possible ◊ For more info on these protocols, see the chapter 2 notes for INFO 330 INFO 321 Weeks 5-6

Web Server protocols

◊ DNS uses UDP as its transport layer protocol • Connectionless, unreliable ◊ The other protocols use TCP for transport • Connection oriented between host computers • Reliable ◊ All protocols work by passing text messages back and forth INFO 321 Weeks 5-6

Web Server Wish List

◊ Run fast ◊ Handle lots of requests with minimal hardware ◊ Support multitasking • Deal with more than one request at a time • Need to maintain workload without shutting the server down ◊ Authenticate requestors INFO 321 Weeks 5-6

Web Server Wish List

◊ Respond to errors in the messages it gets, and tell what is going on ◊ Negotiate a style and language of response with the requestor ◊ Support a variety of formats ◊ Run as a proxy server ◊ Be secure INFO 321 Weeks 5-6

What Does a Web Server Do?

◊ Translate a URL into a file name or a program name • If a file – return the file over the Internet • If a program – run the program, and send the output back over the Internet ◊ URL = Uniform Resource Locator • Has three parts – :/// INFO 321 Weeks 5-6

How Does Apache Work?

◊ Runs under a suitable multitasking operating system • Binary is called httpd under Unix • Binary is called apache.exe under Win32 ◊ Each copy of httpd or apache.exe has its attention directed at a web site • For our purposes, the web site is a directory INFO 321 Weeks 5-6

Apache and TCP/IP

◊ A computer has a connection to the outside world, called an interface • Identify interface by a socket or port number ◊ The server decides how to handle different requests because the four byte (32 b) IPv4 address that leads the request to its interface is followed by a two byte (16 b) port number INFO 321 Weeks 5-6

Apache and TCP/IP

◊ Requests arrive on an interface for a number of different services offered by the server using different protocols • Network News Transfer Protocol (NNTP) • Simple Mail Transfer Protocol (SMTP) • Domain Name Service (DNS) • HTTP (WWW) INFO 321 Weeks 5-6

Apache and TCP/IP

◊ Different services attach to different ports • NNTP: port number 119 • SMTP: port number 25 • DNS: port number 53 • HTTP: port number 80 INFO 321 Weeks 5-6

Apache and TCP/IP

◊ UNIX/Linux • Port numbers below 1024 can only be used by the superuser (root) • Prevents other users from running programs masquerading as standard services ◊ Win32 • Under Win32 there is currently no security directly related to port numbers and no superuser INFO 321 Weeks 5-6

How Does Apache Work?

◊ Idling state – • Listens to the IP addresses specified in its config files (important foreshadowing…) • When a request appears –  Apache receives it and analyzes the headers  Applies the rules in the config file  Takes the appropriate action INFO 321 Weeks 5-6

How HTTP Clients Work

◊ When a URL (beginning http://) is sent to a browser, • The browser reads ‘http:’ and determines it should be using the HTTP protocol to communicate with web servers • A name server (DNS) is contacted to translate the host name in a URL to an IP address INFO 321 Weeks 5-6

Apache and Domain Servers

◊ It is the role of the DNS (Domain Name Server) to translate a computer’s telephone number (IP address) into a human readable (and memorable) name INFO 321 Weeks 5-6

DNS Errors

◊ Suppose Apache is given a URL which does not have a trailing / • Apache will add a trailing / and try to access the URL again (called redirection) • Then use DNS to resolve the IP address INFO 321 Weeks 5-6

Handling Multiple Web Sites

◊ The utility ifconfig binds IP addresses to physical interfaces (e.g. Ethernet ports) • ifconfig also allows binding multiple IP addresses to a single interface ◊ A client can switch from one IP address to another while maintaining service • This is known as

IP Aliasing

INFO 321 Weeks 5-6

Choosing a web server

INFO 321 Weeks 5-6

Why choose Apache?

◊ Apache has been the dominant web server app since 1996 • Open source enables its source code to be examined by thousands of eyes • Substantially more reliable • Apache is extensible • Apache is freeware INFO 321 Weeks 5-6

Other choices

◊ Other web server apps include • • • • Microsoft IIS or PWS • • Google GWS Lighttpd Zeus ZWS nginx Sun (includes Netscape and Netsite variants) INFO 321 Weeks 5-6

Apache market share

◊ Apache has been the leading web server since March 1996, but is losing ground ◊ According to Netcraft surveys • In November 2005, Apache supported 71 percent of domains, more than 50% ahead of Microsoft IIS (20.2 percent) (N=74.6 million) • By June 2009, Apache had 47.12%, versus Windows (IIS and PWS) had 24.80% of the 238 million domains reporting INFO 321 Weeks 5-6

Apache as in Indian?

◊ “The name 'Apache' was chosen from respect for the Native American Indian tribe of Apache (Indé), well-known for their superior skills in warfare strategy and their inexhaustible endurance .” (Apache FAQ) INFO 321 Weeks 5-6

Apache version & platforms

◊ Apache is on version 2.2.17 (released Oct 19, 2010) and changes slowly • Most Linux distributions are a little behind the current release • Old releases (2.0.x and 1.3.x) are maintained ◊ Apache runs on 32-bit Windows flavors, UNIX/Linux, and even NetWare (!) INFO 321 Weeks 5-6

Installing Apache

INFO 321 Weeks 5-6

Apache prereqs

◊ To install Apache, you need: • An Internet connection helps • Disk space – 50 MB to install, about 10 MB to run, depending on options • An ANSI-C compiler, such as the GNU C compiler (GCC) from the Free Software Foundation (FSF)  The Windows version can obtained in .exe form INFO 321 Weeks 5-6

Apache prereqs

• Accurate time keeping such as the ntpdate or xntpd programs  Some parts of HTTP are based on time of day, so some form of NTP support is needed • Perl5 is needed for a few options • The utilities apr and apr-util need to be version 1.2

 Upgrade them separately if needed, but they are included with Apache source code INFO 321 Weeks 5-6

Overview – Apache install

◊ Download • $ lynx http://httpd.apache.org/download.cgi

◊ Extract • $ gzip -d httpd-

NN

.tar.gz

• $ tar xvf httpd-

NN

.tar

• $ cd httpd-

NN

◊ Configure • $ ./configure --prefix=

PREFIX

INFO 321 Weeks 5-6

Overview – Apache install

◊ Compile • $ make ◊ Install • $ make install ◊ Customize • $ vi

PREFIX

/conf/httpd.conf

◊ Test • $

PREFIX

/bin/apachectl -k start INFO 321 Weeks 5-6

Overview – Apache install

NN

must be replaced with the current version number (e.g. 2.2.17) ◊

PREFIX

must be replaced with the file system path under which the server should be installed • If

PREFIX

is not specified, it defaults to /usr/local/apache2 INFO 321 Weeks 5-6

Download

◊ Most UNIX/Linux users will want to download Apache and compile it locally ◊ After download, use PGP to verify the download’s integrity, e.g.

• % pgp -ka KEYS • % pgp apache_1.3.24.tar.gz.asc

◊ This verifies against the MD5 or PGP message digest ASCII file INFO 321 Weeks 5-6

Extract

◊ This set of steps decompresses the tarball, extracts the tarball, and changes to the source code directory • $ gzip -d httpd-

NN

.tar.gz

• $ tar xvf httpd-

NN

.tar

• $ cd httpd-

NN

◊ Notice this is using the tar command we saw in the Backup section INFO 321 Weeks 5-6

Configure

◊ Now things get messy!

◊ The basic configure script, if you’re using the default PREFIX, can be run using • $ ./configure ◊ The configure script allows you to select which features are active on your host • You can also change where specific files are installed, for example INFO 321 Weeks 5-6

Apache architecture

◊ Apache is a modular server • This implies that only the most basic functionality is included in the ‘core’ server  Even core functionality can be disabled • Extended features are available through modules which can be loaded into Apache INFO 321 Weeks 5-6

Apache architecture

◊ By default, a base set of modules is included in the server at compile-time • If the server is compiled to use dynamically loaded modules, then modules can be compiled separately and added at any time using the LoadModule directive • Otherwise, Apache must be recompiled to add or remove modules INFO 321 Weeks 5-6

Some types of module status

Base

• A module having "Base" status is compiled and loaded into the server by default ◊

Extension

• A module with "Extension" status is not normally compiled and loaded into the server; to enable the module and its functionality, you need to change the server build configuration files and re-compile Apache ◊

External

• Modules which are not included with the base Apache distribution (" third-party modules "External" status ") may use the INFO 321 Weeks 5-6

Apache architecture

◊ Apache terminology note: • Features are implemented by

modules

, which are installed or not with your copy of Apache • Once installed, they can be

enabled

or

disabled

to allow them to run or not • Dozens of modules are enabled by default, so you’d have to explicitly disable them  The most dangerous one is --disable-http INFO 321 Weeks 5-6

Apache architecture

• Likewise, many modules are disabled by default, so you have to enable them explicitly  For example, --enable-ssl enables support for SSL/TLS provided by mod_ssl ◊ Be very careful, misspelled features are ignored,

without error message

!

• --enable-sssl will do nothing INFO 321 Weeks 5-6

Configure script vs. file

KEY POINT

: Apache has a configure script which enables

modules

• ./configure ◊ And a configuration file (or several) which contain

directives

PREFIX

/conf/httpd.conf

◊ Both are very important and powerful tools, but are completely separate!

INFO 321 Weeks 5-6

Configure

◊ The general syntax for enabling and disabling is •

--disable-FEATURE

 Do not include

FEATURE

; This is the same as --enable-

FEATURE

=no •

--enable-FEATURE[=ARG]

 Include

FEATURE

; the default value for

ARG

is yes INFO 321 Weeks 5-6

Configure

◊ Less often used enabling options include •

--enable-MODULE=shared

 The corresponding module will be build as a DSO (dynamically shared) module; will be enabled if you use the --enable-mods-shared option •

--enable-MODULE=static

 By default, enabled modules are linked statically; you can force this explicitly INFO 321 Weeks 5-6

Packages

◊ The configure script can invoke packages, which are typically third party features •

--with-PACKAGE[=ARG]

 Use the package

PACKAGE

; the default value for

ARG

is yes ◊ Often these tell where to find specific libraries or databases INFO 321 Weeks 5-6

Environment variables

◊ The configure script can also set environment variables ◊ These mostly describe what C compiler or flags to use, or the location of compile libraries INFO 321 Weeks 5-6

./configure summary

◊ So the Apache configure script controls which modules are enabled or not ◊ When an ISP tells you they support SSL, Perl, etc., they are implying which modules they installed (if they’re using Apache) INFO 321 Weeks 5-6

Build and Install

◊ $ make ◊ $ make install ◊ These are the traditional Unix commands to build and install an app ◊ They’ll take a while, especially make , since it includes compiling all the source code INFO 321 Weeks 5-6

Customize

◊ The file

PREFIX

/conf/httpd.conf

is a customization focal point for Apache ◊ Apache is configured by placing directives in plain text configuration files • Apache configuration files contain one

directive

per line  httpd.conf

is the main file, but other config files can be linked from it via an Include directive INFO 321 Weeks 5-6

Apache configuration

◊ Webmaster’s main control over Apache is through the config file ◊ The webmaster has 412 directives at their disposal • We’ll get to this soon… • No, not all of them  INFO 321 Weeks 5-6

Apache directory structure

◊ First steps • In Apache, what exactly is a “web site” • A web site is a directory somewhere on the server • Every Apache web site directory contains at least three (and maybe a fourth) subdirectories INFO 321 Weeks 5-6

Apache directory structure

◊ Regardless of OS, a site directory has • conf  Contains the important configuration file httpd.conf

• htdocs  Contains the HTML documents, images, data and other files to be served up to the site’s clients  These directories and subdirectories, the web space, are accessible to anyone on the Web INFO 321 Weeks 5-6

Apache directory structure

• logs  Contains the log files – history of accesses and errors • cgi-bin  Contains CGI scripts that are needed  If you don’t use scripts (CGI) you don’t need this directory INFO 321 Weeks 5-6

Running Apache

INFO 321 Weeks 5-6

Running Apache from the Command Line

◊ If the conf subdirectory is not the default location (it usually is not), you need to tell Apache where it is httpd –d /usr/wwww/APACHE3/example.site

53

When Apache is started

◊ It sits and waits in the background, waiting for a client’s request to arrive • After all, it’s a server app!

◊ When a request arrives, Apache attempts to respond to it or generates an error and places this in the log file 54

Configuration File

◊ Apache has a default configuration file • This file covers almost every option that Apache supports • It is quite complicated ◊ It is better, at least in the beginning, to create your own, simpler configuration file 55

Firing up the server

◊ Suppose we have a web site contained in a folder named 321 ◊ The command to run Apache hosting this web site would be – httpd –d /usr/local/apache2/htdocs/321 ◊ If you will use this command a lot it is good idea to create a script file that contains it 56

If all goes well …

◊ Look in /usr/local/apache2 * for the new executables • * Or wherever your PREFIX is ◊ Use ls –l to see the timestamps INFO 321 Weeks 5-6

Killing Apache

◊ To kill Apache, you must kill the main process and all of its children ◊ One way to accomplish this is to get all processes with the name httpd ps awlx | grep httpd ◊ And then kill all of the poor innocent helpless processes – killall httpd 58

Killing the server …

59

Killing the server … gracefully

◊ A utility (program) is supplied with Apache called apachectl (= Apache control?) ◊ It can be used to start and stop Apache and perform other utility operations 60

apachectl

◊ Syntax is ◊ /usr/local/apache2/bin apachectl (start|stop|restart|fullstatus|status| graceful|configtest|help) ◊ start  ◊ stop  start httpd stop httpd ◊ restart  restart httpd if running 61

apachectl

◊ /usr/local/apache2/bin apachectl (start|stop|restart|fullstatus|status| graceful|configtest|help) ◊ fullstatus  dumps a full status screen ◊ status  dumps a short status screen ◊ graceful  do a graceful restart or start if not running ◊ configtest  ◊ help  do a configuration syntax test display command listing 62

Default Problems

◊ If you get the message – fopen: No such file or directory httpd: could not open error log file … ◊ Then to httpd.conf add the line – Errorlog logs/error_log 63

Default Problems

◊ If Apache still fails to start, and you get a message in /logs/error_log: … No such file or directory.: could not open mime types … ◊ In the httpd.conf file add the line – TypesConfig conf/mime.types

64

Default Problems

◊ If Apache still fails to start, and you get this message in the /log/error_log file – fopen: no such file or directory httpd: could not log pid to file … ◊ In httpd.conf you need to add the line – PIDFile logs/httpd.pid

65

A Small But Complete httpd.conf

user webroot Group webgroup ServerName myServerName DocumentRoot /usr/local/apache2/htdocs/ # to fix common problems, uncomment these #ServerRoot /usr/local/apache2/htdocs #ErrorLog logs/error_log #PIDFile logs/httpd.pid

#TypesConfig conf/mime.types

66

A Complete Minimal File

67

Testing to See the Server

◊ In a command line, type telnet myServerName 80 ◊ Response should be – Trying to connect to 192.168.2.223

Connected to myServerName.my.domain

Escape character is ‘^]’ 68

Testing to See the Server

◊ Type – GET / HTTP/1.0 ◊ You should see – HTTP/1.0 200 OK Sat, 28 Jan 2006 23:49 GMT Server: Apache/1.3

Connection: close Content-Type: text/html 69

httpd.conf Directives

◊ ServerName • Gives the hostname of the server to use when creating redirection URLs ◊ DocumentRoot • Directory from which Apache will serve files • Default: /usr/local/apache2/htdocs 70

httpd.conf Directives

◊ ServerRoot • Where conf and logs can be found • Default: /usr/local/etc/httpd ◊ ErrorLog • The name of the file to which the server will log any errors it encounters • Default: Errorlog logs/error_log 71

httpd.conf Directives

◊ PIDFile • Allows the location of the file containing the PID to be changed • Default: logs/httpd.pid

◊ TypesConfig • Path and filename to find the mime.types file if it is not in the default location • Default: conf/mime.types

72

httpd.conf Directives

◊ LoadModule • Links in the specified object file or library • Adds the module structure to the list of active modules ◊ AddModule • Enables a module that has been compiled into Apache but is not in use 73

Virtual Hosting

INFO 321 Weeks 5-6

Virtual Hosts

◊ Let’s make the following assumptions – • We run a business that has been running a web site • We are ready to expand and have a need for more than one web site • As our business has grown we need to set up an Intranet for employees • The existing web server (Extranet) is for customers 75

Virtual Hosts

◊ Two approaches • Approach 1  Run a single copy of Apache  Maintain two web sites as virtual sites • Approach 2  Run two copies of Apache  Each copy maintains a single site  Allows optimization of Apache to a web site 76

Name-based Virtual Hosts

◊ Preferred method of managing virtual hosts ◊ Takes advantage of the ability of HTTP 1.1 compliant browsers ◊ Browser supports host header – specifies the name of the site they want to access 77

Sample Config File

User webuser Group webgroup Key directive NameVirtualHost 192.168.123.2

ServerName www.MyCompany.com

ServerAdmin [email protected]

DocumentRoot /usr/local/apache2/site1.virtual/htdocs/extranet ErrorLog /usr/local/apache2/site1.virtual/htdocs/logs/error_log TransferLog /usr/local/apache2/site1.virtual/htdocs/logs/access_log Tells Apache that requests to the IP will be subdivided by name

> ServerName intranet .MyCompany.com

ServerAdmin [email protected]

DocumentRoot /usr/local/apache2/site2.virtual/htdocs/intranet ErrorLog /usr/local/apache2/site2.virtual/htdocs/logs/error_log TransferLog /usr/local/apache2/site2.virtual/htdocs/logs/access_log 78

NameVirtual Host

◊ Key directive tells Apache that requests to that IP number will be subdivided by name ◊ The ServerName directive provides a name for Apache to return to the client ◊ NameVirtualHost allows you to specify – • IP addresses of your name-based virtual host • A port number can be added if necessary 79

NameVirtualHost

◊ If an IP address is added it needs to match the IP address at the top of a block ◊ A ServerName directive must be included ◊ The ServerName directive must be followed by a registered name 80

Resolving a Virtual Host

◊ When Apache receives a request to a named host – • The blocks are scanned for a match of the IP address declared with a NamedVirtualHost directive to find one that includes the requested servername 81

IP-Based Virtual Hosts

◊ Because the web is primarily IP addressed based, it makes sense to be able to do IP-based virtual hosting ◊ The next config file accomplishes this style of virtual hosting 82

IP-Based Virtual Hosting

User webuser Group webgroup

> ServerName www.MyCompany.com

ServerAdmin [email protected]

DocumentRoot /usr/local/apache2/site1.virtual/htdocs/extranet ErrorLog /usr/local/apache2/site1.virtual/htdocs/logs/error_log TransferLog /usr/local/apache2/site1.virtual/htdocs/logs/access_log

> ServerName intranet .MyCompany.com

ServerAdmin [email protected]

DocumentRoot /usr/local/apache2/site2.virtual/htdocs/intranet ErrorLog /usr/local/apache2/site2.virtual/htdocs/logs/error_log TransferLog /usr/local/apache2/site2.virtual/htdocs/logs/access_log 83

IP-Based Virtual Hosting

◊ What’s Different?

• No NameVirtualHost directive • Need ServerName directive 84

Mixed Name/IP-Based Virtual Hosts

◊ In this case some of our virtual web sites will be accessed via name and others will be access via IP addresses ◊ A useful approach when wanting to set up a web site for testing and limited exposure • The typical user will have no need to access a web site by IP address 85

Mixed Name/IP-Based Virtual

User webuser Group webgroup

Hosts

NameVirtualHost 192.168.123.2

> ServerAdmin [email protected]

DocumentRoot /usr/local/apache2/site1.virtual/htdocs/extranet ErrorLog /usr/local/apache2/site1.virtual/htdocs/logs/error_log TransferLog /usr/local/apache2/site1.virtual/htdocs/logs/access_log

> ServerAdmin [email protected]

DocumentRoot /usr/local/apache2/site2.virtual/htdocs/intranet ErrorLog /usr/local/apache2/site2.virtual/htdocs/logs/error_log TransferLog /usr/local/apache2/site2.virtual/htdocs/logs/access_log ServerName test-new.MyCompany.com

ServerAdmin [email protected]

DocumentRoot /usr/local/apache2/site3.virtual/htdocs/new-test ErrorLog /usr/local/apache2/site3.virtual/htdocs/logs/error_log TransferLog /usr/local/apache2/site3.virtual/htdocs/logs/access_log 86

Authentication

87

Authentication

◊ Client sends username and password to Apache • Apache determines if the user is a valid one for access to the web site ◊ Access to a site or database can be controlled precisely by the web master 88

Authentication

◊ Can also be given to groups • Groups can be given or denied access as a whole ◊ Let’s make the following assumption – • Bill and Ben are the group directors in our business • Betsy and Mike are in the group staff • Password will be “password” for all 89

Authentication

User webuser Group webgroup NameVirtualHost 192.168.123.2

> ServerAdmin [email protected]

DocumentRoot /usr/local/apache2/site1.virtual/htdocs/extranet ErrorLog /usr/local/apache2/site1.virtual/htdocs/logs/error_log TransferLog /usr/local/apache2/site1.virtual/htdocs/logs/access_log

> ServerAdmin [email protected]

DocumentRoot /usr/local/apache2/site2.virtual/htdocs/intranet ErrorLog /usr/local/apache2/site2.virtual/htdocs/logs/error_log TransferLog /usr/local/apache2/site2.virtual/htdocs/logs/access_log AuthType Basic AuthName darkness AuthUserFile /usr/local/apache2/validUsers/intranetUsers AuthGroupFile /usr/local/apache2/validGroups/intranetGroups Require valid-user 90

Authentication

◊ Let’s examine the new part in detail: AuthType Basic AuthName darkness AuthUserFile /usr/local/apache2/validUsers/intranetUsers AuthGroupFile /usr/local/apache2/validGroups/intranetGroups Require valid-user 91

Authentication

◊ AuthType Basic – • Turns on authentication (a key directive), and specifies the type thereof (Basic, not MD5) • Requires AuthName, AuthUserFile, and AuthGroupFile to be specified as well ◊ AuthName directive • Gives the name of the realm in which users’ names and passwords are valid • If more than one, enclose in quotes (“”) 92

Authentication

◊ AuthUserFile directive • Contains usernames and encrypted passwords ◊ AuthGroupFile directive • Contains the correspondence between users and groups 93

Authentication – Passwords

◊ Passwords are managed by the Apache utility htpasswd ◊ Find the source for this utility in the support subdirectory of the Apache directory tree ◊ Compiled with – • make htpasswd 94

htpasswd

◊ Once compiled we can ask it for some help htpasswd -?

◊ This will return (as usual) the use of the command and the options supported in the command line 95

htpasswd

Usage: htpasswd [-cmdps] passwordfile username htpasswd –b[cmdps] passwordfile username password -c Create a new file -m Force MD5 encryption of the password -d Force CRYPT encryption of the password (default) -p Do not encrypt the password – plaintext -s Force SHA encryption of the password -b Use the password from the command line rather than prompting for it 96

htpasswd

◊ Example – htpasswd –m –c /usr/local/apache2/validUsers/intranetUsers bill ◊ Once this command is entered you will be prompted for the password twice • You might have a look in the password file to see what was entered there ◊ If you use the –c option on an existing password file, a new one will be created without warning, so be careful when using this option 97

Other approaches to control access

◊ Apache provides directives to control access precisely ◊ These include – • Allow • Deny • Order 98

Allow from directive

◊ ◊ ◊ allow from host host … directory, .htaccess

Controls access to a directory Host can be one of the following – • all – all hosts are allowed access • A partial domain name   Hosts whose names match or end in this string are allowed access • A full IP address Used to restrict to subnets  1 – 3 bytes of the IP are used • Network/netmask pair • Network CIDR specification (some number of bits) 99

Allow from env directive

◊ Controls access by the existence of a named environment variable, for example BrowserMatch ^KnockKnock/2.0 let_me_in order deny, allow deny from all allow from env=let_me_in 100

Allow from env directive

BrowserMatch ^KnockKnock/2.0 let_me_in ◊ This is a directive that sets an environment variable, let_me_in ◊ The pattern to be matched to set the environment variable is ^KnockKnock/2.0

101

Deny from directive

◊ all Controls access by host, such as: deny from

host host

Where

host

can be one of the following – all hosts are denied access A partial domain name all hosts whose name match or end in this string are denied access A full IP address the first one to three bytes are denied access, for subnet restriction A network/netmask pair network a.b.c.d and netmask w.x.y.z are denied access 102

Deny from env directive

◊ Controls access by the existence of a named environment variable, for example BrowserMatch ^BadRobot/0.9 go_away order allow, deny allow from all deny from env=go_away 103

Order directive

◊ Usage order

ordering

◊ The

ordering

argument is one word • Controls the order in which the foregoing allow or deny directives are applied • If two order directives apply to the same host, the last one to be evaluated prevails 104

Order directive

Ordering

deny,allow • Deny directives are evaluated for allow directives (default) allow,deny • The allow directives are evaluated before the denys. The user will still be rejected if a deny is encountered 105

Order directive

Ordering

mutual-failure • Hosts that appear on the allow list and do not appear on any deny list are allowed to access 106

Order directive examples

allow from all • Lets everyone in allow from 123.156

deny from all • Denys everyone except those whose IP addresses happen to begin with 123.156

• Allow is applied last 107

Order directive examples

order allow,deny allow from 123.156

deny from all ◊ The whole site is closed ◊ Deny is applied last 108

Indexing

109

Indexing

◊ An index provides a listing of the files that are in a web site ◊ If no file like index.html is prepared then Apache will prepare its own rudimentary index to access the web site ◊ It is also possible to use Apache to create better indices 110

Indexing

◊ The directive (in the config file) IndexOption makes Apache create an index on the fly ◊ The index will be displayed when there is no file index.html

111

Indexing - Example

◊ Config File Turn on indexing 112

Indexing (Page Created)

113

Indexing

◊ The directive for indexing is quite complex (lots of options) but it deserves to be examined as it provides valuable functionality ◊ Assume the latest version of Apache IndexOptions [+|-]option [[+|-]option] … 114

Indexing

◊ Options • DescriptionWidth • FancyIndexing • FoldersFirst • IconHeight • IconWidth • NameWidth • ScanHTMLTitles • SuppressColumnSorting • SuppressDescription • SuppressHTMLPreamble • SuppressLastModified • SuppressSize • TrackModified • IndexOrderDefault • ReadmeName • FancyIndexing • IndexIgnore • AddIcon • AddAlt • AddDescription • DefaultIcon • AddIconByType • AddAltByType • AddIconBy Encoding • AddAltbyEncoding • HeaderName 115

Indexing

◊ With so many options, which ones are important or more useful?

116

Indexing

The effect of most of these options is apparent from its name • DescriptionWidth • FancyIndexing • FoldersFirst • IconHeight • IconWidth • NameWidth • ScanHTMLTitles • SuppressColumnSorting • SuppressDescription • SuppressHTMLPreamble • SuppressLastModified • SuppressSize • TrackModified 117

Indexing

• IndexOrderDefault  This option is used to specify the ordering of the entries in the index. You can specify ascending, descending, by name, date, size, and description • ReadmeName  The ReadmeName is the name of the file that will be appended to the end of the index listing • HeaderName  Inserts a header, read from a file, at the top of the page 118

Indexing

◊ These options deal with specifying the icons that are displayed with index entries and the alternate text that is used • AddIcon • AddAlt • AddDescription • DefaultIcon • AddIconByType • AddAltByType • AddIconBy Encoding • AddAltbyEncoding 119

Alias and Redirect

120

Redirection

◊ Two directives allow requests to be shunted around your file system ◊ Directives • Alias • Redirect ◊ These directives allow HTML files to be moved around a file server 121

Alias Directive

◊ Alias • A legitimate purpose of the ALIAS directive is to be able to logically place files around the server • File could also be placed on other servers • In this way, files can be maintained by their owners 122

Alias Directive

◊ Alias • Useful directive • Store documents elsewhere ◊ Demonstration • Create a new directory • /usr/local/apache2/htdocs/somewhere_else • Put a file named lost.txt in this directory with the contents  I am somewhere else • Add the following line to the conf file Alias /somewhere_else /usr/local/apache2/htdocs/somewhere_else 123

Alias Directive

◊ If you now access this directory via the browser (as a named directory off of the root) you will see the following – Index of /somewhere_else . Parent Directory . lost.txt

124

Alias Directive

◊ Use – Alias url_path directory_or_filename ◊ Map a user’s resource URL to its physical location in the file system 125

AliasMatch

◊ Use – AliasMatch regex directory_or_filename ◊ Like ScriptAliasMatch ◊ Takes a regular expression as the first argument otherwise it works like Alias 126

Redirect Directive

◊ Use – Redirect [status] url-path url ◊ Maps an old URL to a new one; the new URL is returned to the client ◊ The client attempts to access the information again using the new URL, for example – Redirect /service http://foo2.bar.com/service If the user requests http://myserver/service/foo.txt it will be told to access http://foo2.bar.com/service/foo.txt

127

Redirect Directive

◊ ◊ ◊ If no status argument is given, the status is temporary The status argument can be used to return HTTP status codes Status  • temp – • permanent Returns a redirect status of 301 indicating the resource has moved permanently  Returns a redirect status of 302 indicating the resource has move temporarily • seeother   Returns a status 303 indicating the resource has been replaced • gone Returns a status 410 indicating the resource has been permanently removed 128

RedirectMatch Directive

◊ Use – RedirectMatch regex url ◊ Uses a regular expression to specify the resource to be redirected 129

Proxying

130

Proxying

◊ Don’t connect a busy web site straight to the web – Why?

◊ Better performance • Cache popular web pages • Distribute requests among a number of servers ◊ Give the bad guys more defended ground to get past ◊ Give local users protected by a firewall access to the Internet 131

Proxying

◊ Security • Keep the the bad guys out of the network • To do this, keep the network hidden behind a firewall • Doing this shuts off access to the Internet • A proxy server is used to create access to the Internet 132

Proxying

◊ As with other functionality with Apache, directives in the .conf file specify proxy functionality ◊ In this capacity, Apache is acting as an agent to send user’s requests out to the Internet 133

Proxy Directives

◊ A new site will be created named proxy ◊ This site has three subdirectories – • cache • proxy • real 134

Sample Config

User webuser Group webgroup ServerName www.myCompany.com

Port 8000 ProxyRequests on CacheRoot /usr/local/apache2/proxy/cache CacheSize 1000 135

Sample Config

ProxyRequests on • Turns proxy serving on CacheRoot /usr/local/apache2/proxy/cache • Sets the directory to contain cache files • Must be writable by Apache CacheSize 1000 • Specifies the size of the cache area in KB 136

Setup

◊ Cache directory • Needs to be set up carefully • Owner = webuser • Group = webgroup ◊ The browser must be told you are going to access the web via a proxy • To do this you specify the IP address of the proxy server and the port 8000 137

Setup

◊ Proxy setting panel from Firefox (see Tools > Options > Advanced > Network tab, Settings) 138

Proxy Simulation

◊ Four elements needed to test the proxy server functionality • A browser configured to access the web via proxy • A firewall (real or imaginary) • Copy of Apache running the proxy • Copy of Apache running the website 139

Proxy Simulation

◊ One copy of Apache will run with the Proxy configuration file User webuser Group webgroup ServerName www.myCompany.com

Port 8000 ProxyRequests on CacheRoot /usr/local/apache2/proxy/cache CacheSize 1000 ◊ Since we are simulating this on a single computer, we will use port 8000 as the port to receive proxy requests 140

Proxy Simulation

◊ The web server will use the following configuration (we are simulating a site out on the web by running Apache as a web server) ◊ Config for the web site User webuser Group webgroup ServerName www.myCompany.com

Listen www.myCompany.com:80 DocumentRoot /usr/local/apache2/real/htdocs 141

Proxy Simulation

◊ In /etc/hosts we place the following entry – 192.168.124.1

www.myCompany.com

◊ This simulates DNS registration for www.myCompany.com

◊ Notice this domain will be on a different subnet than the one we have been using 142

Proxy Simulation

◊ Next we need to configure the Ethernet interface for the simulation ◊ We will use the following commands – ifconfig eth0 192.168.123.2

ifconfig eth0 192.168.123.3 alias netmask 0xFFFFFFFF ifconfig eth0 192.168.124.1 alias 143

Proxy Simulation

◊ Start a copy of Apache for each of the config files and sites ◊ At this point you can fire up your configured browser and enter the URL http://192.168.124.1

◊ You should see the site’s web page displayed ◊ But how do you know the site is being proxy served?

144

Proxy Simulation

◊ Go to the browser and reconfigure to

NOT

use a proxy ◊ Now, enter the URL again http://192.168.124.1

◊ You should get a network error 145

References

◊ Apache Web Server • Apache FAQ • Web server 2.2 documentation • The configure script • apache.conf directives index ◊ Netcraft web server survey ◊ Apache Week (online periodical) INFO 321 Weeks 5-6