The Linux Operating System

Download Report

Transcript The Linux Operating System

Tonga Institute of Higher Education
The Linux Operating System
Lecture 5:
Apache
Apache
●
●
●
●
One of the most important things your Linux computer
might do is run a webserver.
More and more companies, as well as individuals are
having their own web servers, and it's a good idea to
know how to set it up and make sure it's running
The most popular web server in the world is called
Apache. It runs on Linux as well as almost every other
operating system. It is also free to download.
Many tests have shown it to be one of the fastest and
most secure web servers, so it's a good choice to use.
It also comes installed with RedHat, but whenever you
install a new system you should always upgrade
apache. (There might be security holes in older
versions)
Understanding communication
●
●
●
●
The Internet uses what's called the client-server
relationship, where a client will ask a server for
data. Then they will communicate and send the
data. This is what Apache does as well
For the server and client to talk, the server opens
what is called a 'socket.' This is like opening a door
for two computers to send data.
A socket needs to know the port, the service and
the protocol to send the data correctly
Apache, and most web servers will use port 80 (but
you can change that), the 'www' service and the
'HTTP' protocol for the sockets that they open
Web communication
●
●
●
When you open your browser and type in a
website, what happens is that the operating system
will take the address, put together a request for
that webpage in a TCP/IP data packet and wrap it
all up in the HTTP protocol.
When that packet arrives at the server, the
operating system on the server will figure out that
it wants the www service and it sends it the packet
to the webserver to deal with the data
The webserver will use the HTTP protocol to figure
out what the client computer wants and then it
opens up the socket for the data to be sent
Apache as a webserver
●
●
To make apache run, we can use the chkconfig
commands to turn on, or the service commands,
like
–
[root@comp root]# chkconfig httpd on
–
[root@comp root]# service httpd restart
Notice the program name of the webserver is httpd
(stands for http daemon), but everyone will call it
apache.
–
–
●
You can also use the program “apachectl” to start and
stop the server
[root@comp root]# apachectl restart
Our goal is to be able to get the Apache server to
do a few different things, so we need to change the
configuration file. This is how Apache figures out
what to do
Apache configuration
●
●
●
●
The apache configuration file is usually in
/etc/httpd/httpd.conf if you're using RedHat.
If it's not in that location, then use the 'locate'
program to find where it is
Once you open the file using 'pico' or 'vi' you'll be
able to change values that affect how the server
runs and what it does. After you save the changes
you made, you'll also have to restart the server
(using the 'service' program).
The parts that we will look at are the global
settings, Directories, Virtual Hosts and Handlers
Global Settings
●
●
●
●
In the configuration file you will see many lines starting
with '#'
This just means that it's a comment and you can skip
through it.
The first real setting you will come to will say something
like
– ServerRoot “/etc/httpd/”
The comments from above this line say
–
●
# ServerRoot: The top of the directory tree under which
the server's configuration, error, and log files are kept.
The comments tell us that it is the place where the program
will expect to find all the configuration files, logs and other
important info
Global Settings
●
●
Other important global settings to change:
# The Listen setting tells Apache to listen on what port and
what IP address. Notice it has the * in front of ':80'. This
means it will listen on for any IP address and on port 80
–
●
# user and group settings tell the webserver what user they
should run as. Normally there is a separate user that will run
the webserver, called apache
–
–
●
Listen *:80
User apache
Group apache
# ServerAdmin tells what email address should be notified if
there are problems and ServerName says what the
hostname will be of the main webserver
–
–
ServerAdmin [email protected]
ServerName tonga-server.to
Global Settings and Directories
●
●
●
The last global setting you'll need to know is
DocumentRoot. This tells Apache where the website
files are. It will be a folder on your Linux computer
– DocumentRoot “/var/www/html”
This leads into the Directory part of the
configuration file.
Directories are a way to control how different parts
of the website are accessed. For example, you can
say that a folder on your computer at /home/user1
can be accessed at the website if you type in
http://www.website.com/user1 and you can also say
who can access this folder and who is allowed to
see what is inside
●
The first directory listing looks like this
<Directory />
Options FollowSymLinks
AllowOverride None
Directories
</Directory>
●
●
●
It says that the root directory on the website has an option of
FollowSymLinks and no Overiding.
There are many options like the above. The format is that they go
between to tags that say directory. (<Directory>)
You might also see an entry for the directory of the main folder,
like the following
–
<Directory "/home/www/html">
–
Options Indexes FollowSymLinks
–
Order allow,deny
–
Allow from all
–
</Directory>
Directories
●
●
For example, if we have a folder at /home/user and we
want to have it part of the website, but the
DocumentRoot is at “/var/www” how can we include it
as part of the website?
First we add an Alias setting and then we add the
Directory setting
Alias /userfolder/ “/home/user”
<Directory “/home/user”>
Options Indexes MultiViews
AllowOverride None
Order allow,deny
Allow from all
</Directory>
Directory Options
●
●
●
●
Some common directory options that you might want to
use inside the <Directory>:
You should always include the following for security
reasons. It means that previous security settings do not
affect the current folder
– AllowOverride None
If you have a directory and no web pages, just files,
and you want people to see all the files, use:
– Options Indexes
If you want to control who accesses this folder you can
use Order and Allow settings. The following will deny
access to anyone trying to get to this folder that is not
from usp.to
– Order deny,allow
– Allow from computer-tonga.to
Handlers
●
●
●
Handlers extend the ability of apache to do
different thing. A handler is like an extra program
that Apache uses to handle certain types of
requests for webpages.
For example, you might want a program that can
be run on your webpage with Perl. Since it is a
program that needs to be run and not just a regular
file that needs to be sent, Apache must run a
special program that will itself run the program on
your webpage.
This is what the idea of a Handler does. It handles
certain types of data.
Handlers
●
●
●
●
One common type of handler is the following
– AddHandler cgi-script .cgi
This says that if a file ends in “.cgi” that it is a cgiscript, which means it's a special program.
Another common programming done with websites is
called Server Side Includes. It's little programming
codes that are put in webpages, except they end in
.shtml.
Apache needs to do something special with these files.
To let your webserver use the SSI, you can add in the
Handler below:
– AddType text/html .shtml
– AddHandler server-parsed .shtml
Virtual Hosts
●
●
●
●
One of the biggest computing issues that has come
up is how to have more than one website on one
computer?
Apache solved this problem by creating was it
called a 'Virtual Host'
This means that even though your server might be
called usp.to and you might have a webpage that
goes to this, you can ALSO have other domain
names point to your server, and your server will
figure out which domain name is the one the
person actually wants.
There are two types of virtual hosts: Name-based
virtual hosts and IP-based hosts
Virtual Hosts
●
●
●
●
Named virtual hosting – this is when you determine
which website will be displayed based on the domain
name that the client asks for. It is easier to set up,
because you just need to point all the domain names to
the one IP address of the server
IP virtual hosting – This differs from the named virtual
hosting by using the IP address that user wants to
connect to. This means that your server will be using
more than one IP address. Apache will then figure out
which website to display based on the IP address that
the user went to.
This is harder because of having the multiple IP
addresses on one computer
Generally, it's easiest and best to use the Name virtual
hosts
Virtual Hosts
●
●
●
The virtual host section is at the bottom of the Apache
configuration file.
To enable it, first uncomment the following line:
– NameVirtualHost *
Then you need to add in the Virtual Host that will be a
new website. Example:
<VirtualHost *>
ServerName new-website.to
DocumentRoot /home/new-website
DirectoryIndex index.php
CustomLog /var/log/httpd/new-access_log
common
</VirtualHost>
Virtual Hosts
●
So the options that are important for the
virtual hosts, are
–
–
–
–
ServerName – this decides what virtual host we
are talking about
DocumentRoot – this decides where the files
for the webpage are located
DirectoryIndex – this decides the first page
that will come up when someone types in the
webpage
CustomLog – this lets you keep track of who
came to your website. Each virtual host can have
their own separate log that keeps track of
visitors.
Apache Configuration
●
●
●
●
●
●
As you look through the configuration file, you'll see a
lot more options. Most of them are well-commented, so
you can figure out what they are for.
If you're still unsure, try looking at the manual that is
installed on your own website. It will be
http://www.your-website.com/manual/
This is the Apache manual that is included with all of
their servers.
When you restart the apache server, you'll know if there
is an error right away because you'll get a message that
says it could not start. It will even tell you the line that
the problem is on so you can fix it.
If you start Apache and it says, OK, then your
configuration file has no problems.
Webpages and HTML
●
●
●
We should have an Apache webserver set up. If
you type in http://localhost/ you should see a
webpage come up.
The localhost is another name for saying, look at
this computer that I am using. If you are using
another computer, you can type in the address for
your server, and you will also see a webpage come
up
So now you'll need to set up some webpages for
your website. Your webpages are just files on your
Linux server located in a folder, so you can make
them using 'pico' or 'vi'
HTML
●
●
●
●
Go to the folder where your website is located such
as “/var/www/html” and you'll want to open up a
file called index.html
This is the default file that is sent. This can be
changed in your Apache configuration file
When you have opened up that file, we can add in
HTML and make a webpage. As soon as you save
the file, it will be available on the web and you can
see the changes you have made.
The stuff that you type inside the index.html file is
just text. There is nothing special about it. You
could even use only text for a webpage, no HTML
and it would work just fine.
HTML
●
●
●
But we can also use HTML to make our webpages
look a little better.
HTML is the programming language for websites.
It's really called a markup language, which means it
surrounds regular text with little commands that tell
it to change the way things look or how they are
formatted
Some common examples of HTML:
–
To make text bold: <b>Text is bold</b>
–
To make text italic: <i>Text is italic</i>
–
To make text underline:
<u>Text is underline</u>
HTML Parts
●
●
●
There are two parts to an HTML file and you should
generally follow the format so your HTML is correct.
An HTML file will always start at the top with an
<HTML> tag and end at the bottom with a
</HTML> tag
Between, there should be a HEAD part and a BODY
part. The HEAD is for special information (like a
title for the page). The BODY is where all the things
that will display will be put
HTML Parts Example
<html>
<head>
<title>My page</title>
</head>
<body>
<b>Hello World</b> This is my page<BR>
<i>Thank you for coming here</i>
</body>
</html>
More HTML
●
●
HTML is a fairly simple language. It also uses the
same format, of having some command between a
<>
Sometimes there are also attributes within the tag,
like:
– <font size='3'> - changes font size
– <hr width='400'> - makes a line on the
webpage
– <a href='http://www.yahoo.com'> creates a link to yahoo.com
– <body bgcolor='blue'> - changes the
background color of the webpage to blue
Using SSI
●
●
●
●
●
SSI (Server Side Includes) is a little way to make your
web page files have some programming in it.
They are directives that are placed in HTML pages, and
evaluated on the server while the pages are being sent
to the user.
They let you add things that can change to an existing
HTML page (which usually does not change each time
someone goes to it).
These are called dynamic webpages. Normally, HTML
pages are thought of as static webpages, because they
do not change.
SSI lets webpages change though depending on
something that happens
Using SSI
●
SSI directives have the following syntax:
–
●
●
<!--#element attribute=value -->
It is formatted like an HTML comment, so if you
don't have SSI correctly enabled, the browser will
ignore it,
The element part will tell what kind of command
will be run
●
●
The attribute=value part will provide some sort of
argument to the command that element gives.
Example which outputs the current date:
–
<!--#echo var="DATE_LOCAL" -->
SSI Examples
●
Add a line that tells the last modification date of a
file
–
●
●
●
<!--#flastmod file="index.html" -->
Include part of another file (good if you want all
your webpages to look the same)
<!--#include virtual="/footer.html" -->
Execute a command on your computer and then
display the results
–
–
–
<!--#exec cmd="/bin/ls" -->
<!--#exec cmd=”/bin/netstat -n” -->
<!--#exec cmd=”/usr/bin/last” -->
Using SSI
●
●
●
●
To make sure that Apache realizes you're
using SSI, you need to save your file with a
“.shtml” at the end.
Normally, webpage end in “.html” or “.htm”
To tell Apache to do something different, put
a “.shtml” at the end
Example:
–
–
index.shtml
page1.shtml
SSI
●
●
●
●
SSI is a good way to add “dynamic” content to your
webpage, or things that will change depending on
different circumstances
It is not a complete replacement for server-side
programming, like Perl, PHP or ASP, which is what
will be covered next.
While static pages are good for websites that just
want to display information, for websites to interact
with the people who come to them, we need ways
to generate dynamic content.
SSI is one way to do it, but there are certainly
better ways.
Installing a new Apache
●
●
●
Every now and then there is a security hole in
Apache, or a major upgrade. Sometimes it can be
troublesome to install Apache
The main reason for the problems is that RedHat
puts configuration files for Apache in /etc/httpd/
and when you install Apache, it wants to put it in
/usr/local/apache2/
Then we also must make sure the startup scripts
are updated so that it will run the new apache and
not some old apache that is hiding.
Summary
●
●
●
●
Apache is one of the most valuable tools for a Linux
server to use.
Almost 70% of all web servers run Apache. They
are fast, reliable and generally secure
To understand the workings of Apache, a good
understanding of the configuration file is necessary.
Once you are able to control Apache, you will be
able to create a well-maintained and successful
webserver.