Web Server Setup

Download Report

Transcript Web Server Setup

Web Technology
Web Server Setup
Course Overview and Goals
• This course will teach you how to install,
configure, and administer a Web server that
runs on a Unix system and can be used to
deliver dynamic content.
What This Course Is and Is Not
• The purpose of the course is to teach you how to
setup a Web server. This means you will be
learning how to use tools to deliver content for the
World Wide Web, not to create content.
World Wide Web Unix
Administrator Certificate
• This course is one of four required to
receive the World Wide Web Unix
Administrator Certificate.
Prerequisites
• Familiarity with a Web browser such as
Netscape or Internet Explorer.
• You should have user-level experience with
UNIX and must be familiar with the use of
a UNIX text editor like vi, emacs and pico.
• Some level of experience with creating
HTML documents may be helpful.
Course Resources
• Textbook: Professional Apache by Peter
Wainwright (Wrox Press, 1999).
• User account on Linux server
iti.rutgers.edu.
How does the World Wide Web
Work?
• Works on a client/server model. The Web
server is the server component. The Web
browser is the client component. Purpose of the
Web server is to provide documents to clients.
• Web servers, Web browsers, and the
information that is shared between them
through the Hypertext Transfer Protocol
(HTTP) protocol make up the World Wide
Web.
History of the World Wide
Web
• Grew out of the Internet, a network of networks
designed that began in the early 1970’s and was
used to support a variety of services (including
telnet, ftp, Usenet, email, and gopher) that
communicated via TCP/IP (Transmission Control
Protocol/Internet Protocol).
• In 1989, Tim Berners-Lee at CERN developed a
new system to simplify document distribution and
to allow documents to be linked together. Called
the “WorldWideWeb.”
Web History, con’t.
• In 1993, the National Center for
Supercomputing Applications (NSCA)
released to the public a NCSA server
software and a GUI Web browser called
Mosaic. Quickly became popular.
• Mosaic became Netscape
Who is a Webmaster?
• A Webmaster is someone responsible for
the content and/or management of a Web
site and/or a Web server.
What Roles Do Webmasters
Play?
• Web Designers – Create graphical elements
and determine layout of Website.
• Content Providers- Create and edit HTML
documents.
• Web Developers – Write CGI, Java,
JavaScript, ASP, PHP, and other scripts or
programs that are used to deliver dynamic
content.
Webmaster Roles, con’t.
• Administrators – Responsible for
maintaining the Web server software and
often the operating system and hardware
where the Web server is installed.
• For most organizations, these
responsibilities tend to be split over
multiple job positions except for very
small and simple Web sites.
Planning Your Server
• How and where will you host it?
• What kind of hardware will you use?
• What kind of Operating System will the hardware
run?
• What Web server software will you use?
• What domain name will your site use?
• Answers to above questions usually determined by
budget, staffing, and existing infrastructure of
your organization.
Hosting Your Server: Use an
ISP (Internet Service Provider)
• Free Page Site – For personal use, limited space
and tools, adds advertisements. (examples:
Yahoo, Tripod, Xoom, etc.)
• Personal Page Site – For personal use, usually
included with dialup account (about $20 per
month), 2-20 MB disk space, none or limited
access to server-based technologies for delivering
dynamic content, generally under your ISP’s
domain. (Website URL usually looks something
like: http://www.yourisp.com/~yourusername)
Hosting Your Server, con’t.
• Virtual Host – For business or personal use, share
a machine with other domains, can use your own
domain (http://www.yourdomain.com), should
provide a fairly wide range of tools for building
more complex Websites, costs based on disk usage
and traffic, ranges from $10 to several hundreds of
dollars a month. Generally available through all
ISPs and Hosting-only providors such as Highway
Technologies (http://www.hway.net) and
YourDomainHost
(http://www.yourdomainhost.com)
Hosting Your Server, con’t.
• Dedicated Server – For business use, ISP owns
and runs the machine, your organization dictates
the configuration and has exclusive access to the
system, expensive.
• Co-Located Server – For business use, your
organization owns the hardware and software and
is responsible for maintaining it, ISP houses the
system and provides a network connection, pricing
determined by bandwidth requirements.
Hosting Your Server: Do It
Yourself: Networking Options
• For an Intranet Server– Need a LAN (local area
network).
• For an Internet Server – Need a dedicated Internet
connection. Internet Connectivity Options:
– POTS (up to 56Kbps) – not practical for business use
– ISDN (128Kbps) – only a good choice if cable or DSL
is not available
– Cable (512Kbps – 10Mbps)
– DSL (128kps – 1.54 Mbps+)
– T-1 (up to 1.54Mbps) – full, fractional, or burstable
– T-3 (up to 45 Mbps)
Finding an ISP
• Setting up a Internet Web site will require
you to purchase some level of services from
an ISP.
• The List – http://thelist.com
Hosting Your Server:
Hardware Options
• Need to select a machine architecture (i.e
Intel Compatible PC, Sun, Macintosh G4).
• Processor speed and number of processors.
• RAM and Disk Space.
• NIC card.
• Price can range from several hundred
dollars to thousands of dollars.
Hosting Your Server:
Operating System Options
• Commercial Versions of Unix (i.e. Solaris, Irix,
HP-UX, AIX, MacOS X).
• Free Versions of Unix (i.e. Linux, FreeBSD).
• Microsoft Windows (9x, NT, Windows 2000).
• Novell NetWare
• Windows vs. Unix – raises issues of easy of use,
stability, scalability, open source, and pricing.
Hosting Your Server: Web
Server Software Options
• According to the Netcraft Web Server
Survey (http://www.netcraft.com), as of
January 2000, three Web server software
distributions support over 90% of all Web
servers on the Internet:
– Apache 61.66%
– Microsoft Internet Information Server 19.63%
– Netscape Enterprise 7.22%
Web Server Software Options:
Apache
• “The standard” for UNIX web servers.
• Originally based on NCSA httpd code.
• Can be installed under most Unix variants
and Windows. Binary versions available for
many operating systems.
• Uses file-based configuration, although GUI
tools are also available.
Introduction to Apache, con’t.
• Unix versions very stable. Windows version less
mature (beta-level code).
• Very Fast and uses resources efficiently.
• Freely distributed source code. Can be modified
for commercial or non-commercial use.
• Price: Free
• See http://www.apache.org for more information.
Web Server Software Options:
Netscape Server
• Sometimes referred to as the iPlanet server
• Distributed through Sun-Netscape Alliance called
iPlanet.
• Server packages: iPlanet/Netscape Enterprise
Server, Netscape Fast-Track Server.
• Runs under Windows NT, Solaris, Irix, HP-UX,
Digital Unix, AIX, Linux (coming soon).
Netscape iPlanet Server, con’t.
• Uses Web-based administration.
• Can be resource intensive.
• Price: $1495 per processor for Enterprise
Server
• See
http://www.iplanet.com/products/infrastruct
ure/web_servers for more information.
Web Server Software Options:
Microsoft Internet Information
Server
• Most popular for NT-based web servers.
• Runs only under Windows NT Server. IIS
v4 is the most popular release. IIS v5 was
released with Windows 2000 Server.
• GUI-based administration. Web-based
administration available as well.
• May not scale well.
Microsoft IIS, con’t.
• Source code not available. Extendable
through Microsoft’s Internet Server API
(ISAPI).
• Price: Free with NT Server 4.0
• See
http://www.microsoft.com/ntserver/web/d
efault.asp for more information.
Important Notes about Web
Server Hardware
• Web Servers need fast disk access and a lot
of RAM to handle high-volumes of traffic.
– Not unusual to see web servers with 1GB of
RAM and 10,000RPM hard drives.
• Processor speed and performance becomes
very important when delivering dynamic
content via CGI scripts, Server Side
Includes or other web applications.
How the Internet Works:
Networking Basics
• For a Web server to be useful it will need to
be attached to a network.
• Minimum requirements for a computer
network – at least two computers that have
a media and a method of communicating.
• All Internet applications use TCP/IP
(Transmission Control Protocol/Internet
Protocol) for low-level communications.
Networking Basics: TCP/IP
• TCP/IP is actually a combination of 2
protocols:
• A transport layer protocol called the
Transmission Control Protocol (TCP)
• A network protocol called the Internet
Protocol (IP)
Networking Basics: IP Addresses
• TCP/IP uses IP address to identify different
devices. Every computer on the Internet must
have a least one unique IP address.
• IPv4: IP address are four 8-bit numbers
separated by dots: 165.230.30.68
– Usually divided in three parts:
– 165.230 is one of Rutgers’ networks – e.g. no one else
has addresses starting with 165.230
– 30 is the subnet portion of the address
– 68 is the particular node, or host portion of the address
• Division not necessarily on octet boundary.
TCP/IP: Two Friends, Working
Together
• IP - An IP address represents a machine’s identity
on the internet and tells other machines how to get
to it – similar to your street address (e.g. 123 Main
Street, Anytown, USA).
• TCP is a mechanism used to ensure that anything
sent to a specific IP address makes it there in one
piece. – similar to the Post Office.
• Together, TCP/IP assures that anything sent to a
server on the Internet is delivered to the right
place in one complete piece.
Networking Basics: IP
Addresses
• IP addresses no longer being distributed by
classes – blocks are distributed to ISPs on an
as-needed basis and must be justified.
• IP addresses are hard to come by. How do you
get them?
– Your ISP received an “address space” from the ARIN
(http://www.arin.org)
– You receive IP addresses from your ISP.
Networking Basics: Tools
• Network interfaces need to be assigned IP
addresses.
• Interfaces can be configured using ifconfig
command on UNIX machines.
• Type ifconfig –a to view current configuration
settings.
• Additional tools for network monitoring: ping,
traceroute, tcpdump, netstat, arp, snoop.
Networking Basics: DNS
• IP addresses are usually paired with more
human-friendly names: Domain Name
System (DNS).
internet.rutgers.edu
Hostname
Organization Top-level domain
• Other top-level domains include .com, .gov,
.org, etc. There are also country-specific
domains like .uk, .ca, .jp, etc.
Networking Basics: DNS, con’t.
• Domain name information is maintained through a
distributed database of host name/ IP address
pairing.
• The Network Information Center (NIC) manages
the top-level domains, delegates authority for
second-level domains, and maintains a database of
registered name servers for all second-level
domains.
• Host name assignments maintained through zone
files on primary DNS server. Secondary DNS
server gets zone file from primary server.
Networking Basics: DNS, con’t.
• Network Solutions (previously the InterNic)
registers domain names – See
http://www.networksolutions.com. Other
registrars include Register.com
• Costs range from $20 to $50 per year.
• ISP’s beginning to offer domain name
registration as part of other packages.
• Need to register a primary and secondary
domain name servers for your domain and
arrange to have zone files created on DNS
servers.
DNS Overview: If DNS Servers Could Talk…
Networking Basics: DNS Tools
• There are several tools for for monitoring DNS
information:
– whois – tells you the owner and primary DNS servers
associated with a domain (e.g. whois yahoo.com). Also
available via web browser at
www.networksolutions.com.
– nslookup and host – tell you IP address information for
a particular hostname on the internet (e.g. nslookup
www.yahoo.com or host www.rutgers.edu)
DNS Exercise
• What are IP addresses of the DNS servers
that contain information about rutgers.edu?
• What are the IP address of:
– www.retaildecisions.com
– abusaday.admin.cju.com
– www.linux.org
Networking Basics: Ports
• Servers tend to run a number of services. A single
NIC can be used to provide multiple services
through ports.
• Servers with Internet-related services listen on
specific ports. Clients contact server by specifying
an IP address and a connection port.
• Common services and port numbers:
– smtp 25, ftp 21, telnet 23, http/web 80, https/ssl 443
– A list of services and ports is contained in the
/etc/services file
• Ports below 1024 are reserved for system services
and can only be used by programs started by root.
Web Servers and UNIX Systems
• Most web servers run on port 80 – the standard
web port
• Web server software usually runs on UNIX
system as some user other than root. It’s
considered a security risk to run the web server
software as root.
• The web server software binary is httpd. Web
server software is often refered to as “the httpd”
Uniform Resource Locator
(URL)
• URL: a fancy way of saying “web site
address”
• Anatomy of a URL:
http://internet.rutgers.edu:80/ITI520/index.html
Protocol
Hostname
Port Number
Path To File
HTTP – An Introduction
• HTTP – The Hypertext Transfer Protocol
– The protocol used between web clients
(browsers) and web servers.
– Web browsers “ask” for a specific web page
from the server, who returns the content
HTTP: Example and Exercise
• You can emulate the HTTP conversation
between a browser and a server:
– telnet to the internet.rutgers.edu machine, port
80, e.g. telnet internet.rutgers.edu 80 from the
UNIX command line.
– Type: “GET HTTP/1.0 /” Press Enter twice.
– The server returns the HTML (web page code),
which is usually interpreted and displayed by
your web browser.
Unix Tools and Commands
• File Editors: vi, emacs, pico
• File system navigation: cd
• File management: mv, rm, mkdir,rmdir, ls,
chmod, ln
• Archiving and compression: tar, gzip
• ***Process management: ps, kill
• Man pages available for all these
commands, e.g “man rmdir”
UNIX Process Management
• UNIX Processes are managed using the ps
and kill commands
– ps is used to list processes running on the
system
– kill is used to kill and restart processes running
on the system
• Every time you start a new program (pico,
vi, bash, etc.) a process is created and you
are the owner of that process.
Process Management Exercises
• You can type ps –aux to see all the
processes running on a system. This will
list the process owner, process ID (PID) and
the command being run.
• You can kill any PID, as long as you are the
owner of the process.
• ps –u shows all the processes your are
currently running
Process Management Exercises,
con’t.
• Open up a new terminal window and type vi
foo.txt. This will create a new process on
the system that you own.
• Switch back to your original terminal
window. Locate the process ID for your vi
session and kill it.