Transcript Chapter 2

Technology Infrastructure
Internet and the Web
Computer Science 1611
Internet & Web
Learning Objectives
In this section, we will learn about:
• The origin, growth, and current structure of
the Internet
• How packet-switched networks combined to
form the Internet
• Internet protocols and Internet addressing
Computer Science 1611
Internet & Web
Technology Overview
• Computer networks and the Internet form the basic
technology structure for what is now the WWW.
• The computers in these networks run such software
as:
• Operating systems, database managers,
encryption software, multimedia creation and
viewing software, and the graphical user interface
Computer Science 1611
Internet & Web
Technology Overview
• The Internet includes:
• The hardware that connects the
computers together
• the hardware that connects the
networks together
• Rapid change in these technologies requires
businesses to be flexible
Computer Science 1611
Internet & Web
Packet-Switched Networks
• A local area network (LAN) is a network of computers close
together.
• A wide area network (WAN) is a network of computers
connected over a great distance.
• Circuit switching is used in telephone communication.
• The Internet uses packet switching
• Files are broken down into small pieces (called packets) that
are labeled with their origin, sequence, and destination
addresses.
Computer Science 1611
Internet & Web
Internet Protocols
http://
mailto:
ftp://
telnet:
Computer Science 1611
Internet & Web
World Wide Web
E-mail address
File Transfer Protocol
Telnet
Top Level Domain Names
.edu
.ca
.gov
.mil
.com
.net
.org
Educational Institution (in US)
Country Codes (two letters such as .ca,
.de, .mx, .jp)
Governmental Agency
Military Entity
Commercial Entity
Internet Service Provider
Non-Profit Organization
Computer Science 1611
Internet & Web
When Computers Communicate
• When two or more computers communicate, they
must have a common way in which to
communicate.
• To do this computers use protocols
• A protocol is an agreement by which two or more
computers can communicate.
• Transfer Control Protocol/Internet Protocol
(TCP/IP) is the underlying protocol for the
Internet.
Computer Science 1611
Internet & Web
How TCP/IP Works
1) Transfer Control
Protocol (TCP) breaks
data into small pieces
of no bigger than 1500
characters each. These
“pieces” are called
packets.
Computer Science 1611
Internet & Web
101010101001
101010011010
011010210101
010101011010
111101010111
011101110110
110000101110
110101010101
001110101001
010111101000
101010101
001101010
011010011
101010101
001101010
011010011
101010101
001101010
011010011
How TCP/IP Works(II)
2) Each packet is inserted
into different Internet
Protocol (IP) “envelopes.”
Each contains the address
of the intended recipient
and has the exact same
header as all other
envelopes.
Computer Science 1611
Internet & Web
101010101
001101010
011010011
101010101
001101010
011010011
101010101
001101010
011010011
How TCP/IP Works
• A router receives the packets and then
determines the most efficient way to send the
packets to the recipient.
• After traveling along a series of routers, the
packets arrive at their destination.
Packet
101010101001101
010011010011
Router 1
Router 3
Packet
101010101001101
010011010011
Router 2
Computer Science 1611
Internet & Web
Router 4
Packets
• Everything you do on the Internet involves packets. For
example, every Web page that you receive comes as a
series of packets, and every e-mail you send leaves as a
series of packets. Networks that ship data around in small
packets are called packet switched networks. On the
Internet, the network breaks an e-mail message into parts of
a certain size in bytes. These collections of bytes are the
packets. Each packet carries the information that will help it
get to its destination –
– the sender's IP address,
– the intended receiver's IP address,
– something that tells the network how many packets this e-mail
message has been broken into and
– the sequence number of this particular packet
Computer Science 1611
Internet & Web
.
Packets Purpose
• The packets carry the data in the protocols that the Internet
uses: Transmission Control Protocol/Internet Protocol
(TCP/IP). Each packet contains part of the body of your
message. A typical packet contains perhaps 1,000 or 1,500
bytes.
• Each packet is then sent off to its destination by the best
available route -- a route that might be taken by all the other
packets in the message or by none of the other packets in
the message. This makes the network more efficient. First,
the network can balance the load across various pieces of
equipment on a millisecond-by-millisecond basis. Second,
if there is a problem with one piece of equipment in the
network while a message is being transferred, packets can
be routed around the problem, ensuring the delivery of the
entire message.
Computer Science 1611
Internet & Web
Packet Design
Most packets are split into three parts:
• Header - The header contains instructions about the data
carried by the packet. These instructions may include:
• Body - Also called the payload or data of a packet. This is
the actual data that the packet is delivering to the
destination. If a packet is fixed-length, then the payload
may be padded with blank information to make it the right
size.
• Footer - sometimes called the trailer, typically contains a
couple of bits that tell the receiving device that it has
reached the end of the packet. It may also have some type
of error checking.
Computer Science 1611
Internet & Web
How are Packets Used
• If a message is sent over the internet, it will be broken into
packets. Each packet's header will contain the proper
protocols, the originating address (the IP address of your
computer), the destination address (the IP address of the
computer where you are sending the e-mail) and the packet
number (1, 2, 3 or 4 since there are 4 packets). Routers in
the network will look at the destination address in the
header and compare it to their lookup table to find out
where to send the packet. Once the packet arrives at its
destination, the receiving computer will strip the header
and footer off each packet and reassemble the message
based on the numbered sequence of the packets.
Computer Science 1611
Internet & Web
Packet Header
• Header (contains instructions about the data carried by the packet)
– Length of packet
– Synchronization (a few bits that help the packet match
up to the network)
– Packet number (which packet this is in a sequence of
packets)
– Protocol (on networks that carry multiple types of
information, the protocol defines what type of packet is
being transmitted: e-mail, Web page, streaming video)
– Destination address
– Originating address
Computer Science 1611
Internet & Web
Packet Body and Footer
• Body - Also called the payload or data of a
packet. This is the actual data that the packet is
delivering to the destination. If a packet is fixedlength, then the payload may be padded with
blank information to make it the right size.
• Footer - sometimes called the trailer, typically
contains a couple of bits that tell the receiving
device that it has reached the end of the packet. It
may also have some type of error checking.
Computer Science 1611
Internet & Web
Error Checking
• The most common error checking used in
packets is Cyclic Redundancy Check (CRC).
• CRC takes the sum of all the 1s in the payload
and adds them together. The result is stored as a
hexadecimal value in the footer (trailer). The
receiving device adds up the 1s in the payload
and compares the result to the value stored in the
trailer. If the values match, the packet is good.
But if the values do not match, the receiving
device sends a request to the originating device
to resend the packet.
Computer Science 1611
Internet & Web
Error checking example (CRC)
Suppose you have 4 bytes of data of the form
10101101 00111000 11001011 10010011
– The sum of all the 1’s in this data is 17. This
value can be represented in binary form as
00010001
The value 17 (00010001) is the CRC value
which is inserted into the footer (trailer)
Computer Science 1611
Internet & Web
Packet Construction
• Suppose you send an e-mail to a friend, that the
e-mail is about 3,500 bits (3.5 kbits) in size, and
that the network you send it over uses fixedlength packets of 1,024 bits (1 kilobit). The header
of each packet is 96 bits long and the footer is 32
bits long, leaving 896 bits for the payload. To
break the 3,500 bits of message into packets, you
will need four packets (divide 3,500 by 896). Three
packets will contain 896 bits of data and the
fourth will have 812 bits.
Computer Science 1611
Internet & Web
Routing Packets
• The computers that decide how best to forward each packet
in a packet-switched network are called ‘routers’.
• The programs on these routers use ‘routing algorithms’ that
call upon their ‘routing tables’ to determine the best path to
send each packet.
• When packets leave a network to travel on the Internet, they
are translated into a standard format by the router.
• These routers and the telecommunication lines connecting
them are referred to as ‘the Internet backbone’.
Computer Science 1611
Internet & Web
How TCP/IP Works
• Upon arrival at their destination, TCP checks the
data for corruption against the header included in
each packet. If TCP finds a bad packet, it sends a
request that the packet be re-transmitted.
Computer Science 1611
Internet & Web
IP Addresses
• Since computers process numbers more efficiently
and quickly than characters, each machine directly
connected to the Internet is given an IP Address
• An IP address is a 32-bit address comprised of four 8bit numbers (28) separated by periods. Each of the
four numbers has a value between 0 and 255
• Normally, an IP address is is given in “dotted
decimal” form 138.73.1.35
Computer Science 1611
Internet & Web
IP Addresses
• Example of an IP Address:
http://138.73.1.35
The IP Address of the
MtA Web Server
Computer Science 1611
Internet & Web
IP Addresses vs. URLs
• While numeric IP addresses work very well for computers, most
humans find it difficult to remember long patterns of numbers.
• Instead, humans identify computers using Uniform Resource
Locators (URLs), a.k.a. “Web Addresses”.
• When a human types a URL into a browser, the request is sent
to a Domain Name Server (DNS), which then translates the URL
to an IP address understood by computers.
• The DNS acts like a phonebook.
Computer Science 1611
Internet & Web
Anatomy of a URL
http://www.mta.ca/index.html
http
protocol
www
machine name
mta
subdomain
ca
top level domain name
Computer Science 1611
Internet & Web
Internet Protocols
• A protocol is a collection of rules for formatting,
ordering, and error-checking data sent across a
network.
• ARPANET is the earliest packet-switched network.
(ARPA = Advanced Research Projects Agency)
• The open architecture of this experimental network
used Network Control Protocol (NCP) which later was
modified to become TCP/IP, the core of the Internet.
Computer Science 1611
Internet & Web
Internet Protocols
• This open architecture has four key rules that have
contributed to the success of the Internet.
– Independent networks should not require any internal changes to be
connected to the network.
– Packets that do not arrive at their destinations must be retransmitted
from their source network.
– Router computers act as receive-and-forward devices; they do not
retain information about the packets that they handle.
– No global control exists over the network
Computer Science 1611
Internet & Web
.
Internet Protocols
• The Transmission Control Protocol (TCP) and the
Internet Protocol (IP) are the two protocols that
support the Internet operation (commonly referred to
as TCP/IP).
• The TCP controls the disassembly of a message into
packets before it is transmitted over the Internet and
the reassembly of those packets when they reach
their destination.
• The IP specifies the addressing details for each
packet being transmitted.
Computer Science 1611
Internet & Web
IP Addresses
• IP addresses are based on a 32-bit binary number that
allows over 4 billion unique addresses for computers
to connect to the Internet. (138.73.27.246 is Art Miller’s office
machine)
– Ping 138.73.27.246
• IP addresses appear in ‘dotted decimal’ notation (four
numbers separated by periods).
– Each number is in the range 0…255
– Hex notation (aside)
– IP Addresses in decimal form
• IP addresses are assigned by three not-for-profit
organizations (ARIN, RIPE, and APNIC).
– Organization of IP numbers
Computer Science 1611
Internet & Web
IP Addresses
• Approximately two billion IP addresses are either in
use or unavailable for use.
• Private IP addresses are a series of IP numbers that
have been set aside for subnet use and are not
permitted on the Internet.
• IPv6 is a possible solution that uses a 128-bit
hexadecimal number for addresses.
–
–
–
–
A number written using 128 bits can be in the range from 1…2
Since 2 10 is approximately 103 = 1,000, it follows that
2 128 ~ (2 10) 12 ~ (10 3) 12 ~ 10 36 ~
1,000,000,000,000,000,000,000,000,000,000,000,000
Computer Science 1611
Internet & Web
128
Domain Names
• To make the numbering system easier to use, an
alternative addressing method that uses words was
created.
• An address, such as www.course.com, is called a
domain name.
www.mta.ca =
138.73.1.35
• The last part of a domain name (i.e., ‘.com’) is the
most general identifier in the name and is called a
‘top-level domain’ (TLD).
Computer Science 1611
Internet & Web
Top-level Domain Names
Computer Science 1611
Internet & Web
History: Before the Web
• History of the Internet
• Before the creation of the World Wide Web (when,
whom?) there was a set of technologies which
constituted the internet
– telnet
– ftp
– Gopher
• History of the Web
• Early browsers for the Web were not as capable as
those of today
Computer Science 1611
Internet & Web
Web Page Delivery
• Hypertext Transfer Protocol (HTTP) is the set of rules
for delivering Web pages over the Internet.
• HTTP uses the client/server model
• A user’s Web browser opens an HTTP session and sends a
request for a Web page to a remote server.
• In response, the server creates an HTTP response message
that is sent back to the client’s Web browser
.
• In particular, this same action can be accomplished without a browser
by using the (DOS command prompt)
TELNET www.mta.ca 80 (port 80) and once connected using the case
sensitive command GET / (followed by two carriage returns) This will return
the same thing that is returned by your web browser when you enter
http://www.mta.ca
• The combination of the protocol name and the domain name is
called a uniform resource locator (URL).
Computer Science 1611
Internet & Web
SMTP, POP, MIME, and IMAP
• E-mail sent across the Internet must also be formatted
to a common set of rules, otherwise e-mail created by
one company (or Web site) could not be read by a
person at another company.
• Simple Mail Transfer Protocol (SMTP) specifies the
exact format of a mail message and describes how
mail is to be administered at the Internet and network
level.
Computer Science 1611
Internet & Web
SMTP, POP, MIME, and IMAP
• An e-mail program running on a user’s computer can
request mail from the company’s main e-mail
computer using the Post Office Protocol (POP).
• Multipurpose Internet Mail Extensions (MIME) allow
the user to attach binary files to e-mail.
• The Interactive Mail Access Protocol (IMAP) performs
the same basic functions as POP, but includes
additional features.
Computer Science 1611
Internet & Web
Markup Languages and the Web
• Web pages are marked with tags to indicate the
display and formatting of page elements.
• SGML is a meta language (a language that can be
used to define other languages) and historically is the
first markup language
• HTML and XML are both derivatives of SGML.
Computer Science 1611
Internet & Web
HTML Tags
• An HTML document contains both document text and
elements.
• Tags are codes that are used to define where an HTML
element starts and (if necessary) where it ends.
• In an HTML document, each tag is enclosed in
brackets (<>).
• A two-sided tag set has an opening tag and a closing
tag.
Computer Science 1611
Internet & Web
Document Tags
• Document tags are those divide up a Web page into its
basic sections, such as the header information and the part
of the page which contains the displayed text and graphics.
• HTML
– The first and last tags in a document should always be
the HTML tags. These are the tags that tell a Web
browser where the HTML in your document begins and
ends. The absolute most basic of all possible Web
documents is:
– <HTML> </HTML>
– If we load such a page into a Web browser, it will give us
a blank screen, but it is technically a valid Web page.
Computer Science 1611
Internet & Web
HTML Links
• Hypertext documents differ from regular docuements
by offering hyperlinks
• Hyperlinks are bits of text that connect the current
document to:
• another location in the same document
• another document on the same host machine
• another document anywhere on the Internet
Computer Science 1611
Internet & Web
Internet Connection Options
• The Internet is a set of interconnected networks.
• Large firms that provide Internet access to other
businesses are called Internet Service Providers
(ISPs).
Computer Science 1611
Internet & Web
Connectivity Overview
• The most common connection options that ISPs offer
to the Internet are telephone, broadband, leased-line,
and wireless.
• The internet grew quickly in North America because
local telephone calls were free, as opposed to Europe,
where local calls were charged by the time unit
• Bandwidth is the amount of data that can travel
through a communication line per unit of time.
Computer Science 1611
Internet & Web
Voice-Grade Telephone Connections
• The most common way to connect to an ISP is
through a modem connected to your local telephone
service provider.
• POTS uses existing telephone lines and an analog
modem to provide a bandwidth of 28-56 Kbps.
• DSL protocol offers high speed bandwidth over
standard phone lines.
Computer Science 1611
Internet & Web
Broadband Connections
• Connections that operate at speeds of greater than 200
Kbps are called broadband services.
• ADSL uses the DSL (Digital Subscriber Line) protocol to
provide bandwidths (over standard phone lines) of between
100-640 Kbps upstream and 1.5-9 Mbps downstream.
• Cable modems provide transmission speeds between 300
Kbps-1 Mbps from the client to the server and a
downstream rate as high as 10 Mbps.
• Satellite microwave transmissions handle Internet
downloads at speeds around 500 Kbps.
Computer Science 1611
Internet & Web
Networks: Local area and Wide area
Connections
• Large firms can connect to an ISP using higherbandwidth connections that they can lease from
telecommunications carriers.
• A ‘T1’ line operates at 1.544 Mbps
• A ‘T3’ line operates at 44.736 Mbps.
• Ethernet (local) currently operates at 10 Mbps or 100
Mbps, and there are emerging standards for 10Gbps
and 100 Gbps ethernet.
Computer Science 1611
Internet & Web
Wireless Connections
• Many researchers and business managers see great
potential for wireless networks and the devices
connected to them.
• The term m-commerce (mobile commerce) is used to
describe the kinds of resources people might want to
access using devices that have wireless connections.
Computer Science 1611
Internet & Web
Internet2
• Internet2 is an experimental test bed for new
networking technologies that is separate from the
original Internet.
• 200 universities and a number of corporations joined
together to create this network.
• It has achieved bandwidths of 10 Gbps.
• Internet2 promises to be the proving ground for new
technologies and applications of those technologies
that will eventually find their way to the Internet.
Computer Science 1611
Internet & Web