Scalable Web Server Clustering Technologies

Transcript Scalable Web Server Clustering Technologies

Scalable Web Server
Clustering Technologies
J. Wei
Background
Growth of Internet, Dynamic content and increasing users force us
to find faster server (Web).
In the past, we replaced the web server with faster machine
(processor).
Drawback:
Short-term (Moore Law, the number of transistors per integrated
circuit would double every 18 months);
Expensive: we need to replace almost the whole machine.
Solution: Add more processor or machine to the Web server. (It is
commodity hardware and software, so that we can keep the
past investment.)
Requirement

There is no application state kept in server.
For application requests need to be transfer from one to other
servers. (Except some protocol-specific service, such as
Secure Sockets Layer)

Transactions must be relatively short and
with high frequencies. (Short, because we do not use
special hardware or software to process the request. High
frequency, the sample space is large so that we can employ
the stochastic method to distribute the requests. Requests are
stochastic distribution, from anywhere at anytime.)
OSI vs. TCP/IP

Layer 4 Switch
Special Technologies:

Single IP. Because of the Network Address Translation (NAT)
so that the cluster servers appear to be a single server with one
IP address.

Higher-layer address screening. The switch can
make forwarding decision upon the content of the request from
layer 4.
Terminology

L4/2: Layer 4 Switching with Layer 2 Packet Forwarding. The
system has identical layer 3 (Network) with unique MAC address.

L4/3: Layer 4 Switching with Layer 3 Packet Forwarding. The
system has identical layer 4 (Transport, same services) with
unique network address.

Layer 7 Switch: Make forwarding decision based on the
content of client requests. It can employ L4/2 or L4/3.
Terminology (cont.)

Client-side Transparency: The whole cluster servers
appear to be a single host to clients because of the dispatcher.

Server-side Transparency: Each cluster server runs
standard web-server designed for standalone server. It servers
the requests forwarded from dispatcher just the same as the
requests come directly from the clients.

Performance Index: Connections per seconds or bits per
seconds. (Cluster Maximum Utilization)
L4/2 Clustering

The cluster’s IP address (A) is shared by the dispatcher and
servers through the use of primary and secondary IP addresses.
(BK: Each host can have several IP addresses.)

The dispatcher’s primary IP address is A.

The servers use A as secondary address.

All packets whose destinations are A are forwarded to the
dispatcher through the use of Address Resolution Protocol (ARP)
in the nearest gateway/router.
Technology Specification

Load-Sharing Algorithm: Round-Robin or other policies.

Session Map: When request is connection initiation, if it
belongs to established connection in the map, forward it to the
previously selected server, or select a server and save the
connection in the map. If it doesn’t contain a SYN, it maybe
discarded or not.

Backup method: To avoid the down of the dispatcher and
servers.
L4/2 Traffic Flow
L4/2 Traffic Flow (cont.)
1.
2.
3.
4.
A client sends a request to A.
The router sends the request to the dispatcher.
Based on the load-sharing algorithm, the
dispatcher selects actual server (2) to serve the
client.
Server 2 replies the client directly.
Advantage vs. Disadvantage
Advantage:


Servers reply clients directly, which avoid the dispatcher to be
bottleneck.
Don’t need to recalculate the checksum because it operates
on layer 2.
Disadvantage:

There must be direct physical connection to all servers and
the dispatcher.
ONE-IP (Bell Lab, 1996)

Load-Sharing Algorithm
Routing-based Dispatching: Hash the incoming client’s address
to get a number that indicates which server to service the
request;
ONE-IP (cont.)

Broadcast based dispatching: Each server has a fixed and
disjoint portion of the address space.
ONE-IP (cont.)

Drawback: Cannot adapt to the condition that the client
requests are disproportionately distributed.

Backup: Watchdog daemonm watchd
• Dispatcher fail: The backup dispatcher will notice the missing
heartbeat of the primary dispatcher and take over.
• Server fail: Reconfigure the hash table or the address filters on
other servers.
Network Dispatcher(IBM 1996)

It powered the 1998 Olympic Games website with up to 2000
requests/s. Experimental results are 2200 requests/s.
Network Dispatcher (cont.)

Load-Sharing Algorithm: Weighted Round Robin

Connection Map: Discard the packet that doesn’t contain

Backup:

Dispatcher: Secondary dispatcher. In fact, it contains some extra

algorithm.
a SYN or a non-zero allocation weight is unavailable.
dispatchers. The secondary dispatcher will take the IP of the
failed dispatcher.
Server: High Availability Cluster Multi-Processing for AIX
(HACMP) on the IBM SP-2. Reconfigure the dispatchers to
exclude the node; Failed server will automatically reboot;
Reconfigure the dispatchers to include the node.
Network Dispatcher (cont.)

Client affinity:

Background: Two connections from the same client must be
assigned to the same server such as the FTP and SSL services.


The connection requests from the same client before given
affinity life span expires are sent to the same server.
“The quality of the load sharing may suffer slightly, but the
overall performance of the system improves.”
Others

LSMAC (University of Nebraska-Lincoln)
“Implement L4/2 clustering as a portable user-space application
running on commodity systems”

Alteon ACEdirector (hardware implementation)

AceDirector 2’s primary focus is on load balancing Internet
services such as HTTP and FTP.
Load-Sharing Algorithm: Round-robin and least-connections load
sharing policies.
Support SSL service.

L4/3 Clustering
• The dispatcher appears as a
single host to clients while as
a gateway to the servers (IP
address = A).
• Each server has its own IP
address that can be globally
unique or locally unique (IP
addresses = B1, B2, … , Bn).
• Load sharing algorithm:
Round robin or other
algorithms;
• Keep a session map table.
L4/3 Clustering (cont.)
L4/3 Clustering (cont.)
1.
2.
3.
4.
5.
A client sends request with A as the destination;
The packet comes to the dispatcher;
Based on the load sharing algorithm and session table, select
the server, rewrite the destination IP address, recalculate the
checksums, forward it to the server;
The server replies the request through the dispatcher
(gateway) address A as the destination address.
The dispatcher rewrite the source IP address of reply as A,
recalculate the checksums, forward it to the client.
•
Disadvantage:
1.
2.
Recalculate twice the checksums. (IP and TCP)
All traffic flow through the dispatcher. (Bottleneck)
Magicrouter


University of California at Berkeley, 1996
Fast Packet Interposing and modifications of kernel
Load sharing Algorithms:
•
•
•
Round robin
Random
Incremental Load
Backup:
•
•
Dispatcher: primary + backup model.
Server: Use ARP to map server IP addresses to MAC addresses
to detect the fail of servers.
LocalDirector (Cisco, 1996)
Load sharing Algorithm:
•
Least connections: choose the server with fewest connections
Fastest Response: choose the server that response the request
•
Round-Robin: Strictly RR policy.
•
first.
Backup:
•
•
Dispatcher: extra LocalDirector unit that linked to the primary
one with special failover cable
Server: Contact servers periodically, when fail, remove it,
continue to contact, when up, add to the server pool
Sticky flag: similar as IBM’s client affinity.
LSNAT



University of Nebraska-Lincoln
User-space implementation
RFC2391: Load Sharing using IP Network Address Translation
(LSNAT)

Backup:
•
Dispatcher: select one server as new dispatcher. Distributed
•
State Reconstruction Mechanism to rebuild the map of existing
connections.
Server: Exclude from active servers pool. When up, include it
again.
L7 Clustering


Make dispatch decision based on the content. (Application Layer)
Content-based dispatching
LARD



Locality-Aware Request Distribution, Rice University
It uses TCP handoff protocol with the modified kernel.
Different server processes different kind of requests, which can
make use of specialized server.
Web Accelerator (IBM)





“The accelerator can now
perform content-based
routing in which it makes
intelligent decisions about
where to route requests
based on the URL.”
L7 based on L4/2;
Web page caching;
The dispatcher services as a
gateway/router.
All traffic flows through the
dispatcher.
ArrowPoint




Content-based dispatching policy;
Caching mechanism is similar to Web Accelerator;
Sticky connection;
Hot standby of the dispatcher and server node fail detection
mechanism.
Conclusion
L4/2 Clustering


Bottleneck: power of dispatcher to process incoming request;
Advantage: Sustainable request rate.
L4/3 Clustering

Bottleneck: recalculation of checksums.
L7 Clustering


Bottleneck: complexity of content-based dispatching algorithm;
Advantage: Localizing request space and caching request results.
Qualitative comparison
Client-based approach:


Advantage: Reduce the load on web server by implementing
route service in client side.
Disadvantage: It is not general applicability and it need the
server-side cooperation.
Dispatcher-based approach:

Advantage: Full control of client requests to gain good load

Disadvantage: Risk of dispatcher bottleneck.
balancing. Easy to implementation.
Qualitative comparison (cont.)
DNS-based approach:


Advantage: High Scalability. No risk of bottleneck.
Disadvantage:


Due to the address caching mechanisms, need sophisticated
algorithms to gain load balancing.
Less than 32 web servers for each public URL because of the
limitation of UDP packet size.
Server-based approach:


Advantage: No risk of single-point failure and bottleneck.
Disadvantage: Redirection will increase the latency time for
clients.
Qualitative comparison (cont.)
Quantitative comparison

Cluster Maximum Utilization: At a given instant, the
highest utilization among all servers in the cluster.

Cumulative Frequency:
Exponential Distribution:

Heavy-tailed Distribution:

Quantitative comparison (cont.)

Exponential Distribution Model
Quantitative comparison (cont.)

Dispatcher-based: In almost all time, the utilization is
below 0.8

DNS-based (adaptive TTL): utilization below 0.9

DNS-based (constant TTL): 20% time overload

Server-based: utilization below 0.9

DNS-RR: overload time > 70%
Quantitative comparison (cont.)

Heavy-tailed Distribution Model
Quantitative comparison (cont.)

Dispatcher-based: work fine

DNS-based (adaptive TTL): work fine without risk of
bottleneck

Server-based: poor performance when the load is high,
work fine before the load over 0.9
Conclusion


Bottleneck will be the network throughput.
By making use of wide area network bandwidth, it can get much
better performance.

Scalable Web Server Clustering Technologies

Transcript Scalable Web Server Clustering Technologies

Directory