Services in CINEMA

Download Report

Transcript Services in CINEMA

Reliable and Scalable Internet
Telephony
Kundan Singh and Henning Schulzrinne
Internet Real Time Lab – Internal Talk
Sept 24, 2004
Telephone reliability
(PSTN: Public Switched Telephone Network)
database (SCP)
for freephone,
calling card, …
signaling network
(SS7)
local telephone switch
(class 5 switch)
signaling
router
10,000
customers
(STP)
20,000 calls/hour
regional telephone switch
(class 4 switch)
100,000 customers
150,000 calls/hour
“bearer” network
database (SCP)
10 million customers
2 million lookups/hour
signaling router (STP)
1 million customers
1.5 million calls/hour
telephone switch
(SSP)
2
Internet telephony
(SIP: Session Initiation Protocol)
[email protected]
yahoo.com
example.com
INVITE
REGISTER
INVITE
129.1.2.3
[email protected]
192.1.2.4
DB
DNS
3
SIP network architecture
Scalability requirement depends on role
Cybercafe
ISP
IP network
IP phones
ISP
SIP/MGC
SIP/PSTN
Carrier network
GW
GW
MG
IP
PSTN
GW
MG
MG
PBX
PSTN phones
SIP/MGC
T1 PRI/BRI
PSTN
4
Reliability and scalability
for call routing, registration, conferencing, voicemails

Requirements

Reliable


Scalable


Mean Time Between Failures (MTBF), Mean Time To Recover
(MTTR)
Registration rate, call rate, #requests/s
Proposed solutions

Server redundancy



Apply existing web-redundancy designs
Evaluate quantitatively (future work)
Peer-to-peer


Novel P2P-SIP architecture
Evaluate quantitatively (future work)
5
Server redundancy
The problem: failure or overload
INVITE
REGISTER
6
Server redundancy
Replicate registration or search on call
INVITE
REGISTER
INVITE
REGISTER
7
Server redundancy
Known techniques

Client-based


DNS




Cisco phones: primary and backup proxy
NAPTR, SRV
IP address takeover
Database redundancy
...
8
High availability
Failover in CINEMA
Web
scripts
D1
Master/
slave
P1
phone.cs.columbia.edu
_sip._udp
SRV 0 0 5060 phone.cs.columbia.edu
SRV 1 0 5060 sip2.cs.columbia.edu
Web
scripts
replication
D2
Slave/
master
P2
sip2.cs.columbia.edu
REGISTER
proxy1 = phone.cs
backup = sip2.cs
9
High availability
Time to recover

Client re-sends INVITE to P2



sipd has in-memory cache



Immediately on ICMP error
Or after 10s otherwise
Refresh registration much before expiry
Registrations are additive
Measurement of recovery time

Optimal #servers
10
Scalability
Load sharing: redundant proxies and databases
P1

REGISTER

D1


D2
P3
Write to D1 & D2
INVITE

P2
INVITE
REGISTER
Read from D1 or D2
Database write/
synchronization
traffic becomes
bottleneck
11
Scalability
Load sharing: divide the user space
P1
a-h

D1

P2
i-q
D2

Proxy and database
on the same host
Stateless proxy can
become overloaded
Hashing

Static vs dynamic
P3
r-z
D3
12
Scalability
Comparison of the two designs
P1
P1
a-h
D1
D1
P2
P3
P2
i-q
D2
D2
P3
r-z
D2
Total time per DB
((tr/D)+1)TN
((tr+1)/D)TN
= (A/D) + B
= (A/D) + (B/D)
D
N
r
T
t
=
=
=
=
=
number of database servers
number of writes (REGISTER)
#reads/#writes = (INV+REG)/REG ~ 2
write latency
read latency/write latency
13
Reliability and scalability
Two stage architecture for CINEMA
a*@example.com
a1
s1
Master
a2
a.example.com
_sip._udp
SRV 0 0 a1.example.com
SRV 1 0 a2.example.com
Slave
sip:[email protected]
s2
sip:[email protected]
b*@example.com
s3
example.com
_sip._udp
SRV 0 0 s1.example.com
SRV 0 0 s2.example.com
SRV 0 0 s3.example.com
SRV 1 0 ex.backup.com
b1
Master
b2
Slave
b.example.com
_sip._udp
SRV 0 0 b1.example.com
SRV 1 0 b2.example.com
Request-rate = f(#stateless, #groups)
Bottleneck: CPU, memory, bandwidth?
14
Failover latency: ?
Server-based vs peer-to-peer
C
C
S
C

Server-based

C

C

P
P

Peer-to-peer

P
P
P
Cost: maintenance, configuration
Central points of failures
Controlled infrastructure (e.g., DNS)


Robust: no central dependency
Self organizing, no configuration
Scalability ?
15
Related work: Skype
From the KaZaA community
P
P
P

P
P
P
P

P

P
Host cache of some super nodes
Bootstrap IP addresses
Auto-detect NAT/firewall settings

P
P
P





Protocol among super nodes – ??
Allows searching a user (e.g., kun*)
History of known buddies
All communication is encrypted
Promote to super node


STUN and TURN
Based on availability, capacity
Conferencing
16
We propose: P2P-SIP


Unlike server-based SIP architecture
Unlike proprietary Skype architecture


Robust and efficient lookup using DHT
Interoperability


Hybrid architecture


Lookup in SIP+P2P
Unlike file-sharing applications


DHT algorithm uses SIP communication
Data storage, caching, delay, reliability
Disadvantages

Lookup delay and security
17
P2P-SIP
Background: DHT (Chord)
1
54
8
58
10
14
47
21
32
38
14
8+2 = 10
14
8+4 = 12
14
8+8 = 16
21
8+16=24
32
8+32=40
42

Finger table: logN

24
8+1 = 9


38
node
Identifier circle
Keys assigned to successor
Evenly distributed keys and nodes

42
Key
30

ith finger points to first node
that succeeds n by at least 2i-1
Stabilization for join/leave
18
P2P-SIP
Design Alternatives
1
54
58
servers
47
42
38
38
8
d471f1
14 10
d46a1c
21
d467c4
d462ba
1
54
d4213f
10
32
24 30
Route(d46a1c)
d13da3
65a1fc
38
24 30
clients
Use DHT in
server farm
Use DHT for all
clients; But some
are resource limited
Use DHT among super-nodes
1.
2.
Hierarchy
Dynamically adapt
19
P2P-SIP
Node architecture: registrar, proxy, user agent
User interface (buddy list, etc.)
On reset Signout,
transfer
On startup
Leave
Discover
Peer found/
Detect NAT
ICE

Join
Multicast REG
Signup,
Find buddies
IM,
call
User location
Find
Audio devices
DHT (Chord)
REG
SIP
REG, INVITE,
MESSAGE
Codecs
RTP/RTCP
DHT communication using SIP REGISTER



Known node: sip:[email protected]
Unknown node: sip:[email protected]
User: sip:[email protected]
20
P2P-SIP
Node Startup
columbia.edu
sipd

REGISTER
SIP

DB

[email protected]
DHT

Detect peers


REGISTER alice=42
42
58
12
14

REGISTER with SIP registrar
Discover peers: multicast REGISTER
Join DHT using node-key=Hash(ip)
REGISTER with DHT using userkey=Hash([email protected])
Dialing out

Call, instant message, etc.
INVITE sip:[email protected]
MESSAGE sip:[email protected]
REGISTER bob=12

Last seen, SIP NAPTR/SRV, DHT
32
21
P2P-SIP
Node Leaves

Graceful leave

REGISTER key=42


Failure
REGISTER

42
OPTIONS
DHT


42
Un-REGISTER
Transfer registrations
Attached nodes detect and
re-REGISTER
New REGISTER goes to new
super-nodes
Super-nodes adjust DHT
accordingly
22
P2P-SIP
Implementation
31
29
1

31
30
sippeer: C++,
Unix (Linux), Chord

25
26
26

9

19
15
11

Node join and form
the DHT
Node failure is
detected and DHT
updated
Registrations
transferred on node
shutdown
Co-located sipc can
use sippeer service
23
P2P-SIP
Evaluation

#super-nodes needed depends on





Registration refresh rate, replication
Join/leave rate, uptime
Call arrival rate
CPU, memory, bandwidth limits
Other metrics


Call setup latency
Recovery time after super-node failure
24
P2P-SIP
Advanced services and open issues

Offline messages


Conferencing


INVITE or MESSAGE fails => Responsible node
stores voicemail, instant message.
Mixer, full mesh, multicast
Open issues




P2P reputation system
Motivation to become super node
Security (SPAM, DOS, spy, …)
...
25
Server-based vs peer-to-peer
Reliability,
failover latency
DNS-based. Depends on client
retry timeout, DB replication
latency, registration refresh
interval
DHT self organization and
periodic registration refresh.
Depends on client timeout,
registration refresh interval.
Scalability,
number of users
Depends on number of servers
in the two stages.
Depends on refresh rate,
join/leave rate, uptime
Call setup
latency
One or two steps.
O(log(N)) steps.
Security
TLS, digest authentication,
S/MIME
Additionally needs a reputation
system, working around spy nodes
Maintenance,
configuration
Administrator: DNS, database,
middle-box
Automatic: one time bootstrap
node addresses
PSTN
interoperability
Gateways, TRIP, ENUM
Interact with server-based
infrastructure or co-locate peer node
with the gateway
26
Summary

Motivation



Server-based


PSTN is reliable and scalable
Can IP telephony do better?
DNS, stateless, DB replication, two stage
Peer-to-peer

SIP, DHT, soft state, self organizing
27
Beyond proxy/registrar
CINEMA: Columbia InterNet Extensible Multimedia Architecture
Telephone
switch
Local/long distance
1-212-5551212
CINEMA servers
rtspd: media server
sipconf:
RTSP
Conference server
PSTN
RTSP clients
Department
PBX
Internal
Telephone
Extn: 7040
SIP/PSTN Gateway
Quicktime
713x
sipum:
Unified
messaging
sipd:
Proxy, redirect,
Registrar server
SQL
database
cgi
vxml
Web
server
Web based
configuration
SIP
VXML
siph323:
SIP-H.323
translator
H.323
NetMeeting
28
Communication to collaboration

Synchronous (tightly coupled)


Asynchronous (loosely coupled)



Video conference, IM, screen sharing, …
File sharing, message board, …
Messaging and notifications
Personalized view

Per-user calendar, access control, address book
Goal: provide personalized access, alternate
between synchronous and asynchronous
communication, and access from different
devices and clients.
29