No Slide Title

Download Report

Transcript No Slide Title

I/O Multiplexing
Computer Network Programming
1
Input from multiple sources
file
keyboard
Process
other terminal
devices
screen
sockets
A process may have multiple sources of input and may be sending
output to multiple destinations.
I/O multiplexing is used to multiplex the input from multiple
sources into a single process.
2
Where do we use
– When a client handles multiple descriptors
– stdin, a network socket…
– When a client handles multiple sockets at the
same time
– web clients
– TCP server handles listening socket and
connected socket at the same time
– Server handles both TCP and UDP
– Server handles multiple servers: inetd for
example
3
I/O Models
–
–
–
–
–
Blocking I/O
Nonblocking I/O
I/O multiplexing (select() and poll())
Signal driven I/O (SIGIO signal)
Asynchronous I/O (aio_functions)
• Two phases for an input operation:
– waiting for the data to be ready in the kernel
– copying the data from the kernel to the process
4
Blocking I/O Model
Kernel
Application
recvfrom
system call
no datagram ready
Wait for data
Process
blocks
in a call
to
recvfrom
datagram ready
copy datagram
Copy data
from kernel
to user
return OK
copy complete
process
datagram
Assume we want to read from a UDP socket a UDP datagram with recvfrom
function (or system call)
5
Non-blocking I/O Model
Kernel
Application
recvfrom
Process
repeatedly
calls
recvfrom
waiting
for an OK
return
(polling)
system call
no datagram ready
EWOULDBLOCK
recvfrom
system call
EWOULDBLOCK
recvfrom
Wait for
data
system call
EWOULDBLOCK
recvfrom
system call
datagram ready
copy datagram
Copy data
from kernel
to user
return OK
process
datagram
copy complete
6
I/O Multiplexing Model
Kernel
Application
select
Process blocks
in a call to select
waiting for
one of possibly
many sockets to
become readable
system call
no datagram ready
Wait for
data
return readable
recvfrom
Process
blocks while
data copied
into application
buffer
system call
datagram ready
copy datagram
Copy data
from kernel
to user
return OK
copy complete
process
datagram
7
Signal driven I/O
Kernel
Application
Process
continues
executing
Establish
SIGIO signal
handler
return
Wait for
data
signal handler
Process
blocks while
data copied
into
application
buffer
sigaction system call
recvfrom
deliver SIGIO
system call
datagram ready
copy datagram
Copy data
from kernel
to user
return OK
copy complete
process
datagram
8
Asynchronous I/O Model
Kernel
Application
aio_read
system call
no datagram ready
return
wait for
data
Process
continues
executing
datagram ready
copy datagram
Copy data
from kernel
to user
signal handler
process
datagram
deliver signal
copy complete
specified in aio_read
9
Comparison of I/O Models
Blocking
initiate
blocked
Nonblocking
check
check
check
check
check
check
check
check
blocked
complete
complete
I/O Multiplexing Signal-driven I/O
check
Asynchronous I/O
initate
Wait
for data
ready
initiate
blocked
complete
notification
initiate
blocked
complete
notification
Copy
data
from
kernel
to user
10
synchronous versus
asynchronous I/O
A synchronous I/O operation causes the requesting process to be
blocked until that I/O operation completes
Blocking
Nonblocking
I/O multiplexing
Signal driven I/O
An asynchronous I/O operation does not cause the requesting process
to be blocked.
Asynchronous I/O
11
select() function
• Process instruct the kernel to wait for any one of
multiple events to occur and wake up the process
only one or more of these events occurs or when a
special amount of time has passed
Select returns when:
»
»
»
»
descriptor ready for reading
descriptor ready for writing
descriptor has an exception condition pending
some certain amount of time has passed
int
select(int maxfdp1, fd_set *readset, fd_set * writeset, fd_set *, fd_set *
exceptset, const struct timeval *timeout);
returns: positive count of ready descriptors,
0 on timeout, -1 on error.
12
select()
struct timeval {
long tv_sec;
long tv_usec;
}
Wait forever: timeout = NULL;
Wait the amount of time specified by timeout.
Do not wait at all:
timeout->tv_sec = 0;
timeout->tv_usec = 0;
void FD_ZERO(fd_set *fdset);
/* clear all bits */
void FD_SET(int fd, fd_set *fdset);
/* turn on the bit for fd */
void FD_SET(int fd, fd_set *fdset);
/* turn off the bit for fd */
int FD_ISSET(int fd, fd_set *fdset); /* is the bit for fd on? */
13
When is the descriptor ready for read?
• The number of bytes in th socket receive buffer is
greater than or equal to the current size of the lowwater mark for the socket to receive buffer.Low
water mark defaults to 1 and can be set using
SO_RCVLOWAT socket option.
• The read-half of the connection is closed (TCP
received a FIN). Zero returned
• The socket is listening socket and the number of
completed connections for the socket is non-zero.
• A socket error is pending.
14
When a descriptor is ready for write
• The number of bytes of available space in the socket
send buffer is greater than or equal to the current
size of the low-water mark for the socket send
buffer
» the socket should be connected for TCP
• The write-half of the TCP connection is closed.
SIGPIPE returned from the function
• A socket error is pending.
15
Example str_cli function
Data or EOF
stdin
We are calling select
for readibility on either
standard input or
socket
client
socket
error
EOF
TCP
RST
data
FIN
TCP layer can receive a data segment, a FIN segment or a
RST segment from the peer.
16
str_cli function at the client
void str_cli(FILE *fp, int sockfd)
{
int
maxfdp1;
fd_set
rset;
char
sendline[MAXLINE], recvline[MAXLINE];
FD_ZERO(&rset);
for ( ; ; ) {
FD_SET(fileno(fp), &rset);
FD_SET(sockfd, &rset);
maxfdp1 = max(fileno(fp), sockfd) + 1;
Select(maxfdp1, &rset, NULL, NULL, NULL);
if (FD_ISSET(sockfd, &rset)) { /* socket is readable */
if (Readline(sockfd, recvline, MAXLINE) == 0)
err_quit("str_cli: server terminated prematurely");
Fputs(recvline, stdout);
}
if (FD_ISSET(fileno(fp), &rset)) { /* input is readable */
if (Fgets(sendline, MAXLINE, fp) == NULL)
return;
/* all done */
Writen(sockfd, sendline, strlen(sendline));
}
}
}
17
shutdown function
• Close()
– decrements the reference count of a socket and closes it
only when it reaches to zero.
– Close terminates both directions of data transfer
• Shutdown()
– can close a socket immediately without looking to the
reference count
– can close only read-half or write-half of a connection
int shutdown(int socketfd, in howto)
howto:SHUT_RD: read-half of connection is closed
SHUT_WR: write-half of connection is closed
SHUT_RDWR: read-half and write-half of
connection is closed.
18
TCP Echo server
– We have written concurrent TCP echo server
using child processes using fork.
– We can write also a single process concurrent
TCP server using select().
– We will use select to handle any number of clients
concurrently listening
client
Data structures
at the server:
server
client[]
[0]
[1]
[3]
4
5
-1
……..
client
fd0 fd1 fd2 fd3 fd4 fd5
0 0 0 1 1 1
maxfd + 1 = 6
19
TCP echo server
#include
"unp.h"
int
main(int argc, char **argv)
{
int
i, maxi, maxfd, listenfd, connfd, sockfd;
int
nready, client[FD_SETSIZE];
ssize_t
n;
fd_set
rset, allset;
char
line[MAXLINE];
socklen_t clilen;
struct sockaddr_in cliaddr, servaddr;
listenfd = Socket(AF_INET, SOCK_STREAM, 0);
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
servaddr.sin_port
= htons(SERV_PORT);
Bind(listenfd, (SA *) &servaddr, sizeof(servaddr));
Listen(listenfd, LISTENQ);
20
maxfd = listenfd;
maxi = -1;
for (i = 0; i < FD_SETSIZE; i++)
client[i] = -1;
FD_ZERO(&allset);
FD_SET(listenfd, &allset);
/* initialize */
/* index into client[] array */
/* -1 indicates available entry */
for ( ; ; ) {
rset = allset;
/* structure assignment */
nready = Select(maxfd+1, &rset, NULL, NULL, NULL);
if (FD_ISSET(listenfd, &rset)) {
/* new client connection */
clilen = sizeof(cliaddr);
connfd = Accept(listenfd, (SA *) &cliaddr, &clilen);
#ifdef
NOTDEF
printf("new client: %s, port %d\n",
Inet_ntop(AF_INET, &cliaddr.sin_addr, 4, NULL),
ntohs(cliaddr.sin_port));
#endif
for (i = 0; i < FD_SETSIZE; i++)
if (client[i] < 0) {
client[i] = connfd;
break;
}
/* save descriptor */
21
if (i == FD_SETSIZE)
err_quit("too many clients");
FD_SET(connfd, &allset);
if (connfd > maxfd)
maxfd = connfd;
if (i > maxi)
maxi = i;
if (--nready <= 0)
continue;
}
for (i = 0; i <= maxi; i++) {
/* check all clients for data */
if ( (sockfd = client[i]) < 0)
continue;
if (FD_ISSET(sockfd, &rset)) {
if ( (n = Readline(sockfd, line, MAXLINE)) == 0) {
/*connection closed by client */
Close(sockfd);
FD_CLR(sockfd, &allset);
client[i] = -1;
} else
Writen(sockfd, line, n);
if (--nready <= 0)
break;
}
}
}
}
22
Denial of Service Attack
• The TCP server should be designed so that it
doesn’t block on a read operation indefinitely
• otherwise a malicious reader can make the server block
indefinitely, making the server unavailable for other
clients
• on the previous example, readline may block forever, if a
malicious client does not send an end-of-line character
• therefore server should do one of the following:
– use nonblocking I/O
– have each client served by a separate thread of control
– place a timeout on the I/O operation.
23
Socket Options
24
• There are options that affect the operation
of the socket.
• There are functions to get and set the values
of these options
• getsockopt() and setsockopt()
• fcntl()
• ioctl() (we will see this later)
25
getsockopt(), setsockopt()
int getsockopt(int sockfd, int level, int optname, void *optval,
size_t *optlen);
int setsockopt(int sockfd, int level, int optname,
const void *optval, size_t len);
sockfd should refer to an open socket descritor.
optval is a pointer to a variable to keeps the value
level specifies the code in the system to interpret the option
SOL_SOCKET
IPPROTO_IP
IPPROTO_TCP
26
Socket options
level
SOL_SOCKET
IPPROTO_IP
IPPROTO_TCP
optname
SO_BROADCAST
SO_DEBUG
SO_DONTROUTE
SO_ERROR
SO_KEEPALIVE
SO_LINGER
SO_RECVBUF
SO_SENDBUF
SO_RCVLOWAT
SO_SNDLOWAT
SO_RCVTIMEO
SO_SNDTIMEO
SO_REUSEADDR
SO_TYPE
IP_HDRINCL
IP_OPTIONS
IP_TOS
IP_TTL
TCP_KEEPALIVE
TCP_MAXRT
TCP_MAXSEG
TCP_NODELAY
get
set
flag datatype
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
int
int
int
int
int
linger{}
int
int
int
int
timeval{}
timeval{}
int
int
int
int
int
int
int
int
int
int
27
Default Values for Socket Options
aspendos{korpe}:> checkopts
SO_BROADCAST: default = off
SO_DEBUG: default = off
SO_DONTROUTE: default = off
SO_ERROR: default = 0
SO_KEEPALIVE: default = off
SO_LINGER: default = l_onoff = 0, l_linger = 0
SO_OOBINLINE: default = off
SO_RCVBUF: default = 8192
SO_SNDBUF: default = 8192
SO_RCVLOWAT: getsockopt error: Option not supported by protocol
SO_SNDLOWAT: getsockopt error: Option not supported by protocol
SO_RCVTIMEO: getsockopt error: Option not supported by protocol
SO_SNDTIMEO: getsockopt error: Option not supported by protocol
SO_REUSEADDR: default = off
SO_REUSEPORT: (undefined)
SO_TYPE: default = 2
SO_USELOOPBACK: default = off
IP_TOS: default = 0
IP_TTL: default = 255
TCP_MAXSEG: default = 536
TCP_NODELAY: default = off
28
Generic Socket Options
• SO_BROADCAST
• enables or disables a process to send broadcast
messages
– only supported for datagram sockets
– broadcasting is only supported in broadcast mediums
(ethernet, tokenring, wireless LAN….)
• SO_DEBUG
• Kernel keeps track of every packet and received
over the socket
29
• SO_ERROR
• obtain the value of so_error variable and reset it to
zero.
• When a socket error occurs, this so_error variable is
set to one of the E… error values.
• When such pending error exists on a socket:
» select returns
» SIGIO is generated for the process (if process is using
signal driven IO).
30
• SO_KEEPALIVE
• When there is no TCP transferred over the
connection, a server (or a client) can issue probe
segments if this option is set. (after 2 hours)
– the peer can respond with an ACK
– the peer can respond with a RST (ECONNRESET)
– no response at all (ETIMEOUT, EHOSTUNREACH)
• SO_LINGER
• specifies how close function operates for a
connection oriented protocol
31
• SO_LINGER
• by default, close returns immediately. The remaining
data in the socket send buffer is sent by TCP to the
peer.
struct linger {
int l_onoff;
int l_linger;
}
– if l_onoff is 0, the option is turned off.
– if l_onoff is non-zero and l_linger is zero, TCP aborts the
connection when close issued: all remaining data is
discarded and RST is set to the peer.
– if l_onoff is nonzero and l_linger is nonzero, the close
will block, until all data is transmitted and acknowledged
or linger time expires.
32
How to know destination process
received our data
• close() returns immediately. Does not gşve any clue
if destination application has received our data
• close() lingers until the ACK of our FIN is received.
This makes sure that the destination TCP has
received all the data (but may be the process not)
• shutdown followed by a read waits until we receive
the peers FIN thereby being sure that the receiving
process received all the data and closed the socket
• use application level acknowledgements
33
• SO_RCVBUF and SO_SNDBUF
• receive buffers are used to hold received data until it
is read by the application
• receive buffer size should be set before calling
connect and listen.
» Because of the window scale option sent in SYN
segments
• for optimum performance, size of socket buffer sizes
should be related to Bandwidth x Delay product
» we should have at least that much of buffer space.
34
• SO_RCVLOWAT, SO_SNDLOWAT
• used by the select function and determines when the
select will return readable or writeable on a socket
– rcvlowwatermark default is 1, sndlowmatermark default is
2048
• SO_RCVTIMEOUT, SO_SNDTIMEOUT
• place timout on socket receives and sends.
• Affects the following functions:
» read, readv, recv, recvfrom, recvmsg
» write, writev, send, sendto, sendmsg
35
• SO_REUSEADDR
• when we set this option, the listening server can be
restarted even though the child is running using the
same port number
• Allows multiple instances of the same server to be
started on the same port, as long each uses a
different IP address
» However with TCP we can not start servers that use
the same local IP, local port pair no matter what you
do.
• Allows a process to bind same port number to multiple sockets
as long as they have different local IP addresses.
• Allows with UDP completely duplicate binding: same IP
address and port number can be assigned to multiple sockets.
» This is used for multicasting
» We will se multicasting later
36
• IPv4 Socket OPTIONS
– IP_HDRINCL option
» we must build our own IP header if this is set for a
raw socket
– IP_OPTIONS
» setting this options allows us to set IP options in the IP
header
– IP_RCVDSTADDR
» this option causes the destination IP address of the received
UDP datagram as an ancillary data by recvmsg.
– IP_RECVIF
» causes the return of index of the network interface on which
a UDP datagram received with recvfrom function.
– IP_TOS, IP_TTL
» set the type of service field, TTL field in outgoing IP
datagrams from this socket
37
• TCP Socket Options
• TCP_KEEPALİVE
» specifies the idle time in seconds for a connection
before TCP starts keepalive probes (of
SO_KEEPALIVE option is set)
• TCP_MAXRT
» specifies the amount of time before a connection is
broken once TCP starts retransmitting data.
• TCP_MAXSEG
» allows us to fetch and set the MSS value for a TCP
connection (we can not generate segments larger than
specified with this option)
• TCP_NODELAY option
» disables Nagle’s algorithm.
38
Nagle’s Algorithm
• Designed to reduce the amount of small
packets on a wide area network.
• Says that if a connection has an outstanding data
(data not acked), we can not send small packets on
the connection (small means < MSS)
• used for rlogin and telnet
h
e
l
l
h
el
We have chance of sending
more than one character
in a single TCP segment
o
!
lo
39
fcntl function
– Stands for file control and performs various
descriptor control operations.
fcntl(int fd, int cmd, … /* int arg */);
– Set a descriptor for non-blocking I/O
– use F_SETFL cmd with O_NONBLOCK flag
– set a socket for signal driven I/O
– use F_SETFL cmd with O_ASYNC flag
– set the socket owner
– use F_SETOWN cmd
receive
thereby
SIGIO
(data
available)
and SIGURG
fcntl(int» fd,
int
cmd,
… /*
int
arg */);
(urgent data available) signals
40
Example
int
flags;
int
socketfd;
….
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if ((flags = fcntl(fd, F_GETFL, 0)) < 0)
err_sys(“F_GETFL error”),
flags = flags | O_NONBLOCK;
if (fcntl(fd, F_SETFL, flags) < 0)
err_sys(“F_SETFL error”),
41