• Elementary TCP Sockets

Download Report

Transcript • Elementary TCP Sockets

• Elementary TCP Sockets
– The presentation will provide sufficient information to
build a COMPLETE TCP client and server.
– In addition the topic of concurrency will be developed.
• This is an important design pattern in that it may be adapted
for other software development where concurrent behavior is
required.
– It must be noted that in the current era the concurrent
server model has been somewhat modified in that a
connection 'pool' algorithm has been adapted.
• The 'fork' is expensive in terms of time and space.
• Apache uses a connection pool architecture.
• JCA employs a connection pool approach.
• socket function
– The first action a process must take to perform network
I/O is to call the 'socket' function.
– This function specifies the type of communication protocol
desired.
• TCP (IPv4)
• UDP
• Unix Domain Stream protocol
# include <sys/socket.h>
int socket (int family, int type, int protocol);
returns a nonnegative descriptor if OK, -1 (ffff ffff) if error
condition.
int socket (int family, int type, int protocol);
family specifies the protocol family. One of the constants
below:
AF_INET
AF_INET6
AF_LOCAL
AF_ROUTE
AF_KEY
(IPv4 protocol)
(IPv6 protocol)
(Unix domain protocols)
routing sockets
key sockets - interface into keys for security
type is one of the constants found below:
SOCK_STREAM
stream socket
SOCK_DGRAM
SOCK_RAW
datagram socket
raw socketd
protocol is set to 0 except for raw sockets (Chapter 25)
• Socket function
– Returns a small, nonnegative integer similar to a file
descriptor. This is called a socket descriptor or sockfd.
– The socket function does not specify the local protocol
address or the foreign protocol address.
Ø Connect function
Ø Used by a TCP Client to establish a connection with a TCP
server.
# include <sys/socket.h>
int connect (int sockfd, const struct sockadder *servadder,
socklen_t addrlen);
• connect Function
int connect (int sockfd, const struct sockadder *servadder,
socklen_t addrlen);
– sockaddr is a pointer to a socket address structure.
– socklen is the size of the socket address structure.
– The socket address structure must contain the IP address
and port number of the server.
– The client does NOT have to call bind before calling
connect.
• The kernel will choose both an ephemeral port number and
the source IP address if necessary.
• connect Function
– The connect function initiates TCPs 3-way handshake.
– Function only returns when the connection is established
or an error occurs.
• ETIMEDOUT is returned if no response to SYN segment after
75 seconds (BSD 4.4)
• If server responds with RST then no service is waiting and
the ECONNREFUSED error is returned.
• RST is a TCP segment that is transmitted on an error.
• RST is generated on: SYN to a port that has no listening
server, TCP wants to abort existing connection, TCP receives
a segment for a connection that does not exist.
• If the SYN elicits an ICMP destination unreachable error from
some intermediate router - a soft error.
• bind Function
– Assigns a local protocol address to a socket. (32 bit IPv4
address and 16 bit port number).
#include <sys/socket.h>
int bind(int sockfd, const struct sockadd *myaddr, socklen_t,
addrlen);
Servers bind their well-known port at startup.
Most servers use a well-known port. Clients are assigned an
emphemeral port. RPC is an exception as RPC uses the
RCP port mapper.
A process can bind a specific IP address to its socket.
Normally a TCP client does NOT bind an IP address to its
socket.
• bind Function
– If a TCP server does not bind an IP address to its socket
teh kernel uses the destination IP addr of the client's SYN
as the server's source IP address..
– If the a port number of 0 is specified the kernel chooses
an ephemeral port number.
– If an ephemeral port number is chosen bind does NOT
return the port number. (because const cannot be used in
the right-hand side of an assignment).
• Must use getsockname to return the protocol address.
– Provision of web servers to multiple organizations.
• each org has its own domain name.
• each org name is mapped to an IP (typically same subnet)
• All the IPs are then aliased onto a single network interface
(use alias option of ifconfig command)
• listen Function
– Only called by a TCP server
– listen converts an unconnected socket into a passive
socket thereby indicating to the kernel to accept incoming
connections to that socket.
– call to listen moves the socket from CLOSED to LISTEN in
the TCP state diagram.
– The second argument specifies the maximum number of
connections that the kernel should queue for this socket.
int listen (int sockfd, int backlog);
Called after the socket and bind functions and MUST be
called before the accept function.
• Queues:
– For each listening socket the kernel maintains two queues.
– An incomplete connection queue contains an entry for
each SYN that has arrived from a client for which the
server is awaiting the completion of the 3 way handshake.
– These sockets are in the SYN_RCVD state.
– A completed connection queue which contains an entry for
each client with whom the TCP 3-way handshake has been
completed.
– These sockets are in the ESTABLISHED state.
– When a process calls accept the first entry on the
completed queue is returned to the process - if the queue
is empty the process is put to sleep.
• listen Function
– There has never been a formal definition of what backlog
means.
– BSD 4.2 "defines the maximum length the queue of
pending connections may grow to".
– The definition does not define whether a pending
connection is in the SYN_RCVD or ESTABLISHED (and not
yet accepted).
– Common size of backlog is now around 100.
– Steven's has some wrapper code which allows the backlog
value to be an environment variable (no recompiling the
server).
– An entry will remain on the queue for approximately one
RTT (round trip time)
• Backlog queues:
– If a client SYN arrives when the queues are full TCP
ignores the SYN and does NOT send an RST.
• Condition is considered temporary.
– Data that arrives after the connection but before the
accept is queued up in the receiver buffer.
– backlog should specify the maximum number of
completed connections allowed in the queue.
– IP spoofing; sending a flood of SYNs with a bogus source
IP address.
– Thus overunning the incomplete connection queue and
effecting a denial of service.
• accept Function
– accept is called by a TCP server to return the next
completed connection from the front of the completed
connection queue.
– If the queue is empty the process is put to sleep
(assuming a blocking socket).
#include <sys/socket.h>
int accept (int sockfd, struct sockdaddr *clidaddr, socklen_t, *addrlen)
• cliaddr and addrlen arguments are used to return the protocol address of
the connected peer process.
• addrlen is a value-result argument
• Prior to the call the integer value pointed by *addrlen is set to the size of
the socket address structure pointed to by *cliaddr.
• On return *addrlen points to a value which is the actual number of bytes
in the socket address struct.
• accept Function (continued)
– If accept is successful its return value is abrand dnew
descriptor that was automatically created by the kernal.
– The fd refers to the TCP connection with the client.
– For discussion purposes the first arg to accept is the
listening socket (fd created by socket)
• Used as an arg in the call to both bind and listen
– The returned value from accept is referred to as the
connected socket.
– The use of value-result arguments is common in Unix
kernel invocations.
• fork Function
– Used to create a new process (the only way in Unix)
pid_t fork(void);
• fork is called once but it returns twice.
• Returns once to the parent with the PID of the newly created
process
• Also returns to the child with a value of zero.
• Therefore the process can determine if it is a parent or a
child.
• Child has only one parent.
• A parent can have many children and therefore must have
the PIDs to distinguish them.
• fork (continued)
– All descriptors (fd) open in the parent before the fork are
shared with the child after fork() returns.
– This is desireable for our server model in that it allows the
child to access the socket (read/write) but the parent can
still close it.
– A server is using the fork() to allow another process to
handle the connection while the parent can return to
handling requests for connections.
– A fork() can be used to execute another program.
• This approach requires the use of the exec() function.
• exec function
– The only way in Unix to load an executable program file
on disk is via the exec function to be called by an extant
process.
– exec replaces the current process image with the new
program file (arguments).
• This executable will start at its main function.
• PID does NOT change on the invocation of an exec.
– Reference to the exec function is generic as there are six
exec functions.
• exec function (continued)
– Difference in the six exec functions:
• whether the called program is referenced by its pathname or
filename.
• whether the arguments to the new program are listed
sequentially or referenced through an array of pointers (argv,
argc).
• Whether the environment of the calling process is passed to
the new program or whether a new environment is specified.
• Normally execve is a system call within the kernel which the
other five exec functions call.
• Must terminate any argv arrays with a null pointer.
• Concurrent Server
int sockfd;
/* listening socket */
sockfd = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in myaddr;
/* local server address */
bzero(&myaddr, sizeof(myaddr));
/* initialize the structure */
myaddr.sin_family = AF_INET; /* address family */
myaddr.sin_port = htons(80); /* port number */
myaddr.sin_addr.s_addr = htonl(INADDR_ANY);
/* take any interface */
if ( bind( sockfd, (struct sockaddr *) &myaddr, sizeof(myaddr)) < 0)
error_handling();
if ( listen( sockfd, 100) < 0) /* choose a large value here */
error_handling();
• Concurrent server (continued)
/* loop to accept connections and process requests concurrently */
while (1)
{
struct sockaddr_in client_addr;
int newSockfd = accept( sockfd, &client_addr, sizeof(client_addr));
if (newSockfd< 0)
error_handling();
if ( fork() == 0) { /* child: handle this request */
close(sockfd); /* close the listening socket (decrease
the reference count) */
process_request(newSockfd);
exit(0);
} else {
/* parent: continue to accept connections */
close(newSockid);
/* decrease the reference count */
}}
/* end while */
• Reference Counts
– Every file or socket has a reference count.
– The reference count is maintained in the file table entry.
– This is a count of the number of descriptors that are
currently open that refer to the particular file or socket.
– After socket returns the file table entry associated with
listenfd has a reference count of 1.
– After accept returns the file table entry associated with
connfd has a reference count of 1.
– After fork() returns both descriptors are shared
(duplicated) between the parent and the child.
• This means that the file table entries for both have a
reference count of 2
• Reference Counts
– This means that when the parent closes connfd the kernel
decrements the reference count from 2 to 1.
– A real close on the socket does NOT take place until the
reference count is 0.
client
connect()
server
connection
listenfd
connfd
status of client-server after return from accept
• Status of client-server after fork returns.
client
connect()
parent
connection
server
listenfd
connfd
child
listen fd
connfd
Status of client-server after parent and child close
appropriate sockets.
client
parent
server
connect()
listenfd
connfd
connection
child
listen fd
connfd
this is the desired final
state of the sockets.
The child is handling the
connection with the
client and the parent can
call accept again on the
listening socket to handle
the next client connection.
• Concurrent Server
– close Function
int close (int sockfd);
• The default action of close with a TCP socket is to mark
the socket as closed and return to the process
immediately.
• This renders the socket descriptor unusable.
• However close does NOT guarantee that a TCP FIN will be
sent.
• The only way to ensure a FIN is to call shutdown().
• It is imperative that the server code be aware of this
situation as it is possible to exceed the maximum
allowable number of fd for a given process.
• getsockname & getpeername Functions
– getsockname returns the local protocol address associated
with a socket.
– getpeername returns the foreign protocol address
associated with a socket.
int getsockname (int sockfd, sturct sodkaddr *localaddr,
socklen_t, *addrlen);
int getpeername (int sockfd, struct sockaddr *peeraddr,
socklen_t *addrlen)
• Both functions fill in the socket address structure pointed
to by the localaddr or peeradr.
• getsockname & getpeername
– raison d`etre
– If a connect without a bind then getsockname returns the
local IP addres/port number assigned.
– If a bind with a port number of 0 (tells kernel to choose a
local port) getsockname returns the local port num
choosen.
– getsockname can used to determine the address family of
a socket.
– In a TCP server that binds the wildcard IP getsockname
can be used to obtain the local IP address assigned the
connection.
– When a server is exec'ed by the process that calls accept
the ONLY way a server can obtain the identity of the client
is to call getpeername (inetd).
• Assignment:
–
–
–
–
Undergraduates: Problems 4.1, 4.2, 4.3
Graduates: Problems 4.1 through 4.5
Due next week.
Printed, stapled, name on each sheet.
– ALL: read Chapter Five.
Be prepared
• Chapter Five
– Developing an echo server
• Client reads a line of text from stdin and writes the line to
the server
• Server reads the line from network input and echoes the line
back to the client.
• Client reads the echo'ed line and prints it on stdout.
– While this is simplistic the problem covers all the
components necessary to build a 'real' server.
– Can use this model to examine boundary conditions:
• Startup
• Client crash
• Server crash
• Chapter Five
– The code on slides 19 and 20 represents the basic
structure of a TCP server with the following change:
while(1) {
clilen = sizeof (cliaddr);
connfd = accept(listenfd, (SA *) &cliaddr, &clilen);
if (childpid = fork()) == 0) {
close(listenfd);
str_echo(connfd);
/* process the request */
exit(0);
}
close (connfd);
//* parent close of connected soc */
}
• TCP Server
– str_echo function (Figure 5.3)
void str_echo(int sockfd)
{
ssize_t n;
char line(MAXLINE);
for(;;) {
if ( (n = Readline(sockfd, line, MAXLINE) ) == 0)
return;
Writen(sockfd, line, n);
}
}
• TCP Echo Client processing loop
void str_cli(FILE *fp, int sockfd)
{
char sendline(MAXLINE), recvline(MAXLINE);
while (Fgets(sendline, MAXLINE, fp) != NULL)
{
Writen(sockfd, sendline, strlen(sendline));
if (Readline(sockfd, recvline, MAXLINE) == 0)
err_quit("str_cli: server terminated prematurely");
Fputs(recvline, stdout);
}
}
• TCP Echo Client
int main(int argc, char **argv)
{
int sockfd;
struct sockaddr_in, servaddr;
if (argc !=2)
err_quit("Usage: tcpcli <IPaddress>");
sockfd = Socket(AF_INET, SOCK_STREAM, 0);
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(SERV_P0RT);
Inet_pton(AF_INET, argv[1], &servaddr.sin_addr);
Connect(sockfd, (SA*) &servaddr, sizeof(servaddr));
str_cli(stdin, sockfd);
exit(0);
}
• Startup of server/client
– start server.
– run netstat to verify the server's listening socket
• use -a option to only list listening sockets.
– Start client on same host.
– After the 3 way handshake:
• The client calls str_cli which blocks in the call to fgets (no
input on stdin)
• When accept returns, server calls fork and the child calls
str_echo, which calls readline, which calls read, which blocks
waiting for a line to be sent from the client.
• Server parent is now blocking on accept.
• How many processes?
– Do netstat -a
• TCP Client/Server
– Normal Termination
– Type in two lines followed by a terminal EOF character
(Control-D)
• Upon receipt of the EOF fgets returns a null pointer and
str_cli returns.
• Then client main calls exit().
• exit causes all descriptors to be decremented hence client
sends a FIN to server.
– At this point server socket is in the CLOSE_WAIT state and client
socket is in FIN_WAIT_2 state..
• When server receives the FIN the server child is blocked in a
call to readline. Readline then returns a 0. str_echo now
returns to server child main.
• server child terminates by calling exit.
• All open descriptors in the server child are closed. This
causes the final two segments of the TCP termination to fire.
– FIN from server and ACK from client.
• TCP Client/Server termination
– When the server child terminates SIGCHLD is transmitted.
– The child enters the zombie state.
• Posix Signal Handling.
– A signal is indication that an event has happened (in the old world
they were called software interrupts).
• Signals are usually asynchronous.
• Signals can be sent from one process to another
• Or by the kernel to the process.
– Every signal has a disposition which is the action associated with a
signal (like the vector table on Intel hardware interrupts).
– SIGCHLD is sent by the kernel whenever a process terminates to
the parent of the terminating process
• POSIX signal handling
– In Unix a function can be tied to a specific signal. This function is
called the signal handler which ‘catches’ the signal.
– void handler ( int signumber)
•
A signal can be ignored by setting its disposition to SIG_IGN.
– SIGKILL and SIGSTOP cannot be ignored.
•
The default disposition for a signal is achieved by setting its disposition to
SIG_DFL
– The default is normally to terminate the process on receipt.
– This is how you can get core dumps (abend). Some signals have a
default action of generating a core image of the process in its current
working directory.
– A few signals (SIGCHLD and SIGURG) have a default action of being
ignored.
• POSIX signal handling
•
•
Steven’s provides a nifty way to provide a signal disposition, meet Posix
standards and maintain backward compatibility.
– Uses a defined function called ‘signal’ which calls the Posix sigaction
function.
– First arg to signal is the signal name and the second ar is either a pointer
to a function or one of the constants SIG_IGN or SIG_DFL.
– Avoids some of the trickiness of calling sigaction directly.
Signal masks:
sigemptyset(&act.sa_mask); // part of the struct used by sigaction.
The mask allows the specification of a set of signals that will be blocked
when the signal handler is called. A blocked signal cannot be delivered to
the registered process. The example uses the emptyset so that nothing
will be blocked during the sigaction.
Posix guarantees that the signal being handled is blocked during execution
of the signal handler.
• POSIX signal semantics
– Once a signal handler is installed it remains installed.
– While a signal handler is executing the signal being delivered is blocked.
– If a signal is gen’ed 1 or more times while it is blocked it is NOT queued
but is delivered once after unblocking.
• POSIX 1003.1b defines a set of reliable signals that are queued (not used in
this course).
– Sets of signals can be selectively masked and unmasked to protect critical
regions. This is a technique commonly used in the world of designing
and implementing software that will run directly on hardware without
benefit of some OS or ‘kernel’.
• To block or unblock selectively use the sigprocmask function.
• Back to zombies and our hanging child.
– Whenever we fork and create a child must wait for them to prevent
them from becoming zombies.
– To implement the wait, establish a signal handler to catch
SIGCHLD and then call the wait within.
Signal (SIGCHLD, sig_chld);
and the function
void sig_chld ( int signo)
{
pid = wait ( &stat);
Engineering and WAY beyond
• A significant problem with the example of Figure 5.2 (pg
122).
– The parent blocks in its call to accept when the SIGCHLD signal
occurs. The signal handler executes (sig_chld), wait fetches the
child’s PID, and the signal handler returns.
– Since the signal was caught by the parent while the parent was
blocked in a slow system call (accept) the kernel causes the
accept to return an error (EINTR - interrupted system call).
• Slow system calls are any call that can block forever.
– The parent then aborts.
– Therefore must be aware of interrupted system calls and must
provide a means to handle them.
– This is the purpose of the SA_RESTART flag; to automatically
restart interrupted system calls.
• Handling Interrupted System Calls
– Basic rule: when a process is blocked in a slow system call and
the process catches a signal and the signal handler returns, the
system call can return an EINTR.
– Some kernels automatically restart some interrupted system calls.
– To handle the interrupted accept
for (; ; )
if ( ( connfd = accept (……) ) < 0 )
if (errno == EINTR)
continue;
• Handling Interrupted System Calls
• connect cannot be restarted using the self restart. Must use a select.
• The select can be used to check for a successful or unsuccessful
completion of the connection.
– The code will time wait on the connection establishment and can thereby
specify a time limit to wait (can wait forever therefore infinite blocking).
– This is typically used with a non-blocking TCP socket on which a connect
is called.
– The non-blocking TCP socket allows multiple connections to be
established at the same time; used with some Web browsers.
• Wait function
– The wait function returns two values; one through a return and one
through a value-result pair.
• One value is the process id of the terminated child (returned).
• The other value is an integer (value-result) that represents the
termination status of the child. (normally, killed-by-signal, or a job
control stop).
– The waitpid function provides more control allowing deterministic
choice of which process to wait for. Also a variety of options are
available for further definition of the wait state behavior.
• waitpid addresses the shortcoming of establishing a signal
handler and simply calling wait; it won’t prevent zombies.
• In a multiple child termination process, a number of
termination signals can be generated prior to the signal
handler executing.
– The signal handler will only execute once, since Unix normally
does not queue signals.
• Hence the signal handler will only execute one or two
times (same machine or different) leaving N-2 or N-1
zombies.
• The solution is to run waitpid in a loop which will obtain the
status of any children to be terminated.
– Must use the WNOHANG option (3rd argument). This tells waitpid
not to block if there are children running.
• s
• s
• s