9 IP: Multicast Processing (IGMP)
11 TCP: Data Structures And Input ProcessingInput Processing
11.1 Introduction
TCP is the most complex of all protocols in the suite of Internet protocols. It provides reliable, flow-controlled, end-to-end, stream service between two machines of arbitrary processing speed using the unreliable IP mechanism for communication. Like most reliable transport protocols, TCP uses timeout with retransmission to achieve reliability. However, unlike most other transport protocols, TCP is carefully constructed to work correctly even if datagram are delayed, duplicated, lost, delivered out of order, or delivered with the data corrupted or truncated. Furthermore, TCP allows communication machines to reboot and reestablish connections at arbitrary times without causing confusion about which connections are open and which are new.
This chapter examines the global organization of TCP software and describes the data structures TCP uses to manage information about connections. Chapter 12 describes the details of connection management and implementation of the TCP finite state machine used for input. Chapter 13 discusses output and the finite state machine used to control it. Chapters 14 through 16 discuss the details of timer management, estimation of round trip times, retransmission, and miscellaneous details such as urgent data processing.
11.2 Overview Of TCP Software
Recall from Chapter 2 that our implementation of TCP uses three processes. One process handles incoming segments, another manages outgoing segments, and the third is a timer that manages delayed events such as retransmission timeout. In theory, using separate processes isolates the input, output, and event timing parts of TCP and permits us to design each piece independently. In practice, however, the processes interact closely. For example, the input and output processes must cooperate to match incoming
acknowledgements with outgoing segments and cancel the corresponding timer retransmission event. Similarly, the output and timer processes interact when the output process schedules a retransmission event or when the timer triggers a retransmission.
11.3 Transmission Control Blocks
TCP coordinates the activities of transmission, reception, and retransmission for each TCP connection through a data structure shared by all processes. The data structure is known as a transmission control block or TCB. TCP maintains one TCB for each active connection. The TCB contains all information about the TCP connection, including the addresses and port numbers of the connection endpoints, the current round-trip time estimate, data that has been sent or received, whether acknowledgement or retransmission is needed, and any statistics TCP gathers about the use of the connection.
Although the protocol standard defines the notion of the TCB and suggests some of the contents, it does not dictate all the details. Thus, a designer must choose the exact contents. Our example implementation places the information in structure tcb. In most cases field names match the names used in the protocol standard.
/* tcb.h */
/* TCP endpoint types */
#define TCPT_SERVER 1
#define TCPT_CONNECTION 2
#define TCPT_MASTER 3
/* TCP process info */
extern PROCESS tcpinp();
#define TCPISTK 4096 /* stack size for TCP input */
#define TCPIPRI 100 /* TCP runs at high priority */
#define TCPINAM "tcpinp" /* name of TCP input process */
#define TCPIARGC 0 /* count of args to tcpin */
extern PROCESS tcpout();
#define TCPOSTK 4096 /* stack size for TCP output */
#define TCPOPRI 100 /* TCP runs at high priority */
#define TCPONAM "tcpout" /* name of TCP output process */
#define TCPOARGC 0 /* count of args to tcpout */
#define TCPQLEN 20 /* TCP process port queue length */
/* TCP exceptional conditions */
#define TCPE_RESET -1
#define TCPE_REFUSED -2
#define TCPE_TOOBIG -3
#define TCPE_TIMEDOUT -4
#define TCPE_URGENTMODE -5
#define TCPE_NORMALMODE -6
/* string equivalents of TCPE_*, in "tcpswitch.c" */
extern char *tcperror[];
#define READERS 1
#define WRITERS 2
/* tcb_flags */
#define TCBF_NEEDOUT 0x01 /* we need output */
#define TCBF_FIRSTSEND 0x02 /* no data to ACK */
#define TCBF_GOTFIN 0x04 /* no more to receive */
#define TCBF_RDONE 0x08 /* no more receive data to process */
#define TCBF_SDONE 0x10 /* no more send data allowed */
#define TCBF_DELACK 0x20 /* do delayed ACK's */
#define TCBF_BUFFER 0x40 /* do TCP buffering (default no) */
#define TCBF_PUSH 0x80 /* got a push; deliver what we have */
#define TCBF_SNDFIN 0x100 /* user process has closed; send a FIN */
#define TCBF_RUPOK 0x200 /* receive urgent pointer is valid */
#define TCBF_SUPOK 0x400 /* send urgent pointer is valid */
/* aliases, for user programs */
#define TCP_BUFFER TCBF_BUFFER
#define TCP_DELACK TCBF_DELACK
/* receive segment reassembly data */
#define NTCPFRAG 10
struct tcb {
short tcb_state; /* TCP state */
short tcb_ostate; /* output state */
short tcb_type; /* TCP type (SERVER, CLIENT) */
int tcb_mutex; /* tcb mutual exclusion */
short tcb_code; /* TCP code for next packet */
short tcb_flags; /* various TCB state flags */
short tcb_error; /* return error for user side */
IPaddr tcb_rip; /* remote IP address */
u_short tcb_rport; /* remote TCP port */
IPaddr tcb_lip; /* local IP address */
u_short tcb_lport; /* local TCP port */
struct netif *tcb_pni; /* pointer to our interface */
tcpseq tcb_suna; /* send unacked */
tcpseq tcb_snext; /* send next */
tcpseq tcb_slast; /* sequence of FIN, if TCBF_SNDFIN */
u_long tcb_swindow; /* send window size (octets) */
tcpseq tcb_lwseq; /* sequence of last window update */
tcpseq tcb_lwack; /* ack seq of last window update */
u_int tcb_cwnd; /* congestion window size (octets) */
u_int tcb_ssthresh; /* slow start threshold (octets) */
u_int tcb_smss; /* send max segment size (octets) */
tcpseq tcb_iss; /* initial send sequence */
int tcb_srt; /* smoothed Round Trip Time */
int tcb_rtde; /* Round Trip deviation estimator */
int tcb_persist; /* persist timeout value */
int tcb_keep; /* keepalive timeout value */
int tcb_rexmt; /* retransmit timeout value */
int tcb_rexmtcount; /* number of rexmts sent */
tcpseq tcb_rnext; /* receive next */
tcpseq tcb_rupseq; /* receive urgent pointer */
tcpseq tcb_supseq; /* send urgent pointer */
int tcb_lqsize; /* listen queue size (SERVERs) */
int tcb_listenq; /* listen queue port (SERVERs) */
struct tcb *tcb_pptcb; /* pointer to parent TCB (for ACCEPT) */
int tcb_ocsem; /* open/close semaphore */
int tcb_dvnum; /* TCP slave pseudo device number */
int tcb_ssema; /* send semaphore */