3.7 The Completed Implementation
3.7.3 iSCSI Initiator
The UNH-iSCSI initiator consists of two main threads of control.
1. The initiator (per-connection) transmit thread.
19
A similar interaction also occurs between the iSCSI and the SCSI layer, however this document does not cover that interaction in great detail.
iSCSI Thread iSCSI Layer iSER Layer iSER Thread iSER Thread Within iSCSI Layer iSCSI Thread Within iSER Layer
Figure 3-15: iSCSI/iSER Layer Interaction
2. The initiator (per-connection) receive thread.
The initiator transmit thread handles the transmission of iSCSI PDUs. Since the ini- tiator registers itself as a Host Bus Adaptor with the operating system, it is required to implement a queue command function that allow the SCSI mid-level to pass it new SCSI commands. Each new command is added to the initiator’s queue by the queue command
function, and the transmit thread is woken up in order to transmit the new command to the iSCSI target. The Send Controloperational primitive is used by the iSCSI initiator’s transmit thread in order to give the local iSER layer each new iSCSI PDUs to transmit to the target.
The initiator receive thread waits for new incoming iSCSI PDUs on the network. The initiator provides theControl NotifyandData Completion Notifyprimitives to the iSER layer so that it can receive notification of new incoming iSCSI PDUs or data from the RDMA capable connection. Upon receiving an iSCSI response PDU indicating that the
Initiator Transmit Thread Target Transmit Thread Initiator Receive Thread /proc filesystem or command interpreter thread Linux SCSI Midlevel iSER Layer
Figure 3-16: iSCSI Initiator Thread Interactions
iSCSI target has completed servicing the request for a previous iSCSI Command PDU, the initiator receive thread will signal the SCSI mid-level that the command has completed.
In order to create the initiator’s transmit and receive threads for a new connection, the /proc file-system is used in kernel-space mode and a command interpreter thread is used in user-space mode. Writing certain commands to the/proc file system entries that correspond to the iSCSI initiator (or to the command interpreter, in the case of user-space) will cause a new connection to be created. A new connection, in iSER-assisted mode, is created by invoking some of the extended operational primitives similar to the ones used in the target. Once the local iSER layer has completed making the connection to the target, the initiator receive and transmit threads are created and the new iSCSI connection begins. Figure 3-16 shows the interaction between the iSCSI initiator threads. As with the iSCSI target, the interaction between the iSCSI and iSER layer happens on behalf of the
CHAPTER 4
RESULTS
Our implementation is designed to support all of the protocols shown in Figure 3-1. It is also designed to run in user-space and kernel-space and to use the three UNH-iSCSI IO modes (DISKIO, MEMORYIO and FILEIO). This design should be able to perform a large variety of test combinations. Table 4.1 gives a table of possible test combinations. FILEIO mode, while useful in development, is not something that we consider interesting with respect to performance results since it does not tell us any useful information pertaining to the real world use of our implementation. DISKIO mode is interesting since it demonstrates the system while using a real disk drive, and MEMORYIO is useful to demonstrate the performance of just the protocol stack (without the disk overhead). Also, one should note that evaluation runs with the target environment of user-space can not use DISKIO mode because of previously mentioned restrictions on using the Linux SCSI mid-level.
4.1
Tests Performed
Due to time constraints, all of the tests shown in Table 4.1 were not performed. The exper- iments performed, for this thesis, used the following subset of the possible configurations:
• iSER-assisted kernel-space target and initiator over iWARP with MEMORYIO mode.
• Traditional iSCSI kernel-space target and initiator with MEMORYIO mode.
• iSER-assisted user-space target and initiator over iWARP with MEMORYIO mode.
Initiator Target Initiator Target IO Mode Transport Transport Environment Environment
TCP TCP User-space User-space MEMORYIO TCP TCP Kernel-space User-space MEMORYIO TCP TCP User-space Kernel-space MEMORYIO TCP TCP User-space Kernel-space DISKIO TCP TCP Kernel-space Kernel-space MEMORYIO TCP TCP Kernel-space Kernel-space DISKIO
OSC OSC User-space User-space MEMORYIO iWARP OSC User-space User-space MEMORYIO iWARP OSC Kernel-space User-space MEMORYIO OSC iWARP User-space User-space MEMORYIO OSC iWARP User-space Kernel-space MEMORYIO OSC iWARP User-space Kernel-space DISKIO iWARP iWARP User-space User-space MEMORYIO iWARP iWARP User-space Kernel-space MEMORYIO iWARP iWARP User-space Kernel-space DISKIO iWARP iWARP Kernel-space Kernel-space MEMORYIO iWARP iWARP Kernel-space Kernel-space DISKIO
Infiniband Infiniband User-space User-space MEMORYIO Infiniband Infiniband User-space Kernel-space MEMORYIO
Infiniband Infiniband User-space Kernel-space DISKIO Infiniband Infiniband Kernel-space Kernel-space MEMORYIO Infiniband Infiniband Kernel-space Kernel-space DISKIO
TCP: Traditional iSCSI without iSER with software TCP. OSC: iSCSI with iSER and software OSC iWARP/TCP.
iWARP: iSCSI with iSER, OFA CMA and hardware iWARP/TCP. Infiniband: iSCSI with iSER, OFA CMA and hardware Infiniband.
The MEMORYIO mode was chosen in order to demonstrate the throughput of the iSER protocol without the overhead of a disk drive. For this project, the main interest was in demonstrating the benefits of using RDMA hardware to assist an iSCSI connection. The overhead of a disk drive was ignored, for these evaluations, since it is independent of the performance of the iSCSI protocol itself.
While these configurations do not nearly cover the entire set of possible test configura- tions, they do succeed in demonstrating the benefits of iSER over traditional iSCSI. Using these results, a comparison of the throughput of data transfers in both the read and write directions, with and without RDMA, over a 10GigE network connection was possible.