Asynchronous input/output - Transferring data using sockets

Writing a chat application

3.5 Transferring data using sockets

3.5.2 Asynchronous input/output

Nim supports many abstractions that make working with asynchronous I/O simple.

This is achieved in part by making asynchronous I/O very similar to synchronous I/O, so your I/O code doesn’t need to be particularly complex.

Server socket

bindAddr

listen

Client socket

7687.Port

"localhost"

Thread

blocked Figure 3.13 The steps needed to start accepting new connections on a server socket

Let’s first look at the accept procedure in more detail. This procedure takes one parameter, a server socket, which is used to retrieve new clients that have connected to the specified server socket.

The fundamental difference between the synchronous and asynchronous versions of the accept procedure is that the synchronous accept procedure blocks the thread it’s called in until a new socket has connected to the server socket, whereas the asyn-chronous accept procedure returns immediately after it’s called.

But what does the asynchronous version return? It certainly can’t return the accepted socket immediately, because a new client may not have connected yet.

Instead, it returns a Future[AsyncSocket] object. To understand asynchronous I/O, you’ll need to understand what a future is, so let’s look at it in more detail.

THE FUTURETYPE

A Future is a special type that goes by many names in other languages, including prom-ise, delay, and deferred. This type acts as a proxy for a result that’s initially unknown, usu-ally because the computation of its value is not yet complete.

You can think of a future as a container; initially it’s empty, and while it remains empty you can’t retrieve its value. At some unknown point in time, something is placed in the container and it’s no longer empty. That’s where the name future comes from.

Every asynchronous operation in Nim returns a Future[T] object, where the T cor-responds to the type of value that the Future promises to store in the future.

The Future[T] type is defined in the asyncdispatch module, and you can easily experiment with it without involving any asynchronous I/O operations. The next list-ing shows the behavior of a simple Future[int] object.

import asyncdispatch

var future = newFuture[int]() doAssert(not future.finished) future.callback =

proc (future: Future[int]) =

echo("Future is no longer empty, ", future.read) future.complete(42)

Listing 3.14 Simple Future[int] example

The asyncdispatch module needs to be imported because

it defines the Future[T] type. A new future can be initialized with the newFuture constructor.

A future starts out empty; when a future isn’t empty, the finished procedure will return true.

The callback is given the future whose value was set as a parameter.

A callback can be set, and it will be called when the future’s value is set.

The read procedure is used to retrieve the value of the future.

A future’s value can be set by calling the complete procedure.

Futures can also store an exception in case the computation of the value fails. Calling read on a Future that contains an exception will result in an error.

To demonstrate the effects of this, modify the last line of listing 3.14 to future.fail(newException(ValueError, "The future failed")). Then compile and run it.

The application should crash with the following output:

Traceback (most recent call last) system.nim(2510) ch3_futures asyncdispatch.nim(242) fail asyncdispatch.nim(267) :anonymous ch3_futures.nim(8) :anonymous asyncdispatch.nim(289) read

Error: unhandled exception: The future failed unspecified's lead up to read of failed Future:

Traceback (most recent call last) system.nim(2510) ch3_futures asyncdispatch.nim(240) fail [Exception]

As you can see, the error message attempts to include as much information as possi-ble. But the way it’s presented isn’t ideal. The error messages produced by futures are still being worked on and should improve with time. It’s a good idea to get to know what they look like currently, as you’ll undoubtedly see them when writing asynchro-nous applications in Nim.

The preceding exception is caused by calling read on a future that had an excep-tion stored inside it. To prevent that from occurring, you can use the failed proce-dure, which returns a Boolean that indicates whether the future completed with an exception.

One important thing to keep in mind when working with futures is that unless they’re explicitly read, any exceptions that they store may silently disappear when the future is deallocated. As such, it’s important not to discard futures but to instead use the asyncCheck procedure to ensure that any exceptions are reraised in your program.

THEDIFFERENCEBETWEENSYNCHRONOUSANDASYNCHRONOUSEXECUTION

Hopefully, by now you understand how futures work. Let’s go back to learning a little bit more about asynchronous execution in the context of the accept procedure. Fig-ure 3.14 shows the difference between calling the synchronous version of accept and the asynchronous version.

As mentioned earlier, the asynchronous accept returns a Future object immedi-ately, whereas the synchronous accept blocks the current thread. While the thread is blocked in the synchronous version, it’s idle and performs no useful computational work. The asynchronous version, on the other hand, can perform computational work as long as this work doesn’t require the client socket. It may involve client sockets that have connected previously, or it may involve calculating the 1025^th decimal digit of π. In figure 3.14, this work is masked beneath a doWork procedure, which could be doing any of the tasks mentioned.

unspecified is the name of the Future. It’s called unspecified because the future is created with no name. You can name futures for better debugging by specifying a string in the newFuture constructor.

The asynchronous version performs many more calls to doWork() than the synchro-nous version. It also retains the call to doWork(socket), leading to the same code logic but very different performance characteristics.

It’s important to note that the asynchronous execution described in figure 3.14 has a problem. It demonstrates what’s known as busy waiting, which is repeatedly checking whether the Future is empty or not. This technique is very inefficient because CPU time is wasted on a useless activity.

To solve this, each Future stores a callback that can be overridden with a custom procedure. Whenever a Future is completed with a value or an exception, its callback is called. Using a callback in this case would prevent the busy waiting.

EXAMPLEOFASYNCHRONOUS I/O USINGCALLBACKS

The term callback provokes a feeling of horror in some people. But not to worry. You won’t be forced to use callbacks in Nim. Although the most basic notification mecha-nism Futures expose is a callback, Nim provides what’s known as async await, which hides these callbacks from you. You’ll learn more about async await later.

But although you’re not forced to use callbacks in Nim, I’ll first explain asynchro-nous I/O by showing you how it works with callbacks. That’s because you’re likely more familiar with callbacks than with async await. Let’s start with a comparison between Node and Nim, and not a comparison involving sockets but something much simpler: the reading of a file asynchronously.

var fs = require('fs');

fs.readFile('/etc/passwd', function (err, data) { if (err) throw err;

console.log(data);

});

Listing 3.15 Reading files asynchronously in Node Server socket

Client socket Thread blocked

Synchronous

Server socket

Future

Future empty?

true

false

Future.read

doWork(socket) doWork()

Asynchronous

doWork()

doWork(socket)

Client socket

Figure 3.14 The difference between synchronous and asynchronous accept

The code in the preceding listing is taken straight from Node’s documentation.² It simply reads the contents of the /etc/passwd file asynchronously. When this script is executed, the readFile function tells the Node runtime to read the file specified by the path in the first argument, and once it’s finished doing so, to call the function specified in the second argument. The readFile function itself returns immediately, and control is given back implicitly to the Node runtime.

Now compare it to the Nim version.

import asyncdispatch, asyncfile

The Nim version may seem more complex at first, but that’s because Nim’s standard library doesn’t define a single readFile procedure, whereas Node’s standard library does. Instead, you must first open the file using the openAsync procedure to get an AsyncFile object, and then you can read data from that object.³

Other than that difference in standard library APIs, the Nim version also differs in one more important way: the readAll procedure doesn’t accept a callback. Instead, it returns a new instance of the Future type. The callback is then stored in the Future and is called once the future completes.

THEEVENTLOOP

In a Node application, the runtime is a form of event loop—it uses native operating system APIs to check for various events. One of these might be a file being successfully read or a socket receiving data from the server that it’s connected to. The runtime dis-patches these events to the appropriate callbacks.

Nim’s event loop is defined in the asyncdispatch module. It’s similar to Node’s runtime in many ways, except that it needs to be explicitly executed. One way to do this is to call the runForever procedure. Figure 3.15 shows the behavior of the run-Forever procedure.

2 See the Node.js fs.readFile documentation: https://nodejs.org/api/fs.html#fs_fs_readfile_file_options_

callback.

Listing 3.16 Reading files asynchronously in Nim

3 Creating a single ^readFile procedure would be a fairly trivial undertaking. I leave the challenge of creating such a procedure to you.

Opens the "/etc/passwd" file asynchronously and binds it to the file variable

Asks for all of the contents of the file to be read, and assigns the resulting Future[string]

type to the dataFut variable

Assigns a new callback to be called when the future completes

Inside the callback, reads the contents of the future that should now be present Explicitly runs the event loop that’s defined in the asyncdispatch module

The Nim event loop puts you in control. The runForever procedure is simply a wrap-per around the poll procedure, which the runForever procedure calls in an infinite loop. You can call the poll procedure yourself, which will give you greater control over the event loop. The poll procedure waits for events for a specified number of milliseconds (500 ms by default), but it doesn’t always take 500 ms to finish because events can occur much earlier than that. Once an event is created, the poll proce-dure processes it and checks each of the currently pending Future objects to see if the Future is waiting on that event. If it is, the Future’s callback is called, and any appro-priate values that are stored inside the future are populated.

In contrast to synchronous I/O, which can block for an unlimited amount of time, the poll procedure also blocks, but only for a finite amount of time, which can be freely specified. This allows you to commit a certain amount of time to I/O processing and the rest to other tasks, such as drawing a GUI or performing a CPU-intensive calcu-lation. I’ll show you how to utilize this procedure later in the client module, so that async sockets can be mixed with the readLine procedure that reads the standard input stream in another thread.

ASYNCAWAIT

There’s a big problem with using callbacks for asynchronous I/O: for complex appli-cation logic, they’re not flexible, leading to what’s aptly named callback hell. For exam-ple, suppose you want to read another file after a first one has been read. To do so, you’re forced to nest callbacks, and you end up with code that becomes ugly and unmaintainable.

Nim has a solution to this problem: the await keyword. It eliminates callback hell completely and makes asynchronous code almost identical to synchronous code.

The await keyword can only be used inside procedures marked with the {.async.}

pragma. The next listing shows how to read and write files using an async procedure.

runForever

poll

Read/write event Future.callback()

500 ms

asyncdispatch event loop

Blocked for up to 500 ms No events

Figure 3.15 Nim’s asyncdispatch event loop

import asyncdispatch, asyncfile proc readFiles() {.async.} =

var file = openAsync("/home/profile/test.txt", fmReadWrite) let data = await file.readAll()

echo(data)

await file.write("Hello!\n") file.close()

waitFor readFiles()

Listing 3.17 performs the same actions and more than the code in listing 3.16. Every time the await keyword is used, the execution of the readFiles procedure is paused until the Future that’s awaited is completed. Then the procedure resumes its execution, and the value of the Future is read automatically. While the procedure is paused, the application continues running, so the thread is never blocked. This is all done in a single thread.

Multiple async procedures can be paused at any point, waiting for an event to resume them, and callbacks are used in the background to resume these procedures.

Every procedure marked with the {.async.} pragma must return a Future[T]

object. In listing 3.17, the procedure might seem like it returns nothing, but it returns a Future[void]; this is done implicitly to avoid the pain of writing Future[void] all the time. Any procedure that returns a Future[T] can be awaited. Figure 3.16 shows what the execution of listing 3.17 looks like.

The waitFor procedure that’s used instead of runForever runs the event loop until the readFiles procedure finishes its execution. Table 3.2 compares all the dif-ferent async keywords you’ve seen so far.

Listing 3.17 Reading files and writing to them in sequence using await

Table 3.2 Comparison of common async keywords

Procedure Controls event

loop directly Use case Description

runForever Yes Usually used for server applications that need to stay alive indefinitely.

Runs the event loop forever.

waitFor Yes Usually used for applications that need to quit after a specific asynchro-nous procedure finishes its execution.

Runs the event loop until the speci-fied future completes.

The {.async.} pragma is used to specify that the readFiles procedure is asynchronous.

Opens the ~/test.txt file asynchronously in fmReadWrite mode so that the file can be read and written to

The await keyword signifies that readFiles should be paused until the file is fully read.

Displays the contents of the file Writes some data to the file. The

procedure is paused until the data is successfully written to the file.

Runs the event loop until readFiles finishes

poll Yes For applications that need precise control of the event loop. The runForever and waitFor proce-dures call this.

Listens for events for the specified amount of time.

asyncCheck No Used for discarding futures safely, typ-ically to execute an async proc without worrying about its result.

Sets the specified future’s callback property to a procedure that will handle exceptions appropriately.

await No Used to execute another async proc whose result is needed in the line of code after the await.

Pauses the execution of an async proc until the specified future completes.

Table 3.2 Comparison of common async keywords (continued)

Procedure Controls event

loop directly Use case Description

waitFor readFiles()

openAsync(...)

await readAll()

poll()

Read 30% of file

Read 80% of file

Read 100% of file

echo(data)

await write(...)

poll()

Program exit

Written 100% of file readFiles

paused

readFiles paused

readFiles finished

Figure 3.16 The execution of listing 3.17

WARNING: PROCEDURES THAT CONTROL THE EVENT LOOP Typically, runForever, waitFor, and poll shouldn’t be used within async procedures, because they control the event loop directly.

Now, I’ll show you how to use await and asynchronous sockets to finish the implemen-tation of the server.

In document Nim in Action (Page 103-111)