• No results found

Avoiding Common Pitfalls

The two common pitfalls you’re likely to encounter with the PUSH/PULL pattern in Node.js are the first-joiner problem and the limited-resource problem.

The First-Joiner Problem

The first-joiner problem is the result of ØMQ being so fast at sending messages and Node.js being so fast at accepting them. Since it takes time to establish a connection, the first puller to successfully connect will pull many or all of the available messages before the second joiner even has a chance to get into the rotation.

To fix this problem, the pusher needs to wait until all of the pullers are ready to receive messages before pushing any. Let’s consider a real-world scenario and how we’d solve it.

Say you have a Node cluster, and the master process plans to PUSH a bunch of jobs to three worker processes. Before the master can start pushing, the workers need a way to signal back to the master that they’re ready to start pulling jobs. They also need a way to communicate the results of the jobs that they’ll eventually complete.

Figure 9, A Node.js cluster that pushes work to a pool of workers, on page 61

Figure 9—A Node.js cluster that pushes work to a pool of workers

As in previous diagrams, rectangles are Node processes, ovals are resources, and heavy arrows point in the direction the connection is established. The

job, ready, and result messages are shown as light arrow boxes pointing in the direction they are sent.

In the top half of the figure, we have the main communication channel—the master’s PUSH socket hooked up to the workers’ PULL sockets. This is how jobs will be sent to workers.

In the bottom half of the figure, we have a backchannel. This time the master has a PULL socket connected to each worker’s PUSH sockets. The workers can PUSH messages like their readiness to work or job results back to the master.

The master process is the stable part of the architecture, so it binds while the workers connect. Since all of the processes are local to the same machine, it makes sense to use IPC for the transport.

The bonus challenge in Bidirectional Messaging, on page 63, will ask you to implement the PUSH/PULL cluster described earlier. Note, however, that this is just one solution to the first-joiner problem. You could choose to use a different messaging pattern for the backchannel, like request/reply.

The Limited-Resource Problem

The other common pitfall is the limited-resource problem. Node.js is at the mercy of the operating system with respect to the number of resources it can access at the same time. In Unix-speak, these are called file descriptors.

Whenever your Node program opens a file or a TCP connection, it uses one of its available file descriptors. When there are none left, Node will start failing to connect to resources when asked. This is an extremely common problem for Node.js developers. I have yet to meet a Node developer who hasn’t grappled with the limited-resource problem at one time or another.

Strictly speaking, this problem isn’t limited to the PUSH/PULL scenario, but it’s very likely to happen there, and here’s why. Since Node.js is asynchronous, the puller process can start working on many jobs simultaneously. Every time a message event comes in, the Node process invokes the handler and starts working on the job. If these jobs require accessing system resources —and they almost certainly will—you’re liable to exhaust the pool of available file descriptors. Then jobs will quickly start failing.

We’ll explore this topic and its solutions in more detail in Chapter 5, Accessing Databases, on page 65, when we limit the number of concurrent database connections through a technique called connection pooling.

Wrapping Up

This chapter brought us out of the Node.js core and into the larger world of npm. We discussed how to install and use third-party modules with binary components. In particular, we covered how to use ØMQ.

ØMQ supports a number of message-passing patterns; we got to know several of them. We saw how ØMQ does the publish/subscribe pattern, the request/ reply pattern, and the PUSH/PULL pattern. These patterns are now tools at your disposal for designing networked applications in Node.js, even if you choose not to use the ØMQ library itself.

We also explored Node’s clustering capabilities. Using features of Node’s cluster

module, we spun up a number of worker processes and distributed requests to them. We’ll use these capabilities again in Chapter 6, Scalable Web Services, on page 87.

The following bonus questions ask you to modify and create new Node.js and ØMQ programs using what you learned from the chapter.

In the next chapter, we’ll cover how to use Node to interact asynchronously with several popular databases.

Related documents