Synchronous Event Demultiplexing
Sidebar 12: Evaluating Synchronization Mechanism Performance
Operating systems provide a range of synchronization mechanisms to handle the needs of different applications. Although the performance of the synchronization mechanisms discussed in this chapter varies ac-cording to OS implementations and hardware, the following are gen-eral issues to consider:
• Condition variables and semaphores often have higher overhead than mutexes since their implementations are more complicated, Native OS mechanisms nearly always perform better than user-created substitutes, however, since they can take advantage of internal OS characteristics and hardware-specific tuning,
• Mutexes generally have lower overhead than readers/writer locks because they don't need to manage multiple waiter sets, Multiple reader threads can proceed in parallel, however, so readers/writer locks can scale better on multiprocessors when they are used to access data that are read much more than they are updated.
• Nonrecursive mutexes are more efficient than recursive
Moreover, recursive mutexes can cause subtle bugs if program-mers forget to release them (But97), whereas nonrecursive mu-texes reveal these problems by deadlocking immediately.
Section 6.5 Limitations with OS Concurrency Mechanisms 135
6.5 Limitations with OS Concurrency Mechanisms
Developing networked applications using the native OS concurrency mech-anisms outlined above can cause portability and reliability problems. To il-lustrate some of these problems, consider a function that uses a threads mutex to solve the auto-increment serialization problem we observed with
on page 131 in Section 6.4:
typedef u_long COUNTER;
static COUNTER // File scope global variable.
static // Protects request_count (initialized to //
virtual int handle_data *) {
while () != -1) {
// Keep track of number of requests.
mutex_lock // Acquire lock
// Count # of requests mutex_unlock // Release lock
int count = mutex_unlock
ACE_DEBUG ( = ,
return
In the code above, m is a variable of type mutex_t, which is automati-cally initialized to 0. In UI threads, any synchronization variable that's set to zero is initialized implicitly with its default semantics For example, the variable m is a static variable that's initialized by default in the unlocked state. The first time the function is called, it will therefore acquire ownership of the lock. Any other thread that attempts to acquire the lock must wait until the thread owning lock m releases it.
Although the code above solves the original synchronization problem, it suffers from the following drawbacks:
• Obtrusive. The solution requires changing the source code to add the mutex and its associated C functions. When developing a large software
CHAPTER 6 An Overview of Operating System Concurrency Mechanisms
system, making these types of modifications manually will cause mainte-nance problems if changes aren't made consistently.
• Error-prone. Although the handle_data method is relatively sim-ple, it's easy for programmers to forget to call () in more complex methods. Omitting this call will cause starvation for other threads that are blocked trying to acquire the mutex. Moreover, since a
is nonrecursive, deadlock will occur if the thread that owns mutex m tries to reacquire it before releasing it. In addition, we neglected to check the return value of to make sure it succeeded, which can yield subtle problems in production applications.
• Unseen side-effects. It's also possible that a programmer will forget to initialize the mutex variable. As mentioned above, a static mutex_t vari-able is implicitly initialized on threads. No such guarantees are made, however, for fields in objects allocated dynamically. Moreover, other OS thread implementations, such as Pthreads and Win32 threads, don't support these implicit initialization semantics; that is, all synchro-nization objects must be initialized explicitly.
• Non-portable. This code will work only with the UI threads synchro-nization mechanisms. Porting the () method to use Pthreads and Win32 threads will therefore require changing the locking code to use different synchronization APIs
In general, native OS concurrency APIs exhibit many of the same types of problems associated with the Socket API in Section 2.3. In addition, concurrency APIs are even less standardized across OS platforms. Even where the APIs are similar, there are often subtle syntactic and semantic variations from one implementation to another. For example, functions in different drafts of the Pthreads standard implemented by different oper-ating system providers have different parameter lists and return different error indicators.
Sidebar 13 outlines some of the differences between error propagation strategies for different concurrency APIs. This lack of portability increases the accidental complexity of concurrent networked applications. Therefore, it's important to design higher-level programming abstractions, such as those in ACE, to help developers avoid problems with nonportable and nonuniform APIs.
Section 6.6 Summary 137
Sidebar Concurrency API Error Strategies
Different concurrency APIs report errors to callers differently. For exam-ple, some APIs, such as threads and return 0 on success and a non-0 error value on failure. Other APIs, such as Win32 threads, return 0 on failure and indicate the error value via thread-specific storage. This diversity of behaviors is confusing and nonportable. In the ACE concurrency wrapper facades define and enforce a uniform approach that always returns if a failure occurs and sets a thread-specific er-rno value to indicate the cause of the failure.
6.6 Summary
Operating systems provide concurrency mechanisms that manage multiple processes on an end system and manage multiple threads within a process.
Any decent general-purpose OS allows multiple processes to run concur-rently. Modern general-purpose and real-time operating systems also allow multiple threads to run concurrently. When used in conjunction with the appropriate patterns and application concurrency helps to improve performance and simplify program structure.
The concurrency wrapper facade classes provided by ACE are described in the following four chapters:
• Chapter 7 describes ACE classes for synchronous event demultiplex-ing.
• Chapter 8 describes ACE classes for OS process mechanisms.
• Chapter 9 describes ACE classes for OS threading mechanisms.
• Chapter 10 describes ACE classes for OS synchronization mecha-nisms.
Throughout these four chapters we'll examine how ACE uses C++ features and the Wrapper Facade pattern to overcome problems with native OS con-currency APIs and improve the functionality, portability, and robustness of concurrent networked applications. Where appropriate, we'll show ACE concurrency wrapper facade implementations to illustrate how they are mapped onto underlying OS concurrency mechanisms. We also point out where the features of OS platforms differ and how ACE shields developers from these differences.
CHAPTER 7