THE WORLD OF DISTRIBUTED COMPUTING

What is COM?

COM stands for Component Object Model. To learn what is COM, imagine how you built a car or house using Lego mechano blocks when you were a kid (if not at present)! Such blocks are like components. Several such blocks make up the whole car or house! COM is just like that. Using COM, your program will be a collection of Components. If you need to upgrade a particular Component, all you need is just to replace that Component instead of re-building/re-compiling the whole program again!

Does it make sense? Not clear? Ok, think about in a different way. You already know how to make DLLs. You also know that a DLL written in C can be used in VB. You also knew that an OCX control created in VB could be run in Visual C++ or Oracle or even in AutoCAD! So, these are components! A component itself is a binary code. So, you can attach the component into any program as long as they do support COM.

Fortunately most recent programming languages and applications as well support COM. All the COM objects are complete by themselves. You can just add them in your required program and it should provide its service! It is not necessary that only DLLs or OCXs are COMs. Even EXEs can be COMs.

From OLE to COM

Object linking enables you to add a reference to another document from within your applications open document. So, whenever the data in original document changes contents of each of the document that contains the linked data, also changes.

With object embedding, an actual copy of the source data is placed into the document. If you change the embedded data, nothing happens to the original.

After creating OLE, Microsoft decided that OLE should be extended to enable applications to not only share data, but also share functionality. This was known as OLE2.

COM is a specification for creating binary objects that can communicate with each other. COM specifies strict set of rules that programmers must follow when creating binary objects.

The internal architecture of COM

Interfaces are everything in COM! For COM, an interface is a specific memory structure containing an array of function pointers. Each array element contains the address of a function implemented by the component. To be precise, COM states what a block of memory must look like to be considered an interface. You may note that, the memory layout of a COM interface is the same as that of C++

compiler generates for an abstract base class.

Figure 19-1 shows memory layout for the abstract base class defined by following function.

Interface IX {

virtual void _stdcall Fx1( ) = 0 ; virtual void _stdcall Fx1( ) = 0 ; virtual void _stdcall Fx1( ) = 0 ; virtual void _stdcall Fx1( ) = 0 ; }

Figure 12-1

The ‘Virtual Function Table’ is an array of pointers that point to the implementations of the virtual functions. For example, the first entry in above vtbl contains the address of function Fx1 as it is implemented in the derived class.

You may wonder why I am talking gibberish stuffs! Well, things are really pretty complicated. In fact, to know COM entirely you must understand all these nasty stuffs! If you try reading any COM book (which explains internal architecture of COM) I’m sure you’ll find it mind-boggling.

pIX

Virtual Function Table (vtbl) contains pointers to member functions

Vtbl pointer &Fx1

&Fx2

&Fx3

&Fx4

Virtual Function Table (vtbl)

Interfaces are similar to the timbers in a frame house. The timbers determine the house’s structure. If you don’t remove the timbers, the structure of the house remains as it is. You may change the walls for brick to log, but the structure remains the same. Similarly, components can be replaced to give the application different behavior though architecturally the application remains same. Thus, carefully designed architectures can produce highly reusable architectures!

However, such designing is not an easy task!

To find out whether the component supports a particular interface, the client asks the component for that interface at runtime. For this purpose, the client uses the IUnknown interface. IUnknown declares a function named QueryInterface. The client calls QueryInterface to determine whether the component supports an interface. Remember that all COM interfaces are required to inherit from IUnknown. Every interface has QueryInterface, AddRef and Release as first three functions in its virtual table. See figure 19-2.

Figure 12-2

I am not going to plunge you further in COM details because a COM book typically contains more than 1000 pages! But I strongly advice that you do read a good book dedicated on this subject for a better grasp of the whole thing.

pIX

Virtual Function Table (vtbl) contains pointers to member functions

Vtbl pointer QueryInterface

AddRef Release Fx

Virtual Function Table (vtbl)

QueryInterface AddRef Release Fx

CA Client

How do I write COM?

Anything you write in VB is automatically COM (Whew)! Anything written in Visual C++ will also be COM if you tell them to be! To make your life easier (or difficult), Microsoft uses an utility called Active Template Library (ATL) in its Visual Studio. ATL helps you to write COM objects from scratch. You can also write COMs using wizard in MFC.

ATL is a lightweight library of templates designed to make it easy to build small, fast ActiveX controls. Since ATL is implemented as a set of templates, there is very little runtime overhead for interface queries and passing. But using ATL is quite difficult (really).

ATL is not intended to be general-purpose solution for writing any kind of programs. ATL is optimized for use with COM.

What ATL does can also be done with MFC. But MFC applications tend to be bulky. However, MFC has much broader ranges of application.

In Visual Studio, you have a program called ‘OLE-COM Object Viewer’. Try opening a DLL file created in VB by clicking on ‘View Type Lib’ icon. You will see the IDL created by VB for you. You may also try opening a sample ATL file as created later in this chapter.

What is IDL?

IDL stands for ‘Interface Description Language’. All COMs are joined together or with client application using interfaces. It is pretty much like the same way different machine parts are attached each other using screws, nuts & bolts.

DCOM provides network transparency and communication automation so that communications can take place between objects without one object needing to be aware of another object's location. The objects can be in different processes on the same machine, or in separate processes on different machines.

What is CORBA?

CORBA means ‘Common Object Request Broker Architecture’. It is similar to COM (strictly speaking it is more similar to DCOM). COM is Microsoft's technology where as CORBA is Sun's technology. Though I said they are similar their internal architecture is quite different. Detail discussion of CORBA is beyond the scope of this book. If you are interested in details, please consult books dedicated on these topics.

Figure 12-3

The IDL interface definitions inform clients of an object offering an interface exactly what operations an object supports, the types of their parameters and what return type to expect. A client programmer needs only the IDL to write client code that is ready to invoke operations on a remote object. The client uses the data type defined in IDL though a language mapping. The mapping defines the programming language constructs (data types, classes etc.) that will be generated by the IDL compiler supplied by an ORB vendor.

The IDL compiler also generates stub code that the client links to, and this translates, or marshals, the programming language data types into a wire format for transmission as a request message to an object implementation. The implementation of the object has linked to it similar marshaling code, called a skeleton, that unmarshals the request into programming language data types. A different IDL compiler with different language mapping can generate the

Client

Client Proxy (Stub Code)

Object Request Broker

Skeleton Code

Object Implementation

skeleton. In this way object method implementation can be invoked and the results returned by the same means.

Summary of CORBA development process.

1. Write some IDL that describes the inferface to the object(s) that we will use or implement.

2. Compile the IDL using IDL compiler provided by the particular ORB. This produces stub and skeleton code. It will convert an object reference to a network connection to a remote server and then marshal the arguments we provide to an operation on the object reference, convey them to the correct method in the object denoted by our object reference, execute the method, and return the results.

3. Identify the classes (header and implementation files) generated by the IDL compiler that we need to use or specialize in order to invoke or implement operations.

4. Write code to initialize the ORB and inform it of any CORBA objects we have created.

5. Compile all the generated code and our application code with the C++ (or other language) compiler.

6. Run the distributed application.

Sample ATL COM/DCOM project

1. In Visual C++, open a new ATL project named ATLquadratic.

2. In step 1 of 1, select type as EXE.

3. Right clicking on ATLquadratic classes, create a New ATL Object…

4. Choose ‘Simple Object’ and give short name ‘Quad’. (In ATL Object Wizard Properties dialog box)

5. By right clicking on ‘IQuad’ on class view, add a new method.

6. The ‘Add Method to Interface’ dialog box opens.

7. Specify method name as ‘^solve_quad’.

8. Write parameters as [in] int a, [in] int b, [in] int c, [out]

double *root1, [out] double *root2,[out,retval] double

*result

9. Expand ‘CQuad’ class in Class View and then ‘IQuad’ interface. By double clicking on ‘solve_quad’ method, open up ‘Quad.cpp’ source code file.

10. Add the following code. Visual C++ generated code is shown pink.

Code 12-1

// Quad.cpp : Implementation of CQuad

#include "stdafx.h"

#include "ATLquadratic.h"

#include "Quad.h"

#include <math.h>

//////////////////////////////////////////////////////////////

///////////////

// CQuad

// return *result must be last parameter

STDMETHODIMP CQuad::solve_quad(int a, int b, int c, double * root1, double * root2,double * result)

{

// TODO: Add your implementation code here double determ;

determ = b*b - 4*a*c;

if(determ<0)

return E_INVALIDARG;

*root1=((-b)+sqrt(determ))/(2*a);

*root2=((-b)-sqrt(determ))/(2*a);

*result = 1;

return S_OK;

}

11. Now compile the code.

12. Thus we have made a server COM object. To test it we need a client. We can use anything as front end. In this example we shall use VB.

13. Open a new project in VB. Add reference to ‘ATLquadratic 1.0 Type Library’. Make a command button. Add the following code in button click event.

Code 12-2

Private Sub Command1_Click() On Error GoTo Hell

Dim MyQuad As Quad Set MyQuad = New Quad Dim x As Double

Dim r1 As Double Dim r2 As Double

x = MyQuad.solve_quad(1, -5, 6, r1, r2)

MsgBox "Root1= " & r1 & vbCrLf & "Root2= " & r2 Set MyQuad = Nothing

Exit Sub Hell:

MsgBox "No real solution"

End Sub

14. Run the VB project. On clicking the button it should show the roots of the specified quadratic equation.

15. This is just a very simple example of COM! For further (and real world) discussion of COM/DCOM/COM+/CORBA etc. I again strongly advise you to consult books dedicated on these subjects!

16. After step 8, you may note that Visual C++ will itself create an ‘idl’ file (with lots of other files as well).

Summary

Undoubtedly, this is one of the most difficult chapters in this book. Indeed, the underlying concept of COM/CORBA is quite terse. So, in this section I shall try to recapitulate the whole thing.

What is the definition of COM?

COM is a specification for building software components that can be assembled into new programs or add functionality to existing programs. COM components can be written in a variety of computer languages and can be updated and reinstalled without requiring changes to other parts of the program.

What is marshaling?

You already know that you can pass parameter to a function either by value or by reference. Passing by value is easy but passing by reference creates problem.

In Windows, an application can modify contents of memory allocated to its own process. (If you're wondering what is a process, take it for granted for the time being that each executable running in memory is a process.) However, an application can't modify data stored in memory that has been allocated to other processes. This is where COM comes to rescue!

Figure 12-4

When you call a function/procedure on a COM component that is running in a separate process, COM handles inter-process communication by packaging the parameter data and passing it across process boundary. This is called

‘Marshalling’. Suppose you want to pass a parameter by reference. COM passes it by value (by making a copy) at first to the called procedure. Once this

PROCESS A

PROCESS B

COM copies data from process A to B

Data is changed inside process B

Modified data is again copied to same address space in Process A

procedure is complete, COM copies the new value to the caller procedure. So we can say that procedure residing in separate process has accessed and modified data contained in calling process.

Why do I sometimes get “ActiveX can’t create Object” error message in VB applications?

This is most commonly due to version incompatibility. Select Project – Properties – Component tab – Binary Compatibility option while compiling your ActiveX application in VB. Also specify the file name as well.

From COM to CLR – what the hype all about?

In the long run, Microsoft intends to phase out COM by CLR (Common Language Runtime) which is the base of its .NET technology. All code written for the .NET platform runs under the control of CLR. However, in order to ensure compatibility, COM will run without problem along with CLR.

According to Microsoft, code written to run exclusively under the control of CLR is called ‘managed code’ (Just look at the terminology). All codes that rely on COM or Win32 API are termed as ‘unmanaged code’ (so as to pursue you in writing .NET – good business tactics, huh)! Now what is this CLR actually?

CLR was designed to allow a very high level of integration among all languages of .NET platform namely Visual BASIC.NET, C# etc. Here the executable instruction compiled into DLLs and EXEs will be in the form of Microsoft Intermediate Language (MSIL). It is similar to assembly code in the sense that it contains low level instructions for things being pushed, popped or moved in and out of registers. However, it contains no dependencies on any particular operating system or hardware platform. (Sounds similar to Java Runtime?) This means after and EXE or DLL containing MSIL is deployed on a target machine, it must undergo a final round of just-in-time (JIT) compilation to transform it into a machine specific assembly instruction.

Though Microsoft currently has plans to ship CLR in all its Windows platforms, MSIL gives you potential of running your programs in other platforms as well.

(I wonder whether it will actually happen, because it might break Microsoft’s monopoly in PC operating system market.)

However, this concept of intermediate code before machine language executable is not entirely new. VB has always included compiling to p-code option from the very beginning. Programs compiled with p-code are usually 50% smaller in size compared to that of native code. But like Java, p-code executes around 10 times slower than native code! The term p-code originates

form ‘pseudo-code’, which is an intermediate step between high level instructions in your VB program and low level native code executed by your computer’s processor. At runtime, VB translates each p-code statement to native code. If Microsoft ever develops VB virtual machine for p-code in different platforms then VB will be just as portable (i.e. platform independent) as that of Java!

Whether CLR is really better than COM or not is still a subject of arguments among the experts. Its internal architecture is noticeably different than that of COM. It is a new technology. If you are further interested, you may find MSDN journals/magazines helpful.

One machine can do the work of fifty ordinary men. No machine can do the work of one extra ordinary man.

In document Better Understanding of Computer Programming (Page 81-92)

THE WORLD OF DISTRIBUTED COMPUTING – COM AND CORBA