[Previous] [Contents] [Next]

Threading Models for In-Process Components

In-process components work differently than executable components in that they do not call CoInitializeEx on start-up because their client thread will already have initialized COM+ by the time they are loaded. Instead, in-process components declare their supported threading model using registry settings. In the component's CLSID\InprocServer32 registry key, the ThreadingModel named value can be set to Apartment, Neutral, Free, or Both to indicate support for the various threading models; the table below translates these names into modern STA, NA, and MTA terminology.

ThreadingModel Values Description
Not present Single-threaded legacy component that runs only in the main STA
Apartment STA
Neutral NA
Free MTA
Both Supports the STA, NA, and MTA models

If no ThreadingModel value is specified, the in-process component is assumed to be a thread-oblivious legacy component. It is legal for different coclasses provided by a single in-process component to have different ThreadingModel values—in other words, the ThreadingModel value is set on a per-CLSID basis, not a per-DLL basis. The RegisterServer function we used in Chapter 2 to perform component self-registration can be used to set the ThreadingModel value in the registry. The last parameter of this function should contain one of the four strings from the preceding table. For example, the following statement registers a coclass, and the last parameter (shown in boldface) specifies that it supports the STA model:

RegisterServer("component.dll", CLSID_InsideCOM, 
    "Inside COM+ Component", "Component.InsideCOM", 
    "Component.InsideCOM.1", "Apartment");

If you are using the alternative table-driven version of the registration code (RegisterServerEx), you should add an extra entry for the ThreadingModel value, as shown in boldface in the following code.

const REG_DATA g_regData[] = {
    { "CLSID\\{10000002-0000-0000-0000-000000000001}", 0, 
        "Inside COM+ Component" },
    { "CLSID\\{10000002-0000-0000-0000-000000000001}\\"
        "InprocServer32", 0, (const char*)-1 }, 
    { "CLSID\\{10000002-0000-0000-0000-000000000001}\\"
        "InprocServer32", "ThreadingModel", "Apartment" },  
    { "CLSID\\{10000002-0000-0000-0000-000000000001}\\ProgID", 
        0, "Component.InsideCOM.1" }, 
    { "CLSID\\{10000002-0000-0000-0000-000000000001}\\"
        "VersionIndependentProgID", 0, 
        "Component.InsideCOM" },
    { "Component.InsideCOM", 0, 
        "Inside COM+ Component" },
    { "Component.InsideCOM\\CLSID", 0, 
        "{10000002-0000-0000-0000-000000000001}" },
    { "Component.InsideCOM\\CurVer", 0, 
        "Component.InsideCOM.1" },
    { "Component.InsideCOM.1", 0, 
        "Inside COM+ Component" },
    { "Component.InsideCOM.1\\CLSID", 0, 
        "{10000002-0000-0000-0000-000000000001}" },
    { 0, 0, 0 }
};

Apartment Interactions

In this section, we'll look at how COM+ is able to support interoperability among all possible combinations of components, even when those components use different threading models. To recapitulate, here are the basic principles through which components create apartments and declare their supported threading model:

The interaction of clients and components is relatively straightforward when both parties use the same threading models. When a client instantiates an object, COM+ compares the threading models supported by the client and the object. In the ideal case in which the two parties support the same threading model, COM+ allows direct calls from the client to the object; this yields the best performance for a given model. Obviously, executable components and in-process components running in a surrogate require marshaling code for use during cross-process or remote method calls. If both parties do not support the same threading models, however, COM+ must interpose itself between the client and the object—even if everything is running in the same process—to ensure that none of the threading rules for each party's apartment type is violated. It does this with a proxy/stub pair: the client makes calls through a proxy, which uses COM+ to deliver method calls to the stub; the stub in turn calls the actual object. As part of the marshaling process, a thread switch from the caller's thread to a thread in the object's apartment occurs.

Imagine that a client thread running in an MTA invokes an in-process component supporting the STA model. COM+ cannot allow the client's calls to proceed directly to the STA-based in-process object because concurrent access might occur. In such cases, when the interface pointer of the object is marshaled back to the client, COM+ loads the proxy/stub pair and provides the client with a pointer to the proxy. For this reason, even in-process components must provide marshaling code (usually in the form of a proxy/stub DLL) for any custom interfaces they implement if they are to support interoperability among apartment types.

As you've learned, in-process objects that do not have a ThreadingModel value in the registry are considered legacy components written prior to the advent of apartments in COM+. Such objects expect that the client will have only a single thread of execution, from which the object will be created and accessed. To ensure that these components continue to operate correctly, COM+ deals with thread-oblivious objects by creating them in the main STA of the client process, regardless of which thread instantiates the coclass. The main STA is a special designation awarded to the first STA created in a process. Calls from all other apartments, be they STAs, an NA, or an MTA, are marshaled to the thread belonging to the main STA; only then can a call be executed. Such calls are received by the proxy of one apartment and sent to the stub in the main STA via interapartment marshaling before being delivered to the object.

Compared with direct calls, interapartment marshaling is slow; therefore you should rewrite legacy in-process components to support the STA model. Another partial solution to the performance problem is to access legacy in-process components from the main STA only. The thread of the main STA can access the legacy component directly, thereby avoiding the interapartment marshaling performance penalty. Figure 4-5 shows a legacy component being called by two STA threads—one is the main STA, so it can access the object directly, while the auxiliary STA thread must be content to access the object through a proxy.

Click to view at full size.

Figure 4-5. Two STA threads accessing an in-process object that does not have a ThreadingModel value specified.

If you have no alternative but to use legacy in-process components, you should create the main STA in the main thread of the client process because once the main STA terminates, all in-process objects created there are destroyed. If the client carelessly spawns several threads, each of which calls CoInitializeEx with the COINIT_APARTMENTTHREADED flag, without first explicitly creating the main STA, one of those threads will randomly become the main STA through which all calls to legacy in-process objects will be marshaled.

The danger here is that the thread running the main STA might terminate at some point, taking with it all the objects created there. Another difficulty arises when an MTA-based client that does not have an STA instantiates a coclass from an in-process component that supports the STA model. In such cases, COM+ automatically creates the main STA by spawning a new thread and calling CoInitializeEx(NULL, COINIT_APARTMENTTHREADED) in that thread. The object is then created in the newly created host STA, and the interface pointer is marshaled back to the client thread in the MTA. Note that the term host STA describes an STA created automatically by the system; the main STA and the host STA might be one and the same when COM+ creates the first STA in a process.

Objects That Support the MTA Model

In-process objects that support the MTA model (ThreadingModel = Free) implement their own synchronization and are designed for use from an MTA only. MTA-based client threads that instantiate an object can use the object directly; this yields superior performance compared with the STA model. Initially, it might seem that any object supporting the MTA model can also be run safely in an STA. After all, the idea is to protect STA-based objects from concurrent access, and that is obviously not a concern for objects that support the MTA. Unfortunately, things get a bit more complicated. Imagine that a client thread belonging to an STA instantiates an object supporting the MTA model. One or more STAs present no threat to an object that supports the MTA because it manages its own synchronization. However, even in this deceptively simple case, the system must still perform thread synchronization.

This time, it is not the component but the client that needs protection. Although the component has declared its thread independence, the client has not. Pure clients and components are a rare phenomenon; it is much more common for clients and components to have a two-way conversation using connection points or some other private callback interface. In such cases, an MTA object might feel justified in calling the client back from any thread at any time. In effect, a component supporting the MTA model declares, "You can call me from any thread, so I can call you right back from any other thread." The hapless client, which created the object from an STA, is totally unprepared for this turn of events (pun intended). The possibility of multiple concurrent calls into an STA violates STA rules and would probably result in errors. For this reason, even when an STA-based client calls an object designed for MTA access, COM+ creates the object in the MTA and marshals all calls into and out of it.

Figure 4-6 shows how an in-process object designed for use from an MTA is created in the MTA even if it is instantiated by an STA-based thread. The result is that MTA-based threads can call directly into the object, while the STA-based thread that instantiated the object must make its calls via a proxy. Perhaps you are wondering how COM+ can create the object in an MTA if the client instantiates the object from an STA-based thread. If no MTA exists, COM+ first creates one by calling CoInitializeEx(NULL, COINIT_MULTITHREADED). However, if the MTA already exists in the client process when the MTA-supporting object is instantiated, COM+ creates the object there. All calls to and from the object are marshaled from the MTA back to the STA-based client thread.

Click to view at full size.

Figure 4-6. When an STA-based thread instantiates an object designed for execution in the MTA, COM+ creates the MTA if necessary and instantiates the object there.

Objects That Support All Apartment Models

If even an in-process object designed for MTA access incurs the overhead of interapartment marshaling when accessed by an STA-based client thread, what's a conscientious developer to do? Well, to avoid the wrath of COM+, an in-process component can declare that it supports all apartment models (ThreadingModel = Both). (The term Both is an anachronism that dates back to the time when only two threading models were available.) COM+ permits this type of object to be instantiated directly in an STA, the NA, or the MTA, resulting in a big performance advantage over MTA-only components called from an STA. Like MTA-only objects, an object that supports all the apartment models must provide its own synchronization because it can be accessed concurrently by multiple client threads.

In return for the privilege of being instantiated directly in any apartment, an in-process component supporting all three threading models must not make direct calls back to a client on any thread. Instead, it can only call the client back on the thread that received the interface pointer to the callback object. So the implicit promise made by every object supporting the three apartment models to all clients is: "You can call me from any thread, but I will call you back only on the one thread that received the interface pointer to the callback object." This is a component that would make any developer proud.

Although a coclass can support all the apartment types, at run time each object is still instantiated in an STA, the NA, or the MTA—not all three. If an object marked ThreadingModel = Both is instantiated by an STA thread, it belongs to that STA and all calls by the threads of other apartments must still be marshaled, as shown in Figure 4-7. Another option is to instantiate the object in the MTA, enabling all threads of the MTA to access the object directly but requiring that STAs access the object through a proxy. Finally, if an object marked ThreadingModel = Both is instantiated by an object in the NA, the new object will also be created in the NA.

Click to view at full size.

Figure 4-7. Calls to an in-process object in a different apartment, even one registered with ThreadingModel = Both, must still be marshaled.

In-process objects that support the three threading models (ThreadingModel = Both) never know whether they'll be created in an STA, the NA, or the MTA. However, it might be useful for the object to determine which threading model is in operation at run time. You can use the isMTA function, shown in the following code, to obtain this information. The isMTA function returns true if it is called from an MTA thread, false if the code is running in an STA thread. Note that although a coclass registered as ThreadingModel = Both can be instantiated in the NA, it is always executed on either an STA or MTA thread, depending on the threading model of the calling thread.

bool isMTA()
{
    HRESULT hr = CoInitializeEx(NULL, COINIT_MULTITHREADED);
    if(hr == RPC_E_CHANGED_MODE)
        return false;
    else
        CoUninitialize();
    return true;
}

It is also interesting to note that proxy/stub DLLs generated by the MIDL compiler support the three apartment models and therefore register themselves using the Both value for ThreadingModel. This is done because proxies must always be instantiated in the apartment of the creator so that calls can be made directly. If the proxy were created in another apartment, the caller would need a proxy to talk to the proxy!

The Free-Threaded Marshaler

In-process objects declared as ThreadingModel = Both are designed to be instantiated in any type of apartment; at run time, they are always instantiated in the apartment of the creator. This guarantees that the creating thread, as well as any other threads in the creating thread's apartment, can make direct calls to the object without having to go through a proxy/stub mechanism. Unfortunately, in order for the threads of other apartments in the process to call the object, the marshaling infrastructure must still be invoked.

Since in-process objects declared as ThreadingModel = Both are designed to handle concurrent access by multiple clients, it might seem that the client need not marshal the object's interface pointers between apartments. For example, if an object that supports all threading models is instantiated by an MTA thread, the object is created in the MTA. This means that all threads in the MTA can access the object directly. The client, however, must follow the threading rules of COM+ and always marshal interface pointers between the threads of different apartments. So any STA threads in the process still need to have the object's interface pointers marshaled for use in their apartment. The STA threads will then access the object via a proxy even though the object itself is perfectly capable of accepting the direct STA call with no danger to the STA thread. In this example, the need for client threads to remain independent of an object's internal threading model comes into conflict with the developer's desire to achieve the best performance.

The free-threaded marshaler (FTM) is an optimization technique designed for just such occasions. The FTM enables an in-process object to pass a direct pointer into any client apartment. When an object uses the FTM, all client threads, regardless of whether they are STA or MTA threads, call the object directly instead of through a proxy. Figure 4-8 shows an STA thread (STA1) that instantiates an in-process object that supports all the threading models. STA1 calls CoMarshalInterThreadInterfaceInStream to marshal the interface pointer to another STA thread (STA2). STA2 then calls the CoGetInterfaceAndReleaseStream function to obtain an apartment-relative interface pointer for use in calling the object.

Normally, STA2 now has a pointer to a proxy that marshals calls to the thread of STA1 before the object is called. However, if the object in question uses the FTM, STA2 receives a direct pointer to the object from the CoGetInterfaceAndReleaseStream function. Any threads running in the MTA also have direct access to the object. In this example, the decision to have STA1 instantiate the object is an arbitrary one. A thread belonging to any apartment in the process could instantiate the object; the result is nearly identical: all in-process calls are direct regardless of the calling thread's apartment type. The only difference is the apartment in which the object is actually created. But this is immaterial if all client calls can be made directly, irrespective of their apartment.

Click to view at full size.

Figure 4-8. An object that uses the FTM allows threads in different apartments but in the same process to access the object directly.

Notice that the situation shown in Figure 4-8 is fundamentally illegal: STA1, which is legally permitted to have only one thread executing its objects, now has multiple threads from STA2 and the MTA making direct calls to the object. In the pursuit of improved performance, the FTM enables us to break the threading rules of COM+. To use the FTM safely, you must completely understand the COM+ threading models as well as all the situations from which your object might be called. It is easy to crash the process through careless use of the FTM.

How the FTM Works

The FTM works by providing an implementation of the IMarshal7 interface that the object exposes as its own via aggregation, as shown in Figure 4-9. When a client thread marshals an interface pointer using CoMarshalInterThreadInterfaceInStream, the system queries the object for the IMarshal interface. If the object aggregates the FTM, the FTM's implementation of the IMarshal interface is returned. The FTM's implementation of IMarshal first checks to see what type of marshaling is taking place.

Figure 4-9. The InsideCOM object aggregating the FTM.

If marshaling takes place between apartments or contexts8 of a single process (defined by the marshal context flags MSHCTX_INPROC or MSHCTX_ CROSSCTX, passed to CoMarshalInterface), the FTM simply copies the object's actual interface pointer into the marshaling stream. When a different client thread in another apartment later unmarshals the interface pointer by calling CoGetInterfaceAndReleaseStream, instead of receiving a pointer to an interface proxy, the FTM simply retrieves the interface pointer to the real object that was placed in the stream. The second client thread can now make direct calls to the object regardless of the fact that it is running in a different apartment.

If the marshaling context is not MSHCTX_INPROC or MSHCTX_ CROSSCTX, this indicates that marshaling is taking place between processes or possibly even between computers, so the FTM delegates the marshaling work to the standard marshaler obtained by calling CoGetStandardMarshal. The standard marshaler loads the appropriate interface proxy and stub9 and returns a pointer to the proxy object to the client thread. The equivalent pseudocode for the main portion of the FTM is shown here:

HRESULT CFreeThreadedMarshaler::MarshalInterface(
    IStream* pStream, REFIID riid, void* pv, 
    DWORD dwDestContext, void* pvDestContext, DWORD dwFlags)
{
    // If cross-apartment or cross-context marshaling,...
    if(dwDestContext == MSHCTX_INPROC || 
        dwDestContext == MSHCTX_CROSSCTX)
        // simply store the pointer directly in the stream.
        return pStream->Write(this, sizeof(this), 0);

    // Otherwise, delegate the work to the standard marshaler.
    IMarshal* pMarshal = 0;
    CoGetStandardMarshal(riid, pv, dwDestContext, pvDestContext, 
        dwFlags, &pMarshal);
    HRESULT hr = pMarshal->MarshalInterface(pStream, riid, pv, 
        dwDestContext, pvDestContext, dwFlags);
    pMarshal->Release();
    return hr;
}

Aggregating the FTM

The FTM is an object implemented as part of COM+; you instantiate it by calling the CoCreateFreeThreadedMarshaler function. The first parameter of the CoCreateFreeThreadedMarshaler function takes a pointer to the controlling object's IUnknown interface pointer, and the second parameter returns a pointer to the FTM's implementation of IUnknown. The FTM is usually aggregated in an object's constructor, as shown in boldface below.

CInsideCOM::CInsideCOM() : m_cRef(1)
{
    InterlockedIncrement(&g_cComponents);
    CoCreateFreeThreadedMarshaler(this, &m_pFTM);
}

The pointer to the FTM returned by the CoCreateFreeThreadedMarshaler function is stored here as m_pFTM. This pointer is required in the object's implementation of IUnknown::QueryInterface, which must delegate any requests for the IMarshal interface to the FTM. COM+ automatically calls QueryInterface to request the IMarshal interface when an interface pointer on this object is marshaled or unmarshaled—for example, when the client calls CoMarshalInterThreadInterfaceInStream or CoGetInterfaceAndReleaseStream. Most objects normally return E_NOINTERFACE, causing COM+ to default to its own standard marshaler. In this case, however, we intercept the request for the IMarshal interface and forward it to the FTM. The FTM then returns a pointer to its implementation of IMarshal, which has the special behavior discussed previously. Here's how the object's implementation of IUnknown::QueryInterface should look; the additions are shown in boldface:

HRESULT CInsideCOM::QueryInterface(REFIID riid, void** ppv)
{
    if(riid == IID_IUnknown)
        *ppv = (IUnknown*)this;
    else if(riid == IID_ISum)
        *ppv = (ISum*)this;
    else if(riid == IID_IMarshal)
        return m_pFTM->QueryInterface(riid, ppv);
    else 
    {
        *ppv = NULL;
        return E_NOINTERFACE;
    }
    AddRef();
    return S_OK;
}

Before the object is destroyed, the FTM must be released as well. This call is typically made in the object's destructor, as shown here in boldface:

CInsideCOM::~CInsideCOM()
{
    InterlockedDecrement(&g_cComponents);
    m_pFTM->Release();
}

Problems with the FTM

Because the FTM enables threads in different apartments of a process to share direct interface pointers rather than use pointers to proxies—a calculated violation of the COM+ threading rules—an object that uses the FTM must abide by certain restrictions to ensure that the application doesn't crash:

Imagine an in-process object (OBJ1) that supports all the threading models (ThreadingModel = Both), aggregates the FTM,10 and holds a pointer to some other object (OBJ2). This is a recipe for disaster because objects that aggregate the FTM are not permitted to hold apartment-relative resources such as object references. If a client thread running in an STA (STA1) marshals an interface pointer to OBJ1 to another apartment (STA2), the FTM, as expected, sees to it that a direct interface pointer is received by STA2.

Here's the problem: if STA2 calls a method of OBJ1 and that method in turn calls a method of OBJ2, this breaks the threading semantics expected by OBJ2. Unless OBJ2 also happens to aggregate the FTM, it cannot be called directly from STA2 without first being marshaled for use in that apartment. Any attempt to do this would probably result in random program faults or, if OBJ2 were only a proxy in STA1 to an out-of-process object, the infamous RPC_E_ WRONG_THREAD error. Figure 4-10 shows that the threads of STA1 and STA2 can access OBJ1 directly, but only the thread of STA1 can access OBJ2.

Click to view at full size.

Figure 4-10. COM+ threading rules are violated if STA2 calls OBJ1 and then OBJ1 calls OBJ2 directly.

The problem presented in Figure 4-10 is typical, but it is not easy to solve. One possible solution is to call CoMarshalInterThreadInterfaceInStream after you instantiate OBJ2. This call returns a stream-based, apartment-neutral representation of the OBJ2 interface pointer that can later be unmarshaled using the CoGetInterfaceAndReleaseStream function to obtain a valid interface pointer for use in STA2. The problem with this approach, however, is that CoGetInterfaceAndReleaseStream unmarshals the interface pointer and frees the stream object containing the apartment-neutral representation of the interface pointer, which means that the interface pointer can be unmarshaled from the stream only once! All future attempts to unmarshal the interface pointer from the stream will fail because it has already been released.

The Global Interface Table to the Rescue

The Global Interface Table (GIT) is a processwide holding table for interface pointers. Interface pointers can be checked into the GIT, where they are available to any apartment in the process. When a thread requests an interface pointer from the GIT, the interface pointer supplied is guaranteed to be usable by that thread. If necessary, the GIT automatically performs the necessary interapartment marshaling. Because the GIT permits an interface pointer to be unmarshaled as many times as desired, it can help solve problems that arise in objects that use the FTM.

The system-provided GIT is accessed through the IGlobalInterfaceTable interface, shown here in IDL notation:

interface IGlobalInterfaceTable : IUnknown
{
    // Voluntarily check an interface pointer into the GIT.
    HRESULT RegisterInterfaceInGlobal
    (
        [in]  IUnknown* pUnk,
        [in]  REFIID    riid,
        [out] DWORD*    pdwCookie
    );

    // Remove an interface pointer from the GIT.
    HRESULT RevokeInterfaceFromGlobal
    (
        [in] DWORD      dwCookie
    );

    // Unmarshal an interface pointer from the GIT to the
    // caller's apartment.
    HRESULT GetInterfaceFromGlobal
    (
        [in]  DWORD                dwCookie,
        [in]  REFIID               riid,
        [out, iid_is(riid)] void** ppv
    );
};

The GIT provides a perfectly adequate implementation of IGlobalInterfaceTable, so there is really no need to implement this interface yourself. To instantiate the GIT, you call CoCreateInstance with the CLSID parameter set to CLSID_StdGlobalInterfaceTable, as shown in the following code. Only one instance of the GIT exists per process, so multiple calls to this function return the same instance. (This type of object is known as a singleton, and is discussed further in Chapter 13.)

IGlobalInterfaceTable* m_pGIT;
CoCreateInstance(CLSID_StdGlobalInterfaceTable, NULL, 
    CLSCTX_INPROC_SERVER, IID_IGlobalInterfaceTable, 
    (void**)&m_pGIT);

You check interface pointers into the GIT by calling the IGlobalInterfaceTable::RegisterInterfaceInGlobal method. This method stores an apartment-neutral object reference in the GIT and returns a processwide cookie value that any apartment can use to obtain this interface pointer, as shown here:

DWORD m_cookie;
m_pGIT->RegisterInterfaceInGlobal(pMyInterface, 
    IID_IMyInterface, &m_cookie);

You call the IGlobalInterfaceTable::GetInterfaceFromGlobal method in the thread of another apartment to retrieve an apartment-relative interface pointer from a cookie returned previously by RegisterInterfaceInGlobal, as shown in Figure 4-11. The interface pointer must also be released later, as shown in the code below, but any apartment in the process that wants to obtain an interface pointer can reuse the cookie. Thus, the GIT overcomes the limitation of the CoMarshalInterThreadInterfaceInStream and CoGetInterfaceAndReleaseStream pair, which allow unmarshaling to occur only once.

// Possibly called by a client thread in another apartment
m_pGIT->GetInterfaceFromGlobal(m_cookie, IID_IMyInterface, 
    (void**)&pMyInterface);

// Use pMyInterface here.

pMyInterface->Release();

Click to view at full size.

Figure 4-11. Using the GIT allows an object that aggregates the FTM to hold apartment-neutral object references across method invocations.

One problem with the GIT, however, is that each method invoked by the client must call IGlobalInterfaceTable::GetInterfaceFromGlobal to obtain the interface pointer and then IUnknown::Release to release it before returning to the client, because the object is not allowed to hold apartment-relative resources across method calls since it never knows which thread will call it. Unmarshaling interface pointers from the GIT can be a slow process, resulting in impaired performance if it is done often.

Before exiting, you should remove the interface pointer from the GIT by calling the IGlobalInterfaceTable::RevokeInterfaceFromGlobal method, as shown in the following code. There are no restrictions on which threads in the process can call this method. Remember to release the GIT object itself after you finish using it.

m_pGIT->RevokeInterfaceFromGlobal(m_cookie);

// Now release the GIT.
m_pGIT->Release();

Note that the application is responsible for coordinating access to the GIT in such a way that no thread in the process will call RevokeInterfaceFromGlobal while another is calling GetInterfaceFromGlobal for the same cookie; the GIT does not automatically provide this type of synchronization.

In addition to solving the problems commonly associated with objects that aggregate the FTM, the GIT offers an easier way for client applications to marshal interface pointers between the threads of different apartments. The GIT can replace calls to the CoMarshalInterThreadInterfaceInStream and CoGetInterfaceAndReleaseStream functions in components that do not make use of the FTM.

Neutral Apartments

Developers have toiled long and hard in their quest to build apartment-neutral objects that achieve the best performance for whatever threading model the client employs. As you've seen, marking your in-process coclasses ThreadingModel = Both, aggregating with the FTM, and using the GIT is a difficult way to achieve this goal. The NA was designed to make it easier for developers to create objects that can be called directly from any thread of any apartment in a process. Instead of breaking the rules, the NA offers a legal way to achieve the goals of objects that are marked ThreadingModel = Both, that aggregate the FTM, and that use the GIT. The advantages of using the NA model are so compelling that this is now the recommended model for nonvisual components in COM+.

Unlike an STA, which has only one thread, or the MTA, which can have multiple threads, the NA does not have any resident threads. Instead, STA threads (synchronized by window message queues) and MTA threads (unsynchronized) always make direct calls to objects running in the NA, as shown in Figure 412. In other words, objects instantiated in the NA are always executed on their caller's thread. Although the NA provides no built-in synchronization, when running on an STA thread, synchronization is provided via the modal message loop built into the STA model; when called from an MTA thread then no synchronization is provided. Also note that like the MTA, only one NA exists in a process. All objects marked ThreadingModel = Neutral that are instantiated in a process share the NA.

Click to view at full size.

Figure 4-12. Threads belonging to any apartment can call objects running in the NA without a thread switch.

The CoInitializeEx function is used only to bind threads to a particular threading model, either STA or MTA; it cannot be used to create the NA. The only way to instantiate an object in the NA is to call CoGetClassObject or CoCreateInstance(Ex) for a coclass that is marked in the registry as ThreadingModel = Neutral. As with objects running in the MTA, objects running in the NA are not permitted to have thread affinity because different threads might be used to execute their code. However, objects running in the NA are apartment-relative; that is, they belong to an apartment (the NA) and can hold apartment-relative resources such as object references as part of their state. In this way, they differ from objects that aggregate the FTM and must use the GIT to hold apartment-neutral object references across method invocations; objects running in the NA have no need for the FTM or the GIT.

This all sounds good, but perhaps you're wondering how to actually build objects that run in the NA. The first step in preparing an object for execution in the NA is to have the self-registration code set the ThreadingModel value to Neutral. The NA, like the other two COM+ apartment types, has some rules that govern the code you write. Fundamentally, objects designed to exe- cute in the NA have similar implementation issues as objects that are marked ThreadingModel = Both. This means that the object must be prepared to deal with concurrent access by multiple threads. It also means that the object cannot use TLS or have any other form of thread affinity.

Contexts

Contexts are used in COM+ to keep track of the run-time properties (such as an object's transaction requirements) associated with configured components that take advantage of COM+ component services. Contexts and component services are covered in volume 2 of Inside COM+. Unconfigured components, the type covered by this book, normally run in the default context of every apartment unless they are created by a configured component, in which case they run in the context of the caller. When you consider the issue of contexts, the difference between objects that are marked ThreadingModel = Both, that aggregate the FTM, and that use the GIT and objects that run in the NA (ThreadingModel = Neutral) becomes clearer. Objects that aggregate the FTM can be called directly from any context of any apartment on any thread in the process. Objects that run in the NA can be called directly (without a thread switch) from any thread of any apartment in the process, but they are subject to the COM+ rules on apartments and contexts.

Comparing the Apartment Models

Although a variety of complex situations can arise when clients and in-process components of different threading models attempt to play together, COM+ handles all of these situations with aplomb. While the system enables all forms of interoperability using any combination of threading models between clients and components, a performance penalty results when the threading models of the two parties do not match. In most cases, you can circumvent this overhead with a little creative thinking. Imagine a situation in which an MTA thread accesses an object that supports only the STA model. In this case, COM+ has little choice but to introduce synchronization into the equation to protect the object. If you are aware that the coclass supports only the STA model, however, it makes sense for the client to create an STA from which to call this object. When an STA-based client calls an object that also supports the STA model, less overhead results.

How can the client determine which threading model is supported by a component? For in-process components, you need only examine the object's ThreadingModel registry entry to obtain this information. For executable components , you cannot determine programmatically what threading model is supported. It is generally less important for clients and executable components to use the same threading model because COM+ must load a proxy/stub pair for cross-process marshaling purposes in any case. The table below shows the variety of threading models that can be supported by clients and in-process components and describes how COM+ handles each unique situation; the first column on the left shows the creator's apartment type; the top row shows the ThreadingModel value for the coclass that is being instantiated.

Not Specified Apartment Free Both Neutral
Main STA Created in the main STA. Direct access. Created in the main STA. Direct access. Created in the MTA. The MTA is created by the system if necessary. Proxy access. Created in the main STA. Direct access. Created in the NA. Lightweight proxy access —no thread switch.
STA Created in the main STA. Proxy access. Created in the caller's STA. Direct access. Created in the MTA. The MTA is created by the system if necessary. Proxy access. Created in the caller's STA. Direct access. Created in the NA. Lightweight proxy access—no thread switch.
MTA Created in the main STA. The main STA is created by the system if necessary. Proxy access. Created in a host STA. Proxy access. Created in the MTA. Direct access. Created in the MTA. Direct access. Created in the NA. Lightweight proxy access—no thread switch.
Neutral (on an STA thread) Created in the main STA. Proxy access. Created in the caller's STA. Lightweight proxy access—no thread switch. Created in the MTA. The MTA is created by the system if necessary. Proxy access. Created in the NA. Direct access. Created in the NA. Direct access.
Neutral (on an MTA thread) Created in the main STA. Proxy access. Created in a host STA. Proxy access. Created in the MTA. Lightweight proxy access—no thread switch. Created in the NA. Direct access. Created in the NA. Direct access.

Writing Thread-Safe Components

Now that you have a thorough grounding in the theory behind the COM+ apartment models, let's translate this knowledge into something more concrete. This section presents the coding techniques required to make in-process components thread-safe; issues unique to creating thread-safe executable components are covered in Chapter 13. Several samples that demonstrate these techniques are available on the companion CD in the Samples\Apartments folder.

In-process objects that support the STA model (ThreadingModel = Apartment) expect to be accessed by the same client thread that created the object; this is similar to the way thread-oblivious components work. However, objects that support the STA model can be created in multiple STAs of the client process, while objects of a legacy component are always created in the main STA. Because multiple threads might access different objects in the component simultaneously, a component that supports the STA model must code its DllGetClassObject and DllCanUnloadNow entry points to allow for the possibility of concurrent access by multiple client STAs.

Making DllGetClassObject and DllCanUnloadNow Thread-Safe

Let's assume that two different STAs in the client process create instances of the same class simultaneously. It follows that the DllGetClassObject function might be called concurrently by two different threads. Fortunately, most typical implementations of DllGetClassObject, such as the one shown in the following code, are inherently thread-safe since a new class factory is instantiated for each caller and no global or static data is accessed. If your component is not a typical implementation, you must rewrite the DllGetClassObject function to be thread-safe.

HRESULT __stdcall DllGetClassObject(REFCLSID clsid, 
    REFIID riid, void** ppv)
{
    if(clsid != CLSID_InsideCOM)
        return CLASS_E_CLASSNOTAVAILABLE;

    CFactory* pFactory = new CFactory;
     if(pFactory == NULL)
        return E_OUTOFMEMORY;

    HRESULT hr = pFactory->QueryInterface(riid, ppv);
    pFactory->Release();
    return hr;
}

DllCanUnloadNow can also be the root cause of a particularly nasty race condition. Since in-process components do not manage their own lifetimes, the DllCanUnloadNow function is designed to enable a client to determine whether the DLL can be unloaded. If DllCanUnloadNow returns S_OK, the DLL can be unloaded; if it returns S_FALSE, the DLL should not be unloaded at that point. The DllCanUnloadNow function is called by the CoFreeUnusedLibraries function, which is designed to be invoked by clients in their spare time. Here is a typical implementation of DllCanUnloadNow:

HRESULT __stdcall DllCanUnloadNow()
{
    if(g_cServerLocks == 0 && g_cComponents == 0)
        return S_OK;
    else
        return S_FALSE;
}

Now the client can call the IUnknown::Release method, as shown here, decrementing the usage counter to 0 and thus destroying the object (shown in boldface):

ULONG CInsideCOM::Release()
{
    if(--m_cRef != 0)
        return m_cRef;
    delete this;
    return 0;
}

As part of its cleanup duties, the destructor decrements the g_cComponents global variable, as shown in the code below. If this is the last object to be destroyed, g_cComponents is decremented to 0. The object unwittingly enters a deadly race to exit the destructor before DllCanUnloadNow is called. If another thread in the client process calls CoFreeUnusedLibraries, DllCanUnloadNow returns S_OK, leading COM+ to believe that it can unload the DLL. However, unloading the DLL while the destructor's cleanup code is still executing results in a fault.

CInsideCOM::~CInsideCOM()
{
    g_cComponents--;
    // Pray nobody calls CoFreeUnusedLibraries now...
    // Some other cleanup code here...
}

To fix this race condition, Microsoft has modified the implementation of the CoFreeUnusedLibraries function. Instead of immediately unloading a DLL when DllCanUnloadNow returns S_OK, CoFreeUnusedLibraries waits about 10 minutes. If CoFreeUnusedLibraries is called again after 10 minutes and DllCanUnloadNow again returns S_OK, then and only then is the DLL unloaded. This heuristic approach, while not foolproof, works very well. After all, how many destructors do you know of that take 10 minutes to execute?