My question is very similar to this:
https://social.msdn.microsoft.com/Forums/vstudio/en-US/aca162b6-ebe3-49a6-83b6-338e71d706a0/access-violation-inside-of-clsidfileopendialog-com-object?forum=windowsuidevelopment
But instead of necroing an old post that already has a lot of irrelevant noise in it, I wanted to focus on my strange solution which I would like to find an explanation for.
To give some guidelines for reproduction: using VS2015, on Windows 7 Enterprise SP 1, I created a new project using the Win32 template, then selected Windows application, which created a simple window with a menu and a message loop in WinMain. Simply replacing
the Exit menu event with the following code results in a crash with access violation exactly two minutes after closing the file open dialog:
switch (wmId)
{
case IDM_EXIT:
{
IFileOpenDialog* pfd;
HRESULT hr = CoCreateInstance(CLSID_FileOpenDialog, NULL, CLSCTX_INPROC_SERVER, IID_IFileOpenDialog, reinterpret_cast<void**>(&pfd));
pfd->Show(0);
pfd->Release();
I initialise COM in STA mode in WinMain like so:
CoInitializeEx(NULL, COINIT_APARTMENTTHREADED | COINIT_DISABLE_OLE1DDE );
// Main message loop:
while (GetMessage(&msg, nullptr, 0, 0))
{
if (!TranslateAccelerator(msg.hwnd, hAccelTable, &msg))
{
TranslateMessage(&msg);
DispatchMessage(&msg);
}
}
CoUninitialize();
This happens on at least two different machines (with two different daily users) that I have tested. Just before crashing, most often (but not always), one or multiple instances of this message appear in the debug output:
Exception thrown at 0x76BDC54F (KernelBase.dll) in FileDialogTest.exe: 0x80010108: The object invoked has disconnected from its clients.
Now, the provided "solution" in the other thread is to disable the offending Shell extension, that seems to be doing something and crashing on cleanup after 2 minutes. However, what bothers me about this solution is the fact that many other applications
seem to work just fine with all the shell extensions enabled, so what is the difference? If it works for other apps, it should work for mine, right. Also, I don't want to force the users of my app to hunt down and disable shell extensions on their systems.
Firstly, I am still not sure if the exception is directly related to the access violation or whether it is just a coincidence. The common explanation for the 0x80010108 error is that the main STA thread is missing the message pump. But as you can see, that
is clearly not the case, since WinMain continues to pump messages well after the dialog is closed, so I don't understand why the 0x80010108 error is ever reported in the first place.
Secondly, I found out an alternative solution which works on both tested machines, which is to enable COM in a multithreaded mode, but according to all documentation that should be wrong for doing any kind work that involves UI. If I pass COINIT_MULTITHREADED
to CoInitializeEx, I never ever get the above exception reported and I never get a crash (well, since the crash is not 100% reproducible I cannot say for sure, but I've tried to open the dialog and then wait 2 minutes up to 10 times in a row, whereas in a
STA mode, it would consistently take at most 3 attempts for it to crash). What is it with the multithreaded model, that somehow handles the offending Shell extension gracefully? I feel that if I understood that, I could implement the same graceful handling
while retaining the STA mode on the main thread. This is important, because I also want to support Drag&Drop which requires my main thread to call OleInitialize, and that sets up an STA, not an MTA.
The documentation here https://support.microsoft.com/en-us/kb/150777 says that if a MTA attempts to create a COM object with an "Apartment" ThreadedModel (which I'm assuming CLSID_FileOpenDialog is), it will instead be created on a new STA that the
COM model "spins up". I still don't understand what exactly "spins up" here refers to. Does it run a new thread that calls CoInitializeEx with STA mode?
So to dig deeper into this I tried calling OleInitialize on the main thread, then in the menu message handler creating a new thread (using CreateThread) and in it initialise COM, show the dialog, uninitialise COM, and return. The access violation still occurred,
regardless of the COM threading model used in the new thread. So, considering that multi-threaded model fixed the main-thread case for me, I thought maybe it had something to do with that fact that my MTA thread calls CoUninitialise before the 2 minute cleanup
happens, so I removed the CoUninitialize call from the thread function. The access violation still occurred.
The next thing I tried was, to make my dialog thread CoInitialize in MTA mode, and run a message loop after showing the dialog, essentially making the thread never exit. Lo and behold, no access violation. However, I then put a breakpoint in the message loop
and realised that the thread was actually never getting any messages (kind of expected, given the docs say there is no message pump required in MTA). So I realised, that maybe it's all about the MTA thread never exiting. So I created a semaphore and replaced
the message loop in the MTA thread with WaitForSingleObject(mySemaphore, INFINITE) and that worked just as well, no access violation. It seems that when the thread exsits, the COM system detects that and does some kind of cleanup, even though CoUninitialise
was never called by the thread, which again makes some shell extensions run into a wall at the 2 minute mark.
So there you have it: to avoid access violation you need to show the file dialog from an MTA thread that never exits. That could be the main thread, or another thread. Obviously, in my naive setup above, I ended up with multiple zombie threads stuck waiting
on the semaphore, so my final solution is to reuse the same thread by signaling the semaphore, which the thread waits for in a loop. I use a critical section to transfer info to and back from the file dialog thread.
I would still like to get an explanation for why all this works the way it does, and ultimately, I would like to figure out if there's another, more sane way to handle the access violation, without requiring the app users to disable shell extensions.