Add logic to keep track of COM callbacks and cancel them if the sessi…#40183
Add logic to keep track of COM callbacks and cancel them if the sessi…#40183OneBlue wants to merge 4 commits intofeature/wsl-for-appsfrom
Conversation
…on is terminating
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR adds tracking of threads that are executing outgoing COM callbacks so that Terminate() can cancel those calls (via CoCancelCall) and avoid session shutdown hangs caused by stuck user callbacks.
Changes:
- Introduces an RAII helper (
UserCOMCallback) to enable COM call cancellation and register/unregister callback threads inWSLCSession. - Cancels outstanding outgoing COM calls during
WSLCSession::Terminate(). - Adds a Windows test that simulates a stuck progress callback and validates it is unblocked by session termination.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| test/windows/WSLCTests.cpp | Adds a regression test for stuck COM progress callback cancellation during Terminate(). |
| src/windows/wslcsession/WSLCSession.h | Adds UserCOMCallback and session APIs/state for tracking callback threads. |
| src/windows/wslcsession/WSLCSession.cpp | Implements callback registration, enables/disables call cancellation, and cancels calls during termination. |
|
|
||
| UserCOMCallback::~UserCOMCallback() | ||
| { | ||
| Reset(); |
There was a problem hiding this comment.
UserCOMCallback::~UserCOMCallback() calls Reset(), which can throw due to THROW_IF_FAILED(CoDisableCallCancellation(...)). Throwing from a destructor can terminate the process during stack unwinding. Make the destructor non-throwing by ensuring Reset() is noexcept (or provide a ResetNoThrow()/equivalent) and convert the CoDisableCallCancellation failure handling to a non-throwing path (e.g., log/verify) when invoked from destruction.
| Reset(); | |
| try | |
| { | |
| if (m_threadId != 0) | |
| { | |
| WI_ASSERT(m_session != nullptr); | |
| m_session->UnregisterUserCOMCallback(m_threadId); | |
| m_threadId = 0; | |
| const auto hr = CoDisableCallCancellation(nullptr); | |
| WI_ASSERT(SUCCEEDED(hr)); | |
| } | |
| } | |
| catch (...) | |
| { | |
| WI_ASSERT(false); | |
| } |
There was a problem hiding this comment.
That should not be possible unless there was a programmer error
| m_userCOMCallbackThreads.emplace_back(GetCurrentThreadId()); | ||
|
|
||
| return UserCOMCallback{*this}; |
There was a problem hiding this comment.
RegisterUserCOMCallback() pushes the thread id into m_userCOMCallbackThreads before constructing UserCOMCallback. If UserCOMCallback{*this} throws (e.g., CoEnableCallCancellation(nullptr) fails), the thread id remains registered forever, and later CancelUserCOMCallbacks()/UnregisterUserCOMCallback() behavior becomes inconsistent. Fix by ensuring registration is exception-safe (e.g., enable call cancellation first and only append on success, or append then use a scope guard to remove the entry on failure).
| m_userCOMCallbackThreads.emplace_back(GetCurrentThreadId()); | |
| return UserCOMCallback{*this}; | |
| const auto threadId = GetCurrentThreadId(); | |
| m_userCOMCallbackThreads.emplace_back(threadId); | |
| try | |
| { | |
| return UserCOMCallback{*this}; | |
| } | |
| catch (...) | |
| { | |
| auto it = std::ranges::find(m_userCOMCallbackThreads, threadId); | |
| WI_ASSERT(it != m_userCOMCallbackThreads.end()); | |
| m_userCOMCallbackThreads.erase(it); | |
| throw; | |
| } |
| auto buildFuture = buildResult.get_future(); | ||
| auto buildStatus = buildFuture.wait_for(std::chrono::seconds(30)); | ||
| VERIFY_ARE_EQUAL(buildStatus, std::future_status::ready); |
There was a problem hiding this comment.
If BuildImage() does not return within 30 seconds, VERIFY_ARE_EQUAL will fail and stack unwinding will execute the scope_exit which unconditionally join()s the thread (potentially hanging the test run indefinitely). To keep the test harness robust, avoid unconditional join() after a timeout failure (e.g., convert the join to a bounded wait pattern, or restructure so the thread is guaranteed to be unblocked before joining in failure paths).
| { | ||
| std::lock_guard comLock(m_userCOMCallbacksLock); | ||
|
|
||
| // Cancel any pending outgoing COM callback calls (e.g. IProgressCallback::OnProgress) | ||
| // to unblock operations waiting for cross-process COM responses. | ||
| CancelUserCOMCallbacks(); | ||
| } |
There was a problem hiding this comment.
Calling CoCancelCall() while holding m_userCOMCallbacksLock risks deadlock/re-entrancy issues (e.g., cancellation can synchronously trigger unwinding/cleanup that calls UnregisterUserCOMCallback(), which also takes m_userCOMCallbacksLock). A safer pattern is: copy the thread IDs under the lock, release the lock, then call CoCancelCall() for each copied ID. This also avoids holding a mutex across COM/RPC calls.
| auto it = std::ranges::find(m_userCOMCallbackThreads, ThreadId); | ||
| WI_ASSERT(it != m_userCOMCallbackThreads.end()); |
There was a problem hiding this comment.
std::set::erase(end()) is undefined behavior. Right now correctness relies on WI_ASSERT, which may be compiled out in non-assert builds. Use m_userCOMCallbackThreads.find(ThreadId) and handle end() safely (e.g., return early or log) before erasing. This also improves complexity from O(n) (std::ranges::find) to O(log n).
| auto it = std::ranges::find(m_userCOMCallbackThreads, ThreadId); | |
| WI_ASSERT(it != m_userCOMCallbackThreads.end()); | |
| auto it = m_userCOMCallbackThreads.find(ThreadId); | |
| if (it == m_userCOMCallbackThreads.end()) | |
| { | |
| WI_ASSERT(FALSE); | |
| return; | |
| } |
| UserCOMCallback::UserCOMCallback(WSLCSession& Session) noexcept : m_session(&Session), m_threadId(GetCurrentThreadId()) | ||
| { | ||
| LOG_IF_FAILED(CoEnableCallCancellation(nullptr)); |
There was a problem hiding this comment.
If CoEnableCallCancellation(nullptr) fails, the session will still track the thread as cancelable and Terminate() will still attempt CoCancelCall(), but cancellation may never work (risking the exact hang this change is meant to prevent). Consider making this failure actionable (e.g., fail registration / throw from RegisterUserCOMCallback() and avoid inserting the thread ID) or at minimum unregister and disable the feature when enabling cancellation fails.
| UserCOMCallback::UserCOMCallback(WSLCSession& Session) noexcept : m_session(&Session), m_threadId(GetCurrentThreadId()) | |
| { | |
| LOG_IF_FAILED(CoEnableCallCancellation(nullptr)); | |
| UserCOMCallback::UserCOMCallback(WSLCSession& Session) noexcept : m_session(nullptr), m_threadId(0) | |
| { | |
| const auto result = CoEnableCallCancellation(nullptr); | |
| LOG_IF_FAILED(result); | |
| if (SUCCEEDED(result)) | |
| { | |
| m_session = &Session; | |
| m_threadId = GetCurrentThreadId(); | |
| } |
| std::thread buildThread( | ||
| [&]() { buildResult.set_value(m_defaultSession->BuildImage(&options, callback.Get(), exitEvent.get())); }); |
There was a problem hiding this comment.
If BuildImage(...) throws (e.g., due to an unexpected test harness exception path), buildResult.set_value(...) won’t run and the test can hang until timeout. To make the test robust, wrap the thread body in try/catch and call buildResult.set_exception(std::current_exception()) on failure (or otherwise guarantee the promise is always satisfied).
| std::thread buildThread( | |
| [&]() { buildResult.set_value(m_defaultSession->BuildImage(&options, callback.Get(), exitEvent.get())); }); | |
| std::thread buildThread([&]() { | |
| try | |
| { | |
| buildResult.set_value(m_defaultSession->BuildImage(&options, callback.Get(), exitEvent.get())); | |
| } | |
| catch (...) | |
| { | |
| buildResult.set_exception(std::current_exception()); | |
| } | |
| }); |
…on is terminating
Summary of the Pull Request
This change adds logic to cancel user COM callbacks when the session terminates. This will prevent a session from being "stuck" if a user callback hangs during termination
PR Checklist
Detailed Description of the Pull Request / Additional comments
Validation Steps Performed