Skip to content

Add logic to keep track of COM callbacks and cancel them if the sessi…#40183

Draft
OneBlue wants to merge 4 commits intofeature/wsl-for-appsfrom
user/oneblue/com-cancel
Draft

Add logic to keep track of COM callbacks and cancel them if the sessi…#40183
OneBlue wants to merge 4 commits intofeature/wsl-for-appsfrom
user/oneblue/com-cancel

Conversation

@OneBlue
Copy link
Copy Markdown
Collaborator

@OneBlue OneBlue commented Apr 14, 2026

…on is terminating

Summary of the Pull Request

This change adds logic to cancel user COM callbacks when the session terminates. This will prevent a session from being "stuck" if a user callback hangs during termination

PR Checklist

  • Closes: Link to issue #xxx
  • Communication: I've discussed this with core contributors already. If work hasn't been agreed, this work might be rejected
  • Tests: Added/updated if needed and all pass
  • Localization: All end user facing strings can be localized
  • Dev docs: Added/updated if needed
  • Documentation updated: If checked, please file a pull request on our docs repo and link it here: #xxx

Detailed Description of the Pull Request / Additional comments

Validation Steps Performed

Copilot AI review requested due to automatic review settings April 14, 2026 23:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR adds tracking of threads that are executing outgoing COM callbacks so that Terminate() can cancel those calls (via CoCancelCall) and avoid session shutdown hangs caused by stuck user callbacks.

Changes:

  • Introduces an RAII helper (UserCOMCallback) to enable COM call cancellation and register/unregister callback threads in WSLCSession.
  • Cancels outstanding outgoing COM calls during WSLCSession::Terminate().
  • Adds a Windows test that simulates a stuck progress callback and validates it is unblocked by session termination.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File Description
test/windows/WSLCTests.cpp Adds a regression test for stuck COM progress callback cancellation during Terminate().
src/windows/wslcsession/WSLCSession.h Adds UserCOMCallback and session APIs/state for tracking callback threads.
src/windows/wslcsession/WSLCSession.cpp Implements callback registration, enables/disables call cancellation, and cancels calls during termination.


UserCOMCallback::~UserCOMCallback()
{
Reset();
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UserCOMCallback::~UserCOMCallback() calls Reset(), which can throw due to THROW_IF_FAILED(CoDisableCallCancellation(...)). Throwing from a destructor can terminate the process during stack unwinding. Make the destructor non-throwing by ensuring Reset() is noexcept (or provide a ResetNoThrow()/equivalent) and convert the CoDisableCallCancellation failure handling to a non-throwing path (e.g., log/verify) when invoked from destruction.

Suggested change
Reset();
try
{
if (m_threadId != 0)
{
WI_ASSERT(m_session != nullptr);
m_session->UnregisterUserCOMCallback(m_threadId);
m_threadId = 0;
const auto hr = CoDisableCallCancellation(nullptr);
WI_ASSERT(SUCCEEDED(hr));
}
}
catch (...)
{
WI_ASSERT(false);
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should not be possible unless there was a programmer error

Comment on lines +2229 to +2231
m_userCOMCallbackThreads.emplace_back(GetCurrentThreadId());

return UserCOMCallback{*this};
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RegisterUserCOMCallback() pushes the thread id into m_userCOMCallbackThreads before constructing UserCOMCallback. If UserCOMCallback{*this} throws (e.g., CoEnableCallCancellation(nullptr) fails), the thread id remains registered forever, and later CancelUserCOMCallbacks()/UnregisterUserCOMCallback() behavior becomes inconsistent. Fix by ensuring registration is exception-safe (e.g., enable call cancellation first and only append on success, or append then use a scope guard to remove the entry on failure).

Suggested change
m_userCOMCallbackThreads.emplace_back(GetCurrentThreadId());
return UserCOMCallback{*this};
const auto threadId = GetCurrentThreadId();
m_userCOMCallbackThreads.emplace_back(threadId);
try
{
return UserCOMCallback{*this};
}
catch (...)
{
auto it = std::ranges::find(m_userCOMCallbackThreads, threadId);
WI_ASSERT(it != m_userCOMCallbackThreads.end());
m_userCOMCallbackThreads.erase(it);
throw;
}

Copilot uses AI. Check for mistakes.
Comment on lines +2572 to +2574
auto buildFuture = buildResult.get_future();
auto buildStatus = buildFuture.wait_for(std::chrono::seconds(30));
VERIFY_ARE_EQUAL(buildStatus, std::future_status::ready);
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If BuildImage() does not return within 30 seconds, VERIFY_ARE_EQUAL will fail and stack unwinding will execute the scope_exit which unconditionally join()s the thread (potentially hanging the test run indefinitely). To keep the test harness robust, avoid unconditional join() after a timeout failure (e.g., convert the join to a bounded wait pattern, or restructure so the thread is guaranteed to be unblocked before joining in failure paths).

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 15, 2026 03:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

Comment on lines +1976 to +1982
{
std::lock_guard comLock(m_userCOMCallbacksLock);

// Cancel any pending outgoing COM callback calls (e.g. IProgressCallback::OnProgress)
// to unblock operations waiting for cross-process COM responses.
CancelUserCOMCallbacks();
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling CoCancelCall() while holding m_userCOMCallbacksLock risks deadlock/re-entrancy issues (e.g., cancellation can synchronously trigger unwinding/cleanup that calls UnregisterUserCOMCallback(), which also takes m_userCOMCallbacksLock). A safer pattern is: copy the thread IDs under the lock, release the lock, then call CoCancelCall() for each copied ID. This also avoids holding a mutex across COM/RPC calls.

Copilot uses AI. Check for mistakes.
Comment on lines +2239 to +2240
auto it = std::ranges::find(m_userCOMCallbackThreads, ThreadId);
WI_ASSERT(it != m_userCOMCallbackThreads.end());
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::set::erase(end()) is undefined behavior. Right now correctness relies on WI_ASSERT, which may be compiled out in non-assert builds. Use m_userCOMCallbackThreads.find(ThreadId) and handle end() safely (e.g., return early or log) before erasing. This also improves complexity from O(n) (std::ranges::find) to O(log n).

Suggested change
auto it = std::ranges::find(m_userCOMCallbackThreads, ThreadId);
WI_ASSERT(it != m_userCOMCallbackThreads.end());
auto it = m_userCOMCallbackThreads.find(ThreadId);
if (it == m_userCOMCallbackThreads.end())
{
WI_ASSERT(FALSE);
return;
}

Copilot uses AI. Check for mistakes.
Comment on lines +224 to +226
UserCOMCallback::UserCOMCallback(WSLCSession& Session) noexcept : m_session(&Session), m_threadId(GetCurrentThreadId())
{
LOG_IF_FAILED(CoEnableCallCancellation(nullptr));
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CoEnableCallCancellation(nullptr) fails, the session will still track the thread as cancelable and Terminate() will still attempt CoCancelCall(), but cancellation may never work (risking the exact hang this change is meant to prevent). Consider making this failure actionable (e.g., fail registration / throw from RegisterUserCOMCallback() and avoid inserting the thread ID) or at minimum unregister and disable the feature when enabling cancellation fails.

Suggested change
UserCOMCallback::UserCOMCallback(WSLCSession& Session) noexcept : m_session(&Session), m_threadId(GetCurrentThreadId())
{
LOG_IF_FAILED(CoEnableCallCancellation(nullptr));
UserCOMCallback::UserCOMCallback(WSLCSession& Session) noexcept : m_session(nullptr), m_threadId(0)
{
const auto result = CoEnableCallCancellation(nullptr);
LOG_IF_FAILED(result);
if (SUCCEEDED(result))
{
m_session = &Session;
m_threadId = GetCurrentThreadId();
}

Copilot uses AI. Check for mistakes.
Comment on lines +2556 to +2557
std::thread buildThread(
[&]() { buildResult.set_value(m_defaultSession->BuildImage(&options, callback.Get(), exitEvent.get())); });
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If BuildImage(...) throws (e.g., due to an unexpected test harness exception path), buildResult.set_value(...) won’t run and the test can hang until timeout. To make the test robust, wrap the thread body in try/catch and call buildResult.set_exception(std::current_exception()) on failure (or otherwise guarantee the promise is always satisfied).

Suggested change
std::thread buildThread(
[&]() { buildResult.set_value(m_defaultSession->BuildImage(&options, callback.Get(), exitEvent.get())); });
std::thread buildThread([&]() {
try
{
buildResult.set_value(m_defaultSession->BuildImage(&options, callback.Get(), exitEvent.get()));
}
catch (...)
{
buildResult.set_exception(std::current_exception());
}
});

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants