Skip to content

Infinite loop in CefResourceManager::RemoveAllProviders when provider has deletion_pending_ state #4173

@maurovisonaqlik

Description

@maurovisonaqlik

Infinite loop in CefResourceManager::RemoveAllProviders when provider has deletion_pending_ state

Description

CefResourceManager::DeleteProvider does not advance the iterator when it encounters a ProviderEntry that already has deletion_pending_ == true. This causes RemoveAllProviders (and RemoveProviders) to spin in an infinite while loop, consuming 100% CPU on the IO thread indefinitely.

The bug is triggered when RemoveProviders (or any path that sets deletion_pending_) is called on a provider that has pending requests, and RemoveAllProviders is called before the asynchronous cleanup tasks (ContinueOnIOThread / StopOnIOThread) have drained from the IO thread message queue.

Affected code

libcef_dll/wrapper/cef_resource_manager.cc, DeleteProvider method:

void CefResourceManager::DeleteProvider(ProviderEntryList::iterator& iterator,
                                        bool stop) {
  ProviderEntry* current_entry = *(iterator);

  if (current_entry->deletion_pending_) {
    return;  // BUG: iterator is not advanced
  }
  // ...
}

RemoveAllProviders calls DeleteProvider in a loop:

void CefResourceManager::RemoveAllProviders() {
  // ...
  ProviderEntryList::iterator it = providers_.begin();
  while (it != providers_.end()) {
    DeleteProvider(it, true);  // if deletion_pending_, iterator never advances → infinite loop
  }
}

The same bug also affects RemoveProviders:

void CefResourceManager::RemoveProviders(const std::string& identifier) {
  // ...
  while (it != providers_.end()) {
    if ((*it)->identifier_ == identifier) {
      DeleteProvider(it, false);  // same issue: no ++it if deletion_pending_
    } else {
      ++it;
    }
  }
}

Root cause

When DeleteProvider is called on a provider that has pending requests, it:

  1. Sets deletion_pending_ = true
  2. Calls request->Stop() or request->Continue(nullptr) on each pending request — but these always post an asynchronous task back to TID_IO (even when already on TID_IO), via CefPostTask(TID_IO, BindOnce(&StopOnIOThread, std::move(state_)))
  3. Advances ++iterator

The asynchronous task (StopOnIOThreadStopRequestDetachRequestFromProvider) is the only code that removes the request from pending_requests_ and eventually erases the ProviderEntry from the list.

If DeleteProvider is called again on the same entry (e.g., by RemoveAllProviders iterating the list), it finds deletion_pending_ == true and returns without advancing the iterator. Since RemoveAllProviders does not advance the iterator either, this creates an infinite busy-loop.

The cleanup task that would resolve deletion_pending_ is queued on TID_IO, but TID_IO is stuck in the loop and will never process it — a logical deadlock.

Steps to reproduce

The following sequence triggers the bug. It can happen in any application that calls RemoveProviders followed by RemoveAllProviders from a non-IO thread (both calls post to TID_IO as separate tasks):

Thread X (not TID_IO):
  resource_manager->RemoveProviders("my-provider");   // posts Task A to TID_IO
  resource_manager->RemoveAllProviders();              // posts Task B to TID_IO

TID_IO executes Task A:
  RemoveProviders("my-provider")
    → provider has pending requests
    → deletion_pending_ = true
    → request->Continue(nullptr) posts cleanup Task C to TID_IO
    → ++iterator

TID_IO executes Task B (before Task C):
  RemoveAllProviders()
    → while (it != end) { DeleteProvider(it, true); }
    → DeleteProvider finds deletion_pending_ == true → return (no ++it)
    → while re-checks: it != end → true → infinite loop

Task C is never executed (TID_IO is stuck in the loop)

Minimal reproducer (unit test)

#include "include/wrapper/cef_resource_manager.h"
#include "include/wrapper/cef_helpers.h"
#include <atomic>
#include <thread>
#include <chrono>

// A provider that returns true from OnRequest (claims to handle the request)
// but defers the actual Continue call, keeping the request "pending".
class DeferredProvider : public CefResourceManager::Provider {
 public:
  bool OnRequest(scoped_refptr<CefResourceManager::Request> request) override {
    // Hold the request without calling Continue/Stop.
    // This keeps it in the provider's pending_requests_ list.
    pending_request_ = request;
    return true;
  }

  void CompletePending() {
    if (pending_request_) {
      pending_request_->Continue(nullptr);
      pending_request_ = nullptr;
    }
  }

  scoped_refptr<CefResourceManager::Request> pending_request_;
};

// Test must run on TID_IO to directly observe the bug.
// In a real CEF application, wrap with CefPostTask(TID_IO, ...).
void ReproduceInfiniteLoop() {
  CEF_REQUIRE_IO_THREAD();

  CefRefPtr<CefResourceManager> manager = new CefResourceManager();

  // Add a provider that will hold a pending request.
  auto* provider = new DeferredProvider();
  manager->AddProvider(provider, 100, "test-provider");

  // Simulate an incoming resource request so the provider gets a pending
  // request. (In a real scenario this happens via OnBeforeResourceLoad.)

  // 1. First call: RemoveProviders marks the provider as deletion_pending_
  //    because it has a pending request (provider returned true from OnRequest
  //    and request->Continue has not been called yet, or its cleanup task
  //    hasn't drained).
  manager->RemoveProviders("test-provider");

  // 2. Second call: RemoveAllProviders hits the infinite loop.
  //    WARNING: Without the fix, this call NEVER returns.
  //    Use a watchdog thread to detect the hang.
  std::atomic<bool> completed{false};

  std::thread watchdog([&completed]() {
    std::this_thread::sleep_for(std::chrono::seconds(5));
    if (!completed.load()) {
      LOG(ERROR) << "FAIL: RemoveAllProviders did not return within 5 seconds "
                    "(infinite loop detected)";
      std::abort();
    }
  });

  manager->RemoveAllProviders();  // Hangs without the fix
  completed.store(true);

  watchdog.join();
}

Proposed fix

Add ++iterator in the deletion_pending_ early-return branch of DeleteProvider:

 void CefResourceManager::DeleteProvider(ProviderEntryList::iterator& iterator,
                                         bool stop) {
   CEF_REQUIRE_IO_THREAD();

   ProviderEntry* current_entry = *(iterator);

   if (current_entry->deletion_pending_) {
+    // Already pending deletion (e.g., from a prior RemoveProviders call).
+    // Advance the iterator so the caller does not spin on this entry.
+    // The entry will be cleaned up by DetachRequestFromProvider once all
+    // pending requests have completed asynchronously.
+    ++iterator;
     return;
   }

This is safe because:

  • The entry is already marked for deletion and its pending requests have already been told to Stop/Continue.
  • DetachRequestFromProvider will erase the entry from the list once all pending requests drain.
  • The destructor's safety net (~CefResourceManager force-deletes remaining entries) handles any entries that haven't drained yet.
  • GetNextValidProvider already skips deletion_pending_ entries, so advancing past them is consistent with the rest of the design.

Impact

  • Severity: High — causes permanent 100% CPU usage on TID_IO, freezing all resource loading and browser I/O.
  • Affected versions: All versions of cef_resource_manager.cc (the wrapper has had this code since its introduction).
  • Workaround: None at the API level. The only workaround is patching the source.

Environment

  • CEF version: 143.0.13+g30cb3bd+chromium-143.0.7499.170 (bug exists in all versions)
  • OS: Linux (observed in production), also reproducible on Windows

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugBug report

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions