Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -479,31 +479,52 @@ public List<Long> getDatabaseNames() {
return ret;
}

// Close the store and environment
// Close the store and environment.
// Two-phase close: detach resources under the write lock, then close them outside the lock
// to avoid blocking openDatabase() callers during potentially slow close() calls.
public void close() {
for (Database db : openedDatabases) {
List<Database> databasesToClose;
Database epochDBToClose = null;
ReplicatedEnvironment envToClose = null;

lock.writeLock().lock();
try {
databasesToClose = new ArrayList<>(openedDatabases);
openedDatabases.clear();

if (epochDB != null) {
epochDBToClose = epochDB;
epochDB = null;
}

if (replicatedEnvironment != null) {
envToClose = replicatedEnvironment;
replicatedEnvironment = null;
}
} finally {
lock.writeLock().unlock();
}
Comment on lines +490 to 506
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding the class write-lock while calling into external resource cleanup (db.close(), epochDB.close(), replicatedEnvironment.close()) can block unrelated threads for a long time and increases deadlock risk if those close paths indirectly contend on the same lock or other shared locks. A safer pattern is a two-phase close: under the write-lock, atomically “detach” resources (copy/swap openedDatabases to a local list, set fields like epochDB/replicatedEnvironment to null, optionally set a closing/closed flag to reject new opens), then release the lock and perform the actual close() calls outside the lock.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Adopted two-phase close: detach resources under writeLock, then close outside the lock. Also fixed the misleading log messages.


// Close resources outside the lock
for (Database db : databasesToClose) {
try {
db.close();
} catch (DatabaseException exception) {
LOG.error("Error closing db {} will exit", db.getDatabaseName(), exception);
LOG.error("Error closing db {}", db.getDatabaseName(), exception);
}
}
openedDatabases.clear();

if (epochDB != null) {
if (epochDBToClose != null) {
try {
epochDB.close();
epochDB = null;
epochDBToClose.close();
} catch (DatabaseException exception) {
LOG.error("Error closing db {} will exit", epochDB.getDatabaseName(), exception);
LOG.error("Error closing epoch db {}", epochDBToClose.getDatabaseName(), exception);
}
}

if (replicatedEnvironment != null) {
if (envToClose != null) {
try {
// Finally, close the store and environment.
replicatedEnvironment.close();
replicatedEnvironment = null;
envToClose.close();
} catch (DatabaseException exception) {
LOG.error("Error closing replicatedEnvironment", exception);
}
Expand Down
Loading