Skip to content

Commit a9df648

Browse files
committed
[fix](editlog) Fix BDBEnvironment.close() not holding writeLock causing ConcurrentModificationException
BDBEnvironment.close() iterates openedDatabases without acquiring lock.writeLock(), while openDatabase() concurrently modifies the same ArrayList under the write lock. This causes ConcurrentModificationException when a RollbackException triggers close() on the replayer thread while HTTP/heartbeat/RPC handler threads call openDatabase() via getMaxJournalId(). The race window: replayer catches RollbackException in getDatabaseNames(), calls bdbEnvironment.close() which iterates openedDatabases with for-each. Concurrently, a SHOW FRONTENDS or heartbeat thread calls getMaxJournalId() → openDatabase() → openedDatabases.add(). The ArrayList iterator detects the structural modification and throws ConcurrentModificationException. This kills the replayer thread, causing the follower FE to fall permanently behind the master. Fix: wrap the entire close() body with lock.writeLock() to mutually exclude with openDatabase() and removeDatabase().
1 parent 9ecfd40 commit a9df648

1 file changed

Lines changed: 26 additions & 21 deletions

File tree

fe/fe-core/src/main/java/org/apache/doris/journal/bdbje/BDBEnvironment.java

Lines changed: 26 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -481,32 +481,37 @@ public List<Long> getDatabaseNames() {
481481

482482
// Close the store and environment
483483
public void close() {
484-
for (Database db : openedDatabases) {
485-
try {
486-
db.close();
487-
} catch (DatabaseException exception) {
488-
LOG.error("Error closing db {} will exit", db.getDatabaseName(), exception);
484+
lock.writeLock().lock();
485+
try {
486+
for (Database db : openedDatabases) {
487+
try {
488+
db.close();
489+
} catch (DatabaseException exception) {
490+
LOG.error("Error closing db {} will exit", db.getDatabaseName(), exception);
491+
}
489492
}
490-
}
491-
openedDatabases.clear();
493+
openedDatabases.clear();
492494

493-
if (epochDB != null) {
494-
try {
495-
epochDB.close();
496-
epochDB = null;
497-
} catch (DatabaseException exception) {
498-
LOG.error("Error closing db {} will exit", epochDB.getDatabaseName(), exception);
495+
if (epochDB != null) {
496+
try {
497+
epochDB.close();
498+
epochDB = null;
499+
} catch (DatabaseException exception) {
500+
LOG.error("Error closing db {} will exit", epochDB.getDatabaseName(), exception);
501+
}
499502
}
500-
}
501503

502-
if (replicatedEnvironment != null) {
503-
try {
504-
// Finally, close the store and environment.
505-
replicatedEnvironment.close();
506-
replicatedEnvironment = null;
507-
} catch (DatabaseException exception) {
508-
LOG.error("Error closing replicatedEnvironment", exception);
504+
if (replicatedEnvironment != null) {
505+
try {
506+
// Finally, close the store and environment.
507+
replicatedEnvironment.close();
508+
replicatedEnvironment = null;
509+
} catch (DatabaseException exception) {
510+
LOG.error("Error closing replicatedEnvironment", exception);
511+
}
509512
}
513+
} finally {
514+
lock.writeLock().unlock();
510515
}
511516
}
512517

0 commit comments

Comments
 (0)