What is the problem?
When upgrading from dotCMS 24.12.27 to main (or e.g. Docker image 26.04.02-01_f09f5ba), the startup upgrade task Task250604UpdateFolderInodes hangs indefinitely and never completes. A cloud customer reported 20+ hours without completion, blocking the upgrade entirely.
Root Cause Analysis
Two bugs compound to cause this:
Bug 1 — N×M query pattern in FixTask00090RecreateMissingFoldersInParentPath
executeFix() issues one SELECT COUNT(1) DB query per path segment per distinct parent_path in the identifier table. For a large database with many identifiers and deep folder paths, this results in hundreds of thousands of individual DB round-trips under startup load.
Example: 100K distinct paths × avg depth 3 = 300K individual SELECT queries.
Bug 2 — ALTER TABLE deadlock in fixFolderIds()
fixFolderIds() runs:
ALTER TABLE folder DROP CONSTRAINT IF EXISTS folder_identifier_fk;
ALTER TABLE folder ADD CONSTRAINT folder_identifier_fk ... DEFERRABLE;
This DDL requires an exclusive lock on the folder table. However, executeFix() leaves a Hibernate thread-local transaction open (idle in transaction) after calling HibernateUtil.save() in createFixAudit(). This idle transaction holds locks that prevent the ALTER TABLE from ever acquiring the exclusive lock, exhausting the connection pool and hanging forever.
Confirmed via pg_stat_activity:
- PID A:
idle in transaction — last query insert into fixes_audit
- PID B:
ALTER TABLE folder DROP CONSTRAINT — waiting on relation lock held by PID A
Fix
1. FixTask00090RecreateMissingFoldersInParentPath
Pre-load all existing folder identifier keys into a HashSet<String> with a single query upfront. Replace the per-segment SELECT COUNT(1) DB calls with O(1) Set.contains() lookups. Update the Set when new folders are created to prevent duplicate creation in the same run.
Before: N×M DB queries (N = distinct paths, M = avg path depth)
After: 1 query upfront + in-memory lookups
2. Task250604UpdateFolderInodes
After executeFix() returns, explicitly call HibernateUtil.commitTransaction() + DbConnectionFactory.closeSilently() to commit the open Hibernate transaction and release its locks before fixFolderIds() runs its DDL.
Steps to Reproduce
- Have a dotCMS database with a large number of identifiers (100K+) and folders where
inode ≠ identifier
- Upgrade from 24.12.27 to current
main
- Observe
Task250604UpdateFolderInodes in startup logs — it never completes
Impact
- Upgrade from 24.12.27 is completely blocked for large customers
- Customers db: 20+ hours (never completed before fix)
- After fix: completes in seconds
Files Changed
dotCMS/src/main/java/com/dotmarketing/fixtask/tasks/FixTask00090RecreateMissingFoldersInParentPath.java
dotCMS/src/main/java/com/dotmarketing/startup/runonce/Task250604UpdateFolderInodes.java
What is the problem?
When upgrading from dotCMS 24.12.27 to
main(or e.g. Docker image26.04.02-01_f09f5ba), the startup upgrade taskTask250604UpdateFolderInodeshangs indefinitely and never completes. A cloud customer reported 20+ hours without completion, blocking the upgrade entirely.Root Cause Analysis
Two bugs compound to cause this:
Bug 1 — N×M query pattern in
FixTask00090RecreateMissingFoldersInParentPathexecuteFix()issues oneSELECT COUNT(1)DB query per path segment per distinctparent_pathin theidentifiertable. For a large database with many identifiers and deep folder paths, this results in hundreds of thousands of individual DB round-trips under startup load.Example: 100K distinct paths × avg depth 3 = 300K individual SELECT queries.
Bug 2 —
ALTER TABLEdeadlock infixFolderIds()fixFolderIds()runs:This DDL requires an exclusive lock on the
foldertable. However,executeFix()leaves a Hibernate thread-local transaction open (idle in transaction) after callingHibernateUtil.save()increateFixAudit(). This idle transaction holds locks that prevent theALTER TABLEfrom ever acquiring the exclusive lock, exhausting the connection pool and hanging forever.Confirmed via
pg_stat_activity:idle in transaction— last queryinsert into fixes_auditALTER TABLE folder DROP CONSTRAINT— waiting on relation lock held by PID AFix
1.
FixTask00090RecreateMissingFoldersInParentPathPre-load all existing folder identifier keys into a
HashSet<String>with a single query upfront. Replace the per-segmentSELECT COUNT(1)DB calls with O(1)Set.contains()lookups. Update the Set when new folders are created to prevent duplicate creation in the same run.Before: N×M DB queries (N = distinct paths, M = avg path depth)
After: 1 query upfront + in-memory lookups
2.
Task250604UpdateFolderInodesAfter
executeFix()returns, explicitly callHibernateUtil.commitTransaction()+DbConnectionFactory.closeSilently()to commit the open Hibernate transaction and release its locks beforefixFolderIds()runs its DDL.Steps to Reproduce
inode ≠ identifiermainTask250604UpdateFolderInodesin startup logs — it never completesImpact
Files Changed
dotCMS/src/main/java/com/dotmarketing/fixtask/tasks/FixTask00090RecreateMissingFoldersInParentPath.javadotCMS/src/main/java/com/dotmarketing/startup/runonce/Task250604UpdateFolderInodes.java