Current Behavior
Currently, there is only an index COMPONENT_PURL_IDX, which is created on the COMPONENT.PURL field.
However, when you do a search for components via a PURL, the SQL query searches for components based on lower("PURL") instead of the raw value.
https://github.com/DependencyTrack/dependency-track/blob/master/src/main/java/org/dependencytrack/persistence/ComponentQueryManager.java#L283
According to the database statistics, a component search triggers the following queries:
SELECT COUNT("A0"."ID") FROM "COMPONENT" "A0" WHERE LOWER("A0"."PURL") LIKE $1 ESCAPE $2;
SELECT $1 AS "DN_TYPE","A0"."AUTHORS","A0"."BLAKE2B_256","A0"."BLAKE2B_384","A0"."BLAKE2B_512","A0"."BLAKE3","A0"."CLASSIFIER","A0"."COPYRIGHT","A0"."CPE","A0"."DESCRIPTION","A0"."DIRECT_DEPENDENCIES","A0"."EXTENSION","A0"."EXTERNAL_REFERENCES","A0"."FILENAME","A0"."GROUP","A0"."ID","A0"."INTERNAL","A0"."LAST_RISKSCORE","A0"."LICENSE","A0"."LICENSE_EXPRESSION","A0"."LICENSE_URL","A0"."MD5","A0"."NAME","A0"."TEXT","B0"."ACTIVE","B0"."AUTHORS","B0"."CLASSIFIER","B0"."COLLECTION_LOGIC","B0"."COLLECTION_TAG","B0"."CPE","B0"."DESCRIPTION","B0"."DIRECT_DEPENDENCIES","B0"."EXTERNAL_REFERENCES","B0"."GROUP","B0"."ID","B0"."IS_LATEST","B0"."LAST_BOM_IMPORTED","B0"."LAST_BOM_IMPORTED_FORMAT","B0"."LAST_RISKSCORE","B0"."LAST_VULNERABILITY_ANALYSIS","B0"."MANUFACTURER","B0"."NAME","B0"."PUBLISHER","B0"."PURL","B0"."SUPPLIER","B0"."SWIDTAGID","B0"."UUID","B0"."VERSION","A0"."PUBLISHER","A0"."PURL","A0"."PURLCOORDINATES","C0"."ISCUSTOMLICENSE","C0"."FSFLIBRE","C0"."ID","C0"."LICENSEID","C0"."NAME","C0"."ISOSIAPPROVED","C0"."UUID","A0"."SHA1","A0"."SHA_256","A0"."SHA_384","A0"."SHA3_256","A0"."SHA3_384","A0"."SHA3_512","A0"."SHA_512","A0"."SUPPLIER","A0"."SWIDTAGID","A0"."UUID","A0"."VERSION","D0"."LICENSEID" AS "NUCORDER0" FROM "COMPONENT" "A0" INNER JOIN "PROJECT" "B0" ON "A0"."PROJECT_ID" = "B0"."ID" LEFT OUTER JOIN "LICENSE" "C0" ON "A0"."LICENSE_ID" = "C0"."ID" LEFT OUTER JOIN "LICENSE" "D0" ON "A0"."LICENSE_ID" = "D0"."ID" WHERE LOWER("A0"."PURL") LIKE $2 ESCAPE $3 ORDER BY "NUCORDER0" FETCH NEXT $4 ROWS ONLY
Adding this custom index helped reduce the query times from several seconds to 100 milliseconds:
CREATE INDEX xt_component_purl2 ON public."COMPONENT" USING gin (lower(("PURL")::text) gin_trgm_ops)
However, the index uses https://www.postgresql.org/docs/current/pgtrgm.html which needs to be enabled on the database server.
Steps to Reproduce
- Fill the database with test data
- Switch on query statistics for the database server (not sure how, our database guy did it)
- Trigger a query for a component by purl on
/api/v1/component/identity
- Check the query time
Expected Behavior
The database indexes need to be changed/extended to also work on lowercase searches.
lower() is used on a lot of fields, for example also on group, name, version, CPE and SWID. And Component searches are probably not the only ones affected by the incorrect indexes.
Not sure if the existing indexes should be changed to lower() or if there are still VARCHAR searches somewhere in the code which do not use lower(). (.toLowerCase().matches(...))
Dependency-Track Version
4.14.1
Dependency-Track Distribution
Container Image
Database Server
PostgreSQL
Database Server Version
17.5
Browser
N/A
Checklist
Current Behavior
Currently, there is only an index
COMPONENT_PURL_IDX, which is created on the COMPONENT.PURL field.However, when you do a search for components via a PURL, the SQL query searches for components based on
lower("PURL")instead of the raw value.https://github.com/DependencyTrack/dependency-track/blob/master/src/main/java/org/dependencytrack/persistence/ComponentQueryManager.java#L283
According to the database statistics, a component search triggers the following queries:
Adding this custom index helped reduce the query times from several seconds to 100 milliseconds:
However, the index uses https://www.postgresql.org/docs/current/pgtrgm.html which needs to be enabled on the database server.
Steps to Reproduce
/api/v1/component/identityExpected Behavior
The database indexes need to be changed/extended to also work on lowercase searches.
lower()is used on a lot of fields, for example also on group, name, version, CPE and SWID. And Component searches are probably not the only ones affected by the incorrect indexes.Not sure if the existing indexes should be changed to
lower()or if there are still VARCHAR searches somewhere in the code which do not uselower(). (.toLowerCase().matches(...))Dependency-Track Version
4.14.1
Dependency-Track Distribution
Container Image
Database Server
PostgreSQL
Database Server Version
17.5
Browser
N/A
Checklist