bluedynamics
diff --git a/‎.github/workflows/docs.yaml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/docs.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/qa.yaml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/qa.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/tests.yaml‎
Lines changed: 2 additions & 2 deletions b/‎.github/workflows/tests.yaml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎CHANGES.md‎
Lines changed: 10 additions & 0 deletions b/‎CHANGES.md‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎docs/plans/2026-04-01-skip-transforms-tika-design.md‎
Lines changed: 85 additions & 0 deletions b/‎docs/plans/2026-04-01-skip-transforms-tika-design.md‎
Lines changed: 85 additions & 0 deletions
@@ -28,7 +28,7 @@ jobs:
         with:
           python-version: "3.13"
 
-      - uses: astral-sh/setup-uv@v8
+      - uses: astral-sh/setup-uv@v7
 
       - name: Install docs dependencies
         working-directory: docs
 
@@ -14,7 +14,7 @@ jobs:
     steps:
       - uses: actions/checkout@v5
 
-      - uses: astral-sh/setup-uv@v8
+      - uses: astral-sh/setup-uv@v7
 
       - name: Run ruff check
         run: uvx ruff check .
 
@@ -54,7 +54,7 @@ jobs:
         with:
           fetch-depth: 0
 
-      - uses: astral-sh/setup-uv@v8
+      - uses: astral-sh/setup-uv@v7
         with:
           enable-cache: true
           cache-dependency-glob: "pyproject.toml"
@@ -93,7 +93,7 @@ jobs:
     steps:
       - uses: actions/checkout@v5
 
-      - uses: astral-sh/setup-uv@v8
+      - uses: astral-sh/setup-uv@v7
         with:
           enable-cache: true
           cache-dependency-glob: "pyproject.toml"
 
@@ -1,5 +1,15 @@
 # Changelog
 
+## 1.0.0b24
+
+### Changed
+
+- `clearFindAndRebuild` now uses PG-driven iteration instead of
+  `ZopeFindAndApply`. Queries `object_state` directly, filtering out
+  known non-content classes (~96% of rows). No acquisition parent
+  chains on the call stack means `cacheMinimize()` can ghost all
+  objects — flat memory on large sites. Fixes #39.
+
 ## 1.0.0b23
 
 ### Fixed
 
@@ -0,0 +1,85 @@
+# Skip portal_transforms for IFile when Tika is active
+
+**Date:** 2026-04-01
+**Status:** Approved
+**Issue:** N/A (performance improvement for Tika-enabled sites)
+
+## Problem
+
+When Plone indexes a `File` object, the `SearchableText_file` indexer
+(from `plone.app.contenttypes`) calls `portal_transforms` to extract
+text from the blob's binary data (PDF, DOCX, etc.).  This is:
+
+1. **Expensive:** spawns external processes (pdftotext, wv, etc.)
+   synchronously during the request.
+2. **Redundant when Tika is configured:** the async Tika worker
+   already extracts text from blobs and merges it into
+   `searchable_text` via `pgcatalog_merge_extracted_text`.
+3. **Wasteful even when transforms are missing:** `_findPath()` does a
+   full BFS graph traversal of the transform registry before
+   concluding no path exists — not a cheap dict lookup.
+
+## Scope
+
+Only `SearchableText_file` (registered for `IFile`) calls
+`portal_transforms`.  All other Plone SearchableText indexers
+(IDocument, INewsItem, ICollection, IFolder, ILink) only concatenate
+text fields — no transforms involved.
+
+`IImage` does NOT extend `IFile` and has no transform-based indexer.
+
+## Design
+
+### New file: `src/plone/pgcatalog/indexers.py`
+
+A `SearchableText` indexer adapter registered for `IFile`:
+
+- **When `PGCATALOG_TIKA_URL` is set:** return `SearchableText(obj)`
+  (Title + Description only).  No `_findPath`, no blob I/O, no
+  transform call.  The Tika worker fills in the blob text
+  asynchronously as weight 'C' in the tsvector.
+- **When `PGCATALOG_TIKA_URL` is NOT set:** delegate to the original
+  `plone.app.contenttypes.indexers.SearchableText_file` so the full
+  transform pipeline runs as before.
+
+### ZCML registration
+
+Register in `overrides.zcml` to override the `plone.app.contenttypes`
+registration for `IFile`.
+
+### What doesn't change
+
+- `portal_transforms` is untouched — no unregister/re-register.
+- The Tika enqueue pipeline in `processor.py` — already works.
+- Custom SearchableText indexers for other interfaces — unaffected
+  (adapter specificity ensures more specific registrations win).
+- Tsvector weighting: Title 'A', Description 'B', body 'D',
+  Tika-extracted text 'C'.
+
+### Fallback behavior
+
+When `PGCATALOG_TIKA_URL` is NOT set, the override delegates to the
+original indexer.  Zero impact for sites not using Tika.
+
+## Custom types with blob fields
+
+The override only covers `IFile`.  If a custom content type has blob
+fields and uses its own `SearchableText` indexer that calls
+`portal_transforms`, it will NOT be automatically short-circuited.
+
+Developers with such custom types should either:
+
+1. Make their type provide `IFile` (then the override applies), or
+2. Register a similar conditional indexer for their custom interface
+   that checks `PGCATALOG_TIKA_URL` and skips transforms when set.
+
+This should be documented in the package's how-to section.
+
+## Implementation
+
+1. Create `src/plone/pgcatalog/indexers.py` with the conditional
+   indexer function.
+2. Add the adapter registration to `overrides.zcml`.
+3. Add tests: with Tika URL set (returns Title+Description only),
+   without Tika URL (delegates to original).
+4. Add documentation section about custom blob types.