20 May 13:09

kirkryan

02137de

Latest

NetApp NEO 4.1.4

Container Images

Service	Image	Pull Command
API	`ghcr.io/netapp/netapp-neo-api`	`docker pull ghcr.io/netapp/netapp-neo-api:4.1.4`
Worker	`ghcr.io/netapp/netapp-neo-worker`	`docker pull ghcr.io/netapp/netapp-neo-worker:4.1.4`
Extractor (CPU)	`ghcr.io/netapp/netapp-neo-extractor-full`	`docker pull ghcr.io/netapp/netapp-neo-extractor-full:4.1.4`
Extractor (CUDA)	`ghcr.io/netapp/netapp-neo-extractor-cuda-full`	`docker pull ghcr.io/netapp/netapp-neo-extractor-cuda-full:4.1.4`
Extractor (ROCm)	`ghcr.io/netapp/netapp-neo-extractor-rocm-full`	`docker pull ghcr.io/netapp/netapp-neo-extractor-rocm-full:4.1.4`
NER	`ghcr.io/netapp/netapp-neo-ner-full`	`docker pull ghcr.io/netapp/netapp-neo-ner-full:4.1.4`

All images are available for linux/amd64 and linux/arm64 (except CUDA and ROCm variants which are amd64 only).

Quick Start

Download the deployment ZIP from this release

Extract and configure your environment:

unzip netapp-neo-4.1.4.zip
cp .env.example .env
# Edit .env with your settings (see comments for required values)

Start all services:
```
docker compose up -d
```
Access the web console at http://localhost:8081
API available at http://localhost:8000

GPU Support

For NVIDIA CUDA acceleration (extractor + NER):

docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -d

For AMD ROCm GPUs, edit docker-compose.gpu.yml and change the extractor image to:

ghcr.io/netapp/netapp-neo-extractor-rocm-full:4.1.4

What's New

Pre-baked ML Model Images

Full image variants -- Extractor and NER images now ship with all ML models pre-baked into the container, eliminating the need for runtime model downloads. This significantly reduces first-start time and enables fully air-gapped deployments.

Capacity-Based Licensing

TB-based license limits -- License keys can now enforce storage capacity limits, enabling more granular entitlement management per deployment.

OCR Engine Fallback

Automatic OCR fallback -- When the primary OCR engine returns no content for a document, the extractor now automatically falls back to alternate OCR engines, improving extraction success rates for difficult documents.

Extractor Queue Enhancements

Improved queue throughput -- Extraction work queue has been optimized with better tail handling, reducing idle time and improving overall extraction throughput on large crawl jobs.
Failed extraction handling -- Work items that return no content after extraction are now explicitly marked as failed, giving operators clear visibility into extraction issues.

Stability & Reliability

Event-loop starvation fixes -- Resolved multiple scenarios where long-running database operations could starve the async event loop, causing liveness probe failures and worker restarts. Database calls in the work queue and ACL backfill paths are now offloaded to the thread pool.
Database locking improvements -- Read-only queries now properly commit to prevent PostgreSQL idle-in-transaction locking. Index creation retries on lock timeouts during startup. Stale service coordination records are cleaned up during maintenance windows.
OpenShift compatibility -- Fixes for running Neo in OpenShift-managed Kubernetes environments.

Dependency Updates

Docling updated to v2.92.0 with improved document parsing and layout detection.

Bug Fixes

Fixed incorrect column reference in user listing query.
Fixed SQL pattern escaping in file count estimation.
Standardised user table schema for consistent password hashing.

Assets 5

30 Apr 14:42

github-actions

netapp-neo-26.4.4

02137de

netapp-neo-26.4.4

NetApp Neo v4.x — context lake microservice architecture for AI services via MCP service

Assets 3

21 Apr 22:02

kirkryan

netapp-neo-4.1.2

595ebee

NetApp Project Neo v4.1.2

Scale-performance release. All nine changes are additive at the API contract level — defaults preserve 4.1.0/4.1.1 behaviour; new parameters are opt-in. Response-model fields become optional to support low-overhead modes. Validated against a 2.099 billion-row file_metadata dataset across 2,100 LIST partitions.

What's New

New API parameters

?after_modified_time=<ISO-8601> on /api/v1/shares/{share_id}/files and /api/v1/files — keyset pagination ordered by modified_time DESC. Returns rows strictly older than the cursor; responses include next_cursor to walk further. ~300× faster than OFFSET at 1 B rows, flat as the dataset grows. Prefer this for any listing where the page number is meaningful at scale.
?include_counts=false on the same endpoints — skips the cross-partition COUNT(*) and SUM(size) aggregates that previously ran on every request. When omitted, unfiltered listings now use a fast n_live_tup partition-stats estimate for total_count and omit total_size. Filtered listings preserve exact counts (slow path).
?limit= and ?offset= on /api/v1/shares — previously silently ignored; now honoured. The default (no params) still returns the full list for backward compatibility; multi-tenant deployments should cap explicitly.

API runtime

UVICORN_WORKERS environment variable on the API service — configures uvicorn --workers N from deployment values. Default is 1 (matches 4.1.0/4.1.1). Bump to 4 for read-heavy deployments; each worker is an independent Python process with its own DB pool, so scaling comes at the cost of N× memory.

Companion chart release

Helm chart updates ship alongside on NetApp/Innovation-Labs: sensible API resource defaults, nodeSelector / tolerations / extraArgs support on the Postgres StatefulSet, and a chart-default idle_in_transaction_session_timeout=60s on Postgres that works together with the application-side lock_timeout to keep share-creation latency bounded.

Container Images

All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.1.2.

Image	Platforms	Description
`netapp-neo-api:4.1.2`	amd64, arm64	REST API + MCP transport
`netapp-neo-worker:4.1.2`	amd64, arm64	Background processing (crawl, ACL, upload)
`netapp-neo-extractor:4.1.2`	amd64, arm64	Content extraction (CPU)
`netapp-neo-extractor-cuda:4.1.2`	amd64, arm64	Content extraction (NVIDIA GPU)
`netapp-neo-extractor-rocm:4.1.2`	amd64	Content extraction (AMD GPU)
`netapp-neo-ner:4.1.2`	amd64, arm64	Named Entity Recognition (GLiNER2)

Quick Start

docker pull ghcr.io/netapp/netapp-neo-api:4.1.2
docker pull ghcr.io/netapp/netapp-neo-worker:4.1.2
docker pull ghcr.io/netapp/netapp-neo-extractor:4.1.2
docker pull ghcr.io/netapp/netapp-neo-ner:4.1.2

Use the attached docker-compose.yml (or docker-compose.gpu.yml for NVIDIA) plus .env.example to bring up a local deployment. For multi-billion-file deployments, also tune Postgres via postgresql.extraArgs in the companion Helm chart (see PERFORMANCE_TUNING_GUIDE).

Upgrade notes

From 4.1.0 / 4.1.1: drop-in replacement; no data migration required. The new idx_file_metadata_modified_time is created idempotently at startup.
Clients that read total_count / total_size / total_pages: these fields are now Optional[int]. They remain populated with the exact value on 4.1.0/4.1.1 callers (who don't pass the new query params); they are null when the caller opts into ?include_counts=false.
Scale recommendation: for shares with more than ~100K files, migrate paginated file listing to keyset (?after_modified_time=). OFFSET remains supported but degrades linearly with offset.

Quality

2.099 billion-row validation across 2,100 LIST partitions on a tuned Postgres 17 deployment (K3s, 2 TB NVMe-backed PVC, shared_buffers=32 GB)
Read-side SLOs at 2 B: file_by_id 37 ms p95, keyset listing 18 ms p95, cross_share_agg 63 ms p95, share_listing 14 ms p95
Chaos recovery at 2 B: graceful Postgres pod kill 6 s, force kill 13 s, API pod kill 14 s — WAL-bounded, not data-volume-bounded
Cascade share-delete at 1 M-row / 776 MB partition: 5.98 s clean with DROP … CASCADE (pre-fix: silent partition leak)
All unit and integration suites pass on CI

Assets 5

21 Apr 15:59

github-actions

netapp-neo-26.4.3

595ebee

netapp-neo-26.4.3

NetApp Neo v4.x — context lake microservice architecture for AI services via MCP service

Assets 3

21 Apr 13:12

github-actions

netapp-neo-26.4.2

8ce41fb

netapp-neo-26.4.2

NetApp Neo v4.x — context lake microservice architecture for AI services via MCP service

Assets 3

20 Apr 17:28

kirkryan

netapp-neo-4.1.0

aab5762

NetApp Project Neo v4.1.0

A maintenance and capability release focused on enterprise scale, MCP/dataset features, and operational reliability. Drop-in upgrade from 4.0.x.

What's New

Datasets, MCP, and Search

Virtual datasets — query files by share, schedule, type, or pattern without persisting an enumeration. Includes cursor pagination and rollup aggregates for billion-scale deployments.
count_entity_mentions MCP tool — fast aggregated entity-mention counts across files, served from ner_entity_aggregates rollups instead of scanning raw entities.
~20× faster full-text search — ts_headline is now deferred until after LIMIT, eliminating the dominant cost on result pages with snippets.
/api/v1/files filename filter restored — silently ignored since a refactor; now honoured as documented.

Performance & Scale

Batched NER aggregate upserts — single statement per file replaces per-row upserts; large reductions in DB round-trips during NER fan-in.
Partition lifecycle on share delete — ner_entities and ner_entity_aggregates partitions are dropped alongside file partitions, preventing schema bloat in long-lived deployments.
FTS retry window — extends FTS readiness checks during large bulk loads so initial indexing doesn't time out.

Crawl & Worker Reliability

VARCHAR(50) overflow on long file extensions — fixed; some Office/legacy extensions now stored without truncation errors during crawl.
Manual crawl_schedule no longer triggers cron parse error — manual schedules are detected and bypass the cron parser.
Worker env var names aligned with docker-compose.yml and docs — NUM_*_WORKERS set in compose are honoured by the worker service.
NFS host guidance updated in shipped configs — clearer separation of in-cluster vs. host-mounted NFS configurations.

Sizing & Operations

Sizing API exposes ACL/NER worker counts — deployment sizing recommendations now include the full set of horizontally-scalable worker pools.
Per-protocol E2E pipelines — focused tests added for SMB, NFS, and S3 paths with cascade-cleanup verification for share deletion.

Bug Fixes

NER stats endpoint returned zero when share_id was supplied — SQL syntax bug fixed.
Test refresh for procedural drift (worker health via exec, FTS page_size, content field handling).
NER engine tests updated for the counted-entity response format.

Build & Dependencies

Security floors hardened across pyproject.toml; PyJWT removed where unused.
Tier 1 + Tier 2 dependency bumps for security and freshness; floors harmonised with root requirements pins.
--internal build flag for non-public image builds.
arm64 CUDA builds skip onnxruntime-gpu (not published for that platform).

Container Images

All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.1.0.

Image	Platforms	Description
`netapp-neo-api:4.1.0`	amd64, arm64	REST API + MCP transport
`netapp-neo-worker:4.1.0`	amd64, arm64	Background processing (crawl, ACL, upload)
`netapp-neo-extractor:4.1.0`	amd64, arm64	Content extraction (CPU)
`netapp-neo-extractor-cuda:4.1.0`	amd64, arm64	Content extraction (NVIDIA GPU)
`netapp-neo-extractor-rocm:4.1.0`	amd64	Content extraction (AMD GPU)
`netapp-neo-ner:4.1.0`	amd64, arm64	Named Entity Recognition (GLiNER2)

Quick Start

# Pull all core images
docker pull ghcr.io/netapp/netapp-neo-api:4.1.0
docker pull ghcr.io/netapp/netapp-neo-worker:4.1.0
docker pull ghcr.io/netapp/netapp-neo-extractor:4.1.0
docker pull ghcr.io/netapp/netapp-neo-ner:4.1.0

Use the attached docker-compose.yml (or docker-compose.gpu.yml for NVIDIA) and .env.example to bring up a local deployment.

Quality

All Unit Tests + Test Suite runs green on release/4.1.0 at tag time.
E2E pipelines validated across NFS, SMB, and S3 protocols, including cascade-cleanup of share deletion.

Assets 5

16 Apr 06:37

github-actions

netapp-neo-26.4.1

aab5762

netapp-neo-26.4.1

NetApp Neo v4.x — context lake microservice architecture for AI services via MCP service

Assets 3

10 Apr 12:50

github-actions

netapp-neo-26.3.2

c15ca00

netapp-neo-26.3.2

NetApp Neo v4.x — context lake microservice architecture for AI services via MCP service

Assets 3

07 Apr 13:47

kirkryan

netapp-neo-4.0.3p9

612e53b

NetApp Project Neo v4.0.3p9

What's New

Improved Service Resilience

Admin user creation is now decoupled from worker initialization, ensuring authentication always works even if background workers fail to start. Worker initialization also now automatically retries on failure instead of leaving the service in a broken state.

Independent admin account creation -- The admin user is created as a standalone step before worker components initialize, so API authentication is available immediately after setup completes
Automatic worker retry -- If worker initialization fails (e.g., due to a transient Graph API or database issue), the service automatically retries instead of requiring a manual restart

MCP & Search Fixes

Resolves multiple issues with the Model Context Protocol (MCP) integration, ACL-based access control, and NER entity search.

ACL filtering fix -- Shares configured with acl_override_mode=everyone now correctly grant access instead of denying when resolved principals don't match the user
Auth persistence -- MCP OAuth RSA signing keys are now persisted to the database, so authentication tokens survive service restarts
Group-based access control -- User group memberships are now fetched via Microsoft Graph at token validation time, enabling group-based ACL matching through MCP
NER search improvements -- Fixed entity search 422 error, added relevance ranking (exact match, entity density, text length), pagination support, and per-file deduplication
Share status transitions -- NER worker now correctly transitions share status from PROCESSING → READY when all work completes
OAuthProvider abstraction -- Introduced OAuthProvider ABC for future Keycloak/generic OIDC provider support

Container Images

All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.0.3p9.

Image	Platforms	Description
`netapp-neo-api:4.0.3p9`	amd64, arm64	REST API + MCP transport
`netapp-neo-worker:4.0.3p9`	amd64, arm64	Background processing
`netapp-neo-extractor:4.0.3p9`	amd64, arm64	Content extraction (CPU)
`netapp-neo-extractor-cuda:4.0.3p9`	amd64, arm64	Content extraction (NVIDIA GPU)
`netapp-neo-extractor-rocm:4.0.3p9`	amd64	Content extraction (AMD GPU)
`netapp-neo-ner:4.0.3p9`	amd64, arm64	Named Entity Recognition

Quick Start

docker pull ghcr.io/netapp/netapp-neo-api:4.0.3p9
docker pull ghcr.io/netapp/netapp-neo-worker:4.0.3p9
docker pull ghcr.io/netapp/netapp-neo-extractor:4.0.3p9
docker pull ghcr.io/netapp/netapp-neo-ner:4.0.3p9

Quality

Full end-to-end testing passed on both CPU and GPU (CUDA) builds
Validated across S3, NFS, and SMB storage backends
1,467+ files processed with NER entity detection (67,000+ entities on CPU, 8,700+ on GPU)
Zero import errors across all Cython-compiled services
Zero CUDA errors on NVIDIA RTX PRO 4000 Blackwell SFF

Assets 5

09 Apr 07:54

kirkryan

netapp-neo-4.0.3p10

612e53b

NetApp Project Neo v4.0.3p10

What's New

Fix: Worker Startup Hang on Large Datasets

Resolved a critical issue where all worker containers would hang indefinitely during initialization on systems with large file inventories (100k+ files), preventing all data ingestion, ACL resolution, and file processing.

Root cause -- The ACL resolution backfill query used a correlated LIKE subquery on cast JSON text (metadata::text LIKE '%' || id || '%'), resulting in O(n*m) complexity that could take hours on large datasets. With all worker replicas running this query simultaneously, database contention compounded the problem.
Fix -- The backfill is now deferred to a non-blocking background task that runs after workers are fully initialized. The query has been rewritten to use an efficient JSONB key lookup ((metadata::jsonb)->>'file_id') that is indexable and orders of magnitude faster.
Impact -- Workers now start in seconds regardless of dataset size, immediately beginning file processing, ACL resolution, and Graph uploads.

Fix: Admin User Creation Decoupled from Worker Init

Admin user creation now runs independently of worker initialization with retry logic, ensuring authentication works even if worker startup encounters transient errors.

Container Images

All images are available at ghcr.io/netapp. Pull with docker pull ghcr.io/netapp/<image>:4.0.3p10.

Image	Platforms	Description
`netapp-neo-api:4.0.3p10`	amd64, arm64	REST API + MCP transport
`netapp-neo-worker:4.0.3p10`	amd64, arm64	Background processing
`netapp-neo-extractor:4.0.3p10`	amd64, arm64	Content extraction (CPU)
`netapp-neo-extractor-cuda:4.0.3p10`	amd64, arm64	Content extraction (NVIDIA GPU)
`netapp-neo-extractor-rocm:4.0.3p10`	amd64	Content extraction (AMD GPU)
`netapp-neo-ner:4.0.3p10`	amd64, arm64	Named Entity Recognition

Quick Start

docker pull ghcr.io/netapp/netapp-neo-api:4.0.3p10
docker pull ghcr.io/netapp/netapp-neo-worker:4.0.3p10
docker pull ghcr.io/netapp/netapp-neo-extractor:4.0.3p10
docker pull ghcr.io/netapp/netapp-neo-ner:4.0.3p10

Quality

1940/1940 end-to-end test work items passing (100% pass rate)
Validated across SMB and NFS storage backends (CPU and GPU builds)
ACL backfill verified: 10/10 manually-cleared files re-resolved after worker restart

Assets 5

Releases: NetApp/Innovation-Labs

NetApp Project Neo v4.1.4

NetApp NEO 4.1.4

Container Images

Quick Start

GPU Support

What's New

Pre-baked ML Model Images

Capacity-Based Licensing

OCR Engine Fallback

Extractor Queue Enhancements

Stability & Reliability

Dependency Updates

Bug Fixes

Uh oh!

netapp-neo-26.4.4

Uh oh!

NetApp Project Neo v4.1.2

NetApp Project Neo v4.1.2

What's New

New API parameters

API runtime

Companion chart release

Container Images

Quick Start

Upgrade notes

Quality

Uh oh!

netapp-neo-26.4.3

Uh oh!

netapp-neo-26.4.2

Uh oh!

NetApp Project Neo v4.1.0

NetApp Project Neo v4.1.0

What's New

Datasets, MCP, and Search

Performance & Scale

Crawl & Worker Reliability

Sizing & Operations

Bug Fixes

Build & Dependencies

Container Images

Quick Start

Quality

Uh oh!

netapp-neo-26.4.1

Uh oh!

netapp-neo-26.3.2

Uh oh!

NetApp Project Neo v4.0.3p9

NetApp Project Neo v4.0.3p9

What's New

Improved Service Resilience

MCP & Search Fixes

Container Images

Quick Start

Quality

Uh oh!

NetApp Project Neo v4.0.3p10

NetApp Project Neo v4.0.3p10

What's New

Fix: Worker Startup Hang on Large Datasets

Fix: Admin User Creation Decoupled from Worker Init

Container Images

Quick Start

Quality

Uh oh!