Data Model

This document describes the persistence model implemented by entropy-processor, based on:

src/main/resources/db/migration/V1__initial_schema.sql
Entities under src/main/java/com/ammann/entropy/model

1. Persistence Approach

Database: PostgreSQL with TimescaleDB extension.
Schema lifecycle: Flyway migration at startup.
ORM: Hibernate ORM + Panache entities.
Hibernate schema mode: validate (migration SQL is the schema source of truth).

2. Table Inventory

Table	Role	Time-Series Type
`entropy_data`	Primary event storage for ingested entropy events	Hypertable
`nist_test_results`	SP 800-22 per-test outcomes	Hypertable
`nist_90b_results`	SP 800-90B aggregate assessment outcomes	Hypertable
`nist_90b_estimator_results`	SP 800-90B estimator-level detail rows	Regular table
`data_quality_reports`	Persisted quality reports	Regular table
`nist_validation_jobs`	Async validation job tracking	Regular table
`entropy_comparison_run`	Comparison run metadata	Regular table
`entropy_comparison_result`	Source-level comparison outcomes	Hypertable

3. Structural Relationships

%%{init: {"flowchart": {"curve": "linear"}}}%%
flowchart LR
    ED[entropy_data]
    NTR[nist_test_results]
    N90[nist_90b_results]
    NE[nist_90b_estimator_results]
    JV[nist_validation_jobs]
    CR[entropy_comparison_run]
    RS[entropy_comparison_result]

    ED -->|windowed bitstream source| NTR
    ED -->|windowed bitstream source| N90
    N90 -->|assessment_run_id| NE
    JV -->|test_suite_run_id| NTR
    JV -->|assessment_run_id| N90
    CR -->|comparison_run_id| RS

Note: nist_90b_estimator_results intentionally does not enforce a DB foreign key to nist_90b_results because of Timescale/partitioning constraints documented in migration comments.

4. Table-Level Design Notes

4.1 `entropy_data`

Purpose:

Stores one row per ingested entropy event.
Supports time-window analytics and interval computations.

Key columns:

hw_timestamp_ns, server_received, sequence, whitened_entropy, source_address.

Key behavior:

Converted to hypertable partitioned by server_received with 1-day chunks.
Composite primary key (id, server_received) to satisfy Timescale uniqueness constraints.

4.2 `nist_test_results`

Purpose:

Stores SP 800-22 individual test results.
Supports chunked validation runs via chunk_index and chunk_count.

Key columns:

test_suite_run_id, test_name, passed, p_value, executed_at, details.

Time-series behavior:

Hypertable partitioned by executed_at (7-day chunk interval).

4.3 `nist_90b_results`

Purpose:

Stores SP 800-90B assessment outcomes using run-summary row discrimination. Each row is either a run summary row (is_run_summary = TRUE) representing the canonical result for a completed run, or a per-chunk row (is_run_summary = FALSE) representing a single chunk processed during the assessment.
Provides run-level link via assessment_run_id.

Key columns:

assessment_run_id, min_entropy, passed, assessment_details, executed_at, is_run_summary, chunk_index, chunk_count.

Row discrimination:

Run summary row (is_run_summary = TRUE): Written once after all chunks have been processed. Contains min_entropy = MIN(all chunks), passed = AND(all chunks), and chunk_index = NULL. The assessment_details JSON includes aggregation metadata such as chunk count, aggregation rule, and the index of the estimator source chunk.
Per-chunk row (is_run_summary = FALSE): Written during chunk processing. Contains the chunk_index (zero-based) and the assessment outcome for that individual chunk. These rows are retained for forensic diagnosis but are not treated as canonical results.
Incomplete run: If the chunk loop fails before completion, no summary row is written. The absence of a summary row for a given assessment_run_id indicates that the run did not complete successfully. Partial per-chunk rows may remain.

Partial unique index:

uq_nist_90b_run_summary ON nist_90b_results (assessment_run_id) WHERE is_run_summary = TRUE enforces that at most one summary row exists per assessment run.

Time-series behavior:

Hypertable partitioned by executed_at (7-day chunk interval).

4.4 `nist_90b_estimator_results`

Purpose:

Stores detailed estimator outputs (IID and NON_IID categories).
Preserves semantics for non-entropy estimators via nullable entropy_estimate.

Key constraints:

Unique key on (assessment_run_id, test_type, estimator_name).

4.5 `data_quality_reports`

Purpose:

Stores quality assessment summaries generated by quality analysis logic.

Key columns:

report_timestamp, window_start, window_end, total_events, overall_quality_score, recommendations.

4.6 `nist_validation_jobs`

Purpose:

Tracks asynchronous validation workflow state.
Supports API polling for progress and completion.

Lifecycle states encoded by constraint and enums:

QUEUED, RUNNING, COMPLETED, FAILED.

Key columns:

validation_type, status, progress_percent, current_chunk, total_chunks, test_suite_run_id, assessment_run_id.

4.7 Comparison Tables

entropy_comparison_run:

One row per comparison execution, including sample sizes and mixed-source traceability metadata.

entropy_comparison_result:

One or more rows per run, one per source type (BASELINE, HARDWARE, MIXED), with NIST and entropy metric outputs.
Implemented as hypertable partitioned by created_at.

5. Data Flow Through Persistence

graph TD
    Ingest[gRPC ingest] --> entropy_data
    entropy_data --> N22[SP800-22 execution]
    entropy_data --> N90[SP800-90B execution]

    N22 --> nist_test_results
    N90 --> nist_90b_results
    N90 --> nist_90b_estimator_results

    AsyncJobs[Validation job orchestration] --> nist_validation_jobs
    Quality[Quality analysis] --> data_quality_reports
    Compare[Comparison workflow] --> entropy_comparison_run
    Compare --> entropy_comparison_result

6. Query Boundary Observations

From entity and service code:

Time-window access is the dominant access pattern (server_received and executed_at).
Interval analytics are performed using native SQL window functions over entropy_data.
Validation retrieval uses run identifiers (test_suite_run_id, assessment_run_id) for aggregation and API responses. SP 800-90B result retrieval filters on is_run_summary to distinguish canonical run-summary rows from forensic per-chunk rows.
Job tracking is independent from test-result tables and linked by run IDs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Model

1. Persistence Approach

2. Table Inventory

3. Structural Relationships

4. Table-Level Design Notes

4.1 `entropy_data`

4.2 `nist_test_results`

4.3 `nist_90b_results`

4.4 `nist_90b_estimator_results`

4.5 `data_quality_reports`

4.6 `nist_validation_jobs`

4.7 Comparison Tables

5. Data Flow Through Persistence

6. Query Boundary Observations

FilesExpand file tree

data-model.md

Latest commit

History

data-model.md

File metadata and controls

Data Model

1. Persistence Approach

2. Table Inventory

3. Structural Relationships

4. Table-Level Design Notes

4.1 entropy_data

4.2 nist_test_results

4.3 nist_90b_results

4.4 nist_90b_estimator_results

4.5 data_quality_reports

4.6 nist_validation_jobs

4.7 Comparison Tables

5. Data Flow Through Persistence

6. Query Boundary Observations

4.1 `entropy_data`

4.2 `nist_test_results`

4.3 `nist_90b_results`

4.4 `nist_90b_estimator_results`

4.5 `data_quality_reports`

4.6 `nist_validation_jobs`