CRD reference

Every custom resource Hoptimator installs, with field-by-field detail. All CRDs are in the hoptimator.linkedin.com/v1alpha1 API group. The apiVersion is v1alpha1 and subject to change — pin deliberately.

The CRD YAMLs live in hoptimator-k8s/src/main/resources/ and are applied by make deploy along with the operator.

If you're modifying a CRD, regenerate the Java model classes after your change:
make generate-models
The script invokes the upstream Kubernetes Java client's crd-model-gen Docker image to produce the typed V1alpha1* classes the operator and deployers consume. Commit the regenerated files with your CRD change.

At a glance

Kind	Plural	Short names	What it is
`Database`	`databases`	`db`, `dbs`	Registers an external system with the catalog (a JDBC URL + schema name).
`View`	`views`	—	A logical SQL view.
`Pipeline`	`pipelines`	`pip`, `pips`	The deployable unit produced by a materialized view (sources, sink, job).
`TableTemplate`	`tabletemplates`	`tabt`	Template for source/sink resource generation, scoped by database and access method.
`JobTemplate`	`jobtemplates`	`jobt`	Template for job resource generation, scoped by database.
`TableTrigger`	`tabletriggers`	—	Fires a Job when an upstream table changes or on a cron schedule.
`Subscription`	`subscriptions`	`sub`, `subs`	YAML-native equivalent of `CREATE MATERIALIZED VIEW`.
`LogicalTable`	`logicaltables`	`lt`	One named entity bound to multiple physical tier backends.
`Engine`	`engines`	`eng`	Registers a query-execution runtime. Optional; see concepts.
`SqlJob`	`sqljobs`	`sql`, `sj`	Primitive consumed by an `SqlJob` operator to deploy Flink or Flink-Beam SQL jobs.

Database

Registers an external system in the catalog. The JDBC URL drives schema discovery; the schema field is what the system shows up as in SQL.

apiVersion: hoptimator.linkedin.com/v1alpha1
kind: Database
metadata:
  name: ads-database
spec:
  schema: ADS
  url: jdbc:demodb://names=ads
  dialect: Calcite

Spec fields

Field	Type	Required	Description
`url`	string	yes	JDBC connection URL.
`schema`	string		Schema name as rendered in the catalog (e.g. `ADS`).
`catalog`	string		JDBC catalog name. Used for hierarchical sources (e.g. MySQL).
`dialect`	enum		`ANSI`, `MySQL`, or `Calcite`. Affects how the planner generates SQL for this source.
`driver`	string		Fully qualified class name of the JDBC driver, if it isn't auto-discovered.

Printer columns

CATALOG, SCHEMA, URL.

View

A logical view. Each CREATE VIEW statement produces one of these. Views are stored definitions; nothing is deployed to materialize them.

apiVersion: hoptimator.linkedin.com/v1alpha1
kind: View
metadata:
  name: ads-audience
spec:
  schema: ADS
  view: AUDIENCE
  sql: |
    SELECT FIRST_NAME, LAST_NAME
    FROM ADS.PAGE_VIEWS NATURAL JOIN PROFILE.MEMBERS
  materialized: false

Spec fields

Field	Type	Required	Description
`view`	string	yes	View name.
`sql`	string	yes	The view's SQL.
`schema`	string		Schema the view belongs to.
`catalog`	string		Catalog the view belongs to.
`materialized`	boolean		Whether the view should be materialized (i.e. paired with a `Pipeline`).

Status fields

Field	Type	Description
`watermark`	date-time	Timestamp of the last data change event affecting this view.

Printer columns

CATALOG, SCHEMA, VIEW, WATERMARK.

Pipeline

The deployable unit produced by a materialized view: a job plus its sources and sink. The sql field is the auto-generated INSERT INTO statement; the yaml field is the rendered specs of every component.

apiVersion: hoptimator.linkedin.com/v1alpha1
kind: Pipeline
metadata:
  name: ads-audience
spec:
  sql: INSERT INTO `ADS`.`AUDIENCE` SELECT ...
  yaml: |
    apiVersion: flink.apache.org/v1beta1
    kind: FlinkSessionJob
    ...

Spec fields

Field	Type	Description
`sql`	string	The `INSERT INTO` statement this pipeline implements.
`yaml`	string	The concatenated YAML of every object that makes up the pipeline.

Status fields

Field	Type	Description
`ready`	boolean	Whether the entire pipeline is ready.
`failed`	boolean	Whether any element of the pipeline has failed.
`message`	string	Free-text status message — typically the last error, if any.
`jobs`	object	Map of external jobs this pipeline triggers, with their state.

Printer columns

SQL, STATUS.

TableTemplate

Generates source/sink YAML and connector configs when a matching table becomes part of a pipeline. The databases and methods fields gate which tables a template applies to.

apiVersion: hoptimator.linkedin.com/v1alpha1
kind: TableTemplate
metadata:
  name: kafka-template
spec:
  databases:
    - kafka-database
  methods:
    - Scan
  yaml: |
    apiVersion: kafka.strimzi.io/v1
    kind: KafkaTopic
    metadata:
      name: {{name}}
    spec:
      topicName: {{table}}
      partitions: {{kafka.partitions:1}}
  connector: |
    connector = kafka
    topic = {{table}}
    properties.bootstrap.servers = ...

Spec fields

Field	Type	Description
`yaml`	string	YAML template used to generate K8s specs. Placeholders are `{{var}}` (syntax).
`connector`	string	Java-properties-style template used to generate the engine's connector config.
`databases`	array	Database names this template matches. If null/empty, matches every database.
`methods`	array	Access methods to match: `Scan` (read), `Modify` (write). If null/empty, matches all.

A template can contribute yaml, connector, or both. A template that provides only connector is useful for adapters that don't need to deploy new infrastructure (the upstream resource already exists) but still need to declare how to read or write it.

See Templates and configuration for placeholder syntax and full examples.

JobTemplate

Generates the YAML for a job (Flink session job, Beam runner, etc.) when a pipeline targets a matching database.

apiVersion: hoptimator.linkedin.com/v1alpha1
kind: JobTemplate
metadata:
  name: flink-template
spec:
  yaml: |
    apiVersion: flink.apache.org/v1beta1
    kind: FlinkSessionJob
    metadata:
      name: {{name}}
    spec:
      deploymentName: basic-session-deployment
      job:
        entryClass: com.linkedin.hoptimator.flink.runner.FlinkRunner
        args:
        - {{flinksql}}
        jarURI: file:///opt/{{flink.app.name}}.jar
        parallelism: {{flink.parallelism:1}}

Spec fields

Field	Type	Description
`yaml`	string	YAML template. Has access to `{{flinksql}}`, `{{flinkconfigs}}`, plus everything in TableTemplate's environment.
`databases`	array	Database names this template matches. If null/empty, matches every database.

TableTrigger

Runs a Kubernetes job when an upstream table changes or on a cron schedule. See Triggers for operational guidance.

apiVersion: hoptimator.linkedin.com/v1alpha1
kind: TableTrigger
metadata:
  name: refresh-audience
spec:
  schema: KAFKA
  table: existing-topic-1
  schedule: "@hourly"
  yaml: |
    apiVersion: batch/v1
    kind: Job
    metadata:
      name: refresh-audience-job
    spec:
      template:
        spec:
          containers:
            - name: backfill
              image: ...
              command: [...]
          restartPolicy: Never

Spec fields

Field	Type	Required	Description
`schema`	string	yes	Schema of the table the trigger watches (e.g. `KAFKA`).
`table`	string	yes	Table name the trigger watches.
`yaml`	string		The Job (or other resource) to (re)create when the trigger fires.
`jobProperties`	object		Extra source-specific properties available to the job at runtime.
`schedule`	string		Cron schedule. If set, the trigger fires on a schedule. If null, it fires on status patches.
`paused`	boolean		When true, the trigger does not fire (status updates are ignored).

Status fields

Field	Type	Description
`timestamp`	date-time	When the trigger was last fired. Patching this fires the trigger.
`watermark`	date-time	Timestamp of the last successfully processed event.
`jobs`	object	Per-job state — useful for tracking the status of jobs the trigger spawned.

Printer columns

PAUSED, SCHEMA, TABLE, SCHEDULE, TIMESTAMP, WATERMARK.

Subscription

YAML-native way to declare a materialized view. Equivalent to running CREATE MATERIALIZED VIEW <database>.<sink> AS <sql> against the JDBC driver, but useful in GitOps workflows where pipelines should live next to the rest of your manifests.

apiVersion: hoptimator.linkedin.com/v1alpha1
kind: Subscription
metadata:
  name: my-feature
spec:
  database: VENICE
  sql: |
    SELECT m.id AS KEY, m.first_name FROM PROFILE.MEMBERS m
  hints:
    flink.parallelism: "2"

Spec fields

Field	Type	Required	Description
`sql`	string	yes	A single SQL query.
`database`	string	yes	The database in which to create the output sink table.
`hints`	object		Hints to adapters and the planner. May be ignored if a different plan is chosen.

Status fields

Field	Type	Description
`ready`	boolean	Whether the subscription is ready to be consumed.
`failed`	boolean	Whether the operator was unable to deploy a pipeline.
`message`	string	Free-text status, typically the last error.
`sql`	string	The SQL the pipeline ended up implementing (may be planner-rewritten).
`hints`	object	The hints that survived planning.
`attributes`	object	Physical attributes of the job and sink/output table.
`resources`	array	All YAML generated to implement the pipeline.
`jobResources`	array	YAML for the job specifically.
`downstreamResources`	array	YAML for the sink/output table.

Printer columns

STATUS, DB, SQL.

LogicalTable

One named entity backed by multiple physical tiers. See Logical tables in concepts for the bigger picture.

apiVersion: hoptimator.linkedin.com/v1alpha1
kind: LogicalTable
metadata:
  name: audience
spec:
  tableName: audience
  tiers:
    nearline:
      database: kafka-database
    online:
      database: venice
    offline:
      database: hdfs-database

Spec fields

Field	Type	Description
`tableName`	string	Original table name as declared in `CREATE TABLE` (e.g. `audience`).
`tiers`	object	Map of tier name (`nearline`, `online`, `offline`) to a tier binding.

Each tier binding has one field:

Field	Type	Required	Description
`database`	string	yes	Name of the `Database` CRD backing this tier.

The LogicalTableDeployer runs at create time to deploy physical tier resources, the implicit inter-tier sync pipelines, and the offline-tier backfill trigger when an offline tier is bound.

Engine (optional)

Registers a query-execution runtime. See Engines in concepts for what this surface is and isn't — short version: pipeline materialization does not require an Engine resource.

apiVersion: hoptimator.linkedin.com/v1alpha1
kind: Engine
metadata:
  name: flink
spec:
  url: jdbc:flink://gateway.example/
  dialect: Flink

Spec fields

Field	Type	Required	Description
`url`	string	yes	JDBC URL Hoptimator uses to submit queries to the engine.
`dialect`	enum		`ANSI` or `Flink`.
`driver`	string		Fully qualified JDBC driver class name.
`databases`	array		Databases this engine supports. If null/empty, supports all of them.

Printer columns

URL.

SqlJob

A declarative spec for a SQL job — Flink or Flink-Beam, streaming or batch — that an SqlJob operator picks up and deploys. Hoptimator itself doesn't reconcile SqlJob resources; an external operator paired with this CRD does.

apiVersion: hoptimator.linkedin.com/v1alpha1
kind: SqlJob
metadata:
  name: my-sql-job
spec:
  dialect: Flink
  executionMode: Streaming
  sql:
    - "CREATE TABLE input ... WITH ('connector' = 'kafka', ...);"
    - "CREATE TABLE output ... WITH ('connector' = 'venice', ...);"
    - "INSERT INTO output SELECT * FROM input;"
  configs:
    flink.parallelism: "4"

Spec fields

Field	Type	Required	Description
`sql`	array	yes	One or more SQL statements run as a single job.
`dialect`	enum		`Flink` (default) or `FlinkBeam`.
`executionMode`	enum		`Streaming` (default) or `Batch`.
`configs`	object		Job-level configuration passed through to the engine.

Status fields

Field	Type	Description
`ready`	boolean	Whether the job is running or has completed.
`failed`	boolean	Whether the job has failed.
`message`	string	Status message — typically the last error if any.
`sql`	string	The SQL the operator is implementing.
`configs`	object	The configs the operator is using.

Printer columns

DIALECT, MODE, STATUS.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CRD reference

At a glance

Database

Spec fields

Printer columns

View

Spec fields

Status fields

Printer columns

Pipeline

Spec fields

Status fields

Printer columns

TableTemplate

Spec fields

JobTemplate

Spec fields

TableTrigger

Spec fields

Status fields

Printer columns

Subscription

Spec fields

Status fields

Printer columns

LogicalTable

Spec fields

Engine (optional)

Spec fields

Printer columns

SqlJob

Spec fields

Status fields

Printer columns

FilesExpand file tree

crd-reference.md

Latest commit

History

crd-reference.md

File metadata and controls

CRD reference

At a glance

Database

Spec fields

Printer columns

View

Spec fields

Status fields

Printer columns

Pipeline

Spec fields

Status fields

Printer columns

TableTemplate

Spec fields

JobTemplate

Spec fields

TableTrigger

Spec fields

Status fields

Printer columns

Subscription

Spec fields

Status fields

Printer columns

LogicalTable

Spec fields

Engine (optional)

Spec fields

Printer columns

SqlJob

Spec fields

Status fields

Printer columns