Every custom resource Hoptimator installs, with field-by-field detail. All
CRDs are in the hoptimator.linkedin.com/v1alpha1 API group. The apiVersion
is v1alpha1 and subject to change — pin deliberately.
The CRD YAMLs live in
hoptimator-k8s/src/main/resources/
and are applied by make deploy along with the operator.
If you're modifying a CRD, regenerate the Java model classes after your change:
make generate-modelsThe script invokes the upstream Kubernetes Java client's
crd-model-genDocker image to produce the typedV1alpha1*classes the operator and deployers consume. Commit the regenerated files with your CRD change.
| Kind | Plural | Short names | What it is |
|---|---|---|---|
Database |
databases |
db, dbs |
Registers an external system with the catalog (a JDBC URL + schema name). |
View |
views |
— | A logical SQL view. |
Pipeline |
pipelines |
pip, pips |
The deployable unit produced by a materialized view (sources, sink, job). |
TableTemplate |
tabletemplates |
tabt |
Template for source/sink resource generation, scoped by database and access method. |
JobTemplate |
jobtemplates |
jobt |
Template for job resource generation, scoped by database. |
TableTrigger |
tabletriggers |
— | Fires a Job when an upstream table changes or on a cron schedule. |
Subscription |
subscriptions |
sub, subs |
YAML-native equivalent of CREATE MATERIALIZED VIEW. |
LogicalTable |
logicaltables |
lt |
One named entity bound to multiple physical tier backends. |
Engine |
engines |
eng |
Registers a query-execution runtime. Optional; see concepts. |
SqlJob |
sqljobs |
sql, sj |
Primitive consumed by an SqlJob operator to deploy Flink or Flink-Beam SQL jobs. |
Registers an external system in the catalog. The JDBC URL drives schema
discovery; the schema field is what the system shows up as in SQL.
apiVersion: hoptimator.linkedin.com/v1alpha1
kind: Database
metadata:
name: ads-database
spec:
schema: ADS
url: jdbc:demodb://names=ads
dialect: Calcite| Field | Type | Required | Description |
|---|---|---|---|
url |
string | yes | JDBC connection URL. |
schema |
string | Schema name as rendered in the catalog (e.g. ADS). |
|
catalog |
string | JDBC catalog name. Used for hierarchical sources (e.g. MySQL). | |
dialect |
enum | ANSI, MySQL, or Calcite. Affects how the planner generates SQL for this source. |
|
driver |
string | Fully qualified class name of the JDBC driver, if it isn't auto-discovered. |
CATALOG, SCHEMA, URL.
A logical view. Each CREATE VIEW statement produces one of these. Views
are stored definitions; nothing is deployed to materialize them.
apiVersion: hoptimator.linkedin.com/v1alpha1
kind: View
metadata:
name: ads-audience
spec:
schema: ADS
view: AUDIENCE
sql: |
SELECT FIRST_NAME, LAST_NAME
FROM ADS.PAGE_VIEWS NATURAL JOIN PROFILE.MEMBERS
materialized: false| Field | Type | Required | Description |
|---|---|---|---|
view |
string | yes | View name. |
sql |
string | yes | The view's SQL. |
schema |
string | Schema the view belongs to. | |
catalog |
string | Catalog the view belongs to. | |
materialized |
boolean | Whether the view should be materialized (i.e. paired with a Pipeline). |
| Field | Type | Description |
|---|---|---|
watermark |
date-time | Timestamp of the last data change event affecting this view. |
CATALOG, SCHEMA, VIEW, WATERMARK.
The deployable unit produced by a materialized view: a job plus its
sources and sink. The sql field is the auto-generated INSERT INTO
statement; the yaml field is the rendered specs of every component.
apiVersion: hoptimator.linkedin.com/v1alpha1
kind: Pipeline
metadata:
name: ads-audience
spec:
sql: INSERT INTO `ADS`.`AUDIENCE` SELECT ...
yaml: |
apiVersion: flink.apache.org/v1beta1
kind: FlinkSessionJob
...| Field | Type | Description |
|---|---|---|
sql |
string | The INSERT INTO statement this pipeline implements. |
yaml |
string | The concatenated YAML of every object that makes up the pipeline. |
| Field | Type | Description |
|---|---|---|
ready |
boolean | Whether the entire pipeline is ready. |
failed |
boolean | Whether any element of the pipeline has failed. |
message |
string | Free-text status message — typically the last error, if any. |
jobs |
object | Map of external jobs this pipeline triggers, with their state. |
SQL, STATUS.
Generates source/sink YAML and connector configs when a matching table
becomes part of a pipeline. The databases and methods fields gate which
tables a template applies to.
apiVersion: hoptimator.linkedin.com/v1alpha1
kind: TableTemplate
metadata:
name: kafka-template
spec:
databases:
- kafka-database
methods:
- Scan
yaml: |
apiVersion: kafka.strimzi.io/v1
kind: KafkaTopic
metadata:
name: {{name}}
spec:
topicName: {{table}}
partitions: {{kafka.partitions:1}}
connector: |
connector = kafka
topic = {{table}}
properties.bootstrap.servers = ...| Field | Type | Description |
|---|---|---|
yaml |
string | YAML template used to generate K8s specs. Placeholders are {{var}} (syntax). |
connector |
string | Java-properties-style template used to generate the engine's connector config. |
databases |
array | Database names this template matches. If null/empty, matches every database. |
methods |
array | Access methods to match: Scan (read), Modify (write). If null/empty, matches all. |
A template can contribute yaml, connector, or both. A template that
provides only connector is useful for adapters that don't need to deploy
new infrastructure (the upstream resource already exists) but still need to
declare how to read or write it.
See Templates and configuration for placeholder syntax and full examples.
Generates the YAML for a job (Flink session job, Beam runner, etc.) when a pipeline targets a matching database.
apiVersion: hoptimator.linkedin.com/v1alpha1
kind: JobTemplate
metadata:
name: flink-template
spec:
yaml: |
apiVersion: flink.apache.org/v1beta1
kind: FlinkSessionJob
metadata:
name: {{name}}
spec:
deploymentName: basic-session-deployment
job:
entryClass: com.linkedin.hoptimator.flink.runner.FlinkRunner
args:
- {{flinksql}}
jarURI: file:///opt/{{flink.app.name}}.jar
parallelism: {{flink.parallelism:1}}| Field | Type | Description |
|---|---|---|
yaml |
string | YAML template. Has access to {{flinksql}}, {{flinkconfigs}}, plus everything in TableTemplate's environment. |
databases |
array | Database names this template matches. If null/empty, matches every database. |
Runs a Kubernetes job when an upstream table changes or on a cron schedule. See Triggers for operational guidance.
apiVersion: hoptimator.linkedin.com/v1alpha1
kind: TableTrigger
metadata:
name: refresh-audience
spec:
schema: KAFKA
table: existing-topic-1
schedule: "@hourly"
yaml: |
apiVersion: batch/v1
kind: Job
metadata:
name: refresh-audience-job
spec:
template:
spec:
containers:
- name: backfill
image: ...
command: [...]
restartPolicy: Never| Field | Type | Required | Description |
|---|---|---|---|
schema |
string | yes | Schema of the table the trigger watches (e.g. KAFKA). |
table |
string | yes | Table name the trigger watches. |
yaml |
string | The Job (or other resource) to (re)create when the trigger fires. | |
jobProperties |
object | Extra source-specific properties available to the job at runtime. | |
schedule |
string | Cron schedule. If set, the trigger fires on a schedule. If null, it fires on status patches. | |
paused |
boolean | When true, the trigger does not fire (status updates are ignored). |
| Field | Type | Description |
|---|---|---|
timestamp |
date-time | When the trigger was last fired. Patching this fires the trigger. |
watermark |
date-time | Timestamp of the last successfully processed event. |
jobs |
object | Per-job state — useful for tracking the status of jobs the trigger spawned. |
PAUSED, SCHEMA, TABLE, SCHEDULE, TIMESTAMP, WATERMARK.
YAML-native way to declare a materialized view. Equivalent to running
CREATE MATERIALIZED VIEW <database>.<sink> AS <sql> against the JDBC
driver, but useful in GitOps workflows where pipelines should live next to
the rest of your manifests.
apiVersion: hoptimator.linkedin.com/v1alpha1
kind: Subscription
metadata:
name: my-feature
spec:
database: VENICE
sql: |
SELECT m.id AS KEY, m.first_name FROM PROFILE.MEMBERS m
hints:
flink.parallelism: "2"| Field | Type | Required | Description |
|---|---|---|---|
sql |
string | yes | A single SQL query. |
database |
string | yes | The database in which to create the output sink table. |
hints |
object | Hints to adapters and the planner. May be ignored if a different plan is chosen. |
| Field | Type | Description |
|---|---|---|
ready |
boolean | Whether the subscription is ready to be consumed. |
failed |
boolean | Whether the operator was unable to deploy a pipeline. |
message |
string | Free-text status, typically the last error. |
sql |
string | The SQL the pipeline ended up implementing (may be planner-rewritten). |
hints |
object | The hints that survived planning. |
attributes |
object | Physical attributes of the job and sink/output table. |
resources |
array | All YAML generated to implement the pipeline. |
jobResources |
array | YAML for the job specifically. |
downstreamResources |
array | YAML for the sink/output table. |
STATUS, DB, SQL.
One named entity backed by multiple physical tiers. See Logical tables in concepts for the bigger picture.
apiVersion: hoptimator.linkedin.com/v1alpha1
kind: LogicalTable
metadata:
name: audience
spec:
tableName: audience
tiers:
nearline:
database: kafka-database
online:
database: venice
offline:
database: hdfs-database| Field | Type | Description |
|---|---|---|
tableName |
string | Original table name as declared in CREATE TABLE (e.g. audience). |
tiers |
object | Map of tier name (nearline, online, offline) to a tier binding. |
Each tier binding has one field:
| Field | Type | Required | Description |
|---|---|---|---|
database |
string | yes | Name of the Database CRD backing this tier. |
The LogicalTableDeployer runs at create time to deploy physical tier
resources, the implicit inter-tier sync pipelines, and the offline-tier
backfill trigger when an offline tier is bound.
Registers a query-execution runtime. See
Engines in concepts
for what this surface is and isn't — short version: pipeline
materialization does not require an Engine resource.
apiVersion: hoptimator.linkedin.com/v1alpha1
kind: Engine
metadata:
name: flink
spec:
url: jdbc:flink://gateway.example/
dialect: Flink| Field | Type | Required | Description |
|---|---|---|---|
url |
string | yes | JDBC URL Hoptimator uses to submit queries to the engine. |
dialect |
enum | ANSI or Flink. |
|
driver |
string | Fully qualified JDBC driver class name. | |
databases |
array | Databases this engine supports. If null/empty, supports all of them. |
URL.
A declarative spec for a SQL job — Flink or Flink-Beam, streaming or
batch — that an SqlJob operator picks up and deploys. Hoptimator itself
doesn't reconcile SqlJob resources; an external operator paired with this
CRD does.
apiVersion: hoptimator.linkedin.com/v1alpha1
kind: SqlJob
metadata:
name: my-sql-job
spec:
dialect: Flink
executionMode: Streaming
sql:
- "CREATE TABLE input ... WITH ('connector' = 'kafka', ...);"
- "CREATE TABLE output ... WITH ('connector' = 'venice', ...);"
- "INSERT INTO output SELECT * FROM input;"
configs:
flink.parallelism: "4"| Field | Type | Required | Description |
|---|---|---|---|
sql |
array | yes | One or more SQL statements run as a single job. |
dialect |
enum | Flink (default) or FlinkBeam. |
|
executionMode |
enum | Streaming (default) or Batch. |
|
configs |
object | Job-level configuration passed through to the engine. |
| Field | Type | Description |
|---|---|---|
ready |
boolean | Whether the job is running or has completed. |
failed |
boolean | Whether the job has failed. |
message |
string | Status message — typically the last error if any. |
sql |
string | The SQL the operator is implementing. |
configs |
object | The configs the operator is using. |
DIALECT, MODE, STATUS.