A Math Reference (MathDataReference) is a derived data series whose values are computed from a Python expression evaluated over other series in the catalog. Rather than storing a pre-computed result, it evaluates on demand — every call to getData() fetches the underlying source data and applies the expression freshly.
Math References are first-class catalog entries. They appear alongside raw DataReference objects in the table, can be plotted, downloaded, and used as inputs to other Math References (chaining).
When a Math Reference evaluates its expression it resolves each identifier in priority order:
variable_map— an explicit{alias: DataReference}binding provided at construction time.search_map— each alias is looked up in the catalog atgetData()time using attribute criteria (e.g.variable=wind_speed, interval=hourly). This is the recommended approach because it keeps expressions portable.- Catalog name-lookup — if the identifier matches a catalog key directly, that reference is used. This works for simple cases and for chaining one Math Reference into another.
Every DataReference carries a ref_type class attribute:
| Class | ref_type value |
|---|---|
DataReference |
"raw" |
MathDataReference |
"math" |
| Custom subclass | Override with ref_type = "my_type" in the class body |
DataCatalog.to_dataframe() always includes a ref_type column. Subclasses
that need to behave differently for derived vs. raw entries can branch on this
value without using isinstance checks:
df = catalog.to_dataframe()
math_rows = df[df["ref_type"] == "math"]Expressions run inside a safe NumPy namespace. The following are available without any import:
| Category | Available names |
|---|---|
| Trig | sin, cos, tan, arcsin, arccos, arctan, arctan2 |
| Exponential / log | exp, log, log2, log10, sqrt |
| Rounding | ceil, floor, round |
| Array ops | abs, clip, where, diff, cumsum |
| Aggregates | min, max, sum, mean, std |
| Constants | pi, e, nan, inf |
| Libraries | np (NumPy), pd (pandas), math |
Pandas Series/DataFrame methods are also available directly on resolved variables (e.g. obs.mean(), ws.mean(axis=1)).
The most portable way to define Math References is with a YAML file. Each entry is a list item with at minimum a name and an expression. All other keys become metadata attributes (shown as table columns in the UI).
Expression tokens are the full catalog key names of other references. Good for quick one-offs; breaks if the upstream reference is renamed.
Every DataReference has a .name property that is its catalog key — the string used to store and retrieve it. There are two ways a name is set:
- Assigned explicitly — in code you set
ref.name = "my_key"before callingcatalog.add(ref). - Generated from
ref_key()— the defaultDataReference.ref_key()joins all string-representable attribute values with_separators, sanitising spaces and special characters to underscores:Subclasses can override# ref with attributes: station="A", variable="wind", interval="hourly" ref.ref_key() # → "A_wind_hourly"
ref_key()to produce a more domain-specific key from a chosen subset of attributes. For exampleex_tsdataui.pyoverrides it to usestation_name__variable__interval:def ref_key(self) -> str: name = self.get_attribute("station_name", "").replace(" ", "_") variable = self.get_attribute("variable", "") interval = self.get_attribute("interval", "") return f"{name}__{variable}__{interval}" # → "Station_A__wind_speed__hourly"
When the catalog contains any Math References, the table gains an expression column. For raw DataReference rows this column is intentionally filled with the reference's catalog key name — so users can see at a glance exactly what token to paste into an expression:
# Inside get_data_catalog() in ex_tsdataui.py:
mask = df["expression"].isna() | (df["expression"].str.strip() == "")
df.loc[mask, "expression"] = df.loc[mask, "name"] # show catalog key for raw refsThis means the expression column doubles as a "name hint" guide: raw rows show the token you'd use in a direct-name expression; Math Reference rows show their actual expression formula.
You can also list all names in code:
catalog.list_names()
# ['Station_A__wind_speed__hourly', 'Station_B__wind_speed__hourly', ...]# math_refs_tsdataui.yaml
- name: Station_A__wind_speed_mph__hourly
expression: Station_A__wind_speed__hourly * 2.23694
station_id: '1'
station_name: Station A
variable: wind_speed_mph
unit: mph
interval: hourly
- name: Station_B__precip_cumulative__hourly
expression: cumsum(Station_B__precipitation__hourly)
station_id: '2'
station_name: Station B
variable: precip_cumulative
unit: mm
interval: hourly
- name: wind_diff_A_minus_C__hourly
expression: Station_A__wind_speed__hourly - Station_C__wind_speed__hourly
station_id: '1'
station_name: Station A
variable: wind_diff_A_minus_C
unit: m/s
interval: hourlyEach expression variable is a short alias resolved by catalog attribute criteria at getData() time. The expression itself stays the same even if catalog key names change — only the search criteria need to match.
# math_refs_search_map.yaml
# Unit conversion: obs → first match for station_name=Station A, variable=wind_speed, interval=hourly
- name: wind_speed_mph__A__hourly
expression: obs * 2.23694
station_name: Station A
variable: wind_speed_mph
unit: mph
interval: hourly
search_map:
obs:
station_name: Station A
variable: wind_speed
interval: hourly
# Cross-station difference: two independent aliases
- name: wind_diff_B_minus_A__hourly
expression: ws_b - ws_a
variable: wind_diff_B_minus_A
unit: m/s
interval: hourly
search_map:
ws_b:
station_name: Station B
variable: wind_speed
interval: hourly
ws_a:
station_name: Station A
variable: wind_speed
interval: hourly
# Normalised anomaly (z-score)
- name: precip_anomaly__B__hourly
expression: (obs - obs.mean()) / obs.std()
variable: precip_anomaly
unit: ''
interval: hourly
search_map:
obs:
station_name: Station B
variable: precipitation
interval: hourly
# Multi-station mean: match_all: true concatenates ALL matches into a DataFrame
- name: mean_wind_speed__all_stations__hourly
expression: ws.mean(axis=1)
variable: mean_wind_speed
unit: m/s
interval: hourly
search_map:
ws:
variable: wind_speed
interval: hourly
match_all: true
match_all: true— when set inside asearch_mapcriteria block, all catalog entries matching those criteria are fetched and concatenated column-wise (axis=1 join by timestamp index) so the alias resolves to a DataFrame. Use this for multi-station aggregates such asws.mean(axis=1).
By default, every attr=value criterion in a search_map block (or in a catalog.search() call) performs an exact equality check. You can switch to a case-insensitive regular expression by using ~ instead of = as the operator, or by passing a tilde-prefixed string in code.
| Rule | Detail |
|---|---|
| Operator | ~ in the editor/YAML; "~pattern" (tilde-prefixed string) in Python |
| Matching | re.fullmatch — the pattern must match the entire attribute value |
| Case | Case-insensitive (re.IGNORECASE) |
| Partial match | Use .* to allow leading/trailing characters (e.g. ~EC.*) |
search_map:
# Exact match (default) — only "wind_speed" qualifies, not "wind_speed_mph"
obs:
variable: wind_speed
interval: hourly
# Regex fullmatch — matches "wind_speed" AND "wind_speed_mph" (starts with "wind_speed")
obs_any:
variable: ~wind_speed.*
interval: hourly
# Case-insensitive — "EC", "ec", "Ec" all match
salinity:
variable: ~ec
interval: ~15min.*
# Match several known prefixes with alternation
flow_or_stage:
variable: ~(flow|stage)
interval: hourly# Regex via set_search() — matches "EC", "ec_hourly", "EC_daily", etc.
m = (MathDataReference("obs * 1.0", name="ec_copy")
.set_search("obs", variable="~EC.*", interval="hourly")
.set_catalog(catalog))
# Regex in a search_map dict passed to the constructor
m = MathDataReference(
"a + b",
search_map={
"a": {"variable": "~flow.*", "location": "upstream"},
"b": {"variable": "~flow.*", "location": "downstream"},
},
name="net_flow",
)
# catalog.search() with regex — find all refs whose name starts with "station"
refs = catalog.search(name="~station.*")
# catalog.search() with regex attribute — find all EC-family variables
refs = catalog.search(variable="~EC.*")Type criteria using ~ instead of = to indicate regex:
variable~EC.*
variable~EC.*, interval=hourly
The +attr picker always appends attr=. To switch to regex, manually replace = with ~ in the criteria field after picking the attribute name.
re.fullmatch is used, so the pattern must match the entire attribute value, not just a substring:
| Pattern | Attribute value | Matches? |
|---|---|---|
~EC |
"EC" |
✅ yes — exact fullmatch |
~EC |
"EC_daily" |
❌ no — EC alone doesn't cover _daily |
~EC.* |
"EC_daily" |
✅ yes — .* absorbs the suffix |
~ec |
"EC" |
✅ yes — case-insensitive |
~ec |
"EC_DAILY" |
❌ no — same fullmatch rule applies |
~ec.* |
"EC_DAILY" |
✅ yes |
~(flow|stage) |
"flow" |
✅ yes |
~(flow|stage) |
"velocity" |
❌ no |
If a metadata attribute value legitimately starts with ~ (unusual), it cannot be matched using = with a tilde-prefixed expected value, because any value beginning with ~ is interpreted as a regex pattern. As a workaround, use a callable predicate in Python:
# Match the literal value "~special" exactly
refs = catalog.search(variable=lambda v: v == "~special")This limitation does not apply to the YAML or editor format, where you would simply use variable=~special to trigger regex mode.
A Math Reference can reference another Math Reference — either by catalog key name or via search_map attributes. All references in the file are created before any getData() call, so ordering within the file does not matter.
# Chain by name: cumulative sum of the mph-converted series defined above
- name: wind_speed_mph__A__hourly__cumsum
expression: cumsum(wind_speed_mph__A__hourly)
variable: wind_speed_mph_cumsum
unit: mph
interval: hourly
# Chain by search_map attributes: more portable than using the catalog key
- name: wind_speed_mph__A__hourly__pct_of_mean
expression: mph_wind / mph_wind.mean() * 100
variable: wind_speed_mph_pct_of_mean
unit: '%'
interval: hourly
search_map:
mph_wind:
station_name: Station A
variable: wind_speed_mph
interval: hourlyfrom dvue.catalog import DataCatalog
from dvue.math_reference import MathDataCatalogReader
catalog = DataCatalog()
# ... add raw DataReference objects ...
reader = MathDataCatalogReader(parent_catalog=catalog)
catalog.add_reader(reader)
catalog.add_source("math_refs_search_map.yaml")from dvue.math_reference import save_math_refs
save_math_refs(catalog, "math_refs_saved.yaml")The Math Ref button in the UI opens an inline editor panel for creating, editing, and managing Math References without restarting the kernel.
Click the Math Ref button (orange/warning style) in the action toolbar above the catalog table. A new tab opens inside the display panel.
The editor is laid out top-to-bottom as follows.
The catalog key used to store and retrieve this reference. Must be unique within the catalog.
A Python expression string. Use the short aliases you define in the Search Map (or full catalog key names for direct-lookup style). See Available Functions for what is in scope.
A row of clickable buttons appears immediately below the Expression field, one per alias currently defined in the Search Map. Clicking a button appends that alias name to the end of the expression — no typing needed. The buttons update live as you add or rename aliases.
One key: value pair per line. These become metadata attributes shown as columns in the catalog table (e.g. variable, unit, interval, station_name).
One row per expression variable. Rows can be added with + Add variable and removed with the ✕ button.
| Field | Purpose |
|---|---|
| Alias | Short identifier used in the Expression (e.g. obs) |
| Match all | When checked, all matching catalog entries are concatenated into a DataFrame (equivalent to match_all: true in YAML) |
| Catalog criteria | Comma-separated attr=val (exact) or attr~regex (regex fullmatch) pairs used to search the catalog (e.g. variable=wind_speed, interval=hourly or variable~EC.*, interval=hourly) |
| + attr | Drop-down picker; selecting an attribute name appends attr= to the criteria field so you only need to fill in the value |
| ▶ | Per-row test button — runs the criteria against the live catalog and shows an inline badge: ✅ 1 match, ⚠️ N matches, or ❌ 0 matches. If exactly one match is found and the Attributes field is empty, it is auto-filled with the match's identifying attributes |
Criteria pre-fill from the table — if you have a row selected in the catalog table when you click + Add variable, the new row's criteria field is pre-populated with the identifying attributes of the selected row (station ID, variable, interval, etc.), saving manual typing.
The Test Expression button (▶ Play icon) evaluates the full expression against the live catalog and shows the result immediately below the button — shape, and the first five rows as a table. Any errors (unresolved aliases, type mismatches, etc.) are reported inline.
A read-only section listing every attribute column currently in the catalog and its unique values — use this as a reference when writing search criteria.
- Upload YAML — choose a
.yaml/.ymlfile from your computer and click Load Uploaded YAML. AllMathDataReferenceentries in the file are merged into the live catalog and the table refreshes. No server-side path is required. - Download YAML — downloads all current Math References in the catalog as a single YAML file to your browser's downloads folder. The file is in the canonical flat format and can be re-uploaded later.
Click Save to add the reference to the live catalog and refresh the table. After saving, the Name, Expression, and Attributes fields are cleared so you can immediately create the next reference. The Search Map rows are preserved — useful when creating a series of similar references that share the same variable criteria.
If a reference with the same name already exists it is replaced in-place.
Select a row in the catalog table that corresponds to a Math Reference before clicking Math Ref. The editor pre-populates Name, Expression, Attributes, and all Search Map rows from the selected reference.
Once a Math Reference is in the catalog it can be selected and plotted just like any raw data series. The expression is evaluated on demand against the live catalog data.
You can also construct MathDataReference objects directly in code:
from dvue.catalog import DataCatalog, DataReference
from dvue.math_reference import MathDataReference
# Explicit variable_map
a = DataReference(df_a, name="A")
b = DataReference(df_b, name="B")
derived = MathDataReference("A + B * 2", variable_map={"A": a, "B": b}, name="sum_AB")
# Operator overloading shorthand (produces a MathDataReference automatically)
combined = a + b * 2
combined.name = "sum_AB"
# search_map style — resolves against a catalog at getData() time
cat = DataCatalog()
# ... populate cat ...
m = MathDataReference(
"inflow - outflow",
search_map={
"inflow": {"variable": "discharge", "location": "upstream"},
"outflow": {"variable": "discharge", "location": "downstream"},
},
catalog=cat,
name="net_discharge",
)
cat.add(m)m = (MathDataReference("obs * 2.23694", name="wind_mph")
.set_search("obs", variable="wind_speed", interval="hourly", station_name="Station A")
.set_catalog(catalog))
catalog.add(m)TimeSeriesDataUIManager includes the Math Ref editor by default — no extra wiring required. The button is shown automatically whenever show_math_ref_editor is True (the default).
To hide it for a specific manager instance:
mgr.show_math_ref_editor = False
# or pass it to the constructor:
mgr = MyManager(catalog, show_math_ref_editor=False)To customise the suggested download filename or add the action to a manager that does not inherit from TimeSeriesDataUIManager:
from dvue.math_ref_editor import MathRefEditorAction
class MyManager(TimeSeriesDataUIManager):
@property
def data_catalog(self):
return self._catalog # required: expose the DataCatalog
def get_data_actions(self):
actions = super().get_data_actions()
actions.append(dict(
name="Math Ref",
button_type="warning",
icon="math-function",
action_type="display",
callback=MathRefEditorAction(default_yaml_filename="my_math_refs.yaml").callback,
))
return actionsdefault_yaml_filename sets the suggested filename when the user clicks Download YAML in the editor.
The Transform → Ref button (green, in the action toolbar) converts the currently-active
transforms into a MathDataReference and saves it to the live catalog in a single click —
no expression authoring needed.
- Select one or more rows in the catalog table.
- Apply at least one transform in the Transform tab (see Transforms).
- Click Transform → Ref.
A notification confirms each new reference added. If a reference with the same derived name already exists it is silently replaced with the updated one.
| Section | Controls |
|---|---|
| Data Cleanup | fill_gap — fill missing values |
| Resampling | Period string (1D, 1H, 15min, …) + aggregation (mean, max, min, sum, std) |
| Smoothing | Tidal filter (cosine-Lanczos 40 h) + Rolling window (offset string 24H, 7D, …) + aggregation |
| Derived | Differencing (diff_periods) + cumulative sum |
| Scaling | Scale factor (multiplicative constant; 1.0 = no-op) |
Transforms are combined into a single chained pandas expression applied to the variable x:
| Active transforms | Expression |
|---|---|
| Resample 1D mean | x.resample('1D').mean() |
| Tidal filter then resample | cosine_lanczos(x, '40h').resample('1D').mean() |
| Rolling 24H mean | x.rolling('24H').mean() |
| Differencing (1 period) | x.diff(1) |
| Cumulative sum | x.cumsum() |
| Scale by 0.3048 | x * 0.3048 |
| Resample + scale | x.resample('1D').mean() * 0.3048 |
Pipeline order: tidal filter → resample → rolling → diff → cumsum → scale. The tidal filter is always applied on the original sub-daily series; resampling comes after.
[f{n}_]{identity_key}__{tag}
| Part | Meaning |
|---|---|
f{n}_ |
File index prefix, only in multi-file catalogs (e.g. f0_, f1_) |
{identity_key} |
Short identity of the source reference (e.g. RSAC054_FLOW) |
__ |
Unambiguous separator — the sanitiser never produces __ inside an identity key |
{tag} |
Short code for the active transforms (see table below) |
| Active transform | Tag |
|---|---|
| Tidal filter | tf |
| Resample 1D mean | 1D_mean |
| Resample 1H max | 1H_max |
| Rolling 24H mean | r24H_mean |
| Differencing (1 period) | diff |
| Differencing (N periods) | diffN |
| Cumulative sum | cumsum |
| Scale by 2.0 | x2.0 |
Multiple active transforms are joined with _:
resample_period=1D + scale_factor=0.3048 → name: RSAC054_FLOW__1D_mean_x0.3048
The identity part of the name comes from one of three sources, checked in priority order:
-
ref.set_key_attributes([...])— set directly on the originalDataReference. The ref itself advertises which attributes constitute its identity.ref.set_key_attributes(["B", "C"]) # DSS station (B) + variable (C) # → identity key: "RSAC054_FLOW"
-
manager.identity_key_columns = [...]— a catalog-level default on the manager. All refs in this catalog use the same attribute names for their identity.mgr.identity_key_columns = ["B", "C"] # → every Transform → Ref uses B+C as the identity part
-
Fallback — neither source is set. The full
ref_key()is used. This is verbose (includes all non-source attributes), but always produces a valid, unique name.
The new MathDataReference inherits all attributes from the original ref except source
(the file path). This means it is groupable and filterable by the same station, variable, and
interval criteria as the source:
Original ref: B=RSAC054, C=FLOW, E=1HOUR, F=STUDY_V1, filename=base.dss
New math ref: B=RSAC054, C=FLOW, E=1HOUR, F=STUDY_V1+1D_mean
↑ original F ↑ transform tag appended
When the original ref has no F attribute, a transform attribute is added instead:
New math ref: B=RSAC054, C=FLOW, transform=1D_mean
mgr.show_transform_to_catalog = FalseTimeSeriesDataUIManager applies an in-memory transform pipeline to every plotted series.
Transforms are controlled from the Transform tab in the sidebar.
Note: transforms affect only what is plotted and what Transform → Ref captures. They do not modify the underlying data in the catalog.
raw data
→ fill gap
→ tidal filter (cosine-Lanczos 40 h)
→ resample (period + aggregation)
→ rolling window (offset + aggregation)
→ differencing (n periods)
→ cumulative sum
→ scale factor
| Param | Type | Default | Notes |
|---|---|---|---|
resample_period |
str |
"" |
Pandas offset string ("1D", "1H", "15min", …). Empty = disabled |
resample_agg |
selector | "mean" |
mean, max, min, sum, std |
rolling_window |
str |
"" |
Pandas offset string ("24H", "7D", …). Empty = disabled |
rolling_agg |
selector | "mean" |
mean, max, min, std |
do_diff |
bool |
False |
Apply Series.diff(diff_periods) |
diff_periods |
int |
1 |
Number of periods for differencing |
do_cumsum |
bool |
False |
Apply Series.cumsum() |
scale_factor |
float |
1.0 |
Multiply every value; 1.0 is a no-op |
show_transform_to_catalog |
bool |
True |
Show "Transform → Ref" action button |
identity_key_columns |
list[str] |
[] |
Attribute names forming the identity key for derived ref names |
See examples/ex_tsdataui.py for a complete end-to-end demonstration:
conda run -n pydsm panel serve examples/ex_tsdataui.py --showThe example includes:
- A
DataCatalogwith 18 raw station/variable/interval references - Two ready-made YAML files in
examples/data/:math_refs_tsdataui.yaml— direct catalog-name lookup stylemath_refs_search_map.yaml—search_mapalias style (6 examples including chaining and[multi])
- The Math Ref editor action pre-wired and ready to use



