Skip to content

Commit bda6084

Browse files
ELin2025Yicong-Huangchenlica
authored
feat: add a new ternary contour plot operator (#4193)
<!-- Thanks for sending a pull request (PR)! Here are some tips for you: 1. If this is your first time, please read our contributor guidelines: [Contributing to Texera](https://github.com/apache/texera/blob/main/CONTRIBUTING.md) 2. Ensure you have added or run the appropriate tests for your PR 3. If the PR is work in progress, mark it a draft on GitHub. 4. Please write your PR title to summarize what this PR proposes, we are following Conventional Commits style for PR titles as well. 5. Be sure to keep the PR description updated to reflect all changes. --> ### What changes were proposed in this PR? <!-- Please clarify what changes you are proposing. The purpose of this section is to outline the changes. Here are some tips for you: 1. If you propose a new API, clarify the use case for a new API. 2. If you fix a bug, you can clarify why it is a bug. 3. If it is a refactoring, clarify what has been changed. 3. It would be helpful to include a before-and-after comparison using screenshots or GIFs. 4. Please consider writing useful notes for better and faster reviews. --> <img width="1912" height="1027" alt="image" src="https://github.com/user-attachments/assets/536262ea-7541-4aeb-8b10-33dcfd73e72c" /> This change relates to the addition of a ternary contour plot operator, which visualizes how a scalar value varies as a function of three normalized components that sum to a constant (typically 1 or 100%). In a ternary contour plot: - Each vertex of the triangular plot represents 100% of one component. - Any interior point represents a mixture of the three components. - Contour lines or color gradients indicate equal values of the measured quantity across different mixtures. This visualization is useful for identifying regions where the output is optimized or insensitive to changes in the component proportions, as well as for understanding trade-offs between the three variables. The operator takes in 4 inputs. The first three variables are the components, and the fourth variable is the output that corresponds to the proportion of the the three components. ### Any related issues, documentation, discussions? <!-- Please use this section to link other resources if not mentioned already. 1. If this PR fixes an issue, please include `Fixes #1234`, `Resolves #1234` or `Closes #1234`. If it is only related, simply mention the issue number. 2. If there is design documentation, please add the link. 3. If there is a discussion in the mailing list, please add the link. --> Needs python library scikit-image Can be installed using: pip install scikit-image ### How was this PR tested? <!-- If tests were added, say they were added here. Or simply mention that if the PR is tested with existing test cases. Make sure to include/update test cases that check the changes thoroughly including negative and positive cases if possible. If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future. If tests were not added, please describe why they were not added and/or why it was difficult to add. --> Tested with existing test cases ### Was this PR authored or co-authored using generative AI tooling? <!-- If generative AI tooling has been used in the process of authoring this PR, please include the phrase: 'Generated-by: ' followed by the name of the tool and its version. If no, write 'No'. Please refer to the [ASF Generative Tooling Guidance](https://www.apache.org/legal/generative-tooling.html) for details. --> No --------- Co-authored-by: Elliot <36275109+Falcons-Royale@users.noreply.github.com> Co-authored-by: Yicong Huang <17627829+Yicong-Huang@users.noreply.github.com> Co-authored-by: Chen Li <chenli@gmail.com>
1 parent ef66364 commit bda6084

4 files changed

Lines changed: 154 additions & 0 deletions

File tree

amber/operator-requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,4 @@ torch==2.8.0
2424
scikit-learn==1.5.0
2525
transformers==4.57.3
2626
boto3==1.40.53
27+
scikit-image==0.25.2

common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/LogicalOp.scala

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,7 @@ import org.apache.texera.amber.operator.visualization.sankeyDiagram.SankeyDiagra
132132
import org.apache.texera.amber.operator.visualization.scatter3DChart.Scatter3dChartOpDesc
133133
import org.apache.texera.amber.operator.visualization.scatterplot.ScatterplotOpDesc
134134
import org.apache.texera.amber.operator.visualization.tablesChart.TablesPlotOpDesc
135+
import org.apache.texera.amber.operator.visualization.ternaryContour.TernaryContourOpDesc
135136
import org.apache.texera.amber.operator.visualization.ternaryPlot.TernaryPlotOpDesc
136137
import org.apache.texera.amber.operator.visualization.parallelCoordinatesPlot.ParallelCoordinatesPlotOpDesc
137138
import org.apache.texera.amber.operator.visualization.polarChart.PolarChartOpDesc
@@ -260,6 +261,7 @@ trait StateTransferFunc
260261
new Type(value = classOf[TablesPlotOpDesc], name = "TablesPlot"),
261262
new Type(value = classOf[ContinuousErrorBandsOpDesc], name = "ContinuousErrorBands"),
262263
new Type(value = classOf[FigureFactoryTableOpDesc], name = "FigureFactoryTable"),
264+
new Type(value = classOf[TernaryContourOpDesc], name = "TernaryContour"),
263265
new Type(value = classOf[TernaryPlotOpDesc], name = "TernaryPlot"),
264266
new Type(value = classOf[DendrogramOpDesc], name = "Dendrogram"),
265267
new Type(value = classOf[NestedTableOpDesc], name = "NestedTable"),
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one
3+
* or more contributor license agreements. See the NOTICE file
4+
* distributed with this work for additional information
5+
* regarding copyright ownership. The ASF licenses this file
6+
* to you under the Apache License, Version 2.0 (the
7+
* "License"); you may not use this file except in compliance
8+
* with the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing,
13+
* software distributed under the License is distributed on an
14+
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
* KIND, either express or implied. See the License for the
16+
* specific language governing permissions and limitations
17+
* under the License.
18+
*/
19+
20+
package org.apache.texera.amber.operator.visualization.ternaryContour
21+
22+
import com.fasterxml.jackson.annotation.{JsonProperty, JsonPropertyDescription}
23+
import com.kjetland.jackson.jsonSchema.annotations.JsonSchemaTitle
24+
import org.apache.texera.amber.core.tuple.{AttributeType, Schema}
25+
import org.apache.texera.amber.core.workflow.OutputPort.OutputMode
26+
import org.apache.texera.amber.pybuilder.PythonTemplateBuilder.PythonTemplateBuilderStringContext
27+
import org.apache.texera.amber.pybuilder.PyStringTypes.EncodableString
28+
import org.apache.texera.amber.core.workflow.{InputPort, OutputPort, PortIdentity}
29+
import org.apache.texera.amber.operator.PythonOperatorDescriptor
30+
import org.apache.texera.amber.operator.metadata.annotations.AutofillAttributeName
31+
import org.apache.texera.amber.operator.metadata.{OperatorGroupConstants, OperatorInfo}
32+
import org.apache.texera.amber.pybuilder.PythonTemplateBuilder
33+
34+
/**
35+
* Visualization Operator for Ternary Plots.
36+
*
37+
* This operator uses three data fields to construct a ternary plot.
38+
* The points can optionally be color coded using a data field.
39+
*/
40+
41+
class TernaryContourOpDesc extends PythonOperatorDescriptor {
42+
43+
// Add annotations for the first variable
44+
@JsonProperty(value = "firstVariable", required = true)
45+
@JsonSchemaTitle("Variable 1")
46+
@JsonPropertyDescription("First variable data field")
47+
@AutofillAttributeName var firstVariable: EncodableString = ""
48+
49+
// Add annotations for the second variable
50+
@JsonProperty(value = "secondVariable", required = true)
51+
@JsonSchemaTitle("Variable 2")
52+
@JsonPropertyDescription("Second variable data field")
53+
@AutofillAttributeName var secondVariable: EncodableString = ""
54+
55+
// Add annotations for the third variable
56+
@JsonProperty(value = "thirdVariable", required = true)
57+
@JsonSchemaTitle("Variable 3")
58+
@JsonPropertyDescription("Third variable data field")
59+
@AutofillAttributeName var thirdVariable: EncodableString = ""
60+
61+
// Add annotations for the fourth variable
62+
@JsonProperty(value = "fourthVariable", required = true)
63+
@JsonSchemaTitle("Measured Value")
64+
@JsonPropertyDescription("Measured value data field")
65+
@AutofillAttributeName var fourthVariable: EncodableString = ""
66+
67+
// OperatorInfo instance describing ternary plot
68+
override def operatorInfo: OperatorInfo =
69+
OperatorInfo(
70+
userFriendlyName = "Ternary Contour",
71+
operatorDescription =
72+
"Shows how a measured value changes across all mixtures of three components that sum to a constant",
73+
operatorGroupName = OperatorGroupConstants.VISUALIZATION_SCIENTIFIC_GROUP,
74+
inputPorts = List(InputPort()),
75+
outputPorts = List(OutputPort(mode = OutputMode.SINGLE_SNAPSHOT))
76+
)
77+
78+
override def getOutputSchemas(
79+
inputSchemas: Map[PortIdentity, Schema]
80+
): Map[PortIdentity, Schema] = {
81+
val outputSchema = Schema()
82+
.add("html-content", AttributeType.STRING)
83+
Map(operatorInfo.outputPorts.head.id -> outputSchema)
84+
}
85+
86+
/** Returns a Python string that drops any tuples with missing values */
87+
def manipulateTable(): PythonTemplateBuilder = {
88+
// Check for any empty data field names
89+
assert(
90+
firstVariable.nonEmpty && secondVariable.nonEmpty && thirdVariable.nonEmpty && fourthVariable.nonEmpty
91+
)
92+
pyb"""
93+
| # Remove any tuples that contain missing values
94+
| table.dropna(subset=[$firstVariable, $secondVariable, $thirdVariable, $fourthVariable], inplace = True)
95+
|
96+
| #Remove rows where any of the first three variables are negative
97+
| table = table[(table[[$firstVariable, $secondVariable, $thirdVariable]] >= 0).all(axis=1)]
98+
|
99+
| #Remove zero-sum rows
100+
| s = table[$firstVariable] + table[$secondVariable] + table[$thirdVariable]
101+
| table = table[s > 0]
102+
|"""
103+
}
104+
105+
/** Returns a Python string that creates the ternary contour plot figure */
106+
def createPlotlyFigure(): PythonTemplateBuilder = {
107+
pyb"""
108+
| A = table[$firstVariable].to_numpy()
109+
| B = table[$secondVariable].to_numpy()
110+
| C = table[$thirdVariable].to_numpy()
111+
| Z = table[$fourthVariable].to_numpy()
112+
| fig = ff.create_ternary_contour(np.array([A,B,C]), Z, pole_labels=[$firstVariable, $secondVariable, $thirdVariable], interp_mode='cartesian')
113+
|"""
114+
}
115+
116+
/** Returns a Python string that yields the html content of the ternary contour plot */
117+
override def generatePythonCode(): String = {
118+
val finalCode =
119+
pyb"""
120+
|from pytexera import *
121+
|
122+
|import plotly.io
123+
|import plotly.figure_factory as ff
124+
|import numpy as np
125+
|
126+
|class ProcessTableOperator(UDFTableOperator):
127+
|
128+
| # Generate custom error message as html string
129+
| def render_error(self, error_msg):
130+
| return '''<h1>TernaryContour is not available.</h1>
131+
| <p>Reasons are: {} </p>
132+
| '''.format(error_msg)
133+
|
134+
| @overrides
135+
| def process_table(self, table: Table, port: int) -> Iterator[Optional[TableLike]]:
136+
| if table.empty:
137+
| yield {'html-content': self.render_error("Input table is empty.")}
138+
| return
139+
| ${manipulateTable()}
140+
| if table.empty:
141+
| yield {'html-content': self.render_error("No valid rows left (every row has at least 1 missing value).")}
142+
| return
143+
| ${createPlotlyFigure()}
144+
| # Convert fig to html content
145+
| html = plotly.io.to_html(fig, include_plotlyjs = 'cdn', auto_play = False)
146+
| yield {'html-content':html}
147+
|"""
148+
finalCode.encode
149+
}
150+
151+
}
164 KB
Loading

0 commit comments

Comments
 (0)