feat(python-notebook-migration): add LLM client for notebook-to-workflow conversion#5260
feat(python-notebook-migration): add LLM client for notebook-to-workflow conversion#5260zyratlo wants to merge 18 commits into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5260 +/- ##
============================================
+ Coverage 54.08% 54.15% +0.07%
Complexity 2809 2809
============================================
Files 1101 1106 +5
Lines 42597 42777 +180
Branches 4584 4599 +15
============================================
+ Hits 23039 23168 +129
- Misses 18219 18265 +46
- Partials 1339 1344 +5
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
/request-review @mengw15 |
Automated Reviewer SuggestionsBased on the
|
mengw15
left a comment
There was a problem hiding this comment.
A couple of cleanups on migration-prompts.ts
| print("Minimum values:\n", data.min()) | ||
| print("\nMaximum values:\n", data.max()) | ||
| print("\nMean values:\n", data.mean()) | ||
| # END CELL 4 |
There was a problem hiding this comment.
Two small things while we're in here:
1. The 4 // https://github.com/Texera/texera/... reference URLs (lines 22, 95, 122, 153) are stale on two axes:
- Org rename:
Texera/texera→apache/texera. The plain wiki link at line 22 still redirects, but the three pinned-commitblob/...URLs (lines 95, 122, 153) likely 404 because those commits don't exist in the apache fork's git history. - Path drift: the URLs say
core/amber/src/main/python/core/models/X.py, but the actual path onapache/texeraisamber/src/main/python/core/models/X.py(nocore/prefix). Verified viagit ls-tree.
Suggest swapping to the current apache/texera-rooted paths and either dropping the commit pin (link to main/HEAD) or refreshing it.
2. Line 257: # END CELL 4 (with a space) — every other one of the 18 cell markers in EXAMPLE_OF_MULTIPLE_UDF_CONVERSION is the no-space form # END CELLN. This is the canonical pattern the LLM should learn from (and that custom.js parses on the iframe side too). A stray space in the example can teach the LLM that the marker format is loose, producing inconsistent cell IDs when it generates the workflow + mapping. One-character fix: drop the space.
mengw15
left a comment
There was a problem hiding this comment.
Two more comments on migration-llm.ts
| public async convertNotebookToWorkflow(notebook: Notebook): Promise<string> { | ||
| this.assertEnabled(); | ||
| if (!this.initialized) { | ||
| throw new Error("LLM session not initialized"); | ||
| } | ||
|
|
||
| const codeCells = notebook.cells.filter(cell => cell.cell_type === "code"); | ||
| const notebookString = codeCells | ||
| .map(cell => { | ||
| const uuid = String(cell.metadata.uuid); | ||
| return `# START ${uuid}\n${cell.source}\n# END ${uuid}`; | ||
| }) | ||
| .join("\n\n"); | ||
|
|
||
| const workflow = await this.sendPrompt(`${WORKFLOW_PROMPT}\n${notebookString}`); | ||
| const mapping = await this.sendPrompt(MAPPING_PROMPT); |
There was a problem hiding this comment.
this.messages accumulates user/assistant pairs across every sendPrompt call. A single convertNotebookToWorkflow adds 4 messages (workflow prompt + response + mapping prompt + response) on top of the 8 doc system messages from initialize. If a consumer calls convertNotebookToWorkflow twice without re-initialize-ing or calling close() between (e.g., user uploads notebook A, then B, in the same session), the second call sends the entire first conversation back to the LLM — tokens balloon, context window can blow, and the prior workflow/mapping output can pollute the second response.
The class is built for a one-shot lifecycle (initialize → verifyConnection → convertNotebookToWorkflow → close) but doesn't enforce it. Three options:
- Reset at the top: at the start of
convertNotebookToWorkflow, resetthis.messagesto just the documentation prelude (same shape as whatinitializebuilds). This makes each conversion stateless w.r.t. prior conversions. - Throw on re-use: track whether this session has already converted once; on second call, throw an error directing callers to
close()+initialize()first. - Document the one-shot lifecycle clearly in the class doc-comment so the next consumer (and the spec) doesn't trip on it.
Option 1 is probably most forgiving for callers; option 3 is the cheapest. The current spec uses a fresh makeLLM() per test so it doesn't catch the multi-call case, which would be a useful test to add either way.
There was a problem hiding this comment.
I implemented Option 1 in 84e87f1 and added test cases to verify previous conversation history is discarded.
| Object.entries(udfLLMResponse.code).forEach(([udfId, udfCode], i) => { | ||
| let udfOutputColumns: { attributeName: string; attributeType: string }[] = []; | ||
| if (udfLLMResponse.outputs && udfLLMResponse.outputs[udfId]) { | ||
| udfOutputColumns = udfLLMResponse.outputs[udfId].map((attr: string) => ({ | ||
| attributeName: attr, | ||
| attributeType: "binary", | ||
| })); | ||
| } |
There was a problem hiding this comment.
Follow-up to C2 (binary attributeType): thanks for explaining that binary makes sense for UDF-to-UDF pickled-object data flow. That covers internal flow, but I'd want to also flag the user-facing implication:
- Result panel viewing: the final UDF in any LLM-generated chain emits binary columns, so Texera's result panel renders them as opaque blobs (
<binary data>/ base64) rather than the dataframe / values the user expects to see after running their migrated notebook. The user has to manually edit every output column'sattributeTypebefore they can view results. - Non-UDF downstream operators: if the user (post-generation) adds a Visualize / Filter / Aggregate / SQL operator after the chain, those operators can't interpret binary columns either.
For an MVP this is a reasonable simplification (LLM prompt stays simple), but the friction shows up every time the user runs a generated workflow. Two cheap mitigations short of "make the LLM infer types" (which I agree is bigger work):
- Terminal-UDF default: treat the operator(s) with no outgoing edge as the "result-facing" one and default its outputs to
string(or some non-binary type) instead ofbinary— at least the final result renders. - Documented limitation in the class doc-comment + a frontend toast/modal after generation noting "Output columns default to binary; edit per-operator to view typed results."
Either way is fine; even just (2) would close the "user confused why results don't render" UX gap. Up to you whether to scope this here or treat it as a follow-up issue.
There was a problem hiding this comment.
Pull request overview
Adds a frontend notebook-migration LLM client that calls Texera’s LiteLLM proxy to convert Jupyter notebooks into Texera workflow JSON, alongside a prompt library and unit tests.
Changes:
- Introduces
NotebookMigrationLLMAngular service to run a 2-step LLM flow (workflow JSON + cell↔operator mapping) and assemble a Texera workflow payload. - Adds a prompt-constant library used to seed the LLM session with Texera/UDF documentation and conversion instructions.
- Adds Vitest unit tests for JSON parsing and deterministic workflow/mapping construction, and wires in
@ai-sdk/openai.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| frontend/package.json | Adds @ai-sdk/openai dependency used by the new migration client. |
| frontend/yarn.lock | Locks the added @ai-sdk/openai dependency. |
| frontend/src/app/workspace/service/notebook-migration/migration-prompts.ts | Adds the prompt/documentation constants for notebook-to-workflow conversion. |
| frontend/src/app/workspace/service/notebook-migration/migration-llm.ts | Implements the notebook migration LLM session, prompting, parsing, and workflow/mapping assembly. |
| frontend/src/app/workspace/service/notebook-migration/migration-llm.spec.ts | Adds unit tests for parsing and conversion behavior with mocked LLM transport. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import { Injectable } from "@angular/core"; | ||
| import { GuiConfigService } from "../../../common/service/gui-config.service"; | ||
| import { createOpenAI } from "@ai-sdk/openai"; | ||
| import { generateText, type ModelMessage } from "ai"; | ||
| import { AppSettings } from "../../../common/app-setting"; | ||
| import { v4 as uuidv4 } from "uuid"; | ||
| import { WorkflowUtilService } from "../workflow-graph/util/workflow-util.service"; | ||
| import { OperatorPredicate } from "../../types/workflow-common.interface"; |
| public initialize(modelType: string = "gpt-5-mini", apiKey: string = "dummy"): void { | ||
| this.assertEnabled(); | ||
| this.model = createOpenAI({ | ||
| baseURL: new URL(`${AppSettings.getApiEndpoint()}`, document.baseURI).toString(), | ||
| // apiKey is required by the library for creating the OpenAI compatible client; | ||
| // For security reason, we store the apiKey at the backend, thus the value is dummy here. | ||
| apiKey: apiKey, | ||
| }).chat(modelType); | ||
|
|
| settings: { | ||
| dataTransferBatchSize: number; | ||
| }; | ||
| } |
| const workflowJSON: WorkflowJSON = { | ||
| operators: [], | ||
| operatorPositions: {}, | ||
| links: [], | ||
| commentBoxes: [], | ||
| settings: { | ||
| dataTransferBatchSize: 400, | ||
| }, | ||
| }; |
| interface Cell { | ||
| cell_type: string; | ||
| metadata: { [key: string]: any }; | ||
| source: string; | ||
| } |
| private parseJsonResponse(raw: string, context: string): any { | ||
| // Trim first, then strip optional markdown code fences (```json ... ``` or ``` ... ```) | ||
| const cleaned = raw | ||
| .trim() | ||
| .replace(/^```[a-zA-Z]*\s*/, "") | ||
| .replace(/\s*```$/, "") | ||
| .trim(); | ||
| try { | ||
| return JSON.parse(cleaned); | ||
| } catch (err) { | ||
| throw new Error(`Failed to parse LLM ${context} response as JSON: ${(err as Error).message}`); | ||
| } | ||
| } |
| const llm = new NotebookMigrationLLM(stubConfig, stubUtil); | ||
| llm.initialize(); | ||
| return llm; |
| it("produces a link with an undefined endpoint when an edge references an unknown UDF id", async () => { | ||
| const notebook: Notebook = { cells: [codeCell("CELL1", "a")] }; | ||
| mockResponses( | ||
| JSON.stringify({ code: { UDF1: "c1" }, edges: [["UDF1", "UDFX"]], outputs: {} }), | ||
| JSON.stringify({ UDF1: ["CELL1"] }) | ||
| ); | ||
|
|
||
| const { workflowJSON } = JSON.parse(await makeLLM().convertNotebookToWorkflow(notebook)); | ||
|
|
||
| // Documents current behavior: udfMappingToUUID["UDFX"] is undefined. | ||
| expect(workflowJSON.links[0].source.operatorID).toBe("PythonUDFV2-0"); | ||
| expect(workflowJSON.links[0].target.operatorID).toBeUndefined(); | ||
| }); |
| it("emits the 'undefined' cell marker in the prompt when a code cell lacks metadata.uuid", async () => { | ||
| const notebook: Notebook = { cells: [codeCell(undefined, "print(1)")] }; | ||
| mockResponses( | ||
| JSON.stringify({ code: { UDF1: "c1" }, edges: [], outputs: {} }), | ||
| JSON.stringify({ UDF1: ["CELL1"] }) | ||
| ); | ||
|
|
||
| await makeLLM().convertNotebookToWorkflow(notebook); | ||
|
|
||
| // The notebook string (embedded in the workflow prompt) is sent to generateText. | ||
| // messages is a shared, mutated array, so search every message content rather than | ||
| // assuming a fixed index. | ||
| const allPromptContent = mockGenerateText.mock.calls | ||
| .flatMap(call => call[0].messages.map((m: any) => m.content)) | ||
| .join("\n"); | ||
| expect(allPromptContent).toContain("# START undefined"); | ||
| }); |
| it("does not strip a fence preceded by prose (documents current limitation)", () => { | ||
| expect(() => parse('Here is the JSON: ```json\n{"a":1}\n```\nThanks!')).toThrow( | ||
| "Failed to parse LLM workflow response as JSON" | ||
| ); | ||
| }); |
mengw15
left a comment
There was a problem hiding this comment.
Left a follow-up note on Copilot's apiKey/JWT comment.
| public initialize(modelType: string = "gpt-5-mini", apiKey: string = "dummy"): void { | ||
| this.assertEnabled(); | ||
| this.model = createOpenAI({ | ||
| baseURL: new URL(`${AppSettings.getApiEndpoint()}`, document.baseURI).toString(), | ||
| // apiKey is required by the library for creating the OpenAI compatible client; | ||
| // For security reason, we store the apiKey at the backend, thus the value is dummy here. | ||
| apiKey: apiKey, | ||
| }).chat(modelType); |
There was a problem hiding this comment.
Strengthening Copilot's apiKey/JWT note on this initialize() block: this is now a hard blocker rather than a forward-looking suggestion, because of #5421.
Before #5421, AccessControlResource.scala:251 (the /chat/* LiteLLM proxy in access-control-service) was @PermitAll. The JwtAuthFilter skipped it, and the resource body explicitly stripped the incoming Authorization header and substituted the LiteLLM master key — so apiKey: "dummy" worked because the value was discarded server-side. That's where the "for security reason, we store the apiKey at the backend, thus the value is dummy here" comment in this file came from.
After #5421 (fix(auth): require REGULAR/ADMIN role on LiteLLM proxy endpoints, merged 2026-06-07), the same resource is @RolesAllowed(Array("REGULAR", "ADMIN")). The JwtAuthFilter registered at AUTHENTICATION priority in AccessControlService.scala:79 now runs on every /api/chat/* request, parses Authorization: Bearer <…> as a JWT, and rejects malformed tokens with 401 before the request reaches the resource body.
Net effect once this PR rebases on current main:
convertNotebookToWorkflow→ 401verifyConnection→ 401- The whole migration feature is non-functional until the apiKey wiring changes.
Vercel AI SDK puts the apiKey string into the Authorization header verbatim as a Bearer token, which is the exact shape JwtAuthFilter parses, so passing the current user's JWT here should be a small wire-up (e.g. pulling the token from UserService at the call site and threading it into initialize()). The resource body still substitutes the master key before forwarding to LiteLLM, so the JWT is used only for the Texera-side auth check.
The "// apiKey is required by the library… the value is dummy here" code comment should be updated alongside, otherwise it will mislead the next reader.
What changes were proposed in this PR?
Introduces the frontend LLM session class that converts a Jupyter notebook into a Texera workflow JSON plus a bidirectional cell to operator mapping, along with the prompt library it uses. Two files under
frontend/src/app/workspace/service/notebook-migration/, totalling ~700 lines (~410 of which is prompt text).migration-llm.ts— definesNotebookMigrationLLM, an@Injectableclass wrapping a Vercel AI SDK chat session against the LiteLLM proxy already exposed onmainat/api/chat/completion.initialize(modelType, apiKey)— builds an OpenAI-compatible chat client viacreateOpenAI({ baseURL: AppSettings.getApiEndpoint() }), seeds the message history with Texera documentation assystemmessages.verifyConnection()— does a 10-tokenpingcall to validate that the API key works against the configured model.convertNotebookToWorkflow(notebook)— extracts code cells (each tagged with a UUID inmetadata.uuid), sendsWORKFLOW_PROMPT+ the notebook to get a JSON of UDF operators / edges, then sendsMAPPING_PROMPTto get the cell↔operator mapping. Assembles a complete Texera workflow JSON (PythonUDFV2operators with stub input/output ports, links derived from the LLM's edge list, default settings) plus a bidirectionaloperator_to_cell/cell_to_operatormapping. Returns both as a JSON string.close()— clears the message history and the model reference.migration-prompts.ts— string constants used bymigration-llm.ts:TEXERA_OVERVIEW,TUPLE_DOCUMENTATION,TABLE_DOCUMENTATION,OPERATOR_DOCUMENTATION,UDF_INPUT_PORT_DOCUMENTATION,EXAMPLE_OF_GOOD_CONVERSION,VISUALIZER_DOCUMENTATION,EXAMPLE_OF_MULTIPLE_UDF_CONVERSION,WORKFLOW_PROMPT,MAPPING_PROMPT.Any related issues, documentation, discussions?
Closes #5259
Parent issue #4301
How was this PR tested?
No unit tests were included for these reasons:
migration-llm.tsis parsing a response.However I am open to writing tests based on review feedback.
Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.7)