Skip to content

Move placeholder and reshape recognition from getShapeSourceCalls to TensorGenerator dispatch #548

@khatchad

Description

@khatchad

Summary

PythonTensorAnalysisEngine.getShapeSourceCalls is a legacy recognition mechanism used at three sites in performAnalysis:

  1. set_shape (line 880): finds x.set_shape(s) sites; pins shape on receiver via SetShapeOp edge transfer.
  2. placeholder (line 870, via handleShapeSourceOp): finds tf.placeholder(dtype, shape) sites; seeds call's def with the shape arg via init.put.
  3. reshape (line 921, via handleShapeSourceOp): finds tf.reshape(t, s) sites; installs ReshapeOp edge transfer for dim-inference.

For (2) and (3), the recognition could move into the standard TensorGenerator dispatch path that handles every other tensor op in the codebase. The case for (1) is harder — set_shape mutates the receiver, not its own def, which doesn't fit the standard "generator seeds the call's def" pattern. So (1) needs to keep some form of special-casing (tracked separately as #509).

Proposed Refactor

placeholder (direct fit)

tf.placeholder(dtype, shape) produces a fresh tensor with the supplied shape and dtype. A Placeholder TensorGenerator reading the shape arg in getDefaultShapes and dtype in getDefaultDTypes is the natural fit — the seed via init.put(callDef, generator.getTensorTypes()) already happens at the standard dispatch. The special handleShapeSourceOp(placeholder, ...) block in performAnalysis becomes redundant.

reshape (partial fit)

Reshape already has a TensorGenerator (Reshape.java). Its getShapes parses the shape arg and resolves -1 dimensions when the input shape is fully known. The legacy ReshapeOp edge transfer in TensorTypeAnalysis does the same -1 resolution, but as a dataflow transfer — meaning it fires when the input tensor's type changes during analysis, not just at the call site's first evaluation.

That's a real semantic difference: a static-analysis-time computation in the generator vs. a propagation-driven update in the transfer. The transfer catches cases where the input's tensor type becomes known after the reshape call site has been initially evaluated. Moving everything into the generator might lose that.

Suggested approach: move Reshape's recognition out of getShapeSourceCalls, but keep the ReshapeOp edge transfer (or convert it into a per-node post-processing step). The generator handles the call-site-local computation; the transfer handles propagation-driven refinement.

Out of Scope

set_shape's receiver-mutating semantics don't fit the generator framework; that's tracked in #509. This issue is specifically about extracting the placeholder and reshape arms from the legacy getShapeSourceCalls path.

Background

Surfaced during the #509 design discussion. getShapeSourceCalls predates the modern TensorGenerator factory dispatch; the cleanup tightens the recognition boundary for ops that already have generators.

Metadata

Metadata

Assignees

No one assigned

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions