[Question] How do I add a custom generative video transformer into TransformerLens?

### Question
I would like to do some mech interp on a generative video model (which will be a diffusion model but with temporal attention blocks). **Note this model would not have any text element, and would just be a transformer predicting the next video frame** (like Sora and other SOTA generative video models). It will be video-2-video (V2V). Afaik the current version of TransformerLens does not have support for a model like this - but I would like to get started on my research quite soon and hence would like to tailor TransformerLens to be able to handle a model like this. Two questions:

1. Does it make sense for me to expand TransformerLens to this case? (ie, would it be better to just start from scratch and not use TransformerLens for this?) 

2. How would I go about doing this?

While question 1 may have a simple answer I realize that question 2 may not. I am happy to have a longer conversation about this + get working on it after that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] How do I add a custom generative video transformer into TransformerLens? #869

Question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] How do I add a custom generative video transformer into TransformerLens? #869

Description

Question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions