Skip to content

Fine tuning ditto on one avatar image #82

@adityaVR

Description

@adityaVR

If we fine tune ditto on single avatar image and then generate real time talking head for that avatar from audio , will the resulting model be lighter and will it give faster response ? If not , what are the best ways to achieve low latency with ditto for known /one avatar image scenario?
Does ditto skip it's identity resemblance /identity extraction step if we train it directly on a single avatar image that will also be used at inference time ?
Can we pre compute the identity features (and store apperance features in memory) and reuse them to make process faster ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions