Skip to content

[Feature] Support batched generation (inference) in evo2. #1152

@jstjohn

Description

@jstjohn

Problem & Motivation

  • Customers have observed significant speedups when they need to generate based on multiple prompts using batched generations.
  • Currently fir/irr state are maintained without batch index, so to get batching we would need to introduce batch index in inference_context.fir_state etc in the inference kernels.

BioNeMo Framework Version

cd74c2b

Category

Inference

Proposed Solution

  • Add batch index to fir/irr state are maintained without batch index, so to get batching we would need to introduce batch index in inference_context.fir_state etc in NeMO.
  • Add test coverage for batched inference.

Expected Benefits

  • Significant (10x+) performance gains for many shorter generations.

Code Example

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions