Skip to content

1.1x prefill and decode speedup (attention/activations) #4624

1.1x prefill and decode speedup (attention/activations)

1.1x prefill and decode speedup (attention/activations) #4624