Skip to content

Add has_bos to Jetstream decode request#1692

Merged
copybara-service[bot] merged 1 commit into
mainfrom
bos_jetstream
May 7, 2025
Merged

Add has_bos to Jetstream decode request#1692
copybara-service[bot] merged 1 commit into
mainfrom
bos_jetstream

Conversation

@SurbhiJainUSC

@SurbhiJainUSC SurbhiJainUSC commented May 6, 2025

Copy link
Copy Markdown
Collaborator

Description

This PR sets has_bos=True in JetStream decode request because generate_distillation_data.py applies chat template to the prompt before sending it to JetStream for decoding. Initially, JetStream was hardcoding bos token to all the prompts before running inference. This was corrupting our results for distillation. PR in JetStream fixes this issue and added a new argument to DecodeRequest proto.

Notice 1: Once all tests pass, the "pull ready" label will automatically be assigned.
This label is used for administrative purposes. Please do not add it manually.

Notice 2: For external contributions, our settings currently require an approval from a MaxText maintainer to trigger CI tests.

Tests

Tested by running generate_distillation_data.py

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed.

This reverts commit ebc49030f1790a0ff2a7176586f23c14c3ff51c6.
@copybara-service copybara-service Bot merged commit 0add45d into main May 7, 2025
28 of 35 checks passed
@copybara-service copybara-service Bot deleted the bos_jetstream branch May 7, 2025 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants