Skip to content

fix: keep Replicate private endpoint output as one Message#1836

Open
fallintoplace wants to merge 1 commit into
NVIDIA:mainfrom
fallintoplace:fix/replicate-private-endpoint-output
Open

fix: keep Replicate private endpoint output as one Message#1836
fallintoplace wants to merge 1 commit into
NVIDIA:mainfrom
fallintoplace:fix/replicate-private-endpoint-output

Conversation

@fallintoplace
Copy link
Copy Markdown
Contributor

@fallintoplace fallintoplace commented Jun 4, 2026

private Replicate endpoints already join prediction.output into one string, but the private endpoint path then iterates that string when building the return value.

so a response like "hello" becomes a list of per-character messages instead of a single Message("hello").

this PR:

  • returns one Message for the joined response
  • adds a regression test for chunked prediction.output

verification

  • ran PYTHONPATH=/Users/hoangvu/Code/OSS/garak /tmp/garak-pr-venv/bin/python -m pytest tests/generators/test_replicate.py
  • test passes with this change
  • the same test fails against the old behavior because _call_model() returns one message per character

I didn't run a live private Replicate endpoint check here, and I didn't run the full tests/ suite in this local environment.

Signed-off-by: Minh Vu <vuhoangminh97@gmail.com>
@fallintoplace fallintoplace changed the title fix: return one message from Replicate private endpoints fix: keep Replicate private endpoint output as one Message Jun 4, 2026
@fallintoplace fallintoplace marked this pull request as ready for review June 4, 2026 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant