Skip to content

Extend expiring logit bias to other sampling parameters#1770

Open
dungquixote42 wants to merge 21 commits into
ikawrakow:mainfrom
dungquixote42:elb-multisampler-1
Open

Extend expiring logit bias to other sampling parameters#1770
dungquixote42 wants to merge 21 commits into
ikawrakow:mainfrom
dungquixote42:elb-multisampler-1

Conversation

@dungquixote42
Copy link
Copy Markdown
Contributor

@dungquixote42 dungquixote42 commented May 10, 2026

Previous PR: #1731

This PR extends the previous expiring logit bias to other sampling parameters such as temp, top_k and top_p. They can be modified and restored during generation based on specific phrases or durations. It also includes some minor bug fixes.

Syntax + Mechanism

(DURATION : PHRASE ... PHRASE : SPARAM ~ VALUE, ...)

  • if unspecified, DURATION is 1 for unnested parenthesis, -1 (near infinite) for nested parenthesis.
  • PHRASE are matched as a string. They are not tokenized.
  • VALUE is added to SPARAM on odd Nth match, subtracted on even Nth match. (N = 1, 2, 3, ...)
  • if no PHRASE is specified, VALUE is added to SPARAM once in the beginning of the phase and subtracted on expiry.
  • SPARAM is restored when the duration runs out or the exitword match triggers the phase change.
  • "EXITWORD" >> triggers jumping to the phase after the EXITWORD is matched. Mainly for the end-of-thinking token

See X_COMMON_PARAMS_SAMPLING in sampling.h for the full list of supported SPARAM.

Changed Behaviors

  • Nested entries can now be defined multiple times to combine them. Previously, the last one overwrote the one before.
  • Nested entries are now added to the current and all following phrases including the last. e.g. put (()) before the last exitword to exclude from the last phase.

Use Cases

Productivity, coding, etc:

# begin-think tag and opening backtick reduces temp by 0.2 and increases min_p by 0.01
(-1: "<think>", "`": temp ~ -0.2, min_p ~ 0.01)

# begin-think tag != end-think tag, so the change will stick around until expiration by duration or exitword
# but closing backtick will revert temp and min_p changes caused by previous opening backtick
# changes from different PHRASE will stack if one is triggered before the other is expired

Hot -> Cold -> Neutral

######## paragraph 1

(-1: temp ~ 0.4, top_p ~ 0.05, xtc_probability ~ -0.2)

######## paragraph 1 end
\n\n
######## paragraph 2

(-1: temp ~ -0.4, top_p ~ -0.05, min_p ~ -0.05)

######## paragraph 2 end
\n\n
######## paragraph 3

Chill -> Warm -> Hot

######## paragraph 1

(-1: temp ~ -0.1)

######## paragraph 1 end
\n\n
######## paragraph 2

(-1: temp ~ 0.1, dynatemp_range ~ 0.2)

######## paragraph 2 end
\n\n
######## paragraph 3

(-1: temp ~ -0.3, xtc_probability ~ 0.4, top_n_sigma ~ 1.5)

Early exit with >>

\n\n
(-1: temp ~ 0.2)
\n\n
(-1: temp ~ 0.1)
\n\n
(-1: temp ~ -0.1)
\n\n
(-1: temp ~ -0.3)
("In conclusion," : 999)
\n\n
(-1: temp ~ -0.1)
\n\n
(-1: temp ~ 0.1)
\n\n
(-1: temp ~ 0.3)
\n\n
(-1: temp ~ 0.1)

######## jump here if end of thinking
"<channel|>" >>
######## response

Hybrid use

# opening backtick decreases temp
(("`" : temp ~ -0.4))
# opening double quote increases temp
(("\"" : temp ~ 0.3))

Inline comments are now supported.

\n\n    # paragraph 2 below
\n\n    # paragraph 3 below
("</think>" : 999)    # trigger end of thinking at paragraph 3

Honorable Mention for Gemma 4:

# Think-in-the-middle pattern (after paragraph 2)
\n\n
\n\n
("<think>\nOkay, let's check if I am following the user instruction closely." : 999)

I suspect this works for Gemma 4 because its tool call uses <think>. Perhaps it is drawing from other models' reasoning traces?

@Ph0rk0z
Copy link
Copy Markdown

Ph0rk0z commented May 15, 2026

For the previous version, I notice it sometimes doesn't fire but reasoning budget has been working surprisingly well. I also wanted as the token to start it off but that may be being added via the template :(

@dungquixote42
Copy link
Copy Markdown
Contributor Author

For the previous version, I notice it sometimes doesn't fire but reasoning budget has been working surprisingly well. I also wanted as the token to start it off but that may be being added via the template :(

Would you share your file content? I can take a look.

@Ph0rk0z
Copy link
Copy Markdown

Ph0rk0z commented May 16, 2026

Yea.. its what I had before.. I tried to change the first n-n with think but then it caught nothing.


\n\n
\n\n
\n\n
\n\n
\n\n
# trigger end of thinking after paragraph 3, sentence 1
("</think>\n\n" : 999)
(())

@dungquixote42
Copy link
Copy Markdown
Contributor Author

Yea.. its what I had before.. I tried to change the first n-n with think but then it caught nothing.


\n\n
\n\n
\n\n
\n\n
\n\n
# trigger end of thinking after paragraph 3, sentence 1
("</think>\n\n" : 999)
(())
\n\n    # paragraph 2 below
\n\n    # paragraph 3 below
("</think>" : 999)    # trigger end of thinking at paragraph 3

Try this with this PR. It seems to work fine for me.

@dungquixote42 dungquixote42 marked this pull request as ready for review May 16, 2026 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants