Skip to content

Commit df427ba

Browse files
LiqinruiGliqinrui
andauthored
[Docs] add request params (PaddlePaddle#5207)
* [BugFix] rollback max_tokens and min_tokens when continue to infer * [BugFix] rollback max_tokens and min_tokens when continue to infer * [fix] add more logger info: max_tokens * [Docs] add request params --------- Co-authored-by: liqinrui <liqinrui@baidu.com>
1 parent cead6b2 commit df427ba

2 files changed

Lines changed: 159 additions & 0 deletions

File tree

docs/online_serving/README.md

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,17 @@ user: Optional[str] = None
130130
metadata: Optional[dict] = None
131131
# Additional metadata, used for passing custom information (such as request ID, debug markers, etc.).
132132

133+
n: Optional[int] = 1
134+
# Number of candidate outputs to generate (i.e., return multiple independent text completions). Default 1 (return only one result).
135+
136+
seed: Optional[int] = Field(default=None, ge=0, le=922337203685477580)
137+
# Random seed for controlling deterministic generation (same seed + input yields identical results).
138+
# Must be in range `[0, 922337203685477580]`. Default None means no fixed seed.
139+
140+
stop: Optional[Union[str, List[str]]] = Field(default_factory=list)
141+
# Stop generation conditions - can be a single string or list of strings.
142+
# Generation terminates when any stop string is produced (default empty list means disabled).
143+
133144
```
134145

135146
### Additional Parameters Added by FastDeploy
@@ -160,6 +171,11 @@ bad_words_token_ids: Optional[List[int]] = None
160171

161172
repetition_penalty: Optional[float] = None
162173
# Repetition penalty coefficient, reducing the probability of repeating already generated tokens (`>1.0` suppresses repetition, `<1.0` encourages repetition, default None means disabled).
174+
175+
stop_token_ids: Optional[List[int]] = Field(default_factory=list)
176+
# Stop generation token IDs - list of token IDs that trigger early termination when generated.
177+
# Typically used alongside `stop` for complementary stopping conditions (default empty list means disabled).
178+
163179
```
164180

165181
The following extra parameters are supported:
@@ -202,6 +218,19 @@ temp_scaled_logprobs: Optional[bool] = False
202218

203219
top_p_normalized_logprobs: Optional[bool] = False
204220
# Whether to perform top-p normalization when calculating logprobs (default is False, indicating that top-p normalization is not performed).
221+
222+
include_draft_logprobs: Optional[bool] = False
223+
# Whether to return log probabilities during draft stages (e.g., pre-generation or intermediate steps)
224+
# for debugging or analysis of the generation process (default False means not returned).
225+
226+
logits_processors_args: Optional[Dict] = None
227+
# Additional arguments for logits processors, enabling customization of generation logic
228+
# (e.g., dynamically adjusting probability distributions).
229+
230+
mm_hashes: Optional[list] = None
231+
# Hash values for multimodal (e.g., image/audio) inputs, used for verification or tracking.
232+
# Default None indicates no multimodal input or hash validation required.
233+
205234
```
206235

207236
### Differences in Return Fields
@@ -351,6 +380,39 @@ max_tokens: Optional[int] = None
351380

352381
presence_penalty: Optional[float] = None
353382
# Presence penalty coefficient, reducing the probability of generating new topics (unseen topics) (`>1.0` suppresses new topics, `<1.0` encourages new topics).
383+
384+
echo: Optional[bool] = False
385+
# Whether to include the input prompt in the generated output (default: `False`, i.e., exclude the prompt).
386+
387+
n: Optional[int] = 1
388+
# Number of candidate outputs to generate (i.e., return multiple independent text completions). Default 1 (return only one result).
389+
390+
seed: Optional[int] = Field(default=None, ge=0, le=922337203685477580)
391+
# Random seed for controlling deterministic generation (same seed + input yields identical results).
392+
# Must be in range `[0, 922337203685477580]`. Default None means no fixed seed.
393+
394+
stop: Optional[Union[str, List[str]]] = Field(default_factory=list)
395+
# Stop generation conditions - can be a single string or list of strings.
396+
# Generation terminates when any stop string is produced (default empty list means disabled).
397+
398+
stream: Optional[bool] = False
399+
# Whether to enable streaming output (return results token by token), default `False` (returns complete results at once).
400+
401+
stream_options: Optional[StreamOptions] = None
402+
# Additional configurations for streaming output (such as chunk size, timeout, etc.), refer to the specific definition of `StreamOptions`.
403+
404+
temperature: Optional[float] = None
405+
# Temperature coefficient, controlling generation randomness (`0.0` for deterministic generation, `>1.0` for more randomness, default `None` uses model default).
406+
407+
top_p: Optional[float] = None
408+
# Nucleus sampling threshold, only retaining tokens whose cumulative probability exceeds `top_p` (default `None` disables).
409+
410+
response_format: Optional[AnyResponseFormat] = None
411+
# Specifies the output format (such as JSON, XML, etc.), requires passing a predefined format configuration object.
412+
413+
user: Optional[str] = None
414+
# User identifier, used for tracking or distinguishing requests from different users (default `None` does not pass).
415+
354416
```
355417

356418
### Additional Parameters Added by FastDeploy
@@ -379,6 +441,10 @@ bad_words: Optional[List[str]] = None
379441
bad_words_token_ids: Optional[List[int]] = None
380442
# List of forbidden token ids that the model should avoid generating (default None means no restriction).
381443

444+
stop_token_ids: Optional[List[int]] = Field(default_factory=list)
445+
# Stop generation token IDs - list of token IDs that trigger early termination when generated.
446+
# Typically used alongside `stop` for complementary stopping conditions (default empty list means disabled).
447+
382448
repetition_penalty: Optional[float] = None
383449
# Repetition penalty coefficient, reducing the probability of repeating already generated tokens (`>1.0` suppresses repetition, `<1.0` encourages repetition, default None means disabled).
384450
```
@@ -402,6 +468,25 @@ return_token_ids: Optional[bool] = None
402468

403469
prompt_token_ids: Optional[List[int]] = None
404470
# Directly passes the token ID list of the prompt, skipping the text encoding step (default None means using text input).
471+
472+
temp_scaled_logprobs: Optional[bool] = False
473+
# Whether to divide the logits by the temperature coefficient when calculating logprobs (default is False, meaning the logits are not divided by the temperature coefficient).
474+
475+
top_p_normalized_logprobs: Optional[bool] = False
476+
# Whether to perform top-p normalization when calculating logprobs (default is False, indicating that top-p normalization is not performed).
477+
478+
include_draft_logprobs: Optional[bool] = False
479+
# Whether to return log probabilities during draft stages (e.g., pre-generation or intermediate steps)
480+
# for debugging or analysis of the generation process (default False means not returned).
481+
482+
logits_processors_args: Optional[Dict] = None
483+
# Additional arguments for logits processors, enabling customization of generation logic
484+
# (e.g., dynamically adjusting probability distributions).
485+
486+
mm_hashes: Optional[list] = None
487+
# Hash values for multimodal (e.g., image/audio) inputs, used for verification or tracking.
488+
# Default None indicates no multimodal input or hash validation required.
489+
405490
```
406491

407492
### Overview of Return Parameters

docs/zh/online_serving/README.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,15 @@ user: Optional[str] = None
130130
metadata: Optional[dict] = None
131131
# 附加元数据,用于传递自定义信息(如请求 ID、调试标记等)。
132132

133+
n: Optional[int] = 1
134+
# 生成结果的候选数量(即返回多少个独立生成的文本),默认 1(仅返回一个结果)。
135+
136+
seed: Optional[int] = Field(default=None, ge=0, le=922337203685477580)
137+
# 随机种子,用于控制生成过程的确定性(相同种子和输入会得到相同结果)。范围需在 `[0, 922337203685477580]` 之间,默认 None 表示不固定种子。
138+
139+
stop: Optional[Union[str, List[str]]] = Field(default_factory=list)
140+
# 停止生成的条件,可以是单个字符串或字符串列表。当模型生成任一停止字符串时,生成过程会提前终止(默认空列表表示不启用)。
141+
133142
```
134143

135144
### FastDeploy 增加额外参数
@@ -160,6 +169,10 @@ bad_words_token_ids: Optional[List[int]] = None
160169

161170
repetition_penalty: Optional[float] = None
162171
# 重复惩罚系数,降低已生成 token 的重复概率(>1.0 抑制重复,<1.0 鼓励重复,默认 None 表示禁用)。
172+
173+
stop_token_ids: Optional[List[int]] = Field(default_factory=list)
174+
# 停止生成的 token ID 列表,当模型生成任一指定 token 时,生成过程会提前终止(默认空列表表示不启用)。通常与 `stop` 参数互补使用。
175+
163176
```
164177
其他参数的支持如下:
165178
```python
@@ -201,6 +214,16 @@ temp_scaled_logprobs: Optional[bool] = False
201214

202215
top_p_normalized_logprobs: Optional[bool] = False
203216
# 计算logprob时是否进行 top_p 归一化(默认 False 表示不进行top_p归一化)。
217+
218+
include_draft_logprobs: Optional[bool] = False
219+
# 是否在预生成或中间步骤返回对数概率(log probabilities),用于调试或分析生成过程(默认 False 表示不返回)。
220+
221+
logits_processors_args: Optional[Dict] = None
222+
# 传递给 logits 处理器(logits processors)的额外参数,用于自定义生成过程中的逻辑(如动态调整概率分布)。
223+
224+
mm_hashes: Optional[list] = None
225+
# 多模态(multimodal)输入的哈希值列表,用于验证或跟踪输入内容(如图像、音频等)。默认 None 表示无多模态输入或无需哈希验证。
226+
204227
```
205228

206229
### 返回字段差异
@@ -350,6 +373,37 @@ max_tokens: Optional[int] = None
350373

351374
presence_penalty: Optional[float] = None
352375
# 存在惩罚系数,降低新主题(未出现过的话题)的生成概率(`>1.0` 抑制新话题,`<1.0` 鼓励新话题)。
376+
377+
echo: Optional[bool] = False
378+
# 是否将输入的 prompt 包含在输出中(默认 False,即不输出 prompt)。
379+
380+
n: Optional[int] = 1
381+
# 生成结果的候选数量(即返回多少个独立生成的文本),默认 1(仅返回一个结果)。
382+
383+
seed: Optional[int] = Field(default=None, ge=0, le=922337203685477580)
384+
# 随机种子,用于控制生成过程的确定性(相同种子和输入会得到相同结果)。范围需在 `[0, 922337203685477580]` 之间,默认 None 表示不固定种子。
385+
386+
stop: Optional[Union[str, List[str]]] = Field(default_factory=list)
387+
# 停止生成的条件,可以是单个字符串或字符串列表。当模型生成任一停止字符串时,生成过程会提前终止(默认空列表表示不启用)。
388+
389+
stream: Optional[bool] = False
390+
# 是否启用流式输出(逐 token 返回结果),默认 `False`(一次性返回完整结果)。
391+
392+
stream_options: Optional[StreamOptions] = None
393+
# 流式输出的额外配置(如分块大小、超时等),需参考 `StreamOptions` 的具体定义。
394+
395+
temperature: Optional[float] = None
396+
# 温度系数,控制生成随机性(`0.0` 确定性生成,`>1.0` 更随机,默认 `None` 使用模型默认值)。
397+
398+
top_p: Optional[float] = None
399+
# 核采样(nucleus sampling)阈值,只保留概率累计超过 `top_p` 的 token(默认 `None` 禁用)。
400+
401+
response_format: Optional[AnyResponseFormat] = None
402+
# 指定输出格式(如 JSON、XML 等),需传入预定义的格式配置对象。
403+
404+
user: Optional[str] = None
405+
# 用户标识符,用于跟踪或区分不同用户的请求(默认 `None` 不传递)。
406+
353407
```
354408

355409
### FastDeploy 增加额外参数
@@ -375,6 +429,12 @@ include_stop_str_in_output: Optional[bool] = False
375429
bad_words: Optional[List[str]] = None
376430
# 禁止生成的词汇列表(例如敏感词),模型会避免输出这些词(默认 None 表示不限制)。
377431

432+
bad_words_token_ids: Optional[List[int]] = None
433+
# 禁止生成的token id列表,模型会避免输出这些词(默认 None 表示不限制)。
434+
435+
stop_token_ids: Optional[List[int]] = Field(default_factory=list)
436+
# 停止生成的 token ID 列表,当模型生成任一指定 token 时,生成过程会提前终止(默认空列表表示不启用)。通常与 `stop` 参数互补使用。
437+
378438
repetition_penalty: Optional[float] = None
379439
# 重复惩罚系数,降低已生成 token 的重复概率(>1.0 抑制重复,<1.0 鼓励重复,默认 None 表示禁用)。
380440
```
@@ -398,6 +458,20 @@ return_token_ids: Optional[bool] = None
398458
prompt_token_ids: Optional[List[int]] = None
399459
# 直接传入 prompt 的 token ID 列表,跳过文本编码步骤(默认 None 表示使用文本输入)。
400460

461+
temp_scaled_logprobs: Optional[bool] = False
462+
# 计算logprob时是否对logits除以温度系数(默认 False 表示不除以温度系数)。
463+
464+
top_p_normalized_logprobs: Optional[bool] = False
465+
# 计算logprob时是否进行 top_p 归一化(默认 False 表示不进行top_p归一化)。
466+
467+
include_draft_logprobs: Optional[bool] = False
468+
# 是否在预生成或中间步骤返回对数概率(log probabilities),用于调试或分析生成过程(默认 False 表示不返回)。
469+
470+
logits_processors_args: Optional[Dict] = None
471+
# 传递给 logits 处理器(logits processors)的额外参数,用于自定义生成过程中的逻辑(如动态调整概率分布)。
472+
473+
mm_hashes: Optional[list] = None
474+
# 多模态(multimodal)输入的哈希值列表,用于验证或跟踪输入内容(如图像、音频等)。默认 None 表示无多模态输入或无需哈希验证。
401475
```
402476

403477
### 返回参数总览

0 commit comments

Comments
 (0)