Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions docs/version3.x/pipeline_usage/PP-StructureV3.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -1536,7 +1536,7 @@ If not set, the default is <code>True</code>.</td>
<tr>
<td><code>format_block_content</code></td>
<td><b>Meaning:</b>Whether to format the content in <code>block_content</code> as Markdown.<br/>
<b>Description:</b> If not set, the initialized default value will be used, which is <code>False</code> by default.</td>
<b>Description:</b> If not set, the initialized default value will be used, which is <code>False</code> by default. When set to <code>True</code>, the <code>block_content</code> of image-type blocks will contain image path information (e.g., <code>&lt;img src="..." /&gt;</code>). When set to <code>False</code> (default), the <code>block_content</code> of image-type blocks will only contain OCR-recognized text content without image paths. To include image paths in JSON output, set this parameter to <code>True</code>.</td>
<td><code>bool</code></td>
<td></td>
</tr>
Expand Down Expand Up @@ -2302,7 +2302,7 @@ If set to <code>None</code>, the default value is <code>True</code>.</td>
<tr>
<td><code>format_block_content</code></td>
<td><b>Meaning:</b>Whether to format the content in <code>block_content</code> as Markdown.<br/>
<b>Description:</b> If set to <code>None</code>, the default value is <code>False</code>.</td>
<b>Description:</b> If set to <code>None</code>, the default value is <code>False</code>. When set to <code>True</code>, the <code>block_content</code> of image-type blocks will contain image path information (e.g., <code>&lt;img src="..." /&gt;</code>). When set to <code>False</code> (default), the <code>block_content</code> of image-type blocks will only contain OCR-recognized text content without image paths. To include image paths in JSON output, set this parameter to <code>True</code>.</td>
<td><code>bool|None</code></td>
<td></td>
</tr>
Expand Down Expand Up @@ -2480,7 +2480,7 @@ If set to <code>None</code>, the instantiation value is used; otherwise, this pa
</tr>
<tr>
<td><code>format_block_content</code></td>
<td>Whether to format the content in <code>block_content</code> as Markdown. If set to <code>None</code>, the instantiation value is used; otherwise, this parameter takes precedence.</td>
<td>Whether to format the content in <code>block_content</code> as Markdown. If set to <code>None</code>, the instantiation value is used; otherwise, this parameter takes precedence. When set to <code>True</code>, the <code>block_content</code> of image-type blocks will contain image path information (e.g., <code>&lt;img src="..." /&gt;</code>). When set to <code>False</code> (default), the <code>block_content</code> of image-type blocks will only contain OCR-recognized text content without image paths. To include image paths in JSON output, set this parameter to <code>True</code>.</td>
<td><code>bool|None</code></td>
<td></td>
</tr>
Expand Down Expand Up @@ -2778,7 +2778,7 @@ If enabled, the cell detection model will not be used, and only the table struct
<li><code>use_seal_recognition</code>: <code>(bool)</code> Whether to enable seal text recognition sub-pipeline</li>
<li><code>use_table_recognition</code>: <code>(bool)</code> Whether to enable table recognition sub-pipeline</li>
<li><code>use_formula_recognition</code>: <code>(bool)</code> Whether to enable formula recognition sub-pipeline</li>
<li><code>format_block_content</code>: <code>(bool)</code> Controls whether to format the <code>block_content</code> into Markdown format</li>
<li><code>format_block_content</code>: <code>(bool)</code> Controls whether to format the <code>block_content</code> into Markdown format. When set to <code>True</code>, the <code>block_content</code> of image-type blocks will contain image path information (e.g., <code>&lt;img src="..." /&gt;</code>). When set to <code>False</code> (default), the <code>block_content</code> of image-type blocks will only contain OCR-recognized text content without image paths. To include image paths in JSON output, set this parameter to <code>True</code>.</li>
<li><code>markdown_ignore_labels</code>: <code>(List[str])</code> Labels of layout regions that need to be ignored in Markdown</li>
</ul>
</li>
Expand Down
8 changes: 4 additions & 4 deletions docs/version3.x/pipeline_usage/PP-StructureV3.md
Original file line number Diff line number Diff line change
Expand Up @@ -1508,7 +1508,7 @@ paddleocr pp_structurev3 -i ./pp_structure_v3_demo.png --device gpu
</tr>
<tr>
<td><code>format_block_content</code></td>
<td><b>含义:</b>是否将<code>block_content</code>中的内容格式化为Markdown格式。<br><b>说明:</b>如果不设置,将使用产线初始化的该参数值,默认初始化为<code>False</code>。</br></td>
<td><b>含义:</b>是否将<code>block_content</code>中的内容格式化为Markdown格式。<br><b>说明:</b>如果不设置,将使用产线初始化的该参数值,默认初始化为<code>False</code>。当设置为<code>True</code>时,图片类型的 block 的 <code>block_content</code> 将包含图片路径信息(如 <code>&lt;img src="..." /&gt;</code>);当设置为<code>False</code>(默认)时,图片类型的 block 的 <code>block_content</code> 仅包含 OCR 识别的文本内容,不包含图片路径。如需在 JSON 输出中获取图片地址,请将此参数设置为<code>True</code>。</br></td>
<td><code>bool</code></td>
<td></td>
</tr>
Expand Down Expand Up @@ -2225,7 +2225,7 @@ for item in markdown_images:
</tr>
<tr>
<td><code>format_block_content</code></td>
<td>是否将<code>block_content</code>中的内容格式化为Markdown格式。如果设置为<code>None</code>,将使用产线初始化的该参数值,默认初始化为<code>False</code>。</td>
<td>是否将<code>block_content</code>中的内容格式化为Markdown格式。如果设置为<code>None</code>,将使用产线初始化的该参数值,默认初始化为<code>False</code>。当设置为<code>True</code>时,图片类型的 block 的 <code>block_content</code> 将包含图片路径信息(如 <code>&lt;img src="..." /&gt;</code>);当设置为<code>False</code>(默认)时,图片类型的 block 的 <code>block_content</code> 仅包含 OCR 识别的文本内容,不包含图片路径。如需在 JSON 输出中获取图片地址,请将此参数设置为<code>True</code>。</td>
<td><code>bool|None</code></td>
<td></td>
</tr>
Expand Down Expand Up @@ -2393,7 +2393,7 @@ for item in markdown_images:
<tr>
<td><code>format_block_content</code></td>
<td><b>含义:</b>是否将<code>block_content</code>中的内容格式化为Markdown格式。
<br><b>说明:</b>设置为<code>None</code>表示使用实例化参数,否则该参数优先级更高。</br></td>
<br><b>说明:</b>设置为<code>None</code>表示使用实例化参数,否则该参数优先级更高。当设置为<code>True</code>时,图片类型的 block 的 <code>block_content</code> 将包含图片路径信息(如 <code>&lt;img src="..." /&gt;</code>);当设置为<code>False</code>(默认)时,图片类型的 block 的 <code>block_content</code> 仅包含 OCR 识别的文本内容,不包含图片路径。如需在 JSON 输出中获取图片地址,请将此参数设置为<code>True</code>。</br></td>
<td><code>bool|None</code></td>
<td></td>
</tr>
Expand Down Expand Up @@ -2664,7 +2664,7 @@ for item in markdown_images:
<li><code>use_seal_recognition</code>: <code>(bool)</code> 控制是否启用印章文本识别子产线</li>
<li><code>use_table_recognition</code>: <code>(bool)</code> 控制是否启用表格识别子产线</li>
<li><code>use_formula_recognition</code>: <code>(bool)</code> 控制是否启用公式识别子产线</li>
<li><code>format_block_content</code>: <code>(bool)</code> 控制是否将 <code>block_content</code> 中的内容格式化为Markdown格式</li>
<li><code>format_block_content</code>: <code>(bool)</code> 控制是否将 <code>block_content</code> 中的内容格式化为Markdown格式。当设置为<code>True</code>时,图片类型的 block 的 <code>block_content</code> 将包含图片路径信息(如 <code>&lt;img src="..." /&gt;</code>);当设置为<code>False</code>(默认)时,图片类型的 block 的 <code>block_content</code> 仅包含 OCR 识别的文本内容,不包含图片路径。如需在 JSON 输出中获取图片地址,请将此参数设置为<code>True</code>。</li>
<li><code>markdown_ignore_labels</code>: <code>(List[str])</code> 需要在Markdown中忽略的版面标签</li>
</ul>
</li>
Expand Down
12 changes: 6 additions & 6 deletions docs/version3.x/pipeline_usage/PaddleOCR-VL.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -484,7 +484,7 @@ If not set, the initialized default value will be used, which is initialized to
<td><code>format_block_content</code></td>
<td><b>Meaning:</b>Controls whether to format the <code>block_content</code> content within as Markdown. <br/>
<b>Description:</b>
If not set, the initialized default value will be used, which defaults to initialization as<code>False</code>.</td>
If not set, the initialized default value will be used, which defaults to initialization as<code>False</code>. When set to <code>True</code>, the <code>block_content</code> of image-type blocks will contain image path information (e.g., <code>&lt;img src="..." /&gt;</code>). When set to <code>False</code> (default), the <code>block_content</code> of image-type blocks will only contain OCR-recognized text content without image paths. To include image paths in JSON output, set this parameter to <code>True</code>.</td>
<td><code>bool</code></td>
<td></td>
</tr>
Expand Down Expand Up @@ -939,7 +939,7 @@ If set to <code>None</code>, the initialized default value will be used, which i
<td><code>format_block_content</code></td>
<td><b>Meaning:</b>Controls whether to format the <code>block_content</code> content within as Markdown. <br/>
<b>Description:</b>
If set to <code>None</code>, the initialized default value will be used, which defaults to initialization as<code>False</code>.</td>
If set to <code>None</code>, the initialized default value will be used, which defaults to initialization as<code>False</code>. When set to <code>True</code>, the <code>block_content</code> of image-type blocks will contain image path information (e.g., <code>&lt;img src="..." /&gt;</code>). When set to <code>False</code> (default), the <code>block_content</code> of image-type blocks will only contain OCR-recognized text content without image paths. To include image paths in JSON output, set this parameter to <code>True</code>.</td>
<td><code>bool|None</code></td>
<td><code>None</code></td>
<td></td>
Expand Down Expand Up @@ -1181,8 +1181,8 @@ Setting it to <code>None</code> means using the instantiation parameter; otherwi
<tr>
<td><code>format_block_content</code></td>
<td><b>Meaning:</b>The parameter meaning is basically the same as the instantiation parameter. <br/>
<b>Description:</b>
Setting it to <code>None</code> means using the instantiation parameter; otherwise, this parameter takes precedence.</td>
<b>Description:</b>
Setting it to <code>None</code> means using the instantiation parameter; otherwise, this parameter takes precedence. When set to <code>True</code>, the <code>block_content</code> of image-type blocks will contain image path information (e.g., <code>&lt;img src="..." /&gt;</code>). When set to <code>False</code> (default), the <code>block_content</code> of image-type blocks will only contain OCR-recognized text content without image paths. To include image paths in JSON output, set this parameter to <code>True</code>.</td>
<td><code>bool|None</code></td>
<td><code>None</code></td>
</tr>
Expand Down Expand Up @@ -1411,7 +1411,7 @@ Setting it to <code>None</code> means using the instantiation parameter; otherwi
- `use_doc_preprocessor`: `(bool)` Controls whether to enable the document preprocessing sub-pipeline.
- `use_layout_detection`: `(bool)` Controls whether to enable the layout detection module.
- `use_chart_recognition`: `(bool)` Controls whether to enable the chart recognition function.
- `format_block_content`: `(bool)` Controls whether to save the formatted markdown content in `JSON`.
- `format_block_content`: `(bool)` Controls whether to save the formatted markdown content in `JSON`. When set to `True`, the `block_content` of image-type blocks will contain image path information (e.g., `<img src="..." />`). When set to `False` (default), the `block_content` of image-type blocks will only contain OCR-recognized text content without image paths. To include image paths in JSON output, set this parameter to `True`.
- `markdown_ignore_labels`: `(List[str])` Labels of layout regions that need to be ignored in Markdown

- `doc_preprocessor_res`: `(Dict[str, Union[List[float], str]])` A dictionary of document preprocessing results, which exists only when `use_doc_preprocessor=True`.
Expand All @@ -1438,7 +1438,7 @@ Setting it to <code>None</code> means using the instantiation parameter; otherwi
- `use_doc_preprocessor`: `(bool)` Controls whether to enable the document preprocessing sub-pipeline.
- `use_layout_detection`: `(bool)` Controls whether to enable the layout detection module.
- `use_chart_recognition`: `(bool)` Controls whether to enable the chart recognition function.
- `format_block_content`: `(bool)` Controls whether to save the formatted markdown content in `JSON`.
- `format_block_content`: `(bool)` Controls whether to save the formatted markdown content in `JSON`. When set to `True`, the `block_content` of image-type blocks will contain image path information (e.g., `<img src="..." />`). When set to `False` (default), the `block_content` of image-type blocks will only contain OCR-recognized text content without image paths. To include image paths in JSON output, set this parameter to `True`.

- `doc_preprocessor_res`: `(Dict[str, Union[List[float], str]])` A dictionary of document preprocessing results, which exists only when `use_doc_preprocessor=True`.
- `input_path`: `(str)` The image path accepted by the document preprocessing sub-pipeline. When the input is a `numpy.ndarray`, it is saved as `None`; here, it is `None`.
Expand Down
10 changes: 5 additions & 5 deletions docs/version3.x/pipeline_usage/PaddleOCR-VL.md
Original file line number Diff line number Diff line change
Expand Up @@ -474,7 +474,7 @@ paddleocr doc_parser -i ./paddleocr_vl_demo.png --use_layout_detection False
<tr>
<td><code>format_block_content</code></td>
<td><b>含义:</b>控制是否将 <code>block_content</code> 中的内容格式化为Markdown格式。<br/>
<b>说明:</b>如果不设置,将使用初始化的默认值,默认初始化为<code>False</code>。</td>
<b>说明:</b>如果不设置,将使用初始化的默认值,默认初始化为<code>False</code>。当设置为<code>True</code>时,图片类型的 block 的 <code>block_content</code> 将包含图片路径信息(如 <code>&lt;img src="..." /&gt;</code>);当设置为<code>False</code>(默认)时,图片类型的 block 的 <code>block_content</code> 仅包含 OCR 识别的文本内容,不包含图片路径。如需在 JSON 输出中获取图片地址,请将此参数设置为<code>True</code>。</td>
<td><code>bool</code></td>
<td></td>
</tr>
Expand Down Expand Up @@ -909,7 +909,7 @@ output = pipeline.predict(["imgs/file1.png", "imgs/file2.png", "imgs/file3.png"]
<tr>
<td><code>format_block_content</code></td>
<td><b>含义:</b>控制是否将 <code>block_content</code> 中的内容格式化为Markdown格式。<br/>
<b>说明:</b>如果设置为<code>None</code>,将使用初始化的默认值,默认初始化为<code>False</code>。</td>
<b>说明:</b>如果设置为<code>None</code>,将使用初始化的默认值,默认初始化为<code>False</code>。当设置为<code>True</code>时,图片类型的 block 的 <code>block_content</code> 将包含图片路径信息(如 <code>&lt;img src="..." /&gt;</code>);当设置为<code>False</code>(默认)时,图片类型的 block 的 <code>block_content</code> 仅包含 OCR 识别的文本内容,不包含图片路径。如需在 JSON 输出中获取图片地址,请将此参数设置为<code>True</code>。</td>
<td><code>bool|None</code></td>
<td><code>None</code></td>
</tr>
Expand Down Expand Up @@ -1142,7 +1142,7 @@ output = pipeline.predict(["imgs/file1.png", "imgs/file2.png", "imgs/file3.png"]
<td><code>format_block_content</code></td>
<td><b>含义:</b>参数含义与实例化参数基本相同。<br/>
<b>说明:</b>
设置为<code>None</code>表示使用实例化参数,否则该参数优先级更高。</td>
设置为<code>None</code>表示使用实例化参数,否则该参数优先级更高。当设置为<code>True</code>时,图片类型的 block 的 <code>block_content</code> 将包含图片路径信息(如 <code>&lt;img src="..." /&gt;</code>);当设置为<code>False</code>(默认)时,图片类型的 block 的 <code>block_content</code> 仅包含 OCR 识别的文本内容,不包含图片路径。如需在 JSON 输出中获取图片地址,请将此参数设置为<code>True</code>。</td>
<td><code>bool|None</code></td>
<td><code>None</code></td>
</tr>
Expand Down Expand Up @@ -1363,7 +1363,7 @@ output = pipeline.predict(["imgs/file1.png", "imgs/file2.png", "imgs/file3.png"]
<li><code>use_doc_preprocessor</code>: <code>(bool)</code> 控制是否启用文档预处理子产线</li>
<li><code>use_layout_detection</code>: <code>(bool)</code> 控制是否启用版面检测模块</li>
<li><code>use_chart_recognition</code>: <code>(bool)</code> 控制是否开启图表识别功能</li>
<li><code>format_block_content</code>: <code>(bool)</code> 控制是否在<code>JSON</code>中保存格式化后的markdown内容</li>
<li><code>format_block_content</code>: <code>(bool)</code> 控制是否在<code>JSON</code>中保存格式化后的markdown内容。当设置为<code>True</code>时,图片类型的 block 的 <code>block_content</code> 将包含图片路径信息(如 <code>&lt;img src="..." /&gt;</code>);当设置为<code>False</code>(默认)时,图片类型的 block 的 <code>block_content</code> 仅包含 OCR 识别的文本内容,不包含图片路径。如需在 JSON 输出中获取图片地址,请将此参数设置为<code>True</code>。</li>
</ol>
</li>
<li><code>doc_preprocessor_res</code>: <code>(Dict[str, Union[str, Dict[str, bool], int]])</code> 文档预处理子产线的输出结果。仅当<code>use_doc_preprocessor=True</code>时存在
Expand Down Expand Up @@ -1399,7 +1399,7 @@ output = pipeline.predict(["imgs/file1.png", "imgs/file2.png", "imgs/file3.png"]
<li><code>use_doc_preprocessor</code>: <code>(bool)</code> 控制是否启用文档预处理子产线</li>
<li><code>use_layout_detection</code>: <code>(bool)</code> 控制是否启用版面检测模块</li>
<li><code>use_chart_recognition</code>: <code>(bool)</code> 控制是否开启图表识别功能</li>
<li><code>format_block_content</code>: <code>(bool)</code> 控制是否在<code>JSON</code>中保存格式化后的markdown内容</li>
<li><code>format_block_content</code>: <code>(bool)</code> 控制是否在<code>JSON</code>中保存格式化后的markdown内容。当设置为<code>True</code>时,图片类型的 block 的 <code>block_content</code> 将包含图片路径信息(如 <code>&lt;img src="..." /&gt;</code>);当设置为<code>False</code>(默认)时,图片类型的 block 的 <code>block_content</code> 仅包含 OCR 识别的文本内容,不包含图片路径。如需在 JSON 输出中获取图片地址,请将此参数设置为<code>True</code>。</li>
</ol>
</li>
<li><code>doc_preprocessor_res</code>: <code>(Dict[str, Union[str, Dict[str, bool], int]])</code> 文档预处理子产线的输出结果。仅当<code>use_doc_preprocessor=True</code>时存在
Expand Down
Loading