This document describes all available scope types for content extraction in the manifest-driven content generator.
The scope parameter defines what type of content to collect from the pattern match point until the next heading. Collection automatically stops at any heading level (# through ######) or the end of the document.
Pattern Match → Collect content by type → Stop at next heading → Combine all results
Collects: Only the matched line, keeping all markdown formatting.
Use Case: Extract headings with their markdown markers intact.
Example:
Source:
## Introduction to AEM
## Advanced TopicsConfiguration:
pattern: "## "
scope: selfOutput:
- ## Introduction to AEM
- ## Advanced TopicsCollects: Only the matched line, removing markdown formatting.
Use Case: Extract heading text without markdown markers.
Example:
Source:
## Introduction to AEM
## Advanced Topics
### Subtopic NameConfiguration:
pattern: "## "
scope: self_plainOutput:
- Introduction to AEM
- Advanced TopicsNote: ### Subtopic Name is not matched because pattern is "## " (H2 only).
Collects: All bullet points (lines starting with - or *) from match until next heading.
Use Case: Extract list items, objectives, prerequisites, etc.
Example:
Source:
#### Objectives
- Learn content modeling
- Understand GraphQL
- Build queries
## Next SectionConfiguration:
pattern: "#### Objectives"
scope: bulletsOutput:
- Learn content modeling
- Understand GraphQL
- Build queriesComplex Example with Multiple Headings:
Source:
# Module 1
#### Objectives
- Learn X
- Learn Y
## Activity 1-1
- Step 1
- Step 2
# Module 2
#### Objectives
- Learn ZConfiguration:
pattern: "#### Objectives"
scope: bulletsOutput (bullets from BOTH matches combined):
- Learn X
- Learn Y
- Step 1
- Step 2
- Learn ZNote: Collects ALL bullets between "#### Objectives" and the next H1 heading for each match.
Collects: All bullet points with the bullet marker (- or *) removed.
Use Case: Get bullet content as plain text without markers.
Example:
Source:
#### Prerequisites
- Experience with JavaScript
- Basic knowledge of ReactConfiguration:
pattern: "#### Prerequisites"
scope: bullets_plainOutput:
- Experience with JavaScript
- Basic knowledge of ReactNote: Output is still formatted as markdown-list by default, but original bullet markers are stripped from source content.
Collects: Only paragraph text (excludes headings, bullets, code blocks).
Use Case: Extract descriptive paragraphs, introductions, explanations.
Example:
Source:
## Overview
This module introduces Adobe Experience Manager. It covers the basics of content management and digital asset organization.
- Some bullet
- Another bullet
More descriptive text here.
## Next SectionConfiguration:
pattern: "## Overview"
scope: textOutput:
- This module introduces Adobe Experience Manager. It covers the basics of content management and digital asset organization.
- More descriptive text here.Collects: Paragraph text with all markdown formatting removed (no bold, italic, links, etc.).
Use Case: Extract plain text without any markdown styling.
Example:
Source:
## Description
This guide covers **AEM Sites** and [Edge Delivery Services](https://example.com). It includes *important* concepts.
## Next SectionConfiguration:
pattern: "## Description"
scope: text_plainOutput:
- This guide covers AEM Sites and Edge Delivery Services. It includes important concepts.Collects: Code blocks (content between ``` markers).
Use Case: Extract code examples, snippets, configurations.
Example:
Source:
## Implementation
```javascript
function init() {
console.log('Hello');
}Some text here.
.button { color: red; }
Configuration:
```yaml
pattern: "## Implementation"
scope: code
Output:
- ```javascript
function init() {
console.log('Hello');
}.button { color: red; }
Collects: Code blocks without the ``` markers.
Use Case: Extract raw code content.
Example:
Source:
## Example
```javascript
const x = 10;
Configuration:
```yaml
pattern: "## Example"
scope: code_plain
Output:
- const x = 10;Collects: All content between the match and next heading (bullets, text, code, everything).
Use Case: Extract complete sections including all content types.
Example:
Source:
## Setup Instructions
Follow these steps carefully.
- Install Node.js
- Clone repository
```bash
npm installThat's it!
Configuration:
```yaml
pattern: "## Setup Instructions"
scope: all
Output:
- Follow these steps carefully.
- Install Node.js
- Clone repository
- ```bash
npm install- That's it!
---
### `all_plain`
**Collects:** All content with markdown formatting removed.
**Use Case:** Extract complete sections as plain text.
**Example:**
Source:
```markdown
## Summary
This covers **important** topics:
- Topic 1
- Topic 2
See [documentation](url) for more.
## Next
Configuration:
pattern: "## Summary"
scope: all_plainOutput:
- This covers important topics:
- Topic 1
- Topic 2
- See documentation for more.The format parameter controls how collected items are output (default: markdown-list):
Outputs each item as a markdown bullet:
- Item 1
- Item 2
- Item 3Outputs each item on its own line without bullets:
Item 1
Item 2
Item 3Outputs items comma-separated on one line:
Item 1, Item 2, Item 3Collection ALWAYS stops at:
- Next heading of ANY level (
#,##,###,####,#####,######) - End of document
You cannot specify custom boundaries in v1.
- Matches at start of line after trimming whitespace
- Case-sensitive
- Exact string match (no regex in v1)
- Partial matches work (e.g.,
"## "matches any H2)
Examples:
"## "→ Matches all H2 headings"## Activity"→ Matches only H2 starting with "Activity""#### Objectives"→ Matches exact H4 heading "Objectives"
- Use
self_plainfor titles: Extract heading text without markdown markers - Use
bulletsfor lists: Extract objectives, prerequisites, steps - Use
textfor descriptions: Extract paragraph content - Use
allfor complete sections: When you need everything - Add
_plainsuffix: When you want content without formatting
pattern: "# "
scope: self_plainpattern: "## Activity"
scope: self_plainpattern: "#### Objectives"
scope: bulletspattern: "## Prerequisites"
scope: bulletspattern: "## Introduction"
scope: text