Skip to content

feat(DOC-2008): document automatic topic creation for redpanda_migrator#395

Merged
Feediver1 merged 7 commits into
mainfrom
feat/doc-2008-auto-topic-creation
Jun 4, 2026
Merged

feat(DOC-2008): document automatic topic creation for redpanda_migrator#395
Feediver1 merged 7 commits into
mainfrom
feat/doc-2008-auto-topic-creation

Conversation

@mfernest

@mfernest mfernest commented Mar 20, 2026

Copy link
Copy Markdown
Contributor

Summary

Resolves DOC-2008. Documents the new sync_topic_interval field on redpanda_migrator output, which implements the engineering change in ENG-1051 (upstream Connect PR redpanda-data/connect#4059).

This PR was originally opened by @mfernest, who is no longer with the team. @Feediver1 is taking it over to finalize.

What changed on this page

  • New "How it works" content explaining the sync_topic_interval behavior: topics sync from source on startup and every 5 minutes by default, including source topics that have no current data (for example, after retention cleanup). Setting sync_topic_interval: 0s disables the periodic sync; topics are still created on demand when the first message arrives.
  • Consumer-groups default behavior fix: previously said "Only Empty state groups migrated" — corrected to "all groups except those in Dead state migrate by default" with consumer_groups.only_empty: true as the opt-in.
  • Promoted bold pseudo-headings (*Topics*, *Schema Registry*, *Consumer Groups*) under "Synchronization details" to real H3 headings so they're anchorable.
  • Multiple-migrator-pairs section clarified to spell out that labels must match exactly and that mismatched labels prevent the input/output from coordinating (addresses CodeRabbit nit from 2026-03-20).
  • Active-voice prose pass through manually-written sections, with plural subject-verb agreement on bullet lists.
  • xref from prose to field row: the prose mention of sync_topic_interval now anchors to its field-row entry in the auto-gen partial.

Engineering scope (ENG-1051)

The customer-facing scenario is: source topics emptied by retention cleanup were previously not re-created at the destination because Connect only created topics on first data flow. With sync_topic_interval, Connect proactively syncs the topic set on a configurable schedule (default 5 minutes) so destination topology matches source even when some source topics are empty. The sync_topic_interval field row is in the auto-generated field partial after the bot's run in PR #428.

Preview pages

Test plan

  • Netlify preview builds without errors
  • The sync_topic_interval field row renders in the configuration field listing with the canonical description, default 5m, and three examples
  • The prose xref <<sync_topic_interval>> in the "How it works" Topics bullet resolves to the field row
  • H3 headings under "Synchronization details" render in the page TOC

🤖 Generated with Claude Code

@mfernest mfernest requested a review from a team as a code owner March 20, 2026 01:42
@netlify

netlify Bot commented Mar 20, 2026

Copy link
Copy Markdown

Deploy Preview for redpanda-connect ready!

Name Link
🔨 Latest commit ce563af
🔍 Latest deploy log https://app.netlify.com/projects/redpanda-connect/deploys/6a21b93b232aff00084bdfe5
😎 Deploy Preview https://deploy-preview-395--redpanda-connect.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai

coderabbitai Bot commented Mar 20, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6022cb0d-d285-49ab-b4fb-94c15091d879

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This pull request updates the redpanda_migrator component documentation in AsciiDoc format. Changes include reframing the component as a migration mechanism between Kafka and Redpanda clusters, restructuring the synchronization details section with field-driven statements, clarifying configuration defaults, refining performance-tuning guidance around max_in_flight and buffer-related parameters, and updating the metrics table formatting. The documentation maintains the same core information while improving organization and clarity around configuration fields, synchronization behavior, and guarantees.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • mmatczuk
  • JakeSCahill
  • mihaitodor
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title references documenting automatic topic creation, which aligns with the main change of updating documentation to describe topic synchronization behavior via the new sync_topic_interval field.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description clearly describes the changeset: documenting a new sync_topic_interval field, fixing consumer group default behavior documentation, promoting pseudo-headings to H3, and applying prose style edits.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/doc-2008-auto-topic-creation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
modules/components/pages/outputs/redpanda_migrator.adoc (1)

47-47: Consider clarifying that labels must match exactly.

The paired input component's documentation states that "labels must match exactly" and explains the failure consequence. This output documentation is less explicit. Consider either adding "exactly" or cross-referencing the input documentation for the full explanation.

✏️ Proposed clarification
-Each migrator pair requires a unique `label`. Set the same label value on both the input and output within a pair.
+Each migrator pair requires a unique `label`. The label must match exactly between the paired input and output components.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/components/pages/outputs/redpanda_migrator.adoc` at line 47, The
output docs for Redpanda migrator currently state "Each migrator pair requires a
unique `label`" but don't specify that the input and output must match exactly;
update the sentence in redpanda_migrator.adoc to explicitly say that the label
value must match exactly between the paired input and output (e.g., "Set the
same label value on both the input and output within a pair; labels must match
exactly"), and optionally add a short cross-reference note pointing readers to
the paired input component's documentation for the failure behavior details.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@modules/components/pages/outputs/redpanda_migrator.adoc`:
- Line 47: The output docs for Redpanda migrator currently state "Each migrator
pair requires a unique `label`" but don't specify that the input and output must
match exactly; update the sentence in redpanda_migrator.adoc to explicitly say
that the label value must match exactly between the paired input and output
(e.g., "Set the same label value on both the input and output within a pair;
labels must match exactly"), and optionally add a short cross-reference note
pointing readers to the paired input component's documentation for the failure
behavior details.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0f6285f1-0e2e-4435-8f81-755f02013645

📥 Commits

Reviewing files that changed from the base of the PR and between 840cc2f and 6eb7979.

📒 Files selected for processing (1)
  • modules/components/pages/outputs/redpanda_migrator.adoc

@mfernest mfernest requested a review from Jeffail March 26, 2026 17:52
@Feediver1 Feediver1 requested a review from JakeSCahill May 28, 2026 14:06
mfernest and others added 3 commits May 28, 2026 10:16
Update the redpanda_migrator output page to document the new
sync_topic_interval field (ENG-1051/CON-334): topics now sync from
source on startup and every 5 minutes by default, including empty
source topics with no message flow.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Eliminate passive voice throughout
- Fix subject-verb agreement on plural subjects after colons
- Use natural subjects as actors where possible (carry over, sync, rewind)
- Trim redundant "The migrator" where intro sentence establishes subject
- Fix consumer groups default behavior (all non-Dead groups, not only Empty)
- Promote bold pseudo-headings to H3 in Synchronization details
- Tighten metrics table column widths and remove redundant "metrics" from group headers
- Add framing sentence before config tabs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rame retention scenario

- Spell out the failure mode for mismatched migrator-pair labels
  (resolves CodeRabbit nit from 2026-03-20).
- Link the prose mention of sync_topic_interval to its field-row anchor
  so readers can jump to the canonical definition.
- Replace the bare engineering claim "Data flows as messages are read
  from the input" with the more accurate behavior from the migrator
  source: topics are still created on demand when the first message
  arrives. Frame the empty-topic scenario with the retention-cleanup
  use case from ENG-1051.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Feediver1 Feediver1 force-pushed the feat/doc-2008-auto-topic-creation branch from 6eb7979 to 6135d35 Compare May 28, 2026 14:17
@Feediver1 Feediver1 self-assigned this May 28, 2026
@Feediver1 Feediver1 requested a review from mmatczuk May 28, 2026 14:18
@Feediver1

Copy link
Copy Markdown
Contributor

@mmatczuk — picking up this PR from @mfernest, who is no longer with the team. This documents the sync_topic_interval field you added in redpanda-data/connect#4059 (ENG-1051). Since the sync_topic_interval row is now in the auto-generated field partial after the bot's regen, this PR focuses on the manual narrative around it.

I rebased on main and made a few small follow-ups in 6135d35:

  1. Label-matching clarification (resolves CodeRabbit's nit from 2026-03-20): the "Multiple migrator pairs" section now spells out that labels must match exactly and that mismatched labels prevent input/output coordination.
  2. xref to the field row: the prose mention of sync_topic_interval in "How it works" now anchors to its auto-gen field entry, so readers can jump to the canonical definition.
  3. Behavior wording adjusted to match migrator_topic.go's "topics are still created on first message" comment — previous wording said "Data flows as messages are read from the input", which is accurate but harder to map to the field semantics. The new wording is: "When periodic sync is disabled, topics are still created on demand when the first message arrives."
  4. Retention-cleanup framing added to align the prose with the customer-facing scenario in ENG-1051 (source topics emptied by retention being absent at the destination).

Could you check (a) whether the "How it works > Topics" bullet is technically accurate, and (b) whether anything else from the engineering work needs surfacing in the manual prose that the auto-gen field partial doesn't cover? Thanks.

(Also addressing CodeRabbit's COMMENTED review from 2026-03-20 — the label-matching clarification is now applied; rest of the nit was on the original pseudo-heading structure which has since been promoted to H3.)

@micheleRP micheleRP left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@Feediver1

Copy link
Copy Markdown
Contributor

@copilot resolve the merge conflicts in this pull request

Copilot AI commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

@copilot resolve the merge conflicts in this pull request

Resolved in ae2494a.

Copilot AI requested a review from Feediver1 June 3, 2026 19:38
@Feediver1 Feediver1 merged commit ca2c89a into main Jun 4, 2026
5 checks passed
@Feediver1 Feediver1 deleted the feat/doc-2008-auto-topic-creation branch June 4, 2026 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants