Skip to content

Commit e39a850

Browse files
TovRudyyclaudeestherk15
authored andcommitted
Update Flaky Tests Management notifications documentation (#35763)
* Update Flaky Tests Management notifications documentation Add notification types table and document the new "New flaky test detected" notification type. Restructure the Receive notifications section for clarity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Clarify code owners AND matching behavior for notifications Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix description for successful remediation notification Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Use "state" instead of "status" for flaky test states Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Replace "status" with "state" for flaky test states throughout Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Update reference link * Address PR review feedback Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Esther Kim <esther.kim@datadoghq.com>
1 parent c2360bc commit e39a850

File tree

2 files changed

+29
-13
lines changed

2 files changed

+29
-13
lines changed

content/en/tests/flaky_management/_index.md

Lines changed: 28 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -19,24 +19,24 @@ further_reading:
1919

2020
## Overview
2121

22-
The [Flaky Tests Management][1] page provides a centralized view to track, triage, and remediate flaky tests across your organization. You can view every test's status along with key impact metrics like number of pipeline failures, CI time wasted, and failure rate.
22+
The [Flaky Tests Management][1] page provides a centralized view to track, triage, and remediate flaky tests across your organization. You can view every test's state along with key impact metrics like number of pipeline failures, CI time wasted, and failure rate.
2323

2424
From this UI, you can act on flaky tests to mitigate their impact. Quarantine or disable problematic tests to keep known flakes from breaking builds, and create cases and Jira issues to track work toward fixes.
2525

2626
{{< img src="tests/flaky_management-2.png" alt="Overview of the Flaky Tests Management UI" style="width:100%;" >}}
2727

28-
## Change a flaky test's status
28+
## Change a flaky test's state
2929

30-
Use the status drop-down to change how a flaky test is handled in your CI pipeline. This can help reduce CI noise while retaining traceability and control. Available statuses are:
30+
Use the state drop-down to change how a flaky test is handled in your CI pipeline. This can help reduce CI noise while retaining traceability and control. Available states are:
3131

32-
| Status | Description |
32+
| State | Description |
3333
| ----------- | ----------- |
3434
| **Active** | The test is known to be flaky and is running in CI. |
3535
| **Quarantined** | Keep the test running in the background, but failures don't affect CI status or break pipelines. This is useful for isolating flaky tests without blocking merges. Datadog tags test run events with `@test.test_management.is_quarantined:true` when quarantined. |
3636
| **Disabled** | Skip the test entirely in CI. Use this when a test is no longer relevant or needs to be temporarily removed from the pipeline. Datadog tags test run events with `@test.test_management.is_disabled:true` when disabled. |
37-
| **Fixed** | The test has passed consistently and is no longer flaky. If supported, use the [remediation flow](#confirm-fixes-for-flaky-tests) to confirm the fix and automatically apply this status after it is merged into the default branch. |
37+
| **Fixed** | The test has passed consistently and is no longer flaky. If supported, use the [remediation flow](#confirm-fixes-for-flaky-tests) to confirm the fix and automatically apply this state after it is merged into the default branch. |
3838

39-
<div class="alert alert-info">Status actions have minimum version requirements for each programming language's instrumentation library. See <a href="#compatibility">Compatibility</a> for details.</div>
39+
<div class="alert alert-info">State actions have minimum version requirements for each programming language's instrumentation library. See <a href="#compatibility">Compatibility</a> for details.</div>
4040

4141
## Configure policies to automate the flaky test lifecycle
4242

@@ -61,7 +61,7 @@ Configure automated Flaky Test Policies to govern how flaky tests are handled in
6161
<p>Toggle to allow flaky tests to be quarantined for this repository.</p>
6262
<p>Customize automation rules based on:</p>
6363
<ul>
64-
<li><strong>Time</strong>: Quarantine a test if its status is <code>Active</code> for a specified number of days. The rule is triggered every day at 12:15 UTC.</li>
64+
<li><strong>Time</strong>: Quarantine a test if its state is <code>Active</code> for a specified number of days. The rule is triggered every day at 12:15 UTC.</li>
6565
<li><strong>Branch</strong>: Quarantine an <code>Active</code> test if it flakes in one or more specified branches.</li>
6666
<li><strong>Failure rate</strong>: Quarantine an <code>Active</code> test if its failure rate over the last 7 days is greater or equal to the specified threshold. The rule is triggered every 15 minutes.</li>
6767
</ul>
@@ -73,7 +73,7 @@ Configure automated Flaky Test Policies to govern how flaky tests are handled in
7373
<p>Toggle to allow flaky tests to be disabled for this repository. You may want to do this after quarantining or to protect specific branches from flakiness.</p>
7474
<p>Customize automation rules based on:</p>
7575
<ul>
76-
<li><strong>Status and time</strong>: Disable a test if it has a specified status for a specified number of days. The rule is triggered every day at 12:30 UTC.</li>
76+
<li><strong>State and time</strong>: Disable a test if it has a specified state for a specified number of days. The rule is triggered every day at 12:30 UTC.</li>
7777
<li><strong>Branch</strong>: Disable an <code>Active</code> or <code>Quarantined</code> test if it flakes in one or more specified branches.</li>
7878
<li><strong>Failure rate</strong>: Disable an <code>Active</code> or <code>Quarantined</code> test if its failure rate over the last 7 days is greater or equal to the specified threshold. The rule is triggered every 15 minutes.</li>
7979
</ul>
@@ -85,7 +85,7 @@ Configure automated Flaky Test Policies to govern how flaky tests are handled in
8585
</tr>
8686
<tr>
8787
<td><strong>Fixed</strong></td>
88-
<td>If a flaky test no longer flakes for 30 days, it is automatically moved to Fixed status. This automation is default behavior and can't be customized.</td>
88+
<td>If a flaky test no longer flakes for 30 days, it is automatically moved to the Fixed state. This automation is default behavior and can't be customized.</td>
8989
</tr>
9090
</tbody>
9191
</table>
@@ -128,7 +128,7 @@ When you fix a flaky test, Test Optimization's remediation flow can confirm the
128128
- If all retries pass, marks the fix as **in progress** in the Flaky Tests Management UI, associates it with the branch used for the fix, and waits for that branch to be merged.
129129
- Tags the last test retry with `@test.test_management.attempt_to_fix_passed:true` in test run events.
130130
- Starts a 14-day [grace period](#grace-period-mechanism) to give time for the fix to propagate everywhere in the repository.
131-
- If any retry fails, keeps the test's current status (`Active`, `Quarantined`, or `Disabled`).
131+
- If any retry fails, keeps the test's current state (`Active`, `Quarantined`, or `Disabled`).
132132
- Tags the last test retry with `@test.test_management.attempt_to_fix_passed:false` in test run events.
133133

134134
### Track fixes that are in progress
@@ -201,9 +201,25 @@ Flaky Tests Management uses AI to automatically assign a root cause category to
201201

202202
## Receive notifications
203203

204-
Set up notifications to track changes to your flaky tests. Whenever a user or a policy changes the state of a flaky test, a message is sent to your selected recipients. You can send notifications to email addresses or Slack channels (see the [Datadog Slack integration][5]), and route messages based on test code owners. If no code owners are specified, all selected recipients are notified of all flaky test changes in the repository. Configure notification for each repository from the [**Flaky Test Policies**][13] page in Software Delivery settings.
204+
Set up notifications to track changes to your flaky tests. Notifications are sent when:
205+
- A new flaky test is detected on the default branch of the repository.
206+
- A user or policy changes the state of a flaky test.
207+
- The remediation flow for a flaky test succeeds or fails.
205208

206-
Notifications are not sent immediately; they are batched every few minutes to reduce noise.
209+
You can send notifications to email addresses or Slack channels (see the [Datadog Slack integration][5]), and route messages based on test code owners. When multiple code owners are specified, a flaky test must be owned by all specified code owners for the notification rule to match. If no code owners are specified, all selected recipients are notified of all flaky test changes in the repository. Configure notifications for each repository from the [**Flaky Test Policies**][13] page in Software Delivery settings.
210+
211+
Notifications are bundled over a short period to reduce noise.
212+
213+
### Notification types
214+
215+
| Notification type | Description |
216+
|---|---|
217+
| **New flaky test detected** | A new flaky test is detected on the default branch of the repository. |
218+
| **Test quarantined** | A test is quarantined by an automated policy rule (time-based, branch-based, or failure rate). |
219+
| **Test disabled** | A test is disabled by an automated policy rule (time-based, branch-based, or failure rate). |
220+
| **Fix successful** | A test passes all retries in the remediation flow and is marked as "fix in progress". |
221+
| **Fix failed** | A test fails during the remediation flow. |
222+
| **Manual state change** | A user manually changes the state of a flaky test. |
207223

208224
{{< img src="tests/flaky_management_notifications_settings-2.png" alt="Notifications settings UI" style="width:100%;" >}}
209225

content/en/tests/guides/setup_new_flaky_pr_gate.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -139,4 +139,4 @@ For more information, see the GitHub documentation for [status checks][11].
139139
[9]: /tests/flaky_management
140140
[10]: /tests/setup/junit_xml/
141141
[11]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/collaborating-on-repositories-with-code-quality-features/about-status-checks
142-
[12]: /tests/flaky_management/#change-a-flaky-tests-status
142+
[12]: /tests/flaky_management/#change-a-flaky-tests-state

0 commit comments

Comments
 (0)