Skip to content

ref(seer): Remove should_run_automation from get_issue_summary#112219

Closed
trevor-e wants to merge 11 commits intomasterfrom
trevor-e/ref/refactor-get-issue-summary
Closed

ref(seer): Remove should_run_automation from get_issue_summary#112219
trevor-e wants to merge 11 commits intomasterfrom
trevor-e/ref/refactor-get-issue-summary

Conversation

@trevor-e
Copy link
Copy Markdown
Member

@trevor-e trevor-e commented Apr 3, 2026

get_issue_summary was doing too much — generating the summary, caching it,
and optionally triggering automation via a should_run_automation boolean flag.
This made it misleading as a "get" function and hid a significant side effect
behind a default parameter.

This PR moves the run_automation call out of get_issue_summary and into
generate_summary_and_run_automation, the task that actually wants automation.
Callers now explicitly decide whether to run automation after getting a summary.

get_issue_summary now returns a 3-tuple (summary_data, status_code, event)
so callers that need the event for automation have access to it.

Also renames generate_issue_summary_onlygenerate_summary_and_fixability_score
to better describe what the task does (the Celery task name string is unchanged).

Note: the GroupAiSummaryEndpoint previously passed should_run_automation=True
(the default) but used source=SeerAutomationSource.ISSUE_DETAILS. run_automation
has an early return for that source (if source == ISSUE_DETAILS: return), so
automation was never actually triggered from that endpoint — the flag was a no-op
there. This refactor doesn't change that behavior.

Move the run_automation call out of get_issue_summary and into the
generate_summary_and_run_automation task, where the intent is explicit.
This removes a hidden side effect from what should be a summary-only
function and lets callers decide whether to trigger automation.

Also renames generate_issue_summary_only to
generate_summary_and_fixability_score to better reflect what it does.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 3, 2026
)

return summary_dict, 200
return summary_dict, 200, event
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the automation responsibility to the calling spot, it was getting too complicated tracing all the various paths and implicit behavior.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't remove this here. The old Seer SKU (not per seat orgs) still use this flow.
It is the "old Seer" in this diagram - https://miro.com/app/board/uXjVJqn1-fQ=/?focusWidget=3458764648325594380

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should still be preserved.

def kick_off_seer_automation(job: PostProcessJob) -> None:
  ...
    if not is_seer_seat_based_tier_enabled(group.organization):
        generate_summary_and_run_automation.delay(group.id, trigger_path="old_seer_automation")

def generate_summary_and_run_automation(group_id: int, **kwargs) -> None:
    if status_code == 200 and event is not None:
        run_automation(group, AnonymousUser(), event, SeerAutomationSource.POST_PROCESS)

The generate_summary_and_run_automation() func now explicitly calls run_automation() instead of relying on get_issue_summary() to call it.

retry=Retry(times=3, delay=3, on=(Exception,)),
)
def generate_issue_summary_only(group_id: int) -> None:
def generate_summary_and_fixability_score(group_id: int) -> None:
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can revert the naming if you find it confusing to not match the task name, but the _only suffix was really confusing to read when it clearly does not only generate the summary lol.

This error handler was never triggered in production. Removing it lets
the task retry mechanism handle failures instead of silently swallowing them.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
trevor-e and others added 2 commits April 3, 2026 15:49
Two early returns in get_issue_summary (feature flag disabled, AI features
hidden) were still returning 2-tuples after the signature changed to 3-tuples.
Also fixes two test mocks for _generate_summary with the same issue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…loss

On retry, the summary is cached so get_issue_summary returns event=None,
causing the automation guard to skip entirely. Catching the exception prevents
a wasted retry and logs the failure instead of silently dropping automation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
get_issue_summary no longer calls run_automation, so the
_trigger_autofix_task and get_autofix_state patches and assertions
were stale.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Missed caller of get_issue_summary that was still unpacking 2 values
instead of 3. Would have caused ValueError at runtime for Slack alert
notifications.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Backend Test Failures

Failures on 88054dd in this run:

tests/sentry/integrations/utils/test_issue_summary_for_alerts.py::FetchIssueSummaryTest::test_fetch_issue_summary_with_hide_ai_features_disabledlog
[gw0] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/integrations/utils/test_issue_summary_for_alerts.py:74: in test_fetch_issue_summary_with_hide_ai_features_disabled
    assert result == mock_summary
E   AssertionError: assert None == {'headline': 'Test AI Summary', 'possibleCause': 'Test cause', 'whatsWrong': 'Something went wrong'}
tests/sentry/seer/endpoints/test_group_ai_summary.py::GroupAiSummaryEndpointTest::test_endpoint_with_error_responselog
[gw0] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/seer/endpoints/test_group_ai_summary.py:61: in test_endpoint_with_error_response
    assert response.status_code == 400
E   assert 500 == 400
E    +  where 500 = <Response status_code=500, "application/json">.status_code
tests/sentry/integrations/slack/test_message_builder.py::BuildGroupAttachmentTest::test_compact_alerts_with_ai_summarylog
[gw1] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/integrations/slack/test_message_builder.py:1205: in test_compact_alerts_with_ai_summary
    assert found_initial_guess, "Initial Guess context block not found"
E   AssertionError: Initial Guess context block not found
E   assert False
tests/sentry/integrations/slack/test_message_builder.py::BuildGroupAttachmentTest::test_build_group_block_with_ai_summarylog
[gw1] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/integrations/slack/test_message_builder.py:957: in test_build_group_block_with_ai_summary
    assert "Identity not found" in blocks["blocks"][0]["text"]["text"]
E   AssertionError: assert 'Identity not found' in ':red_circle: <http://testserver/organizations/baz/issues/84/?referrer=slack|*IntegrationError*>'
tests/sentry/integrations/slack/test_message_builder.py::BuildGroupAttachmentTest::test_build_group_block_with_ai_summary_text_truncationlog
[gw0] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/integrations/slack/test_message_builder.py:1044: in test_build_group_block_with_ai_summary_text_truncation
    assert "First line of text..." in title_text
E   AssertionError: assert 'First line of text...' in ':red_circle: <http://testserver/organizations/baz/issues/72/?referrer=slack|*IntegrationError*>'

Several test files were still mocking get_issue_summary with 2-tuple
return values instead of 3-tuples: the endpoint error test, Slack message
builder tests, and issue_summary_for_alerts tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@trevor-e trevor-e marked this pull request as ready for review April 3, 2026 20:45
@trevor-e trevor-e requested review from a team as code owners April 3, 2026 20:45
Comment on lines +77 to +79
from django.contrib.auth.models import AnonymousUser

from sentry.seer.autofix.issue_summary import get_issue_summary, run_automation
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these imports need to be inlined or can they go to top of file?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post_process.py is very picky about imports which is why this is all inlined.

Copy link
Copy Markdown
Contributor

@Mihir-Mavalankar Mihir-Mavalankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kick_off_seer_automation still needs to support Seer customers on the old plan. I think these changes break that

)

return summary_dict, 200
return summary_dict, 200, event
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't remove this here. The old Seer SKU (not per seat orgs) still use this flow.
It is the "old Seer" in this diagram - https://miro.com/app/board/uXjVJqn1-fQ=/?focusWidget=3458764648325594380

try:
run_automation(group, AnonymousUser(), event, SeerAutomationSource.POST_PROCESS)
except Exception:
logger.exception("seer.automation.run_automation_failed", extra={"group_id": group.id})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this issue warden caught is correct.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell this issue is present in the old code as well. get_issue_summary() checks:

    # 1. Check cache first
    if cached_summary := cache.get(cache_key):
        return convert_dict_key_case(cached_summary, snake_to_camel_case), 200

which early returns before the automation gets triggered. We can fix if you think this is a bug but trying to keep this PR having no behavior changes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although to Warden's point maybe this should be named generate_summary_and_maybe_run_automation()

@Mihir-Mavalankar
Copy link
Copy Markdown
Contributor

Also maybe this PR is worth breaking down into 2 things:

  1. Renames and small non logic changes
  2. Any potential changes to logic of autofix automations.

source was only used to pass to run_automation which was removed from
this function. get_issue_summary still accepts source for billing logging.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e_summary

The refactor to return event from get_issue_summary removed the
run_automation call that was previously embedded in _generate_summary.
This restores the ALERT automation path by calling run_automation
explicitly after a successful summary generation.

Co-Authored-By: Claude Opus 4.6 <noreply@example.com>
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 901ed71. Configure here.

logger.exception(
"seer.automation.run_automation_failed",
extra={"group_id": group.id},
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alert path runs automation without timeout protection

Medium Severity

The run_automation call, which makes HTTP requests to the Seer API, now runs synchronously on the main thread after the timed issue summary generation. This work is no longer covered by the timeout, allowing slow Seer responses to block the alert notification thread.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 901ed71. Configure here.

@trevor-e trevor-e closed this Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants