Skip to content

frontend: stream /backend/update/ response to prevent Apache 504#4280

Closed
xsuchy wants to merge 1 commit into
fedora-copr:mainfrom
xsuchy:2928
Closed

frontend: stream /backend/update/ response to prevent Apache 504#4280
xsuchy wants to merge 1 commit into
fedora-copr:mainfrom
xsuchy:2928

Conversation

@xsuchy
Copy link
Copy Markdown
Member

@xsuchy xsuchy commented Apr 23, 2026

When update_state_from_dict triggers slow cascading deletes (e.g. removing large projects), Apache times out waiting for the response. Stream the response so newlines are sent after each db.session.commit(), keeping the connection alive.

Fixes: #2928

Assisted-By: Claude Opus 4.6 noreply@anthropic.com

@xsuchy
Copy link
Copy Markdown
Member Author

xsuchy commented Apr 23, 2026

Disclaimer, I actually did not try it as I do not have the big repo that I can delete.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces response streaming to the /update/ endpoint to prevent Apache 504 timeouts during potentially slow database operations, such as cascading deletes. The feedback suggests improving the effectiveness of this mechanism by yielding an initial newline at the start of the stream and yielding again immediately before the database commit. Additionally, the reviewer recommended adding validation to raise a ValueError if build IDs are missing and refactoring list constructions to use standard for-loops for consistency with the existing codebase.

Comment on lines +408 to +428
def _stream():
result = {}
for typ, logic_cls in [("actions", actions_logic.ActionsLogic),
("builds", BuildsLogic)]:
if typ not in request_data:
continue
to_update = {}
for obj in request_data[typ]:
to_update[obj["id"]] = obj
existing = {}
for obj in logic_cls.get_by_ids(to_update.keys()).all():
existing[obj.id] = obj
non_existing_ids = list(set(to_update.keys()) - set(existing.keys()))
for i, obj in existing.items():
logic_cls.update_state_from_dict(obj, to_update[i])
db.session.commit()
result[f"updated_{typ}_ids"] = list(existing.keys())
result[f"non_existing_{typ}_ids"] = non_existing_ids
# Keep the connection alive while the DB commits.
yield "\n"
yield flask.json.dumps(result)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation yields a newline after the db.session.commit() call. If the cascading delete triggered by update_state_from_dict is slow, Apache might still time out before the first newline is ever sent.

To effectively prevent the 504 timeout, it is better to:

  1. Yield an initial newline at the very start of the generator to ensure headers are sent and the connection is marked as active.
  2. Yield the newline before the db.session.commit() call to reset the Apache timeout timer just before the potentially slow operation starts.

Additionally, the code has been updated to comply with repository rules: raising a ValueError for empty build_ids and using standard for-loops for list construction instead of shorthand constructors or comprehensions.

    def _stream():
        # Send an initial newline to open the connection and prevent timeout
        # during the first potentially slow database operation.
        yield "\n"
        result = {}
        for typ, logic_cls in [("actions", actions_logic.ActionsLogic),
                               ("builds", BuildsLogic)]:
            if typ not in request_data:
                continue
            to_update = {}
            for obj in request_data[typ]:
                to_update[obj["id"]] = obj

            if typ == "builds" and not to_update:
                raise ValueError("The list of build_ids cannot be empty.")

            existing = {}
            for obj in logic_cls.get_by_ids(to_update.keys()).all():
                existing[obj.id] = obj

            non_existing_ids = []
            for id_key in to_update.keys():
                if id_key not in existing:
                    non_existing_ids.append(id_key)

            for i, obj in existing.items():
                logic_cls.update_state_from_dict(obj, to_update[i])

            # Keep the connection alive while the DB commits.
            yield "\n"
            db.session.commit()

            updated_ids = []
            for key in existing.keys():
                updated_ids.append(key)

            result[f"updated_{typ}_ids"] = updated_ids
            result[f"non_existing_{typ}_ids"] = non_existing_ids
        yield flask.json.dumps(result)
References
  1. When querying for content, if the list of build_ids is empty, raise a ValueError as it is considered an invalid state.
  2. For consistency with similar functions, prefer using a standard for-loop to build a list over a list comprehension.

@nikromen nikromen self-requested a review April 27, 2026 08:13
@praiskup
Copy link
Copy Markdown
Member

/packit test

@praiskup
Copy link
Copy Markdown
Member

Trying to fix CI in #4285

@praiskup
Copy link
Copy Markdown
Member

Can you please rebase on top of main?

When update_state_from_dict triggers slow cascading deletes (e.g.
removing large projects), Apache times out waiting for the response.
Stream the response so newlines are sent after each db.session.commit(),
keeping the connection alive.

Fixes: fedora-copr#2928

Assisted-By: Claude Opus 4.6 <noreply@anthropic.com>
@nikromen
Copy link
Copy Markdown
Member

rebased, let's see

/packit test

@praiskup
Copy link
Copy Markdown
Member

praiskup commented May 5, 2026

When update_state_from_dict triggers slow cascading deletes (e.g. removing large projects), Apache times out waiting for the response. Stream the response so newlines are sent after each db.session.commit(), keeping the connection alive.

Are we really talking about connection timeouts (as configured in wsgi)? If so, I do not believe this streaming response would help here (in a sense that it would make the response fit the limit).

@praiskup
Copy link
Copy Markdown
Member

praiskup commented May 5, 2026

WSGIDaemonProcess backend user=copr-fe group=copr-fe {{ develizer(2, 15) }} display-name="httpd backend" maximum-requests=8000 graceful-timeout=20

Hmm, does this mean we have the default 60s timeout?

result[f"updated_{typ}_ids"] = list(existing.keys())
result[f"non_existing_{typ}_ids"] = non_existing_ids
# Keep the connection alive while the DB commits.
yield "\n"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about streamed_json_array_response ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sending just one byte is going to likely just fill some buffer somewhere, we should be flushing)

@praiskup
Copy link
Copy Markdown
Member

praiskup commented May 5, 2026

Otherwise, sorry @xsuchy for the delay here, this is indeed rather a simple change (still not sure if correct) - especially since we are in /backend/ namespace where it's not that likely we would be facing problems like from #4283.

@praiskup
Copy link
Copy Markdown
Member

praiskup commented May 5, 2026

The real problem probably lies here:

for build in to_delete:
db.session.delete(build)

@praiskup
Copy link
Copy Markdown
Member

praiskup commented May 5, 2026

Because:

for chunk in batched(builds, 1000):
# Don't send delete action for each build, rather send an action to delete
# a whole project as a part of CoprsLogic.delete_unsafe() method.
BuildsLogic.delete_builds(user, chunk, send_delete_action=False)

@praiskup
Copy link
Copy Markdown
Member

praiskup commented May 5, 2026

Check the last comment in #4280 (the update was just a transitive symptom of DoSed frontend).
The real problem was sub-optimal build removal, which was fixed.

Unless we still see those /update/ tracebacks in Sentry?

@praiskup
Copy link
Copy Markdown
Member

praiskup commented May 5, 2026

Snímek obrazovky_20260505_205625

There don't seem to be related tracebacks in backend_ns.update now. Closing as fixed.

@praiskup praiskup closed this May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Copr tracebacks (httpd 504) on requests to remove large projects

3 participants