Skip to content

feat(api): cap banned IPs per jail in /api/summary response#136

Merged
swissmakers merged 1 commit into
swissmakers:mainfrom
michael-ferioli:feat/cap-summary-banned-ips
May 17, 2026
Merged

feat(api): cap banned IPs per jail in /api/summary response#136
swissmakers merged 1 commit into
swissmakers:mainfrom
michael-ferioli:feat/cap-summary-banned-ips

Conversation

@michael-ferioli
Copy link
Copy Markdown
Contributor

Why

The dashboard's /api/summary handler enumerates every active jail and returns the full banned-IP list for each via collectJailInfos. On hosts with large jails this becomes a hard scaling problem.

In our environment we have ~17 active jails and one of them (nginx-property-scraper) holds 4,500+ active bans at any given moment due to a sustained MLS-scraping campaign. With those numbers, /api/summary takes 3+ minutes to respond. Every reasonable HTTP gateway/proxy returns 504 long before that — for us it was AWS ALB's default 60s idle_timeout. Result: the dashboard's "Loading summary data…" panel never resolves and the JS console shows SyntaxError: Unexpected token '<', '<html><h'... is not valid JSON (the parser choking on the gateway's HTML 504).

What this changes

Adds a small cap on BannedIPs per jail in the JSON returned by /api/summary. The accurate TotalBanned count is always preserved — only the BannedIPs slice is truncated.

  • Default cap: 100 IPs per jail
  • Configurable via env var FAIL2BAN_UI_SUMMARY_MAX_IPS
  • Set to 0 to disable the cap and restore current behaviour (full list)

Why 100 is enough for the dashboard

pkg/web/static/js/dashboard.js's renderBannedIPs shows only the first 5 IPs per jail by default (maxVisible = 5); the rest are hidden behind a "show more" toggle. 100 leaves comfortable headroom for the expanded view without forcing the backend to materialise tens of thousands of strings into a single JSON response.

Reproduction

# inside a container with a large jail
$ fail2ban-client status nginx-property-scraper | head -10
Status for the jail: nginx-property-scraper
   |- Currently banned:	4632
   ...
$ time curl -s http://localhost:8080/api/summary?serverId=local >/dev/null
# >180s, often times out at the gateway with 504

After this patch (with default FAIL2BAN_UI_SUMMARY_MAX_IPS=100):

$ time curl -s http://localhost:8080/api/summary?serverId=local | jq '.jails[] | {jailName, totalBanned, returnedIPs: (.bannedIPs | length)}'
# responds well under 1s
{ "jailName": "nginx-property-scraper", "totalBanned": 4632, "returnedIPs": 100 }
...

Trade-offs and follow-ups

  • Users who genuinely need the entire ban list in /api/summary can keep current behaviour via FAIL2BAN_UI_SUMMARY_MAX_IPS=0.
  • A cleaner long-term direction is to expose the full per-jail list via a separate paginated endpoint (e.g. /api/jails/:jail/banned?limit=&offset=) and have the dashboard fetch it on-demand when the user expands a jail. Happy to follow up with that as a separate PR if you'd like — keeping this one minimal so it can land quickly.

Diff

+23 / -1 lines, single file internal/fail2ban/connector_global.go.

Thanks for fail2ban-ui — it's been a great fit for our infrastructure.

Made with Cursor

The dashboard's /api/summary handler enumerates every active jail and
returns the FULL banned-IP list for each. On hosts with large jails
(e.g. 4,500+ active bans on one jail across 17 active jails), this
single endpoint can take 3+ minutes to respond, exceeding any
reasonable HTTP gateway/proxy timeout (we hit a 60s ALB idle_timeout
default that returned 504). The dashboard renders only the first 5
IPs per jail by default (renderBannedIPs, maxVisible=5), so returning
a few hundred is more than enough for the UI's needs.

This patch caps BannedIPs per jail at 100 (configurable via the
FAIL2BAN_UI_SUMMARY_MAX_IPS env var; set to 0 to disable). The
accurate TotalBanned count is always preserved -- only the BannedIPs
slice is truncated.

Reproduction:

    fail2ban-client status nginx-property-scraper  # ~4500 banned IPs
    curl http://localhost:8080/api/summary?serverId=local
    # times out at >180s

After the patch (FAIL2BAN_UI_SUMMARY_MAX_IPS=100):

    curl http://localhost:8080/api/summary?serverId=local
    # responds in <1s, 100 IPs per jail, accurate TotalBanned

Trade-offs: a future user who actually wants the full IP list per jail
in /api/summary can opt out via FAIL2BAN_UI_SUMMARY_MAX_IPS=0. A
follow-up could expose the full list via a paginated /api/jails/:jail
endpoint, but that's out of scope for this fix.
@swissmakers swissmakers self-assigned this Apr 29, 2026
@swissmakers
Copy link
Copy Markdown
Owner

Hi @michael-ferioli thanks a lot for your detailed write-up, the reproduction steps, and your careful analysis of renderBannedIPs. The 504 timeout against /api/summary on large jails looks like a real problem and worth fixing.

A couple of things before we can merge this:

1. Please retarget the PR to dev, not main.
We merge feature changes through dev first and only promote to main after some function and security tests. So could you please change the base branch on this PR to dev?

2. The hard cap of 100 has UI side effects we need to address.

  • The "Show more" button (inside Overview active Jails and Blocks) is a client-side reveal of IPs already loaded from /api/summary (initial maxVisible = 5, rest hidden). With the cap, "Show more" can only expose what's in the truncated slice so that would be new only 100 by default all other can't be shown.
  • The unban action is rendered per row in the DOM. IPs outside the returned slice have no row and therefore no unban button. The dashboard would no longer be able to surface or release those bans (they'd only again be reachable via CLI or the manual form, which requires knowing the exact IP).
  • The dashboard search (Search Banned IPs) filters the client-side dataset. Once we cap the payload, search silently can no longer find IPs above the cap. From a user's perspective the IP "doesn't exist" -> even though it's still actively banned.

So while the cap removes the timeout, it creates new problems..

3. Preferred direction

Your own follow-up proposal in the description -> a separate paginated endpoint like /api/jails/:jail/banned?limit=&offset= with the dashboard fetching on demand when the user expands a jail (and a server-side search query) would be the right architectural fix. This will solve the timeout and* keeps "Show more", unban, and search functional at any scale working.

Would you be willing to fold that into this PR (or replace it with that PR)? Concretely, I think that would make sense:

  • a paginated /api/jails/:jail/banned endpoint (with limit, offset, and ideally a q filter),
  • /api/summary returning only the count + a small preview slice (I would keep that then really small by default e.g. 10 or even 5 , since the full list is reachable via the new endpoint),
  • the dashboard switching "Show more" to a lazy fetch against the new endpoint, and search hitting it server-side rather than filtering the cached payload.

Alternatively we could also think of an redis cache to initial fetch all the blocked IP's / Jails once inside redis and call them from there (then update only single entries on ban / unban - or the whole dataset on server enable / disable events in the background.)

If that's too large to land quickly and you'd prefer to keep this PR as a stopgap, you can also only switch the destination branch to dev in a first step. We will take then the rest to our roadmap.

Happy either way, your call on which path you want to take. Thanks again for the contribution!

@swissmakers swissmakers merged commit 33d4738 into swissmakers:main May 17, 2026
@swissmakers
Copy link
Copy Markdown
Owner

Other plan, i will merge it then move it to dev and will take a look at it. Thanks for your contribution again! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants