Skip to content

fix(threats): correct RFC1918 source IP attribution#670

Open
jedis00 wants to merge 1 commit into
Ozark-Connect:devfrom
jedis00:fix/threats-attribution
Open

fix(threats): correct RFC1918 source IP attribution#670
jedis00 wants to merge 1 commit into
Ozark-Connect:devfrom
jedis00:fix/threats-attribution

Conversation

@jedis00
Copy link
Copy Markdown
Contributor

@jedis00 jedis00 commented May 26, 2026

Fix destination ASN attribution on source-IP groupings; add categorized filters, self-detection, and configurable display limits

Summary

The "Top Threat Sources" table on audit PDFs displayed misleading attribution: internal source IPs were tagged with destination ASNs. RFC1918 internal hosts that ran outbound DNS-over-HTTPS queries to NextDNS were labeled NextDNS, Inc.; an internal workstation that mostly talked to Google services was labeled Google LLC. Neither owns a public ASN; the values were the destinations of their flows, not the sources.

This PR fixes the root cause, makes the audit report honor the same noise filters as the threat dashboard, adds an Infrastructure / TrustedUser filter category so suppressed activity is visible in dedicated sub-tables rather than silently dropped, surfaces categorized rows on the dashboard with a Category badge, and exposes the audit display limits as user-editable settings.

Root cause

GeoEnrichmentService.EnrichEvents redirected the MaxMind lookup to the destination IP for RFC1918 sources, then wrote the result back onto evt.CountryCode / evt.AsnOrg. The event row carried the destination's ASN under fields that callers read as the source's ASN. ThreatRepository.GetTopSourcesAsync then did GroupBy(SourceIp).Select(g => new { ..., AsnOrg = g.First().AsnOrg }), surfacing the destination ASN under a source-IP label.

A secondary issue: the audit PDF call path (AuditService.BuildThreatSummaryAsync) never called SetNoiseFilters on the scoped repository, so any noise filters configured via the dashboard UI were ignored in the audit report.

Changes

Bug fixes

  • ThreatEvent - split source and destination geo enrichment into distinct fields. CountryCode / AsnOrg / etc. now reflect the source IP only (null for RFC1918, which is the correct answer). New DestCountryCode / DestAsn / DestAsnOrg / etc. carry destination enrichment.
  • GeoEnrichmentService.EnrichEvents - rewrites to enrich source IP into source fields and destination IP into destination fields. No more cross-contamination.
  • AuditService.BuildThreatSummaryAsync - now loads enabled noise filters and applies them via SetNoiseFilters before querying, matching dashboard behavior.
  • BackfillGeoDataAsync - new GeoEnriched flag drives the predicate so RFC1918 events (which legitimately have null source geo) are not re-processed forever.

Feature: categorized filters

  • ThreatFilterCategory (new enum) - Noise / Infrastructure / TrustedUser.
  • ThreatNoiseFilter - new Category, Label, IsSystem fields. IsSystem entries are locked from delete/disable in the UI.
  • ThreatRepository.GetSourcesByCategoryAsync (new) - returns the source-IP rollups that match enabled filters of the given category, bypassing the noise-filter exclusion so suppressed activity can be re-surfaced for the audit report.
  • AuditService.BuildThreatSummaryAsync - populates InfrastructureSources and TrustedUserSources and a SuppressedEventCount total.
  • PdfReportGenerator / MarkdownReportGenerator - render "Known Infrastructure Activity" and "Trusted User Activity" sub-tables beneath the main Top Threat Sources table. A footnote on the main table calls out the suppressed counts.
  • ThreatDashboard.razor - the existing noise-filter form gets Category dropdown and optional Label input. The existing filter table shows Category / Label columns and disables delete/disable controls on IsSystem entries.

Feature: dashboard category badges

  • ThreatDashboardService.GetDashboardDataAsync - the dashboard's main Top Threat Sources table now applies only Noise-category filters (when the master toggle is on). Infrastructure and TrustedUser sources remain in the main table with a Category column badge so they are visible at a glance. The kill chain, ports, and pattern queries keep the full-filter behavior so categorized noise does not dominate charts.
  • SourceIpSummary.MatchedFilterCategory / Label are now attached during enrichment in the dashboard service for badge rendering.
  • The audit PDF keeps the original "main excludes categorized, sub-tables surface them" layout. Dashboard and PDF have different presentations of the same data: badge-in-main for interactive use, sub-tables for the static report.

Feature: auto-detect self

  • ThreatCollectionService.EnsureSelfInfrastructureFilterAsync (new) - on startup, detects the host's primary IPv4 (UDP-socket-route trick, NIC enumeration fallback) and ensures an enabled, IsSystem Infrastructure-category filter exists for it with Label = "Network Optimizer (self)". If the IP changes (DHCP renewal), the prior system entry is demoted (IsSystem = false, Enabled = false) but kept in the table for audit history; user can delete it via the UI when ready. If the same IP later returns, the existing demoted entry is re-promoted to system rather than creating a duplicate.
  • IThreatRepository.DemoteAndDisableSystemFilterAsync / PromoteToSystemFilterAsync (new) support the demote-on-change lifecycle.
  • Self-detection is IPv4 only (AddressFamily.InterNetwork); IPv6-primary hosts will not be auto-detected. Manual filter entries still work for IPv6.

Feature: configurable audit display limits

  • Three new settings keys with defaults 5, 10, 20:
    • threats.top_sources_main_limit - rows in the main Top Threat Sources table
    • threats.top_sources_category_limit - rows in each categorized sub-table
    • threats.sources_by_category_query_limit - candidate fetch size from the DB per category before trimming to the display limit
  • AuditService.BuildThreatSummaryAsync reads them via a new GetIntSettingAsync helper; redundant .Take(N) calls on the PDF/MD renderers are removed.
  • Settings.razor Threat Intelligence section gains three numeric input fields with form-help text. SaveThreatSettings clamps to sane ranges (1-100, 1-100, 1-500) before persisting in case of direct DB writes.

CrowdSec hardening

  • CrowdSecLookupOutcome.NotApplicable (new enum value) for RFC1918 / loopback / link-local IPs.
  • CrowdSecEnrichmentService.GetReputationAsync now has a centralized private-IP guard at the canonical entry point. Returns NotApplicable before touching cache or the API. Closes a quota-burn gap on the previously-unguarded public ThreatDashboardService.GetCrowdSecReputationAsync. Existing inline guards at higher call sites are kept as defense-in-depth.

Schema

Migration 20260526120000_AddDestGeoAndFilterCategory:

  • ThreatEvents: adds DestCountryCode, DestCity, DestAsn, DestAsnOrg, DestLatitude, DestLongitude, GeoEnriched. One-time SQL backfill sets GeoEnriched = 1 for rows with non-null CountryCode; a follow-up UPDATE nulls source-geo and clears GeoEnriched for RFC1918 / loopback / link-local source rows so the backfill loop re-enriches them with the corrected logic (closes the data-integrity gap on pre-fix rows).
  • ThreatNoiseFilters: adds Category (int, default 0 = Noise), Label (text, nullable), IsSystem (bool, default false). Indexes IX_ThreatNoiseFilters_Category_Enabled and IX_ThreatNoiseFilters_Category_SourceIp.

Behavior on existing data

  • Pre-fix RFC1918 source rows have their source-geo fields nulled by the migration scrub and GeoEnriched reset, so the backfill re-enriches them with the corrected logic (source ends up null, which is the truthful answer).
  • Old events with public-IP sources keep their existing CountryCode / AsnOrg values (migration marks them GeoEnriched = 1).
  • Old noise filters default to Category = Noise, preserving prior behavior.
  • The self-IP entry is created on the next service start. If the host IP later changes, the prior entry is demoted-not-deleted (see auto-detect-self section above) so audit history is preserved.

Known limitations (decided, not deferred)

  • Self-IP detection runs at service start only. If the host gets a new DHCP lease mid-run, the system filter is stale until the next service restart. Accepted trade-off (LXC IP rarely changes; periodic re-check adds complexity for a corner case).
  • Self-detection is the optimizer LXC only. Other on-network infrastructure (local DNS proxies, internal services) requires manual UI setup. Auto-detection of those is intentionally out of scope - kept user-driven.
  • Self-detection is IPv4 only. IPv6-primary hosts need a manual filter entry.

Tests

318 tests pass (190 Threats + 128 Storage). New coverage:

  • GeoEnrichmentServiceTests - private/loopback/link-local/invalid IPs return GeoInfo.Empty; no-DB-loaded case is a true no-op (no fields set, GeoEnriched stays false so backfill is correctly gated by IsCityAvailable upstream).
  • ThreatNoiseFilterTests - default Category is Noise, default IsSystem is false; existing Matches() contract (exact + CIDR) still works after the new fields are added.
  • ThreatNoiseFilterPersistenceTests - new Category, Label, IsSystem fields round-trip through SaveNoiseFilterAsync / GetNoiseFiltersAsync; legacy filters without an explicit Category land as Noise. DemoteAndDisableSystemFilterAsync strips IsSystem and disables but keeps the row+label for audit history; PromoteToSystemFilterAsync restores both.
  • CrowdSecEnrichmentServiceTests - private IPs return NotApplicable without touching cache or API (verified with a strict mock that throws on any unconfigured method call); public IPs still hit the cache layer first.
  • ThreatRepositoryCategoryTests - the direct regression coverage:
    • GetTopSourcesAsync_PrivateSource_DoesNotInheritDestinationAsn is the regression test for the actual bug. Constructs an event with the corrected enrichment field layout (RFC1918 source, source geo null, DestAsnOrg="NextDNS, Inc.") and asserts the table no longer surfaces the destination ASN under the source.
    • GetSourcesByCategoryAsync coverage: empty filters, exact IP, CIDR (192.0.2.0/24), disabled-filter exclusion, bypass of noise-filter exclusion (the audit-report path), and category isolation.
    • GetTopSourcesAsync_NoiseOnlyFilterSet_AllowsInfrastructureRowsThrough - the data-layer guarantee the dashboard's Category-badge layout depends on. With only Noise-category filters applied, Infrastructure rows remain visible in the main table.

Full E2E MaxMind enrichment (with a real .mmdb) is not included - the behavior is exercised indirectly via the repository tests that consume the post-enrichment field layout. Migration SQL data scrub is not exercised in tests because the EF Core InMemory provider ignores migrationBuilder.Sql(); manual verification was done on a production-data copy before merge.

…tegories

GeoEnrichmentService was redirecting MaxMind lookups to the destination
IP for RFC1918 sources but writing the result back onto the source geo
fields. ThreatRepository.GetTopSourcesAsync then surfaced destination
ASNs under source IP labels, causing private IPs to appear as Google,
NextDNS, Cloudflare, etc. in dashboards and audit PDFs.

Fix splits source/dest geo on ThreatEvent and adds a GeoEnriched flag.
Source enrichment writes to source fields; dest enrichment writes to
dest fields. Private/loopback/link-local sources no longer receive
source-side enrichment. CrowdSec enrichment short-circuits private IPs
at the service entry point via a NotApplicable outcome.

Adds ThreatFilterCategory enum (Noise/Infrastructure/TrustedUser) and
Category/Label/IsSystem columns on ThreatNoiseFilters with supporting
indexes. EnsureSelfInfrastructureFilterAsync auto-detects the
optimizer's primary IPv4 at startup and registers a system
Infrastructure filter so the optimizer's own scanning traffic surfaces
in a categorized sub-table rather than the main Top Sources. On IP
change the old entry is demoted to non-system and disabled (kept for
audit history); if the prior IP returns, its entry is re-promoted.

AuditService.BuildThreatSummaryAsync applies Noise-only filtering to
Top Sources (Infrastructure/TrustedUser rows surface in their own
sub-tables) and reads three new configurable display limits from
settings: threats.top_sources_main_limit,
threats.top_sources_category_limit,
threats.sources_by_category_query_limit. ThreatDashboard adds a
Category column with badges (purple Infrastructure, teal TrustedUser);
Settings exposes the three limits as numeric inputs.

Migration 20260526120000_AddDestGeoAndFilterCategory adds the new
columns, indexes, and a backfill that sets GeoEnriched=1 for existing
enriched rows then clears it (and nulls source geo) for
RFC1918/loopback/link-local source rows so the runtime backfill loop
re-attributes them under corrected logic.

Tests: 5 new files covering the private-IP guards, category
persistence and Demote/Promote lifecycle, CrowdSec NotApplicable
behavior, and the GetTopSourcesAsync regression. All 190 Threats and
128 Storage tests pass.
@tvancott42
Copy link
Copy Markdown
Collaborator

I've done an initial review on this one... nice work, to start. I should have time to wrapping up review and a couple small housekeeping fixes next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants