fix(threats): correct RFC1918 source IP attribution#670
Open
jedis00 wants to merge 1 commit into
Open
Conversation
…tegories GeoEnrichmentService was redirecting MaxMind lookups to the destination IP for RFC1918 sources but writing the result back onto the source geo fields. ThreatRepository.GetTopSourcesAsync then surfaced destination ASNs under source IP labels, causing private IPs to appear as Google, NextDNS, Cloudflare, etc. in dashboards and audit PDFs. Fix splits source/dest geo on ThreatEvent and adds a GeoEnriched flag. Source enrichment writes to source fields; dest enrichment writes to dest fields. Private/loopback/link-local sources no longer receive source-side enrichment. CrowdSec enrichment short-circuits private IPs at the service entry point via a NotApplicable outcome. Adds ThreatFilterCategory enum (Noise/Infrastructure/TrustedUser) and Category/Label/IsSystem columns on ThreatNoiseFilters with supporting indexes. EnsureSelfInfrastructureFilterAsync auto-detects the optimizer's primary IPv4 at startup and registers a system Infrastructure filter so the optimizer's own scanning traffic surfaces in a categorized sub-table rather than the main Top Sources. On IP change the old entry is demoted to non-system and disabled (kept for audit history); if the prior IP returns, its entry is re-promoted. AuditService.BuildThreatSummaryAsync applies Noise-only filtering to Top Sources (Infrastructure/TrustedUser rows surface in their own sub-tables) and reads three new configurable display limits from settings: threats.top_sources_main_limit, threats.top_sources_category_limit, threats.sources_by_category_query_limit. ThreatDashboard adds a Category column with badges (purple Infrastructure, teal TrustedUser); Settings exposes the three limits as numeric inputs. Migration 20260526120000_AddDestGeoAndFilterCategory adds the new columns, indexes, and a backfill that sets GeoEnriched=1 for existing enriched rows then clears it (and nulls source geo) for RFC1918/loopback/link-local source rows so the runtime backfill loop re-attributes them under corrected logic. Tests: 5 new files covering the private-IP guards, category persistence and Demote/Promote lifecycle, CrowdSec NotApplicable behavior, and the GetTopSourcesAsync regression. All 190 Threats and 128 Storage tests pass.
Collaborator
|
I've done an initial review on this one... nice work, to start. I should have time to wrapping up review and a couple small housekeeping fixes next week. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix destination ASN attribution on source-IP groupings; add categorized filters, self-detection, and configurable display limits
Summary
The "Top Threat Sources" table on audit PDFs displayed misleading attribution: internal source IPs were tagged with destination ASNs. RFC1918 internal hosts that ran outbound DNS-over-HTTPS queries to NextDNS were labeled
NextDNS, Inc.; an internal workstation that mostly talked to Google services was labeledGoogle LLC. Neither owns a public ASN; the values were the destinations of their flows, not the sources.This PR fixes the root cause, makes the audit report honor the same noise filters as the threat dashboard, adds an Infrastructure / TrustedUser filter category so suppressed activity is visible in dedicated sub-tables rather than silently dropped, surfaces categorized rows on the dashboard with a Category badge, and exposes the audit display limits as user-editable settings.
Root cause
GeoEnrichmentService.EnrichEventsredirected the MaxMind lookup to the destination IP for RFC1918 sources, then wrote the result back ontoevt.CountryCode/evt.AsnOrg. The event row carried the destination's ASN under fields that callers read as the source's ASN.ThreatRepository.GetTopSourcesAsyncthen didGroupBy(SourceIp).Select(g => new { ..., AsnOrg = g.First().AsnOrg }), surfacing the destination ASN under a source-IP label.A secondary issue: the audit PDF call path (
AuditService.BuildThreatSummaryAsync) never calledSetNoiseFilterson the scoped repository, so any noise filters configured via the dashboard UI were ignored in the audit report.Changes
Bug fixes
ThreatEvent- split source and destination geo enrichment into distinct fields.CountryCode/AsnOrg/ etc. now reflect the source IP only (null for RFC1918, which is the correct answer). NewDestCountryCode/DestAsn/DestAsnOrg/ etc. carry destination enrichment.GeoEnrichmentService.EnrichEvents- rewrites to enrich source IP into source fields and destination IP into destination fields. No more cross-contamination.AuditService.BuildThreatSummaryAsync- now loads enabled noise filters and applies them viaSetNoiseFiltersbefore querying, matching dashboard behavior.BackfillGeoDataAsync- newGeoEnrichedflag drives the predicate so RFC1918 events (which legitimately have null source geo) are not re-processed forever.Feature: categorized filters
ThreatFilterCategory(new enum) -Noise/Infrastructure/TrustedUser.ThreatNoiseFilter- newCategory,Label,IsSystemfields.IsSystementries are locked from delete/disable in the UI.ThreatRepository.GetSourcesByCategoryAsync(new) - returns the source-IP rollups that match enabled filters of the given category, bypassing the noise-filter exclusion so suppressed activity can be re-surfaced for the audit report.AuditService.BuildThreatSummaryAsync- populatesInfrastructureSourcesandTrustedUserSourcesand aSuppressedEventCounttotal.PdfReportGenerator/MarkdownReportGenerator- render "Known Infrastructure Activity" and "Trusted User Activity" sub-tables beneath the main Top Threat Sources table. A footnote on the main table calls out the suppressed counts.ThreatDashboard.razor- the existing noise-filter form gets Category dropdown and optional Label input. The existing filter table shows Category / Label columns and disables delete/disable controls onIsSystementries.Feature: dashboard category badges
ThreatDashboardService.GetDashboardDataAsync- the dashboard's main Top Threat Sources table now applies onlyNoise-category filters (when the master toggle is on). Infrastructure and TrustedUser sources remain in the main table with a Category column badge so they are visible at a glance. The kill chain, ports, and pattern queries keep the full-filter behavior so categorized noise does not dominate charts.SourceIpSummary.MatchedFilterCategory/Labelare now attached during enrichment in the dashboard service for badge rendering.Feature: auto-detect self
ThreatCollectionService.EnsureSelfInfrastructureFilterAsync(new) - on startup, detects the host's primary IPv4 (UDP-socket-route trick, NIC enumeration fallback) and ensures an enabled,IsSystemInfrastructure-category filter exists for it withLabel = "Network Optimizer (self)". If the IP changes (DHCP renewal), the prior system entry is demoted (IsSystem = false,Enabled = false) but kept in the table for audit history; user can delete it via the UI when ready. If the same IP later returns, the existing demoted entry is re-promoted to system rather than creating a duplicate.IThreatRepository.DemoteAndDisableSystemFilterAsync/PromoteToSystemFilterAsync(new) support the demote-on-change lifecycle.AddressFamily.InterNetwork); IPv6-primary hosts will not be auto-detected. Manual filter entries still work for IPv6.Feature: configurable audit display limits
5,10,20:threats.top_sources_main_limit- rows in the main Top Threat Sources tablethreats.top_sources_category_limit- rows in each categorized sub-tablethreats.sources_by_category_query_limit- candidate fetch size from the DB per category before trimming to the display limitAuditService.BuildThreatSummaryAsyncreads them via a newGetIntSettingAsynchelper; redundant.Take(N)calls on the PDF/MD renderers are removed.Settings.razorThreat Intelligence section gains three numeric input fields with form-help text.SaveThreatSettingsclamps to sane ranges (1-100, 1-100, 1-500) before persisting in case of direct DB writes.CrowdSec hardening
CrowdSecLookupOutcome.NotApplicable(new enum value) for RFC1918 / loopback / link-local IPs.CrowdSecEnrichmentService.GetReputationAsyncnow has a centralized private-IP guard at the canonical entry point. ReturnsNotApplicablebefore touching cache or the API. Closes a quota-burn gap on the previously-unguarded publicThreatDashboardService.GetCrowdSecReputationAsync. Existing inline guards at higher call sites are kept as defense-in-depth.Schema
Migration
20260526120000_AddDestGeoAndFilterCategory:ThreatEvents: addsDestCountryCode,DestCity,DestAsn,DestAsnOrg,DestLatitude,DestLongitude,GeoEnriched. One-time SQL backfill setsGeoEnriched = 1for rows with non-nullCountryCode; a follow-upUPDATEnulls source-geo and clearsGeoEnrichedfor RFC1918 / loopback / link-local source rows so the backfill loop re-enriches them with the corrected logic (closes the data-integrity gap on pre-fix rows).ThreatNoiseFilters: addsCategory(int, default 0 = Noise),Label(text, nullable),IsSystem(bool, default false). IndexesIX_ThreatNoiseFilters_Category_EnabledandIX_ThreatNoiseFilters_Category_SourceIp.Behavior on existing data
GeoEnrichedreset, so the backfill re-enriches them with the corrected logic (source ends up null, which is the truthful answer).CountryCode/AsnOrgvalues (migration marks themGeoEnriched = 1).Category = Noise, preserving prior behavior.Known limitations (decided, not deferred)
Tests
318 tests pass (190 Threats + 128 Storage). New coverage:
GeoEnrichmentServiceTests- private/loopback/link-local/invalid IPs returnGeoInfo.Empty; no-DB-loaded case is a true no-op (no fields set,GeoEnrichedstays false so backfill is correctly gated byIsCityAvailableupstream).ThreatNoiseFilterTests- defaultCategoryisNoise, defaultIsSystemis false; existingMatches()contract (exact + CIDR) still works after the new fields are added.ThreatNoiseFilterPersistenceTests- newCategory,Label,IsSystemfields round-trip throughSaveNoiseFilterAsync/GetNoiseFiltersAsync; legacy filters without an explicit Category land asNoise.DemoteAndDisableSystemFilterAsyncstripsIsSystemand disables but keeps the row+label for audit history;PromoteToSystemFilterAsyncrestores both.CrowdSecEnrichmentServiceTests- private IPs returnNotApplicablewithout touching cache or API (verified with a strict mock that throws on any unconfigured method call); public IPs still hit the cache layer first.ThreatRepositoryCategoryTests- the direct regression coverage:GetTopSourcesAsync_PrivateSource_DoesNotInheritDestinationAsnis the regression test for the actual bug. Constructs an event with the corrected enrichment field layout (RFC1918 source, source geo null,DestAsnOrg="NextDNS, Inc.") and asserts the table no longer surfaces the destination ASN under the source.GetSourcesByCategoryAsynccoverage: empty filters, exact IP, CIDR (192.0.2.0/24), disabled-filter exclusion, bypass of noise-filter exclusion (the audit-report path), and category isolation.GetTopSourcesAsync_NoiseOnlyFilterSet_AllowsInfrastructureRowsThrough- the data-layer guarantee the dashboard's Category-badge layout depends on. With only Noise-category filters applied, Infrastructure rows remain visible in the main table.Full E2E MaxMind enrichment (with a real
.mmdb) is not included - the behavior is exercised indirectly via the repository tests that consume the post-enrichment field layout. Migration SQL data scrub is not exercised in tests because the EF Core InMemory provider ignoresmigrationBuilder.Sql(); manual verification was done on a production-data copy before merge.