Skip to content

Commit cf58bd3

Browse files
authored
Merge pull request #58 from Prescott-Data/feat/observability-and-hardening
Feat/observability and hardening
2 parents 76d64af + ae2d28d commit cf58bd3

22 files changed

Lines changed: 1871 additions & 146 deletions

File tree

CHANGELOG.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,36 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [0.2.1] - 2026-05-10
9+
10+
### Changed
11+
- **Service Layer**: Refactored `connection_part2.go` into `credential.go`, separating credential capture, token refresh, and credential validation by responsibility.
12+
- **HTTP Client**: `validateCredentials`, `refreshTokens`, and `executeExchange` now use the centrally injected `httpClient` instead of creating inline clients, ensuring the configured transport is respected across all outbound calls.
13+
- **Audit Interface**: `ConnectionService` now accepts the `audit.Logger` interface instead of a concrete `*audit.Service` pointer, enabling proper mocking in unit tests.
14+
- **Method Promotion**: `validateCredentials` and `refreshTokens` promoted from standalone functions to methods on `connectionService` to allow struct field access.
15+
16+
### Added
17+
- **Service Layer Tests**: 7 new unit tests covering the previously untested `SaveCredential`, `Refresh`, and `ExchangeCodeForTokens` methods, including OAuth2 flows validated against `httptest` mock servers.
18+
- **SOC 2 Integration Tests**: Enterprise-grade compliance test suite (`soc_test.go`, `soc_livedb_test.go`) verifying encryption at rest (SOC-CTRL-01), immutable audit trail (SOC-CTRL-02), API key enforcement (SOC-CTRL-03), IP allowlisting (SOC-CTRL-04), and defense-in-depth middleware (SOC-CTRL-05).
19+
- **Architecture Enforcement**: `TestSeparationOfConcerns` statically analyzes import paths via `go/parser` to enforce layer boundaries at CI time.
20+
- **Docker Compose**: Local PostgreSQL and Redis containers for running live integration tests against a real database schema.
21+
22+
---
23+
24+
## [0.2.0] - 2026-05-05
25+
26+
### Added
27+
- **Security-as-Code CLI**: Declarative provider manifest management via YAML (`nexus apply`, `nexus plan`, `nexus diff`), with field-level diff output and concurrent provider fetching.
28+
- **Audit Subsystem**: Structured audit event logging to `audit_events` table with caller IP, User-Agent, and JSON event data.
29+
- **Secret Masking**: CLI masks sensitive fields in plan output to prevent credential exposure in logs.
30+
31+
### Changed
32+
- **CI/CD**: Removed CI workflow from the open repository; internal Azure deployment pipeline secured behind manual trigger.
33+
- **Documentation**: All registry examples standardized to `localhost:8090` to support OSS adoption without exposing internal infrastructure.
34+
- **Providers Endpoint**: Fixed path references (`/v1/providers``/providers`, `/v1/audit``/audit`) throughout documentation and code.
35+
36+
---
37+
838
## [0.1.0] - 2026-02-19
939

1040
### Added

nexus-broker/README.md

Lines changed: 33 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -208,13 +208,38 @@ curl -X POST -H "X-API-Key: $API_KEY" \
208208

209209
## Metrics and Logging
210210

211-
Prometheus at `/metrics`:
212-
- `oauth_consents_created_total`
213-
- `oauth_consents_with_openid_total`
214-
- `oauth_token_exchanges_total{status=success|error}`
215-
- `oauth_exchange_duration_seconds`
216-
- `oauth_id_tokens_returned_total`
217-
- `oauth_token_get_total{provider,has_id_token}`
211+
Prometheus at `/metrics`. All metrics are registered on startup.
212+
213+
#### Token Flows
214+
| Metric | Type | Labels | Description |
215+
|:---|:---|:---|:---|
216+
| `oauth_consents_created_total` | Counter || Consent specs issued |
217+
| `oauth_consents_with_openid_total` | Counter || Consents requesting OpenID scope |
218+
| `oauth_token_exchanges_total` | Counter | `status={success,error}` | Code-for-token exchanges |
219+
| `oauth_exchange_duration_seconds` | Histogram || Duration of token exchange |
220+
| `oauth_id_tokens_returned_total` | Counter || Exchanges that returned an `id_token` |
221+
| `oauth_token_refreshes_total` | Counter | `status={success,error}` | On-demand refresh attempts |
222+
| `oauth_refresh_duration_seconds` | Histogram || Duration of token refresh |
223+
| `oauth_credential_captures_total` | Counter | `status={success,error}` | API-key credential captures (SaveCredential) |
224+
225+
#### Token Retrieval
226+
| Metric | Type | Labels | Description |
227+
|:---|:---|:---|:---|
228+
| `oauth_token_get_total` | Counter | `provider`, `has_id_token` | Token retrievals by provider |
229+
230+
#### OIDC Infrastructure
231+
| Metric | Type | Labels | Description |
232+
|:---|:---|:---|:---|
233+
| `oidc_verifications_total` | Counter | `result={success,error}` | ID token verifications |
234+
| `oidc_verification_duration_seconds` | Histogram || ID token verification latency |
235+
| `oidc_discovery_total` | Counter | `result={success,error}` | OIDC discovery attempts |
236+
| `oidc_discovery_duration_seconds` | Histogram || OIDC discovery latency |
237+
238+
#### System Health
239+
| Metric | Type | Labels | Description |
240+
|:---|:---|:---|:---|
241+
| `nexus_connections_total` | Gauge | `status` | Live count of connections by status (polled every 30s) |
242+
| `nexus_db_operation_duration_seconds` | Histogram | `operation` | Repository-level DB operation latency |
218243

219244
Access logs are structured; audit events are recorded in `audit_events`.
220245

@@ -231,7 +256,7 @@ Access logs are structured; audit events are recorded in `audit_events`.
231256

232257
See `docs/SECURITY.md` for detailed guardrails and operations.
233258

234-
OIDC hardening (id_token verification via JWKS, nonce, discovery) is deferred. See `docs/TECH_DEBT.md`.
259+
OIDC hardening (id_token verification via JWKS, nonce, discovery) is fully implemented. See `pkg/oidc` for the validator and `pkg/discovery` for provider discovery.
235260

236261
---
237262

@@ -271,10 +296,3 @@ go build -o nexus-broker ./cmd/nexus-broker
271296
- `docs/PROVIDERS.md` – registry and templates for supported providers
272297
- `docs/TECH_DEBT.md` – OIDC hardening plan and acceptance criteria
273298

274-
Touching to test pipeline
275-
276-
<!-- trigger build Mon Jan 26 11:07:00 EAT 2026 -->
277-
<!-- trigger build Tue Jan 27 12:39:45 EAT 2026 env update -->
278-
279-
Triggering broker build
280-

nexus-broker/cmd/nexus-broker/main.go

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,15 @@ import (
77
"time"
88

99
"github.com/Prescott-Data/nexus-framework/nexus-broker/internal/audit"
10+
"github.com/Prescott-Data/nexus-framework/nexus-broker/internal/repository/instrumented"
1011
"github.com/Prescott-Data/nexus-framework/nexus-broker/internal/repository/postgres"
1112
"github.com/Prescott-Data/nexus-framework/nexus-broker/internal/service"
1213
"github.com/Prescott-Data/nexus-framework/nexus-broker/pkg/caching"
1314
"github.com/Prescott-Data/nexus-framework/nexus-broker/pkg/config"
1415
"github.com/Prescott-Data/nexus-framework/nexus-broker/pkg/handlers"
1516
"github.com/Prescott-Data/nexus-framework/nexus-broker/pkg/provider"
1617
"github.com/Prescott-Data/nexus-framework/nexus-broker/pkg/server"
18+
"github.com/Prescott-Data/nexus-framework/nexus-broker/pkg/telemetry"
1719
"github.com/go-chi/chi/v5"
1820
"github.com/go-redis/redis/v8"
1921
"github.com/jmoiron/sqlx"
@@ -65,8 +67,8 @@ func main() {
6567

6668
providersHandler := handlers.NewProvidersHandler(store, auditSvc)
6769

68-
connRepo := postgres.NewConnectionRepository(db)
69-
tokenRepo := postgres.NewTokenRepository(db)
70+
connRepo := instrumented.NewConnectionRepository(postgres.NewConnectionRepository(db))
71+
tokenRepo := instrumented.NewTokenRepository(postgres.NewTokenRepository(db))
7072

7173
connSvc := service.NewConnectionService(
7274
connRepo,
@@ -123,6 +125,9 @@ func main() {
123125
defer cleanupCancel()
124126
go handlers.StartOrphanTokenCleanup(cleanupCtx, db, 1*time.Hour)
125127

128+
// Start connection health gauge (polls every 30s)
129+
telemetry.NewConnectionGaugeCollector(connRepo, 30*time.Second)
130+
126131
log.Printf("Starting OAuth Broker server on port %s", cfg.Port)
127132
log.Printf("Version: %s", Version)
128133
log.Printf("Base URL: %s", cfg.BaseURL)

nexus-broker/docker-compose.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ services:
1212
- "5432:5432"
1313
volumes:
1414
- postgres_data:/var/lib/postgresql/data
15-
- ./migrations/00_create_tables.sql:/docker-entrypoint-initdb.d/00_create_tables.sql
15+
- ./migrations:/docker-entrypoint-initdb.d
1616
healthcheck:
1717
test: ["CMD-SHELL", "pg_isready -U oauth_user -d oauth_broker"]
1818
interval: 5s

nexus-broker/internal/architecture_test.go

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,22 @@ func TestSeparationOfConcerns(t *testing.T) {
5656
},
5757
Description: "HTTP Handlers must not bypass the Service layer to talk directly to Repositories.",
5858
},
59+
{
60+
Package: "pkg/provider",
61+
ShouldNot: []string{
62+
modulePrefix + "/pkg/handlers",
63+
modulePrefix + "/internal/service",
64+
},
65+
Description: "The Provider store infrastructure must not depend on HTTP handlers or business logic.",
66+
},
67+
{
68+
Package: "pkg/telemetry",
69+
ShouldNot: []string{
70+
modulePrefix + "/pkg/handlers",
71+
modulePrefix + "/internal/service",
72+
},
73+
Description: "Telemetry collectors must be independent of HTTP handlers and business logic.",
74+
},
5975
}
6076

6177
basePath := ".." // We are inside internal, so .. is the broker root

nexus-broker/internal/domain/models.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ type Connection struct {
2323
// ConnectionWithProvider joins connection and basic provider info
2424
type ConnectionWithProvider struct {
2525
Connection
26+
ProviderName string
2627
AuthType string
2728
AuthHeader string
2829
APIBaseURL string
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
package instrumented
2+
3+
import (
4+
"context"
5+
"time"
6+
7+
"github.com/google/uuid"
8+
"github.com/prometheus/client_golang/prometheus"
9+
10+
"github.com/Prescott-Data/nexus-framework/nexus-broker/internal/domain"
11+
"github.com/Prescott-Data/nexus-framework/nexus-broker/internal/repository"
12+
)
13+
14+
var (
15+
dbOpDuration = prometheus.NewHistogramVec(prometheus.HistogramOpts{
16+
Name: "nexus_db_operation_duration_seconds",
17+
Help: "Duration of database operations by repository and method",
18+
Buckets: []float64{0.001, 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0},
19+
}, []string{"repo", "method"})
20+
)
21+
22+
func init() {
23+
prometheus.MustRegister(dbOpDuration)
24+
}
25+
26+
func observe(repo, method string, start time.Time) {
27+
dbOpDuration.WithLabelValues(repo, method).Observe(time.Since(start).Seconds())
28+
}
29+
30+
// --- ConnectionRepository decorator ---
31+
32+
// ConnectionRepository wraps repository.ConnectionRepository with latency instrumentation.
33+
type ConnectionRepository struct {
34+
inner repository.ConnectionRepository
35+
}
36+
37+
// NewConnectionRepository wraps a ConnectionRepository with Prometheus latency histograms.
38+
func NewConnectionRepository(inner repository.ConnectionRepository) repository.ConnectionRepository {
39+
return &ConnectionRepository{inner: inner}
40+
}
41+
42+
func (r *ConnectionRepository) Create(ctx context.Context, conn *domain.Connection) error {
43+
defer observe("connection", "Create", time.Now())
44+
return r.inner.Create(ctx, conn)
45+
}
46+
47+
func (r *ConnectionRepository) GetPending(ctx context.Context, id uuid.UUID) (*domain.Connection, error) {
48+
defer observe("connection", "GetPending", time.Now())
49+
return r.inner.GetPending(ctx, id)
50+
}
51+
52+
func (r *ConnectionRepository) GetWithProvider(ctx context.Context, id uuid.UUID) (*domain.ConnectionWithProvider, error) {
53+
defer observe("connection", "GetWithProvider", time.Now())
54+
return r.inner.GetWithProvider(ctx, id)
55+
}
56+
57+
func (r *ConnectionRepository) GetReturnURL(ctx context.Context, id uuid.UUID) (string, error) {
58+
defer observe("connection", "GetReturnURL", time.Now())
59+
return r.inner.GetReturnURL(ctx, id)
60+
}
61+
62+
func (r *ConnectionRepository) UpdateStatus(ctx context.Context, id uuid.UUID, status string) error {
63+
defer observe("connection", "UpdateStatus", time.Now())
64+
return r.inner.UpdateStatus(ctx, id, status)
65+
}
66+
67+
func (r *ConnectionRepository) CountByStatus(ctx context.Context) (map[string]int64, error) {
68+
defer observe("connection", "CountByStatus", time.Now())
69+
return r.inner.CountByStatus(ctx)
70+
}
71+
72+
// --- TokenRepository decorator ---
73+
74+
// TokenRepository wraps repository.TokenRepository with latency instrumentation.
75+
type TokenRepository struct {
76+
inner repository.TokenRepository
77+
}
78+
79+
// NewTokenRepository wraps a TokenRepository with Prometheus latency histograms.
80+
func NewTokenRepository(inner repository.TokenRepository) repository.TokenRepository {
81+
return &TokenRepository{inner: inner}
82+
}
83+
84+
func (r *TokenRepository) Upsert(ctx context.Context, token *domain.Token) error {
85+
defer observe("token", "Upsert", time.Now())
86+
return r.inner.Upsert(ctx, token)
87+
}
88+
89+
func (r *TokenRepository) Get(ctx context.Context, connectionID uuid.UUID) (*domain.Token, error) {
90+
defer observe("token", "Get", time.Now())
91+
return r.inner.Get(ctx, connectionID)
92+
}

nexus-broker/internal/repository/interfaces.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ type ConnectionRepository interface {
1414
GetWithProvider(ctx context.Context, id uuid.UUID) (*domain.ConnectionWithProvider, error)
1515
GetReturnURL(ctx context.Context, id uuid.UUID) (string, error)
1616
UpdateStatus(ctx context.Context, id uuid.UUID, status string) error
17+
CountByStatus(ctx context.Context) (map[string]int64, error)
1718
}
1819

1920
// TokenRepository handles database operations for tokens

nexus-broker/internal/repository/postgres/connection.go

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,12 +44,12 @@ func (r *connectionRepository) GetWithProvider(ctx context.Context, id uuid.UUID
4444
var conn domain.ConnectionWithProvider
4545
err := r.db.QueryRowContext(ctx, `
4646
SELECT c.id, c.provider_id, c.status, c.scopes, c.return_url,
47-
p.auth_type, COALESCE(p.auth_header, ''), COALESCE(p.api_base_url, ''), COALESCE(p.user_info_endpoint, ''), p.params
47+
p.name, p.auth_type, COALESCE(p.auth_header, ''), COALESCE(p.api_base_url, ''), COALESCE(p.user_info_endpoint, ''), p.params
4848
FROM connections c
4949
JOIN provider_profiles p ON p.id = c.provider_id
5050
WHERE c.id = $1`, id).
5151
Scan(&conn.ID, &conn.ProviderID, &conn.Status, pq.Array(&conn.Scopes), &conn.ReturnURL,
52-
&conn.AuthType, &conn.AuthHeader, &conn.APIBaseURL, &conn.UserInfoEndpoint, &conn.ProviderParams)
52+
&conn.ProviderName, &conn.AuthType, &conn.AuthHeader, &conn.APIBaseURL, &conn.UserInfoEndpoint, &conn.ProviderParams)
5353
if err != nil {
5454
return nil, err
5555
}
@@ -66,3 +66,22 @@ func (r *connectionRepository) UpdateStatus(ctx context.Context, id uuid.UUID, s
6666
_, err := r.db.ExecContext(ctx, "UPDATE connections SET status = $1, updated_at = NOW() WHERE id = $2", status, id)
6767
return err
6868
}
69+
70+
func (r *connectionRepository) CountByStatus(ctx context.Context) (map[string]int64, error) {
71+
rows, err := r.db.QueryContext(ctx, "SELECT status, COUNT(*) FROM connections GROUP BY status")
72+
if err != nil {
73+
return nil, err
74+
}
75+
defer rows.Close()
76+
77+
counts := make(map[string]int64)
78+
for rows.Next() {
79+
var status string
80+
var count int64
81+
if err := rows.Scan(&status, &count); err != nil {
82+
return nil, err
83+
}
84+
counts[status] = count
85+
}
86+
return counts, rows.Err()
87+
}

0 commit comments

Comments
 (0)