This document covers logging, error tracking, and monitoring in the cloud portal.
The observability stack consists of:
| Component | Purpose | Local | Production |
|---|---|---|---|
| Sentry | Error tracking | Optional | Required |
| OpenTelemetry | Distributed tracing | Jaeger | Grafana Tempo |
| Prometheus | Metrics collection | Local | Cloud |
| Grafana | Dashboards | Local | Cloud |
Use structured logging in development:
// Simple logging
console.log('User logged in', { userId, orgId });
// Error logging
console.error('Failed to fetch zones', { error, params });console.log- General informationconsole.warn- Warnings, non-critical issuesconsole.error- Errors, exceptions
Server logs are captured by the Hono server and forwarded to OTEL:
// In loaders/actions
export async function loader({ request }: LoaderFunctionArgs) {
console.log('Loading zones', { url: request.url });
// ...
}Set up in .env:
SENTRY_DSN=https://xxx@sentry.io/xxx
SENTRY_ORG=datum
SENTRY_PROJECT=cloud-portalSentry automatically captures:
- Unhandled exceptions
- Promise rejections
- React error boundaries
- Network errors
- API errors (via axios interceptors)
The @/modules/sentry module provides centralized Sentry integration:
import {
// Context - hierarchical enrichment
setSentryUser,
setSentryOrgContext,
setSentryProjectContext,
setSentryResourceContext,
// Breadcrumbs - user journey tracking
trackFormSubmit,
trackFormSuccess,
trackFormError,
// Capture - error reporting
captureError,
captureApiError,
captureMessage,
} from '@/modules/sentry';Context is set automatically at different levels:
// User context (set on login)
setSentryUser({ id: 'user-123', email: 'user@example.com' });
// Organization context (set in org layout)
setSentryOrgContext({ name: 'acme-corp', uid: 'org-abc' });
// Project context (set in project layout)
setSentryProjectContext({ name: 'my-project', uid: 'proj-xyz' });
// Resource context (set automatically from API responses)
setSentryResourceContext({
kind: 'DNSZone',
apiVersion: 'dns.networking.miloapis.com/v1alpha1',
metadata: { name: 'example.com', namespace: 'default' },
});Filter issues in Sentry dashboard using these tags:
| Tag | Description | Example |
|---|---|---|
user.id |
User identifier | user-123 |
org.id |
Organization name | acme-corp |
project.id |
Project name | my-project |
resource.kind |
K8s resource kind | DNSZone |
resource.apiGroup |
API group | dns.networking.miloapis.com |
resource.type |
Resource type (from URL) | dnszones |
resource.name |
Resource name | example.com |
API errors are automatically captured with resource context:
// Automatic capture via axios interceptors
// Errors include: fingerprint, resource context, method, URL, status
// Manual capture
captureApiError({
error: axiosError,
method: 'GET',
url: '/apis/dns.networking.miloapis.com/v1alpha1/dnszones/my-zone',
status: 404,
message: 'Not Found',
});Error Grouping: Errors are grouped by resource type + API group + status code:
API 404: GET dnszones(instead of generic "AxiosError")API 401: POST projects
Forms automatically track user interactions as breadcrumbs:
// Add name prop to forms for better tracking
<Form.Root name="dns-zone-create" schema={schema} onSubmit={handleSubmit}>
...
</Form.Root>Tracked events:
- Form submit attempts
- Validation errors (field names only, not values)
- Submission success/failure
Sentry tracks:
- Page load times
- Route transitions
- API call durations
- React component renders
Browser → Hono Server → Control Plane APIs
│ │ │
└──────────┴──────────────┘
│
Trace Context
│
▼
Jaeger (local) / Tempo (prod)
# .env
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
OTEL_SERVICE_NAME=cloud-portal
OTEL_ENABLED=trueThe following are automatically traced:
- HTTP requests (incoming and outgoing)
- Route handlers (loaders, actions)
- Database queries
- External API calls
For custom tracing:
import { trace } from '@opentelemetry/api';
const tracer = trace.getTracer('cloud-portal');
async function complexOperation() {
return tracer.startActiveSpan('complex-operation', async (span) => {
try {
span.setAttribute('custom.attribute', 'value');
// Nested span
await tracer.startActiveSpan('sub-operation', async (subSpan) => {
await doSomething();
subSpan.end();
});
return result;
} finally {
span.end();
}
});
}Local (Jaeger):
- Start observability stack:
bun run dev:otel - Open http://localhost:16686
- Select "cloud-portal" service
- Search for traces
Finding a Trace:
- By trace ID from logs
- By operation name
- By tags (user ID, route, etc.)
| Metric | Type | Description |
|---|---|---|
http_requests_total |
Counter | Total HTTP requests |
http_request_duration_seconds |
Histogram | Request latency |
http_requests_in_flight |
Gauge | Concurrent requests |
nodejs_heap_size_bytes |
Gauge | Memory usage |
import { Counter, Histogram } from 'prom-client';
// Counter
const zoneCreations = new Counter({
name: 'dns_zone_creations_total',
help: 'Total DNS zones created',
labelNames: ['org_id'],
});
zoneCreations.inc({ org_id: orgId });
// Histogram
const queryDuration = new Histogram({
name: 'dns_query_duration_seconds',
help: 'DNS query duration',
buckets: [0.1, 0.5, 1, 2, 5],
});
const timer = queryDuration.startTimer();
await performQuery();
timer();Metrics are exposed at /metrics:
curl http://localhost:3000/metrics# Start all observability services
bun run dev:otel
# Or with docker-compose
docker-compose -f docker-compose.otel.yml up -d| Service | Port | URL |
|---|---|---|
| Jaeger UI | 16686 | http://localhost:16686 |
| Prometheus | 9090 | http://localhost:9090 |
| Grafana | 3001 | http://localhost:3001 |
| OTEL Collector | 4318 | (HTTP receiver) |
| OTEL Collector | 4317 | (gRPC receiver) |
Pre-configured dashboards:
- Application Overview - Request rate, error rate, latency
- Node.js Runtime - Memory, CPU, event loop
- API Performance - Per-endpoint metrics
Default credentials: admin/admin
docker-compose -f docker-compose.otel.yml down- Create project in Sentry
- Configure DSN in deployment
- Set up release tracking
- Configure alerts
- Configure OTEL exporter endpoint
- Set up Tempo for traces
- Configure Prometheus remote write
- Import dashboards
Configure alerts for:
- Error rate > threshold
- P99 latency > threshold
- Memory usage > threshold
- Failed health checks
- Get trace ID from logs or network tab
- Search in Jaeger/Tempo
- Examine span timeline
- Check span attributes and logs
- Find error in Sentry
- Get trace ID from error context
- View full trace
- Identify root cause
- Open Grafana dashboard
- Identify slow endpoints
- View traces for slow requests
- Check span breakdown
Use tags to filter issues in the Sentry dashboard:
# Find all errors for a specific organization
org.id:acme-corp
# Find errors in a specific project
project.id:my-project
# Find all DNS Zone errors
resource.type:dnszones
# Find HTTP Proxy errors with 404 status
resource.type:httpproxies status:404
# Find all errors for a resource API group
resource.apiGroup:dns.networking.miloapis.com
# Combine filters for specific customer issues
org.id:acme-corp project.id:production resource.kind:HTTPProxy
- Get customer org ID from support ticket
- Filter in Sentry:
org.id:<customer-org> - Check breadcrumbs for user journey (form submissions, API calls)
- View resource context to see what resource they were working on
- Correlate with trace ID for full request flow
- Add context to errors (user ID, org ID, resource ID)
- Use structured logging
- Add custom spans for complex operations
- Set meaningful span names
- Use
captureApiError()for API errors (automatic fingerprinting) - Add
nameprop to forms for better tracking - Filter errors by resource tags in Sentry dashboard
- Log sensitive data (tokens, passwords)
- Create too many custom metrics
- Ignore high-cardinality labels
- Skip error context
- Use
Sentry.captureException()directly for API errors (usecaptureApiError()) - Track form field values (only track field names)
- Troubleshooting - Common issues
- Deployment - Production setup
- Local Development - Dev setup