Home: README
- Learners integrating external monitoring or telemetry APIs into Python data workflows.
- Teams needing reliable API-based reporting and downstream dashboards.
- Read-only ingestion jobs for external monitoring APIs.
- Shared database cache tables for dashboard-friendly access.
- A documented field mapping worksheet and data contract.
- SQL ETL baseline from 06_SQL.md.
- API credentials (key or token) for your monitoring platform.
- Database write access to cache schema.
Before any code:
- confirm API scopes,
- enforce read-only endpoints,
- define timeout and retry standards.
No write operations are allowed until validation checklist passes.
Use any monitoring or telemetry API you have access to. Good free options for learning:
- OpenWeatherMap API (weather monitoring)
- GitHub API (repository/event monitoring)
- UptimeRobot API (uptime monitoring)
Initial pulls:
- active alerts or events,
- resource/node health status,
- key metrics or utilization data.
Add a second data source to practice multi-source integration:
- a different API endpoint from the same platform,
- or a complementary monitoring service.
Initial pulls:
- monitored instances or resources,
- health and performance metrics,
- alert/event summary.
Create cache tables with source metadata:
cache_monitoring_alertscache_monitoring_nodescache_perf_instancescache_perf_metrics
Required columns:
source_systemcollected_at_utcentity_keypayload_hash
Maintain this worksheet in your project docs:
| Entity | Source system | Collection cadence | Destination table | Owner |
|---|---|---|---|---|
| Active alerts | Monitoring API | Every 15 min | cache_monitoring_alerts | Monitoring team |
| Node status | Monitoring API | Every 15 min | cache_monitoring_nodes | Monitoring team |
| Instance health | Performance API | Every 30 min | cache_perf_instances | DBA team |
| Performance metrics | Performance API | Hourly | cache_perf_metrics | DBA team |
Outputs:
output/monitoring_daily.xlsxoutput/monitoring_daily.htmllogs/monitoring_run_YYYYMMDD.log
Include:
- top critical alerts,
- down resource summary,
- instance health snapshot,
- stale data warnings.
- Stable read-only ingestion from monitoring APIs.
- Reliable cache tables for dashboards.
- Traceable field mappings and data ownership.
- Force token/auth failure and verify sanitized error logging.
- Simulate API timeout and validate retry/backoff.
- Remove required field from payload transform and verify reject behavior.
- auth errors:
- validate credential scope,
- test endpoint manually,
- confirm clock/time sync for token validity.
- schema drift:
- compare payload keys against mapping worksheet,
- route unknown fields to audit logs.
- stale dashboards:
- verify cache job schedule and
collected_at_utcfreshness checks.
- verify cache job schedule and
Advance when you can:
- describe your API data contracts,
- prove read-only ingestion reliability,
- explain fallback when source APIs are unavailable.
- Play: vary polling intervals and compare freshness/cost tradeoffs.
- Build: implement both ingestion jobs and cache writes.
- Dissect: annotate one end-to-end payload transform.
- Teach-back: present source-to-cache architecture to a peer.
- Use this schema pack for cache and downstream marts: