Traditional FinOps is reactive—it relies on dashboards showing last month's cloud waste. This project introduces Predictive Runtime FinOps, an intelligent Layer-7 API Gateway that dynamically shifts multi-cloud traffic in real-time based on fluctuating spot-instance pricing and network telemetry.
- Eliminates Bill Shock: Throttles runaway API usage and prevents recursive LLM loops before the network request executes.
- Automated Arbitrage: Shifts traffic dynamically between cloud regions (AWS/Azure) to capture bottom-dollar compute pricing without manual DevOps intervention.
- Zero Latency Degradation: Asynchronous policy evaluation ensures financial governance never degrades the end-user API response time.
- The Market Engine: A mock pricing API simulating real-time compute cost fluctuations across AWS and Azure spot instances.
- The Telemetry Mesh: Simulated network latency metrics representing regional health and availability.
- The AI Gateway: A FastAPI reverse proxy integrated with an air-gapped LLM (Llama-3). The AI acts as an autonomous "Cloud Economist," evaluating the complex trade-off between cost-per-request and latency to make millisecond routing decisions.
- Start the mock Cloud Providers (Terminal 1 & 2):
python services/cloud_aws.py&python services/cloud_azure.py - Start the Pricing Market (Terminal 3):
python services/market_api.py - Start the AI Router (Terminal 4):
python app/router.py - Send traffic (Terminal 5):
curl http://localhost:8000/process
