Date: February 5, 2026
Problem: The backend was receiving 403 errors when trying to call Vertex AI because the service account didn't have the required IAM role.
Files Modified:
backend/config/settings.py- Updated validation to not require API key when using Vertex AIbackend/services/gemini_service.py- Improved error handling for Vertex AI authenticationsetup-gcp.ps1- Added Vertex AI APIs and IAM rolessetup-gcp.sh- Added Vertex AI APIs and IAM roles
What was changed:
-
Settings Validation (
config/settings.py)gemini_api_keyvalidator now allows empty key when Vertex AI is enabledvalidate_settings()function now checksuse_vertex_aiflag before requiring API key- Better error messages when credentials are missing
-
Gemini Service (
services/gemini_service.py)- Improved authentication flow to fail clearly if Vertex AI initialization fails
- Removed silent fallback to public API that was causing scope errors
- Added detailed error messages about required IAM roles
-
Setup Scripts
- Added
aiplatform.googleapis.comto enabled APIs - Added
generativeai.googleapis.comto enabled APIs - Added
roles/aiplatform.userto service account roles
- Added
The backend uses proper async/await patterns in the lifespan manager to handle startup/shutdown without blocking.
Key components:
- ✅ Health check endpoint at
/api/health - ✅ Timeout handling with 10-second initialization window
- ✅ Proper async cleanup on shutdown
- ✅ Python 3.11-slim Docker image for faster cold starts
On Windows (PowerShell):
./fix-vertex-ai-permissions.ps1 -ProjectId "legalmind-486106"On Linux/Mac:
chmod +x fix-vertex-ai-permissions.sh
./fix-vertex-ai-permissions.sh legalmind-486106# Set your project
gcloud config set project legalmind-486106
# Enable required APIs
gcloud services enable aiplatform.googleapis.com
gcloud services enable generativeai.googleapis.com
# Grant Vertex AI User role
gcloud projects add-iam-policy-binding legalmind-486106 \
--member="serviceAccount:legalmind-backend@legalmind-486106.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"# From the project root
docker build -t gcr.io/legalmind-486106/legalmind-backend:latest .
docker push gcr.io/legalmind-486106/legalmind-backend:latest
# Deploy with proper configuration
gcloud run deploy legalmind-backend \
--image gcr.io/legalmind-486106/legalmind-backend:latest \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--memory 1Gi \
--cpu 1 \
--timeout 60 \
--min-instances 1 \
--set-env-vars "GOOGLE_CLOUD_PROJECT=legalmind-486106,USE_VERTEX_AI=true,DEBUG=false"# Check service status
gcloud run services describe legalmind-backend --region=us-central1
# Check logs
gcloud run services logs read legalmind-backend --region=us-central1 --limit=20
# Test health endpoint
curl https://legalmind-backend-YOUR-ID.us-central1.run.app/api/health- fix-vertex-ai-permissions.ps1 - PowerShell script to fix permissions
- fix-vertex-ai-permissions.sh - Bash script to fix permissions
- docs/DEPLOYMENT_TROUBLESHOOTING.md - Comprehensive troubleshooting guide
-
backend/config/settings.py
- Line ~70: Updated
validate_api_key()to allow empty key with Vertex AI - Line ~145: Updated
validate_settings()to check use_vertex_ai flag
- Line ~70: Updated
-
backend/services/gemini_service.py
- Line ~203: Improved
_configure_api()error handling and messaging - Removed silent fallback that was causing 403 errors
- Line ~203: Improved
-
setup-gcp.ps1
- Added
aiplatform.googleapis.comandgenerativeai.googleapis.comto APIs - Added
roles/aiplatform.userto service account roles
- Added
-
setup-gcp.sh
- Added
aiplatform.googleapis.comandgenerativeai.googleapis.comto APIs - Added
roles/aiplatform.userto service account roles
- Added
# Real-time logs
gcloud run services logs read legalmind-backend --region=us-central1 --follow
# Service metrics
gcloud run services describe legalmind-backend --region=us-central1 --format=json | jq '.status'
# Check request metrics
gcloud monitoring metrics-descriptors list --filter="metric.type:run.googleapis.com"--min-instances 1keeps one instance always running (costs $~10/month)- Prevents cold start errors
- Ensures API is always responsive
For production reliability:
--memory 1Gi # 1 GB RAM (prevents OOM crashes)
--cpu 1 # 1 vCPU (proper concurrency)
--timeout 60 # 60 second startup timeout
--min-instances 1 # Always keep 1 warm
--max-instances 10 # Scale up to 10 if needed- ✅ Run the fix script to add Vertex AI permissions
- ✅ Redeploy the backend
- ✅ Monitor logs for any startup errors
- ✅ Test the health endpoint
- ✅ Verify the frontend can access the backend
If you encounter any issues during deployment:
- Check the troubleshooting guide:
docs/DEPLOYMENT_TROUBLESHOOTING.md - Review logs:
gcloud run services logs read legalmind-backend --limit=50 - Verify IAM permissions:
gcloud projects get-iam-policy legalmind-486106
- Initialization Gap: The service account was created with basic roles but Vertex AI wasn't included
- Silent Fallback: When Vertex AI init failed, the code tried to fall back to the public API
- Scope Mismatch: Service account tokens don't have scopes for the public Gemini API
- Result: 403 "ACCESS_TOKEN_SCOPE_INSUFFICIENT" error
- Explicit Roles: Service account now has
roles/aiplatform.user - Clear Errors: Code fails with helpful message instead of silent fallback
- Required Setup: Setup scripts now include Vertex AI configuration
- Better Validation: Settings properly handle Vertex AI credentials
Cloud Run Service (legalmind-backend)
↓
Service Account (legalmind-backend@legalmind-486106.iam.gserviceaccount.com)
↓
Application Default Credentials (ADC)
↓
Vertex AI API (with proper IAM role: aiplatform.user)
Refer to:
- GCP IAM Documentation: https://cloud.google.com/iam/docs
- Vertex AI Documentation: https://cloud.google.com/vertex-ai/docs
- Cloud Run Documentation: https://cloud.google.com/run/docs