Skip to content

Commit 363d20f

Browse files
tonyxiaoclaude
andauthored
E2E: prod + QA matrix, extract monitor, fix global_state_count (#325)
* Extract sync monitor polling loop into standalone script Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude * Using zshrc * Switch mitmweb ports from 8080/8081 to 9080/9081 to avoid conflicts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude * Fix global_state_count to only count global state messages + use --yes for db delete - global_state_count in progress reducer was incrementing on every source_state message regardless of state_type. Now only increments for state_type: 'global'. - Use --yes flag instead of piping confirmation to stripe databases delete. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude * Add QA environment to E2E workflow + accept error state until selective sync - Run E2E monitor on both prod and QA (matrix strategy) - Pass --api-base to Stripe CLI for QA environment - Accept error/failed as non-fatal terminal states until selective sync is available in the CLI/API - Set STRIPE_SK_GOLDILOCKS_QA GitHub secret Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude * Trigger E2E workflow on push to fix-e2e branch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude * Use static secret references instead of dynamic secrets[] lookup Avoids passing all org/repo secrets to the runner. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude * Move secret name into matrix definition Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude * Always pass --api-base, use consistent STRIPE_SK_GOLDILOCKS_PROD secret Removes conditionals around api_base — prod uses https://api.stripe.com explicitly. Renames prod secret from GOLDI_LIVE_KEY to STRIPE_SK_GOLDILOCKS_PROD for consistency with QA. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude * Add explicit API key validation step before creating database Fails fast with a clear error message if the key is empty, expired, or IP-restricted. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 3d74001 commit 363d20f

9 files changed

Lines changed: 131 additions & 88 deletions

File tree

.envrc

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,2 @@
11
# shellcheck shell=bash
22
source_env_if_exists .envrc.local
3-
4-
alias pipelines='bun apps/service/src/bin/sync-service.ts pipelines'
5-
alias tsx='node --conditions bun --import tsx --no-warnings --use-env-proxy'
Lines changed: 29 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
name: Stripe Database Monitor
22

33
on:
4+
push:
5+
branches: [fix-e2e]
46
schedule:
57
- cron: '0 * * * *' # Every hour
68
workflow_dispatch:
@@ -10,11 +12,21 @@ permissions:
1012

1113
jobs:
1214
monitor:
13-
name: Create DB & monitor sync progress
15+
name: E2E — ${{ matrix.env }}
1416
runs-on: ubuntu-24.04-arm
1517
timeout-minutes: 75
18+
strategy:
19+
fail-fast: false
20+
matrix:
21+
include:
22+
- env: prod
23+
api_key: STRIPE_SK_GOLDILOCKS_PROD
24+
api_base: https://api.stripe.com
25+
- env: qa
26+
api_key: STRIPE_SK_GOLDILOCKS_QA
27+
api_base: https://qa-api.stripe.com
1628
env:
17-
STRIPE_API_KEY: ${{ secrets.GOLDI_LIVE_KEY }}
29+
STRIPE_API_KEY: ${{ secrets[matrix.api_key] }}
1830

1931
steps:
2032
- name: Checkout repository
@@ -35,11 +47,22 @@ jobs:
3547
| sudo tee /etc/apt/sources.list.d/stripe.list
3648
sudo apt-get update -qq && sudo apt-get install -y -qq stripe
3749
50+
- name: Validate API key
51+
run: |
52+
if [ -z "$STRIPE_API_KEY" ]; then
53+
echo "::error::STRIPE_API_KEY is empty — check that the ${{ matrix.api_key }} secret is set"
54+
exit 1
55+
fi
56+
if ! stripe customers list --limit 1 --api-base ${{ matrix.api_base }} >/dev/null 2>&1; then
57+
echo "::error::STRIPE_API_KEY (${{ matrix.api_key }}) is invalid or expired — rotate it in the Stripe Dashboard and update the GitHub secret"
58+
exit 1
59+
fi
60+
3861
- name: Create Stripe database
3962
id: create-db
4063
run: |
4164
echo "Creating Stripe database..."
42-
RESULT=$(stripe databases create --live)
65+
RESULT=$(stripe databases create --live --api-base ${{ matrix.api_base }})
4366
4467
DB_ID=$(echo "$RESULT" | grep -oE 'db_[A-Za-z0-9]+' | head -1)
4568
DB_STRING=$(echo "$RESULT" | grep -oE 'postgresql://[^[:space:]]+')
@@ -71,64 +94,8 @@ jobs:
7194
env:
7295
DB_STRING: ${{ steps.create-db.outputs.db_string }}
7396
DB_ID: ${{ steps.create-db.outputs.db_id }}
74-
run: |
75-
POLL_INTERVAL=15
76-
MAX_POLLS=240
77-
PREV_TOTAL=0
78-
PREV_TIME=$(date +%s)
79-
START_TIME=$PREV_TIME
80-
81-
for i in $(seq 1 $MAX_POLLS); do
82-
STATUS=$(stripe databases retrieve "$DB_ID" 2>&1 | grep -oE 'backfilling|ready|error|failed' | head -1)
83-
84-
SUM_EXPR=$(psql "$DB_STRING" -t -A -c "
85-
SELECT COALESCE(
86-
string_agg('(SELECT count(*) FROM public.' || quote_ident(table_name) || ')', ' + '),
87-
'0'
88-
) FROM information_schema.tables
89-
WHERE table_schema = 'public' AND table_type = 'BASE TABLE'
90-
" 2>/dev/null || echo "0")
91-
TOTAL=$(psql "$DB_STRING" -t -A -c "SELECT $SUM_EXPR" 2>/dev/null || echo "0")
92-
93-
NOW=$(date +%s)
94-
ELAPSED=$((NOW - PREV_TIME))
95-
if [ "$ELAPSED" -gt 0 ] && [ "$PREV_TOTAL" -gt 0 ]; then
96-
DELTA=$((TOTAL - PREV_TOTAL))
97-
RPS=$((DELTA / ELAPSED))
98-
echo "[poll $i] status=$STATUS total_rows=$TOTAL delta=$DELTA rows_per_sec=$RPS"
99-
else
100-
echo "[poll $i] status=$STATUS total_rows=$TOTAL (baseline)"
101-
fi
102-
103-
if [ "$STATUS" = "ready" ]; then
104-
TOTAL_ELAPSED=$(( NOW - START_TIME ))
105-
echo ""
106-
echo "Sync complete in ${TOTAL_ELAPSED}s"
107-
break
108-
fi
109-
if [ "$STATUS" = "error" ] || [ "$STATUS" = "failed" ]; then
110-
echo "::error::Database entered $STATUS state"
111-
exit 1
112-
fi
113-
114-
PREV_TOTAL=$TOTAL
115-
PREV_TIME=$NOW
116-
sleep $POLL_INTERVAL
117-
done
118-
119-
if [ "$STATUS" != "ready" ]; then
120-
echo "::error::Timed out waiting for sync to complete"
121-
exit 1
122-
fi
123-
124-
echo ""
125-
echo "=== Final table breakdown ==="
126-
psql "$DB_STRING" -c "
127-
SELECT relname AS table_name, n_live_tup AS row_count
128-
FROM pg_stat_user_tables
129-
WHERE schemaname = 'public'
130-
ORDER BY n_live_tup DESC;
131-
"
97+
STRIPE_API_BASE: ${{ matrix.api_base }}
98+
run: ./scripts/monitor-sync-progress.sh
13299

133100
- name: Sigma validation
134101
env:
@@ -146,4 +113,4 @@ jobs:
146113
if: always() && steps.create-db.outputs.db_id
147114
run: |
148115
echo "Deleting database ${{ steps.create-db.outputs.db_id }}..."
149-
echo "remove StripeDB" | stripe databases delete ${{ steps.create-db.outputs.db_id }}
116+
stripe databases delete ${{ steps.create-db.outputs.db_id }} --yes --api-base ${{ matrix.api_base }}

.zshrc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
2+
alias tsx='node --conditions bun --import tsx --no-warnings --use-env-proxy'
3+
alias pipelines='tsx apps/service/src/bin/sync-service.ts pipelines'

apps/engine/src/lib/engine.test.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -918,7 +918,7 @@ describe('engine.pipeline_sync() pipeline', () => {
918918

919919
const eof = output.find((m) => m.type === 'eof')!
920920
expect(eof.eof.ending_state?.sync_run.run_id).toBe('same-run')
921-
expect(eof.eof.ending_state?.sync_run.progress?.global_state_count).toBe(4)
921+
expect(eof.eof.ending_state?.sync_run.progress?.global_state_count).toBe(3)
922922
expect(eof.eof.ending_state?.sync_run.progress?.elapsed_ms).toBeGreaterThan(5000)
923923
})
924924

apps/engine/src/lib/progress/reducer.test.ts

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ describe('progressReducer — records', () => {
7878
})
7979

8080
describe('progressReducer — source_state', () => {
81-
it('increments global_state_count', () => {
81+
it('increments global_state_count only for global state_type', () => {
8282
let p = createInitialProgress()
8383
p = progressReducer(
8484
p,
@@ -87,14 +87,15 @@ describe('progressReducer — source_state', () => {
8787
source_state: { state_type: 'stream', stream: 'customers', data: {} },
8888
})
8989
)
90+
expect(p.global_state_count).toBe(0)
9091
p = progressReducer(
9192
p,
9293
at({
9394
type: 'source_state',
94-
source_state: { state_type: 'stream', stream: 'customers', data: {} },
95+
source_state: { state_type: 'global', data: { events_cursor: 1 } },
9596
})
9697
)
97-
expect(p.global_state_count).toBe(2)
98+
expect(p.global_state_count).toBe(1)
9899
})
99100

100101
it('marks stream as started on first source_state for that stream', () => {
@@ -147,7 +148,7 @@ describe('progressReducer — source_state', () => {
147148
p,
148149
at({
149150
type: 'source_state',
150-
source_state: { state_type: 'stream', stream: 'customers', data: {} },
151+
source_state: { state_type: 'global', data: { events_cursor: 1 } },
151152
})
152153
)
153154
expect(p.global_state_count).toBe(0)
@@ -500,15 +501,15 @@ describe('progressReducer — elapsed_ms and rates', () => {
500501
p,
501502
at({
502503
type: 'source_state',
503-
source_state: { state_type: 'stream', stream: 'customers', data: {} },
504+
source_state: { state_type: 'global', data: { events_cursor: 1 } },
504505
_ts: '2024-01-01T00:00:00.000Z',
505506
})
506507
)
507508
p = progressReducer(
508509
p,
509510
at({
510511
type: 'source_state',
511-
source_state: { state_type: 'stream', stream: 'customers', data: {} },
512+
source_state: { state_type: 'global', data: { events_cursor: 2 } },
512513
_ts: '2024-01-01T00:00:04.000Z',
513514
})
514515
)

apps/engine/src/lib/progress/reducer.ts

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,10 @@ export function progressReducer(progress: ProgressPayload, msg: Message): Progre
9696
const next = {
9797
...progress,
9898
elapsed_ms: elapsedMs,
99-
global_state_count: progress.global_state_count + 1,
99+
global_state_count:
100+
msg.source_state.state_type === 'global'
101+
? progress.global_state_count + 1
102+
: progress.global_state_count,
100103
}
101104
if (msg.source_state.state_type === 'stream') {
102105
const stream = msg.source_state.stream

apps/engine/src/lib/state-reducer.test.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -166,7 +166,7 @@ describe('stateReducer message events', () => {
166166
const msg: Message = {
167167
_ts: TS,
168168
type: 'source_state',
169-
source_state: { state_type: 'stream', stream: 'customers', data: { cursor: 'x' } },
169+
source_state: { state_type: 'global', data: { events_cursor: 'evt_1' } },
170170
}
171171
const next = stateReducer(state, msg)
172172
expect(next.sync_run.progress.global_state_count).toBe(1)

scripts/mitmweb-forward-proxy.sh

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
# Usage:
55
# source scripts/mitmweb-forward-proxy.sh
66
#
7-
# Starts a forward proxy on http://127.0.0.1:8080 with mitmweb UI on
8-
# http://127.0.0.1:8081 and logs in tmp/mitmweb-forward-proxy-8080.log.
7+
# Starts a forward proxy on http://127.0.0.1:9080 with mitmweb UI on
8+
# http://127.0.0.1:9081 and logs in tmp/mitmweb-forward-proxy-9080.log.
99
#
1010
# Requires mitmproxy 12+ for store_streamed_bodies support.
1111
# Install or upgrade with:
@@ -19,10 +19,10 @@ if [[ "${BASH_SOURCE[0]}" == "$0" ]]; then
1919
exit 1
2020
fi
2121

22-
MITM_PROXY="http://127.0.0.1:8080"
23-
MITM_WEB="http://127.0.0.1:8081"
22+
MITM_PROXY="http://127.0.0.1:9080"
23+
MITM_WEB="http://127.0.0.1:9081"
2424
MITM_CA="$HOME/.mitmproxy/mitmproxy-ca-cert.pem"
25-
MITM_LOG_FILE="tmp/mitmweb-forward-proxy-8080.log"
25+
MITM_LOG_FILE="tmp/mitmweb-forward-proxy-9080.log"
2626
MITM_MIN_MAJOR=12
2727
mkdir -p tmp
2828

@@ -83,18 +83,18 @@ _kill_mitmweb_listener() {
8383

8484
_abort_bad_mitmweb || return 1 2>/dev/null || exit 1
8585

86-
if _port_listening 8080 || _port_listening 8081; then
87-
_kill_mitmweb_listener 8080 || return 1 2>/dev/null || exit 1
88-
_kill_mitmweb_listener 8081 || return 1 2>/dev/null || exit 1
86+
if _port_listening 9080 || _port_listening 9081; then
87+
_kill_mitmweb_listener 9080 || return 1 2>/dev/null || exit 1
88+
_kill_mitmweb_listener 9081 || return 1 2>/dev/null || exit 1
8989
sleep 0.5
9090
fi
9191

92-
if ! _port_listening 8080; then
92+
if ! _port_listening 9080; then
9393
UPSTREAM="${https_proxy:-${http_proxy:-}}"
9494

9595
MITM_ARGS=(
96-
--listen-port 8080
97-
--web-port 8081
96+
--listen-port 9080
97+
--web-port 9081
9898
--no-web-open-browser
9999
--ssl-insecure
100100
--set connection_strategy=lazy
@@ -113,12 +113,12 @@ if ! _port_listening 8080; then
113113
mitmweb "${MITM_ARGS[@]}" >>"$MITM_LOG_FILE" 2>&1 &
114114

115115
for i in $(seq 1 10); do
116-
_port_listening 8080 && break
116+
_port_listening 9080 && break
117117
sleep 0.5
118118
done
119119

120-
if ! _port_listening 8080; then
121-
echo "ERROR: mitmweb failed to start (proxy port 8080)." >&2
120+
if ! _port_listening 9080; then
121+
echo "ERROR: mitmweb failed to start (proxy port 9080)." >&2
122122
return 1 2>/dev/null || exit 1
123123
fi
124124

scripts/monitor-sync-progress.sh

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
# Monitor Stripe database sync progress by polling status and row counts.
5+
# Required env: DB_STRING, DB_ID
6+
7+
: "${DB_STRING:?DB_STRING is required}"
8+
: "${DB_ID:?DB_ID is required}"
9+
10+
STRIPE_FLAGS="--api-base ${STRIPE_API_BASE:?STRIPE_API_BASE is required}"
11+
12+
POLL_INTERVAL=15
13+
MAX_POLLS=240
14+
PREV_TOTAL=0
15+
PREV_TIME=$(date +%s)
16+
START_TIME=$PREV_TIME
17+
18+
for i in $(seq 1 $MAX_POLLS); do
19+
STATUS=$(stripe databases retrieve "$DB_ID" $STRIPE_FLAGS 2>&1 | grep -oE 'backfilling|ready|error|failed' | head -1)
20+
21+
SUM_EXPR=$(psql "$DB_STRING" -t -A -c "
22+
SELECT COALESCE(
23+
string_agg('(SELECT count(*) FROM public.' || quote_ident(table_name) || ')', ' + '),
24+
'0'
25+
) FROM information_schema.tables
26+
WHERE table_schema = 'public' AND table_type = 'BASE TABLE'
27+
" 2>/dev/null || echo "0")
28+
TOTAL=$(psql "$DB_STRING" -t -A -c "SELECT $SUM_EXPR" 2>/dev/null || echo "0")
29+
30+
NOW=$(date +%s)
31+
ELAPSED=$((NOW - PREV_TIME))
32+
if [ "$ELAPSED" -gt 0 ] && [ "$PREV_TOTAL" -gt 0 ]; then
33+
DELTA=$((TOTAL - PREV_TOTAL))
34+
RPS=$((DELTA / ELAPSED))
35+
echo "[poll $i] status=$STATUS total_rows=$TOTAL delta=$DELTA rows_per_sec=$RPS"
36+
else
37+
echo "[poll $i] status=$STATUS total_rows=$TOTAL (baseline)"
38+
fi
39+
40+
if [ "$STATUS" = "ready" ]; then
41+
TOTAL_ELAPSED=$(( NOW - START_TIME ))
42+
echo ""
43+
echo "Sync complete in ${TOTAL_ELAPSED}s"
44+
break
45+
fi
46+
# TODO: Once selective sync is available in the CLI/API, error should be a hard failure.
47+
# For now databases without selective sync may error on unsupported resources.
48+
if [ "$STATUS" = "error" ] || [ "$STATUS" = "failed" ]; then
49+
TOTAL_ELAPSED=$(( NOW - START_TIME ))
50+
echo ""
51+
echo "::warning::Database entered $STATUS state after ${TOTAL_ELAPSED}s (accepted until selective sync is available)"
52+
break
53+
fi
54+
55+
PREV_TOTAL=$TOTAL
56+
PREV_TIME=$NOW
57+
sleep $POLL_INTERVAL
58+
done
59+
60+
if [ "$STATUS" != "ready" ] && [ "$STATUS" != "error" ] && [ "$STATUS" != "failed" ]; then
61+
echo "::error::Timed out waiting for sync to complete"
62+
exit 1
63+
fi
64+
65+
echo ""
66+
echo "=== Final table breakdown ==="
67+
psql "$DB_STRING" -c "
68+
SELECT relname AS table_name, n_live_tup AS row_count
69+
FROM pg_stat_user_tables
70+
WHERE schemaname = 'public'
71+
ORDER BY n_live_tup DESC;
72+
"

0 commit comments

Comments
 (0)