Skip to content

Remove GA4 Data API data source#109

Merged
marevol merged 4 commits into
mainfrom
refactor/remove-ga4-data-api
May 25, 2026
Merged

Remove GA4 Data API data source#109
marevol merged 4 commits into
mainfrom
refactor/remove-ga4-data-api

Conversation

@marevol

@marevol marevol commented May 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

Removes the GA4 Data API data source (source.type: ga4). The GA4 Data API cannot return a stable user identifier (userId) suitable for collaborative-filtering training, so the source has no viable use.

What was removed

  • src/recotem/datasource/ga4.py, src/recotem/_metrics_ga4.py
  • config.get_ga4_max_pages + the RECOTEM_GA4_MAX_PAGES env var
  • The ga4 entry point and the ga4 optional-dependency extra (google-analytics-data); uv.lock regenerated
  • All GA4-Data-API tests (unit, metrics, fuzz)
  • docs/data-sources/ga4.md, examples/ga4-data-api/, the --extra ga4 CI step + GA4-SDK smoke-test import, and GA4 mentions in README / CLAUDE.md / recipe-reference

What was preserved (intentional)

  • examples/ga4-bigquery/ — uses type: bigquery (NOT the ga4 source); the BigQuery export carries userId, so this remains the supported way to use GA4 data
  • docs/data-sources/bigquery.md GA4 events_* query patterns and related GCP_PROJECT notes

Behaviour after removal

recotem validate on a type: ga4 recipe now fails with Unknown DataSource type 'ga4'. Known types: ['bigquery', 'csv', 'parquet', 'sql'] (exit code 2).

Verification

  • uv run pytest tests → 1785 passed, 4 deselected
  • uv run ruff check src tests / uv run ruff format --check src tests → clean
  • uv sync --frozen --all-extras → lock consistent; no google-analytics-data
  • Repo-wide sweep (including non-ga4 SDK tokens google.analytics / data_v1beta) → only intentional BigQuery-export references remain

🤖 Generated with Claude Code

marevol added 4 commits May 25, 2026 23:00
The GA4 Data API cannot return a stable userId suitable for
collaborative-filtering training, so the ga4 source is removed.
GA4 via the BigQuery export (examples/ga4-bigquery, type: bigquery)
is unaffected.
Drops docs/data-sources/ga4.md, examples/ga4-data-api/, the --extra ga4
CI step, and GA4 mentions in README/CLAUDE.md/recipe-reference.
Keeps examples/ga4-bigquery (type: bigquery) and the BigQuery-export
GA4 query-pattern docs.
@marevol marevol added this to the 2.0.0a1 milestone May 25, 2026
@marevol marevol merged commit 5df6760 into main May 25, 2026
11 checks passed
marevol added a commit to codelibs/recotem-docs that referenced this pull request May 26, 2026
Reflects codelibs/recotem#109, which removed the `source.type: ga4`
data source. Drops the GA4 source page, recipe `source.type: ga4`
section, `recotem[ga4]` install extra, `RECOTEM_GA4_MAX_PAGES` env var,
and the GA4 sidebar entry, in both English and Japanese v2 docs.

BigQuery-based GA4 usage (`events_*` export via `type: bigquery`) is
preserved.

Also repoints a pre-existing dead link in environment-variables.md
(`./data-sources/` -> `./recipe-reference#source`) so the build passes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant