Skip to content

[codex] Fix shared embeddings and string sparse validation#551

Merged
shenweichen merged 5 commits intomasterfrom
codex/fix-shared-embedding-esmm
Apr 25, 2026
Merged

[codex] Fix shared embeddings and string sparse validation#551
shenweichen merged 5 commits intomasterfrom
codex/fix-shared-embedding-esmm

Conversation

@shenweichen
Copy link
Copy Markdown
Owner

@shenweichen shenweichen commented Apr 17, 2026

Summary

  • Reuse embedding layers by embedding_name for both SparseFeat and VarLenSparseFeat instead of recreating and overwriting duplicate entries.
  • Preserve sequence masking when a shared embedding is used by varlen sparse features.
  • Raise a clear ValueError when a string SparseFeat is configured without use_hash=True, so users do not hit TensorFlow's lower-level string-to-int embedding lookup failure.
  • Add regression coverage for shared embeddings and ESMM string sparse feature configuration.

Why

Fixes #419 by making shared embedding semantics explicit and avoiding repeated initialization for the same embedding_name. Fixes #548 by clarifying the supported path for string sparse features in ESMM and other models: use use_hash=True, or pre-encode values to integer ids before passing them to DeepCTR.

Validation

  • git diff --check
  • python -m compileall -q deepctr tests
  • python -m pytest -q tests/feature_test.py tests/models/MTL_test.py was attempted locally, but this environment does not have TensorFlow installed (ModuleNotFoundError: No module named 'tensorflow').

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 17, 2026

Codecov Report

❌ Patch coverage is 93.75000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.23%. Comparing base (b6a623a) to head (242ec66).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
deepctr/feature_column.py 77.77% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #551      +/-   ##
==========================================
+ Coverage   78.11%   78.23%   +0.11%     
==========================================
  Files          61       61              
  Lines        3304     3331      +27     
==========================================
+ Hits         2581     2606      +25     
- Misses        723      725       +2     
Flag Coverage Δ
pytest 78.23% <93.75%> (+0.11%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codacy-production
Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 21 complexity · 0 duplication

Metric Results
Complexity 21
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@shenweichen shenweichen marked this pull request as ready for review April 24, 2026 16:52
@shenweichen shenweichen merged commit a827877 into master Apr 25, 2026
13 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant