refactor: parameterize uuid generation#6961
Conversation
Rust Dependency DiffHead: ✅ OK: Within budget.
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6961 +/- ##
========================================
Coverage 75.68% 75.68%
========================================
Files 1146 1146
Lines 161232 161391 +159
========================================
+ Hits 122032 122154 +122
- Misses 39200 39237 +37
🚀 New features to boost your workflow:
|
Use the stdlib Self type on Python 3.11+ while keeping the existing typing_extensions fallback for Python 3.10.
Greptile SummaryThis PR consolidates UUIDv4 and UUIDv7 generation into a single
Confidence Score: 4/5Safe to merge; all changes are additive or internal consolidation with no effect on public output types. The refactor is clean and the new tests are thorough. The only loose ends are a dead import sys in series.py and the asymmetric v4/v7 call paths in misc.py — neither affects correctness or output. The uuidv7 function name and its uuid_v7 alias are no longer registered, so any serialized query plans referencing those names would fail to deserialize. daft/series.py (unused import), daft/functions/misc.py (asymmetric dispatch paths) Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["Python: uuid(version='v4'|'v7')"] --> B{version?}
B -->|v4| C["_call_builtin_scalar_fn('uuid')"]
B -->|v7| D["_call_builtin_scalar_fn('uuid', 'v7')"]
C --> E["Rust: Uuid ScalarUDF"]
D --> E
E --> F["UuidArgs::try_into"]
F --> G{UuidVersion}
G -->|V4 default| H["RustUuid::new_v4()"]
G -->|V7| I["RustUuid::now_v7()"]
H --> J["uuid_kernel"]
I --> J
J --> K["UuidArray Series"]
|
| import sys | ||
| from typing import TYPE_CHECKING, Any, Literal, TypeVar |
There was a problem hiding this comment.
The
sys module is imported here but never referenced anywhere in series.py. This appears to be an accidental leftover from development — no sys. call exists in the file.
| import sys | |
| from typing import TYPE_CHECKING, Any, Literal, TypeVar | |
| from typing import TYPE_CHECKING, Any, Literal, TypeVar |
| if version == "v4": | ||
| return Expression._call_builtin_scalar_fn("uuid") | ||
| if version == "v7": | ||
| return Expression._call_builtin_scalar_fn("uuidv7") | ||
| return Expression._call_builtin_scalar_fn("uuid", version) | ||
| raise ValueError("`version` must be 'v4' or 'v7'") |
There was a problem hiding this comment.
The v4 and v7 dispatch paths are asymmetric: v4 omits the version argument (relying on the Rust default), while v7 passes it explicitly. This works correctly today, but the two paths can be unified to make the intent clearer and ensure both versions are handled identically at the call site. An invalid version string will still be caught by
raise ValueError before either call is reached.
| if version == "v4": | |
| return Expression._call_builtin_scalar_fn("uuid") | |
| if version == "v7": | |
| return Expression._call_builtin_scalar_fn("uuidv7") | |
| return Expression._call_builtin_scalar_fn("uuid", version) | |
| raise ValueError("`version` must be 'v4' or 'v7'") | |
| if version in ("v4", "v7"): | |
| return Expression._call_builtin_scalar_fn("uuid", version) | |
| raise ValueError("`version` must be 'v4' or 'v7'") |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ce88822625
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| @@ -74,6 +74,5 @@ impl FunctionModule for MiscFunctions { | |||
| parent.add_fn(ToStructFunction); | |||
| parent.add_fn(Slice); | |||
| parent.add_fn(Uuid); | |||
There was a problem hiding this comment.
Preserve
uuidv7 function aliases in registry
Registering only Uuid here removes the previously registered uuidv7/uuid_v7 entry points, so existing SQL or planner expressions that call those names now fail with a function-not-found error after this refactor. The new parameterized uuid(version := 'v7') path is good, but backward-compatible aliases should remain registered (or be exposed via Uuid::aliases) to avoid breaking existing workloads.
Useful? React with 👍 / 👎.
Summary
uuid, parameterized by a version literal.uuidv7registration with a sharedUuidVersiondispatch path backed by theuuidcrate.Validation
cargo fmt --all --check.venv/bin/ruff check daft/functions/misc.py tests/dataframe/test_uuid.py.venv/bin/ruff format --check daft/functions/misc.py tests/dataframe/test_uuid.pycargo test -p daft-functions uuid --libDAFT_RUNNER=native make test EXTRA_ARGS="-v tests/dataframe/test_uuid.py"make doctestsgit diff --check origin/main...HEAD