Skip to content

feat: make polars-lazy optional behind 'lazy' feature#12

Open
0xDaizz wants to merge 7 commits into
mainfrom
feat/optional-lazy-dep
Open

feat: make polars-lazy optional behind 'lazy' feature#12
0xDaizz wants to merge 7 commits into
mainfrom
feat/optional-lazy-dep

Conversation

@0xDaizz
Copy link
Copy Markdown
Owner

@0xDaizz 0xDaizz commented Apr 4, 2026

Summary

  • Make polars-lazy dependency optional in cudf-polars, gated behind lazy feature
  • collect_gpu function is only available when lazy feature is enabled
  • This breaks the circular dependency when polars-lazy depends on cudf-polars for GPU engine dispatch

Test plan

  • cargo check -p cudf-polars --no-default-features should compile without polars-lazy
  • cargo check -p cudf-polars --features lazy should compile with polars-lazy

🤖 Generated with Claude Code

0xDaizz and others added 7 commits April 5, 2026 01:24
…ness fixes

- engine: collect_gpu를 lf._collect_post_opt() 콜백 기반으로 재구현하여
  polars 내부 코드 변경 없이 GPU 실행 파이프라인 연결 (IR → DataFrameScan 치환)
- expr: First/Last agg를 slice+repeat 방식으로 수정하여 올바른 단일값
  브로드캐스트 보장, 빈 컬럼/그룹 guard 추가 (null_column_for_type 공통화)
- column: try_data_type() 추가 — data_type()의 fallible 버전으로
  FFI type_id 불일치 시 panic 대신 Result 반환
- convert: Arrow FFI ptr::read 4곳에 debug_assert_eq 크기 검증 +
  SAFETY 주석 보강

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Not expression: Bool→logical not, Int→bitwise invert (Polars semantics)
- IsIn expression: values.contains + nulls_equal propagation
  (null in col → null output when nulls_equal=false)
- GroupBy Quantile: extract q via DynLiteralValue/Scalar for Polars 0.53
  LiteralValue API (Dyn/Scalar/Series/Range variants)
- Enable `is_in` feature in polars-plan dependency
- Codex rescue review 3건 반영: LiteralValue variant mismatch,
  Not integer dispatch, IsIn null semantics

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
engine.rs: Replace Vec::position() linear scan with HashMap for column
name lookup in HStack, eliminating O(n²) behavior on wide DataFrames.

expr.rs: Replace host-allocated vec![v; height] + from_slice with
cudf::filling::sequence GPU-native constant fill for Int32/Int64/
Float32/Float64 scalar broadcasts, avoiding host→device round-trip.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Date/Datetime/Duration type mapping in types.rs with explicit
rejection of timezone-aware Datetime. Support temporal null columns
(i32 for DurationDays/TimestampDays, i64 for all other temporal types)
and scalar broadcast pass-through. Add temporal pass-through in
arithmetic_output_type to let cudf handle native type promotion.

Incorporates fixes from Codex rescue (5 blocking issues resolved).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement per-row broadcast for window aggregations via groupby→left_join→
sort_by_key pipeline. Left join output order is not guaranteed by libcudf,
so sort by left_indices to restore original row alignment.

- Add AExpr::Over handler with GroupsToRows dispatch
- Explicitly reject order_by (not yet supported)
- Promote extract_agg_info/map_ir_agg to pub(crate) for reuse

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This breaks the circular dependency when polars-lazy depends on
cudf-polars for GPU engine dispatch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…oral types, performance optimizations

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant