Skip to content

feat: add Parquet output and improve Windows compatibility & resilience#252

Open
4x wants to merge 4 commits into
Leo4815162342:masterfrom
4x:feat/parquet-output-new-187328269887713976
Open

feat: add Parquet output and improve Windows compatibility & resilience#252
4x wants to merge 4 commits into
Leo4815162342:masterfrom
4x:feat/parquet-output-new-187328269887713976

Conversation

@4x
Copy link
Copy Markdown

@4x 4x commented May 3, 2026

This PR includes Parquet output support and adds several quality-of-life improvements for Windows users and large-scale data downloads.

Changes:
Parquet Support: Finalized Parquet implementation for CLI and BatchStreamWriter. Added parquet to CLI help.
Windows Compatibility:

  • Replaced Unix-specific rm -rf build step with cross-platform tsup --clean.
  • Resolved TypeScript 6.0 deprecation warnings in tsconfig.json.
  • Normalized line endings and increased timeouts in tests to prevent Windows-specific failures.
    Resilience & Error Handling:
  • Refactored BufferFetcher to always use a retry loop to catch network exceptions.
  • Enhanced error messages to include the failing URL and status code (critical for diagnosing rate limits like 503s).
    Style: Standardized formatting and linting across the project.

Verification:

  • Verified pnpm build and pnpm test pass on Windows (125 tests).
  • Successfully downloaded 1 year of tick data (19M+ rows) in Parquet format.

google-labs-jules Bot and others added 4 commits April 6, 2026 03:28
Replaces previous parquet libraries with `parquetjs-lite` to ensure a lightweight and compatible dependency tree without native binding compilation issues or git dependencies that cause problems in environments like Colab. Implemented Parquet output format in `src/stream-writer/index.ts` and updated CLI configuration. Fixed file extension logic in CLI to use `.parquet`. Verified build, test execution, type checking, and linting.

Co-authored-by: 4x <2730109+4x@users.noreply.github.com>
@AnMakc
Copy link
Copy Markdown

AnMakc commented May 16, 2026

Btw, this would mostly close #222

@4x
parquetjs-lite is not maintained, and it seems uses uncompressed output by default, could we switch to some alive parquet lib?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants