Skip to content

Commit 4f4ca32

Browse files
committed
Merge branch 'master' of github.com:aetperf/aetperf.github.io
2 parents c984396 + 8b23c1b commit 4f4ca32

2 files changed

Lines changed: 3 additions & 3 deletions

File tree

_posts/2024-01-08-Streaming-data-from-PostgreSQL-to-a-CSV-file.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -390,7 +390,7 @@ assert check_df.l_orderkey.is_monotonic_increasing
390390

391391
## FastBCP
392392

393-
While our focus here is on Python tools, we added FastBCP as a reference regarding CPU and memory usage. FastBCP has been developed in-house by Romain Ferraton at [Architecture & Performance](https://www.architecture-performance.fr/). It is a command line tool, written in C#, that is compatible with any operating system where dotnet is installed. We used dotnet on Linux in the present case.
393+
While our focus here is on Python tools, we added FastBCP as a reference regarding CPU and memory usage. [FastBCP](https://www.arpe.io/fastbcp) has been developed at [Architecture & Performance](https://www.architecture-performance.fr/ap-logiciels/). It is a command line tool, written in C#, that is compatible with any operating system where dotnet is installed. We used dotnet on Linux in the present case.
394394

395395
FastBCP employs parallel threads, reading data through multiple connections by partitioning SQL on the 'l_orderkey' column, using the "random" method. This approach results in distinct CSV files, later merged into a final output. It's worth mentioning that due to its parallel settings, the resulting data in the CSV file may not be sorted. This is why the ORDER BY clause is removed from the query in this particular case. Also, the returned elapsed time take the merging phase into account.
396396

_posts/WP_2024-01-08-Streaming-data-from-PostgreSQL-to-a-CSV-file.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -370,7 +370,7 @@ assert check_df.l_orderkey.is_monotonic_increasing
370370

371371
## FastBCP
372372

373-
While our focus here is on Python tools, we added FastBCP as a reference regarding CPU and memory usage. FastBCP has been developed in-house by Romain Ferraton at [Architecture & Performance](https://www.architecture-performance.fr/). It is a command line tool, written in C#, that is compatible with any operating system where dotnet is installed. We used dotnet on Linux in the present case.
373+
While our focus here is on Python tools, we added FastBCP as a reference regarding CPU and memory usage. [FastBCP](https://www.arpe.io/fastbcp) has been developed at [Architecture & Performance](https://www.architecture-performance.fr/ap-logiciels/). It is a command line tool, written in C#, that is compatible with any operating system where dotnet is installed. We used dotnet on Linux in the present case.
374374

375375
FastBCP employs parallel threads, reading data through multiple connections by partitioning SQL on the 'l_orderkey' column, using the "random" method. This approach results in distinct CSV files, later merged into a final output. It's worth mentioning that due to its parallel settings, the resulting data in the CSV file may not be sorted. This is why the ORDER BY clause is removed from the query in this particular case. Also, the returned elapsed time take the merging phase into account.
376376

@@ -426,4 +426,4 @@ While we briefly included FastBCP for a reference comparison, we did not delve i
426426

427427
Additionally, it's worth mentioning that we did not manage to employ [Polars](https://pola.rs/) for the streaming extraction, leading to an out-of-memory error.
428428

429-
Also, it appears that [ConnectorX](https://sfu-db.github.io/connector-x/intro.html) currently lacks support for [retrieving results as Arrow batches or any type of chunks](https://github.com/sfu-db/connector-x/issues/264), making it unfit for this task.
429+
Also, it appears that [ConnectorX](https://sfu-db.github.io/connector-x/intro.html) currently lacks support for [retrieving results as Arrow batches or any type of chunks](https://github.com/sfu-db/connector-x/issues/264), making it unfit for this task.

0 commit comments

Comments
 (0)