[core] FormatTable supports Blob Format by steFaiz · Pull Request #8191 · apache/paimon

steFaiz · 2026-06-10T07:14:21Z

Purpose

Supports Blob Format in FormatTable.
The situation is to replace ObjectStore by Paimon on DFS, unifying storage engines. Consider this situation:

Users are trying to parse big videos, splitting into hundreds of images.
This is always done by UDF, input is a video, output is a Json Map, contains <ImageIdentifier, ImageURL>, the results will be exported to structural storage e.g. ODPS
Image splitting and upload is done within the UDF. Previously those images are uploaded to OSS. Now we can use paimon FormatTable to store them, we could get the BlobDescriptor easily by BlobConsumers.

The key advantages are:

Partition-level management: drop/overwrite partitions to manage blob lifecycle natively
Drastically fewer files: N blobs packed into one file instead of N separate objects.
BlobDescriptor output: each written blob returns a descriptor (path + offset + length) that downstream structured tables (e.g., ODPS) can consume via UDF for random access.

Restriction

Now we only permit one non-partition column Blob Format Table.

Tests

See org.apache.paimon.table.format.FormatTableBlobTest

JingsongLi · 2026-06-10T09:03:28Z

    enum Format {
        ORC,
        PARQUET,
+        BLOB,


Adding BLOB here also exposes format-table projection paths. For a table like (payload BLOB, ds INT) PARTITIONED BY (ds), projecting only ds makes FormatReadBuilder remove partition columns before creating the file reader, so the projectedRowType passed to BlobFileFormat is empty. BlobFileFormat currently requires a BLOB field and throws, whereas other format tables can satisfy partition-only projections. Please handle this case, for example by reading only the blob file metadata to get the row count and then appending partition columns, or by adding an explicit supported projection path with a test.

JingsongLi · 2026-06-10T09:03:28Z

            }
+            if (writer instanceof FileAwareFormatWriter) {
+                FileAwareFormatWriter fileAwareFormatWriter = (FileAwareFormatWriter) writer;
+                fileAwareFormatWriter.setFile(path);


setFile(path) is not enough for the withBlobConsumer path. BlobFormatWriter invokes the consumer while writing and the emitted descriptor points at this target path, but this writer is backed by a TwoPhaseOutputStream, so the target file is not visible until FormatTableCommit commits it; if a later write/commit fails, abort()/FormatTableCommit.abort() discards it anyway. This violates the TableWrite.withBlobConsumer contract that these files are left for the caller to clean up, and leaves already-emitted descriptors dangling. Please either make the consumer path use visible/non-deleted files like SingleFileWriter does with deleteFileUponAbort(), or defer/avoid emitting descriptors until the file has actually been committed, and add a failure-path test.

@JingsongLi Thanks for your reivew! But this scenario is a little bit tricky.
Currently FormatTable on DFS uses RENAME to do two-phase-commit. So the set path is not real, only exists after commit! At that case, if commit failed and aborted, it's meaningless to retain the written files, because they are in temp dir and not equal to path stored in BlobDescriptors.
(However in python, no two-phase commit implemented, so I still retain written files on abortion)

Here're my thinkings:

Maybe we could explicitly warn users that in FormatTable, returned blobDescriptors are only valid after commit? Or maybe introduce a PendingBlobDescriptor for format tables, all same as BlobDescriptors but BlobRef could warn users the Descriptor is still pending, rather than throws path not exists.

I think this "visible after commit" is acceptable for batch scenarios, for example: in Spark/Ray, FormatTable commit is a part of job, exported descriptors will be visible only after the job is succesfully finished.

Or maybe we do not use two-phase commit for BlobFormatTables? Just filter out the broken files on read.

Thanks again for your review! I'll close this PR and find an another way if you think this scenario is not suitable for paimon FormatTable.

JingsongLi · 2026-06-11T05:36:36Z

@steFaiz Why not just using Paimon table to store objects?

steFaiz · 2026-06-11T06:17:36Z

Why not just using Paimon table to store objects?

@JingsongLi Thanks for your question! Let me explain this. My scenario is:

A Spark/Flink UDF takes images as input and immediately outputs a JSON Map<String, BlobDescriptor> — i.e. each image (blob) is written out and the UDF directly produces the descriptor (path + offset + length) for downstream (ODPS). Previously this is done by uploading each image to individual OSS files, I'm trying to replace OSS by directly Paimon on DFS

Why append table is not suitable?

If use paimon, each UDF need to commit on close(). Each udf instance will commit once. For spark jobs, there may be hundreds of concurrent commits! Format table's commit is pretty lightweight.

I'm exploring use Paimon Format Table to replace oss, just act as an archive for blobs. Users always refer to blobs by descriptor-only(not full scan) and can utilize paimon's blob packing, partition management and table management.

steFaiz added 2 commits June 10, 2026 15:01

[core] FormatTable supports Blob Format

2040729

add pypaimon

0a5ebb0

JingsongLi reviewed Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] FormatTable supports Blob Format#8191

[core] FormatTable supports Blob Format#8191
steFaiz wants to merge 2 commits into
apache:masterfrom
steFaiz:format_table_blob

steFaiz commented Jun 10, 2026 •

edited

Loading

Uh oh!

JingsongLi Jun 10, 2026

Uh oh!

JingsongLi Jun 10, 2026

Uh oh!

steFaiz Jun 11, 2026

Uh oh!

JingsongLi commented Jun 11, 2026

Uh oh!

steFaiz commented Jun 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

steFaiz commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Restriction

Tests

Uh oh!

JingsongLi Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

JingsongLi Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

steFaiz Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

JingsongLi commented Jun 11, 2026

Uh oh!

steFaiz commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

steFaiz commented Jun 10, 2026 •

edited

Loading

steFaiz commented Jun 11, 2026 •

edited

Loading