Skip to content

docs: Incorporate writing table provider blog post to user documentation#21398

Merged
alamb merged 4 commits intoapache:mainfrom
buraksenn:incorporate-blog-post-to-docs
Apr 9, 2026
Merged

docs: Incorporate writing table provider blog post to user documentation#21398
alamb merged 4 commits intoapache:mainfrom
buraksenn:incorporate-blog-post-to-docs

Conversation

@buraksenn
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

The existing custom table providers doc was minimal and thanks to @timsaucer this blog post covers the same topic with much more depth. Thus, @alamb opened the issue for incorporating blog post.

What changes are included in this PR?

I've replaced content with blog post with some changes:

  • fix several broken URLs (e.g. ParquetSource url was broken)
  • made some examples compilable
  • fix ExecutionPlan api
  • kept table constraints and using your table provider sections
  • remove acknowledgements and get involved

Are these changes tested?

Ran cargo test -p datafusion --doc custom_table_providers — 3 passed, 7 ignored.

Then build with docs/README.md instructions and ss for docs:
image
image

Are there any user-facing changes?

documentation only

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Apr 6, 2026
Copy link
Copy Markdown
Member

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel a little uncomfortable with this because it seems like it's most of the content I wrote has been copy+pasted into this document.

I would feel more comfortable if instead of it saying "This content is based on the blog post
[Writing Custom Table Providers in Apache DataFusion]" that instead we say "Most of the content of this page was originally posted in the DataFusion blog [Writing Custom Table Providers in Apache DataFusion]" or something similar.

@buraksenn
Copy link
Copy Markdown
Contributor Author

buraksenn commented Apr 6, 2026

I feel a little uncomfortable with this because it seems like it's most of the content I wrote has been copy+pasted into this document.

I would feel more comfortable if instead of it saying "This content is based on the blog post [Writing Custom Table Providers in Apache DataFusion]" that instead we say "Most of the content of this page was originally posted in the DataFusion blog [Writing Custom Table Providers in Apache DataFusion]" or something similar.

Actually my understanding from the #21304 is to copy-paste. I can change the above part to what you've described or close this PR so that you can incorporate it. Please tell me what is appropriate for you

edited: I'm also really sorry to make you uncomfortable I thought since it was an issue from alamb and there is a comment in your PR this was the way to do it. I've probably misunderstood about the incorporating part

Copy link
Copy Markdown
Member

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with approving after this change is made.

Co-authored-by: Tim Saucer <timsaucer@gmail.com>
@buraksenn
Copy link
Copy Markdown
Contributor Author

I'm okay with approving after this change is made.

Thanks @timsaucer I was waiting for your reply such that whether I should close this PR or not. I've applied your change and will go on sorry again for causing this

@timsaucer
Copy link
Copy Markdown
Member

Thanks for the work on this @buraksenn

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this page out locally and it looks great

Image

Thank you @buraksenn and @timsaucer

data: HashMap<u8, User>,
bank_account_index: BTreeMap<u64, u8>,
}
The majority of this content was originally posted in the blog
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

A [TableProvider] represents a queryable data source. For a minimal read-only
table, you need four methods:

```rust,ignore
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be great (as a follow on PR) to update this example so it was actually compiled/testsed (and thus did not drift out of sync with the code)

In other words change this to rust and then ensure cargo test -p datafusion --doc passes

It currently fails like this:

--- a/docs/source/library-user-guide/custom-table-providers.md
+++ b/docs/source/library-user-guide/custom-table-providers.md
@@ -157,7 +157,7 @@ stream path, which gives you complete control and applies to any data source.
 A [TableProvider] represents a queryable data source. For a minimal read-only
 table, you need four methods:

-```rust,ignore
+```rust
 impl TableProvider for MyTable {
     fn as_any(&self) -> &dyn Any { self }

diff --git a/testing b/testing
index 7df2b70baf..0d60ccae40 160000
--- a/testing
+++ b/testing
@@ -1 +1 @@
(venv) andrewlamb@Andrews-MacBook-Pro-3:~/Software/datafusion3$ cargo test -p datafusion --doc -- library_user_guide_custom_table_providers
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.41s
   Doc-tests datafusion

running 10 tests
test datafusion/core/src/lib.rs - library_user_guide_custom_table_providers (line 1365) ... ignored
test datafusion/core/src/lib.rs - library_user_guide_custom_table_providers (line 1441) ... ignored
test datafusion/core/src/lib.rs - library_user_guide_custom_table_providers (line 1498) ... ignored
test datafusion/core/src/lib.rs - library_user_guide_custom_table_providers (line 1512) ... ignored
test datafusion/core/src/lib.rs - library_user_guide_custom_table_providers (line 1551) ... ignored
test datafusion/core/src/lib.rs - library_user_guide_custom_table_providers (line 2058) ... ignored
test datafusion/core/src/lib.rs - library_user_guide_custom_table_providers (line 1285) ... FAILED
test datafusion/core/src/lib.rs - library_user_guide_custom_table_providers (line 1913) ... ok
test datafusion/core/src/lib.rs - library_user_guide_custom_table_providers (line 1631) ... ok
test datafusion/core/src/lib.rs - library_user_guide_custom_table_providers (line 1772) ... ok

failures:

---- datafusion/core/src/lib.rs - library_user_guide_custom_table_providers (line 1285) stdout ----
error[E0405]: cannot find trait `TableProvider` in this scope
    --> datafusion/core/src/lib.rs:1286:6
     |
1286 | impl TableProvider for MyTable {
     |      ^^^^^^^^^^^^^ not found in this scope
     |
help: consider importing one of these traits
     |
1285 + use datafusion::catalog::TableProvider;
     |
1285 + use datafusion_catalog::TableProvider;
...

@alamb alamb added this pull request to the merge queue Apr 9, 2026
@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 9, 2026

Let's get it shipped! We can keep iterating on the docs in follow on PRs!

Merged via the queue into apache:main with commit 44af0a1 Apr 9, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorporate "Writing Custom Table Providers in Apache DataFusion" into the main datafusion docs

3 participants