Skip to content

Latest commit

 

History

History
163 lines (116 loc) · 5.79 KB

File metadata and controls

163 lines (116 loc) · 5.79 KB
title TabPFN-Plus Preview Program
description Preview TabPFN-Plus and TabPFN-Plus with enhanced fit mode, then run focused evaluations on your real workloads.
This page is part of the **TabPFN-Plus API preview program**.

What you get

We're running the preview with a small group of API users before public release. You get early access to two model options:

| Model | Best for | Key differences | | :---- | :------- | :-------------- | | `TabPFN-Plus` | Strong default quality with standard fit behavior | Better prediction quality than the regular model across classification and regression, with native text support and native multi-class handling | | `TabPFN-Plus` + `enhanced_fit_mode=True` | Highest prediction quality when extra fit time is acceptable | During `.fit()`, we optimize TabPFN-3 more to fit your use case; participants typically see anywhere from **10% to 20% better quality** than the regular model |

What to test

Run both options on your real data and share direct comparisons.

  • Re-run datasets you've already sent through the API and compare quality and latency against your current setup.
  • If you had subsampled your data before due to size limitations, run larger samples of your data through the model. Compare the results.
  • Try tasks that were hard to model before, especially mixed tabular + text data.
  • Test multi-class classification, including problems with many classes, without installing our multiclass extension.
  • Test regression workloads and report where quality changed, even when changes are small.

Install

Install the preview build of tabpfn-client directly from the beta branch:

uv add "tabpfn-client @ git+https://github.com/PriorLabs/tabpfn-client.git@main"

Using TabPFN-Plus (v3)

TabPFN-Plus is selected via create_default_for_version(ModelVersion.V3). This returns a regular scikit-learn-compatible estimator — .fit() / .predict() / .predict_proba() behave as usual.

Classification

from tabpfn_client import TabPFNClassifier
from tabpfn_client.constants import ModelVersion

clf = TabPFNClassifier.create_default_for_version(ModelVersion.V3)
clf.fit(X_train, y_train)

preds = clf.predict(X_test)
probs = clf.predict_proba(X_test)

Regression

from tabpfn_client import TabPFNRegressor
from tabpfn_client.constants import ModelVersion

reg = TabPFNRegressor.create_default_for_version(ModelVersion.V3)
reg.fit(X_train, y_train)

preds = reg.predict(X_test)

Pass any TabPFNClassifier / TabPFNRegressor kwarg (e.g. n_estimators, random_state) directly to create_default_for_version to override the defaults:

clf = TabPFNClassifier.create_default_for_version(
    ModelVersion.V3,
    n_estimators=16,
    random_state=0,
)

Enhanced fit mode

enhanced_fit_mode=True improves how .fit() optimizes for your data, running for up to 40 minutes depending on dataset size. That allows to select the best TabPFN configuration. Prediction is unchanged from the caller's perspective — call .predict() / .predict_proba() as usual.

Usage

Classification

from tabpfn_client import TabPFNClassifier
from tabpfn_client.constants import ModelVersion

clf = TabPFNClassifier.create_default_for_version(
    ModelVersion.V3,
    enhanced_fit_mode=True,
)
clf.fit(X_train, y_train)

preds = clf.predict(X_test)
probs = clf.predict_proba(X_test)

Regression

from tabpfn_client import TabPFNRegressor
from tabpfn_client.constants import ModelVersion

reg = TabPFNRegressor.create_default_for_version(
    ModelVersion.V3,
    enhanced_fit_mode=True,
)
reg.fit(X_train, y_train)

preds = reg.predict(X_test)
`.fit()` can take up to 40 minutes depending on dataset size. Plan for this when wiring it into pipelines — or lower the ceiling with `enhanced_fit_mode_time_limit_s` (see below).

Choosing the metric

By default, the enhanced mode optimizes for a task-appropriate metric (log_loss for multiclass classification, root_mean_squared_error for regression, etc.). You can override this with enhanced_fit_mode_metric — the metric drives what we optimize for during the sweep.

Classification

clf = TabPFNClassifier.create_default_for_version(
    ModelVersion.V3,
    enhanced_fit_mode=True,
    enhanced_fit_mode_metric="f1",  # or "accuracy", "log_loss", "roc_auc", "balanced_accuracy", "precision", "recall"
)

Regression

reg = TabPFNRegressor.create_default_for_version(
    ModelVersion.V3,
    enhanced_fit_mode=True,
    enhanced_fit_mode_metric="mae",  # or "rmse", "r2", "mape"
)
`enhanced_fit_mode_metric` is **not** the same as the `eval_metric` parameter described on the [Metric Tuning](/capabilities/metric-tuning) and [Model Parameters](/improving-performance/model-parameters) pages. Those configure decision-threshold and temperature tuning on the standalone TabPFN classifier. `enhanced_fit_mode_metric` only takes effect when `enhanced_fit_mode=True` and is consumed entirely by the enhanced-fit sweep.

Adjusting the time budget

enhanced_fit_mode_time_limit_s sets the ceiling on the enhanced-fit sweep in seconds. The default and maximum is 2400s (40 minutes); lower it if you need a tighter budget for smaller datasets. Passing a larger value raises a ValueError at .fit().

clf = TabPFNClassifier.create_default_for_version(
    ModelVersion.V3,
    enhanced_fit_mode=True,
    enhanced_fit_mode_time_limit_s=600,  # 10 minutes
)

Only consulted when enhanced_fit_mode=True. None defaults to the maximum (2400s).

Notes

  • All other TabPFNClassifier / TabPFNRegressor parameters (e.g. n_estimators, random_state) work as usual — pass them as kwargs to create_default_for_version.