microsoft · thinkall · Jan 13, 2026 · Jan 13, 2026 · Jan 13, 2026 · Jan 13, 2026
diff --git a/flaml/version.py b/flaml/version.py
@@ -1 +1 @@
-__version__ = "2.4.0"
+__version__ = "2.4.1"
diff --git a/setup.py b/setup.py
@@ -116,14 +116,14 @@
             "scikit-learn",
         ],
         "hf": [
-            "transformers[torch]==4.26",
+            "transformers[torch]>=4.26",
             "datasets",
             "nltk<=3.8.1",
             "rouge_score",
             "seqeval",
         ],
         "nlp": [  # for backward compatibility; hf is the new option name
-            "transformers[torch]==4.26",
+            "transformers[torch]>=4.26",
             "datasets",
             "nltk<=3.8.1",
             "rouge_score",

diff --git a/website/docs/Best-Practices.md b/website/docs/Best-Practices.md
@@ -0,0 +1,132 @@
+````markdown
+# Best Practices
+
+This page collects practical guidance for using FLAML effectively across common tasks.
+
+## General tips
+
+- Start simple: set `task`, `time_budget`, and keep `metric="auto"` unless you have a strong reason to override.
+- Prefer correct splits: ensure your evaluation strategy matches your data (time series vs i.i.d., grouped data, etc.).
+- Keep estimator lists explicit when debugging: start with a small `estimator_list` and expand.
+- Use built-in discovery helpers to avoid stale hardcoded lists:
+
+```python
+from flaml import AutoML
+from flaml.automl.task.factory import task_factory
+
+automl = AutoML()
+print("Built-in sklearn metrics:", sorted(automl.supported_metrics[0]))
+print("classification estimators:", sorted(task_factory("classification").estimators.keys()))
+```
+
+## Classification
+
+- **Metric**: for binary classification, `metric="roc_auc"` is common; for multiclass, `metric="log_loss"` is often robust.
+- **Imbalanced data**:
+  - pass `sample_weight` to `AutoML.fit()`;
+  - consider setting class weights via `custom_hp` / `fit_kwargs_by_estimator` for specific estimators (see [FAQ](FAQ)).
+- **Probability vs label metrics**: use `roc_auc` / `log_loss` when you care about calibrated probabilities.
+
+## Regression
+
+- **Default metric**: `metric="r2"` (minimizes `1 - r2`).
+- If your target scale matters (e.g., dollar error), consider `mae`/`rmse`.
+
+## Learning to rank
+
+- Use `task="rank"` with group information (`groups` / `groups_val`) so metrics like `ndcg` and `ndcg@k` are meaningful.
+- If you pass `metric="ndcg@10"`, also pass `groups` so FLAML can compute group-aware NDCG.
+
+## Time series forecasting
+
+- Use time-aware splitting. For holdout validation, set `eval_method="holdout"` and use a time-ordered dataset.
+- Prefer supplying a DataFrame with a clear time column when possible.
+- Optional time-series estimators depend on optional dependencies. To list what is available in your environment:
+
+```python
+from flaml.automl.task.factory import task_factory
+
+print("forecast:", sorted(task_factory("forecast").estimators.keys()))
+```
+
+## NLP (Transformers)
+
+- Install the optional dependency: `pip install "flaml[hf]"`.
+- When you provide a custom metric, ensure it returns `(metric_to_minimize, metrics_to_log)` with stable keys.
+
+## Speed, stability, and tricky settings
+
+- **Time budget vs convergence**: if you see warnings about not all estimators converging, increase `time_budget` or reduce `estimator_list`.
+- **Memory pressure / OOM**:
+  - set `free_mem_ratio` (e.g., `0.2`) to keep free memory above a threshold;
+  - set `model_history=False` to reduce stored artifacts;
+- **Reproducibility**: set `seed` and keep `n_jobs` fixed; expect some runtime variance.
+
+## Persisting models
+
+FLAML supports **both** MLflow logging and pickle-based persistence. For production deployment, MLflow logging is typically the most important option because it plugs into the MLflow ecosystem (tracking, model registry, serving, governance). For quick local reuse, persisting the whole `AutoML` object via pickle is often the most convenient.
+
+### Option 1: MLflow logging (recommended for production)
+
+When you run `AutoML.fit()` inside an MLflow run, FLAML can log metrics/params automatically (disable via `mlflow_logging=False` if needed). To persist the trained `AutoML` object as a model artifact and reuse MLflow tooling end-to-end:
+
+```python
+import mlflow
+import numpy as np
+from sklearn.datasets import load_iris
+from sklearn.model_selection import train_test_split
+from flaml import AutoML
+
+
+X, y = load_iris(return_X_y=True, as_frame=True)
+X_train, X_test, y_train, y_test = train_test_split(
+    X, y, test_size=0.2, random_state=42
+)
+
+automl = AutoML()
+mlflow.set_experiment("flaml")
+with mlflow.start_run(run_name="flaml_run") as run:
+    automl.fit(X_train, y_train, task="classification", time_budget=3, retrain_full=False, eval_method="holdout")
+
+run_id = run.info.run_id
+
+# Later (or in a different process)
+automl2 = mlflow.sklearn.load_model(f"runs:/{run_id}/model")
+assert np.array_equal(automl2.predict(X_test), automl.predict(X_test))
+```
+
+### Option 2: Pickle the full `AutoML` instance (convenient / Fabric)
+
+Pickling stores the *entire* `AutoML` instance (not just the best estimator). This is useful when you prefer not to rely on MLflow or when you want to reuse additional attributes of the AutoML object without retraining.
+
+In Microsoft Fabric scenarios, this is particularly important for re-plotting visualization figures without requiring model retraining.
+
+```python
+import mlflow
+import numpy as np
+from sklearn.datasets import load_iris
+from sklearn.model_selection import train_test_split
+from flaml import AutoML
+
+
+X, y = load_iris(return_X_y=True, as_frame=True)
+X_train, X_test, y_train, y_test = train_test_split(
+    X, y, test_size=0.2, random_state=42
+)
+
+automl = AutoML()
+mlflow.set_experiment("flaml")
+with mlflow.start_run(run_name="flaml_run") as run:
+    automl.fit(X_train, y_train, task="classification", time_budget=3, retrain_full=False, eval_method="holdout")
+
+automl.pickle("automl.pkl")
+automl2 = AutoML.load_pickle("automl.pkl")
+assert np.array_equal(automl2.predict(X_test), automl.predict(X_test))
+assert automl.best_config == automl2.best_config
+assert automl.best_loss == automl2.best_loss
+assert automl.mlflow_integration.infos == automl2.mlflow_integration.infos
+```
+
+See also: [Task-Oriented AutoML](Use-Cases/Task-Oriented-AutoML) and [FAQ](FAQ).
+
+````
diff --git a/website/docs/Contribute.md b/website/docs/Contribute.md
@@ -62,10 +62,10 @@ There is currently no formal reviewer solicitation process. Current reviewers id
 
 ```bash
 git clone https://github.com/microsoft/FLAML.git
-pip install -e FLAML[notebook,autogen]
+pip install -e ".[notebook]"
 ```
 
-In case the `pip install` command fails, try escaping the brackets such as `pip install -e FLAML\[notebook,autogen\]`.
+In case the `pip install` command fails, try escaping the brackets such as `pip install -e .\[notebook\]`.
 
 ### Docker
 

diff --git a/website/docs/FAQ.md b/website/docs/FAQ.md
@@ -115,3 +115,57 @@ from sklearn.ensemble import RandomForestClassifier
 model = RandomForestClassifier(**best_params)
 model.fit(X, y)
 ```
+
+### How to save and load an AutoML object? (`pickle` / `load_pickle`)
+
+FLAML provides `AutoML.pickle()` / `AutoML.load_pickle()` as a convenient and robust way to persist an AutoML run.
+
+```python
+from flaml import AutoML
+
+automl = AutoML()
+automl.fit(X_train, y_train, task="classification", time_budget=60)
+
+# Save
+automl.pickle("automl.pkl")
+
+# Load
+automl_loaded = AutoML.load_pickle("automl.pkl")
+pred = automl_loaded.predict(X_test)
+```
+
+Notes:
+
+- If you used Spark estimators, `AutoML.pickle()` externalizes Spark ML models into an adjacent artifact folder and keeps
+  the pickle itself lightweight.
+- If you want to skip re-loading externalized Spark models (e.g., in an environment without Spark), use:
+
+```python
+automl_loaded = AutoML.load_pickle("automl.pkl", load_spark_models=False)
+```
+
+### How to list all available estimators for a task?
+
+The available estimator set is task-dependent and can vary with optional dependencies. You can list the estimator keys
+that FLAML currently has registered in your environment:
+
+```python
+from flaml.automl.task.factory import task_factory
+
+print(sorted(task_factory("classification").estimators.keys()))
+print(sorted(task_factory("regression").estimators.keys()))
+print(sorted(task_factory("forecast").estimators.keys()))
+print(sorted(task_factory("rank").estimators.keys()))
+```
+
+### How to list supported built-in metrics?
+
+```python
+from flaml import AutoML
+
+automl = AutoML()
+sklearn_metrics, hf_metrics, spark_metrics = automl.supported_metrics
+print(sorted(sklearn_metrics))
+print(sorted(hf_metrics))
+print(spark_metrics)
+```
diff --git a/website/docs/Getting-Started.md b/website/docs/Getting-Started.md
@@ -8,53 +8,17 @@ and optimizes their performance.
 
 ### Main Features
 
-- FLAML enables building next-gen GPT-X applications based on multi-agent conversations with minimal effort. It simplifies the orchestration, automation and optimization of a complex GPT-X workflow. It maximizes the performance of GPT-X models and augments their weakness.
 - For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend.
 - It supports fast and economical automatic tuning, capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping.
 
 FLAML is powered by a series of [research studies](/docs/Research) from Microsoft Research and collaborators such as Penn State University, Stevens Institute of Technology, University of Washington, and University of Waterloo.
 
 ### Quickstart
 
-Install FLAML from pip: `pip install flaml`. Find more options in [Installation](/docs/Installation).
+Install FLAML from pip: `pip install flaml` (**requires Python >= 3.10**). Find more options in [Installation](/docs/Installation).
 
 There are several ways of using flaml:
 
-#### (New) [AutoGen](https://microsoft.github.io/autogen/)
-
-Autogen enables the next-gen GPT-X applications with a generic multi-agent conversation framework.
-It offers customizable and conversable agents which integrate LLMs, tools and human.
-By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For example,
-
-```python
-from flaml import autogen
-
-assistant = autogen.AssistantAgent("assistant")
-user_proxy = autogen.UserProxyAgent("user_proxy")
-user_proxy.initiate_chat(
-    assistant,
-    message="Show me the YTD gain of 10 largest technology companies as of today.",
-)
-# This initiates an automated chat between the two agents to solve the task
-```
-
-Autogen also helps maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4. It offers a drop-in replacement of `openai.Completion` or `openai.ChatCompletion` with powerful functionalites like tuning, caching, error handling, templating. For example, you can optimize generations by LLM with your own tuning data, success metrics and budgets.
-
-```python
-# perform tuning
-config, analysis = autogen.Completion.tune(
-    data=tune_data,
-    metric="success",
-    mode="max",
-    eval_func=eval_func,
-    inference_budget=0.05,
-    optimization_budget=3,
-    num_samples=-1,
-)
-# perform inference for a test instance
-response = autogen.Completion.create(context=test_instance, **config)
-```
-
 #### [Task-oriented AutoML](/docs/Use-Cases/task-oriented-automl)
 
 With three lines of code, you can start using this economical and fast AutoML engine as a scikit-learn style estimator.
@@ -140,9 +104,10 @@ Then, you can use it just like you use the original `LGMBClassifier`. Your other
 
 ### Where to Go Next?
 
-- Understand the use cases for [AutoGen](https://microsoft.github.io/autogen/), [Task-oriented AutoML](/docs/Use-Cases/Task-Oriented-Automl), [Tune user-defined function](/docs/Use-Cases/Tune-User-Defined-Function) and [Zero-shot AutoML](/docs/Use-Cases/Zero-Shot-AutoML).
-- Find code examples under "Examples": from [AutoGen - AgentChat](/docs/Examples/AutoGen-AgentChat) to [Tune - PyTorch](/docs/Examples/Tune-PyTorch).
+- Understand the use cases for [Task-oriented AutoML](/docs/Use-Cases/Task-Oriented-Automl), [Tune user-defined function](/docs/Use-Cases/Tune-User-Defined-Function) and [Zero-shot AutoML](/docs/Use-Cases/Zero-Shot-AutoML).
+- Find code examples under "Examples": from [AutoML - Classification](/docs/Examples/AutoML-Classification) to [Tune - PyTorch](/docs/Examples/Tune-PyTorch).
 - Learn about [research](/docs/Research) around FLAML and check [blogposts](/blog).
+- Apply practical guidance in [Best Practices](/docs/Best-Practices).
 - Chat on [Discord](https://discord.gg/Cppx2vSPVP).
 
 If you like our project, please give it a [star](https://github.com/microsoft/FLAML/stargazers) on GitHub. If you are interested in contributing, please read [Contributor's Guide](/docs/Contribute).

diff --git a/website/docs/Installation.md b/website/docs/Installation.md
@@ -2,7 +2,7 @@
 
 ## Python
 
-FLAML requires **Python version >= 3.7**. It can be installed from pip:
+FLAML requires **Python version >= 3.10**. It can be installed from pip:
 
 ```bash
 pip install flaml
@@ -16,12 +16,6 @@ conda install flaml -c conda-forge
 
 ### Optional Dependencies
 
-#### [Autogen](Use-Cases/Autogen)
-
-```bash
-pip install "flaml[autogen]"
-```
-
 #### [Task-oriented AutoML](Use-Cases/Task-Oriented-AutoML)
 
 ```bash