Skip to content

Add batch inference benchmarking notebooks#279

Open
sfc-gh-jiehuang wants to merge 2 commits into
mainfrom
jh_6_23_benchmarks_batch
Open

Add batch inference benchmarking notebooks#279
sfc-gh-jiehuang wants to merge 2 commits into
mainfrom
jh_6_23_benchmarks_batch

Conversation

@sfc-gh-jiehuang

@sfc-gh-jiehuang sfc-gh-jiehuang commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds two batch inference benchmark notebooks under samples/ml/model_serving/batch_inference_benchmarking/:

  • run_batch_xgboost_10b.ipynb — XGBoost 10B-row batch inference benchmark
  • run_batch_st_10m.ipynb — Sentence Transformer 10M-row batch inference benchmark

Add XGBoost 10B-row and Sentence Transformer 10M-row batch inference
benchmark notebooks under samples/ml/model_serving/batch_inference_benchmarking.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
"\n",
"# --- Generate input data ---\n",
"SENTENCE_TEMPLATES = [\n",
" \"Machine learning models require diverse training data for optimal performance across domains.\",\n",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a standard dataset

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to rerun the benchmark for all platforms if we update the datasets. I do not feel like it is worth the effort at this point.

"execution_count": null,
"metadata": {},
"outputs": [],
"source": "# ╔════════════════════════════════════════════════════════════╗\n# ║ USER CONFIGURATION — fill these in before running ║\n# ╚════════════════════════════════════════════════════════════╝\nCONNECTION_NAME = \"<connection>\" # Snowflake connection name (from ~/.snowflake/connections.toml)\nDB_NAME = \"ST_BENCHMARK\" # Database to create/use for this benchmark\nWAREHOUSE_SIZE = \"4X-LARGE\" # Warehouse size for data generation\nIMAGE_REPO = \"<db>.<schema>.<repo>\" # Image repository for SPCS containers\nEVENT_TABLE = \"<db>.<schema>.<table>\" # Event table for platform metrics (set to None to skip metrics)\n\n# ╔════════════════════════════════════════════════════════════╗\n# ║ BENCHMARK DEFAULTS — change only to explore alternatives ║\n# ╚════════════════════════════════════════════════════════════╝\nNUM_NODES = 2\nINSTANCE_FAMILY = \"GPU_NV_S\"\nNUM_WORKERS = 2\nMAX_BATCH_ROWS = 256\nREPLICAS = 2\nFUNCTION_NAME = \"encode\"\nINPUT_ROWS = 10_000_000\nGPU_REQUESTS = \"1\" # GPUs per worker\nREPEAT = 3\nWARMUP_ROW_COUNT = 1_000\n\nMODEL_ID = \"all-MiniLM-L6-v2\" # HuggingFace model to download\nMODEL_NAME = \"all_minilm_l6_v2\"\nMODEL_VERSION = \"V1\""

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel like some of these don't need to be changed by customers. Of course this is the code and they can do whatever but we should draw less attention to it

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The users do not need to touch this to run this notebook.

It is the concern that we do not want to draw attention to it? We can move the non mandatory parameters to other sections. What do you think?

"execution_count": null,
"metadata": {},
"outputs": [],
"source": "# ╔════════════════════════════════════════════════════════════╗\n# ║ USER CONFIGURATION — fill these in before running ║\n# ╚════════════════════════════════════════════════════════════╝\nCONNECTION_NAME = \"<connection>\" # Snowflake connection name (from ~/.snowflake/connections.toml)\nDB_NAME = \"ST_BENCHMARK\" # Database to create/use for this benchmark\nWAREHOUSE_SIZE = \"4X-LARGE\" # Warehouse size for data generation\nIMAGE_REPO = \"<db>.<schema>.<repo>\" # Image repository for SPCS containers\nEVENT_TABLE = \"<db>.<schema>.<table>\" # Event table for platform metrics (set to None to skip metrics)\n\n# ╔════════════════════════════════════════════════════════════╗\n# ║ BENCHMARK DEFAULTS — change only to explore alternatives ║\n# ╚════════════════════════════════════════════════════════════╝\nNUM_NODES = 2\nINSTANCE_FAMILY = \"GPU_NV_S\"\nNUM_WORKERS = 2\nMAX_BATCH_ROWS = 256\nREPLICAS = 2\nFUNCTION_NAME = \"encode\"\nINPUT_ROWS = 10_000_000\nGPU_REQUESTS = \"1\" # GPUs per worker\nREPEAT = 3\nWARMUP_ROW_COUNT = 1_000\n\nMODEL_ID = \"all-MiniLM-L6-v2\" # HuggingFace model to download\nMODEL_NAME = \"all_minilm_l6_v2\"\nMODEL_VERSION = \"V1\""

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why image_repo is needed?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

" EVENT_TABLE = None\n",
" print(\"EVENT_TABLE not configured -- platform metrics will be skipped.\")\n",
"\n",
"WAREHOUSE_NAME = f\"{DB_NAME}_WH\"\n",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take this as input?

" print(f\"Registered: {MODEL_NAME}/{MODEL_VERSION}\")\n",
"\n",
"# --- Generate input data ---\n",
"SENTENCE_TEMPLATES = [\n",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably take more standard dataset from HF

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"execution_count": null,
"metadata": {},
"outputs": [],
"source": "# ╔════════════════════════════════════════════════════════════╗\n# ║ USER CONFIGURATION — fill these in before running ║\n# ╚════════════════════════════════════════════════════════════╝\nCONNECTION_NAME = \"<connection>\" # Snowflake connection name (from ~/.snowflake/connections.toml)\nDB_NAME = \"ST_BENCHMARK\" # Database to create/use for this benchmark\nWAREHOUSE_SIZE = \"4X-LARGE\" # Warehouse size for data generation\nIMAGE_REPO = \"<db>.<schema>.<repo>\" # Image repository for SPCS containers\nEVENT_TABLE = \"<db>.<schema>.<table>\" # Event table for platform metrics (set to None to skip metrics)\n\n# ╔════════════════════════════════════════════════════════════╗\n# ║ BENCHMARK DEFAULTS — change only to explore alternatives ║\n# ╚════════════════════════════════════════════════════════════╝\nNUM_NODES = 2\nINSTANCE_FAMILY = \"GPU_NV_S\"\nNUM_WORKERS = 2\nMAX_BATCH_ROWS = 256\nREPLICAS = 2\nFUNCTION_NAME = \"encode\"\nINPUT_ROWS = 10_000_000\nGPU_REQUESTS = \"1\" # GPUs per worker\nREPEAT = 3\nWARMUP_ROW_COUNT = 1_000\n\nMODEL_ID = \"all-MiniLM-L6-v2\" # HuggingFace model to download\nMODEL_NAME = \"all_minilm_l6_v2\"\nMODEL_VERSION = \"V1\""

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need IMAGE_REPO?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

" CREATE OR REPLACE TABLE {table_name} AS\n",
" SELECT\n",
" ARRAY_CONSTRUCT({array_literal})[MOD(SEQ4(), {num_templates})]::VARCHAR AS SENTENCE\n",
" FROM TABLE(GENERATOR(ROWCOUNT => {row_count}))\n",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Replace run_batch_st_10m.ipynb and run_batch_xgboost_10b.ipynb with the
latest versions from the source benchmarking notebooks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants