Add batch inference benchmarking notebooks#279
Conversation
Add XGBoost 10B-row and Sentence Transformer 10M-row batch inference benchmark notebooks under samples/ml/model_serving/batch_inference_benchmarking. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| "\n", | ||
| "# --- Generate input data ---\n", | ||
| "SENTENCE_TEMPLATES = [\n", | ||
| " \"Machine learning models require diverse training data for optimal performance across domains.\",\n", |
There was a problem hiding this comment.
Can we use a standard dataset
There was a problem hiding this comment.
We would need to rerun the benchmark for all platforms if we update the datasets. I do not feel like it is worth the effort at this point.
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": "# ╔════════════════════════════════════════════════════════════╗\n# ║ USER CONFIGURATION — fill these in before running ║\n# ╚════════════════════════════════════════════════════════════╝\nCONNECTION_NAME = \"<connection>\" # Snowflake connection name (from ~/.snowflake/connections.toml)\nDB_NAME = \"ST_BENCHMARK\" # Database to create/use for this benchmark\nWAREHOUSE_SIZE = \"4X-LARGE\" # Warehouse size for data generation\nIMAGE_REPO = \"<db>.<schema>.<repo>\" # Image repository for SPCS containers\nEVENT_TABLE = \"<db>.<schema>.<table>\" # Event table for platform metrics (set to None to skip metrics)\n\n# ╔════════════════════════════════════════════════════════════╗\n# ║ BENCHMARK DEFAULTS — change only to explore alternatives ║\n# ╚════════════════════════════════════════════════════════════╝\nNUM_NODES = 2\nINSTANCE_FAMILY = \"GPU_NV_S\"\nNUM_WORKERS = 2\nMAX_BATCH_ROWS = 256\nREPLICAS = 2\nFUNCTION_NAME = \"encode\"\nINPUT_ROWS = 10_000_000\nGPU_REQUESTS = \"1\" # GPUs per worker\nREPEAT = 3\nWARMUP_ROW_COUNT = 1_000\n\nMODEL_ID = \"all-MiniLM-L6-v2\" # HuggingFace model to download\nMODEL_NAME = \"all_minilm_l6_v2\"\nMODEL_VERSION = \"V1\"" |
There was a problem hiding this comment.
Feel like some of these don't need to be changed by customers. Of course this is the code and they can do whatever but we should draw less attention to it
There was a problem hiding this comment.
The users do not need to touch this to run this notebook.
It is the concern that we do not want to draw attention to it? We can move the non mandatory parameters to other sections. What do you think?
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": "# ╔════════════════════════════════════════════════════════════╗\n# ║ USER CONFIGURATION — fill these in before running ║\n# ╚════════════════════════════════════════════════════════════╝\nCONNECTION_NAME = \"<connection>\" # Snowflake connection name (from ~/.snowflake/connections.toml)\nDB_NAME = \"ST_BENCHMARK\" # Database to create/use for this benchmark\nWAREHOUSE_SIZE = \"4X-LARGE\" # Warehouse size for data generation\nIMAGE_REPO = \"<db>.<schema>.<repo>\" # Image repository for SPCS containers\nEVENT_TABLE = \"<db>.<schema>.<table>\" # Event table for platform metrics (set to None to skip metrics)\n\n# ╔════════════════════════════════════════════════════════════╗\n# ║ BENCHMARK DEFAULTS — change only to explore alternatives ║\n# ╚════════════════════════════════════════════════════════════╝\nNUM_NODES = 2\nINSTANCE_FAMILY = \"GPU_NV_S\"\nNUM_WORKERS = 2\nMAX_BATCH_ROWS = 256\nREPLICAS = 2\nFUNCTION_NAME = \"encode\"\nINPUT_ROWS = 10_000_000\nGPU_REQUESTS = \"1\" # GPUs per worker\nREPEAT = 3\nWARMUP_ROW_COUNT = 1_000\n\nMODEL_ID = \"all-MiniLM-L6-v2\" # HuggingFace model to download\nMODEL_NAME = \"all_minilm_l6_v2\"\nMODEL_VERSION = \"V1\"" |
There was a problem hiding this comment.
why image_repo is needed?
| " EVENT_TABLE = None\n", | ||
| " print(\"EVENT_TABLE not configured -- platform metrics will be skipped.\")\n", | ||
| "\n", | ||
| "WAREHOUSE_NAME = f\"{DB_NAME}_WH\"\n", |
| " print(f\"Registered: {MODEL_NAME}/{MODEL_VERSION}\")\n", | ||
| "\n", | ||
| "# --- Generate input data ---\n", | ||
| "SENTENCE_TEMPLATES = [\n", |
There was a problem hiding this comment.
we should probably take more standard dataset from HF
| "execution_count": null, | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": "# ╔════════════════════════════════════════════════════════════╗\n# ║ USER CONFIGURATION — fill these in before running ║\n# ╚════════════════════════════════════════════════════════════╝\nCONNECTION_NAME = \"<connection>\" # Snowflake connection name (from ~/.snowflake/connections.toml)\nDB_NAME = \"ST_BENCHMARK\" # Database to create/use for this benchmark\nWAREHOUSE_SIZE = \"4X-LARGE\" # Warehouse size for data generation\nIMAGE_REPO = \"<db>.<schema>.<repo>\" # Image repository for SPCS containers\nEVENT_TABLE = \"<db>.<schema>.<table>\" # Event table for platform metrics (set to None to skip metrics)\n\n# ╔════════════════════════════════════════════════════════════╗\n# ║ BENCHMARK DEFAULTS — change only to explore alternatives ║\n# ╚════════════════════════════════════════════════════════════╝\nNUM_NODES = 2\nINSTANCE_FAMILY = \"GPU_NV_S\"\nNUM_WORKERS = 2\nMAX_BATCH_ROWS = 256\nREPLICAS = 2\nFUNCTION_NAME = \"encode\"\nINPUT_ROWS = 10_000_000\nGPU_REQUESTS = \"1\" # GPUs per worker\nREPEAT = 3\nWARMUP_ROW_COUNT = 1_000\n\nMODEL_ID = \"all-MiniLM-L6-v2\" # HuggingFace model to download\nMODEL_NAME = \"all_minilm_l6_v2\"\nMODEL_VERSION = \"V1\"" |
There was a problem hiding this comment.
why do you need IMAGE_REPO?
| " CREATE OR REPLACE TABLE {table_name} AS\n", | ||
| " SELECT\n", | ||
| " ARRAY_CONSTRUCT({array_literal})[MOD(SEQ4(), {num_templates})]::VARCHAR AS SENTENCE\n", | ||
| " FROM TABLE(GENERATOR(ROWCOUNT => {row_count}))\n", |
There was a problem hiding this comment.
can we use wine dataset directly? https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_wine.html
Replace run_batch_st_10m.ipynb and run_batch_xgboost_10b.ipynb with the latest versions from the source benchmarking notebooks. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Adds two batch inference benchmark notebooks under
samples/ml/model_serving/batch_inference_benchmarking/:run_batch_xgboost_10b.ipynb— XGBoost 10B-row batch inference benchmarkrun_batch_st_10m.ipynb— Sentence Transformer 10M-row batch inference benchmark