From 98a07caf6fbfe221739460092e826fb6c1198868 Mon Sep 17 00:00:00 2001 From: jddocs Date: Thu, 17 Apr 2025 09:17:25 -0400 Subject: [PATCH] [Updated] App Platform LLM and RAG Pipeline guides --- .../deploy-llm-for-ai-inferencing-on-apl/index.md | 10 +++++++--- .../deploy-rag-pipeline-and-chatbot-on-apl/index.md | 9 +++++++-- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/docs/guides/kubernetes/deploy-llm-for-ai-inferencing-on-apl/index.md b/docs/guides/kubernetes/deploy-llm-for-ai-inferencing-on-apl/index.md index 82ed4c40f48..f01069c6f09 100644 --- a/docs/guides/kubernetes/deploy-llm-for-ai-inferencing-on-apl/index.md +++ b/docs/guides/kubernetes/deploy-llm-for-ai-inferencing-on-apl/index.md @@ -5,6 +5,7 @@ description: "This guide includes steps and guidance for deploying a large langu authors: ["Akamai"] contributors: ["Akamai"] published: 2025-03-25 +modified: 2025-04-17 keywords: ['ai','ai inference','ai inferencing','llm','large language model','app platform','lke','linode kubernetes engine','llama 3','kserve','istio','knative'] license: '[CC BY-ND 4.0](https://creativecommons.org/licenses/by-nd/4.0)' external_resources: @@ -66,11 +67,14 @@ If you prefer to manually install an LLM and RAG Pipeline on LKE rather than usi - Enrollment into the Akamai App Platform's [beta program](https://cloud.linode.com/betas). -- An provisioned and configured LKE cluster with App Platform enabled. We recommend an LKE cluster consisting of at least 3 RTX4000 Ada x1 Medium [GPU](https://techdocs.akamai.com/cloud-computing/docs/gpu-compute-instances) plans. +## Set Up Infrastructure -To learn more about provisioning a LKE cluster with App Platform, see our [Getting Started with App Platform for LKE](https://techdocs.akamai.com/cloud-computing/docs/getting-started-with-akamai-application-platform) guide. +### Provision an LKE Cluster -## Set Up Infrastructure +We recommend provisioning an LKE cluster with [App Platform](https://techdocs.akamai.com/cloud-computing/docs/application-platform) enabled and the following minimum requirements: + +- 3 **8GB Dedicated CPUs** with [autoscaling](https://techdocs.akamai.com/cloud-computing/docs/manage-nodes-and-node-pools#autoscale-automatically-resize-node-pools) turned on +- A second node pool consisting of at least 2 **RTX4000 Ada x1 Medium [GPU](https://techdocs.akamai.com/cloud-computing/docs/gpu-compute-instances)** plans Once your LKE cluster is provisioned and the App Platform web UI is available, complete the following steps to continue setting up your infrastructure. diff --git a/docs/guides/kubernetes/deploy-rag-pipeline-and-chatbot-on-apl/index.md b/docs/guides/kubernetes/deploy-rag-pipeline-and-chatbot-on-apl/index.md index d48806d4b9c..d8349c08a48 100644 --- a/docs/guides/kubernetes/deploy-rag-pipeline-and-chatbot-on-apl/index.md +++ b/docs/guides/kubernetes/deploy-rag-pipeline-and-chatbot-on-apl/index.md @@ -5,6 +5,7 @@ description: "This guide expands on a previously built LLM and AI inferencing ar authors: ["Akamai"] contributors: ["Akamai"] published: 2025-03-25 +modified: 2025-04-17 keywords: ['ai','ai inference','ai inferencing','llm','large language model','app platform','lke','linode kubernetes engine','rag pipeline','retrieval augmented generation','open webui','kubeflow'] license: '[CC BY-ND 4.0](https://creativecommons.org/licenses/by-nd/4.0)' external_resources: @@ -50,9 +51,13 @@ If you prefer a manual installation rather than one using App Platform for LKE, ## Prerequisites -- Complete the deployment in the [Deploy an LLM for AI Inferencing with App Platform for LKE](/docs/guides/deploy-llm-for-ai-inferencing-on-apl) guide. An LKE cluster consisting of at least 3 RTX4000 Ada x1 Medium [GPU](https://techdocs.akamai.com/cloud-computing/docs/gpu-compute-instances) nodes is recommended for AI inference workloads. +- Complete the deployment in the [Deploy an LLM for AI Inferencing with App Platform for LKE](/docs/guides/deploy-llm-for-ai-inferencing-on-apl) guide. Your LKE cluster should include the following minimum hardware requirements: -- [Python3](https://www.python.org/downloads/) and the [venv](https://docs.python.org/3/library/venv.html) Python module installed on your local machine. + - 3 **8GB Dedicated CPUs** with [autoscaling](https://techdocs.akamai.com/cloud-computing/docs/manage-nodes-and-node-pools#autoscale-automatically-resize-node-pools) turned on + + - A second node pool consisting of at least 2 **RTX4000 Ada x1 Medium [GPU](https://techdocs.akamai.com/cloud-computing/docs/gpu-compute-instances)** plans + +- [Python3](https://www.python.org/downloads/) and the [venv](https://docs.python.org/3/library/venv.html) Python module installed on your local machine ## Set Up Infrastructure