Jupyter Notebooks

For development and experimentation purposes, the Jupyter notebooks provide guidance to building knowledge augmented chatbots.

The following Jupyter notebooks are provided with the AI workflow for the default canonical RAG example:

LLM Streaming Client

This notebook demonstrates how to use a client to stream responses from an LLM deployed to NVIDIA Triton Inference Server with NVIDIA TensorRT-LLM (TRT-LLM). This deployment format optimizes the model for low latency and high throughput inference.

Document Question-Answering with LangChain

This notebook demonstrates how to use LangChain to build a chatbot that references a custom knowledge-base. LangChain provides a simple framework for connecting LLMs to your own data sources. It shows how to integrate a TensorRT-LLM to LangChain using a custom wrapper.

Document Question-Answering with LlamaIndex

This notebook demonstrates how to use LlamaIndex to build a chatbot that references a custom knowledge-base. It contains the same functionality as the notebook before, but uses some LlamaIndex components instead of LangChain components. It also shows how the two frameworks can be used together.

Advanced Document Question-Answering with LlamaIndex

This notebook demonstrates how to use LlamaIndex to build a more complex retrieval for a chatbot. The retrieval method shown in this notebook works well for code documentation; it retrieves more contiguous document blocks that preserve both code snippets and explanations of code.

Interact with REST FastAPI Server

This notebook demonstrates how to use the REST FastAPI server to upload the knowledge base and then ask a question without and with the knowledge base.

Nvidia AI Endpoint Integration with langchain This notebook demonstrates how to build a Retrieval Augmented Generation (RAG) example using the NVIDIA AI endpoint integrated with Langchain, with FAISS as the vector store.
RAG with langchain and local LLM model from This notebook demonstrates how to plug in a local llm from HuggingFace Hub and build a simple RAG app using langchain.
Nvidia AI Endpoint with llamaIndex and Langchain This notebook demonstrates how to plug in a NVIDIA AI Endpoint mixtral_8x7b and embedding nvolveqa_40k, bind these into LlamaIndex with these customizations.
Locally deployed model from Hugginface integration with llamaIndex and Langchain This notebook demonstrates how to plug in a local llm from HuggingFace Hub Llama-2-13b-chat-hf and all-MiniLM-L6-v2 embedding from Huggingface, bind these to into LlamaIndex with these customizations.
Langchain agent with tools plug in multiple models from NVIDIA AI Endpoint This notebook demonstrates how to use multiple NVIDIA's AI endpoint's model like mixtral_8*7b, Deplot and Neva.

Running the notebooks

If a JupyterLab server needs to be compiled and stood up manually for development purposes, follow the following commands:

[Optional] Notebook 7 to 9 require GPUs. If you have a GPU and are trying out notebooks 7-9 update the jupyter-server service in the docker-compose.yaml file to use ./notebooks/Dockerfile.gpu_notebook as the Dockerfile

  jupyter-server:
    container_name: notebook-server
    image: notebook-server:latest
    build:
      context: ../../
      dockerfile: ./notebooks/Dockerfile.gpu_notebook

[Optional] Notebook from 7-9 may need multiple GPUs. Update docker-compose.yaml to use multiple gpu ids in device_ids field below or set count: all

  jupyter-server:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['0', '1']
              capabilities: [gpu]

Build the container

  source deploy/compose/compose.env
  docker compose -f deploy/compose/docker-compose.yaml build jupyter-server

Run the container which starts the notebook server

  source deploy/compose/compose.env
  docker compose -f deploy/compose/docker-compose.yaml up jupyter-server

Using a web browser, type in the following URL to access the notebooks.

http://host-ip:8888

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jupyter Notebooks

Running the notebooks

FilesExpand file tree

jupyter_server.md

Latest commit

History

jupyter_server.md

File metadata and controls

Jupyter Notebooks

Running the notebooks