|
| 1 | +# 🧪 Tutorial: Urban Mapper + AutoDDG + Jupyter Pipeline |
| 2 | +# 🧪 Tutorial: Urban Mapper + AutoDDG +Jupyter Pipeline |
| 3 | + |
| 4 | +This tutorial shows how to **stack three MCPs**: |
| 5 | + |
| 6 | +- **Urban Mapper** (urban computing analysis utilising the Urban Mapper official library) |
| 7 | +- **AutoDDG** (dataset description & context generation via Large Language Models) |
| 8 | +- **Jupyter** (reproducible notebook analysis) |
| 9 | + |
| 10 | +About Urban Mapper: |
| 11 | + |
| 12 | +* Urban Mapper official repository: https://github.com/VIDA-NYU/UrbanMapper |
| 13 | +* Urban Mapper documentation: https://urbanmapper.readthedocs.io/en/latest/ |
| 14 | +* Urban Mapper MCP: https://github.com/MCP-Pipeline/mcpstack-urbanmapper |
| 15 | + |
| 16 | +About AutoDDG: |
| 17 | + |
| 18 | +* AutoDDG official repository: https://github.com/VIDA-NYU/AutoDDG |
| 19 | +* AutoDDG MCP: https://github.com/MCP-Pipeline/mcpstack-autoddg |
| 20 | + |
| 21 | +You’ll learn how to: |
| 22 | + |
| 23 | +1. Build a pipeline with the three tools |
| 24 | +2. Ask in a natural language to build a reproducible urban analysis workflow utilising Urban Mapper, and let the LLM explore the dataset with AutoDDG to get context on it and better inform the analysis |
| 25 | +3. Export code and results into a Jupyter Notebook for reproducible Python analysis |
| 26 | + |
| 27 | +--- |
| 28 | + |
| 29 | +## 🎥 Video Walkthrough |
| 30 | + |
| 31 | +<iframe width="860" height="515" src="https://www.youtube.com/embed/-kgIi7GzN6o?si=aWRBoNgeqIzJjew_" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> |
| 32 | + |
| 33 | +--- |
| 34 | + |
| 35 | +## Prerequisites |
| 36 | + |
| 37 | +```bash |
| 38 | +# If you prefer other Python package managers, feel free to adapt `pip install X`. |
| 39 | + |
| 40 | +uv init --python 3.10 |
| 41 | +uv add mcpstack |
| 42 | +uv add mcpstack-jupyter |
| 43 | +uv add mcpstack-urbanmapper |
| 44 | +uv add mcpstack-autoddg |
| 45 | + |
| 46 | +# To see if the tools are all connected |
| 47 | +uv run mcpstack list-tools |
| 48 | +``` |
| 49 | + |
| 50 | +## 🔧 Step 1 — Build with Pipeline W/ Urban Mapper Default |
| 51 | + |
| 52 | +Urban Mapper Default is basically using HuggingFace's datasets to load datasets for urban pipeline analysis. |
| 53 | +You can control otherwise, but it would be preferable to start with the default. |
| 54 | + |
| 55 | +```bash |
| 56 | +uv run mcpstack pipeline urbanmapper --new-pipeline my_pipeline.json |
| 57 | +``` |
| 58 | + |
| 59 | +## 🔧 Step 2 — Create a Jupyter `ToolConfig` |
| 60 | + |
| 61 | +Basically, Jupyter MCP works with some sort of connections between the LLM and the Jupyter instance. This is via a |
| 62 | +URL and a TOKEN. Hence, the need for a `ToolConfig`. |
| 63 | + |
| 64 | +```bash |
| 65 | +uv run mcpstack tools jupyter configure \ |
| 66 | + --token YOUR_JUPYTER_TOKEN |
| 67 | + |
| 68 | +# This create a `jupyter_config.json` file |
| 69 | +# Ex of a token: 1117bf468693444a5608e882ab3b55d511f354a175a0df02 |
| 70 | +``` |
| 71 | + |
| 72 | +## 🔧 Step 3 — Add To The Tool To The Pipeline |
| 73 | + |
| 74 | +```bash |
| 75 | +uv run mcpstack pipeline jupyter --to-pipeline my_pipeline.json --tool-config jupyter_config.json |
| 76 | +``` |
| 77 | + |
| 78 | +## 🔧 Step 4 — Add AutoDDG To The Pipeline |
| 79 | + |
| 80 | +```bash |
| 81 | +uv run mcpstack tools autoddg configure |
| 82 | +``` |
| 83 | + |
| 84 | +This will prompt an interactive shell to set up at the very least one mandatory environment variable: `AUTO_DDG_OPENAI_API_KEY`. |
| 85 | +This is required to let the LLM use OpenAI large language models. TO generate one key, |
| 86 | +please refer to: https://platform.openai.com/account/api-keys. |
| 87 | + |
| 88 | +## 🔧 Step 5 — Add To The Tool To The Pipeline |
| 89 | + |
| 90 | +```bash |
| 91 | +uv run mcpstack pipeline autoddg --to-pipeline my_pipeline.json --tool-config autoddg_config.json |
| 92 | +``` |
| 93 | + |
| 94 | +## 🔧 Step 6 — Compose & Run the Pipeline On Claude Desktop |
| 95 | + |
| 96 | +```bash |
| 97 | +uv run mcpstack build --pipeline my_pipeline.json --config-type claude |
| 98 | +``` |
| 99 | + |
| 100 | +Now you can ask the LLM to explore a dataset, operate a Urban Mapper's pipeline analysis and export results into Jupyter. |
| 101 | + |
| 102 | +## 📣 Prompt Used During The Demo Video |
| 103 | + |
| 104 | +### Initial Prompt |
| 105 | +```text |
| 106 | +Hi there! I am interested in a *per-street* `Urban Mapper (UM)` *pipeline* *analysis* in **Downtown Brooklyn, NYC, USA** using the `oscur/nyc_311` dataset showing broadly speaking complaints in the street of New York City. Such dataset can be obtained via via `UM` actions. |
| 107 | +
|
| 108 | +What I am genuinely interested into first is to explore the dataset's context & description, please, using `AutoDDG` actions. Do not add any manual analysis of the sort. |
| 109 | +
|
| 110 | +**Shall we? Additionally, do download the full **`oscur/nyc_311`** data, but stream/sample the dataset up to 30k rows maximum when using **`AutoDDG`** . ** |
| 111 | +
|
| 112 | +Note, if you have to download any packages, do so via `!uv add <package_name>` |
| 113 | +``` |
| 114 | + |
| 115 | +### Follow-up Prompt |
| 116 | +````text |
| 117 | +So prior even building a UM pipeline analysis with nyc_311, can you explore what is `UM`, examples about it, and then based on what you've seen from `UM` and what you have gathered from AutoDDG on the dataset, can you list out here not in the notebook, the various `enrichers` you believe would be interesting to extract out of the dataset to apply to the streets of Downtown brooklyn please? Be straighforward-thinking. |
| 118 | +
|
| 119 | +Attaching a bit more documentation about enrichers. |
| 120 | +```` |
| 121 | + |
| 122 | +### Follow-up Prompt – Optional |
| 123 | +````text |
| 124 | +nunique is not available, and mean on categorical -based feature will most probably crash. |
| 125 | +You have to pass a custom lambda method function to deal with all that, very easy straightforward custom lambda functions. |
| 126 | +```` |
| 127 | + |
| 128 | +### Follow-up Prompt |
| 129 | +````text |
| 130 | +Let's build the pipeline with the FULL data, all those very interesting stacked enrichers |
| 131 | +( |
| 132 | +not need to create multiple time the pipeline, stack the enrichers, |
| 133 | +and then straight after building, .preview(.), .composed_transform(.) |
| 134 | +(by the way, no. need to grave the output of this function for the time being), |
| 135 | +and most importantly, follow by nothing else than of : .visualise([<with the name of all of our output column created>] |
| 136 | +). |
| 137 | +
|
| 138 | +Shall we :) ? Output everything in the jupyter notebook. |
| 139 | +
|
| 140 | +Recall for any packages needed `!uv add <package_name>`. |
| 141 | +```` |
| 142 | + |
| 143 | +!!! tip |
| 144 | + Try chaining additional tools to build research-ready urban analysis (e.g. ML) workflows. |
0 commit comments