You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[](https://discord.gg/pathway)
10
-
[](https://x.com/intent/follow?screen_name=pathway_com)
11
-
</div>
12
5
13
6
Pathway's **AI Pipelines** allow you to quickly put in production AI applications that offer **high-accuracy RAG and AI enterprise search at scale** using the most **up-to-date knowledge** available in your data sources. It provides you ready-to-deploy **LLM (Large Language Model) App Templates**. You can test them on your own machine and deploy on-cloud (GCP, AWS, Azure, Render,...) or on-premises.
14
7
@@ -21,26 +14,26 @@ The application templates provided in this repo scale up to **millions of pages
|[`Question-Answering RAG App`](templates/question_answering_rag/)| Basic end-to-end RAG app. A question-answering pipeline that uses the GPT model of choice to provide answers to queries to your documents (PDF, DOCX,...) on a live connected data source (files, Google Drive, Sharepoint,...). You can also try out a [demo REST endpoint](https://pathway.com/solutions/rag-pipelines#try-it-out). |
25
-
|[`Live Document Indexing (Vector Store / Retriever)`](templates/document_indexing/)| A real-time document indexing pipeline for RAG that acts as a vector store service. It performs live indexing on your documents (PDF, DOCX,...) from a connected data source (files, Google Drive, Sharepoint,...). It can be used with any frontend, or integrated as a retriever backend for a [Langchain](https://pathway.com/blog/langchain-integration) or [Llamaindex](https://pathway.com/blog/llamaindex-pathway) application. You can also try out a [demo REST endpoint](https://pathway.com/solutions/ai-contract-management#try-it-out). |
26
-
|[`Multimodal RAG pipeline with GPT4o`](templates/multimodal_rag/)| Multimodal RAG using GPT-4o in the parsing stage to index PDFs and other documents from a connected data source files, Google Drive, Sharepoint,...). It is perfect for extracting information from unstructured financial documents in your folders (including charts and tables), updating results as documents change or new ones arrive.|
27
-
|[`Unstructured-to-SQL pipeline + SQL question-answering`](templates/unstructured_to_sql_on_the_fly/)| A RAG example which connects to unstructured financial data sources (financial report PDFs), structures the data into SQL, and loads it into a PostgreSQL table. It also answers natural language user queries to these financial documents by translating them into SQL using an LLM and executing the query on the PostgreSQL table. |
28
-
|[`Adaptive RAG App`](templates/adaptive_rag/)| A RAG application using Adaptive RAG, a technique developed by Pathway to reduce token cost in RAG up to 4x while maintaining accuracy. |
29
-
|[`Private RAG App with Mistral and Ollama`](templates/private_rag/)| A fully private (local) version of the `question_answering_rag` RAG pipeline using Pathway, Mistral, and Ollama. |
30
-
|[`Slides AI Search App`](templates/slides_ai_search/)| An indexing pipeline for retrieving slides. It performs multi-modal of PowerPoint and PDF and maintains live index of your slides."|
17
+
|[`Question-Answering RAG App`]| Basic end-to-end RAG app. A question-answering pipeline that uses the GPT model of choice to provide answers to queries to your documents (PDF, DOCX,...) on a live connected data source (files, Google Drive, Sharepoint,...). You can also try out a [demo REST endpoint]. |
18
+
|[`Live Document Indexing (Vector Store / Retriever)`]| A real-time document indexing pipeline for RAG that acts as a vector store service. It performs live indexing on your documents (PDF, DOCX,...) from a connected data source (files, Google Drive, Sharepoint,...). It can be used with any frontend, or integrated as a retriever backend for a [Langchain] or [Llamaindex] application. You can also try out a [demo REST endpoint]. |
19
+
|[`Multimodal RAG pipeline with GPT4o`]| Multimodal RAG using GPT-4o in the parsing stage to index PDFs and other documents from a connected data source files, Google Drive, Sharepoint,...). It is perfect for extracting information from unstructured financial documents in your folders (including charts and tables), updating results as documents change or new ones arrive.|
20
+
|[`Unstructured-to-SQL pipeline + SQL question-answering`]| A RAG example which connects to unstructured financial data sources (financial report PDFs), structures the data into SQL, and loads it into a PostgreSQL table. It also answers natural language user queries to these financial documents by translating them into SQL using an LLM and executing the query on the PostgreSQL table. |
21
+
|[`Adaptive RAG App`]| A RAG application using Adaptive RAG, a technique developed by Pathway to reduce token cost in RAG up to 4x while maintaining accuracy. |
22
+
|[`Private RAG App with Mistral and Ollama`]| A fully private (local) version of the `question_answering_rag` RAG pipeline using Pathway, Mistral, and Ollama. |
23
+
|[`Slides AI Search App`]| An indexing pipeline for retrieving slides. It performs multi-modal of PowerPoint and PDF and maintains live index of your slides."|
31
24
32
25
33
26
## How do these AI Pipelines work?
34
27
35
28
The apps can be run as **Docker containers**, and expose an **HTTP API** to connect the frontend. To allow quick testing and demos, some app templates also include an optional Streamlit UI which connects to this API.
36
29
37
-
The apps rely on the [Pathway Live Data framework](https://github.com/pathwaycom/pathway) for data source synchronization and for serving API requests (Pathway is a standalone Python library with a Rust engine built into it). They bring you a **simple and unified application logic** for back-end, embedding, retrieval, LLM tech stack. There is no need to integrate and maintain separate modules for your Gen AI app: ~Vector Database (e.g. Pinecone/Weaviate/Qdrant) + Cache (e.g. Redis) + API Framework (e.g. Fast API)~. Pathway's default choice of **built-in vector index** is based on the lightning-fast [usearch](https://github.com/unum-cloud/usearch) library, and **hybrid full-text indexes** make use of [Tantivy](https://github.com/quickwit-oss/tantivy) library. Everything works out of the box.
30
+
The apps rely on the [Pathway Live Data framework] for data source synchronization and for serving API requests (Pathway is a standalone Python library with a Rust engine built into it). They bring you a **simple and unified application logic** for back-end, embedding, retrieval, LLM tech stack. There is no need to integrate and maintain separate modules for your Gen AI app: ~Vector Database (e.g. Pinecone/Weaviate/Qdrant) + Cache (e.g. Redis) + API Framework (e.g. Fast API)~. Pathway's default choice of **built-in vector index** is based on the lightning-fast [usearch] library, and **hybrid full-text indexes** make use of [Tantivy] library. Everything works out of the box.
38
31
39
32
## Getting started
40
33
41
-
Each of the [App templates](templates/) in this repo contains a README.md with instructions on how to run it.
34
+
Each of the [App templates] in this repo contains a README.md with instructions on how to run it.
42
35
43
-
You can also find [more ready-to-run code templates](https://pathway.com/developers/templates/) on the Pathway website.
36
+
You can also find [more ready-to-run code templates] on the Pathway website.
44
37
45
38
46
39
## Some visual highlights
@@ -49,40 +42,27 @@ Effortlessly extract and organize table and chart data from PDFs, docs, and more
49
42
50
43

51
44
52
-
(Check out [`Multimodal RAG pipeline with GPT4o`](templates/multimodal_rag/) to see the whole pipeline in the works. You may also check out the [`Unstructured-to-SQL pipeline`](templates/unstructured_to_sql_on_the_fly/) for a minimal example that works with non-multimodal models as well.)
45
+
(Check out [`Multimodal RAG pipeline with GPT4o`] to see the whole pipeline in the works. You may also check out the [`Unstructured-to-SQL pipeline`] for a minimal example that works with non-multimodal models as well.)
53
46
54
47
55
48
Automated real-time knowledge mining and alerting:
56
49
57
50

58
51
59
-
(Check out the [`Alerting when answers change on Google Drive`](https://github.com/pathwaycom/llm-app/tree/main/templates/drive_alert) app example.)
60
-
61
-
62
-
### Do-it-Yourself Videos
63
-
64
-
▶️ [An introduction to building LLM apps with Pathway](https://www.youtube.com/watch?v=kcrJSk00duw) - by [Jan Chorowski](https://scholar.google.com/citations?user=Yc94070AAAAJ)
65
-
66
-
▶️ [Let's build a real-world LLM app in 11 minutes](https://www.youtube.com/watch?v=k1XGo7ts4tI) - by [Pau Labarta Bajo](https://substack.com/@paulabartabajo)
67
-
52
+
(Check out the [`Alerting when answers change on Google Drive`] app example.)
68
53
69
54
## Troubleshooting
70
55
71
-
To provide feedback or report a bug, please [raise an issue on our issue tracker](https://github.com/pathwaycom/pathway/issues).
56
+
To provide feedback or report a bug, please [raise an issue on our issue tracker].
72
57
73
58
## Contributing
74
59
75
-
Anyone who wishes to contribute to this project, whether documentation, features, bug fixes, code cleanup, testing, or code reviews, is very much encouraged to do so. If this is your first contribution to a GitHub project, here is a [Get Started Guide](https://docs.github.com/en/get-started/quickstart/contributing-to-projects).
60
+
Anyone who wishes to contribute to this project, whether documentation, features, bug fixes, code cleanup, testing, or code reviews, is very much encouraged to do so. If this is your first contribution to a GitHub project, here is a [Get Started Guide].
76
61
77
-
If you'd like to make a contribution that needs some more work, just raise your hand on the [Pathway Discord server](https://discord.com/invite/pathway) (#get-help) and let us know what you are planning!
62
+
If you'd like to make a contribution that needs some more work, just raise your hand on the [Pathway Discord server] (#get-help) and let us know what you are planning!
0 commit comments