Skip to content

Commit e7524dd

Browse files
authored
Add reverse image search docs (#19)
1 parent 16af8c8 commit e7524dd

4 files changed

Lines changed: 132 additions & 8 deletions

File tree

python-sdk/folders.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ All the core document operations available on the main Morphik client are also a
9696
- `ingest_file` - Ingest a file into this folder
9797
- `ingest_files` - Ingest multiple files into this folder
9898
- `ingest_directory` - Ingest all files from a directory into this folder
99-
- `retrieve_chunks` - Retrieve chunks matching a query from this folder
99+
- `retrieve_chunks` - Retrieve chunks matching a query from this folder (supports [reverse image search](/python-sdk/retrieve_chunks#reverse-image-search))
100100
- `retrieve_docs` - Retrieve documents matching a query from this folder
101101
- `query` - Generate a completion using context from this folder (supports `llm_config` parameter for custom LLM configuration)
102102
- `list_documents` - List all documents in this folder

python-sdk/retrieve_chunks.mdx

Lines changed: 64 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,43 +7,46 @@ description: "Retrieve relevant chunks from Morphik"
77
<Tab title="Sync">
88
```python
99
def retrieve_chunks(
10-
query: str,
10+
query: Optional[str] = None,
1111
filters: Optional[Dict[str, Any]] = None,
1212
k: int = 4,
1313
min_score: float = 0.0,
1414
use_colpali: bool = True,
1515
folder_name: Optional[Union[str, List[str]]] = None,
1616
padding: int = 0,
1717
output_format: Optional[str] = None,
18+
query_image: Optional[str] = None,
1819
) -> List[FinalChunkResult]
1920
```
2021
</Tab>
2122
<Tab title="Async">
2223
```python
2324
async def retrieve_chunks(
24-
query: str,
25+
query: Optional[str] = None,
2526
filters: Optional[Dict[str, Any]] = None,
2627
k: int = 4,
2728
min_score: float = 0.0,
2829
use_colpali: bool = True,
2930
folder_name: Optional[Union[str, List[str]]] = None,
3031
padding: int = 0,
3132
output_format: Optional[str] = None,
33+
query_image: Optional[str] = None,
3234
) -> List[FinalChunkResult]
3335
```
3436
</Tab>
3537
</Tabs>
3638

3739
## Parameters
3840

39-
- `query` (str): Search query text
41+
- `query` (str, optional): Search query text. Mutually exclusive with `query_image`.
4042
- `filters` (Dict[str, Any], optional): Optional metadata filters
4143
- `k` (int, optional): Number of results. Defaults to 4.
4244
- `min_score` (float, optional): Minimum similarity threshold. Defaults to 0.0.
4345
- `use_colpali` (bool, optional): Whether to use ColPali-style embedding model to retrieve the chunks (only works for documents ingested with `use_colpali=True`). Defaults to True.
4446
- `folder_name` (str | List[str], optional): Optional folder scope. Accepts a single folder name or a list of folder names.
4547
- `padding` (int, optional): Number of additional chunks/pages to retrieve before and after matched chunks (ColPali only). Defaults to 0.
4648
- `output_format` (str, optional): Controls how image chunks are returned. Set to `"url"` to receive presigned URLs; omit or set to `"base64"` (default) to receive base64 content.
49+
- `query_image` (str, optional): Base64-encoded image for reverse image search. Mutually exclusive with `query`. Requires `use_colpali=True`.
4750

4851
## Metadata Filters
4952

@@ -140,3 +143,61 @@ The `FinalChunkResult` objects returned by this method have the following proper
140143
- The `download_url` field may be populated for image chunks. When using `output_format="url"`, it will typically match `content` for those chunks.
141144

142145
Tip: To download the original raw file for a document, use [`get_document_download_url`](./get_document_download_url).
146+
147+
## Reverse Image Search
148+
149+
You can search using an image instead of text by providing `query_image` with a base64-encoded image. This enables finding visually similar content in your documents.
150+
151+
<Tabs>
152+
<Tab title="Sync">
153+
```python
154+
import base64
155+
from morphik import Morphik
156+
157+
db = Morphik()
158+
159+
# Load and encode your query image
160+
with open("query_image.png", "rb") as f:
161+
image_b64 = base64.b64encode(f.read()).decode("utf-8")
162+
163+
# Search using the image
164+
chunks = db.retrieve_chunks(
165+
query_image=image_b64,
166+
use_colpali=True, # Required for image queries
167+
k=5,
168+
)
169+
170+
for chunk in chunks:
171+
print(f"Score: {chunk.score}")
172+
print(f"Document ID: {chunk.document_id}")
173+
print("---")
174+
```
175+
</Tab>
176+
<Tab title="Async">
177+
```python
178+
import base64
179+
from morphik import AsyncMorphik
180+
181+
async with AsyncMorphik() as db:
182+
# Load and encode your query image
183+
with open("query_image.png", "rb") as f:
184+
image_b64 = base64.b64encode(f.read()).decode("utf-8")
185+
186+
# Search using the image
187+
chunks = await db.retrieve_chunks(
188+
query_image=image_b64,
189+
use_colpali=True, # Required for image queries
190+
k=5,
191+
)
192+
193+
for chunk in chunks:
194+
print(f"Score: {chunk.score}")
195+
print(f"Document ID: {chunk.document_id}")
196+
print("---")
197+
```
198+
</Tab>
199+
</Tabs>
200+
201+
<Note>
202+
Reverse image search requires documents to be ingested with `use_colpali=True`. You must provide either `query` or `query_image`, but not both.
203+
</Note>

python-sdk/retrieve_chunks_grouped.mdx

Lines changed: 66 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ description: "Retrieve relevant chunks with grouping for UI display"
77
<Tab title="Sync">
88
```python
99
def retrieve_chunks_grouped(
10-
query: str,
10+
query: Optional[str] = None,
1111
filters: Optional[Dict[str, Any]] = None,
1212
k: int = 4,
1313
min_score: float = 0.0,
@@ -20,13 +20,14 @@ description: "Retrieve relevant chunks with grouping for UI display"
2020
graph_name: Optional[str] = None,
2121
hop_depth: int = 1,
2222
include_paths: bool = False,
23+
query_image: Optional[str] = None,
2324
) -> GroupedChunkResponse
2425
```
2526
</Tab>
2627
<Tab title="Async">
2728
```python
2829
async def retrieve_chunks_grouped(
29-
query: str,
30+
query: Optional[str] = None,
3031
filters: Optional[Dict[str, Any]] = None,
3132
k: int = 4,
3233
min_score: float = 0.0,
@@ -39,14 +40,15 @@ description: "Retrieve relevant chunks with grouping for UI display"
3940
graph_name: Optional[str] = None,
4041
hop_depth: int = 1,
4142
include_paths: bool = False,
43+
query_image: Optional[str] = None,
4244
) -> GroupedChunkResponse
4345
```
4446
</Tab>
4547
</Tabs>
4648

4749
## Parameters
4850

49-
- `query` (str): Search query text
51+
- `query` (str, optional): Search query text. Mutually exclusive with `query_image`.
5052
- `filters` (Dict[str, Any], optional): Optional metadata filters
5153
- `k` (int, optional): Number of results. Defaults to 4.
5254
- `min_score` (float, optional): Minimum similarity threshold. Defaults to 0.0.
@@ -59,6 +61,7 @@ description: "Retrieve relevant chunks with grouping for UI display"
5961
- `graph_name` (str, optional): Name of the graph to use for knowledge graph-enhanced retrieval
6062
- `hop_depth` (int, optional): Number of relationship hops to traverse in the graph. Defaults to 1.
6163
- `include_paths` (bool, optional): Whether to include relationship paths in the response. Defaults to False.
64+
- `query_image` (str, optional): Base64-encoded image for reverse image search. Mutually exclusive with `query`. Requires `use_colpali=True`.
6265

6366
## Returns
6467

@@ -182,3 +185,63 @@ Each `ChunkGroup` in `groups` has:
182185
- The `groups` list organizes results with their padding context, ideal for building search result UIs.
183186
- When `padding` is specified, surrounding chunks are included in `padding_chunks` for each group.
184187
- Knowledge graph parameters (`graph_name`, `hop_depth`, `include_paths`) enable graph-enhanced retrieval.
188+
189+
## Reverse Image Search
190+
191+
You can search using an image instead of text by providing `query_image` with a base64-encoded image:
192+
193+
<Tabs>
194+
<Tab title="Sync">
195+
```python
196+
import base64
197+
from morphik import Morphik
198+
199+
db = Morphik()
200+
201+
# Load and encode your query image
202+
with open("query_image.png", "rb") as f:
203+
image_b64 = base64.b64encode(f.read()).decode("utf-8")
204+
205+
# Search using the image with grouped results
206+
response = db.retrieve_chunks_grouped(
207+
query_image=image_b64,
208+
use_colpali=True, # Required for image queries
209+
k=5,
210+
padding=1,
211+
)
212+
213+
for group in response.groups:
214+
print(f"Main chunk score: {group.main_chunk.score}")
215+
print(f"Document: {group.main_chunk.document_id}")
216+
print("---")
217+
```
218+
</Tab>
219+
<Tab title="Async">
220+
```python
221+
import base64
222+
from morphik import AsyncMorphik
223+
224+
async with AsyncMorphik() as db:
225+
# Load and encode your query image
226+
with open("query_image.png", "rb") as f:
227+
image_b64 = base64.b64encode(f.read()).decode("utf-8")
228+
229+
# Search using the image with grouped results
230+
response = await db.retrieve_chunks_grouped(
231+
query_image=image_b64,
232+
use_colpali=True, # Required for image queries
233+
k=5,
234+
padding=1,
235+
)
236+
237+
for group in response.groups:
238+
print(f"Main chunk score: {group.main_chunk.score}")
239+
print(f"Document: {group.main_chunk.document_id}")
240+
print("---")
241+
```
242+
</Tab>
243+
</Tabs>
244+
245+
<Note>
246+
Reverse image search requires documents to be ingested with `use_colpali=True`. You must provide either `query` or `query_image`, but not both.
247+
</Note>

python-sdk/users.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ The UserScope class provides the same document operations as the main Morphik cl
9090
- `ingest_file` - Ingest a file for this user
9191
- `ingest_files` - Ingest multiple files for this user
9292
- `ingest_directory` - Ingest all files from a directory for this user
93-
- `retrieve_chunks` - Retrieve chunks matching a query from this user's documents
93+
- `retrieve_chunks` - Retrieve chunks matching a query from this user's documents (supports [reverse image search](/python-sdk/retrieve_chunks#reverse-image-search))
9494
- `retrieve_docs` - Retrieve documents matching a query from this user's documents
9595
- `query` - Generate a completion using context from this user's documents (supports `llm_config` parameter for custom LLM configuration)
9696
- `list_documents` - List all documents owned by this user

0 commit comments

Comments
 (0)