Skip to content

Commit e82a914

Browse files
authored
Add changes for compatibility with WASM components and collocated UDF servers (#121)
Clean up Python SDK for WASM compatibility and add new Plugin API server for UDFs. This includes various changes such as: * Lazy load pandas, polars, pyarrow, numpy * Move other imports such as JWT inside functions that use them * Add experimental Plugin API server
1 parent 36387aa commit e82a914

37 files changed

Lines changed: 6086 additions & 655 deletions

.github/workflows/fusion-docs.yml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
on:
2-
push:
3-
tags:
4-
- 'v*.*.*'
2+
release:
3+
types: [published]
54

65
name: Generate Fusion docs
76

@@ -36,6 +35,6 @@ jobs:
3635
3736
- name: Upload release asset
3837
run: |
39-
gh release upload ${{ github.ref_name }} fusion-docs.zip
38+
gh release upload ${{ github.event.release.tag_name }} fusion-docs.zip
4039
env:
4140
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

.github/workflows/publish.yml

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,12 @@
77
name: Publish packages
88

99
on:
10+
push:
11+
tags:
12+
- 'v*-rc*'
13+
- 'v*-test*'
14+
- 'v*-alpha*'
15+
- 'v*-beta*'
1016
release:
1117
types: [published]
1218
workflow_dispatch:
@@ -159,13 +165,15 @@ jobs:
159165
permissions:
160166
id-token: write # Required for OIDC trusted publishing
161167
actions: read # Required for actions/download-artifact
162-
contents: read # Required for repository access
168+
contents: write # Required for gh release create/upload
163169

164170
environment:
165171
name: publish
166172
url: https://pypi.org/p/singlestoredb
167173

168174
steps:
175+
- uses: actions/checkout@v3
176+
169177
- name: Download Linux wheels and sdist
170178
uses: actions/download-artifact@v4
171179
with:
@@ -184,6 +192,20 @@ jobs:
184192
name: artifacts-macOS
185193
path: dist
186194

195+
- name: Create GitHub Release
196+
if: ${{ github.event_name == 'push' }}
197+
run: |
198+
gh release create ${{ github.ref_name }} dist/* --prerelease --title "${{ github.ref_name }}"
199+
env:
200+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
201+
202+
- name: Upload release assets
203+
if: ${{ github.event_name == 'release' }}
204+
run: |
205+
gh release upload ${{ github.event.release.tag_name }} dist/* --clobber
206+
env:
207+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
208+
187209
- name: Publish to PyPI
188210
if: ${{ github.event_name == 'release' || github.event.inputs.publish_pypi == 'true' }}
189211
uses: pypa/gh-action-pypi-publish@release/v1

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ dev-docs
8383
**/.ipynb_checkpoints
8484
**/.benchmarks
8585
*.ipynb
86-
test*.py
86+
/test*.py
8787
certs
8888
**/*.prof
8989
**/*.pprof

ARCHITECTURE.md

Lines changed: 65 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,14 @@ singlestoredb/
9999
│ ├── mmap.py # Memory-mapped execution
100100
│ ├── json.py # JSON serialization
101101
│ ├── rowdat_1.py # ROWDAT_1 format
102-
│ └── arrow.py # Apache Arrow format
102+
│ ├── arrow.py # Apache Arrow format
103+
│ └── plugin/ # Plugin UDF server (Unix socket)
104+
│ ├── __main__.py # CLI entry point
105+
│ ├── server.py # Socket server with thread/process pool
106+
│ ├── connection.py # Connection handling
107+
│ ├── control.py # Control protocol
108+
│ ├── registry.py # Function registry & discovery
109+
│ └── wasm.py # WASM component interface
103110
104111
├── ai/ # AI/ML integration
105112
│ ├── chat.py # Chat completion factory
@@ -603,9 +610,9 @@ Located in `singlestoredb/fusion/handlers/`:
603610
## External Functions (UDFs)
604611

605612
The functions module (`singlestoredb/functions/`) enables deploying Python functions
606-
as SingleStore external functions. UDF servers can be deployed as HTTP using
607-
an ASGI application or a collocated socket server that uses mmap files to
608-
transfer data.
613+
as SingleStore external functions. UDF servers can be deployed in three modes: as HTTP using an ASGI application
614+
(remote), a memory-mapped collocated server (lowest latency), or as a plugin
615+
server using Unix sockets with a thread/process pool (CLI-driven).
609616

610617
### Architecture
611618

@@ -629,6 +636,7 @@ transfer data.
629636
├─────────────────────────────────────────────────────────────────────┤
630637
│ asgi.py │ HTTP server via ASGI (Uvicorn; JSON or ROWDAT_1) │
631638
│ mmap.py │ Memory-mapped shared memory (collocated; ROWDAT_1) │
639+
│ plugin/ │ Plugin UDF server (Unix socket + thread/process pool) │
632640
│ json.py │ JSON serialization over HTTP │
633641
│ rowdat_1.py│ ROWDAT_1 binary format │
634642
│ arrow.py │ Apache Arrow columnar format │
@@ -659,6 +667,33 @@ python -m singlestoredb.functions.ext.asgi \
659667
my_functions
660668
```
661669

670+
### Plugin Server CLI
671+
672+
The plugin server can be launched via the CLI for Unix socket-based UDF serving:
673+
674+
```bash
675+
# Launch the plugin server
676+
python -m singlestoredb.functions.ext.plugin \
677+
--plugin-name myfuncs \
678+
--search-path /home/user/libs \
679+
--socket /tmp/my-udf.sock
680+
681+
# Or use the console_scripts entry point
682+
python-udf-server --plugin-name myfuncs
683+
```
684+
685+
**CLI Arguments:**
686+
687+
| Argument | Env Variable | Default | Description |
688+
|----------|-------------|---------|-------------|
689+
| `--plugin-name` | `PLUGIN_NAME` | (required) | Python module to import |
690+
| `--search-path` | `PLUGIN_SEARCH_PATH` | `""` | Colon-separated search dirs for the module |
691+
| `--socket` | `PLUGIN_SOCKET_PATH` | auto-generated | Unix socket path |
692+
| `--n-workers` | `PLUGIN_N_WORKERS` | `0` (CPU count) | Worker threads/processes |
693+
| `--max-connections` | `PLUGIN_MAX_CONNECTIONS` | `32` | Socket backlog |
694+
| `--log-level` | `PLUGIN_LOG_LEVEL` | `info` | Logging level (debug/info/warning/error) |
695+
| `--process-mode` | `PLUGIN_PROCESS_MODE` | `process` | Concurrency mode: `thread` or `process` |
696+
662697
### Type Mapping
663698

664699
The `signature.py` module maps Python types to SQL types:
@@ -682,29 +717,30 @@ The `signature.py` module maps Python types to SQL types:
682717
┌─────────────────────────────────────────────────────────────────────┐
683718
│ SingleStore Database │
684719
└─────────────────────────────────────────────────────────────────────┘
685-
│ │
686-
│ ASGI/HTTP │ Memory-mapped
687-
│ (remote) │ (collocated)
688-
▼ ▼
689-
┌─────────────┐ ┌─────────────┐
690-
│ asgi.py │ │ mmap.py │
691-
│ Uvicorn │ │ Shared │
692-
│ HTTP/2 │ │ Memory │
693-
└─────────────┘ └─────────────┘
694-
│ │
695-
└────────────────────┘
696-
697-
698-
┌─────────────────┐
699-
│ Python UDF │
700-
│ Functions │
701-
└─────────────────┘
720+
│ │
721+
│ ASGI/HTTP │ Memory-mapped │ Plugin
722+
│ (remote) │ (collocated) │ (Unix socket)
723+
▼ ▼
724+
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
725+
│ asgi.py │ │ mmap.py │ │ plugin/ │
726+
│ Uvicorn │ │ Shared │ │ Unix sock │
727+
│ HTTP/2 │ │ Memory │ │ + pool │
728+
└─────────────┘ └─────────────┘ └─────────────┘
729+
│ │
730+
└────────────────────┴────────────────────
731+
732+
733+
┌─────────────────┐
734+
│ Python UDF │
735+
│ Functions │
736+
└─────────────────┘
702737
```
703738

704739
| Mode | File | Use Case |
705740
|------|------|----------|
706741
| ASGI | `asgi.py` | Remote execution via HTTP, scalable |
707742
| Memory-mapped | `mmap.py` | Collocated execution, lowest latency |
743+
| Plugin | `plugin/` | Unix socket server, thread/process pool, CLI-driven |
708744
| JSON | `json.py` | Simple serialization, debugging |
709745
| ROWDAT_1 | `rowdat_1.py` | Binary format, efficient |
710746
| Arrow | `arrow.py` | Columnar format, analytics |
@@ -1043,6 +1079,13 @@ with free_tier.start() as server:
10431079
| `SINGLESTOREDB_MANAGEMENT_TOKEN` | Management API token | None |
10441080
| `SINGLESTORE_LICENSE` | License key for Docker | None |
10451081
| `USE_DATA_API` | Use HTTP API for tests | 0 |
1082+
| `PLUGIN_NAME` | Plugin server: Python module to import | None |
1083+
| `PLUGIN_SEARCH_PATH` | Plugin server: colon-separated module search dirs | `""` |
1084+
| `PLUGIN_SOCKET_PATH` | Plugin server: Unix socket path | auto-generated |
1085+
| `PLUGIN_N_WORKERS` | Plugin server: worker count (0 = CPU count) | `0` |
1086+
| `PLUGIN_MAX_CONNECTIONS` | Plugin server: socket backlog | `32` |
1087+
| `PLUGIN_LOG_LEVEL` | Plugin server: logging level | `info` |
1088+
| `PLUGIN_PROCESS_MODE` | Plugin server: `thread` or `process` | `process` |
10461089

10471090
### B. Cursor Type Matrix
10481091

@@ -1096,4 +1139,5 @@ Feature options:
10961139
| Management API | `singlestoredb/management/workspace.py` |
10971140
| Fusion handlers | `singlestoredb/fusion/handler.py` |
10981141
| UDF decorator | `singlestoredb/functions/decorator.py` |
1142+
| Plugin UDF server | `singlestoredb/functions/ext/plugin/server.py` |
10991143
| Test fixtures | `singlestoredb/pytest.py` |

CONTRIBUTING.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,10 @@ pytest -v --cov=singlestoredb.connection singlestoredb/tests/test_connection.py
182182
# Test UDF functionality
183183
pytest singlestoredb/tests/test_udf.py
184184

185+
# Manual testing of the plugin UDF server
186+
python -m singlestoredb.functions.ext.plugin \
187+
--plugin-name myfuncs --search-path /path/to/modules
188+
185189
# Test against specific server (skips Docker)
186190
SINGLESTOREDB_URL=admin:pass@localhost:3306 pytest -v singlestoredb/tests
187191

README.md

Lines changed: 59 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ analytics and vector search.
1818
- **Vector Store**: Pinecone-compatible vector database API for similarity search
1919
applications with built-in connection pooling
2020
- **User-Defined Functions**: Deploy Python functions as SingleStore UDFs with
21-
automatic type mapping
21+
automatic type mapping (HTTP/ASGI or plugin-mode via CLI)
2222
- **SQLAlchemy Support**: Integrate with SQLAlchemy through the optional
2323
`sqlalchemy-singlestoredb` adapter
2424
- **Fusion SQL**: Extend SQL with custom client-side command handlers
@@ -53,6 +53,9 @@ pip install 'singlestoredb[rsa]'
5353
# Ed25519 key authentication
5454
pip install 'singlestoredb[ed22519]'
5555

56+
# Pytest plugin with Docker test containers
57+
pip install 'singlestoredb[pytest,docker]'
58+
5659
# Multiple extras can be combined
5760
pip install 'singlestoredb[vectorstore,sqlalchemy]'
5861
```
@@ -66,6 +69,8 @@ pip install 'singlestoredb[vectorstore,sqlalchemy]'
6669
| `kerberos` / `gssapi` | Kerberos/GSSAPI authentication support |
6770
| `rsa` | RSA key exchange for encrypted connections |
6871
| `ed22519` | Ed25519 key authentication |
72+
| `docker` | Docker SDK for automated test container management |
73+
| `pytest` | Pytest plugin with SingleStoreDB Docker test fixtures |
6974

7075
## Documentation
7176

@@ -250,6 +255,59 @@ conn.execute("""
250255
See [singlestoredb/fusion/README.md](singlestoredb/fusion/README.md)
251256
for details on writing custom Fusion SQL handlers.
252257

258+
## Pytest Plugin
259+
260+
The SDK includes a pytest plugin that automatically manages SingleStoreDB Docker
261+
containers for integration testing. Install with:
262+
263+
```bash
264+
pip install 'singlestoredb[pytest,docker]'
265+
```
266+
267+
The plugin provides these fixtures:
268+
269+
- `singlestoredb_test_container` (session-scoped): Starts and stops a
270+
SingleStoreDB Dev container, or reuses an existing server if
271+
`SINGLESTOREDB_URL` is set
272+
- `singlestoredb_connection` (session-scoped): Database connection to the
273+
test container
274+
- `singlestoredb_tempdb` (function-scoped): Cursor with a fresh temporary
275+
database, dropped after the test
276+
277+
The plugin supports `pytest-xdist` parallel execution with leader/follower
278+
coordination for container lifecycle.
279+
280+
```python
281+
def test_query(singlestoredb_tempdb):
282+
singlestoredb_tempdb.execute('CREATE TABLE t (id INT)')
283+
singlestoredb_tempdb.execute('INSERT INTO t VALUES (1)')
284+
singlestoredb_tempdb.execute('SELECT * FROM t')
285+
assert singlestoredb_tempdb.fetchone() == (1,)
286+
```
287+
288+
## Plugin UDF Server
289+
290+
The SDK ships a high-performance plugin-mode UDF server that runs as a
291+
standalone process, communicating with SingleStoreDB over a Unix socket.
292+
293+
```bash
294+
# Via the installed CLI entry point
295+
python-udf-server --plugin-name myfuncs --search-path /path/to/modules
296+
297+
# Or as a Python module
298+
python -m singlestoredb.functions.ext.plugin --plugin-name myfuncs
299+
```
300+
301+
| Option | Env Variable | Default | Description |
302+
|--------|-------------|---------|-------------|
303+
| `--plugin-name` | `PLUGIN_NAME` | (required) | Python module to import |
304+
| `--search-path` | `PLUGIN_SEARCH_PATH` | `""` | Colon-separated module search dirs |
305+
| `--socket` | `PLUGIN_SOCKET_PATH` | auto-generated | Unix socket path |
306+
| `--n-workers` | `PLUGIN_N_WORKERS` | `0` (CPU count) | Worker threads/processes |
307+
| `--max-connections` | `PLUGIN_MAX_CONNECTIONS` | `32` | Socket backlog |
308+
| `--log-level` | `PLUGIN_LOG_LEVEL` | `info` | Logging level |
309+
| `--process-mode` | `PLUGIN_PROCESS_MODE` | `process` | `thread` or `process` concurrency |
310+
253311
## Advanced Options
254312

255313
### SSL/TLS Configuration

0 commit comments

Comments
 (0)