Skip to content

Commit 4f5ec2c

Browse files
committed
Update: Updated connection to NB and application description
1 parent a978431 commit 4f5ec2c

15 files changed

Lines changed: 488 additions & 224 deletions

.gitignore

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,19 @@ values.yaml
33

44
__pycache__/
55
*.pyc
6-
*.log
6+
*.log
7+
8+
### macOS ###
9+
# General
10+
.DS_Store
11+
.AppleDouble
12+
.LSOverride
13+
14+
# Environments
15+
.env
16+
.venv
17+
env/
18+
venv/
19+
ENV/
20+
env.bak/
21+
venv.bak/

mlconnector/README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,6 @@ This is used for internal communication of the varrious services. You can setup
3434
- `POSTGRES_DB`: PostgreSQL database name (default, `mlmodel`)
3535
- `POSTGRES_USER`: PostgreSQL username (default, `postgres`)
3636
- `POSTGRES_PASSWORD`: PostgreSQL password (default, `strongpassword`)
37-
- `PGADMIN_DEFAULT_EMAIL`: pgAdmin default login email (default, `user@mail.com`)
38-
- `PGADMIN_DEFAULT_PASSWORD`: pgAdmin default login password (default, `strongpassword`)
3937
- `DB_HOST_NAME`: Database host (e.g., `database`, This corresponds to the name of the container)
4038
- `DB_PORT`: Database port (default: `5432`)
4139
- `DB_DRIVER`: Database driver string (default, `postgresql+asyncpg`) **NOTE:** Only use an async driver

mlconnector/api_full_documentation.md

Lines changed: 58 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -647,11 +647,12 @@ reg = Ridge(alpha=1.0, random_state=0)
647647
reg.fit(X, y)
648648
...
649649

650-
# It is important that all models are saved with a .pkl extension
651-
# Serialize with pickle to a .pkl file
650+
# Serialize with pickle to a .pkl file or any other format
652651
output_path = "diabetes_ridge.pkl"
653652
with open(output_path, "wb") as f:
654653
pickle.dump(reg, f)
654+
# joblib.dump(bundle, model_path) (Using joblib)
655+
# or you can load the model in the custom function (see inference section)
655656

656657
```
657658
## 2. Register ML model with
@@ -719,17 +720,20 @@ The above step should return a model_id that will be used in the next steps. Her
719720
- Model file (pickled file saved in step one above)
720721
- Training data. This will be used for explainability and drift detection. (Note, it has to be the exact same data used to train the model, otherwise you will get wrong results)
721722
- Requirements file that defines the environment the model was trained in.
723+
- If you will use a different predict function (See inference section).
722724

723725
Upload these one by one using the example bellow;
724726
Note: file_kind can be `model`, `data`, `code`, and `env`
727+
728+
725729
```python
726730
import requests
727731

728732
files = {
729733
"file": open("model.pkl", "rb"),
730734
"file_kind": (None, "model")
731735
}
732-
resp = requests.post("BASE_URL/model/1234/upload", files=files)
736+
resp = requests.post("BASE_URL/model/{model_id}/upload", files=files)
733737
print(resp.json())
734738
```
735739
## 3. Deployment
@@ -775,20 +779,59 @@ curl -X GET "BASE_URL/deployment/get/status/dep-iris-001"
775779

776780
## 4. Inference Endpoint (including Explainability)
777781

778-
### 4.1 Predict Call
779-
780-
Assuming deployment created with `deployment_id = dep-iris-001`:
782+
### 4.1 Inference
781783

782-
```bash
783-
curl -X POST "BASE_URL/deployment/dep-iris-001/predict" \
784-
-H "Content-Type: application/json" \
785-
-d '{
786-
"data": [[5.1, 3.5, 1.4, 0.2]],
787-
"explain": true
788-
}'
789-
```
784+
Once the ML application is ready, the response will contain the inference endpoint.
790785

791-
**Response:**
786+
```python
787+
url =BASE_URL/prediction"
788+
headers = {
789+
"accept": "application/json",
790+
"Content-Type": "application/json",
791+
}
792+
payload = {
793+
"data": [{…}],
794+
"is_fun": False,
795+
"explanation": False
796+
}
797+
resp = requests.post(url, json=payload, headers=headers)
798+
```
799+
- `data` is list of dictionaries in the format of `feature:value`
800+
- `is_fun` If set to `True` the inference application will use a custom predict function. This has to specified by the application owner. See example below.
801+
802+
<table style="width:100%; border-collapse:collapse; font-size:12px;">
803+
<tr>
804+
<th style="text-align:left; border:1px solid #e0e0e0; padding:6px;">scikilearn</th>
805+
<th style="text-align:left; border:1px solid #e0e0e0; padding:6px;">pytorch</th>
806+
</tr>
807+
<tr>
808+
<td style="vertical-align:top; border:1px solid #e0e0e0; padding:6px;">
809+
<pre><code>import joblib
810+
811+
def predict(path, df):
812+
&quot;&quot;&quot;Minimal sklearn: load bundle &amp; predict.&quot;&quot;&quot;
813+
b = joblib.load(path) # {&#x27;pipeline&#x27;: fitted_estimator, ...}
814+
return b[&quot;pipeline&quot;].predict(df).tolist()
815+
</code></pre>
816+
</td>
817+
<td style="vertical-align:top; border:1px solid #e0e0e0; padding:6px;">
818+
<pre><code>import torch, numpy as np
819+
820+
def predict(path, df, feats=None, mean=None, scale=None):
821+
&quot;&quot;&quot;Minimal PyTorch (TorchScript).&quot;&quot;&quot;
822+
m = torch.jit.load(path, map_location=&quot;cpu&quot;).eval() # one-file scripted model
823+
X = df[feats].to_numpy(np.float32) if feats else df.to_numpy(np.float32)
824+
if mean is not None and scale is not None: # optional scaling
825+
X = (X - np.asarray(mean, np.float32)) / np.asarray(scale, np.float32)
826+
with torch.no_grad():
827+
y = m(torch.from_numpy(X)).argmax(1).cpu().numpy()
828+
return y.tolist()
829+
</code></pre>
830+
</tr>
831+
</table>
832+
833+
- `explanation` If set to True, then the response includes explanations.
834+
**Example response:**
792835
```json
793836
{
794837
"prediction": [0],

mlconnector/db/Dockerfile

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
1-
FROM harbor.nbfc.io/proxy_cache/library/postgres
2-
USER root
3-
RUN export LANGUAGE=en_US.UTF-8
4-
COPY configs/init-my-db.sh /docker-entrypoint-initdb.d/init-user-db.sh
5-
# COPY configs/drift_metrics_mmd.csv /docker-entrypoint-initdb.d/drift_metrics_mmd.csv
1+
FROM postgres:16-bookworm
2+
3+
ENV LANG=en_US.UTF-8 LANGUAGE=en_US:en LC_ALL=en_US.UTF-8
4+
5+
# Copy init assets. Number prefix enforces order if you have multiple files.
6+
COPY configs/init-my-db.sh /docker-entrypoint-initdb.d/10-init-my-db.sh
7+
8+
# Normalize Windows line endings just in case (no harm if already LF)
9+
RUN sed -i 's/\r$//' /docker-entrypoint-initdb.d/10-init-my-db.sh

mlconnector/db/configs/data.py

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
import pandas as pd
2+
from sqlalchemy import text
3+
from dotenv import load_dotenv
4+
import os
5+
6+
# Load environment variables
7+
load_dotenv(override=True)
8+
9+
# Database config
10+
db_config = {
11+
"DB_DRIVER": "postgresql+psycopg2", # e.g. postgresql+asyncpg
12+
"DB_USER": os.getenv("POSTGRES_USER"),
13+
"DB_PASSWORD": os.getenv("POSTGRES_PASSWORD"),
14+
"DB_HOST": "localhost",
15+
"DB_PORT": os.getenv("DB_PORT"),
16+
"DB_NAME": os.getenv("POSTGRES_DB")
17+
}
18+
19+
# Build connection string
20+
DATABASE_URL = (
21+
f"{db_config['DB_DRIVER']}://{db_config['DB_USER']}:{db_config['DB_PASSWORD']}"
22+
f"@{db_config['DB_HOST']}:{db_config['DB_PORT']}/{db_config['DB_NAME']}"
23+
)
24+
print(f"Connecting to database at {DATABASE_URL}")
25+
"""# Create async engine and session
26+
engine = create_async_engine(DATABASE_URL, echo=False)
27+
AsyncSessionLocal = sessionmaker(bind=engine, expire_on_commit=False, class_=AsyncSession)
28+
29+
# Main async logic
30+
async def insert_drift_metrics():
31+
df = pd.read_csv("drift_metrics_mmd.csv")
32+
33+
# Add required fields
34+
df["rowid"] = [str(uuid.uuid4()) for _ in range(len(df))]
35+
df["timestamp"] = datetime.utcnow()
36+
37+
async with AsyncSessionLocal() as session:
38+
for _, row in df.iterrows():
39+
await session.execute(text(""
40+
INSERT INTO drift_metrics (
41+
rowid, feature, type, statistic, p_value,
42+
method, drift_detected, timestamp, modelid
43+
) VALUES (
44+
:rowid, :feature, :type, :statistic, :p_value,
45+
:method, :drift_detected, :timestamp, :modelid
46+
)
47+
""), {
48+
"rowid": row["rowid"],
49+
"feature": row["feature"],
50+
"type": row["type"],
51+
"statistic": float(row["statistic"]),
52+
"p_value": float(row["p_value"]),
53+
"method": row["method"],
54+
"drift_detected": str(row["drift_detected"]),
55+
"timestamp": row["timestamp"],
56+
"modelid": row["modelid"]
57+
})
58+
await session.commit()
59+
60+
# Entry point
61+
if __name__ == "__main__":
62+
asyncio.run(insert_drift_metrics())
63+
"""

0 commit comments

Comments
 (0)