Skip to content

Commit f138227

Browse files
authored
Merge branch 'develop' into feat/inference-cohort-selection
2 parents 60f46b2 + d688be3 commit f138227

30 files changed

Lines changed: 3235 additions & 735 deletions

.devcontainer/devcontainer.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,11 @@
1515
"vscode": {
1616
"extensions": [
1717
"hashicorp.terraform",
18-
"ms-python.black-formatter"
18+
"charliermarsh.ruff"
1919
],
2020
"settings": {
2121
"[python]": {
22-
"editor.defaultFormatter": "ms-python.black-formatter",
22+
"editor.defaultFormatter": "charliermarsh.ruff",
2323
"editor.formatOnSave": true
2424
}
2525
}

.vscode/launch.json

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,21 @@
2929
"env": {
3030
"ENV_FILE_PATH": "${workspaceFolder}/src/worker/.env"
3131
}
32+
},
33+
{
34+
"name": "pytest (current file)",
35+
"type": "debugpy",
36+
"request": "launch",
37+
"module": "pytest",
38+
"args": [
39+
"${file}",
40+
"-v",
41+
"-s"
42+
],
43+
"cwd": "${workspaceFolder}",
44+
"env": {
45+
"ENV_FILE_PATH": "${workspaceFolder}/src/webapp/.env"
46+
}
3247
}
3348
],
3449
}

CONTRIBUTING.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ To get an overview of the project, please read the [README](README.md) and our [
1010
## Getting started
1111
### Creating Issues
1212

13-
If you spot a problem, [search if an issue already exists](https://github.com/datakind/sst-app-api/issues). If a related issue doesn't exist,
14-
you can open a new issue using a relevant [issue form](https://github.com/datakind/sst-app-api/issues/new).
13+
If you spot a problem, [search if an issue already exists](https://github.com/datakind/edvise-api/issues). If a related issue doesn't exist,
14+
you can open a new issue using a relevant [issue form](https://github.com/datakind/edvise-api/issues/new).
1515

1616
As a general rule, we don’t assign issues to anyone. If you find an issue to work on, you are welcome to open a PR with a fix.
1717

@@ -28,7 +28,7 @@ poetry install --no-interaction
2828
As many other open source projects, we use the famous [gitflow](https://nvie.com/posts/a-successful-git-branching-model/) to manage our branches.
2929

3030
Summary of our git branching model:
31-
- Get all the latest work from the upstream `datakind/sst-app-api` repository
31+
- Get all the latest work from the upstream `datakind/edvise-api` repository
3232
(`git checkout main`)
3333
- Create a new branch off with a descriptive name (for example:
3434
`feature/new-test-macro`, `bugfix/bug-when-uploading-results`). You can
@@ -107,7 +107,7 @@ You can type `pytest` to run your tests, no matter which type of test it is.
107107

108108
## Continuous Integration
109109

110-
We use [GitHub Actions](https://github.com/datakind/sst-app-api/actions)
110+
We use [GitHub Actions](https://github.com/datakind/edvise-api/actions)
111111
for continuous integration.
112112
See [here](https://docs.github.com/en/actions) for GitHub's documentation.
113113

README.md

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,25 @@
22

33
This repo contains:
44

5-
* [src/webapp/](https://github.com/datakind/sst-app-api/tree/develop/src/webapp): The source code for the SST API (which is called by the SST frontend and by any direct API callers)
6-
* [src/worker/](https://github.com/datakind/sst-app-api/tree/develop/src/worker): The source code for the SFTP Worker (which calls the SST API)
5+
* [src/webapp/](https://github.com/datakind/edvise-api/tree/develop/src/webapp): The source code for the SST API (which is called by the SST frontend and by any direct API callers)
6+
* [src/worker/](https://github.com/datakind/edvise-api/tree/develop/src/worker): The source code for the SFTP Worker (which calls the SST API)
77
* [terraform/]
8-
(https://github.com/datakind/sst-app-api/tree/develop/terraform): The Terraform configuration for the SST API/Frontend and other GCP resources including Cloud SQL setup, networking setup, secrets setup
8+
(https://github.com/datakind/edvise-api/tree/develop/terraform): The Terraform configuration for the SST API/Frontend and other GCP resources including Cloud SQL setup, networking setup, secrets setup
99
* .devcontainer/ and .vscode/: which allow easy setup if you are using VSCode as your IDE.
10-
* [devtools/](https://github.com/datakind/sst-app-api/tree/develop/devtools): is a place to put utility scripts
11-
* .github/: contains mostly copied over files when this directory was forked from the student-success-tool repo, so likely much of it is outdated. The only Github action we've added is the [webapp-and-worker-precommit](https://github.com/datakind/sst-app-api/blob/develop/.github/workflows/webapp-and-worker-precommit.yml) which is run on every push to develop. This action contains a python linter (we use [black](https://black.readthedocs.io/en/stable/)), and automated runs of the unit tests in the src/webapp/ and src/worker/ directories.
12-
* Additionally, [pyproject.toml](https://github.com/datakind/sst-app-api/blob/develop/pyproject.toml) and [uv.lock](https://github.com/datakind/sst-app-api/blob/develop/uv.lock) are important for dependency management. At time of writing, the worker is just skeleton code so there's no separate dependency management. In the long-term consider separating out the dependency management for the two programs.
10+
* [devtools/](https://github.com/datakind/edvise-api/tree/develop/devtools): is a place to put utility scripts
11+
* .github/: contains mostly copied over files when this directory was forked from the student-success-tool repo, so likely much of it is outdated. The only Github action we've added is the [webapp-and-worker-precommit](https://github.com/datakind/edvise-api/blob/develop/.github/workflows/webapp-and-worker-precommit.yml) which is run on every push to develop. This action contains a python linter (we use [black](https://black.readthedocs.io/en/stable/)), and automated runs of the unit tests in the src/webapp/ and src/worker/ directories.
12+
* Additionally, [pyproject.toml](https://github.com/datakind/edvise-api/blob/develop/pyproject.toml) and [uv.lock](https://github.com/datakind/edvise-api/blob/develop/uv.lock) are important for dependency management. At time of writing, the worker is just skeleton code so there's no separate dependency management. In the long-term consider separating out the dependency management for the two programs.
1313

1414

1515
NOTE: this repo was forked from the https://github.com/datakind/student-success-tool repo, which means some of the static files (e.g. CONTRIBUTING.md) may be outdated or may include irrelevant information from that repo. Please update those as you see fit. For information about the specific items listed above, defer to the specific readmes in the relevant directory.
16+
17+
## Local edvise development override
18+
19+
Production uses a pinned Git reference for `edvise`. For local development, use an
20+
editable install after syncing the environment.
21+
22+
1. Clone `edvise` alongside `edvise-api` (so `../edvise` exists).
23+
2. Run `uv sync`.
24+
3. Override locally: `uv pip install -e ../edvise`
25+
26+
To revert back to the pinned Git dependency, run `uv sync --reinstall-package edvise`.

SECURITY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,4 @@ If we verify a reported security vulnerability, our policy is:
88

99
## Reporting a Security Issue
1010

11-
To report any security issues, please [raise an issue](https://github.com/datakind/sst-app-api/issues/new/choose) and select **Security issue*
11+
To report any security issues, please [raise an issue](https://github.com/datakind/edvise-api/issues/new/choose) and select **Security issue*

cloudbuild-webapp.yaml

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Cloud Build config for webapp (dev-webapp trigger).
2+
# _REGION and _ENVIRONMENT are set by the trigger (Terraform).
3+
steps:
4+
- name: gcr.io/cloud-builders/docker
5+
args:
6+
- build
7+
- '-f'
8+
- src/webapp/Dockerfile
9+
- '-t'
10+
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/webapp:$COMMIT_SHA'
11+
- '-t'
12+
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/webapp:latest'
13+
- .
14+
- name: gcr.io/cloud-builders/docker
15+
args:
16+
- push
17+
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/webapp:$COMMIT_SHA'
18+
- name: gcr.io/cloud-builders/docker
19+
args:
20+
- push
21+
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/webapp:latest'
22+
- name: gcr.io/cloud-builders/gcloud
23+
args:
24+
- run
25+
- deploy
26+
- '${_ENVIRONMENT}-webapp'
27+
- '--image'
28+
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/webapp:$COMMIT_SHA'
29+
- '--region'
30+
- '${_REGION}'
31+
timeout: 600s
32+
options:
33+
logging: CLOUD_LOGGING_ONLY
34+
dynamicSubstitutions: true

cloudbuild-worker.yaml

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Cloud Build config for worker (dev-worker trigger).
2+
# _REGION and _ENVIRONMENT are set by the trigger (Terraform).
3+
steps:
4+
- name: gcr.io/cloud-builders/docker
5+
args:
6+
- build
7+
- '-f'
8+
- src/worker/Dockerfile
9+
- '-t'
10+
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/worker:$COMMIT_SHA'
11+
- '-t'
12+
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/worker:latest'
13+
- .
14+
- name: gcr.io/cloud-builders/docker
15+
args:
16+
- push
17+
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/worker:$COMMIT_SHA'
18+
- name: gcr.io/cloud-builders/docker
19+
args:
20+
- push
21+
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/worker:latest'
22+
- name: gcr.io/cloud-builders/gcloud
23+
args:
24+
- run
25+
- deploy
26+
- '${_ENVIRONMENT}-worker'
27+
- '--image'
28+
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/worker:$COMMIT_SHA'
29+
- '--region'
30+
- '${_REGION}'
31+
timeout: 600s
32+
options:
33+
logging: CLOUD_LOGGING_ONLY
34+
dynamicSubstitutions: true

pyproject.toml

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ dependencies = [
88
"databricks-sdk~=0.38.0",
99
"pydantic~=2.10",
1010
"fastapi[standard]~=0.115.4",
11-
"google-cloud-storage~=2.18.2",
11+
"google-cloud-storage==2.19.0",
1212
"paramiko~=3.5.0",
1313
"cloud-sql-python-connector[pymysql]~=1.14.0",
1414
"sqlalchemy~=2.0.36",
@@ -28,13 +28,14 @@ dependencies = [
2828
"thefuzz[speedup]~=0.22.1",
2929
"databricks-sql-connector~=3.5.0",
3030
"pandera~=0.13",
31-
"mlflow~=2.15.0",
31+
"mlflow~=2.22",
3232
"cachetools",
3333
"types-cachetools",
34+
"edvise",
3435
]
3536

3637
[project.urls]
37-
Repository = "https://github.com/datakind/sst-app-api"
38+
Repository = "https://github.com/datakind/edvise-api"
3839

3940
[dependency-groups]
4041
dev = [
@@ -52,9 +53,17 @@ dev = [
5253
requires = ["hatchling"]
5354
build-backend = "hatchling.build"
5455

56+
[tool.hatch.metadata]
57+
allow-direct-references = true
58+
5559
[tool.uv]
5660
default-groups = ["dev"]
5761

62+
[tool.uv.sources]
63+
# Install edvise from GitHub (branch, tag, or commit).
64+
# Use rev = "main" for default branch, or rev = "v1.0.0" for a release tag.
65+
edvise = { git = "https://github.com/datakind/edvise.git", rev = "feat/eda_summary_class" }
66+
5867
[tool.ruff]
5968
line-length = 88
6069
indent-width = 4

src/webapp/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ In the long-term, look into a way to have the API key --> token conversion be ha
5151

5252
## Databases
5353

54-
All data is stored in MySQL databases for dev/staging/prod, these are databases in GCP's Cloud SQL. In the local environment, the database is sqlite. The main file you'll want to look at for database table definitions is [src/webapp/database.py](https://github.com/datakind/sst-app-api/blob/develop/src/webapp/database.py).
54+
All data is stored in MySQL databases for dev/staging/prod, these are databases in GCP's Cloud SQL. In the local environment, the database is sqlite. The main file you'll want to look at for database table definitions is [src/webapp/database.py](https://github.com/datakind/edvise-api/blob/develop/src/webapp/database.py).
5555

5656
At time of writing, the databases the API cares about and tracks, are as follows:
5757

@@ -112,7 +112,7 @@ Enter into the root directory of the repo.
112112

113113
You're now in your virtual env with all your dependencies added.
114114

115-
For all of the following, the steps above are pre-requisites and you should be in the root folder of `sst-app-api/`.
115+
For all of the following, the steps above are pre-requisites and you should be in the root folder of `edvise-api/`.
116116

117117
### Spin up the app locally:
118118

src/webapp/database.py

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,9 @@ class Base(DeclarativeBase):
6363
# Test institution - same ID as DEV USC Beaufort for testing
6464
TEST_INST_UUID = uuid.UUID("942d4b0e-12e7-4d2a-9187-9508ae3cef7c")
6565
TEST_BATCH_UUID = uuid.UUID("3182f472-e079-4678-a0a1-9ca5ead6c49a")
66+
# Montana Tech test institution + batch (matches DEV data for local EDA testing)
67+
MONTANA_TECH_INST_UUID = uuid.UUID("1e2c628cabda4088b900cfd9ced44268")
68+
MONTANA_TECH_BATCH_UUID = uuid.UUID("1df31913e1fe4c458f5039527c967b13")
6669

6770

6871
@event.listens_for(Mapper, "before_insert")
@@ -224,6 +227,57 @@ def init_db(env: str) -> None:
224227
session.merge(test_file_student)
225228
session.merge(test_file_course)
226229
session.merge(test_batch)
230+
231+
# Montana Tech - matches DEV data for local EDA testing
232+
session.merge(
233+
InstTable(
234+
id=MONTANA_TECH_INST_UUID,
235+
name="Montana Tech",
236+
state="MT",
237+
schemas=["COURSE", "STUDENT"],
238+
created_at=DATETIME_TESTING,
239+
updated_at=DATETIME_TESTING,
240+
created_by=LOCAL_USER_UUID,
241+
)
242+
)
243+
montana_student_file = FileTable(
244+
id=uuid.UUID("a2b0f7b2-0c57-4aa9-8dd0-8e1a49d1be01"),
245+
inst_id=MONTANA_TECH_INST_UUID,
246+
name="1764013161378_AO1600pdp_AO1600_AR_DEIDENTIFIED_STUDYID_20251028121051.csv",
247+
source="MANUAL_UPLOAD",
248+
uploader=LOCAL_USER_UUID,
249+
sst_generated=False,
250+
valid=True,
251+
schemas=["STUDENT"],
252+
created_at=DATETIME_TESTING,
253+
updated_at=DATETIME_TESTING,
254+
)
255+
montana_course_file = FileTable(
256+
id=uuid.UUID("d8b0f2c3-2d5a-4d7b-8e33-7e4c3b2609d2"),
257+
inst_id=MONTANA_TECH_INST_UUID,
258+
name="1764013161379_AO1600pdp_AO1600_COURSE_LEVEL_AR_DEIDENTIFIED_STUDYID_20251028121051.csv",
259+
source="MANUAL_UPLOAD",
260+
uploader=LOCAL_USER_UUID,
261+
sst_generated=False,
262+
valid=True,
263+
schemas=["COURSE"],
264+
created_at=DATETIME_TESTING,
265+
updated_at=DATETIME_TESTING,
266+
)
267+
montana_batch = BatchTable(
268+
id=MONTANA_TECH_BATCH_UUID,
269+
inst_id=MONTANA_TECH_INST_UUID,
270+
name="Batch_2025-10-28_1764013161000",
271+
completed=True,
272+
created_by=LOCAL_USER_UUID,
273+
created_at=DATETIME_TESTING,
274+
updated_at=DATETIME_TESTING,
275+
)
276+
montana_batch.files.add(montana_student_file)
277+
montana_batch.files.add(montana_course_file)
278+
session.merge(montana_student_file)
279+
session.merge(montana_course_file)
280+
session.merge(montana_batch)
227281
session.commit()
228282
except Exception as e:
229283
session.rollback()
@@ -781,7 +835,6 @@ def init_connection_pool_local() -> sqlalchemy.engine.base.Engine:
781835
"""Creates a local sqlite db for local env testing."""
782836
return sqlalchemy.create_engine(
783837
"sqlite://",
784-
echo=True,
785838
echo_pool="debug",
786839
connect_args={"check_same_thread": False},
787840
poolclass=StaticPool,

0 commit comments

Comments
 (0)