Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
391 commits
Select commit Hold shift + click to select a range
5763b79
feat: model version retrieval testing
Mesh-ach Oct 23, 2025
de8b250
feat: model version retrieval testing
Mesh-ach Oct 23, 2025
c0f7250
fix: formatting style
Mesh-ach Oct 23, 2025
5a97168
fix: formatting style
Mesh-ach Oct 23, 2025
2b51021
fix: formatting style
Mesh-ach Oct 23, 2025
7b8a777
fix: formatting style
Mesh-ach Oct 23, 2025
167112a
fix: formatting style
Mesh-ach Oct 23, 2025
bd44916
fix: formatting style
Mesh-ach Oct 23, 2025
3e39347
fix: formatting style
Mesh-ach Oct 23, 2025
87d2854
fix: formatting style
Mesh-ach Oct 23, 2025
d213a8c
fix: formatting style
Mesh-ach Oct 23, 2025
1969f73
fix: linting
Mesh-ach Oct 23, 2025
3d83f7f
Feat: Added backfill endpoint
Mesh-ach Oct 23, 2025
e1a687c
Feat: Added backfill endpoint
Mesh-ach Oct 23, 2025
1fe495d
Feat: Added backfill endpoint
Mesh-ach Oct 23, 2025
d92cea1
Fix: linting
Mesh-ach Oct 23, 2025
a176d62
added func description
Mesh-ach Oct 23, 2025
03f0275
Merge pull request #178 from datakind/BackfillEndpoint
Mesh-ach Oct 23, 2025
bdf1d47
added func description
Mesh-ach Oct 24, 2025
280df44
Merge pull request #179 from datakind/BackfillEndpoint
Mesh-ach Oct 24, 2025
ca3c4e5
added func description
Mesh-ach Oct 24, 2025
a391bcb
added func description
Mesh-ach Oct 24, 2025
36ec01e
added func description
Mesh-ach Oct 24, 2025
903e9d8
added func description
Mesh-ach Oct 24, 2025
d400f26
added func description
Mesh-ach Oct 24, 2025
9dc2513
feat: adjusted run output endpointto return model_run_id
Mesh-ach Oct 27, 2025
f389b7d
Delete .DS_Store
Mesh-ach Oct 27, 2025
cd8189f
Delete src/.DS_Store
Mesh-ach Oct 27, 2025
20bd5f5
Delete terraform/.DS_Store
Mesh-ach Oct 27, 2025
dbb00ff
Merge pull request #180 from datakind/AdjustModelRunOutput
vishpillai123 Oct 27, 2025
94824d1
feat: added model deletion endpoint
Mesh-ach Nov 4, 2025
bd6cafe
feat: added model deletion endpoint
Mesh-ach Nov 4, 2025
8305181
feat: added model deletion endpoint
Mesh-ach Nov 4, 2025
9b9d8cd
fix: linting
Mesh-ach Nov 4, 2025
9792046
fix: linting
Mesh-ach Nov 4, 2025
5cfad35
fix: linting
Mesh-ach Nov 4, 2025
6d7682e
fix: linting
Mesh-ach Nov 4, 2025
1feabc7
fix: linting
Mesh-ach Nov 4, 2025
d2130b3
fix: linting
Mesh-ach Nov 4, 2025
b0f69a9
fix: linting
Mesh-ach Nov 4, 2025
3e0cb4b
Merge pull request #181 from datakind/ModelDeletionEndpoint
Mesh-ach Nov 5, 2025
5b0d590
fixed model name malformation
Mesh-ach Nov 5, 2025
82a2452
Merge pull request #182 from datakind/ModelDeletionEndpoint
Mesh-ach Nov 5, 2025
314ef2c
fix: removed databricks deletion functionality
Mesh-ach Nov 5, 2025
5647200
Merge pull request #183 from datakind/ModelDeletionEndpoint
Mesh-ach Nov 5, 2025
2319d7a
fix: removed query results not needed
Mesh-ach Nov 5, 2025
a41b71b
fix: removed query results not needed
Mesh-ach Nov 5, 2025
809d1db
fix: added status
Mesh-ach Nov 5, 2025
ae5fe8f
fix: added status
Mesh-ach Nov 5, 2025
b10ed71
fix: formatting fix
Mesh-ach Nov 5, 2025
3f076f3
fix: added query to retrieve model id
Mesh-ach Nov 5, 2025
ef3ee89
fix: added passive delete to db cascade so deleting the model ensures…
Mesh-ach Nov 5, 2025
8e2b09f
fix: removed extra db query for model id, since db now handles passiv…
Mesh-ach Nov 5, 2025
41b4253
fix: formatting fix
Mesh-ach Nov 5, 2025
c0382bb
fix: removed db mapping framework
Mesh-ach Nov 5, 2025
3ed550a
fix: removed db mapping framework
Mesh-ach Nov 5, 2025
edff369
fix: removed db mapping framework
Mesh-ach Nov 5, 2025
bea0425
fix: removed db mapping framework
Mesh-ach Nov 5, 2025
de11e17
feat: changed endpoint parameter name from experiment_run_id -> model…
Mesh-ach Nov 17, 2025
dbb442f
fix: type check errors
Mesh-ach Nov 17, 2025
35510db
Merge pull request #185 from datakind/RenameEndpointParameter
vishpillai123 Nov 17, 2025
c617008
test batch and file data
Nov 24, 2025
1c9ee92
eda endpoints
Nov 24, 2025
dddc0a5
test data
Nov 25, 2025
225427c
eda calculations
Dec 3, 2025
abac377
eda year and term, course enrollemnts
Dec 3, 2025
ab432f6
eda degree types
Dec 3, 2025
c501d7c
fix: divide data category into a seperate front end table section
Mesh-ach Dec 4, 2025
7d6b3fa
fix: linting
Mesh-ach Dec 4, 2025
467d616
feat: developed function for adding custom jobs with institution and …
Mesh-ach Dec 8, 2025
70da6e2
fix: linting errors
Mesh-ach Dec 8, 2025
c453783
fix: linting errors
Mesh-ach Dec 8, 2025
131eb21
Merge pull request #186 from datakind/AddCustomJob
Mesh-ach Dec 8, 2025
f3fd6fc
fix: changed route from GET to POST
Mesh-ach Dec 8, 2025
5275e6b
fix: added output filename definition
Mesh-ach Dec 8, 2025
1ec5d62
fix: linting errors
Mesh-ach Dec 8, 2025
1a37ab1
eda test institution data
Dec 8, 2025
f2cbf9a
Merge branch 'develop' into EDAEnpoints
Dec 8, 2025
8ffea4a
eda test institution
Dec 9, 2025
2807313
eda data
Dec 9, 2025
94108d6
eda test data
Dec 9, 2025
f814620
allow missing eda data
Dec 9, 2025
1650d64
eda enrollment type by intensity
Dec 9, 2025
12d3b4a
eda pell recipient by 1st gen
Dec 9, 2025
fcfe9cc
eda student age by gender
Dec 9, 2025
bf8b288
eda pell status by race
Dec 9, 2025
3428887
eda tests
Dec 9, 2025
aef0b96
cache eda
Dec 9, 2025
fd17625
tidy up
Dec 9, 2025
1ca1a36
remove LOCAL test bucket setup
Dec 9, 2025
2560c9c
return List from get_term_counts
Dec 9, 2025
22c427c
import pandas
Dec 9, 2025
decd869
remove unused variable
Dec 9, 2025
4844bd1
tidy up
Dec 9, 2025
6b1d4fb
eda bucket names
Dec 10, 2025
ecadf07
fix: type check errors
Mesh-ach Dec 15, 2025
2450559
fix: type check errors
Mesh-ach Dec 15, 2025
336c558
fix: type check errors
Mesh-ach Dec 15, 2025
ccf54c9
fix: formatting errors
Mesh-ach Dec 15, 2025
894b181
fix: type check errors
Mesh-ach Dec 15, 2025
4710268
fix: type check errors
Mesh-ach Dec 15, 2025
a5d58b4
fix: batch name renewal
Mesh-ach Dec 15, 2025
ddca88a
fix: batch name renewal
Mesh-ach Dec 15, 2025
7034c13
Merge pull request #188 from datakind/feature/EDAEndpoint
vishpillai123 Dec 16, 2025
4a789cf
fix: changed output_valid to true
Mesh-ach Dec 17, 2025
3861f21
Merge branch 'develop' of github.com-work:datakind/sst-app-api into d…
Mesh-ach Dec 17, 2025
501b99e
fix: adjusted model card file path
Mesh-ach Dec 17, 2025
ff5dc68
Merge pull request #190 from datakind/AdjustModelCardEndpoint
Mesh-ach Dec 17, 2025
8d64483
fix: ensuring we are grabbing the most recent run for a model id
Mesh-ach Dec 17, 2025
1f5a65c
Merge pull request #191 from datakind/AdjustModelCardEndpoint
vishpillai123 Dec 18, 2025
af75ace
remove colors from /eda endpoint
Dec 19, 2025
bbc3191
return count and percentage in /eda degree_types
Dec 19, 2025
fa68e08
tidy up
Dec 19, 2025
8317b8f
fix: fix file format
Mesh-ach Dec 22, 2025
13b4c82
Merge pull request #192 from datakind/fix/EdaEndpoint
vishpillai123 Dec 22, 2025
6f1f6ea
fix: retrieve by model_run_id instead
Mesh-ach Dec 23, 2025
9bb2201
fix: formatting
Mesh-ach Dec 23, 2025
4fd5492
fix: validation error for worwic
Mesh-ach Dec 23, 2025
48a2cdb
Merge pull request #194 from datakind/FixValidationError
Mesh-ach Dec 23, 2025
3456904
fix: changed model name to model_run_id parameter
Mesh-ach Dec 23, 2025
4fdd655
fix: added function to retrieve config.toml from select catalog
Mesh-ach Dec 23, 2025
1020ac5
manually initialized course mappings
Mesh-ach Jan 6, 2026
3047461
feat: added validation mapping
Mesh-ach Jan 6, 2026
96d75aa
fix: formatting
Mesh-ach Jan 6, 2026
8c4b4a6
fix: pylint
Mesh-ach Jan 6, 2026
aafb10b
Ignore .cursor folder for personal cursor preferences
chapmanhk Jan 11, 2026
c375808
feat(schema): add Edvise schema definition
chapmanhk Jan 11, 2026
19ec732
feat(institutions): add Edvise schema support
chapmanhk Jan 12, 2026
6b71053
fix: resolve CI/CD test failures
chapmanhk Jan 12, 2026
0715084
fix: resolve unique constraint conflicts in SchemaRegistryTable
chapmanhk Jan 12, 2026
e7971be
fix: resolve mypy type errors
chapmanhk Jan 12, 2026
8a28672
fix: add missing type annotations to test function parameters
chapmanhk Jan 12, 2026
8dde462
feat: Implement Phase 3 Edvise schema validation logic
chapmanhk Jan 12, 2026
3f38953
fix: Resolve Edvise test failures and improve test reliability
chapmanhk Jan 12, 2026
f5fb979
fix: Update Edvise test filenames to include descriptive keywords
chapmanhk Jan 12, 2026
8bfd945
style: Format data_test.py with ruff
chapmanhk Jan 12, 2026
467d1ff
fix(validation): return proper HTTP status codes for institution errors
chapmanhk Jan 13, 2026
eb5e2a9
fix: handle filename inference errors and extension schema deactivation
chapmanhk Jan 13, 2026
4fc3a35
Merge branch 'develop' into feat/EdviseSchema
chapmanhk Jan 13, 2026
a436c5e
fix: remove unused imports from validation_error_formatter_snapshot_test
chapmanhk Jan 13, 2026
5aa3037
fix: resolve test failures and configuration issues
chapmanhk Jan 13, 2026
e04d010
fix: resolve Ruff and Mypy linting errors
chapmanhk Jan 13, 2026
6dc9604
fix: align database constraints with production schema and fix Edvise…
chapmanhk Jan 14, 2026
57821f1
fix: handle parameterized Pandera check types in validation error for…
chapmanhk Jan 15, 2026
f922b62
style: format validation_error_formatter files with ruff
chapmanhk Jan 15, 2026
85e787a
feat: add case-insensitive institution name lookup
chapmanhk Jan 26, 2026
15147c1
fix: add missing return type annotations to test functions
chapmanhk Jan 26, 2026
eb6e5d8
style: apply ruff formatting to test file
chapmanhk Jan 26, 2026
9bb2895
Merge pull request #197 from datakind/feat/case-insensitive-inst-lookup
chapmanhk Jan 26, 2026
bc67ae2
Merge branch 'develop' into feat/EdviseSchema
chapmanhk Jan 26, 2026
7f8a3e7
fix(test): update institutions test for edvise_id API changes
chapmanhk Jan 26, 2026
77e1b5b
fix(validation): pass institution_id so Edvise/PDP/custom use correct…
chapmanhk Jan 28, 2026
822b740
Merge pull request #196 from datakind/feat/EdviseSchema
chapmanhk Jan 28, 2026
e27deab
Apply Black formatting to institutions_test.py
chapmanhk Jan 29, 2026
8b5a3b6
Apply ruff format to institutions_test.py
chapmanhk Jan 29, 2026
50cbee7
Merge pull request #198 from datakind/feat/EdviseSchema
chapmanhk Jan 29, 2026
708bed3
Fix institutions_test assert for Black and Ruff format compatibility
chapmanhk Jan 29, 2026
a1d2d98
Merge pull request #199 from datakind/feat/EdviseSchema
chapmanhk Jan 29, 2026
c653a39
Fix pylint E1135 in data_test: use .get() instead of membership test …
chapmanhk Jan 29, 2026
51136e2
Apply ruff format to data_test.py
chapmanhk Jan 29, 2026
9e5a5e6
Merge pull request #200 from datakind/feat/EdviseSchema
chapmanhk Jan 29, 2026
06b8669
feat(validation): schema validation during upload with PDP/edvise rep…
chapmanhk Feb 6, 2026
a33c311
feat(validation): write normalized data to validated/, archive raw to…
chapmanhk Feb 6, 2026
84f2ea2
refactor(validation): align with universal principles, add tests, fix…
chapmanhk Feb 6, 2026
6194804
feat(validation): use edvise read for PDP uploads and add PDP path tests
chapmanhk Feb 9, 2026
fc9ff7d
move cloud build config to repo
Feb 12, 2026
a190603
sst-app-api -> edvise-api
Feb 12, 2026
82def86
quiet down sqlalchemy
Feb 12, 2026
83dc5ae
use EdaSummary from edvise
Feb 12, 2026
fd87407
use ruff formatter
Feb 12, 2026
2c1deab
test a file
Feb 12, 2026
1629f57
tidy up
Feb 12, 2026
a0052ab
Add return type annotations for mypy in main_test and users_test
Feb 12, 2026
00e6dd7
tidy up
Feb 12, 2026
18ab791
move cache check after batch result check
Feb 17, 2026
4888b75
fix test_execute_pdp_pull
Feb 17, 2026
fbfc9df
Merge pull request #203 from datakind/feat/move-eda-functions
chapmanhk Feb 17, 2026
dd21d3c
Merge branch 'develop' into feat/schema-validation-during-upload
chapmanhk Feb 17, 2026
d688be3
Merge pull request #202 from datakind/feat/schema-validation-during-u…
chapmanhk Feb 17, 2026
0a2e186
install git
Feb 18, 2026
be59dcd
install git in correct Dockerfile
Feb 18, 2026
e87baed
install git in worker
Feb 18, 2026
21a809b
update edvise branch
Feb 18, 2026
79639f4
use develop branch for edvise
Feb 18, 2026
215307b
Merge pull request #204 from datakind/fix/cloudbuild
chapmanhk Feb 18, 2026
0f0cdbe
install edvise in build
Feb 19, 2026
1ec7d08
cloudbuild with edvise
Feb 19, 2026
92473af
Merge pull request #205 from datakind/fix/cloudbuild-install-edvise
chapmanhk Feb 19, 2026
0857e5d
fix(validation): resolve pylint used-before-assignment error
chapmanhk Feb 20, 2026
4694ed6
feat(api): add legacy school type with any-format uploads
chapmanhk Feb 26, 2026
d2750cf
feat(api): legacy PII check, principles compliance, and test coverage
chapmanhk Feb 27, 2026
1e3ad38
docs(api): use Edvise Schema (ES) naming to reduce confusion
chapmanhk Feb 27, 2026
ad39fa1
feat(data): allow legacy institutions to upload files with any filename
chapmanhk Feb 27, 2026
ed15d8a
chore: remove PR_DESCRIPTION.md
chapmanhk Feb 27, 2026
15b26c6
fix(validation): run PII check for header-only legacy CSVs
chapmanhk Feb 27, 2026
3ad6b86
fix(test): align validation error snapshot with non-PII student_id di…
chapmanhk Feb 27, 2026
e2f2e9c
feat(validation): use PDP cohort converter and support custom converters
chapmanhk Mar 2, 2026
cbe7afa
fix(validation): satisfy mypy for PDP validation and tests
chapmanhk Mar 2, 2026
928453c
Merge pull request #209 from datakind/feat/schema-validation-during-u…
chapmanhk Mar 2, 2026
05a54a1
chore: remove real institution names
mrmaloof Mar 4, 2026
bdbd543
chore: ruff format
mrmaloof Mar 4, 2026
3a96073
Merge pull request #210 from datakind/fix/remove_institution_names
vishpillai123 Mar 4, 2026
37f791e
fix: use latest edvise EdaSummary
mrmaloof Mar 4, 2026
cd98e59
fix: use edvise develop branch
mrmaloof Mar 4, 2026
9f01339
Merge branch 'develop' into feat/legacy-school-classifier
chapmanhk Mar 6, 2026
e46c664
chore(deps): pin edvise to develop
mrmaloof Mar 6, 2026
564593a
feat(ci): notify slack channel on deployment
mrmaloof Mar 6, 2026
55d3616
fix: lock file was out of sync
Mar 6, 2026
baf53b0
Merge pull request #211 from datakind/fix/eda_endpoint_updates
vishpillai123 Mar 6, 2026
5a0f2a4
chore: bump edvise version to 0.1.12
Mar 6, 2026
15e079f
Merge pull request #213 from datakind/chore/upgrade_edvise_0.1.12
vishpillai123 Mar 6, 2026
cfab036
Merge pull request #208 from datakind/feat/legacy-school-classifier
mrmaloof Mar 6, 2026
7c8f3bb
Revert "feat: legacy school type with any-format uploads, PII check, …
vishpillai123 Mar 6, 2026
1d7824d
Merge pull request #215 from datakind/revert-208-feat/legacy-school-c…
vishpillai123 Mar 6, 2026
a0906d5
Revert "Revert "feat: legacy school type with any-format uploads, PII…
chapmanhk Mar 7, 2026
be64dd1
feat(config): add optional local inst/batch/file seed from config for…
mrmaloof Mar 11, 2026
faebf9e
Merge pull request #216 from datakind/revert-215-revert-208-feat/lega…
kaylawilding Mar 11, 2026
652f59b
style: ruff format
mrmaloof Mar 11, 2026
a316993
fix(validation): pass schema_type to handling_duplicates for PDP cour…
chapmanhk Mar 19, 2026
652df89
style: ruff format PDP course read path test
chapmanhk Mar 19, 2026
7106313
Merge pull request #217 from datakind/fix/pdp-course-handling-duplicates
chapmanhk Mar 19, 2026
4161f83
fix(deps): upgrade databricks-sql-connector for pyarrow>=17 (edvise)
chapmanhk Mar 19, 2026
b1b34de
Merge pull request #218 from datakind/fix/pdp-course-handling-duplicates
chapmanhk Mar 19, 2026
9d40253
fix(storage): reduce peak memory during upload validation
chapmanhk Mar 24, 2026
0814e38
chore(storage): log errno on temp download/to_csv OSError
chapmanhk Mar 24, 2026
bdf2f1c
test(storage): cover temp cleanup and OSError logging for validate up…
chapmanhk Mar 24, 2026
fc24586
refactor(storage): extract temp download/unlink helpers for clarity
chapmanhk Mar 24, 2026
6610b35
style: apply black/ruff format to gcsutil_test.py
chapmanhk Mar 24, 2026
0c7a43f
Merge pull request #221 from datakind/fix/validation-upload-memory
chapmanhk Mar 25, 2026
15817ac
chore: bump edvise v0.2.0
Apr 2, 2026
482c2eb
Merge pull request #225 from datakind/chore/bump_edvise_v020
vishpillai123 Apr 2, 2026
a16ec3c
fix(pdp-validation): default cohort converter to none
chapmanhk Apr 7, 2026
d6dafa0
feat(api): remove custom institution path; require school type; legac…
chapmanhk Apr 7, 2026
3959567
feat(api): harden institutions API after custom-institution removal
chapmanhk Apr 8, 2026
0b19aef
docs(api): revert broad custom wording; keep upload docs accurate
chapmanhk Apr 8, 2026
4984892
fix(institutions): reject POST duplicate when existing row lacks scho…
chapmanhk Apr 8, 2026
93ebced
test(institutions): cover duplicate POST, PATCH flags, allowed_schema…
chapmanhk Apr 8, 2026
7dbc362
Merge pull request #227 from datakind/fix/remove-cohort-converter-val…
vishpillai123 Apr 8, 2026
83b8f61
refactor(institutions): extract PATCH helpers and DRY school-type errors
chapmanhk Apr 8, 2026
3d1c57e
fix(lint): satisfy ruff and mypy on databricks and institutions
chapmanhk Apr 10, 2026
485ac01
style(institutions): apply ruff format to router and tests
chapmanhk Apr 10, 2026
55c2009
Merge pull request #228 from datakind/feat/remove-custom-institutions
vishpillai123 Apr 13, 2026
2fdda5a
Merge branch 'develop' into feat/local_inst_testing
mrmaloof Apr 20, 2026
60283c8
refactor: simplify local_inst_data
mrmaloof Apr 20, 2026
5b53af4
docs: Update local_inst_data instructions
mrmaloof Apr 20, 2026
8e05298
chore: remove unused import
mrmaloof Apr 20, 2026
bea4abf
fix: make pdp_id and state optional
mrmaloof May 3, 2026
e3564ee
Merge pull request #229 from datakind/feat/local_inst_testing
mrmaloof May 3, 2026
4db72c9
chore: bumping pyproject and uv.lock
May 7, 2026
ca41320
Merge pull request #232 from datakind/chore/bump_edvise_0_2_1
vishpillai123 May 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,11 @@
"vscode": {
"extensions": [
"hashicorp.terraform",
"ms-python.black-formatter"
"charliermarsh.ruff"
],
"settings": {
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter",
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true
}
}
Expand Down
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,9 @@ ENV/
env.bak/
venv.bak/

# Local config / dev data
config/local_inst_data.json

# mkdocs documentation
/site

Expand All @@ -114,3 +117,6 @@ dmypy.json
# terraform
**/.terraform/*
**/terraform.tfvars

# Cursor rule files
.cursor/
15 changes: 15 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,21 @@
"env": {
"ENV_FILE_PATH": "${workspaceFolder}/src/worker/.env"
}
},
{
"name": "pytest (current file)",
"type": "debugpy",
"request": "launch",
"module": "pytest",
"args": [
"${file}",
"-v",
"-s"
],
"cwd": "${workspaceFolder}",
"env": {
"ENV_FILE_PATH": "${workspaceFolder}/src/webapp/.env"
}
}
],
}
8 changes: 4 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ To get an overview of the project, please read the [README](README.md) and our [
## Getting started
### Creating Issues

If you spot a problem, [search if an issue already exists](https://github.com/datakind/sst-app-api/issues). If a related issue doesn't exist,
you can open a new issue using a relevant [issue form](https://github.com/datakind/sst-app-api/issues/new).
If you spot a problem, [search if an issue already exists](https://github.com/datakind/edvise-api/issues). If a related issue doesn't exist,
you can open a new issue using a relevant [issue form](https://github.com/datakind/edvise-api/issues/new).

As a general rule, we don’t assign issues to anyone. If you find an issue to work on, you are welcome to open a PR with a fix.

Expand All @@ -28,7 +28,7 @@ poetry install --no-interaction
As many other open source projects, we use the famous [gitflow](https://nvie.com/posts/a-successful-git-branching-model/) to manage our branches.

Summary of our git branching model:
- Get all the latest work from the upstream `datakind/sst-app-api` repository
- Get all the latest work from the upstream `datakind/edvise-api` repository
(`git checkout main`)
- Create a new branch off with a descriptive name (for example:
`feature/new-test-macro`, `bugfix/bug-when-uploading-results`). You can
Expand Down Expand Up @@ -107,7 +107,7 @@ You can type `pytest` to run your tests, no matter which type of test it is.

## Continuous Integration

We use [GitHub Actions](https://github.com/datakind/sst-app-api/actions)
We use [GitHub Actions](https://github.com/datakind/edvise-api/actions)
for continuous integration.
See [here](https://docs.github.com/en/actions) for GitHub's documentation.

Expand Down
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@

This repo contains:

* [src/webapp/](https://github.com/datakind/sst-app-api/tree/develop/src/webapp): The source code for the SST API (which is called by the SST frontend and by any direct API callers)
* [src/worker/](https://github.com/datakind/sst-app-api/tree/develop/src/worker): The source code for the SFTP Worker (which calls the SST API)
* [src/webapp/](https://github.com/datakind/edvise-api/tree/develop/src/webapp): The source code for the SST API (which is called by the SST frontend and by any direct API callers)
* [src/worker/](https://github.com/datakind/edvise-api/tree/develop/src/worker): The source code for the SFTP Worker (which calls the SST API)
* [terraform/]
(https://github.com/datakind/sst-app-api/tree/develop/terraform): The Terraform configuration for the SST API/Frontend and other GCP resources including Cloud SQL setup, networking setup, secrets setup
(https://github.com/datakind/edvise-api/tree/develop/terraform): The Terraform configuration for the SST API/Frontend and other GCP resources including Cloud SQL setup, networking setup, secrets setup
* .devcontainer/ and .vscode/: which allow easy setup if you are using VSCode as your IDE.
* [devtools/](https://github.com/datakind/sst-app-api/tree/develop/devtools): is a place to put utility scripts
* .github/: contains mostly copied over files when this directory was forked from the student-success-tool repo, so likely much of it is outdated. The only Github action we've added is the [webapp-and-worker-precommit](https://github.com/datakind/sst-app-api/blob/develop/.github/workflows/webapp-and-worker-precommit.yml) which is run on every push to develop. This action contains a python linter (we use [black](https://black.readthedocs.io/en/stable/)), and automated runs of the unit tests in the src/webapp/ and src/worker/ directories.
* Additionally, [pyproject.toml](https://github.com/datakind/sst-app-api/blob/develop/pyproject.toml) and [uv.lock](https://github.com/datakind/sst-app-api/blob/develop/uv.lock) are important for dependency management. At time of writing, the worker is just skeleton code so there's no separate dependency management. In the long-term consider separating out the dependency management for the two programs.
* [devtools/](https://github.com/datakind/edvise-api/tree/develop/devtools): is a place to put utility scripts
* .github/: contains mostly copied over files when this directory was forked from the student-success-tool repo, so likely much of it is outdated. The only Github action we've added is the [webapp-and-worker-precommit](https://github.com/datakind/edvise-api/blob/develop/.github/workflows/webapp-and-worker-precommit.yml) which is run on every push to develop. This action contains a python linter (we use [black](https://black.readthedocs.io/en/stable/)), and automated runs of the unit tests in the src/webapp/ and src/worker/ directories.
* Additionally, [pyproject.toml](https://github.com/datakind/edvise-api/blob/develop/pyproject.toml) and [uv.lock](https://github.com/datakind/edvise-api/blob/develop/uv.lock) are important for dependency management. At time of writing, the worker is just skeleton code so there's no separate dependency management. In the long-term consider separating out the dependency management for the two programs.


NOTE: this repo was forked from the https://github.com/datakind/student-success-tool repo, which means some of the static files (e.g. CONTRIBUTING.md) may be outdated or may include irrelevant information from that repo. Please update those as you see fit. For information about the specific items listed above, defer to the specific readmes in the relevant directory.
2 changes: 1 addition & 1 deletion SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@ If we verify a reported security vulnerability, our policy is:

## Reporting a Security Issue

To report any security issues, please [raise an issue](https://github.com/datakind/sst-app-api/issues/new/choose) and select **Security issue*
To report any security issues, please [raise an issue](https://github.com/datakind/edvise-api/issues/new/choose) and select **Security issue*
55 changes: 55 additions & 0 deletions cloudbuild-webapp.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Cloud Build config for webapp (dev-webapp trigger).
# _REGION and _ENVIRONMENT are set by the trigger (Terraform).
steps:
- name: ghcr.io/astral-sh/uv:debian
entrypoint: bash
args:
- -c
- |
set -e
apt-get update && apt-get install -y --no-install-recommends git
uv lock --upgrade-package edvise
- name: gcr.io/cloud-builders/docker
args:
- build
- '-f'
- src/webapp/Dockerfile
- '-t'
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/webapp:$COMMIT_SHA'
- '-t'
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/webapp:latest'
- .
- name: gcr.io/cloud-builders/docker
args:
- push
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/webapp:$COMMIT_SHA'
- name: gcr.io/cloud-builders/docker
args:
- push
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/webapp:latest'
- name: gcr.io/cloud-builders/gcloud
args:
- run
- deploy
- '${_ENVIRONMENT}-webapp'
- '--image'
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/webapp:$COMMIT_SHA'
- '--region'
- '${_REGION}'
- name: curlimages/curl
args:
- '-X'
- POST
- '-H'
- 'Content-Type: application/json'
- '-f'
- '-d'
- >-
{"text":"🚀 *$REPO_NAME* deployed · `$BRANCH_NAME` · $TRIGGER_NAME · $BUILD_ID"}
- >-
https://hooks.slack.com/triggers/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
id: notify-slack
timeout: 600s
options:
logging: CLOUD_LOGGING_ONLY
dynamicSubstitutions: true
34 changes: 34 additions & 0 deletions cloudbuild-worker.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Cloud Build config for worker (dev-worker trigger).
# _REGION and _ENVIRONMENT are set by the trigger (Terraform).
steps:
- name: gcr.io/cloud-builders/docker
args:
- build
- '-f'
- src/worker/Dockerfile
- '-t'
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/worker:$COMMIT_SHA'
- '-t'
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/worker:latest'
- .
- name: gcr.io/cloud-builders/docker
args:
- push
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/worker:$COMMIT_SHA'
- name: gcr.io/cloud-builders/docker
args:
- push
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/worker:latest'
- name: gcr.io/cloud-builders/gcloud
args:
- run
- deploy
- '${_ENVIRONMENT}-worker'
- '--image'
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/edvise-api/worker:$COMMIT_SHA'
- '--region'
- '${_REGION}'
timeout: 600s
options:
logging: CLOUD_LOGGING_ONLY
dynamicSubstitutions: true
60 changes: 60 additions & 0 deletions config/local_inst_data.example.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
[
{
"inst_id": "inst-uuid-here",
"name": "Example institution",
"state": "XX",
"retention_days": null,
"pdp_id": "",
"edvise_id": null,
"batches": [
{
"batch_id": "batch-uuid-here",
"inst_id": "inst-uuid-here",
"file_names_to_ids": {
"example_course.csv": "file-id-course",
"example_student.csv": "file-id-student"
},
"name": "example_batch_1",
"created_by": "uploader-uuid-here",
"deleted": false,
"completed": true,
"deletion_request_time": null,
"created_at": "2025-01-15T12:00:00",
"updated_at": "2025-01-15T12:00:00",
"updated_by": ""
}
],
"files": [
{
"name": "example_course.csv",
"data_id": "file-id-course",
"batch_ids": ["batch-uuid-here"],
"inst_id": "inst-uuid-here",
"uploader": "uploader-uuid-here",
"source": "MANUAL_UPLOAD",
"schemas": ["COURSE"],
"deleted": false,
"deletion_request_time": null,
"retention_days": null,
"sst_generated": false,
"valid": true,
"uploaded_date": "2025-01-15T11:58:00"
},
{
"name": "example_student.csv",
"data_id": "file-id-student",
"batch_ids": ["batch-uuid-here"],
"inst_id": "inst-uuid-here",
"uploader": "uploader-uuid-here",
"source": "MANUAL_UPLOAD",
"schemas": ["STUDENT"],
"deleted": false,
"deletion_request_time": null,
"retention_days": null,
"sst_generated": false,
"valid": true,
"uploaded_date": "2025-01-15T11:57:00"
}
]
}
]
24 changes: 18 additions & 6 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ dependencies = [
"databricks-sdk~=0.38.0",
"pydantic~=2.10",
"fastapi[standard]~=0.115.4",
"google-cloud-storage~=2.18.2",
"google-cloud-storage==2.19.0",
"paramiko~=3.5.0",
"cloud-sql-python-connector[pymysql]~=1.14.0",
"sqlalchemy~=2.0.36",
Expand All @@ -26,13 +26,16 @@ dependencies = [
"pandas~=2.0",
"six~=1.16.0",
"thefuzz[speedup]~=0.22.1",
"databricks-sql-connector~=3.5.0",
"databricks-sql-connector[pyarrow]~=4.2.0",
"pandera~=0.13",
"mlflow~=2.15.0"
"mlflow~=2.22",
"cachetools",
"types-cachetools",
"edvise~=0.2.1",
]

[project.urls]
Repository = "https://github.com/datakind/sst-app-api"
Repository = "https://github.com/datakind/edvise-api"

[dependency-groups]
dev = [
Expand All @@ -50,9 +53,15 @@ dev = [
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.metadata]
allow-direct-references = true

[tool.uv]
default-groups = ["dev"]

[tool.uv.sources]
edvise = { git = "https://github.com/datakind/edvise.git", rev = "develop" }

[tool.ruff]
line-length = 88
indent-width = 4
Expand Down Expand Up @@ -84,8 +93,11 @@ lines-after-imports = 1
[tool.pytest.ini_options]
minversion = "8.0"
addopts = ["--verbose", "--import-mode=importlib"]
filterwarnings = ["ignore::DeprecationWarning"]
testpaths = ["tests"]
filterwarnings = [
"ignore::DeprecationWarning",
"ignore::FutureWarning:pandera",
]
testpaths = ["src"]

[tool.mypy]
files = ["src"]
Expand Down
5 changes: 5 additions & 0 deletions src/webapp/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@ WORKDIR /app
# Add project files
ADD uv.lock pyproject.toml /app/

# Install git and ca-certificates
RUN apt-get update \
&& apt-get install -y --no-install-recommends git ca-certificates \
&& rm -rf /var/lib/apt/lists/*

# Install dependencies
RUN uv sync --frozen --no-install-project

Expand Down
26 changes: 24 additions & 2 deletions src/webapp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ In the long-term, look into a way to have the API key --> token conversion be ha

## Databases

All data is stored in MySQL databases for dev/staging/prod, these are databases in GCP's Cloud SQL. In the local environment, the database is sqlite. The main file you'll want to look at for database table definitions is [src/webapp/database.py](https://github.com/datakind/sst-app-api/blob/develop/src/webapp/database.py).
All data is stored in MySQL databases for dev/staging/prod, these are databases in GCP's Cloud SQL. In the local environment, the database is sqlite. The main file you'll want to look at for database table definitions is [src/webapp/database.py](https://github.com/datakind/edvise-api/blob/develop/src/webapp/database.py).

At time of writing, the databases the API cares about and tracks, are as follows:

Expand Down Expand Up @@ -112,7 +112,7 @@ Enter into the root directory of the repo.

You're now in your virtual env with all your dependencies added.

For all of the following, the steps above are pre-requisites and you should be in the root folder of `sst-app-api/`.
For all of the following, the steps above are pre-requisites and you should be in the root folder of `edvise-api/`.

### Spin up the app locally:

Expand Down Expand Up @@ -168,3 +168,25 @@ The process to upload a file involves three API calls:
## Local VSCode Debugging

From the Run & Debug panel (⇧⌘D on 🍎) you can run the [debug launch config](../../.vscode/launch.json) for the webapp or worker modules. This will allow you to set breakpoints within the source code while the applications are running.

## Local edvise development override

Production uses a pinned Git reference for `edvise`. For local development, use an
editable install after syncing the environment.

1. Clone `edvise` alongside `edvise-api` (so `../edvise` exists).
2. Run `uv sync`.
3. Override locally: `uv pip install -e ../edvise`

To revert back to the pinned Git dependency, run `uv sync --reinstall-package edvise`.

## Local institutions (optional)

You can seed the local database with institution, batch, and file metadata that matches dev or staging (names, UUIDs, batch membership) without checking secrets into Git.

1. Copy `config/local_inst_data.example.json` to `config/local_inst_data.json`. The latter is gitignored.
2. Edit `local_inst_data.json` to match your needs. Use the example file as the schema: one array element per institution, with `inst_id`, `name`, and optionally `state`, `pdp_id`, `batches`, and `files`.

If the file is missing, startup skips this step and the default local seed in code still applies.

**Limitation:** Endpoints that read uploaded CSV (for example EDA) load blobs from GCS under the bucket name `dev_<institution_uuid_hex>`, not from this JSON. To exercise those flows locally you still need GCP credentials and the corresponding objects in that bucket, or you rely on tests/mocks instead.
1 change: 1 addition & 0 deletions src/webapp/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
"DATABRICKS_HOST_URL": "",
# The service account that is used in Databricks to access GCP buckets.
"DATABRICKS_SERVICE_ACCOUNT_EMAIL": "",
"GCP_CACHE_BUCKET": "",
}


Expand Down
Loading
Loading