5735 - go parser shadow tables by jtimpe · Pull Request #5832 · raft-tech/TANF-app

jtimpe · 2026-05-05T18:56:56Z

Summary of Changes

Pull request closes #5735

Updates the go parser to use shadow tables (or any specified table prefix) when the GO_PARSER_SHADOW_MODE environment variable is true. This var defaults to true
When GO_PARSER_SHADOW_MODE=false, the go parser writes to the production parser tables
Creates a shadow datafile alongside the regular datafile when a file is uploaded (and GO_PARSER_SHADOW_MODE=true)
Queues the parse task for both the python parser and go parser whenever a file is uploaded
Codex added a preliminary status and record count update for the shadow datafile (and production df when the var is false)

How to Test

Set GO_PARSER_SHADOW_MODE=True in your backend .env, then start the services

cd tdrs-frontend && docker-compose up --build
cd tdrs-backend && docker-compose up --build

Submit files across all program types and sections

Query the postgres db for the shadow tables

psql -h localhost -U tdpuser -d tdrs_test
select * from shadow_data_files_datafile;
select * from shadow_parsers_datafilesummary;
select COUNT(*) from shadow_parser_error;
select COUNT(*) from shadow_search_indexes_{tanf_t1|tanf_t2|tribal_tanf_t1|ssp_m1|etc};

Compare the record counts and datafile/datafilesummary data to the production tables

select * from data_files_datafile;
select * from parsers_datafilesummary;
select COUNT(*) from parser_error;
select COUNT(*) from search_indexes_{tanf_t1|tanf_t2|tribal_tanf_t1|ssp_m1|etc};

Reparse submitted files. Re-check record counts and metadata.
Bring containers downs, change GO_PARSER_SHADOW_MODE to False.
Submit files across program types and sections. Check record counts and metadata. Nothing should be written to shadow tables, only to the production tables (doubled, for the python parser and go parser; docker compose stop celery to avoid this). *The go parser writes to the same datafile and datafilesummary row, this causes some issues in the frontend (no case_aggregates are written because it has not yet been implemented)*
Reparse submitted files. Re-check record counts and metadata.

Deliverables

More details on how deliverables herein are assessed included here.

Deliverable 1: Accepted Features

Checklist of ACs:

shadow_mode config option added to pipeline.yaml
When enabled, all table names prefixed with shadow_
Django migration creates shadow tables mirroring production schema
Go parser writes correctly to shadow tables
Shadow tables can be truncated independently of production tables
Integration tests updated to verify against shadow table models
Testing Checklist has been run and all tests pass
README is updated, if necessary

lfrohlich and/or adpennington confirmed that ACs are met.

Deliverable 2: Tested Code

Are all areas of code introduced in this PR meaningfully tested?
- If this PR introduces backend code changes, are they meaningfully tested?
- If this PR introduces frontend code changes, are they meaningfully tested?
Are code coverage minimums met?
- Frontend coverage: [insert coverage %] (see CodeCov Report comment in PR)
- Backend coverage: [insert coverage %] (see CodeCov Report comment in PR)

Deliverable 3: Properly Styled Code

Are backend code style checks passing on CircleCI?
Are frontend code style checks passing on CircleCI?
Are code maintainability principles being followed?

Deliverable 4: Accessible

Does this PR complete the epic?
Are links included to any other gov-approved PRs associated with epic?
Does PR include documentation for Raft's a11y review?
Did automated and manual testing with iamjolly and ttran-hub using Accessibility Insights reveal any errors introduced in this PR?

Deliverable 5: Deployed

Was the code successfully deployed via automated CircleCI process to development on Cloud.gov?

Deliverable 6: Documented

Does this PR provide background for why coding decisions were made?
If this PR introduces backend code, is that code easy to understand and sufficiently documented, both inline and overall?
If this PR introduces frontend code, is that code easy to understand and sufficiently documented, both inline and overall?
If this PR introduces dependencies, are their licenses documented?
Can reviewer explain and take ownership of these elements presented in this code review?

Deliverable 7: Secure

Does the OWASP Scan pass on CircleCI?
Do manual code review and manual testing detect any new security issues?
If new issues detected, is investigation and/or remediation plan documented?

Deliverable 8: User Research

Research product(s) clearly articulate(s):

the purpose of the research
methods used to conduct the research
who participated in the research
what was tested and how
impact of research on TDP
(if applicable) final design mockups produced for TDP development

jtimpe · 2026-05-05T20:16:59Z

+        "user": models.ForeignKey(
+            "users.User",
+            on_delete=models.CASCADE,
+            related_name="+",


related_name="+" disables the reverse relationship, so you cannot access user.shadow_data_files like you can user.data_files - can change this if necessary - Django docs

jtimpe · 2026-05-05T20:22:35Z

+from django.db import models
+
+
+def create_shadow_model(


subclassing the "production" model (TANF_T1 for example) actually creates a model with a foreign key pointer/relationship to the parent model, rather than copying the model fields into the new model. this helper copies all the fields and relationships, but doesn't preserve the @property methods. Since we're not using the python properties on the go side, i think that's okay, but something to be aware of - the shadow models are more limited.

jtimpe · 2026-05-05T21:04:56Z


  dump-parser-schema:
-    desc: Dump selected parser table CREATE TABLE statements to schema-tmp.sql
+    desc: Dump selected parser table CREATE TABLE statements to parser schema.sql, using GO_PARSER_SHADOW_MODE to select shadow or production tables


kinda torn on this change (along with the generated schema.sql, query.sql, query.sql.go, and models.go). we don't really use any of these, it's more of a gut-check that changes to the python models have been appropriately represented in the go schemas. i don't know if that should check the shadow tables or not. the task switches based on the GO_PARSER_SHADOW_MODE flag, but the generated files are all shadow-specific. if it's not useful, i can revert it

keep original schema.sql versioned

unversion models.go

change task to dump to schema.temp.sql for sqlc diff

generate models.go and models.temp.go for sqlc diff

more helpful to validate against schema yml definition and storage/writer/tanf|ssp|tribal record serializers (still two steps) - generate serializer based on yml file?

define parsedrecord as an interface (look at decoders/decoder.go and decoder/csv.go), put serializer on validationresult telling each record how to serialize itself

separate ticket?

Did we write a ticket up to tighten up our django migration validation against the Go schemas yet?

jtimpe · 2026-05-06T18:18:54Z

+        current_app.send_task(
+            GO_PARSER_TASK_NAME,
+            args=[data_file_id],
+            queue=GO_PARSER_QUEUE,


doesn't handle reparses

jtimpe · 2026-05-06T18:41:59Z


-	var df DataFilesDatafile
+	var df ShadowDataFilesDatafile
 	err := pool.QueryRow(ctx, query, id).Scan(


look into easier way to map the struct

codecov · 2026-05-07T20:24:53Z

Codecov Report

❌ Patch coverage is 98.82353% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.02%. Comparing base (60b5d6c) to head (f4ba65a).
⚠️ Report is 7 commits behind head on develop.

Files with missing lines	Patch %	Lines
tdrs-backend/tdpservice/common/shadow_models.py	90.00%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #5832      +/-   ##
===========================================
+ Coverage    93.98%   94.02%   +0.04%     
===========================================
  Files          538      543       +5     
  Lines        24618    24776     +158     
  Branches       620      620              
===========================================
+ Hits         23137    23296     +159     
+ Misses        1368     1367       -1     
  Partials       113      113

Flag	Coverage Δ
dev-backend	`94.31% <98.82%> (+0.04%)`	⬆️
dev-frontend	`91.84% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...rvice/data_files/migrations/0027_shadowdatafile.py	`100.00% <100.00%> (ø)`
tdrs-backend/tdpservice/data_files/models.py	`83.51% <100.00%> (+0.63%)`	⬆️
...drs-backend/tdpservice/data_files/test/test_api.py	`98.76% <100.00%> (+0.02%)`	⬆️
tdrs-backend/tdpservice/data_files/views.py	`93.75% <100.00%> (+0.14%)`	⬆️
...ns/0017_shadowdatafilesummary_shadowparsererror.py	`100.00% <100.00%> (ø)`
tdrs-backend/tdpservice/parsers/models.py	`88.04% <100.00%> (+0.40%)`	⬆️
tdrs-backend/tdpservice/scheduling/parser_task.py	`96.45% <100.00%> (+0.36%)`	⬆️
...end/tdpservice/scheduling/test/test_parser_task.py	`99.70% <100.00%> (+0.03%)`	⬆️
...wprogramaudit_t1_shadowprogramaudit_t2_and_more.py	`100.00% <100.00%> (ø)`
...ckend/tdpservice/search_indexes/models/__init__.py	`100.00% <100.00%> (ø)`
... and 5 more

... and 1 file with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a6d79b...f4ba65a. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

raftmsohani · 2026-05-12T13:28:25Z

+
+	summaryStatusAccepted           = "Accepted"
+	summaryStatusAcceptedWithErrors = "Accepted with Errors"
+	summaryStatusPartiallyAccepted  = "Partially Accepted"


Go uses "Partially Accepted" while Django enum is "Partially Accepted with Errors", will this cause any problem?

I believe "Partially Accepted with Errors" is the friendly/detail name, but the enum/db value is Partially Accepted. You can see it encoded in this way in the original migration

('Partially Accepted with Errors', 'Partially Accepted')

These statuses can largely be removed right? The go parser will only ever need to set a rejected status based on specific validation/exceptions right?

elipe17 · 2026-05-14T15:50:51Z

+	if err != nil {
+		return fmt.Errorf("update %s result for datafile_id=%d: %w", tableName, datafileID, err)
+	}
+	totalCreatedInt4, err := int64ToInt4(totalCreated)


Why do we need the conversion? Seems like we should just let the "total" counts be int32 to prevent the extra conversion steps. Do the compiled queries require the pgtype or can we give it the exact Go int32 and let the DB engine handle the conversion if needed?

i think this is because sqlc generated the queries for a nullable integer column, which is pgtype.Int4. if the field was NOT NULL then it would generate an int32 and we wouldn't have to convert. there's a sqlc config option emit_pointers_for_null_types which might handle this, but i didn't test it out

Hmm, that is a nice override. Might save our code from any conversions in the future and push it off to the library or db engine. Will you write up an exploratory snack ticket for this?

Co-authored-by: Eric Lipe <125676261+elipe17@users.noreply.github.com>

elipe17

Two requests for tickets, otherwise LGTM!

…5735-go-parser-shadow-tables

…t-tech/TANF-app into 5735-go-parser-shadow-tables

jtimpe added 5 commits May 5, 2026 13:32

dump-parser-schema to shadow models

18ad7b5

create shadow copies of django models

4686409

go parser write to shadow tables if config option enabled

4fb41a1

queue parser run for both python and go parsers

43f9b45

prelim submission state and record counts

fe1c0e2

jtimpe self-assigned this May 5, 2026

jtimpe commented May 5, 2026

View reviewed changes

Comment thread tdrs-services/parser/internal/storage/writer/writer.go Outdated

jtimpe commented May 5, 2026

View reviewed changes

jtimpe added 4 commits May 6, 2026 08:22

Merge branch 'develop' into 5735-go-parser-shadow-tables

a5c6f39

QueueName -> queue (merge conflict)

55c9d6f

merge conflict missing env var

5e0f941

env var for queue name

6310a0c

jtimpe added the raft review This issue is ready for raft review label May 6, 2026

jtimpe requested review from elipe17, mattcoleanderson and raftmsohani May 6, 2026 13:59

elipe17 reviewed May 6, 2026

View reviewed changes

Comment thread tdrs-backend/tdpservice/scheduling/parser_task.py Outdated

jtimpe added 6 commits May 6, 2026 09:47

fix git secrets err

f981931

write go parser queue error to logentry

c4ac9c7

shadow df model

70411e0

update df to use shadow

e923e70

fix go tests

35ec100

refactor test_ofa_system_admin_permissions

349df86

jtimpe commented May 6, 2026

View reviewed changes

Comment thread tdrs-backend/tdpservice/data_files/views.py

jtimpe commented May 6, 2026

View reviewed changes

Comment thread tdrs-services/parser/internal/server/celery/celery.go

jtimpe commented May 6, 2026

View reviewed changes

Comment thread tdrs-services/parser/internal/server/celery/celery.go Outdated

jtimpe commented May 6, 2026

View reviewed changes

control go parse with shadow_mode var

f0c3a65

jtimpe requested a review from elipe17 May 7, 2026 20:37

elipe17 reviewed May 7, 2026

View reviewed changes

Comment thread tdrs-services/parser/internal/db/datafile.go Outdated

elipe17 reviewed May 7, 2026

View reviewed changes

Comment thread tdrs-services/parser/internal/db/datafile.go Outdated

elipe17 reviewed May 7, 2026

View reviewed changes

Comment thread tdrs-services/parser/internal/server/celery/celery.go Outdated

elipe17 reviewed May 7, 2026

View reviewed changes

Comment thread tdrs-services/parser/internal/server/celery/celery.go Outdated

raftmsohani reviewed May 12, 2026

View reviewed changes

jtimpe and others added 9 commits May 13, 2026 13:13

Merge branch 'develop' into 5735-go-parser-shadow-tables

ee533b3

fix tests

dd76d32

use parsing result to set df state

1b74556

move queries to query.sql

8b424d0

Merge branch 'develop' into 5735-go-parser-shadow-tables

b5c76e2

add production schemas and queries

f0dee28

add prod model counterparts to models

b2635b0

fix test

7c4b615

add prod tables to schema, update task

f13a49d

jtimpe requested review from elipe17 and raftmsohani May 14, 2026 13:31

elipe17 reviewed May 14, 2026

View reviewed changes

Comment thread tdrs-services/parser/internal/db/datafile.go Outdated

jtimpe and others added 4 commits May 15, 2026 14:47

Update tdrs-services/parser/internal/db/datafile.go

b076f55

Co-authored-by: Eric Lipe <125676261+elipe17@users.noreply.github.com>

rm unused summary statuses

03a5e0a

Merge branch 'develop' into 5735-go-parser-shadow-tables

da8b566

Merge branch 'develop' into 5735-go-parser-shadow-tables

f4ba65a

jtimpe requested a review from elipe17 May 19, 2026 13:05

elipe17 approved these changes May 19, 2026

View reviewed changes

elipe17 added 3 commits May 21, 2026 11:35

Merge branch 'develop' of https://github.com/raft-tech/TANF-app into …

85bf2b9

…5735-go-parser-shadow-tables

- Use correct parser error table name

985d2cd

Merge branch '5735-go-parser-shadow-tables' of https://github.com/raf…

9676309

…t-tech/TANF-app into 5735-go-parser-shadow-tables

Conversation

jtimpe commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of Changes

How to Test

Deliverables

Uh oh!

jtimpe May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jtimpe May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elipe17 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jtimpe commented May 5, 2026 •

edited

Loading

jtimpe May 5, 2026 •

edited

Loading

codecov Bot commented May 7, 2026 •

edited

Loading

jtimpe May 13, 2026 •

edited

Loading