feat: asynchronous checks#62
Conversation
fix: logic bug where success was never set to true
| """Add a rule result and update the overall success.""" | ||
| self.rule_results.append(rule_result) | ||
| if not rule_result.success and rule_result.level == "failure": | ||
| if rule_result.success and rule_result.level == "failure": |
There was a problem hiding this comment.
if you read the statement it's
if not X and ...:
X = false
so it's would always be a no-op. In the DB all checks have a success of true regardless of it the checks are passing
There was a problem hiding this comment.
sorry for coming back to this. The condition was checking rule_result.success, and then setting self.success, so not changing the same var.
It looked more correct before, it feels wrong checking for rule_result.success True and level Failure.
There was a problem hiding this comment.
okay I will double check
There was a problem hiding this comment.
okay great catch nico! there is an error with this line but it's
| if rule_result.success and rule_result.level == "failure": | |
| if not rule_result.success and rule_result.level == "error": |
| db.session.commit() | ||
|
|
||
| with UnitOfWork() as uow: | ||
| # as we are in the task, run the check synchronously | ||
| ChecksAPI.run_check(config, record, uow, sync=True) | ||
| uow.commit() |
There was a problem hiding this comment.
Why are we mixingg manual commits with UnitOfWork?
There was a problem hiding this comment.
good question. I was imagining that the uow would automatically commit when exiting the scope but this is not the behaviour. @slint what's the best practise here?
There was a problem hiding this comment.
It actually surprisingly doesn't commit, so it has to be explicit... there's a similar behavior with SQLAlchemy's session so I think this where this design is coming from.
There was a problem hiding this comment.
we could pass an optional uow to the run_check APIs, and treat that class as a service. The decorator will take care of the commit.
| return result_run | ||
|
|
||
| @classmethod | ||
| def run_check(cls, config, record, uow, is_draft=None, sync=False): |
There was a problem hiding this comment.
Hmm I think run_check is mixing execution and persistence, which can causes a problem in the async flow. The task recieves a check_run_id, but then run_check queries by config_id, record_id and is_draft instead of using that id, so it may update or overwrite the wrong row if multiple runs exist. I think it would be safer to separate concerns so the execution just returns a result, and the task updates the existing CheckRun directly by id.
There was a problem hiding this comment.
It shouldn't be possible for there to be multiple CheckRuns as they are only created in _create_or_update_check_run which handles this. I don't think I need to make any other changes from your comment.
For clarity, refer to the diagram in the original message in this PR and note if you think there's anything wrong with when the status is being updated
There was a problem hiding this comment.
My concern is that it feels brittle that the task receives a check_run_id, but run_check(sync=True) then re-queries instead of updating that specific row directly, since the async task already have owns the CheckRun conceptually.
I was thinking it might be cleaner if run_check just returned the result/state, and the task then updated the existing CheckRun directly by id.
There was a problem hiding this comment.
Can we please refer to the diagram so we're on the same page :)
If run_check (in api.py) just returned the result state, then the component would have to update the db (which is not correct).
What you're suggesting I think would to run all checks via a task, so that the task is the only thing updating the DB, however Alex already said that this is not correct and that instead we should consider creating a current_checks_api or a service (which I haven't proceeded with as it requires more design thinking about how we want the service/proxy to work)
There was a problem hiding this comment.
Good idea referring to the diagram 😄 What I meant isn’t that the return value from api.py should change, but that in the async execution flow the task already has the check_run_id, so it could update/save that specific CheckRun directly instead of run_check(sync=True) re-querying it.
There was a problem hiding this comment.
but then we would need to make the params of run_check fully optional as you can either call it with (config, record) or (check_run_id) with the additional implied checking of params
|
|
||
|
|
||
| class AsyncCheck(Check): | ||
| """Example of an async check.""" |
There was a problem hiding this comment.
Is this code that we need to change later?
There was a problem hiding this comment.
I'll drop this commit before we merge, it's just to have something to test with and give an example of how to configure it
| db.session.commit() | ||
|
|
||
| with UnitOfWork() as uow: | ||
| # as we are in the task, run the check synchronously | ||
| ChecksAPI.run_check(config, record, uow, sync=True) | ||
| uow.commit() |
There was a problem hiding this comment.
we could pass an optional uow to the run_check APIs, and treat that class as a service. The decorator will take care of the commit.
❤️ Thank you for your contribution!
Description
Celery tasks are used to execute the check, however we do not want to duplicate the logic of executing a task, and so in the task context the asynchronous task is run synchronously. A visual explanation of this is shown below (see
run_check(..., sync=true)).Checklist
Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge:
Frontend
Reminder
By using GitHub, you have already agreed to the GitHub’s Terms of Service including that: