Skip to content
This repository was archived by the owner on Apr 15, 2026. It is now read-only.

Fix prediction response timestamp population and webhook finalization#219

Merged
michaeldwan merged 3 commits into
mainfrom
correct-response-formatting
Sep 17, 2025
Merged

Fix prediction response timestamp population and webhook finalization#219
michaeldwan merged 3 commits into
mainfrom
correct-response-formatting

Conversation

@tempusfrangit
Copy link
Copy Markdown
Contributor

@tempusfrangit tempusfrangit commented Sep 17, 2025

This PR addresses critical issues with timestamp population in prediction responses and consolidates webhook finalization logic to ensure consistent behavior across sync and async predictions.

Core Infrastructure Changes

Time Format Standardization

  • Add TimeFormat constant to config package using microsecond precision with explicit timezone offset
  • Ensures consistent timestamp formatting across all prediction responses and webhook payloads
  • Reverted to not time constant to avoid upstream issues.

PredictionResponse Structure Overhaul

  • Remove deprecated WebhookURL field from PredictionResponse struct
  • Add dedicated CreatedAt, StartedAt, CompletedAt timestamp fields with proper JSON tags
  • Change Metrics field from any to map[string]any for type safety
  • Add finalizeResponse() method to set CompletedAt timestamp and calculate predict_time metric
  • Add populateFromRequest() method to copy core fields (ID, Input, CreatedAt, StartedAt) from request

Prediction Lifecycle Management

Manager API Restructuring

  • Rename Manager.Predict() to Manager.PredictSync() for clarity
  • Update Manager.PredictAsync() to return initial populated response instead of void
  • Refactor internal predict() method to return both channel and initial response
  • Add async parameter to assignReqToRunner() to control setup wait behavior

Runner Prediction Flow

  • Update runner.predict() signature to return initial response populated with request fields
  • Fix timestamp population in predict() method by setting CreatedAt/StartedAt on pending request
  • Add proper populateFromRequest() calls in handleResponseWebhooksAndCompletion after response overwrites
  • Ensure terminal responses call finalizeResponse() before sending through safeSend()

Webhook and Response Handling

  • Consolidate terminal webhook logic into unified sendTerminalWebhook() method
  • Fix race conditions by ensuring response finalization before webhook sending
  • Update subprocess monitoring to use populateFromRequest() for failed prediction responses
  • Ensure all webhook payloads include complete timestamp information

Test Infrastructure Improvements

Response Validation Framework

  • Add comprehensive unit tests for finalizeResponse() covering edge cases and error conditions
  • Add unit tests for populateFromRequest() method with field overwrite scenarios
  • Update all integration tests to validate CreatedAt, StartedAt, CompletedAt fields
  • Add ValidateTerminalWebhookResponse helper function for consistent terminal response validation

Test Harness Enhancements

  • Introduce testHarnessResponse type with string Logs field for easier test assertions
  • Update webhook receiver to use testHarnessResponse for simplified log comparisons
  • Retrofit existing tests to use new validation patterns without changing test logic

Bug Fixes

Timestamp Population Issues

  • Fix missing timestamps in sync prediction responses by ensuring populateFromRequest() calls
  • Resolve timestamp loss during response overwrites in handleResponseWebhooksAndCompletion
  • Ensure webhook payloads contain complete timestamp information for terminal events

Metrics and Finalization

  • Fix missing predict_time metric in terminal responses by calling finalizeResponse()
  • Ensure Metrics map initialization before metric calculation
  • Handle time parsing errors gracefully in finalizeResponse() method

Concurrency and Error Handling

  • Fix potential race conditions in webhook sending by proper response finalization order
  • Ensure cleanup of failed predictions includes populated response fields
  • Update error responses in runner stop scenarios to include request-derived fields

Backward Compatibility

  • Maintain existing API signatures where possible
  • Preserve webhook payload structure while adding timestamp fields
  • Keep existing test behavior while enhancing validation capabilities

Files Changed: 21 files changed, 478 insertions(+), 277 deletions(-)

@tempusfrangit tempusfrangit requested a review from a team as a code owner September 17, 2025 07:49
Copilot AI review requested due to automatic review settings September 17, 2025 07:49
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the prediction response system by adding dedicated timestamp fields and fixing missing timestamp population across sync and async prediction flows. It consolidates the PredictionResponse type from the server package into the runner package while adding proper timestamp handling, metrics calculation, and webhook finalization.

  • Adds dedicated CreatedAt, StartedAt, CompletedAt timestamp fields with consistent microsecond precision formatting
  • Implements response finalization methods to ensure complete timestamp and metrics population before webhooks
  • Refactors Manager prediction methods to return initial responses for async predictions and fix sync response handling

Reviewed Changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
internal/config/config.go Adds TimeFormat constant for consistent microsecond precision timestamps
internal/runner/types.go Adds timestamp fields to PredictionResponse, implements finalizeResponse() and populateFromRequest() methods
internal/runner/types_test.go Adds comprehensive unit tests for response finalization and population methods
internal/runner/runner.go Updates predict() to return initial response and ensures proper timestamp population
internal/runner/manager.go Refactors prediction methods to handle initial responses and consolidates webhook logic
internal/runner/manager_test.go Updates test to use renamed PredictSync method
internal/runner/runner_test.go Updates tests to handle new predict() return signature
internal/server/types.go Removes PredictionResponse struct (consolidated into runner package)
internal/server/server.go Simplifies prediction handling and removes duplicate response conversion logic
internal/tests/*.go Updates test files to use new response types and adds terminal response validation

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread internal/runner/runner.go Outdated
Comment thread internal/runner/types.go Outdated
Comment thread internal/runner/runner.go Outdated
@tempusfrangit tempusfrangit force-pushed the correct-response-formatting branch 2 times, most recently from 5416212 to 7306e46 Compare September 17, 2025 08:15
Copilot AI review requested due to automatic review settings September 17, 2025 08:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread internal/runner/types.go Outdated
Comment thread internal/runner/runner.go Outdated
Comment thread internal/runner/runner.go Outdated
@tempusfrangit tempusfrangit force-pushed the correct-response-formatting branch from 7306e46 to 1824718 Compare September 17, 2025 08:20
This PR addresses critical issues with timestamp population in prediction responses and consolidates webhook finalization logic to ensure consistent behavior across sync and async predictions.

Core Infrastructure Changes

Time Format Standardization
- Add TimeFormat constant to config package using microsecond precision with explicit timezone offset
- Ensures consistent timestamp formatting across all prediction responses and webhook payloads

PredictionResponse Structure Overhaul
- Remove deprecated WebhookURL field from PredictionResponse struct
- Add dedicated CreatedAt, StartedAt, CompletedAt timestamp fields with proper JSON tags
- Change Metrics field from any to map[string]any for type safety
- Add finalizeResponse() method to set CompletedAt timestamp and calculate predict_time metric
- Add populateFromRequest() method to copy core fields (ID, Input, CreatedAt, StartedAt) from request

Prediction Lifecycle Management

Manager API Restructuring
- Rename Manager.Predict() to Manager.PredictSync() for clarity
- Update Manager.PredictAsync() to return initial populated response instead of void
- Refactor internal predict() method to return both channel and initial response
- Add async parameter to assignReqToRunner() to control setup wait behavior

Runner Prediction Flow
- Update runner.predict() signature to return initial response populated with request fields
- Fix timestamp population in predict() method by setting CreatedAt/StartedAt on pending request
- Add proper populateFromRequest() calls in handleResponseWebhooksAndCompletion after response overwrites
- Ensure terminal responses call finalizeResponse() before sending through safeSend()

Webhook and Response Handling
- Consolidate terminal webhook logic into unified sendTerminalWebhook() method
- Fix race conditions by ensuring response finalization before webhook sending
- Update subprocess monitoring to use populateFromRequest() for failed prediction responses
- Ensure all webhook payloads include complete timestamp information

Test Infrastructure Improvements

Response Validation Framework
- Add comprehensive unit tests for finalizeResponse() covering edge cases and error conditions
- Add unit tests for populateFromRequest() method with field overwrite scenarios
- Update all integration tests to validate CreatedAt, StartedAt, CompletedAt fields
- Add ValidateTerminalWebhookResponse helper function for consistent terminal response validation

Test Harness Enhancements
- Introduce testHarnessResponse type with string Logs field for easier test assertions
- Update webhook receiver to use testHarnessResponse for simplified log comparisons
- Retrofit existing tests to use new validation patterns without changing test logic

Bug Fixes

Timestamp Population Issues
- Fix missing timestamps in sync prediction responses by ensuring populateFromRequest() calls
- Resolve timestamp loss during response overwrites in handleResponseWebhooksAndCompletion
- Ensure webhook payloads contain complete timestamp information for terminal events

Metrics and Finalization
- Fix missing predict_time metric in terminal responses by calling finalizeResponse()
- Ensure Metrics map initialization before metric calculation
- Handle time parsing errors gracefully in finalizeResponse() method

Concurrency and Error Handling
- Fix potential race conditions in webhook sending by proper response finalization order
- Ensure cleanup of failed predictions includes populated response fields
- Update error responses in runner stop scenarios to include request-derived fields

Backward Compatibility

- Maintain existing API signatures where possible
- Preserve webhook payload structure while adding timestamp fields
- Keep existing test behavior while enhancing validation capabilities
…esponse handling

- Consolidate response update logic into single operation to avoid duplicate overwrites
- Extract webhook sending logic into dedicated sendStatusWebhook function
- Extract terminal completion handling into handleTerminalCompletion function
- Ensure proper lock context around populateFromRequest calls accessing pending.request
- Maintain race condition protection by using local response copy for safeSend
- Remove TODO comment about function being "a mess" - now clean and maintainable

This eliminates the scattered response object handling and reduces lock contention
while preserving all existing functionality and thread safety guarantees.
Copilot AI review requested due to automatic review settings September 17, 2025 08:24
@tempusfrangit tempusfrangit force-pushed the correct-response-formatting branch from 1824718 to a621d08 Compare September 17, 2025 08:24
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 7 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread internal/tests/harness_test.go
Comment thread internal/tests/harness_test.go
Comment thread internal/tests/harness_test.go
Comment thread internal/tests/harness_test.go
Comment thread internal/tests/harness_test.go
Comment thread internal/runner/types.go Outdated
Comment thread internal/runner/types.go Outdated
@tempusfrangit
Copy link
Copy Markdown
Contributor Author

I'll note that prediction ID is not sent by "real cog" in the response. Coglet does do this. It's a clear divergence - @michaeldwan we may want to consider not sending ID in the response.

@michaeldwan michaeldwan requested a review from Copilot September 17, 2025 15:22
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 64 to 66
runner.PredictionSucceeded,
},
},
Copy link

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test case definition appears to be incomplete or malformed. The struct has fields for allowedPredictionStatuses but is missing other required fields like a name field that other test cases have. This could cause the test to behave unexpectedly.

Copilot uses AI. Check for mistakes.
Comment thread internal/runner/types_test.go
Copy link
Copy Markdown
Contributor

@michaeldwan michaeldwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks sensible! going to merge and run the tests that found the issue yesterday

@michaeldwan michaeldwan merged commit 5d32bf9 into main Sep 17, 2025
23 checks passed
@michaeldwan michaeldwan deleted the correct-response-formatting branch September 17, 2025 16:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants