Skip to content

feat(ai): add transcription support#1470

Merged
hwbrzzl merged 8 commits into
masterfrom
bowen/ai-transcription
May 17, 2026
Merged

feat(ai): add transcription support#1470
hwbrzzl merged 8 commits into
masterfrom
bowen/ai-transcription

Conversation

@hwbrzzl
Copy link
Copy Markdown
Contributor

@hwbrzzl hwbrzzl commented May 16, 2026

Summary

  • Add a fluent AI transcription API that lets applications choose a provider, override the model, set a language hint, apply a timeout, and request speaker diarization.
  • Add transcription file helpers for uploads, paths, and storage-backed files, and return typed transcript text, segments, and usage metadata from the framework response.
  • Implement OpenAI /audio/transcriptions support with default transcription models, diarized segment parsing, provider capability validation, and regression coverage for nil-file and empty-response cases.

Closes goravel/goravel#963

Why

This branch adds the missing speech-to-text workflow to Goravel's AI package, so applications can transcribe audio through the same fluent request style already used for chat, audio, and image generation. The new contracts and helpers make it possible to pass uploaded files or stored audio directly into a provider-backed transcription request without dropping down to provider-specific code.

audio, err := ctx.Request().File("audio")
if err != nil {
	return ctx.Response().Json(http.StatusBadRequest, http.Json{"message": err.Error()})
}

response, err := facades.AI().Transcription(
	transcription.FromUpload(audio),
).Provider("openai").
	Language("en").
	Diarize().
	Timeout(30 * time.Second).
	Generate()
if err != nil {
	return ctx.Response().Json(http.StatusInternalServerError, http.Json{"message": err.Error()})
}

return ctx.Response().Success().Json(http.Json{
	"text":     response.Text(),
	"segments": response.Segments(),
})

The OpenAI implementation now handles both standard and diarized transcription responses, preserves usage metadata, and rejects invalid file inputs before making a provider request. That keeps the new API aligned with the rest of the framework while covering the edge cases and follow-up fixes added during review.

@hwbrzzl hwbrzzl requested a review from a team as a code owner May 16, 2026 05:05
Copilot AI review requested due to automatic review settings May 16, 2026 05:05
@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

Codecov Report

❌ Patch coverage is 77.90698% with 38 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.32%. Comparing base (c78a1d9) to head (dafa584).

Files with missing lines Patch % Lines
ai/openai/provider.go 78.49% 14 Missing and 6 partials ⚠️
ai/setup/stubs.go 0.00% 6 Missing ⚠️
ai/transcription/transcription.go 40.00% 6 Missing ⚠️
ai/response.go 73.33% 2 Missing and 2 partials ⚠️
ai/application.go 83.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1470      +/-   ##
==========================================
+ Coverage   69.26%   69.32%   +0.05%     
==========================================
  Files         373      374       +1     
  Lines       29523    29686     +163     
==========================================
+ Hits        20450    20580     +130     
- Misses       8135     8159      +24     
- Partials      938      947       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a fluent AI transcription workflow to the framework, including contracts, facade/application plumbing, helper constructors, OpenAI /audio/transcriptions support, response types, errors, mocks, config stubs, and tests.

Changes:

  • Added transcription request/response/provider contracts and application request handling.
  • Implemented OpenAI transcription generation, diarized segment parsing, usage mapping, and config defaults.
  • Added helper package APIs, generated mocks, and regression tests.

Reviewed changes

Copilot reviewed 16 out of 20 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
mocks/ai/TranscriptionResponse.go Adds generated transcription response mock.
mocks/ai/TranscriptionRequest.go Adds generated transcription request mock.
mocks/ai/TranscriptionProvider.go Adds generated transcription provider mock.
mocks/ai/AI.go Adds mocked AI.Transcription.
errors/list.go Adds transcription-related AI errors.
contracts/ai/transcription.go Defines transcription request and segment contracts.
contracts/ai/response.go Adds transcription response contract.
contracts/ai/provider.go Adds transcription prompt/provider contracts.
contracts/ai/config.go Adds transcription model config.
contracts/ai/ai.go Adds transcription entrypoint to AI contract.
ai/transcription/transcription.go Adds transcription helper constructors.
ai/transcription/transcription_test.go Tests transcription helper constructors.
ai/transcription_request.go Implements fluent transcription request.
ai/setup/stubs.go Updates generated AI config stub.
ai/response.go Implements transcription response storage/accessors.
ai/response_transcription_test.go Tests transcription response behavior.
ai/openai/provider.go Implements OpenAI transcription support.
ai/openai/provider_test.go Adds OpenAI transcription tests.
ai/application.go Wires transcription through application/provider resolution.
ai/application_test.go Tests transcription application/request flow.
Files not reviewed (4)
  • mocks/ai/AI.go: Language not supported
  • mocks/ai/TranscriptionProvider.go: Language not supported
  • mocks/ai/TranscriptionRequest.go: Language not supported
  • mocks/ai/TranscriptionResponse.go: Language not supported

Comment thread ai/openai/provider.go Outdated
Comment thread ai/openai/provider.go
Comment thread ai/setup/stubs.go Outdated
Comment thread ai/openai/provider.go
Comment thread ai/transcription/transcription.go
Comment thread ai/response_transcription_test.go Outdated
Copilot AI review requested due to automatic review settings May 16, 2026 09:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 20 changed files in this pull request and generated 3 comments.

Files not reviewed (4)
  • mocks/ai/AI.go: Language not supported
  • mocks/ai/TranscriptionProvider.go: Language not supported
  • mocks/ai/TranscriptionRequest.go: Language not supported
  • mocks/ai/TranscriptionResponse.go: Language not supported

Comment thread ai/openai/provider.go
Comment thread ai/transcription/transcription.go
Comment thread ai/openai/provider.go
Copilot AI review requested due to automatic review settings May 16, 2026 10:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 22 changed files in this pull request and generated 2 comments.

Files not reviewed (4)
  • mocks/ai/AI.go: Language not supported
  • mocks/ai/TranscriptionProvider.go: Language not supported
  • mocks/ai/TranscriptionRequest.go: Language not supported
  • mocks/ai/TranscriptionResponse.go: Language not supported

Comment thread ai/image/image.go
Comment thread ai/transcription/transcription.go
@hwbrzzl
Copy link
Copy Markdown
Contributor Author

hwbrzzl commented May 16, 2026

Pull request overview

Copilot reviewed 18 out of 22 changed files in this pull request and generated 2 comments.

Files not reviewed (4)

@copilot review the rest files.

Copy link
Copy Markdown
Contributor

Copilot AI commented May 16, 2026

Pull request overview

Copilot reviewed 18 out of 22 changed files in this pull request and generated 2 comments.

...

Reviewed the remaining files. Found two issues to fix (commit 3829d56):

  1. ai/image/image.go – The Of helper and its test were unintentionally removed in the last optimize commit. Restored both, since removing an exported function is a breaking API change unrelated to transcription.

  2. ai/transcription/transcription_test.goFromStorage and WithDisk were untested. Added TestFromStorage and TestFromStorageWithDisk that verify the wrapper returns a valid attachment and that content resolution routes through the storage path (confirmed by the expected StorageFacadeNotSet error when no storage backend is configured).

All other changed files (contracts/ai/, ai/transcription_request.go, ai/response.go, mocks/ai/) look correct and consistent with the rest of the codebase.

Copilot AI review requested due to automatic review settings May 17, 2026 02:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 24 changed files in this pull request and generated 2 comments.

Files not reviewed (4)
  • mocks/ai/AI.go: Language not supported
  • mocks/ai/TranscriptionProvider.go: Language not supported
  • mocks/ai/TranscriptionRequest.go: Language not supported
  • mocks/ai/TranscriptionResponse.go: Language not supported

Comment thread ai/image/image.go
Comment thread ai/transcription/transcription.go
@hwbrzzl hwbrzzl merged commit edd87ef into master May 17, 2026
20 checks passed
@hwbrzzl hwbrzzl deleted the bowen/ai-transcription branch May 17, 2026 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat support where has with throgh in orm

3 participants