AudioQnA Application

AudioQnA is an example that demonstrates the integration of Generative AI (GenAI) models for performing question-answering (QnA) on audio files, with the added functionality of Text-to-Speech (TTS) for generating spoken responses. The example showcases how to convert audio input to text using Automatic Speech Recognition (ASR), generate answers to user queries using a language model, and then convert those answers back to speech using Text-to-Speech (TTS).

Architecture
Deployment Options

Architecture

The AudioQnA example is implemented using the component-level microservices defined in GenAIComps. The flow chart below shows the information flow between different microservices for this example.

---
config:
  flowchart:
    nodeSpacing: 400
    rankSpacing: 100
    curve: linear
  themeVariables:
    fontSize: 50px
---
flowchart LR
    %% Colors %%
    classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef invisible fill:transparent,stroke:transparent;
    style AudioQnA-MegaService stroke:#000000

    %% Subgraphs %%
    subgraph AudioQnA-MegaService["AudioQnA MegaService "]
        direction LR
        ASR([ASR MicroService]):::blue
        LLM([LLM MicroService]):::blue
        TTS([TTS MicroService]):::blue
    end
    subgraph UserInterface[" User Interface "]
        direction LR
        a([User Input Query]):::orchid
        UI([UI server<br>]):::orchid
    end



    WSP_SRV{{whisper service<br>}}
    SPC_SRV{{speecht5 service <br>}}
    LLM_gen{{LLM Service <br>}}
    GW([AudioQnA GateWay<br>]):::orange


    %% Questions interaction
    direction LR
    a[User Audio Query] --> UI
    UI --> GW
    GW <==> AudioQnA-MegaService
    ASR ==> LLM
    LLM ==> TTS

    %% Embedding service flow
    direction LR
    ASR <-.-> WSP_SRV
    LLM <-.-> LLM_gen
    TTS <-.-> SPC_SRV

Deployment Options

The table below lists currently available deployment options. They outline in detail the implementation of this example on selected hardware.

Category	Deployment Option	Description
On-premise Deployments	Docker compose	AudioQnA deployment on Xeon
		AudioQnA deployment on Gaudi
		AudioQnA deployment on AMD EPYC
		AudioQnA deployment on AMD ROCm
	Kubernetes	Helm Charts

Validated Configurations

Deploy Method	LLM Engine	LLM Model	Hardware
Docker Compose	vLLM, TGI	meta-llama/Meta-Llama-3-8B-Instruct	Intel Gaudi
Docker Compose	vLLM, TGI, GPT-SoVITS	meta-llama/Meta-Llama-3-8B-Instruct	Intel Xeon
Docker Compose	vLLM, TGI	meta-llama/Meta-Llama-3-8B-Instruct	AMD EPYC
Docker Compose	vLLM, TGI	Intel/neural-chat-7b-v3-3	AMD ROCm
Helm Charts	vLLM, TGI	meta-llama/Meta-Llama-3-8B-Instruct	Intel Gaudi
Helm Charts	vLLM, TGI	meta-llama/Meta-Llama-3-8B-Instruct	Intel Xeon

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AudioQnA Application

Table of Contents

Architecture

Deployment Options

Validated Configurations

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

AudioQnA Application

Table of Contents

Architecture

Deployment Options

Validated Configurations