This document describes the architecture of the Graph RAG (Retrieval-Augmented Generation) application that combines OpenAI GPT with GraphDB knowledge graphs for intelligent jaguar conservation queries. The application demonstrates:
- Conversational AI using Microsoft Agent Framework DevUI that queries structured data using SPARQL
- Ontology-Aware Knowledge Extraction that transforms unstructured text into RDF knowledge graphs
This is a true semantic Graph RAG implementation using formal ontologies (RDFS/OWL), not just labeled property graphs (LPG).
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Microsoft Agent Framework β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β DevUI Server β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β β β Auto- β β Interactive β β Built-in β β β
β β β opening β β Chat β β Debugging β β β
β β β Browser β β Interface β β Tools β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β Microsoft Agent Framework β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Jaguar Query Agent β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β System Prompt β β β
β β β - Graph RAG Instructions β β β
β β β - SPARQL Guidelines β β β
β β β - Response Formatting β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β OpenAI Client β β β
β β β - GPT-4 Integration β β β
β β β - Function Calling β β β
β β β - Thread Management β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β GraphDB Tool β β β
β β β - SPARQL Query Execution β β β
β β β - Query Validation β β β
β β β - Result Processing β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β External Systems β
β βββββββββββββββββββ βββββββββββββββββββββββ β
β β OpenAI API β β GraphDB β β
β β - GPT-4 β β - RDF Triple Store β β
β β - Responses β β - SPARQL Engine β β
β β - Function β β - Jaguar Ontology β β
β β Calling β β - Conservation β β
β βββββββββββββββββββ β Data β β
β βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
In addition to querying existing knowledge graphs, this system provides ontology-driven knowledge extraction from unstructured text:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Ontology-Aware Knowledge Extraction β
β β
β ββββββββββββββββ ββββββββββββββββ β
β β Jaguar β β Jaguar β β
β β Ontology β β Corpus β β
β β (.ttl) β β (.txt) β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β β β
β ββββββββββββ¬βββββββββββ β
β β β
β ββββββββββββΌβββββββββββ β
β β GPT-5 β β
β β Semantic Analysis β β
β β - Domain Context β β
β β - Disambiguation β β
β β - Relationship β β
β β Inference β β
β ββββββββββββ¬βββββββββββ β
β β β
β ββββββββββββΌβββββββββββ β
β β RDF Turtle β β
β β Generation β β
β β - Aligned to β β
β β Ontology β β
β β - Valid Structure β β
β ββββββββββββ¬βββββββββββ β
β β β
β ββββββββββββΌβββββββββββ β
β β GraphDB β β
β β Import β β
β β (Text Snippet) β β
β βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Key Innovation: The LLM uses the formal ontology to understand domain semantics, enabling it to:
- Disambiguate entities (wildlife jaguars vs. cars vs. guitars)
- Extract only relevant information aligned with the ontology
- Infer relationships between entities based on context
- Generate valid RDF that conforms to the ontology structure
Why This Requires RDF/Ontologies:
- LPG databases lack formal semantic definitions
- No standardized class hierarchies (RDFS/OWL)
- No property domain/range constraints
- No reasoning or inference capabilities
- LLMs need semantic structure, not just node/edge labels
- DevUI Server: Built-in development web interface
- Auto-opening Browser: Automatically launches at
http://localhost:8000 - Interactive Chat: Real-time conversation interface
- Built-in Debugging: Visual inspection of agent behavior
- Zero Configuration: No frontend code required
- Jaguar Query Agent: Graph RAG specialist agent
- System Prompt: Graph RAG instructions and SPARQL guidelines
- OpenAI Client: GPT-4 integration with function calling
- Thread Management: Conversation context preservation
- text2knowledge.ipynb: Jupyter notebook for ontology-aware extraction
- Loads formal ontology (jaguar_ontology.ttl)
- Processes unstructured text corpus
- Uses GPT-5 for semantic entity extraction
- Generates RDF Turtle aligned with ontology
- Demonstrates concept understanding vs. pattern matching
- GraphDB Tool: Core Graph RAG component
- Executes SPARQL queries against knowledge graph
- Validates query syntax and ontology compliance
- Processes and formats query results
- OpenAI API: GPT-4/GPT-5 for natural language processing and semantic extraction
- GraphDB: RDF triple store with jaguar conservation ontology
- DevUI Launch:
python3 main.pystarts DevUI server - User Interaction: User types queries in DevUI interface
- Agent Processing:
- LLM Analysis: OpenAI GPT analyzes natural language query
- SPARQL Generation: AI generates appropriate SPARQL query
- Tool Call Detection: Agent Framework detects need for GraphDB tool
- Knowledge Graph Query:
- SPARQL Execution:
query_jaguar_databasetool executes query - Result Processing: Raw SPARQL results processed and formatted
- SPARQL Execution:
- Response Generation:
- LLM Interpretation: OpenAI GPT interprets GraphDB results
- Natural Language Response: Generates human-readable response
- Markdown Formatting: Applies formatting with code blocks
- DevUI Display:
- Response Delivery: Shows formatted response in DevUI
- Context Preservation: Conversation history maintained automatically
DevUI provides automatic state management:
# DevUI handles all state management automatically
serve(entities=[query_agent], auto_open=True)Microsoft Agent Framework handles conversation management:
# Agent Framework manages threads and context automatically
response = asyncio.run(agent.run(user_message, thread=thread, store=True))Combines LLM with knowledge graph queries:
# LLM generates SPARQL, tool executes against GraphDB
tools=[query_jaguar_database]
tool_choice="auto"# DevUI handles all state management automatically
# No manual state configuration required
serve(entities=[query_agent], auto_open=True)- Agent Framework: Microsoft Agent Framework manages conversation context
- OpenAI Responses API: Server-side thread persistence
- DevUI Interface: Built-in conversation history and state management
# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key
OPENAI_RESPONSES_MODEL_ID=gpt-4
# GraphDB Configuration
GRAPHDB_URL=http://localhost:7200
GRAPHDB_REPOSITORY=your_repo_name# Hardcoded in create_jaguar_query_agent() function
settings = OpenAISettings(
api_key=os.getenv("OPENAI_API_KEY", ""),
model_id=os.getenv("OPENAI_RESPONSES_MODEL_ID", "gpt-4")
)- API Keys: Stored in
.env, never committed to repository - Input Validation: User inputs validated before processing
- Error Handling: Errors logged but sanitized for user display
- SPARQL Injection Prevention: Tool validates SPARQL syntax
- Read-Only Access: Agent only reads from GraphDB
- No Data Storage: No user data persisted beyond conversation
- Secure APIs: Use HTTPS for all external communications
- Microsoft Agent Framework DevUI
- Automatic state management
- Single-user design
- OpenAI API rate limits
- Multi-User Support: Session-based user management
- Distributed State: Redis or database backend
- Horizontal Scaling: Multiple agent instances
- Query Caching: Cache frequent SPARQL patterns
- Async Processing: Non-blocking GraphDB queries
- DevUI request/response logging
- Agent Framework conversation logging
- GraphDB query execution logging
- Response Times: Graph RAG query processing time
- Query Success Rate: SPARQL execution success rate
- Tool Usage: GraphDB tool call frequency
- DevUI Interface: Test queries through the DevUI
- SPARQL Validation: Verify generated queries in GraphDB
- Response Quality: Check natural language response accuracy
- Microsoft Agent Framework: Agent management and DevUI
- OpenAI: GPT API client
- Requests: GraphDB HTTP communication
- python-dotenv: Environment variable management
- GraphDB: RDF triple store
- SPARQL: Query language for knowledge graphs
- RDF/Turtle: Ontology format
graph_RAG/
βββ main.py # DevUI entry point
βββ text2knowledge.ipynb # Ontology-aware knowledge extraction notebook
βββ src/
β βββ agents/
β βββ jaguar_query_agent.py # Agent creation with DevUI integration
β βββ jaguar_tool.py # GraphDB tool implementation
βββ data/
β βββ jaguar_ontology.ttl # Jaguar conservation ontology (RDFS/OWL)
β βββ jaguars.ttl # Jaguar instance data (RDF)
β βββ jaguar_corpus.txt # Mixed-content text corpus for extraction
βββ docs/
β βββ agent_design.md # Agent design documentation
β βββ architecture.md # Architecture documentation
βββ .env # Environment variables
βββ requirements.txt # Python dependencies
βββ README.md # Project documentation
- True Semantic Graph RAG: Uses formal ontologies (RDFS/OWL), not just LPG
- Hybrid Intelligence: Combines LLM reasoning with structured data
- Ontology-Driven Extraction: Intelligent knowledge mining from unstructured text
- Real-time Queries: Live data from knowledge graphs
- Context Awareness: Maintains conversation context
- Standards-Based: W3C standards (RDF, SPARQL, RDFS, OWL)
- Extensible: Easy to add new tools and capabilities
- Formal Ontologies: Machine-readable domain semantics
- Class Hierarchies: RDFS/OWL inheritance and reasoning
- Property Definitions: Domain/range constraints
- Semantic Interoperability: Universal URIs and namespaces
- Reasoning Capabilities: Built-in inference engines
- LLM-Friendly: Rich semantic structure for AI understanding
- Data-Driven Insights: Access to structured conservation data
- Natural Language Interface: Easy querying of complex data
- Knowledge Mining: Extract conservation data from research papers
- Educational Tool: Demonstrates semantic Graph RAG capabilities
- Research Support: Facilitates conservation research queries