chatbot-evaluation

Here are 9 public repositories matching this topic...

SparkBeyond / agentune

Tune your AI Agent to best meet its KPI with a cyclic process of analyze, improve and simulate

customer-support customer-service conversational-agents ai-agents chatbot-evaluation agent-simulator kpi-analysis agent-evaluation agent-optimization sales-agents customer-facing-agents kpi-optimization

Updated Jan 14, 2026
Python

ricky-ma / ChatbotAnalytics

Star

Exploratory data analysis and interactive model-understanding and evaluation tool for chatbot training data and feedback

maintenance user-feedback confirmation chatbot-predictions chatbot-evaluation

Updated Feb 2, 2024
Jupyter Notebook

alexostrovsky01 / chatbot-testing-framework

Star

An open-source framework for robust, LLM-powered testing and tracing of conversational AI applications.

testing tracing chatbot-application conversational-ai chatbot-testing chatbot-evaluation llm chatbot-tracing

Updated Sep 14, 2025
Python

aron-radvanyi / GENERATIVE_AI_Embedding_Metrics_Calculator

Star

This is a repository for a Jupyter based tool to calculate Greedy Matching, Vector Extrema and Average Embedding evaluation metrics for generative AI chatbots

nlp machine-learning natural-language-processing ai deep-learning word2vec chatbot question-answering chatbot-application embedding evaluation-metrics embedding-evaluation chatbot-evaluation generative-ai embedding-metrics

Updated Jul 31, 2023
Jupyter Notebook

Chatbot-TRACER / TRACER-evaluation

Star

Evaluation results and experimental data for TRACER, demonstrating its effectiveness in discovering chatbot functionalities and detecting errors with coverage analysis and mutation testing.

mutation-testing experimental-data conversational-ai chatbot-evaluation ai-testing

Updated Nov 28, 2025

zichenzha0 / Text-Reference-AIChatbot

Star

AI chatbot evaluation benchmark for mental health and suicide prevention with rule-based ethical alignment, inclusivity scoring, and sentiment analysis. Research paper submitted to AI & Society (Springer Nature) under review.

inclusivity benchmark mental-health research-paper springernature ai-ethics suicide-prevention lgbtq social-work chatbot-evaluation ai-societies

Updated May 2, 2026
Python

Eis4TY / Eval-Any-Agent

Star

Self-hosted LLM/Agent batch evaluation platform with OpenAI-compatible API testing, streaming response parsing, LLM-as-a-Judge scoring, Docker deployment, and CSV/XLSX export.

Updated May 12, 2026
TypeScript

prajaktapandit7 / conversational-AI-evaluation

Star

Structured evaluation of 30 support bot conversations measuring containment, escalation rate, intent accuracy, and CSAT correlation, LLM-assisted qualitative coding, edge cases, and recommendations.

ai-agents conversational-ai chatbot-evaluation cx-analytics qualitative-coding ai-product-evaluation bot-performance containment-rate

Updated Feb 19, 2026

nhidang912 / ChatbotEvaluation

Star

First publication & part of Master’s thesis at HCMUS: End-to-end automatic evaluation framework for retrieval-augmented chatbots (accepted at ACIIDS 2026)

data-synthesis chatbot-evaluation hallucination-detection llm-as-a-judge

Updated Mar 12, 2026
Python

Improve this page

Add a description, image, and links to the chatbot-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the chatbot-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chatbot-evaluation

Here are 9 public repositories matching this topic...

SparkBeyond / agentune

ricky-ma / ChatbotAnalytics

alexostrovsky01 / chatbot-testing-framework

aron-radvanyi / GENERATIVE_AI_Embedding_Metrics_Calculator

Chatbot-TRACER / TRACER-evaluation

zichenzha0 / Text-Reference-AIChatbot

Eis4TY / Eval-Any-Agent

prajaktapandit7 / conversational-AI-evaluation

nhidang912 / ChatbotEvaluation

Improve this page

Add this topic to your repo