Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions ai/generative-ai-service/sentiment-categorization/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@
The Customer Message Analyzer is a tool designed to analyze customer messages through unsupervised categorization, sentiment analysis, and summary reporting. It helps businesses understand customer feedback without requiring extensive manual labeling or analysis.


Reviewed: 01.04.2025

Reviewed: 19.09.2025

<img width="2542" height="1202" alt="image" src="https://github.com/user-attachments/assets/bdb7dbb0-78ec-4896-bb93-927bf75c31d9" />

# When to use this asset?

Customer service teams, product managers, and marketing professionals would use this asset when they need to quickly understand large volumes of customer feedback, identify trends, and make data-driven decisions to improve products or services.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[client]
showSidebarNavigation = false
toolbarMode = "minimal"

[server]
headless = true
121 changes: 51 additions & 70 deletions ai/generative-ai-service/sentiment-categorization/files/README.md
Original file line number Diff line number Diff line change
@@ -1,67 +1,33 @@
# Customer-Agent Conversation Analysis and Categorization Demo
This demo showcases an AI-powered solution for analyzing batches of customer messages, categorizing them into hierarchical levels, extracting sentiment scores, and generating structured reports.

## Key Features
* **Hierarchical Categorization**: Automatically categorizes messages into three levels of hierarchy:
+ Primary Category: High-level categorization
+ Secondary Category: Mid-level categorization, building upon primary categories
+ Tertiary Category: Low-level categorization, providing increased specificity and detail
* **Sentiment Analysis**: Extracts sentiment scores for each message, ranging from very negative (1) to very positive (10)
* **Structured Reporting**: Generates a comprehensive report analyzing the batch of messages, including:
+ Category distribution across all three levels
+ Sentiment score distribution
+ Summaries of key findings and insights

## Data Requirements
* Customer messages should be stored in a CSV file(s) within a folder named `data`.
* Each CSV file should contain a column with the message text.

## Python Version
This project requires **Python 3.13** or later. You can check your current Python version by running:
```
python --version
```
or
```
python3 --version
```
This demo showcases an AI-powered solution for analyzing batches of customer messages, categorizing them into hierarchical levels, extracting sentiment scores, and generating structured reports. The latest version adds a professional, corporate UI theme, CSV upload/validation in the sidebar, and step-aware progress feedback during processing.

## Getting Started
To run the demo, follow these steps:
1. Clone the repository using `git clone`.
2. *(Optional but recommended)* Create and activate a Python virtual environment:
- On Windows:
```
python -m venv venv
venv\Scripts\activate
```
- On macOS/Linux:
```
python3 -m venv venv
source venv/bin/activate
```
3. Place your CSV files containing customer messages in the `data` folder. Ensure each includes a column with the message text.
4. Install dependencies using `pip install -r requirements.txt`.
5. Run the application using `streamlit run app.py`.
## Key Features
- Hierarchical Categorization
- Primary Category: High-level categorization
- Secondary Category: Mid-level categorization, building upon primary categories
- Tertiary Category: Low-level categorization, providing increased specificity and detail
- Sentiment Analysis
- Extracts sentiment scores for each message, from very negative (1) to very positive (10)
- Structured Reporting
- Category distribution across all three levels
- Sentiment score distribution
- Summaries of key findings and insights
- CSV Upload and Validation
- Upload CSV in the sidebar; validates required columns ID and Message before running
- Displays a preview in the sidebar and a full interactive table in the main area
- Execution Progress and Status
- Step-aware progress bar with status text showing the currently running stage and total steps

## Example Use Cases
* Analyze customer feedback from surveys, reviews, or social media platforms to identify trends and patterns.
* Inform product development and customer support strategies by understanding customer sentiment and preferences.
* Optimize marketing campaigns by targeting specific customer segments based on their interests and concerns.

## Technical Details
* The solution leverages Oracle Cloud Infrastructure (OCI) GenAI, a suite of AI services designed to simplify AI adoption.
* Specifically, this demo utilizes the Cohere R+ model, a state-of-the-art language model optimized for natural language processing tasks.
* All aspects of the demo, including:
+ Hierarchical categorization
+ Sentiment analysis
+ Structured report generation are powered by GenAI, ensuring accurate and efficient analysis of customer messages.

- Built on Oracle Cloud Infrastructure (OCI) GenAI services
- End-to-end flow powered by GenAI for:
- Hierarchical categorization
- Sentiment analysis
- Structured report generation

## Project Structure

The repository is organized as follows:

```plaintext
│ app.py # Main Streamlit application entry point
│ README.md # Project documentation
Expand All @@ -70,31 +36,46 @@ The repository is organized as follows:
├───backend
│ │ feedback_agent.py # Logic for feedback processing agents
│ │ feedback_wrapper.py # Wrappers and interfaces for feedback functionalities
│ │ message_handler.py # Utilities for handling and preprocessing messages
│ │
│ ├───data
│ │ complaints_messages.csv # Example dataset of customer messages
│ │
│ └───utils
│ config.py # Configuration and setup for the project
│ llm_config.py # Model- and LLM-related configuration
│ prompts.py # Prompt templates for language models
└───pages
SentimentByCat.py # Additional Streamlit page for sentiment by category
└───data
complaints_messages.csv # Example dataset of customer messages
```
## Output
The demo will display an interactive dashboard with the generated report, providing valuable insights into customer messages, including:
* Category distribution across all three levels
* Sentiment score distribution
* Summaries of key findings and insights

## Contributing
We welcome contributions to improve and expand the capabilities of this demo. Please fork the repository and submit a pull request with your changes.
## Getting Started
1. Clone the repository using git clone.
2. (Optional) Create and activate a Python virtual environment:
- Windows:
- python -m venv venv
- venv\Scripts\activate
- macOS/Linux:
- python3 -m venv venv
- source venv/bin/activate
3. Place your CSV files in the data folder. Ensure each includes the required columns ID and Message.
4. Install dependencies with pip install -r requirements.txt.
5. Run the application with `streamlit run app.py`.

## Data Requirements
- Input format: CSV with two columns: ID and Message
- The app validates:
- File extension is CSV
- Both required columns are present
- The full dataset is visualized in the main view after successful validation.

## Output
The dashboard displays an interactive report with:
- Category distribution across all three levels
- Sentiment score distribution
- Summaries of key findings and insights
- Step-by-step execution status and overall progress of the analysis run

## License
Copyright (c) 2025 Oracle and/or its affiliates.

Licensed under the Universal Permissive License (UPL), Version 1.0.

See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
216 changes: 208 additions & 8 deletions ai/generative-ai-service/sentiment-categorization/files/app.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,216 @@
import json
import pandas as pd
import streamlit as st
import plotly.express as px
from backend.feedback_wrapper import FeedbackAgentWrapper

st.set_page_config(
page_title="Hello",
page_icon="👋",
st.set_page_config(page_title="Feedback Dashboard", page_icon="📊", layout="wide")

def load_styles():
try:
with open("styles.css", "r", encoding="utf-8") as f:
st.markdown(f"<style>{f.read()}</style>", unsafe_allow_html=True)
except Exception:
st.markdown(
"<style>.main-header{border-bottom:3px solid #C74634;padding:1rem 0}</style>",
unsafe_allow_html=True,
)

load_styles()

st.markdown(
"""
<div class="main-header">
<div class="header-content">
<h1>Customer Feedback Dashboard</h1>
<p>Analyze customer sentiment, insights, and trends across categories</p>
</div>
</div>
""",
unsafe_allow_html=True,
)

st.sidebar.markdown('<div class="stMarkdown"><h3>Data Input</h3></div>', unsafe_allow_html=True)
uploaded_file = st.sidebar.file_uploader("Upload CSV (required columns: ID, Message)", type=["csv"])

data_list = None
df_uploaded = None
valid_file = False

if uploaded_file is not None:
try:
df_uploaded = pd.read_csv(uploaded_file)
required_columns = {"ID", "Message"}
if not required_columns.issubset(set(df_uploaded.columns)):
st.sidebar.error(f"CSV must include columns: {', '.join(required_columns)}")
else:
valid_file = True
st.sidebar.success("File uploaded and validated successfully.")
st.sidebar.markdown("</div>", unsafe_allow_html=True)

data_list = df_uploaded.values.tolist()
except Exception as e:
st.sidebar.error(f"An error occurred while processing the file: {e}")

st.sidebar.markdown('<div class="stMarkdown"><h4>Run</h4></div>', unsafe_allow_html=True)
if "flow_completed" not in st.session_state:
st.session_state.flow_completed = True

start_button = st.sidebar.button(
"Start",
disabled=not (st.session_state.flow_completed and valid_file),
)

st.write("# Welcome to Streamlit! 👋")
st.markdown('<div class="comparison-container">', unsafe_allow_html=True)

if df_uploaded is not None and valid_file:
st.markdown("### Uploaded Data")
dataset_exp = st.expander(uploaded_file.name, expanded=True)
dataset_exp.dataframe(df_uploaded, height=200, use_container_width=True)

def display_category(data):
if not isinstance(data, dict) or "categories" not in data:
st.warning("No category data found in the report.")
return

st.markdown('<div class="metrics-grid">', unsafe_allow_html=True)
st.markdown("</div>", unsafe_allow_html=True)

for category in data.get("categories", []):
with st.container():
st.markdown(
f'<div class="response-card finetuned"><div class="response-header">'
f'<div class="model-name">{category.get("category_level_1", "Unknown")}</div>'
f'</div><div class="response-content">',
unsafe_allow_html=True,
)
st.write(category.get("summary", ""))

st.sidebar.success("Select a demo above.")
col1, col2, col3 = st.columns(3)
with col1:
avg = category.get("average_sentiment_score", None)
if avg is not None:
st.metric("Avg Sentiment Score", avg, delta=None)
st.progress(min(max(avg / 10, 0.0), 1.0))
with col2:
high = category.get("highest_sentiment_message", {})
st.success(f"Highest Sentiment: {high.get('sentiment_score', 'N/A')}")
st.write(f"“{high.get('summary', '')}”")
with col3:
low = category.get("lowest_sentiment_message", {})
st.error(f"Lowest Sentiment: {low.get('sentiment_score', 'N/A')}")
st.write(f"“{low.get('summary', '')}”")

st.markdown("#### Key Insights")
for insight in category.get("key_insights", []):
st.info(f"• {insight}")

st.markdown("#### Subcategories Breakdown")
for subcategory in category.get("subcategories", []):
with st.expander(
f"{subcategory.get('category_level_2','(Unknown)')} "
f"(Avg: {subcategory.get('average_sentiment_score','N/A')})"
):
st.write(subcategory.get("summary", ""))
st.markdown("</div></div>", unsafe_allow_html=True)

def display_sentiment(df: pd.DataFrame):
if df.empty:
st.warning("No sentiment data to display.")
return
fig = px.bar(
df,
x="id",
y="sentiment_score",
color="sentiment_score",
text="topic",
labels={"id": "Id", "sentiment_score": "Sentiment Score (1-10)"},
title="Sentiment Scores per Feedback Category",
)
fig.update_layout(
margin=dict(l=10, r=10, t=50, b=10),
legend_title_text="Score",
)
fig.update_traces(textposition="inside")
st.plotly_chart(fig, use_container_width=True)

if start_button and st.session_state.flow_completed and valid_file:
st.session_state.flow_completed = False

agent = FeedbackAgentWrapper(data_list)
steps, edges = agent.get_nodes_edges()
steps = steps[1:-1]
outputs = []
current_step = steps[0] if steps else "summarize"

status_placeholder = st.empty()
progress_bar = st.progress(0)
total_steps = len(steps) if steps else 1
step_counter = 0

while current_step != "FINALIZED":
status_placeholder.markdown(
f'<div class="response-meta">Running step: <strong>{current_step}</strong> '
f'({step_counter}/{total_steps})</div>',
unsafe_allow_html=True,
)
next_step, output = agent.run_step_by_step()
if not output:
current_step = "FINALIZED"
else:
outputs.append(output)
current_step = next_step
step_counter += 1
progress_bar.progress(min(step_counter / max(total_steps, 1), 1.0))

progress_bar.progress(1.0)
status_placeholder.markdown(
f'<div class="response-meta">Completed {step_counter} of {total_steps} steps.</div>',
unsafe_allow_html=True,
)

def find_report(objs):
for o in objs:
for v in o.values():
if isinstance(v, dict) and "reports" in v:
return v["reports"]
return None

report_list = find_report(outputs) or []
if report_list:
try:
categories = json.loads(report_list[0])
st.markdown("### Report")
display_category(categories)
except json.JSONDecodeError:
st.error("Report is not valid JSON.")

def find_summaries(objs):
for o in objs:
if "summarize" in o and "messages_info" in o["summarize"]:
return o["summarize"]["messages_info"]
return []

summaries = find_summaries(outputs)
try:
df = pd.DataFrame([s if isinstance(s, dict) else s.dict() for s in summaries])
except Exception:
df = pd.DataFrame()

if not df.empty:
st.markdown("### Sentiment Overview")
display_sentiment(df)

st.session_state.flow_completed = True

# Footer
st.markdown(
"""
This is a demo!
"""
)
<div class="oracle-footer">
© Oracle Corporation | Technology Engineering | OCI Generative AI Services
</div>
""",
unsafe_allow_html=True,
)

st.markdown("</div>", unsafe_allow_html=True)
Loading