Market Pulse is an AI-powered market intelligence platform that ingests public social signals, enriches them with AI, clusters related demand patterns, and serves ranked opportunities in a web feed.
This README is written for contributors. It explains how to run the project, how the system is structured, and how to contribute safely and consistently.
- Collects raw signals from external platforms.
- Enriches signals with AI sentiment, summary, and topics.
- Generates embeddings and clusters semantically related signals.
- Computes cluster-level opportunity metrics.
- Serves cluster and signal data via Next.js API routes.
- Renders an interactive feed UI for exploration.
market-pulse/
├── frontend/ # Next.js app + API routes
├── worker/ # Scheduled ingestion and processing pipeline
├── supabase/
│ └── schema.sql # Postgres schema to initialize Supabase
├── docker-compose.yml # Local multi-service runtime
├── .env.example # Shared env template
└── README.md
- Framework: Next.js App Router.
- UI: React + Tailwind CSS.
- API: Route handlers under
frontend/app/api/*. - Data source: Supabase Postgres (server-side via
frontend/lib/server/supabase.ts).
- Runtime: Python 3.11+.
- Scheduler:
worker/scheduler.pyruns pipeline every 10 minutes. - Orchestration:
worker/tasks.py. - Ingestion:
worker/ingestion/*. - Processing:
worker/processing/*(AI, embeddings, clustering, scoring). - Persistence: Supabase client + Postgres tables.
supabase/schema.sqldefines the current schema.- Core tables:
signalsclusters
- Docker + Docker Compose
- Node.js 20+
- Python 3.11+
- A Supabase project
- API credentials for data/AI providers you want to run
Copy .env.example to .env:
cp .env.example .envMinimum required for both services:
SUPABASE_URLSUPABASE_SERVICE_ROLE_KEY
Worker-specific values (recommended for full pipeline):
GOOGLE_API_KEYPH_TOKENSTACK_OVERFLOW_API_KEYREDDIT_CLIENT_IDREDDIT_SECRETREDDIT_USER_AGENTENVIRONMENT
Security notes:
- Never expose
SUPABASE_SERVICE_ROLE_KEYin client-side code. - Never commit real secrets into git.
- Create your Supabase project.
- Open Supabase SQL Editor.
- Run all SQL from
supabase/schema.sql. - Populate
.envfrom.env.example.
docker compose up --buildEndpoints:
- Frontend:
http://localhost:3000 - Health:
http://localhost:3000/api/health
cd worker
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python scheduler.pycd frontend
npm install
npm run devIf running frontend directly, ensure env vars are available in frontend/.env.local or your shell.
- Fork/branch from
main. - Keep PRs focused and small.
- Write clear commit messages.
- Update docs when behavior/config changes.
- Open a PR with context, screenshots (if UI), and test notes.
Suggested branch naming:
feat/<short-description>fix/<short-description>chore/<short-description>
- Prefer simple, readable code over clever code.
- Keep functions focused and side effects explicit.
- Avoid unrelated refactors in feature/fix PRs.
- Keep API response shapes stable when possible.
- Use
lib/api.tsfor client-fetch patterns. - Keep UI components in
app/feed/sectionsfocused/presentational.
- Treat ingestion output as untrusted input; validate fields defensively.
- Keep pipeline steps idempotent where possible.
- Log key step boundaries and failures with useful context.
Current repo has limited formal test automation.
Before opening a PR, run at least:
- Frontend dev build/lint flow (
npm run dev,npm run lintif configured) - Worker pipeline smoke run with representative env vars
- Manual API checks for:
/api/health/api/clusters/api/signals/api/clusters/:clusterId/signals
If you add a bug fix, include reproduction and verification steps in the PR description.
- Using Supabase publishable key where service role key is required.
- Forgetting to run
supabase/schema.sqlbefore starting services. - Introducing breaking API response changes without updating frontend consumers.
- Assuming all external API credentials are available in every environment.
- Hacker News and Product Hunt ingestion
- AI semantic enrichment and clustering
- Reddit ingestion hardening
- Vector DB integration for high-scale similarity search
- Queue-based worker orchestration (Celery/RabbitMQ or equivalent)
- Auth and personalized dashboards
- Additional source connectors (X, LinkedIn, Telegram)
If you get stuck while contributing:
- Open an issue with logs, env context (without secrets), and reproduction steps.
- Tag the affected area clearly:
frontend,worker,database, orinfra.