Code for the paper "Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching" (COLING 2025)
-
Updated
May 27, 2026 - Python
Code for the paper "Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching" (COLING 2025)
The official implementation of [Quality over Quantity: Boosting Data Efficiency Through Ensembled Multimodal Data Curation] in AAAI2025.
Our project for the "Data Intelligence Applications" exam at Politecnico di Milano. The project was about Social Influence and Pricing techniques applied to networks.
Decision intelligence platform for industrial manufacturing. Connects to CRM, ERP, and CMMS systems and monitors industry and macroeconomic conditions to compute leading indicators, generate predictions, and deliver daily executive briefings.
MCP integration to use Cassis with Snowflake when Cassis isn't natively connected to the warehouse. Cassis generates grounded SQL, Snowflake executes it.
A structured intelligence layer for internet attention.
PM observability KPI framework for a real-time clinical risk-stratification stack (Mobile SDK → APIM → Kafka → ETL → XGBoost → 3rd-party).
Patent intelligence for AI agents — patent search, USPTO data, patent landscape & pgvector prior-art search. MCP + x402.
Weather & climate intelligence for AI agents — current weather, forecast, historical, climate normals, alerts, agricultural & travel weather. MCP + x402.
Data visualizations through Tableau for insightful analytics and decision-making using Walmart retail data.
Trade-Based Money Laundering investigation reports generated in minutes — not weeks. This actor is built for AML compliance officers, trade finance banks, and financial investigators who need a scored TBML risk assessment backed by data from 14+ authoritative sources and five independent forensic algorithms.
Cyber attack surface report for any domain — enter a target and get a full external risk assessment in minutes.
Gati Shakti AI Digital Twin is an AI-powered governance intelligence platform designed to revolutionize infrastructure planning and execution in India. It acts as a cognitive digital twin of national infrastructure, enabling real-time coordination across ministries, predictive decision-making, and citizen-driven governance.
Sensorium is a real-time data intelligence platform that ingests, processes, and visualizes sensor data through a resilient Python service, a high-performance Node.js/Express API, and an intuitive React.js dashboard—transforming raw data into actionable insights for smarter decision-making.
DomainKits connects Claude to live domain data. Built-in workflows help verify domains from multiple angles before making decisions. Supports domain search, analysis, brand conflict detection, valuation, trend discovery,
AI-powered job screening system that helps match candidates with job openings
Django-based job-search analytics platform with funnel metrics, data quality checks, workbook exports, evidence documentation, and 133 passing tests.
n8n community node for CrawlSnap — structured, on-demand data intelligence APIs (VectorSnap, PulseSnap, SubdoSnap).
The Cognitive Node of the Automated Data Intelligence Platform (ADIP). An AI-powered analytical infrastructure that consumes raw data from the Ingestion Engine into automated insights, forecasts and applied LLM reasoning, all served via Streamlit.
Add a description, image, and links to the data-intelligence topic page so that developers can more easily learn about it.
To associate your repository with the data-intelligence topic, visit your repo's landing page and select "manage topics."