Skip to content
Change the repository type filter

All

    Repositories list

    • 0000Updated Jun 2, 2026Jun 2, 2026
    • hlqa

      Public
      Collaborative Reasoning and Human-Centered QA in the GenAI Era
      HTML
      MIT License
      0000Updated Jun 1, 2026Jun 1, 2026
    • Tutorial website for "Temporal Information Retrieval and Question Answering in the Age of LLMs" at WWW 2026 Conference
      HTML
      0100Updated May 30, 2026May 30, 2026
    • Rankify

      Public
      🔥 Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation 🔥. Our toolkit integrates 40 pre-retrieved benchmark dat…
      Python
      6967616Updated May 6, 2026May 6, 2026
    • Survey of datasets, methods, and tools for Temporal Question Answering.
      0300Updated Apr 17, 2026Apr 17, 2026
    • HintEval

      Public
      HintEval💡: A Comprehensive Framework for Hint Generation and Evaluation for Questions
      Python
      Apache License 2.0
      33600Updated Apr 16, 2026Apr 16, 2026
    • A benchmark and task for answering questions by inferring answers from indirect (answer-supporting) evidence rather than answer-containing passages.
      Python
      MIT License
      1100Updated Apr 15, 2026Apr 15, 2026
    • This repository contains materials, code, and resources for the SIGIR 2026 tutorial on temporal information retrieval and extraction, covering foundational conc…
      HTML
      0000Updated Apr 13, 2026Apr 13, 2026
    • HTML
      0000Updated Apr 13, 2026Apr 13, 2026
    • Q-DAPS

      Public
      Estimate question difficulty for LLMs via answer plausibility scoring and entropy-based uncertainty.
      MIT License
      0000Updated Apr 9, 2026Apr 9, 2026
    • 0000Updated Apr 8, 2026Apr 8, 2026
    • Code and experiments for studying convergence-based passage construction in inferential QA, demonstrating improvements over cosine similarity in retrieval-augme…
      Python
      MIT License
      1100Updated Apr 8, 2026Apr 8, 2026
    • Official repository for studying how pretraining exposure drives popularity bias in large language models.
      Python
      MIT License
      1100Updated Apr 8, 2026Apr 8, 2026
    • Official repository for "Are LLM-Based Retrievers Worth Their Cost? An Empirical Study of Efficiency, Robustness, and Reasoning Overhead", accepted at SIGIR 202…
      0100Updated Apr 4, 2026Apr 4, 2026
    • SustainableQA: A Comprehensive Question Answering Dataset for Corporate Sustainability and EU Taxonomy Reporting
      Python
      Other
      24410Updated Mar 20, 2026Mar 20, 2026
    • RecencyQA

      Public
      How often do Answers Change? Estimating Recency Requirements in Question Answering
      MIT License
      0200Updated Feb 15, 2026Feb 15, 2026
    • Parse

      Public
      An Open-Domain Reasoning Question Answering Benchmark for Persian
      HTML
      MIT License
      1200Updated Feb 7, 2026Feb 7, 2026
    • tempoeval

      Public
      Python
      MIT License
      1300Updated Feb 5, 2026Feb 5, 2026
    • TempRetriever: Fusion-based Temporal Dense Passage Retrieval for Time-Sensitive Questions Accepted at WSDM main 2026
      0100Updated Dec 1, 2025Dec 1, 2025
    • RankArena

      Public
      RankArena: A Unified Platform for Evaluating Retrieval, Reranking and RAG with Human and LLM Feedback — CIKM ’25, Seoul, Nov 10–14, 2025.
      Python
      Apache License 2.0
      1300Updated Nov 26, 2025Nov 26, 2025
    • Python
      Creative Commons Zero v1.0 Universal
      21130Updated Nov 11, 2025Nov 11, 2025
    • DeAR (Deep Agent Rank): Dual-Stage Document Reranking with Reasoning Agents Accepted at EMNLP Findings 2025
      Python
      0810Updated Oct 23, 2025Oct 23, 2025
    • HintQA

      Public
      Exploring Hint Generation Approaches in Open-Domain Question Answering
      Jupyter Notebook
      MIT License
      23000Updated Sep 19, 2025Sep 19, 2025
    • How Good are LLM-based Rerankers? Accepted at EMNLP Findings 2025
      Apache License 2.0
      01200Updated Aug 28, 2025Aug 28, 2025
    • Evaluating Robustness of LLMs in Question Answering on Multilingual NOisy OCR Data
      Python
      MIT License
      0600Updated Aug 20, 2025Aug 20, 2025
    • ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages
      Python
      MIT License
      11500Updated Aug 19, 2025Aug 19, 2025
    • Wrong Answers Can Also Be Useful: PlausibleQA — A QA Dataset with Answer Plausibility Scores
      MIT License
      2900Updated Jul 27, 2025Jul 27, 2025
    • WikiHint

      Public
      WikiHint: A Human-Annotated Dataset for Hint Ranking and Generation
      Python
      MIT License
      1400Updated Jul 27, 2025Jul 27, 2025
    • TriviaHG

      Public
      A Dataset for Automatic Hint Generation from Factoid Questions
      Python
      MIT License
      22800Updated Jul 27, 2025Jul 27, 2025
    • Detecting Temporal Ambiguity in Questions
      MIT License
      0400Updated Nov 26, 2024Nov 26, 2024
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.