Skip to content

feat: support Google Sheets datasets#289

Closed
Dawn-Fighter wants to merge 1 commit into
msoedov:mainfrom
Dawn-Fighter:feat/google-sheets-datasets
Closed

feat: support Google Sheets datasets#289
Dawn-Fighter wants to merge 1 commit into
msoedov:mainfrom
Dawn-Fighter:feat/google-sheets-datasets

Conversation

@Dawn-Fighter
Copy link
Copy Markdown
Contributor

Summary

  • Adds support for using public Google Sheets links as CSV-backed prompt datasets.
  • Converts standard Sheets browser links and published sheet links to direct CSV export URLs before loading.
  • Documents how to use a Google Sheet as a custom CSV source.

Testing

  • Ran python3 -m py_compile agentic_security/probe_data/data.py.
  • I could not run the pytest suite in this local checkout because the environment is missing project test dependencies like pytest and runtime dependencies like colorama.

Happy to contribute this. I picked this up from #86 because Google Sheets feels like a practical way for teams to keep shared red-team prompt sets updated without exporting CSV files manually.

Closes #86

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for using public Google Sheets URLs as CSV-backed prompt datasets by translating standard and published Sheets links into their CSV export URLs prior to fetching. Documentation is updated to describe the new capability.

Changes:

  • New normalize_google_sheets_csv_url helper handling both standard (/spreadsheets/d/<id>/edit) and published (/spreadsheets/d/e/<id>/pubhtml) URLs, preserving gid from query string or fragment.
  • fetch_csv_content invokes the normalizer before issuing the HTTP request.
  • docs/datasets.md documents the new Google Sheets workflow with an example.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
agentic_security/probe_data/data.py Adds Google Sheets URL normalization and wires it into the CSV fetch path.
docs/datasets.md Documents the new Google Sheets dataset source with usage example.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread agentic_security/probe_data/data.py
@msoedov
Copy link
Copy Markdown
Owner

msoedov commented May 14, 2026

Thanks for the patch! It looks like this has already been addressed in #290.

@msoedov msoedov closed this May 14, 2026
@Dawn-Fighter
Copy link
Copy Markdown
Contributor Author

it's okay thanks for replying

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable support for Google Sheets-based datasets

3 participants