Skip to content

feat(slack): add Slack connector#77

Draft
volod-vana wants to merge 4 commits into
mainfrom
feat/slack-connector
Draft

feat(slack): add Slack connector#77
volod-vana wants to merge 4 commits into
mainfrom
feat/slack-connector

Conversation

@volod-vana

Copy link
Copy Markdown
Member

Adds a Playwright connector for Slack.

Scopes: slack.profile, slack.conversations (channels + DMs index), slack.messages (bodies + thread replies).

How it works:

  • Reads the xoxc token from app.slack.com localStorage.
  • Runs all API calls in-page from the slack.com origin to bypass the app.slack.com service worker (which breaks in-page fetch); the browser supplies the d cookie natively.
  • Auto-skips archived, muted, and non-member channels (no UI for an ignore list).
  • Optional oldestDays input limits history depth.

Tested locally end-to-end in DataConnect: 392 conversations / 99,247 messages / 16,345 threads, 0 errors (~1h43m, Slack rate-limited).

Status: experimental.

Exports slack.profile, slack.conversations, and slack.messages from the
active Slack workspace. Token is read from app.slack.com localStorage;
API calls run in-page from the slack.com origin to bypass the app.slack.com
service worker. Auto-skips archived, muted, and non-member channels.
@github-actions

github-actions Bot commented May 29, 2026

Copy link
Copy Markdown

Schema Health Check — Non-blocking inherited issues

44/50 scopes consistent | 6 inherited Gateway gap(s) | no new blocking issues in this PR

- artifacts/slack-playwright/slack-playwright-0.1.0.tgz: new bundle
- connector-index.json: append slack-playwright entry, refresh
  sourceTag / sourceCommit / artifactUrl on all entries to this branch's
  HEAD (standard regen side-effect)

The 16 other byte-different artifacts are gtar-version artifacts from
running scripts/generate-connector-index.mjs on macOS (BSD tar lacks
--sort=name; gtar via PATH shim). Functionally identical to main's
artifacts — same bundle contents, different gzip framing.
volod-vana added a commit that referenced this pull request May 31, 2026
## What
Adds real per-post **capture dates** to Instagram posts:
- `instagram-playwright.js` — extract `taken_at` (`taken_at` /
`taken_at_timestamp`) from the IG v1 feed media node, emit as ISO
string.
- `schemas/instagram.posts.json` — add `taken_at: string` to the scope
schema.

## Why
The **Memory timeline** needs real per-post dates. Without `taken_at`,
every post lands on the collected-at timestamp and piles onto one day.
This is the desktop/DataConnect path; the web/lite path (data-pipe Apify
actor) maps the same field.

## ⚠️ Needs a maintainer step before deploy
- **Re-register the connector** (`register.cjs` → registry.json checksum
+ artifact). I did **not** include a registry.json change:
`register.cjs` in my environment rewrote the entry inconsistently with
`main` (dropped `sourceId/displayName/...`, changed
`meta/`→`connectors/meta/` paths), so the checksum bump should be done
with the correct tooling/setup.
- **Register the updated schema to Context Gateway** so PS accepts
`taken_at` (the scope is strict). This is a prerequisite for posts to
persist with dates from either path.

## Scope
IG `taken_at` only — branched cleanly off `main` (Slack is separate in
#77).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant