fix(youtube): login detection, watch-history selector drift, error envelope#79
Open
volod-vana wants to merge 2 commits into
Open
fix(youtube): login detection, watch-history selector drift, error envelope#79volod-vana wants to merge 2 commits into
volod-vana wants to merge 2 commits into
Conversation
…-formed error envelope
Connecting YouTube failed even after a successful manual login. Two bugs:
1. Login-detection race. After the headed-browser login + goHeadless resume, the
connector judged login status on the FIRST frame via settleLoggedInSession,
before YouTube's SPA re-rendered the signed-in avatar — so a successful login
read as 'not logged in'. The programmatic path already retried; the headed
fallback did not.
- settleLoggedInSession now waits for the topbar to render a decisive element
(avatar OR sign-in link) before judging.
- The headed path retries (mirrors the programmatic path) instead of a single
first-frame check.
2. Error envelope missing timestamp. The login-failure return was { error } with
no timestamp, so DataConnect classified it as 'Protocol violation: timestamp
is required' instead of a clean login failure. Both error returns now include
success:false + timestamp.
Bump connector + registry to 1.0.1; refresh checksums.
Validated structurally (node -c; validate-connector.cjs 41 passed / 0 failed).
NOT verified against a live Google login — needs a real run.
Schema Health Check — Non-blocking inherited issues41/47 scopes consistent | 6 inherited Gateway gap(s) | no new blocking issues in this PR |
The history scrape returned 0 items even with watch history ON. Confirmed live
on youtube.com/feed/history: the lockups exist (yt-lockup-view-model), but every
INNER selector the extractor relied on had drifted and YouTube now obfuscates
those class names:
- content-id-{videoId} class -> gone (videoId now only in the href)
- a.yt-lockup-view-model__content-image -> gone
- a.yt-lockup-metadata-view-model__title -> gone
- h3.yt-lockup-metadata-view-model__heading-reset -> gone
So linkEl was null -> 'continue' skipped every item -> 0 results.
Rewrote extraction with STABLE structural selectors (validated live, 7/7 items):
- link: a[href*='/watch'] (videoId parsed from ?v=, content-id-* fallback)
- title: h3[title] / h3 text
- channel: [aria-label^='Go to channel'] (the one selector that survived)
- header: ytd-item-section-header-renderer h2 (Today / Yesterday / date)
- views: metadata span matching /views/
Refresh registry script checksum.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three connector bugs found while wiring YouTube into Memory, all in
connectors/google/youtube-playwright.js. First two were diagnosed from the actualDataConnect.log; the third was confirmed live againstyoutube.com/feed/historyin a logged-in browser.1. Login-detection race (blocked connecting at all)
After the headed-browser login +
goHeadless({ resumeUrl }), the connector judged login on the first frame viasettleLoggedInSession()— before YouTube's SPA re-rendered the avatar — so a successful manual login read as "not logged in." The programmatic path retried; the headed fallback checked once.settleLoggedInSession()now waits for the topbar to render a decisive element (avatar or sign-in link) before judging.✅ e2e validated — connect now completes:
20 subscriptions, 100 likes, 5 watch later→classified outcome: success.2. Error envelope missing
timestampThe login-failure return was
{ error }with notimestamp, so Data Connect rejected it as "Protocol violation: timestamp is required" instead of a clean "Login failed." Both error returns now includesuccess: false+timestamp.3. Watch-history scraping returned 0 (selector drift)
Even with watch history on, history came back empty. Inspected the live page: the
yt-lockup-view-modelitems exist, but every inner selector the extractor used had drifted (YouTube now obfuscates those class names), solinkElwas null →continueskipped every item → 0 results.content-id-{id}classvideoIdparsed froma[href*="/watch"]?v=(class fallback kept)a.yt-lockup-view-model__content-imagea[href*="/watch"]h3.yt-lockup-...__heading-reseth3[title]/h3text#header #titleytd-item-section-header-renderer h2[aria-label^="Go to channel"]Follows the connector skill guidance: structural selectors / ARIA, never obfuscated class names.
Versioning / testing
1.0.0 → 1.0.1; checksums refreshed.node -cparses;validate-connector.cjs→ 41 passed / 0 failed.