Hi Leonid,
Perplexity made several breaking changes since the July 2025 push that take this lib offline. I needed it working for a personal export, so I forked and patched in 4 areas:
- login — cookie banner selectors/text changed; the DOM-based login-success detector (
#ask-input) is unreliable because Perplexity now renders the search UI to logged-out users too. Replaced with /api/auth/session polling. Also added explicit instructions for the user to type the 6-digit code (NOT click the magic link in the email — that authenticates the user's regular browser, not the Puppeteer-controlled one, and the script never sees a session).
- library enumeration —
data-testid="thread-title" is gone, and /library only renders the ~20 sidebar threads. Replaced with observe-and-replay of the /rest/thread/list_ask_threads POST and offset pagination — pulls the full archive.
- per-thread fetch — the SPA's natural request uses
limit=10, silently truncating any thread with >10 turns. Switched to a direct API call with limit=1000 and has_next_page/offset pagination. Side benefits: ~10× faster (no per-thread page navigation) and avoids the detached-frame errors that pile up when scraping hundreds of threads.
- resilience — try/catch per conversation with page-recreation recovery on detached-Frame / Target-closed errors. Cookies persist on the browser context so no re-login during recovery.
Fork: https://github.com/osedlacek/perplexport
Happy to break this into smaller PRs if you'd like to bring the changes upstream. Otherwise, no worries — the fork is here as a working option for anyone who lands on this repo.
Hi Leonid,
Perplexity made several breaking changes since the July 2025 push that take this lib offline. I needed it working for a personal export, so I forked and patched in 4 areas:
#ask-input) is unreliable because Perplexity now renders the search UI to logged-out users too. Replaced with/api/auth/sessionpolling. Also added explicit instructions for the user to type the 6-digit code (NOT click the magic link in the email — that authenticates the user's regular browser, not the Puppeteer-controlled one, and the script never sees a session).data-testid="thread-title"is gone, and/libraryonly renders the ~20 sidebar threads. Replaced with observe-and-replay of the/rest/thread/list_ask_threadsPOST and offset pagination — pulls the full archive.limit=10, silently truncating any thread with >10 turns. Switched to a direct API call withlimit=1000andhas_next_page/offset pagination. Side benefits: ~10× faster (no per-thread page navigation) and avoids the detached-frame errors that pile up when scraping hundreds of threads.Fork: https://github.com/osedlacek/perplexport
Happy to break this into smaller PRs if you'd like to bring the changes upstream. Otherwise, no worries — the fork is here as a working option for anyone who lands on this repo.