-
Notifications
You must be signed in to change notification settings - Fork 37
feat: add devto article scraping PoC snippet #377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ewbenigno
wants to merge
2
commits into
he4rt:4.x
Choose a base branch
from
ewbenigno:feat/article-scraping-poc
base: 4.x
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
175 changes: 175 additions & 0 deletions
175
docs/superpowers/specs/2026-06-27-devto-article-scraping-poc.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,175 @@ | ||
| # PoC: Script de Extração de Reactions History do dev.to | ||
|
|
||
| ## Contexto | ||
|
|
||
| O dev.to não expõe dados de "Reactions History" via API pública. Este snippet | ||
| JavaScript é executado manualmente no console do browser na página `/stats` de | ||
| um artigo para extrair essas informações. | ||
|
|
||
| ## Como usar | ||
|
|
||
| 1. Abra a página `/stats` de um artigo no dev.to (requer login) | ||
| 2. Abra o console do browser (F12 → Console) | ||
| 3. Cole o script abaixo e pressione Enter | ||
| 4. O JSON será copiado pro clipboard automaticamente | ||
|
|
||
| ## Script | ||
|
|
||
| ```javascript | ||
| (() => { | ||
| // Validação: está na página certa? | ||
| if (!window.location.pathname.endsWith('/stats')) { | ||
| console.error('❌ Execute este script na página /stats de um artigo do dev.to'); | ||
| return; | ||
| } | ||
|
|
||
| const articleUrl = window.location.href.replace('/stats', ''); | ||
| const articleSlug = window.location.pathname.replace('/stats', '').split('/').filter(Boolean).pop(); | ||
|
|
||
| function copyToClipboard(text) { | ||
| if (navigator.clipboard && document.hasFocus()) { | ||
| return navigator.clipboard.writeText(text); | ||
| } | ||
| // fallback legado para quando o foco está no DevTools | ||
| const textarea = document.createElement('textarea'); | ||
| textarea.value = text; | ||
| textarea.style.position = 'fixed'; | ||
| textarea.style.opacity = '0'; | ||
| document.body.appendChild(textarea); | ||
| textarea.focus(); | ||
| textarea.select(); | ||
| try { | ||
| const ok = document.execCommand('copy'); | ||
| document.body.removeChild(textarea); | ||
| return ok ? Promise.resolve() : Promise.reject(new Error('execCommand falhou')); | ||
| } catch (err) { | ||
| document.body.removeChild(textarea); | ||
| return Promise.reject(err); | ||
| } | ||
| } | ||
|
|
||
| function finish(result) { | ||
| const jsonOutput = JSON.stringify(result, null, 2); | ||
|
|
||
| copyToClipboard(jsonOutput) | ||
| .then(() => console.log('✅ JSON copiado para o clipboard!')) | ||
| .catch(() => { | ||
| console.warn('⚠️ Não foi possível copiar para o clipboard. JSON abaixo:'); | ||
| }); | ||
|
|
||
| console.log(`📊 ${result.articleSlug}`); | ||
| console.log(` Total: ${result.totalReactions} reações`); | ||
| Object.entries(result.summary).forEach(([type, count]) => { | ||
| console.log(` ${type}: ${count}`); | ||
| }); | ||
| console.log(jsonOutput); | ||
|
|
||
| return result; | ||
| } | ||
|
|
||
| // Encontrar a seção "Reactions History" | ||
| const headers = document.querySelectorAll('h2'); | ||
| let rhHeader = null; | ||
| headers.forEach(h => { | ||
| if (h.textContent.trim() === 'Reactions History') rhHeader = h; | ||
| }); | ||
|
|
||
| // Edge case: artigo sem reações / sem seção | ||
| if (!rhHeader) { | ||
| console.warn('⚠️ Nenhuma seção "Reactions History" encontrada.'); | ||
| return finish({ | ||
| articleUrl, | ||
| articleSlug, | ||
| extractedAt: new Date().toISOString(), | ||
| totalReactions: 0, | ||
| reactions: [], | ||
| summary: {} | ||
| }); | ||
| } | ||
|
|
||
| const container = rhHeader.parentElement; | ||
| const entries = container.querySelectorAll('.fs-sm.py-2.flex.items-center'); | ||
|
|
||
| const reactions = []; | ||
| const summary = {}; | ||
|
|
||
| entries.forEach(entry => { | ||
| const imgs = entry.querySelectorAll('img'); | ||
| const rightSide = entry.querySelector('.flex-1'); | ||
| if (!rightSide) return; | ||
|
|
||
| const infoDivs = rightSide.children; | ||
|
|
||
| // Tipo de reação: texto antes da quebra de linha no primeiro div filho | ||
| const rawText = infoDivs[0]?.textContent.trim() || ''; | ||
| const reactionType = rawText.split('\n')[0].trim(); | ||
|
|
||
| // Usuário | ||
| const link = infoDivs[0]?.querySelector('a'); | ||
| const username = link?.textContent.trim() || 'unknown'; | ||
| const userProfileUrl = link?.href || ''; | ||
|
|
||
| // Avatar | ||
| const userAvatarUrl = imgs[0]?.src || ''; | ||
|
|
||
| // Data (mantida como string original, sem parsear — evita bugs de timezone) | ||
| const date = infoDivs[1]?.textContent.trim() || ''; | ||
|
|
||
| reactions.push({ | ||
| type: reactionType, | ||
| username, | ||
| userProfileUrl, | ||
| userAvatarUrl, | ||
| date | ||
| }); | ||
|
|
||
| summary[reactionType] = (summary[reactionType] || 0) + 1; | ||
| }); | ||
|
|
||
| return finish({ | ||
| articleUrl, | ||
| articleSlug, | ||
| extractedAt: new Date().toISOString(), | ||
| totalReactions: reactions.length, | ||
| reactions, | ||
| summary | ||
| }); | ||
| })(); | ||
| ``` | ||
|
|
||
| ## Edge cases cobertos | ||
|
|
||
| - **Artigo sem reações ou sem seção "Reactions History"**: o script loga um | ||
| aviso e ainda assim monta o JSON estruturado, com `totalReactions: 0`, | ||
| `reactions: []` e `summary: {}`. | ||
| - **Falha ao copiar automaticamente** (ex: foco do browser está no painel do | ||
| DevTools, não na página): o script tenta um fallback via `execCommand('copy')` | ||
| antes de desistir; se mesmo assim falhar, o JSON completo é impresso no | ||
| console para cópia manual. | ||
| - **Página incorreta**: se o script for executado fora de uma URL terminada em | ||
| `/stats`, ele loga um erro explicativo e interrompe a execução. | ||
|
|
||
| ## Saída esperada | ||
|
|
||
| ```json | ||
| { | ||
| "articleUrl": "https://dev.to/user/article", | ||
| "articleSlug": "article-slug", | ||
| "extractedAt": "2026-06-27T00:00:00.000Z", | ||
| "totalReactions": 6, | ||
| "reactions": [ | ||
| { | ||
| "type": "like", | ||
| "username": "Some User", | ||
| "userProfileUrl": "https://dev.to/someuser", | ||
| "userAvatarUrl": "https://media2.dev.to/dynamic/image/...", | ||
| "date": "Jun 27" | ||
| } | ||
| ], | ||
| "summary": { | ||
| "like": 2, | ||
| "fire": 1, | ||
| "unicorn": 1 | ||
| } | ||
| } | ||
| ``` | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Incomplete page validation.
endsWith('/stats')allows any domain. Addwindow.location.hostname.includes('dev.to')to ensure this runs only on dev.to.🤖 Prompt for AI Agents