Skip to content

feat(CLAUDE.md): created for opinions and oral_args submodules#1862

Open
grossir wants to merge 1 commit into
mainfrom
1793-create-opinion-claude-md
Open

feat(CLAUDE.md): created for opinions and oral_args submodules#1862
grossir wants to merge 1 commit into
mainfrom
1793-create-opinion-claude-md

Conversation

@grossir
Copy link
Copy Markdown
Contributor

@grossir grossir commented Mar 19, 2026

#1793

Juriscraper has a lot of quirks that confuse both human users and agents.

This CLAUDE.md attempts to create guidelines for the development process (issue creation, development, code review). In a sense, it could be used by humans too.

I have tried to keep it concise and limit it to non obvious points that we value in scraper development

#1793

Juriscraper has a lot of quirks that confuse both human users and agents.

This CLAUDE.md attempts to create guidelines for the development process (issue creation, development, code review). In a sense, it could be used by humans too.

I have tried to keep it concise and limit it to non obvious points that we value in scraper development
@grossir grossir moved this to PRs to Review in Sprint (Case Law) Mar 19, 2026
@grossir grossir assigned grossir and Luis-manzur and unassigned grossir Mar 19, 2026
@grossir grossir requested a review from Luis-manzur March 19, 2026 01:36
@grossir
Copy link
Copy Markdown
Contributor Author

grossir commented Mar 19, 2026

I think it needs some more comments on backscraping, download_backwards, etc. It got a little confused today implementing a backscraper

Copy link
Copy Markdown
Contributor

@Luis-manzur Luis-manzur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small changes

- Sentry links let reviewers count occurrences, see affected courts, and trace exact failing records. Any proposed fix should address those specific occurrences.
- Show data examples (screenshots, SQL counts, specific records) to illustrate the problem
- Reviewers need real edge cases to verify fixes against. Without examples, PRs get reviewed blindly and bugs slip through.
- If the issue is getting to big, use the `<details><summary>Title of collapsible section</summary>A lot of content</details>` tags to organize extra info.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion

Suggested change
- If the issue is getting to big, use the `<details><summary>Title of collapsible section</summary>A lot of content</details>` tags to organize extra info.
- If the issue is getting too big, use the `<details><summary>Title of collapsible section</summary>A lot of content</details>` tags to organize extra info.

- Use `date_filed_is_approximate = True` when exact dates aren't available; don't set unrealistic dates
- `"Unknown"` status when status can't be determined, not `"Published"`
- Try to parse the most amount of data possible. If a source shows the author of an opinion, the disposition, a summary, etc; try to pick it up. If there is some interesting field you don't know where to fit in the accepted return keys, highlight it to the user so they can research.
- If a source shows different opinion types for the same case, consider using `ClusterSite` as a base class to group those together.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No mention of ClusterSite mechanics — mentions considering ClusterSite but doesn't explain how it works or when to use it vs OpinionSiteLinear.

- Real data captures encoding quirks, edge cases, and unexpected fields that synthetic data misses.
- Auto-generate example compare files via: `python -m unittest -v tests.local.test_ScraperExampleTest juriscraper.opinions`
- Manually created compare files drift from actual scraper output. The test command generates them from saved responses.
- Test both scraper AND backscraper with `sample_caller`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding something like python sample_caller.py -c
juriscraper.opinions.united_states.state.conn would help.

@Luis-manzur Luis-manzur assigned grossir and unassigned Luis-manzur Mar 24, 2026
@Luis-manzur Luis-manzur moved this from PRs to Review to PRs to improve in Sprint (Case Law) Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: PRs to improve

Development

Successfully merging this pull request may close these issues.

2 participants