Skip to content

GitHub Search App with Parallel and LanceDB#19

Open
elviskahoro wants to merge 2 commits into
mainfrom
elvis/github-search
Open

GitHub Search App with Parallel and LanceDB#19
elviskahoro wants to merge 2 commits into
mainfrom
elvis/github-search

Conversation

@elviskahoro
Copy link
Copy Markdown
Collaborator

@elviskahoro elviskahoro commented Apr 26, 2026

Summary

tldr; githubsearch.app

…rate descriptions with Parallel and load them into LanceDB for Vector Search
@elviskahoro elviskahoro changed the title Limit BigQuery results to single top repository per month GitHub Search App with Parallel and LanceDB Apr 26, 2026
This change adds a LIMIT 1 clause to the BigQuery query in the repos_with_stars resource. This optimization ensures that only the repository with the highest star count is retrieved for each month, rather than processing all repositories above the minimum star threshold.

Benefits:
- Reduces BigQuery costs by limiting data transferred and processed
- Improves pipeline performance by fetching fewer rows per month
- Simplifies downstream processing to focus on top-performing repositories
- Maintains accurate star count metrics for the most-starred repo per month

The query now returns a single row per month sorted by star count in descending order, providing a cleaner signal for tracking repository popularity trends.
@elviskahoro elviskahoro force-pushed the elvis/github-search branch from 92136d8 to 594bcda Compare April 26, 2026 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants