Skip to content

feat(gitter): git commit graph and patch ID caching#4728

Merged
Ly-Joey merged 42 commits into
google:masterfrom
Ly-Joey:feat-gitter
Feb 24, 2026
Merged

feat(gitter): git commit graph and patch ID caching#4728
Ly-Joey merged 42 commits into
google:masterfrom
Ly-Joey:feat-gitter

Conversation

@Ly-Joey

@Ly-Joey Ly-Joey commented Feb 3, 2026

Copy link
Copy Markdown
Contributor

Building on top of #4508 & co.

Key changes:

  • Added foundational structures to save commit details to disk
  • We now parse full git history into in-memory graph for commit traversal
  • Parallelised patch ID calculation. They are loaded from the cache, meaning subsequent requests to the same repo only need to compute patch IDs for new commits.

API updates:

  • POST /cache: Triggers a clone / fetch and store the computed commit details
  • /getgit renamed to GET /git
    • change the endpoint in Python

WIP (To be added in follow-up PRs to keep this more manageable to review):

  • POST /affected-commits: accepts a list of events and returns the list of affected commits
    • /affected-commits should accept a list of ranges as valid request as well
    • Probably refactor the check last fetch time and refetch logic, because every operation relies on it.
  • Check if a repoistory exist (/git-good)
  • Resolve refs to commit hash

Comment thread go/cmd/gitter/persistence.go Fixed
Comment thread go/cmd/gitter/persistence.go Dismissed
Comment thread go/cmd/gitter/gitter.go Fixed
Comment thread go/cmd/gitter/gitter.go Dismissed
@Ly-Joey Ly-Joey changed the title Gitter feat(gitter): git commit graph and patch ID caching, affected commits walking Feb 9, 2026
Comment thread go/cmd/gitter/repository.go Dismissed
Comment thread go/cmd/gitter/persistence.go Dismissed
@another-rex

Copy link
Copy Markdown
Contributor

/gemini review

We now have:
- mutex locking on patch ID related data in repo struct
- repo-level rw locking
- singleflight on operation-level
- request-level concurrency control
@Ly-Joey Ly-Joey changed the title feat(gitter): git commit graph and patch ID caching, affected commits walking feat(gitter): git commit graph and patch ID caching Feb 20, 2026
@Ly-Joey

Ly-Joey commented Feb 20, 2026

Copy link
Copy Markdown
Contributor Author

To keep this PR moving along (and so I don't have to keep rebasing and resolving merge conflicts), I've decided to split the affected commit related logic (mainly graph walking and cherrypick detection) into another branch and a future PR.

@Ly-Joey Ly-Joey requested a review from another-rex February 20, 2026 04:40
@Ly-Joey

Ly-Joey commented Feb 20, 2026

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist

This comment was marked as off-topic.

@Ly-Joey Ly-Joey marked this pull request as ready for review February 20, 2026 04:42
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Comment thread go/cmd/gitter/gitter.go Outdated
Comment thread go/cmd/gitter/gitter.go
another-rex
another-rex previously approved these changes Feb 23, 2026
@michaelkedar

Copy link
Copy Markdown
Member

Just note if the go importer gets merged first to change the /getgit endpoint here too

@Ly-Joey Ly-Joey requested a review from another-rex February 23, 2026 03:54

@another-rex another-rex left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@Ly-Joey Ly-Joey enabled auto-merge (squash) February 24, 2026 04:36
@Ly-Joey Ly-Joey merged commit d6f19ae into google:master Feb 24, 2026
19 checks passed
tymzd pushed a commit to tymzd/osv.dev that referenced this pull request Apr 13, 2026
Building on top of google#4508 & co.

Key changes:
- Added foundational structures to save commit details to disk
- We now parse full git history into in-memory graph for commit
traversal
- Parallelised patch ID calculation. They are loaded from the cache,
meaning subsequent requests to the same repo only need to compute patch
IDs for new commits.

API updates:
- `POST /cache`: Triggers a clone / fetch and store the computed commit
details
- `/getgit` renamed to `GET /git`
  - [x] change the endpoint in Python

WIP (To be added in follow-up PRs to keep this more manageable to
review):
- [ ] `POST /affected-commits`: accepts a list of events and returns the
list of affected commits
- `/affected-commits` should accept a list of ranges as valid request as
well
- Probably refactor the check last fetch time and refetch logic, because
every operation relies on it.
- [ ] Check if a repoistory exist (/git-good)
- [ ] Resolve refs to commit hash
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants