feat: Create SourceRepository interface for Go#4731
Hidden character warning
Conversation
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a SourceRepository abstraction and its Datastore implementation, which is a great step in refactoring the importer. The separation between the domain models in internal/models and the Datastore-specific models in internal/database/datastore is well-executed and will improve maintainability. The new tests are also comprehensive.
I have a few suggestions to improve the code further, mainly regarding a potential data loss bug with RepoUsername, a naming convention, and improving the clarity of test data. Overall, this is a solid contribution.
| func (sr *SourceRepository) toModel() *models.SourceRepository { | ||
| msr := &models.SourceRepository{ | ||
| Name: sr.Name, | ||
| Type: sr.Type, | ||
| Strictness: sr.StrictValidation, | ||
| IgnorePatterns: sr.IgnorePatterns, | ||
| Extension: sr.Extension, | ||
| KeyPath: sr.KeyPath, | ||
| GitAnalysis: &models.GitAnalysisConfig{ | ||
| IgnoreGit: sr.IgnoreGit, | ||
| DetectCherrypicks: sr.DetectCherrypicks, | ||
| ConsiderAllBranches: sr.ConsiderAllBranches, | ||
| VersionsFromRepo: sr.VersionsFromRepo, | ||
| }, | ||
| Link: sr.Link, | ||
| HumanLink: sr.HumanLink, | ||
| IDPrefixes: sr.DBPrefix, | ||
| } | ||
| switch sr.Type { | ||
| case models.SourceRepositoryTypeGit: | ||
| msr.Git = &models.SourceRepoGit{ | ||
| URL: sr.RepoURL, | ||
| Branch: sr.RepoBranch, | ||
| Path: sr.DirectoryPath, | ||
| LastSyncedCommit: sr.LastSyncedHash, | ||
| } | ||
| case models.SourceRepositoryTypeBucket: | ||
| msr.Bucket = &models.SourceRepoBucket{ | ||
| Bucket: sr.Bucket, | ||
| Path: sr.DirectoryPath, | ||
| LastUpdated: sr.LastUpdateDate, | ||
| IgnoreLastImportTime: sr.IgnoreLastImportTime, | ||
| IgnoreDeletionThreshold: sr.IgnoreDeletionThreshold, | ||
| } | ||
| case models.SourceRepositoryTypeREST: | ||
| msr.REST = &models.SourceRepoREST{ | ||
| URL: sr.RestApiUrl, | ||
| LastUpdated: sr.LastUpdateDate, | ||
| IgnoreLastImportTime: sr.IgnoreLastImportTime, | ||
| IgnoreDeletionThreshold: sr.IgnoreDeletionThreshold, | ||
| } | ||
| } | ||
| return msr | ||
| } |
There was a problem hiding this comment.
The RepoUsername field from the datastore model is not being mapped to or from the domain model. This seems to be an omission as the Python model and datastore model both have this field for SSH authentication.
To fix this, you should:
- Add a
Usernamefield tomodels.SourceRepoGitininternal/models/source_repository.go. - Update
toModel()here to mapsr.RepoUsernametomsr.Git.Username. - Update
newSourceRepositoryFromModel()to mapr.Git.Usernametosr.RepoUsername.
There was a problem hiding this comment.
I'm intentionally removing Username, since its purpose seems to be for SSH writing, and we no longer want the importer (or worker) to be able to modify repos.
Repos should be public anyway, so there's no need to have a username / key.
another-rex
left a comment
There was a problem hiding this comment.
I do quite like the abstracted models.
This is part of the Go rewrite of the Importer, but I thought I'd split up this PR to get some opinions on the database structure.
Adding a
SourceRepositoryabstraction, and the bindings for the Datastore object to allow the source repos to be read from Datastore.There's an abstraction layer to separate out the business logic (
internal/models) from the Datastore API (internal/database/datastore) which should make migrating off of Datastore easier in the future.I've moved & aliased the existing datastore models in
osv/modelsintointernal/database/datastore, and will eventually try abstract away the datastore usage from the exporter & relations computation.