Conversation
d96be33 to
5bc9a49
Compare
|
I tried to fix this more generally in #757 to improve count performance everywhere. If you want to check if it fixes your issues, there are debs/rpms to test it here: https://github.com/furlongm/patchman/releases/tag/v4.0.10-dev1 It will run db migrations though so maybe run on a test/backup db, or keep a backup of your current db prior to installing it. |
|
Thank you, I confirm it works fine in our setup with the version you linked so I drop this PR. |
|
Thanks for confirming - another prerelease here with even more sql optimizations and a few bug fixes. Let me know if you have any issues with it: https://github.com/furlongm/patchman/releases/tag/v4.0.10-dev3 |
|
And another release (some more optimizations): https://github.com/furlongm/patchman/releases/tag/v4.0.10-dev4 |
|
Sorry, I only saw yesterday evening that you had further replied to this thread :/ With the latest stable version, we have a few reports that fail to be processed but I fail to get any logs or details on what went wrong. And we have poor performances in I applyed a patch from chatgpt but I am not completely sure it is correct : index 2d8f4f1..9fbc123 100644
--- a/hosts/models.py
+++ b/hosts/models.py
@@ -163,21 +163,45 @@ class Host(models.Model):
def get_host_repo_packages(self):
- if self.host_repos_only:
- hostrepos_q = Q(mirror__repo__in=self.repos.all(),
- mirror__enabled=True,
- mirror__repo__enabled=True,
- mirror__repo__hostrepo__enabled=True)
- else:
- hostrepos_q = \
- Q(mirror__repo__osrelease__osvariant__host=self,
- mirror__repo__arch=self.arch,
- mirror__enabled=True,
- mirror__repo__enabled=True) | \
- Q(mirror__repo__in=self.repos.all(),
- mirror__enabled=True,
- mirror__repo__enabled=True)
- return Package.objects.select_related().filter(hostrepos_q).distinct()
+ """
+ Return packages available in repositories assigned to this host.
+
+ This implementation avoids the very expensive multi-table JOIN +
+ DISTINCT used previously, which could take >100s on large repos.
+ """
+
+ # Determine repository IDs enabled for this host
+ if self.host_repos_only:
+ repo_ids = (
+ self.repos
+ .filter(
+ hostrepo__enabled=True,
+ enabled=True,
+ mirror__enabled=True,
+ )
+ .values_list("id", flat=True)
+ )
+ else:
+ repo_ids = Repository.objects.filter(
+ Q(osrelease__osvariant__host=self, arch=self.arch) |
+ Q(hostrepo__host=self),
+ enabled=True,
+ ).values_list("id", flat=True)
+
+ # Restrict to package names already installed on the host
+ # This dramatically reduces the dataset size.
+ host_package_names = self.packages.values_list("name_id", flat=True)
+
+ return (
+ Package.objects
+ .filter(
+ mirrorpackage__mirror__repo_id__in=repo_ids,
+ mirrorpackage__mirror__enabled=True,
+ name_id__in=host_package_names,
+ )
+ .only(
+ "id", "name_id", "arch_id", "version", "release",
+ "epoch", "packagetype", "category_id"
+ )
+ )But it clearly targets the slow query I had. |
We had a crash when trying to access Hosts page on Patchman 4 installed with MySQL as MySQL was trying to write more than 10 GB of temporary tables to disk, filling up
/tmpand resulting in query crash because of the lack of free space.This patch is a courtesy of ChatGPT (with minor edits) but seems correct even from the results I get when running it on out Patchman instance. And performance seems decent (< 2 seconds load page with 1000+ hosts).