Skip to content

feat: allow empty community_id for check config#55

Open
edivalentinitu wants to merge 1 commit into
inveniosoftware:masterfrom
edivalentinitu:null-comm-id
Open

feat: allow empty community_id for check config#55
edivalentinitu wants to merge 1 commit into
inveniosoftware:masterfrom
edivalentinitu:null-comm-id

Conversation

@edivalentinitu
Copy link
Copy Markdown

@edivalentinitu edivalentinitu commented Mar 17, 2026

This PR should address the need for global checks by:

  • allowing null community_id value in checks_config table -> this will indicate a global check
  • getting the check configs will always query the global checks before trying to search for the community defined checks

Solves the first part of this issue: #47

@edivalentinitu edivalentinitu marked this pull request as ready for review March 17, 2026 10:39
Comment thread invenio_checks/api.py Outdated
Comment on lines +37 to +49
generic_configs = CheckConfig.query.filter(
CheckConfig.community_id.is_(None),
CheckConfig.enabled.is_(True),
).all()

all_configs = generic_configs
if community_ids and "None" not in community_ids:
all_configs.extend(
CheckConfig.query.filter(
CheckConfig.community_id.in_(community_ids),
CheckConfig.enabled.is_(True),
).all()
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that having two separate DB queries is a bit heavy; it locks us out of any query optimizations the DB could do on its own.
Could we use a single query here?

Comment thread invenio_checks/alembic/1773672705_allow_empty_community_id.py Outdated
@edivalentinitu edivalentinitu force-pushed the null-comm-id branch 2 times, most recently from 86670e4 to 2d5864d Compare March 23, 2026 08:19
Copy link
Copy Markdown

@max-moser max-moser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Comment thread invenio_checks/api.py Outdated
"""
conditions = [CheckConfig.community_id.is_(None)]

if community_ids and "None" not in community_ids:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: when can "None" be in community_ids and is it intended that if something like community_ids=["None", "id1", "id2"] happens that conditions contains only the value from line 37? if that is intended then why?

Copy link
Copy Markdown
Author

@edivalentinitu edivalentinitu Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if community_ids=["None", "id1", "id2"] happens, this will lead to an error when searching with in_ query. do we want that error to be raised or to still return only the "generic" checks? normally "None" should not be in that array.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I missed that!
Can it ever happen under normal circumstances that the string "None" gets entered here?
In my dev instance all the IDs are UUIDs:

In [8]: [current_communities.service.record_cls.get_record(c.id).id for c in current_communities.service.record_cls.model_cls.query.all() if not c.is_deleted]
Out[8]: 
[UUID('f0012a50-a081-4ebc-9310-0fa8bc5cd669'),
 UUID('aa28abee-6f84-4ad5-9d38-03fbc4ce3aff'),
 UUID('bb03fd54-7db8-4367-bf2e-0f80f770da5a'),
 UUID('e4248c1a-4e1f-4441-909b-2c8d211e1a83'),
 UUID('0d9cee69-ef6f-4365-acd0-048113b69ef0'),
 UUID('cef97299-8f0e-4498-b8c6-a7a7205a6a8d'),
 UUID('81800efa-18fd-428e-a873-fbb509fd29d8'),
 UUID('3fbed832-1720-488a-9fe8-2096a7aa831f'),
 UUID('8ae7fbc4-9bb1-4437-859f-2d14186ee4d5'),
 UUID('83f6fb9b-6184-41f5-a2af-caf0e73a596f')]

Even if community names or slugs can be entered here, would we actually want to handle that value specifically?

Copy link
Copy Markdown
Author

@edivalentinitu edivalentinitu Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on a second thought, i remembered that that "None" check was a relic from an early wip-implementation of this feature - this shouldn't be necessary now as community_ids should either be

  • None (null) or empty array OR
  • an array of UUIDs

anything other than that will result in an error during the query which is actually the expected behaviour, so i removed that check. but i am open for feedback if this can be improved


"""Allow empty community id."""

import sqlalchemy as sa
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import sqlalchemy as sa

Unused import

Copy link
Copy Markdown

@max-moser max-moser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@max-moser max-moser requested a review from utnapischtim April 28, 2026 12:40
* alembic script to allow null community_id column
* update get_configs to always include generic configs
Comment thread invenio_checks/models.py
id = db.Column(UUIDType, primary_key=True, default=uuid.uuid4)
community_id = db.Column(
UUIDType, db.ForeignKey(CommunityMetadata.id), nullable=False
UUIDType, db.ForeignKey(CommunityMetadata.id), nullable=True
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
UUIDType, db.ForeignKey(CommunityMetadata.id), nullable=True
UUIDType, db.ForeignKey(CommunityMetadata.id), index=True, nullable=True

minor: e.g. on Zenodo we have about 2.5k rows in the table, so not really an issue yet, but I think it's one of the classic mistakes of assuming FKs are indexed (they're not! 🤯)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would need a change in the alembic script too or?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, we stumbled upon this because @carlinmack and @mairasalazar are doing some work on expanding checks to allow:

  1. running them asynchronously (cc @mesemus since you mentioned something about this in a telecon presentation)
  2. running them against other types of entities, e.g. communities

We were thinking of doing a breakout room in the next telecon (05/05) to discuss this further.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created an issue here #63

@utnapischtim utnapischtim mentioned this pull request May 4, 2026
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants